Techno Blender
Digitally Yours.
Browsing Tag

Eirik

Pipelines in Scikit-Learn: An Amazing Way to Bundle Transformations | by Eirik Berge, PhD | Apr, 2023

One of the most popular Python libraries for dealing with machine learning tasks is scikit-learn. It went public in 2010 and has since been essential for implementing popular supervised ML algorithms like logistic regression, random forests, and support vector machines.When writing code in scikit-learn, you can use a feature called pipelines. This feature allows you to bundle up several of the steps in the machine learning process into a single component. The use of pipelines is one of the single most determining factors…

Don’t Underestimate the Elegant Power of Python Sets 💣 | by Eirik Berge, PhD | Aug, 2022

In Python there are four basic container types:Lists: Lists in Python are mutable sequences that can contain data of different types. An example is dates = .Tuples: Tuples in Python are similar to lists, except that they are not mutable. Their internal state does not change over time. An example is seasons = ("Spring", "Summer", "Autumn", "Winter").Dictionaries: Dictionaries in Python are key-value storage where the keys point to a specific value. An example is age = {"Eirik": 27, "Grandma": 86}.Sets: Sets in Python are…

Five Essential Presenting Tips for Data Professionals | by Eirik Berge, PhD | Jul, 2022

People in data roles (data scientists, data engineers, data analysis, etc.) often have to present their work to various audiences:You might be working for a consulting company and are asked to brief the customer on your findings for a project. Your customer has strong domain knowledge, but might be lacking in basic data literacy.You might be working for a traditional company where you have recently implemented a machine learning solution to help automate a central process in the company. Great work! The higher-ups have…

Master CSV Files in the Terminal With the Csvkit Package | by Eirik Berge | Jun, 2022

Few things can seem as intimidating as working in the command line terminal. Many data scientists don’t have a computer science background, and so are not used to this. It might even seem like working in the command line terminal is a relic of the past 😧I am here to tell you that this is false. In fact, with the increase in cloud-based shells like e.g. the Azure Cloud Shell there is a higher value than ever to learn the command line. The truth is that the command line is an efficiency booster. Using the graphical user…

How to Write High-Quality Python as a Data Scientist | by Eirik Berge | May, 2022

When starting out as a data scientist one is told conflicting tales regarding code quality. Some people say that code quality is really important. Others say that data scientists are not software engineers and adopt the following mantra:Who cares? If it works it works, right?When presented with the option to care about code quality or not, it is tempting to choose the path of least resistance. Learning to write high-quality code takes time and effort. Why not simply disregard code quality and have one less thing to worry…

Why Software Development Skills are Essential for Data Science | by Eirik Berge | May, 2022

Data scientists were dubbed the Sexiest Job Of The Century by the Harvard Business Review, Forbes, and others almost a decade ago. No wonder, as data scientists were given high wages and interesting problems to solve. It quickly became a popular job for both college graduates and self-taught learners to aspire to.The data scientist of the 2010s had an incredibly broad scope and ill-defined responsibilities. A data scientist was simply someone who could generate insight from data.Two data scientists at different companies…