Techno Blender
Digitally Yours.
Browsing Tag

Bernardo

How Data Science Helped Sherlock Holmes Find a Murderer | by Bernardo Furtado | Apr, 2023

Solve the “Who Killed the Duke of Densmore” mystery using graph theory, constraint programming, and mixed-integer linear programmingPhoto by Volodymyr Hryshchenko on UnsplashInventory management, portfolio optimization, machine scheduling, vehicle routing, and many other real-life problems are excellent examples of how data science and analytical techniques can be employed. It’s no surprise that these are some of the first problems we learn in university. However, I’m fascinated by the numerous other problems that can be…

Heuristics as Warm Start for Mixed Integer Programming (MIP) Models | by Bernardo Furtado | Apr, 2023

Setting a starting solution in MIP Models: a scheduling applicationPhoto by Nils Geldner on UnsplashIn computer science, heuristics are techniques used to find a feasible solution to a given problem, typically faster than exact methods but without a guarantee of optimality. On the other hand, exact methods are much more expensive computationally, but the optimal solution is guaranteed.Modeling a problem as a Mixed Integer Program (MIP) and solving it using a solver may give you the optimal solution. Usually, those solvers…

Unsupervised Learning Method Series — Exploring K-Means Clustering | by Ivo Bernardo | Apr, 2023

Let’s explore one of the most famous unsupervised learning methods and how it uses distances to map similar instances togetherPhoto by alexlanting @Unsplash.comUnsupervised learning is a misterious, yet fun, art. While there is no ground truth label to predict and it may be harder to evaluate the solution we come up with, unsupervised learning methods are extremely interesting techniques to understand our data’s structure and reduce it’s complexity.Along with visualization and dimensionality reduction techniques,…

How to Start Learning the R Programming Language | by Ivo Bernardo | Dec, 2022

In this guide, I will walk you through the steps to create a comprehensive study plan that will provide you a solid foundation in the R languagePhoto by Patrick Perkins — Unsplash.comFor those unfamiliar with it, R is an open source programming language that’s widely used for data analysis and statistical computing. It’s a powerful tool that allows you to work with large datasets, create visualizations, and build algorithms, among other things.Most R courses or materials start by showing you how to work with DataFrames,…

A guide to using ggmap in R. Learn how to work with ggmap, a cool R… | by Ivo Bernardo | Dec, 2022

Learn how to work with ggmap, a cool R library to visualize dataAdding spatial and map capabilities can be a very good way to enhance your data science or analytics projects. Either because you want to showcase some examples on a map or because you have some geographical features to build algorithms, having the ability to combine data and maps is a great asset for any… Learn how to work with ggmap, a cool R library to visualize dataAdding spatial and map capabilities can be a very good way to enhance your data science…

Building your First Shiny app in R | by Ivo Bernardo | Nov, 2022

Learn how to build a shiny app using R, and showcase your code and work interactivelyPhoto by sigmund @ unsplash.comSo, you’ve developed your data science model or analysis using R, and now you are, maybe, thinking that you would like to showcase the results in a visual and intuitive way.You’ve tweaked a bit of your storytelling, added some plots but you feel that your visuals are a bit static, making you repeat code or plots throughout the storytelling process. Also, you are having a hard time hiding your code from the…

A Guide to using h20.ai in R. Learn how to work with the h2o library… | by Ivo Bernardo | Nov, 2022

Learn how to work with the h20 library using the R LanguagePhoto by pawel_czerwinsk @Unsplash.comR has many machine learning libraries one can use to train models. From caret to stand-alone libraries such as randomForest, rpart or glm, R provides a wide array of options when you want to perform some data science tasks.A curious library that you may have never heard of is h2o. An in-memory platform for distributed and scalable machine learning, h2o can run on powerful clusters when you need boosted computing power. The…

How to Use SubQueries in SQL. Learn how to make your SQL queries more… | by Ivo Bernardo | Oct, 2022

Learn how to make your SQL queries more flexible using subqueries and reduce code clutterPhoto by casparrubin @Unsplash.comSubqueries are a cool concept that we can use when programming in Structured Query Language (SQL). Starting with the problem: sometimes we want to access data outside the context of our query to filter rows or perform some special filter based on an aggregation metric. When we want to do that, we may fall into the trap of creating too many checkpoints that depend on each other with several temporary…

6 Statistical Concepts for Data Scientists | by Ivo Bernardo | Sep, 2022

Know about some statistical concepts that will help you on your journey as a data scientist or analystStatistics are one of the major component of data scientists’ job description. Knowing statistics is specially important when it comes to making conclusions about data or models, avoiding common pitfalls that we can fall into when building models or analysis.There might be a temptation to discard statistics in a world where computing power and AI innovation gets a boost every single day. Why learn about p-values, data…

Using Resample in Pandas. Learn how to work with the Pandas… | by Ivo Bernardo | Sep, 2022

Learn how to work with the Pandas resample method, a cool way to work with time based dataPhoto by Markus Spiske @Unsplash.comTime based data is one of the most common data formats that you, as data scientist, have probably stumbled. Either in the format of historical features (for instance, customer data) or time series data, it’s pretty common that one has to deal with timestamp columns in data pipelines.If you work as a data scientist in 2022, it’s mandatory that you know how to work with pandas, one of the most…