Techno Blender
Digitally Yours.
Browsing Tag

Eduardo

Deploying a Data Science Platform on AWS: Running containerized experiments (Part II) | by Eduardo Blancas | Oct, 2022

Data Science Cloud InfrastructureA step-by-step guide to deploy a Data Science platform on AWS with open-source softwarePhoto by Guillaume Bolduc on UnsplashIn our previous post, we saw how to configure AWS Batch and tested our infrastructure by executing a task that spinned up a container, waited for 3 seconds and shut down.In this post, we’ll leverage the existing infrastructure, but this time, we’ll execute a more interesting example.We’ll ship our code to AWS by building a container and storing it in Amazon ECR, a…

Deploying a Data Science Platform on AWS: Setting Up AWS Batch (Part I) | by Eduardo Blancas | Oct, 2022

Data Science Cloud InfrastructureA step-by-step guide to deploy a Data Science platform on AWS with open-source softwareYour laptop isn’t enough, let’s use the cloud. Photo by CHUTTERSNAP on UnsplashIn this series of tutorials, we’ll show you how to deploy a Data Science platform with AWS and open-source software. By the end of the series, you’ll be able to submit computational jobs to AWS scalable infrastructure with a single command.Architecture of the Data Science platform we’ll deploy. Image by author.Screenshot of…

Introducing Snapshot Testing for Jupyter Notebooks | by Eduardo Blancas | Jul, 2022

Software Engineering for Data Sciencenbsnapshot is an open-source package that benchmarks notebook’s outputs to detect issues automaticallyImage by author.If you want to keep up-to-date with my content. Follow me on Medium or Twitter. Thanks for reading!When analyzing data in a Jupyter notebook, I unconsciously memorize “rules of thumb” to determine if my results are correct. For example, I might print some summary statistics and become skeptical of some outputs if they deviate too much from what I’ve seen historically.…

From Jupyter to Kubernetes: Refactoring and Deploying Notebooks Using Open-Source Tools | by Eduardo Blancas | Jun, 2022

Software Engineering For Data ScienceA step-by-step guide to going from a messy notebook to a pipeline running in KubernetesPhoto by Myriam Jessier on UnsplashNotebooks are great for rapid iterations and prototyping but quickly get messy. After working on a notebook, my code becomes difficult to manage and unsuitable for deployment. In production, code organization is essential for maintainability (it’s much easier to improve and debug organized code than a long, messy notebook).In this post, I’ll describe how you can use…

Bruce Campbell teams with Eduardo Risso for Sgt. Rock

Main cover by Gary Frank Bruce Campbell teams with Eduardo Risso for DC Horror Presents: Sgt. Rock vs. The Army of the Dead.  That’s it. That’s the tweet. Or an even shorter version: Nazi zombies. Sgt. Rock is an iconic DC character created back in 1959 by Robert Kanigher and Joe Kubert in the pages of Our Army at War #83. As the leader of “Easy Company,” he became one of the most famous war comics characters of all time. Sgt. Rock had a long run as one of the very last non-superhero holdovers from the 50s and has been…

Analyze and plot 5.5M records in 20s with BigQuery and Ploomber | by Eduardo Blancas | May, 2022

Software Engineering for Data ScienceDevelop scalable pipelines on Google Cloud using open-source softwareImage by author.This tutorial will show how you can use Google Cloud and Ploomber to develop a scalable and production-ready pipeline.We’ll use Google BigQuery (data warehouse) and Cloud Storage to show how we can transform big datasets with ease using SQL, plot the results with Python, and store the results in the cloud. Thanks to BigQuery scalability (we’ll use a dataset with 5.5M records!) and Ploomber’s…