Techno Blender
Digitally Yours.
Browsing Tag

YUNNA

Do you really need a Feature Store? | by YUNNA WEI | Mar, 2023

Feature Store — the interface between raw data and ML models“Feature store” has been around for a few years. There are both open-source solutions (such as Feast and Hopsworks), and commercial offerings (such as Tecton, Hopsworks, Databricks Feature Store) for “feature store”. There have been a lot of articles and blogs published around what “feature store” is, and why “feature store” is valuable. Some organizations have also already adopted “feature store” to be part of their ML applications. However, it is worthwhile to…

MLOps Automation — CI/CD/CT for Machine Learning (ML) Pipelines | by YUNNA WEI | Feb, 2023

Scaling the use of AI/ML by building Continuous Integration (CI) / Continuous Delivery (CD) / Continuous Training (CT) pipelines for ML based applicationsBackgroundIn my previous article:MLOps in Practice — De-constructing an ML Solution Architecture into 10 componentsI talked about the importance of building CI/CT/CD solutions to automate the ML pipelines. The aim of MLOps automation is to continuously test and integrate code changes, continuously train new models with new data, upgrade model performance when required,…

ML model registry — the “interface” that binds model experiments and model deployment | by YUNNA WEI | Feb, 2023

MLOps in Practice — A deep- dive into ML model registries, model versioning and model lifecycle managementBackgroundIn my previous article:MLOps in Practice — De-constructing an ML Solution Architecture into 10 componentsI talked about the architectural importance of managing model metadata and artifacts generated by ML experiment runs. We all know that the model training process produces many artifacts for further ML model performance tuning, as well as for subsequent ML model deployment. These artifacts include the…

Why data scientists should adopt Machine Learning (ML) pipelines | by YUNNA WEI | Feb, 2023

OpinionMLOps in Practice — as a data scientist, are you handing over a notebook or an ML pipeline to your ML engineers or DevOps engineers for the ML model to be deployed in a production environment?BackgroundIn my previous articles :I talked about the importance of building ML pipelines. In today’s article, I will deep dive into the topic of ML pipelines and explain in detail:Why is it necessary and important to build ML pipelinesWhat are the key components of a ML pipelineWhy and how data scientists should adopt ML…

Build low-latency and scalable ML model prediction pipelines using Spark Structured Streaming and MLflow | by YUNNA WEI | Jan, 2023

MLOps in practice series — sharing design and implementation patterns of critical MLOps component. The focus of today’s article is on building model prediction pipelines.To make ML models work in a real production environment, one of the most critical steps is to deploy the trained models for predictions. Model deployment (release) is a process that enables you to integrate trained ML models into production to make decisions on real-world data. When it comes to model deployment, there are generally two types:One is batch…

Have You Ever “Tested” Your Data Pipelines? | by YUNNA WEI | Dec, 2022

A comprehensive guide to make your data pipelines testable, maintainable and reliableWhy is it necessary to test your data pipelines?Embedding appropriate tests to your data pipelines makes them less bug-prone and also makes sure the data goes through proper data quality checks, before flowing to the end data consumers.The two key components of any data pipeline are “code” and “data”. Code is used as a tool to manage how to Extract, Transform and Load (ETL) the data, while data is the ingredient of the data pipelines. To…

MLOps in Practice — De-constructing an ML Solution Architecture into 10 components | by YUNNA WEI | Dec, 2022

A comprehensive introduction to the 10 key components of an end-to-end ML solutionIn my previous blogs, I have talked about the three key pipelines (1) Data and Feature Engineering Pipelines, (2) ML Model Training and Re-training Pipelines, (3) ML Model Inference and Serving Pipelines, as well as the required underlying infrastructure in order to build a reliable and scalable MLOps solution. You can find the blog details here:From this blog onwards, I will focus on explaining the detailed implementation of an end-to-end…

Data Engineering Best Practice — Embedding Reliability and Integrity into Your Data Pipelines | by YUNNA WEI | Nov, 2022

Building highly reliable and trustworthy data pipelines to deliver high quality data and information for downstream data consumersThe Importance of Data Quality and Data ReliabilityIt goes without saying, data is absolutely critical to the operations and applications of many organizations nowadays. For data engineers, it is not only about purely delivering data through Extract, Transform and Load (ETL) pipelines, but more about delivering reliable data so that business can make valid and well-informed data-driven…

What Makes a Good Data Pipeline — A Pre-Production Checklist for Data Engineers | by YUNNA WEI | Nov, 2022

The most essential part of becoming a data engineer is to build highly scalable and reliable data pipelinesIn one of my previous articles on this subject, namely Learning the Core of Data Engineering — Building Data Pipelines, I talked about the 8 key components of building a data pipeline also called Extract, Transform and Load (ETL) pipelineIn today’s article, I will go into more detail to explain what makes a data pipeline good enough for deployment in a production environment. Hopefully this can also provide you a…

How is an ML Driven System Unique? — Understanding Why MLOps Is Necessary and What Components of ML Infrastructure is Required | by YUNNA…

A deeper explanation of why MLOps is required, and what problems MLOps is trying to solve to make AI work in a real world production environmentFor any organizations that have already built ML solutions or even at the PoC stage of developing a ML model, I am sure MLOps has been a topic. Indeed, in order to develop and deploy an ML solution in a production environment in a reliable, secure and highly available manner, MLOps is required.In today’s blog, I would love to take one step back to explain:Why is MLOps necessary…