A Survey of the Top Three MLOps tools | by Sadrach Pierre, Ph.D. | Oct, 2022

By Jessie Hobb On Oct 25, 2022

State of the art tools for machine learning model deployment and management

Deploying and maintaining machine learning models is essential for any company using predictive analytics to deliver value to their clients. MLOps, short for machine learning operations, refers to a set of tasks used to reliably deploy and maintain machine learning models. With the rise of data science and machine learning teams across industries, companies are increasingly using predictive analytics to deliver value to their clients. This has created a necessity for systems that ensure machine learning pipelines are robust to changes and that their results are reproducible.

MLOps sits at the intersection between software development operations (DevOps), machine learning and data engineering. MLOps applies a specific set of practices to all steps within the machine learning lifecycle which includes, data collection and processing, labeling and feature engineering, model training and optimization, and endpoint deployment and monitoring. At a high level MLOps seeks to automate these interconnected steps in an efficient and reliable way, while maintaining the quality of model predictions and meeting business requirements.

For any company looking to scale their machine learning system, implementing some basic MLOps practices are essential. For example, regarding data collection and processing it is important for data checks to be in place before proceeding to the data labeling and feature engineering steps in the lifecycle. A data refresh may result in poor quality data which may create bottle necks for data labeling and feature engineering. This will most certainly result in worsened model performance and a decrease in the utility of the ML system for clients and end users.

Another example is monitoring the runtime for model training and model predictions. If a data refresh results in a significant increase a model training time there should be a process in place to use a more powerful machine or a cluster of machines so that deployment isn’t significantly delayed. Further, regarding endpoint deployment and modeling, the latency of a model predict call can have a significant impact on user experience. For example, if there are expectations for how quickly a tool returns a prediction, prediction call runtimes should be monitored whenever there are changes to the training data, model features, or model type. Having checks in place for data ingestion, model validation and business requirements can prevent poor quality models and the nonsense insights that come from these models from being shipped to a client. There are a wide variety of tools available for monitoring data, models, and business rules. Here we will survey three of the most popular tools for MLOps.

Kubeflow

Kubeflow is a machine learning toolkit developed by Google that allows data scientists and machine learning engineers to monitor the machine learning lifecycle. Kubeflow builds on Kubernetes for deploying, scaling and managing complex system such as machine learning pipelines. Kubernetes is an open-source container orchestration systems that has mostly be used in the context of software engineering. Specifically it is used for automating software deployment, scaling and management.

Kubeflow is essentially an extension of Kubernetes for machine learning pipelines. The image below conceptually outlines the Kubeflow platform and how it fits into the different components of the machine learning workflow. Roughly it is broken down by ML tools, model serving tools, and cloud computing platforms.

Kubeflow is typically more useful for DevOp engineers working closely with data scientists. It is probably the best choice for model serving at scale.

MLflow

Another similar toolkit is MLflow which was developed by Databricks. MLflow has four main components: Tracking, Projects, Models, and Registry:

As stated in the diagram, each MLflow component is used for model experimentation, reproducible runs, model deployment and model storage respectively.

Tracking

Tracking is most useful for model experimentation. It contains an API for logging model hyperparameters, model performance metrics, and artifacts such as pickeled trained models.

Projects

The projects component is useful for packaging data science and machine learning code in a reproducible way. It is also has an API and command line tools that allows multiple projects to be chained together.

Models

The models component is used for serving models. For example, it can be used with a REST API for real-time model prediction serving.

Registry

Finally, the registry component has several tools for managing the full lifecycle of an MLflow model. These include centralized model stores, APIs, UIs which can be used for tracking model lineage, version control, model staging and annotation.

MLflow is more useful for data scientists and machine learning engineers as it allows them to easily design, run and reproduce experiments.

AWS SageMaker

AWS SageMaker is another popular MLOps tool. It is a very user friendly option for data scientists and business analysts as it provides IDEs and no-code UIs for each respective persona. It also has tools for accessing, labeling, and processing both structured and unstructured data. Further, similar to Kubeflow and MLflow, it enables straightforward deployment and monitoring of machine learning models.

Overall, AWS SageMaker is probably the best option for the data engineering and data wrangling steps within the machine learning lifecycle. This is because it has a built-in IDE that is equipped with tools for these tasks.

Comparing Kubeflow, MLflow and AWS SageMaker

There are many similarities and differences between the three platforms we discussed here. At a high level, each can be used for deploying, serving, and maintaining machine learning models. A big difference between Kubeflow and the other two platforms is that it depends on Kubernetes. Further, MLflow is a python library that makes it very easy for data scientists to run and save experiments for modeling. MLflow also can be used with AWS SageMaker for model deployment. Kubeflow is also different from SageMaker in that it is free and open-sourced and SageMaker requires paying for AWS services (even within the free tier). Further, AWS SageMaker is a better choice for data engineering and data wrangling tasks compared to Kubeflow and MLflow. Given that there are pros and cons to each platform it is good for DevOps engineers, data scientists, and machine learning engineers to have some familiarity with each.

Conclusions

Machine learning operations has been come an important part of developing, monitoring and maintaining the machine learning lifecycle. When it comes to managing data science and machine learning pipelines it is important to have a general understanding and awareness of best practices and the state of the art tools available. Kubeflow is best used for serving machine learning models at scale. MLflow is best for model experimentation and logging. AWS SageMaker has many of the same tools as Kubeflow and MLflow while also providing more tools for data engineering tasks.