Various Types of Deployment in Machine Learning | by Suhas Maddali | Jan, 2023

By Jessie Hobb On Jan 6, 2023

Learn various deployment strategies for successfully building an end-to-end machine learning pipeline

There is a lot of scope and demand for machine learning, especially in the latest self-driving industry where drivers are given assistance with the aid of AI. Furthermore, there are other industries benefiting such as the pharmaceutical industry that are beginning to use AI in building interesting products that are essentially used for predictive healthcare. Other industries include e-commerce where the best products that are most relevant to the users are being suggested, increasing customers’ incline to buy the products.

Oftentimes, there is a good talk about the capabilities of machine learning and how they are able to achieve state-of-the-art results in a large number of tasks giving rise to good accuracy. However, the least talked about topic is the deployment of them in real-time along with constant monitoring and evaluation in the production phases. This is a key consideration that is also missed in a large number of machine learning and deep learning courses that you find online. A machine learning model is good only if we are able to use it as an application for the end users.

Taking a look at all the different types of industries that are relying on machine learning would make a lot of people lean toward this field and make things work for the company. There are a large number of online courses that highlight the key areas of machine learning such as feature engineering, data preparation, model building, hyperparameter tuning, and the list goes on and on. However, there is one essential ingredient that is missing in those courses: deployment.

In this article, we would be taking a look at different types of deployment strategies that are essential to learning if one wants to impress the team by building AI capabilities. Let us now go over a list of deployment strategies for machine learning in great detail.

Batch Inference

Now that you have trained and performed hyperparameter tuning with machine learning models, it is now time to get the best model to be put under production. The batch inference is a deployment strategy in which a machine learning model is deployed in real-time and it only accepts batches of data on a periodic basis. As a result of this strategy, the models would mostly be able to work offline or on periodic tasks, such as generating reports or predictions.

Batch inference can be useful in scenarios where we would want to classify the sentiment of customers towards various products. In other words, they might be giving their reviews and if we want to know the overall sentiment that customers have towards the product, batch inference can be a great strategy to deploy machine learning models.

Real-time Inference

This is a type of deployment in which machine learning models are run on incoming data in real-time. Therefore, they would be ready to receive the data in a format and give real-time predictions so that actions could be taken accordingly. In addition to this, there would be requirements for real-time inference such as having a low latency system or higher predictive accuracy depending on the project and goals of the team.

One of the classical examples of real-time inference would be in detecting the chances of fraudulent activity when making transactions. The machine learning model would be initially trained with data containing fraudulent and non-fraudulent transactions. After the best model is selected, it is deployed with real-time inference so that customers can get to know if there was fraudulent activity taking place.

On-premises Deployment

There are oftentimes requirements on the side of the ML team to have high-security measures and compliance with the data before the product is deployed in real-time. In such scenarios, there is a higher importance given to data and ML code that is put into production.

On-premises deployment involves using deploying ML models on physical devices or servers within an organization’s facility. As a result, it can offer high security and control over the data and the models.

On-premises deployment can be useful in predictive maintenance in which ML models are used to determine the possibility of failure of various manufacturing equipment. Rather than relying on the internet to give real-time predictions, we are using our own set of servers and machines that are capable of providing the computing needed for machine learning predictions. Whenever the model predicts that there are various defects in manufacturing materials, humans can take action by replacing those products.

Cloud Deployment

This is a type of deployment in which we are offering our machine learning services to the cloud. Therefore, we are using the compute resources and memory that are useful with the help of a cluster of devices. As a result, we should be able to scale our applications depending on the traffic faced by various ML operations done by users.

Some of the examples where cloud deployment could be useful are when we are not certain about the number of resources we might be needed to train and deploy models. Furthermore, these services are initialized only when the users are going to be using our predictions.

One of the popular examples of using cloud deployment would be in predicting the chances of a customer having a chance to churn on a particular set of services. If we have built a service that customers who have a subscription, we would be predicting the chances of whether a customer is going to leave the service based on a set of predictive features. Since we are not fully aware of the total number of customers who can enroll in the service and at the same time leave the service, it is a good approach to deploy training models in the cloud as this would result in easier scalability as and when needed depending on the traffic.

Mobile Deployment

It is a type of deployment in which mobile devices such as smartphones and tablets are used as a platform where machine learning models are deployed. Examples of this type of deployment include personal assistants, image recognition, and language translation applications.

Since we are deploying the models in resource-constrained environments which is unlike the ones that are deployed in the servers, hardware considerations must be placed before ending up with an ML product. Machine learning applications could be quite useful and can have reasonable accuracy. If the model, however, is not able to generate predictions when there are less number of hardware resources, it would not be the most appropriate one to be used for mobile applications.

Constraints that must be taken when trying to deploy these products in real-time on mobiles would be to check considerations such as low-latency requirements, bias-variance tradeoffs along with other factors.

Edge Deployment

It is a type of deployment of machine learning models on edge devices such as the Internet of Things (IoT). These are the types of devices that are located at the edge of a network and rely on a steady internet connection to make predictions. Nonetheless, there are a few IoT devices that do not require internet access and instead have their own hardware that is capable of generating predictions.

There are quite a few requirements that must be taken into consideration when trying to use this type of deployment. Important considerations include processing power, memory capacity, and connectivity. These factors can have a strong influence on the performance of IoT devices that are used with machine learning capabilities.

Another important consideration would be to also optimize the model for edge deployment and consider the feasibility of models to be run in the cloud. This can involve reducing the complexity or size of the ML model that must be used in edge devices. Therefore, the type of deployment would depend on the type of devices used for providing machine learning capabilities.

Conclusion

All in all, we have seen a lot of deployment options for machine learning and deep learning models. In a lot of online courses, there is a good amount of emphasis on machine learning models and how they work internally. They do a good job of highlighting nuances about these models that could further be interpreted well before testing how. However, there should also be a good emphasis on the deployment aspects as it can be challenging to deploy them in real-time due to a large number of considerations.

After going through this article, hope you understood some of the common deployment options for machine learning. Thanks.

Below are the ways where you could contact me or take a look at my work.

GitHub: suhasmaddali (Suhas Maddali ) (github.com)

YouTube: https://www.youtube.com/channel/UCymdyoyJBC_i7QVfbrIs-4Q

LinkedIn: (1) Suhas Maddali, Northeastern University, Data Science | LinkedIn

Medium: Suhas Maddali — Medium