Common Reasons why Machine Learning Projects Fail | by Suhas Maddali | Sep, 2022

By Jessie Hobb On Sep 20, 2022

Understanding various ways at which machine learning projects fail ensures that one takes the right steps and measures to avoid them before facing the situation

Okay, you did a good job in ensuring that your machine learning model is doing well on the training data. The next step would be to deploy it in real-time and get feedback from users and from the business about whether using it has led to an increase in the revenue of various data products. But for a successful machine learning project, there are a lot of things and intricacies that must be taken care of before they can be given to users for consumption. Looking at a list of possibilities about why machine learning projects fail can raise our awareness about them and also devise the right steps to be applied so as to reduce this further. In this article, we will be exploring the various ways in which machine learning projects fail, leading to a better understanding of them.

The amount of data available to be used by companies has sky-rocketed and it is expected that it would also grow in the future. Though they have huge amounts of data, failing to leverage them can be similar to not having them at all. Therefore, they are looking for professionals in the field who has degrees in data science and machine learning to use the state-of-the-art techniques and give predictive power to them.

As a result, companies place a tremendous amount of effort in building and using AI applications. Perhaps it is one of the most talked about terms on the internet and companies are doing a good job gravitating towards this field. However, one thing that they must also realize is that there is a likelihood that these projects can fail due to a large number of reasons. Diagnosing where they are failing can help reduce them and build trust among ML practitioners and researchers in the company.

Let us now explore the reasons behind the failure of various ML projects and also try to find ways to tackle them.

Reasons for Failure of ML projects

There are numerous ways in which failure can be encountered in our machine learning setting with some reasons being that the data is not ready for ML, or sometimes there can be overfitting, or scalability issues. We will explore each of these areas and gain a thorough understanding of them.

Not Being Able to Connect Machine Learning with Business — It is impressive that machine learning models are able to churn vast amounts of data and get a lot of insights from them and make predictions for the data that they haven’t seen at a level similar to human-level performance or better. However, being unable to connect the dots about how machine learning can be used for the business and whether it could meet the objectives of the business in terms of revenue, customer retention, or other factors, usually leads to the failure of the project. To avoid this issue, it is advisable to have a brainstorming session before the start of a project about how machine learning and artificial intelligence are capable of solving the challenges that were not solvable by conventional methods used earlier. If there is not a lot of overlap between the objectives in the business and the use of machine learning, there is a higher probability that this wouldn’t add value to the team, leading to its failure.

Failure to Understand Resources Needed — Companies sometimes underestimate the total amount of resources that are needed to run an AI project. Sometimes it is assumed that having a powerful graphics processing unit (GPU) and a fully functioning CPU is sufficient for AI applications. However, there can be more resources needed such as load balancers and storage devices that can store huge volumes of data before they are used for AI purposes. We are talking about Terabytes of data instead of just Gigabytes. There should be data processing pipelines set up without being error-prone. Therefore, the best step to avoid this issue would be to do market research and also understand case studies from users before finally deciding the number of resources that should be allocated for the project.

ML Models Overfitting and Underfitting — Overfitting and underfitting are some of the most prevalent issues with machine learning. Train a too complex model with the training data and you are likely to face this issue of overfitting. On the other hand, training a less complex ML model on intricate data leads to underfitting. Therefore, success usually involves finding the right balance between the two so that the models are able to generalize well on the data that they have not seen before (test set). If you want a more thorough understanding of these topics, feel free to click on the link below where I highlight the differences between these two terms in great detail.

Differences between Bias and Variance in Machine Learning | by Suhas Maddali | Sep, 2022 | Towards Data Science (medium.com)

Not Having the Right Talent — While this point can be obvious to understand, it is worth mentioning as there are a large number of people who do not have the right skills that the companies are looking for especially to fill the roles of a data scientist or a machine learning engineer. The online courses do a great job of reinforcing a large number of concepts and ensuring that students are ready to tackle real-world problems. However, they do not cover the whole picture of the day-to-day tasks of data scientists such as extracting the data, loading it, organizing, and transforming it before training the ML models. Hence we do have a large number of applications and companies are facing the stress of finding the right candidates for the job as there is a lot of demand for it and a shortage of trained professionals in the field.

Choosing the Wrong Error Metric — A lot of effort has been put in order to build a machine learning pipeline and also steps are taken to ensure that the ML models are optimized for the metrics chosen. The performance of these models is good for the metric that was considered and chosen by the practitioners. However, the same error metric might not be the best one to pick for the problem at hand. For instance, considering cases such as cancer diagnosis prediction, we see that metrics such as accuracy do not give us a good picture of how well the model is actually performing on the data, and choosing this metric can sometimes have devastating consequences on the business impact. Due to the nature of the data set which is highly imbalanced (higher number of patients having lower chances of cancer), accuracy can give an inflated picture without explaining the strength of the machine learning model. Therefore, alternate metrics such as precision and recall should be selected for these problems. Hence it is important to select the right kind of metric for the problem at hand so that it solves the business challenges.

Not Regularly Monitoring ML Models — After deploying the models in production, it can also be important to regularly check in cycles how the models are actually performing on the real-time data. For a large number of reasons, there might be data drift and concept drift that hinders the performance of ML models to a large extent. Regularly understanding and benchmarking the performance of the models ensures that companies do not lose revenue as a result of bad predictions from models during certain stages of the development cycle of machine learning projects.

Not Choosing the Right Model — There is a large set of machine learning models that can be trained and deployed in real time. Sometimes ML practitioners can be conditioned to only use a certain set of models such as XGBoost, Random Forest, or Gradient Boosted Decision Trees (GBDT). Depending on the application, however, these models also might not be the right ones to use especially for tasks that have limited training resources or that demand low latency predictions with constrained budgets. In those cases, simpler models such as linear regression or logistic regression (classification) can be deployed leading to easily being able to deploy them without placing a large emphasis on deployment.

Failure to Remove Outliers — In our data, there can be outliers in both the training and the test set. Therefore, failing to treat them before the models are deployed also makes things complicated and a lot of questions are raised by the management after the models are productionized in real-time. In order to avoid such costs, removing the outliers in both the training and test sets helps ML practitioners prioritize and make models more robust to outliers and generalize well on the new data. Therefore, time should be taken to remove those outliers that impact the performance in real time when they are deployed.

Conclusion

After going through this article, you should be able to understand the reasons why machine learning projects fail. It is good to note that there can be a large set of other reasons why they might also fail but these are the ones that are mostly found in a large number of organizations. Taking the right steps to avoid them is useful to the business before it spends a significant amount of revenue to diagnose them after they are deployed. Thanks for taking the time to read this article.

If you like to get more updates about my latest articles and also have unlimited access to the medium articles for just 5 dollars per month, feel free to use the link below to add your support for my work. Thanks.

https://suhas-maddali007.medium.com/membership

Below are the ways where you could contact me or take a look at my work.

GitHub: suhasmaddali (Suhas Maddali ) (github.com)

YouTube: https://www.youtube.com/channel/UCymdyoyJBC_i7QVfbrIs-4Q

LinkedIn: (1) Suhas Maddali, Northeastern University, Data Science | LinkedIn

Medium: Suhas Maddali — Medium