Extreme Churn Prediction: Forecasting Without Features | by Marco Cerliani | Nov, 2022

By Jessie Hobb On Nov 3, 2022

Studying Events Frequency to Identify Unusual Behaviors

Nowadays we live in a data-centric world. With every action, we generate a significant amount of data that can be collected and used to produce valuable business insights. Big tech companies know these dynamics very well. By monitoring our daily activities, it’s possible to identify our habits and preferences to customize offers and increase the probability of engagements.

Accessing, and leveraging at the same time, the huge variety of data produced by each individual/user may increase the business results, better satisfy customers, and grant a leading market position. As we imagine the information produced and stored is enormous and of various kinds. In this sense, business intelligence and machine learning play a crucial role in providing valuable meaning to the data at our disposal and take the most efficient decisions.

With the scope of maximizing the value of the business, we can adopt machine learning to detect when customers are close to churning, identify the best retention strategies, recommend the best services or products according to their preferences, and so on. The possibilities concerning the application of artificial intelligence solutions for customer engagement are many. Having at our disposal a full and rich set of data may be a great added value for our needs. As data scientists, we can build powerful and complex pipelines by merging different data sources and building meaningful features to feed our predictive algorithms.

This sounds like magic and a dream for every data scientist. Since collecting and storing data should be a must for every data-centric company, it may not be so easy and free. Working with few observations, low-quality data, or without meaningful predictors may happen often. In these cases, we should reinvent ourselves to deliver the best solution with what we have at our disposal.

In this post, we develop a solution to detect when customers are about to leave. We don’t apply standard or complex machine learning approaches. We simulate operating in an extreme scenario where we have only stored the purchase information for each user together with the relative amounts. We don’t know anything about our customers and can’t merge any valuable external data source to help us to enrich our dataset. With simple and effective statistical techniques, we aim to achieve reasonable results which can be adopted also to benchmark more complex solutions.

We imagine being a company that sells products or services. We don’t offer subscriptions. Our customers make purchases on their own when they need. For each customer’s order, we can retrieve the date and the corresponding money amount. These are the only things we can collect and obtain.

For the scope of this post, we simulate some orders for a fixed number of customers in a given time range. We register a churn when a customer stops ordering from us and no longer buys anything.

From the information at our disposal, we can retrieve how many days passed between two consecutive purchases (i.e. order frequency). Calculating the difference in days from two adjacent orders is fundamental for developing our solution. We aim to identify churns mainly by looking at the distributions of order frequency of each customer. If the number of days since the last purchase is far from the usual customer’s order frequency, we are confident to encounter a leaving. In other words, we verify the day’s difference, between today and the date of the last order, being greater than in past for each customer.

*Number of orders vs mean/std of order frequency* (image by the author)

Practically speaking, we must develop an algorithm that, studying the order frequency distribution, can alert the risk of churning for every customer on any date. We propose two simple yet effective approaches:

sigma modeling
cumulative distribution function (CDF) modeling

With the sigma modeling, we calculate the mean and the standard deviation of order frequency for each customer on a historical period of reference. During inference, we verify if the difference between today’s and the last order date is greater than the mean plus n-times sigma. If that happens our customer is no longer buying with the same frequency as before, which may indicate a possible churn.

With the CDF approach, we model the order frequency as a random variable (precisely a truncated normal) building the corresponding inverse monotonic CDF on a historical period of reference for each customer. At inference time, we can identify the probability of any difference, between today’s and the last order date, being greater than a predefined level of confidence. As before, if that happens our customer is no longer buying with the same frequency as before by us. To improve the effectiveness of CDF modeling, we can include also the order amount in the formulation if we consider it meaningful for churn detection. This is important if we want to give more importance to purchases with an amount close to the average customer’s historical expense.

To test how the proposed techniques work, we apply them following a sliding window approach to all the simulated data at our disposal. We evaluate the churning propensity of each customer on the first day of each month. As ground truths, we know when the customers have made their last order with us (i.e. we know when they churn). In this way, it’s possible to calculate standard metrics like accuracy, precision, and recall.

For both approaches, we evaluate each customer only after a fixed amount of purchases (6 orders in our cases) to build reliable order frequency distributions and make the procedures more trustable. On the other hand, we prefer to not say anything about these customers and wait to evaluate them in the future.

Recall results. Methodologies at comparison (image by the author)

Precision results. Methodologies at comparison (image by the author)

Accuracy results. Methodologies at comparison (image by the author)

The proposed methodologies seem to do their work well in identifying possible churns. The sigma one seems to show the highest precision, while the CDF one is the more accurate with a higher recall rate. Possible tuning and adjustments of the approaches can be made by tweaking the sigma level or the CDF confidence.

In this post, we introduced two approaches that leverage the study of event frequency to identify possible unusual behaviors. We applied the mentioned approaches in a churn application. The same could be used in all the tasks where the study of the time between events is crucial to take decisions. For their simplicity and adaptability, the proposed methodologies can also be used as a benchmark when there are possibilities to adopt more complex solutions.

Studying Events Frequency to Identify Unusual Behaviors

For the scope of this post, we simulate some orders for a fixed number of customers in a given time range. We register a churn when a customer stops ordering from us and no longer buys anything.

sigma modeling
cumulative distribution function (CDF) modeling

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.