Techno Blender
Digitally Yours.

A Complete Guide to Cohort Analysis using BigQuery and Looker Studio | by Damien Azzopardi | Mar, 2023

0 68


Demystifying cohort analysis, step by step

Photo by Hunter Harritt on Unsplash

Let’s admit it; cohort analysis may seem intimidating at first glance. Still, it is a powerful instrument that can provide valuable insights and, sometimes, the only correct way to visualize data. Mastering them will give you a clear advantage in your Data Analytics journey.

But first, what do we mean by cohort analysis?

A cohort analysis is a way to study and compare different groups of people over time.

These groups are defined by a common characteristic, such as the date they joined a service or made their first purchase.

Cohort analysis is frequently used to analyze the rate customers end their relationship with a product or a service, a concept commonly referred to as “churn.” For subscription-based businesses, churn represents the percentage of customers who cancel their subscriptions within a given period.

Churn is an important business metric that can significantly impact revenue and growth. While a high churn rate can be a sign of customer dissatisfaction, low churn, on the contrary, can be proof of customer loyalty and satisfaction.

Now let’s demonstrate the power of cohort analysis by running a churn analysis for a subscription-based app that wants to analyze its customer’s behavior across 2023.

Step 1 — Understanding your dataset

For this example, we’ll work with the subscriptions table stored in BigQuery. It contains a list of subscriptions made on our product, including the signup date and, most importantly, the information about their state, active or canceled. Here’s what the table looks like:

Query results (image by author)

Now you’d like to look at churn evolution monthly. You might want to do this using the following query, taking the number of lost customers per month, and dividing them by the total number of customers over the same period:

Query results (image by author)
Query results plotted (image by author)

Good news, churn rate has decreased over the year. But does this reflect reality? I’m afraid it doesn’t. Looking at churn this way will only represent part of the picture. This is where cohorts go into action.

Step 2 — Data transformation

Remember, cohorts are groups defined by a common characteristic, in that case, their signup date. So let’s break it down into different cohorts; users who signed up in January, users who signed up in February, and so on. And for each cohort, we want to know how many customers signed up and how many canceled within each time frame. In other words, how many canceled after one month, two months, etc.

To do so, let’s use the following query:

Query results where signup_date = January 2023 (image by author)

This query will return a total of 78 rows; 12 for January, 11 for February, 10 for March, and so on. Here’s a visual cue to help you better understand the results:

Query results illustrated (image by author)

Now let’s break down each of the fields in the query results:

  • signup_date: the date customers signed up.
  • cancellation_date: the date customers canceled.
  • cohort_month: the difference between the signup_date and the cancellation_date, in month.
  • max_subscriptions: the total number of customers who signed up on that month.
  • sum_cancellations: the number of customers who canceled their subscriptions each month.
  • r_sum_cancellations: the running sum of members who canceled their subscriptions over time. We will need this field later on when building our visualization.

For example, looking at row number 5, we see that, out of the 67 customers who signed up on January, 2 of them canceled their subscription in May, four months after joining the service, for a total of 10 customers canceling between January and May.

Step 3 — Putting it all together in Looker Studio

Now that our dataset is ready let’s use it to visualize the cohorts in Looker Studio.

First, let’s create a new calculated field called churn_rate using the following formula:

SUM(r_sum_cancellations)/MAX(max_subscriptions)

Then, let’s create a new Pivot Table chart with the following criteria:

  • Row dimension: signup_date
  • Column dimension: cohort_month
  • Metric: churn_rate as a percentage
  • Sorting Row #1: signup_date Ascending
  • Sorting Column #1: cohort_month Ascending

To add more context to your dashboard, let’s add a table with the total number of subscriptions to the left of the cohort chart. To do so, create a new Table chart with the following criteria:

  • Dimension: signup_date
  • Metric: max_subscriptions with a Max aggregation
  • Sort: signup_date Ascending

Adding a bit of formatting, and voila!

Medium — Cohort Analysis Looker Studio (image by author)

By looking at the churn this way, we can draw quick conclusions regarding user engagement. For example, the April 2023 cohort outperformed all other cohorts. In other words, the group with the lowest churn rate indicates that customers who joined in April were more committed and engaged with the product. By analyzing the reasons behind this cohort’s success, we can use those insights to improve customer retention and loyalty in the future.

Conclusion

Cohort analysis is essential for any subscription-based business willing to monitor customer behavior and churn. It provides valuable insights for making informed marketing and retention strategy decisions, leading to higher revenue and customer satisfaction. Following the steps outlined in this article, you’re ready to implement cohort analysis and start reaping its benefits. Happy analyzing!


Demystifying cohort analysis, step by step

Photo by Hunter Harritt on Unsplash

Let’s admit it; cohort analysis may seem intimidating at first glance. Still, it is a powerful instrument that can provide valuable insights and, sometimes, the only correct way to visualize data. Mastering them will give you a clear advantage in your Data Analytics journey.

But first, what do we mean by cohort analysis?

A cohort analysis is a way to study and compare different groups of people over time.

These groups are defined by a common characteristic, such as the date they joined a service or made their first purchase.

Cohort analysis is frequently used to analyze the rate customers end their relationship with a product or a service, a concept commonly referred to as “churn.” For subscription-based businesses, churn represents the percentage of customers who cancel their subscriptions within a given period.

Churn is an important business metric that can significantly impact revenue and growth. While a high churn rate can be a sign of customer dissatisfaction, low churn, on the contrary, can be proof of customer loyalty and satisfaction.

Now let’s demonstrate the power of cohort analysis by running a churn analysis for a subscription-based app that wants to analyze its customer’s behavior across 2023.

Step 1 — Understanding your dataset

For this example, we’ll work with the subscriptions table stored in BigQuery. It contains a list of subscriptions made on our product, including the signup date and, most importantly, the information about their state, active or canceled. Here’s what the table looks like:

Query results (image by author)

Now you’d like to look at churn evolution monthly. You might want to do this using the following query, taking the number of lost customers per month, and dividing them by the total number of customers over the same period:

Query results (image by author)
Query results plotted (image by author)

Good news, churn rate has decreased over the year. But does this reflect reality? I’m afraid it doesn’t. Looking at churn this way will only represent part of the picture. This is where cohorts go into action.

Step 2 — Data transformation

Remember, cohorts are groups defined by a common characteristic, in that case, their signup date. So let’s break it down into different cohorts; users who signed up in January, users who signed up in February, and so on. And for each cohort, we want to know how many customers signed up and how many canceled within each time frame. In other words, how many canceled after one month, two months, etc.

To do so, let’s use the following query:

Query results where signup_date = January 2023 (image by author)

This query will return a total of 78 rows; 12 for January, 11 for February, 10 for March, and so on. Here’s a visual cue to help you better understand the results:

Query results illustrated (image by author)

Now let’s break down each of the fields in the query results:

  • signup_date: the date customers signed up.
  • cancellation_date: the date customers canceled.
  • cohort_month: the difference between the signup_date and the cancellation_date, in month.
  • max_subscriptions: the total number of customers who signed up on that month.
  • sum_cancellations: the number of customers who canceled their subscriptions each month.
  • r_sum_cancellations: the running sum of members who canceled their subscriptions over time. We will need this field later on when building our visualization.

For example, looking at row number 5, we see that, out of the 67 customers who signed up on January, 2 of them canceled their subscription in May, four months after joining the service, for a total of 10 customers canceling between January and May.

Step 3 — Putting it all together in Looker Studio

Now that our dataset is ready let’s use it to visualize the cohorts in Looker Studio.

First, let’s create a new calculated field called churn_rate using the following formula:

SUM(r_sum_cancellations)/MAX(max_subscriptions)

Then, let’s create a new Pivot Table chart with the following criteria:

  • Row dimension: signup_date
  • Column dimension: cohort_month
  • Metric: churn_rate as a percentage
  • Sorting Row #1: signup_date Ascending
  • Sorting Column #1: cohort_month Ascending

To add more context to your dashboard, let’s add a table with the total number of subscriptions to the left of the cohort chart. To do so, create a new Table chart with the following criteria:

  • Dimension: signup_date
  • Metric: max_subscriptions with a Max aggregation
  • Sort: signup_date Ascending

Adding a bit of formatting, and voila!

Medium — Cohort Analysis Looker Studio (image by author)

By looking at the churn this way, we can draw quick conclusions regarding user engagement. For example, the April 2023 cohort outperformed all other cohorts. In other words, the group with the lowest churn rate indicates that customers who joined in April were more committed and engaged with the product. By analyzing the reasons behind this cohort’s success, we can use those insights to improve customer retention and loyalty in the future.

Conclusion

Cohort analysis is essential for any subscription-based business willing to monitor customer behavior and churn. It provides valuable insights for making informed marketing and retention strategy decisions, leading to higher revenue and customer satisfaction. Following the steps outlined in this article, you’re ready to implement cohort analysis and start reaping its benefits. Happy analyzing!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment