Techno Blender
Digitally Yours.

Seasoning your AB testing experiments | by Mark Eltsefon | Mar, 2023

0 33


Photo by Manuel Asturias on Unsplash

AB testing is one of the most well-known methods to measure the effect of new features’ implementation. The main idea is to split your traffic ( or just the part of it) into two or more groups randomly. However, it’s important to ensure that the split is truly random and unbiased in order to be confident about the results. This is where salts come into play. The purpose of salts is to eliminate sources of bias or predictability that could affect the results of an AB test. Without salts, there is a risk that some users may be more likely to be assigned to a particular variation of a webpage or application based on their characteristics, behavior, or other factors, leading to biased results.

Hash and splitting.

Let’s explore the main question of how to divide our users into different experiments and groups. To make it clearer, it’s better to use an example. We can divide our entire traffic into 10 buckets.

10 original buckets. Image by Author

Before actually assigning users to the buckets , let’s refresh our memory with what the hash function is.

Hash functions are used to convert input data into a fixed-size, unique value that can represent the original data. How can this be beneficial? It can transform a unique aspect of our user, typically the user ID, into a number that we can utilize to assign them to one of the buckets.

user_id = "a4234jf3345" 

# Get the hash value of the user
hash_value = hash(user_id)

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number
print(number)

# Print the bucket that we assigned our user to
print(number%10)

A user ends up in bucket 1 if the remainder of the hash value divided by 10 equals to 1 and so forth.

Why do we need salts?

To begin, let’s discuss the concept of domain. A domain refers to a specific aspect of your product that you define yourself, such as the basket, login page, or checkout page. The most important feature of domains is that you can confidently overlap your experiments between them. This means that any experiment (or the vast majority of it) conducted in one domain should not cause any changes in another domain.

The use of domain salt helps us overcome two challenges:

  1. Without domain salt, a user ends up in the same bucket for all experiments, leading to biased experiment results.
  2. We are limited by our traffic, but by adding domain salt, we can expand the number of experiments we can conduct by the number of domains. For each domain, we create our own buckets using unique salts.

In python the code will be something like the following:

user_id = "a3d45f6g6j7"

# Get the hash value of the user with the domain salt
hash_value = hash(user_id + 'Basket')

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number
print(number)

Users’ allocation

To allocate users to experiments, specific buckets are assigned to each experiment. In the example given, experiment #1 has been allocated buckets 1 and 2, with A representing the control group and B representing the treatment group. And we assume that we run experiments within one domain.

Image by Author

During frequent experimentation, a majority of the buckets are usually allocated to ongoing experiments.

Red — #1 exp , Green — #2 exp , Yellow — #3 exp. Image by Author

However, what happens when our experiment (#1 red) end, freeing up some buckets that can be allocated to a new experiment?

Suppose 20% of the buckets are now free (because the first experiment has concluded) and we want to launch experiment #4.

Image by Author

Can we simply use hashing and the same domain salt as before?

The answer is no, and the reason is carryover effects.

Carryover effects occur when exposing a user to a treatment can impact their behavior at a later time. Essentially, people tend to remember their past experiences, which can lead to biased results for future experiments. To address this issue, we introduce a new salt called the shuffle salt, which is unique for each experiment.

Now, our code for bucket allocation becomes the following.

user_id = "a3d45f6g6j7"

# Get the hash value of the user with the domain and shuffle salts
hash_value = hash(user_id + 'Basket' + 'Experiment_with_basket')

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number of the bucket
print(number % 10)

How the shuffle salt works?

We break down our first two buckets into 10 of its own buckets and assign users to these sub-buckets to provide even more granularity for our experiments.

Breaking down the first two buckets. Image by Author

The image on the picture distinguishes the control and treatment groups for an experiment using violet and blue colors respectively.

And now how it fits into the bigger picture.

The launch of experiment #4. Image by Author

In situations where we require less than 20% of our traffic, we will lose a certain percentage of users. For instance, if we need 15% of the entire traffic, we have to acquire 2 buckets since we cannot get 1.5 buckets. Consequently, we end up losing 5% of our traffic.

Unfortunately, there is no way to completely eliminate this loss.

In practice, most companies use these two types of salts. However, some companies may also use additional salts, such as exclusive salts (to test major changes) or more granular-domain salts (to test changes in specific countries).

Building an AB testing system from scratch is not the simplest task, given the numerous nuances that people tend to overlook. Nevertheless, you should not be afraid to attempt it and don’t forget to use salts.

May the Bias be without you!


Photo by Manuel Asturias on Unsplash

AB testing is one of the most well-known methods to measure the effect of new features’ implementation. The main idea is to split your traffic ( or just the part of it) into two or more groups randomly. However, it’s important to ensure that the split is truly random and unbiased in order to be confident about the results. This is where salts come into play. The purpose of salts is to eliminate sources of bias or predictability that could affect the results of an AB test. Without salts, there is a risk that some users may be more likely to be assigned to a particular variation of a webpage or application based on their characteristics, behavior, or other factors, leading to biased results.

Hash and splitting.

Let’s explore the main question of how to divide our users into different experiments and groups. To make it clearer, it’s better to use an example. We can divide our entire traffic into 10 buckets.

10 original buckets. Image by Author

Before actually assigning users to the buckets , let’s refresh our memory with what the hash function is.

Hash functions are used to convert input data into a fixed-size, unique value that can represent the original data. How can this be beneficial? It can transform a unique aspect of our user, typically the user ID, into a number that we can utilize to assign them to one of the buckets.

user_id = "a4234jf3345" 

# Get the hash value of the user
hash_value = hash(user_id)

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number
print(number)

# Print the bucket that we assigned our user to
print(number%10)

A user ends up in bucket 1 if the remainder of the hash value divided by 10 equals to 1 and so forth.

Why do we need salts?

To begin, let’s discuss the concept of domain. A domain refers to a specific aspect of your product that you define yourself, such as the basket, login page, or checkout page. The most important feature of domains is that you can confidently overlap your experiments between them. This means that any experiment (or the vast majority of it) conducted in one domain should not cause any changes in another domain.

The use of domain salt helps us overcome two challenges:

  1. Without domain salt, a user ends up in the same bucket for all experiments, leading to biased experiment results.
  2. We are limited by our traffic, but by adding domain salt, we can expand the number of experiments we can conduct by the number of domains. For each domain, we create our own buckets using unique salts.

In python the code will be something like the following:

user_id = "a3d45f6g6j7"

# Get the hash value of the user with the domain salt
hash_value = hash(user_id + 'Basket')

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number
print(number)

Users’ allocation

To allocate users to experiments, specific buckets are assigned to each experiment. In the example given, experiment #1 has been allocated buckets 1 and 2, with A representing the control group and B representing the treatment group. And we assume that we run experiments within one domain.

Image by Author

During frequent experimentation, a majority of the buckets are usually allocated to ongoing experiments.

Red — #1 exp , Green — #2 exp , Yellow — #3 exp. Image by Author

However, what happens when our experiment (#1 red) end, freeing up some buckets that can be allocated to a new experiment?

Suppose 20% of the buckets are now free (because the first experiment has concluded) and we want to launch experiment #4.

Image by Author

Can we simply use hashing and the same domain salt as before?

The answer is no, and the reason is carryover effects.

Carryover effects occur when exposing a user to a treatment can impact their behavior at a later time. Essentially, people tend to remember their past experiences, which can lead to biased results for future experiments. To address this issue, we introduce a new salt called the shuffle salt, which is unique for each experiment.

Now, our code for bucket allocation becomes the following.

user_id = "a3d45f6g6j7"

# Get the hash value of the user with the domain and shuffle salts
hash_value = hash(user_id + 'Basket' + 'Experiment_with_basket')

# Convert the hash value to a positive number
number = abs(hash_value)

# Print the number of the bucket
print(number % 10)

How the shuffle salt works?

We break down our first two buckets into 10 of its own buckets and assign users to these sub-buckets to provide even more granularity for our experiments.

Breaking down the first two buckets. Image by Author

The image on the picture distinguishes the control and treatment groups for an experiment using violet and blue colors respectively.

And now how it fits into the bigger picture.

The launch of experiment #4. Image by Author

In situations where we require less than 20% of our traffic, we will lose a certain percentage of users. For instance, if we need 15% of the entire traffic, we have to acquire 2 buckets since we cannot get 1.5 buckets. Consequently, we end up losing 5% of our traffic.

Unfortunately, there is no way to completely eliminate this loss.

In practice, most companies use these two types of salts. However, some companies may also use additional salts, such as exclusive salts (to test major changes) or more granular-domain salts (to test changes in specific countries).

Building an AB testing system from scratch is not the simplest task, given the numerous nuances that people tend to overlook. Nevertheless, you should not be afraid to attempt it and don’t forget to use salts.

May the Bias be without you!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment