A/B Test:

Testing a New Feature of a Shopping Webpage

10 min readSep 21, 2022

This article was written by Alparslan Mesri and Hale Kizilduman.

This article solves a sample problem about an A/B test. Data is open to access via the following link. While doing this study, the codes were adapted from similar themed works in Kaggle. You can access the .ipynb files via the link.

The story of this data is given below. The descriptions of the attributes used have been added to the glossary part at the end of the article.

A company recently introduced a new bidding type, “average bidding”, as an alternative to its existing bidding type, called “maximum bidding”. One of our clients, ….com, has decided to test this new feature and wants to conduct an A/B test to understand if average bidding brings more conversions than maximum bidding. The A/B test has run for 1 month and ….com now expects you to analyze and present the results of this A/B test [1].

Therefore, the A/B test criterion Conversion Rate was chosen. First EDA was done, then A/B test was checked.

First, required libraries are imported.

Test group data is imported and separated into columns by using sep parameter.

Column names have been rearranged to not include hashtags.

The data types of the columns were checked with the info command.

The data type of the Date column is converted from object to datetime.

A summary of the df_test dataframe is examined at with the describe function.

The procedures for the test group were also performed for the control group, respectively, and it was determined that there were NaN values in the control group. These null data are filled using the K-NN algorithm.

The test and control data frames were combined and a single dataframe, df, was created. Index column has been reset according to the new dataframe to avoid confusion. New attributes to be used in the analysis were created with feature engineering. These attributes are: USD Spend per Purchase, CTR(%), Conversion Rate(%), Add to Cart Rate(%).

Exploratory Data Analysis

In order to explore the data, various graphs were created and the data was viewed from different angles.

According to the following graph, the control group is more successful in Conversion Rate(%), while the test group is more successful in website access per ad, CTR(%).

The distribution seems to be equal for the test and control groups, no significant difference was detected considering the Number of Website Clicks and Number of Purchase.

In the chart below, the number of products purchased per commercial transaction in the control group appears to be higher.

The test group seems to have made more expensive trades compared to the control group, as the graph below shows.

As seen below, both the number of impressions and reach are higher in the control group compared to the test group.

When the sample data were examined, a total of 68,653.0 USD was spent in the control group, while 76,892.0 USD was spent in the test group. The total spend uplift was calculated as follows,

(76892–68653)/68653 = 0.12

According to this result, in total, 12% more money was spent in the test group.

While the average amount of USD spent per transaction was 5.9 USD in the test group, it decreased by 1 USD to 4.99 USD in the control group. The USD per purchase uplift was calculated as follows,

(5.90–4.99)/4.99= 0.18

Although there was a decrease in the number of commercial transactions, more products were purchased in each transaction or the type of product sold may have changed. This can be understood by looking at the number of products added to the chart.

In the test group, much less product was added to the cart. Therefore, the product type may be changed or the products added to the cart in the test group may be removed less frequently.

The distribution of the groups according to the conversion rate (%)is shown below as a box plot.

While the average of the conversion rate (%) metric of the control group was calculated as 11.4, it was calculated as 9.23 for the test group.

The distribution of the groups according to the CTR (%)is shown below as a box plot. The box plot shows that the published ads may have captured the target audience better. Another possibility is that the ads served could have been made more catchy.

The distribution of the groups according to the number of impressions is shown below as a box plot.

The results are parallel when looking at the box plot values regarding number of impressions and the number of reach. The values of the control group are higher than the test group.

A/B Test: Conversion Rate

First of all, the df variable is copied to the df_2 variable so that there is no alteration on the df variable. Then the data set is divided into two parts as test_g and control_g.

For the first hypothesis test, conversion rate columns are assigned to test_group_cr and control_group_cr variables.

The standard deviations of the two samples were compared in order to implement a hypothesis test. There is a difference between the test group std which is 4.449 and the control group std which is 6.722.

In order to detect the outlier values, the boxplot graph of the conversion rate was drawn.

After the outlier values were seen in the boxplot graph, the points were printed.

The detected outlier points were cleared from the data. Thus, the variances of the two data sets were brought closer. In this way, we can perform a standard independent 2-sample test while AB testing.

After the outlier data were cleaned, the distribution of the two samples returned to the normal distribution.

AB test has been performed. Since the p-value is greater than alpha, the significance level of 0.05, Ho hypothesis was accepted. Therefore, there is no statistically meaningful difference between the control and the test group in terms of Conversion Rate(%).

A/B Test: CTR

After the conversion rate(%), now the AB test of CTR will be executed.
Outlier data was illustrated by plotting the boxplot.

The point with a CTR value of 34 was removed from the sample because it was outlier data.

As a rule of thumb, if the ratio of the larger variance to the smaller variance is less than 4 then we can assume the variances are approximately equal and use the Student’s t-test [2]. When we compared the variances of the test and control groups, it was seen that the difference between the variances was good enough for hypothesis testing with Student’s t-test.

The distribution graphs of the sample groups were drawn and it was examined whether they had a normal distribution or not. After examining the graph it might be beneficial to make a test that would help to understand that whether the groups had a normal distribution.

The null-hypothesis of Shapiro-Wilk and Kolmogorov-Smirnov tests are that the population is normally distributed. Thus, if the p value is less than the chosen alpha level, then the null hypothesis is rejected[3]. As it can be seen below test results, test group does not distributed normally. Therefore, the AB test cannot be applied. As a recommendation, the AB test for CTR can be looked at again after the test data has been converted to a normal distribution.

A/B Test: Add to Cart Rate (%)

After the CTR(%), now the AB test of Add to Cart Rate(%) will be executed.
Outlier data was illustrated by plotting the boxplot.

The point with a Add to Cart Rate (%) value greater than 140 was removed from the sample because it was outlier data.

When we compared the variances of the test and control groups, it was seen that the difference between the variances was good enough for hypothesis testing with Student’s t-test.

The distribution graph of the sample groups were drawn and it was examined whether they had a normal distribution or not. Although the images are similar to the normal distribution, yet Shapiro-Wilk test was applied.

Since the p-value is greater than the alpha level, 0.05, then the null hypothesis is accepted. Therefore, the AB test can be applied.

AB test has been performed. Since the p-value is smaller than alpha, the significance level of 0.05, Ho hypothesis was rejected. Therefore, there is a statistically meaningful difference between the control and the test group in terms of Add to Cart Rate(%).

The means of the test and control groups were 61.79 and 42.00, respectively. In this case, it can be said that there is an improvement in the new design of the web page in terms of the Add to Cart Rate parameter.

To sum up, in this study, three parameters: conversion rate, CTR and add to cart rate, were examined to measure the differences between the old and new design of a website.
According to the Conversion Rate parameter, there was no statistically significant difference between the old and the new design. When CTR parameter is examined, it was seen that this parameter was not suitable for A/B testing due to its distribution. Looking at the add to cart rate parameter, it was seen that the mean of the Add to Cart Rate of the new design is higher than the old one.

Glossary:

Control group: an online controlled experiment is a group of (usually randomly assigned) users/sessions/pageviews/etc. who will not be exposed to the experimental treatment(s). The performance of the Test Group(s) is compared to that of the Control group to check for discrepancies large enough to reject the null hypothesis [4].

Test group: is a group of (usually randomly assigned) users/sessions/pageviews/etc. that are exposed to a certain treatment. The test groups are then compared to the Control Group to check for discrepancies large enough to reject the null hypothesis of interest [5].

control_group.csv: It contains control group data.

test_group.csv: It contains test group data.

Campaign Name: Campaign name contains two value which are test group and control group.

Date: dd/mm/yy

Spend [USD]: the money spent per day.

Number of Impressions: It is a variable for the user to see an ad.

Reach: The number of unique people who saw an ad.

Number of Website Clicks: It is the variable related to the user clicking the website link in the advertisement.

Number of Searches: It is the variable related to the user performing a search on the website.

Number of View Content: It is the variable related to the user viewing the details of a product.

Number of Add to Cart: It is the variable related to the user adding the product to the cart.

Number of Purchase: It is the variable related to the user’s purchase of the product.

Conversion Rate: Commercial transactions per visit, shown as a percentage.

USD Spend per Purchase: Amount of USD spent per item purchased.
CTR(%): A ratio showing how often people who see your ad and end up clicking it.
Add to Cart Rate(%): Purchase rate of products added to cart during the session, shown as percentage.

References:

[1] https://www.kaggle.com/code/evaaasong/a-b-testing-analysis/notebook

[2] https://www.statology.org/determine-equal-or-unequal-variance/

[3] https://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test

[4] https://www.analytics-toolkit.com/glossary/control-group/

[5] https://www.analytics-toolkit.com/glossary/test-group/