A/B Testing, calculating your sample size using various methods
One headache for A/B testing is how best to determine the total number of users to use in the two groups namely control_group and the experimental_group/test_group. In this blog, I focused on methods to use to compute your sample size. They are trackers of the performance of any feature to be launched.
The first method is using the two-sample proportionality hypothesis tests in statistics also used for G*Power method. The formula for this method is
from the formula above, let’s define the various variables stated.
=> n = the determinant total sample size to be calculated ………………=>𝛼 is the “Significance level” of 5% = 0.05 ……………………………=>B is the in “Statistical Power ( 1- B) where B is the false negative of 20%……………………………………………………………………………=>P1 is the “Baseline conversion rate”. Baseline conversion rate is the current conversion rate of your control_group or existing product as it is………………………………………………………………………………….=>P2 is the conversion rate lifted by the Absolute “Minimum Detectable Effect”, which means P1 + Absolute Minimum Detectable Effect(MDE). MDE is calculated by (desired conversion rate lift/baseline conversion rate)*100. The MDE talks about that marginal impact you want to detect………………………………………………………………………….... => Z𝛼/2 means the Z score from the table that corresponds to 𝛼/2 ……………………………………………………………………….……….. => Z B means the Z score from the table that corresponds to B .
With the above explanation of variables, we can now have a sample question to help us compute our sample size.
Practical Question
Assuming you run an e-commerce platform and want to send out newsletters in a form of reminders to inactive members who have been inactive for about 6months. The reactivation rate (members who click the newsletter and made a purchase)is 30% . Management sees the reactivation rate is low and wants something to be done about it. So this is the hypothesis: “ If we give all newly reactivated members $5, we would increase reactivation rate by 50%”. The 5$ reactivation would be published as part of the newsletter when sent to inactive members.
We would use this same example in our subsequent posts on determining if the test yielded the expected results and would make statistical sense to roll the feature out. Now with this, we need to calculate the sample size we need to use for the test this hypothesis with.
Solution using G*Power method
=> n = the determinant total sample size to be calculated ……………….……=>𝛼 = 5% = 0.05 …………………………………………………………………=>B =20% = 0.2 ………………………………………………..……………….. =>P1 = control_group = 30% = 0.3………………………………………….. =>P2 =treatment_group = P1 + MDE => MDE = 50%*30% =15% =0.3+0.15 = 0.45………………………………………………………………… => Z𝛼/2 = −1.959963985 read from 2tailed a negative Z-score table……… => Z B = −0.841621234 read from 2tailed a negative Z-score table ……….. Substitute the values into the formula above.
From the above, if the team wants to increase the reactivation rate by 50%, the sample size needed is 150 users per group. This means control_group needs to have approximately 150 users and treatment_group needs to also have 150 users to be able to test if the hypothesis is true.
This is equivalent to using the popular online tool named Evan’s Awesome A/B Tools. Understanding the statistical logic behind the calculation helps to make the best decisions involved in determining sample size for A/B testing.
Method 2: Using the statsmodels module in python to compute the sample size. I have created one that can be used in my Github repo. Make sure to substitute the values needed into the code before you run it. It would produce approximately 161.93 as sample size per group.
Method 3: Using Excel to calculate the sample size to be used for A/B testing. In this spreadsheet here, I have done this calculation. It would also get a result as 150.38. Excel has a formula as NORM.S.INV to compute this.
You can choose to employ just one method that best suits you. All methods have the same impact and are well enough to use.
I hope this have been helpful to help you as a product analyst or data analyst. A/B testing is a very important skill as an analyst you need to help product managers make informed decisions based on feature rollout. Thank you for reading. I hope you learnt something new and would implement A/B testing in your projects before you roll-out any new feature or design. Contact me via email on henrykpano@gmail.com with your questions. Share and follow me.