A/B Test Significance Calculator
Determine if your A/B test results are statistically significant. Calculate p-values, confidence intervals, and conversion rate lift.
A/B Test Significance Calculator
Determine if your A/B test results are statistically significant
Control Group (A)
Treatment Group (B)
Formula
z = (p₁ - p₂) / √(p̂(1-p̂)(1/n₁ + 1/n₂))Two-proportion z-test comparing conversion rates between control and treatment groups.
A/B Testing Best Practices
✓ Do
- • Calculate sample size before starting
- • Run tests for full business cycles
- • Test one variable at a time
- • Document your hypothesis
- • Consider practical significance
✗ Don't
- • Stop tests early when significant
- • Run multiple tests simultaneously
- • Ignore segment differences
- • Test during unusual periods
- • Make decisions on small samples
Understanding Your Results
| P-Value | Confidence | Interpretation |
|---|---|---|
| < 0.01 | > 99% | Very strong evidence |
| 0.01 - 0.05 | 95% - 99% | Strong evidence |
| 0.05 - 0.10 | 90% - 95% | Moderate evidence |
| > 0.10 | < 90% | Weak/no evidence |
How to Use
- 1Enter control data — Input the number of visitors and conversions for your control group (A).
- 2Enter treatment data — Input the number of visitors and conversions for your treatment group (B).
- 3Calculate — Click Calculate to analyze statistical significance.
- 4Interpret results — Review the p-value, confidence level, and lift to make decisions.
Frequently Asked Questions
What is statistical significance in A/B testing?
Statistical significance indicates that the difference between your control and treatment groups is unlikely to be due to random chance. A p-value below 0.05 (95% confidence) is typically considered significant.
What is a good sample size for A/B tests?
Sample size depends on your baseline conversion rate and the minimum effect you want to detect. Generally, you need thousands of visitors per variation. Use the sample size recommendation in the results for guidance.
What does relative lift mean?
Relative lift is the percentage improvement of the treatment over the control. For example, if control converts at 5% and treatment at 6%, the relative lift is 20% ((6-5)/5 × 100).
Should I stop my test when it reaches significance?
No! Stopping early can lead to false positives. Plan your sample size in advance and run the test until you reach it, regardless of interim results. This is called "peeking" and can invalidate your results.
What is the confidence interval?
The confidence interval shows the range where the true difference likely falls. A 95% CI of [1%, 5%] means we're 95% confident the true lift is between 1% and 5%.