Intuitive A/B Test Evaluations for Coders

Friday 14:40 in Ferrum

Making A/B Test Evaluations Intuitive for Coders: A Python-Based Approach

A/B testing is an essential method for data-driven decision-making, but interpreting the results can be daunting. Complex jargon around p-values and confidence intervals often creates barriers to understanding. This talk simplifies A/B testing by introducing a practical, Python-powered approach using bootstrapping—a flexible and accessible method that aligns with how software engineers think and works without requiring statistical knowledge.

Session Highlights:

Statistical Significance and Hypothesis Testing:
- Why is statistical testing crucial for A/B tests? Simple comparisons overlook randomness.
- Using Python, we’ll demonstrate how to simulate "what-if" scenarios by shuffling and resampling data, allowing participants to compute p-values and understand the likelihood of observed differences occurring by chance.
Confidence Intervals with Bootstrapping:
- Confidence intervals clarify the range of plausible outcomes.
- We’ll explore how to resample experiment data repeatedly to estimate variability and construct intuitive confidence intervals—all using basic tools like random number generators and loops, without requiring advanced math.
- Key Takeaways:

Hands-on skills to compute p-values and confidence intervals using basic programming concepts.
Clear, step-by-step demonstrations of shuffling, resampling, and generating statistical insights.
Practical knowledge to move beyond black-box libraries and understand the "why" and "how" behind A/B test evaluations.

By the end of the session, attendees will be equipped to demystify A/B testing with a coder-friendly workflow, empowering them to make confident, data-driven decisions in their projects.

Talk Outline:

Setting the Stage (5 minutes)
- What is A/B testing?
- Why isn't it enough to just compare numbers? Why do we need statistics to interpret results?
Statistical Significance and P-Values (5 minutes)
- Statistical tests (t-test, z-test, binomial test) are frequently used, but what is the intuition behind them?
- Introducing the basic idea of bootstrapping.
Bootstrapping Explained (8 minutes)
- Step-by-step illustration of the bootstrapping approach.
- What is a p-value? An intuitive description using resampling.
Confidence Intervals Explained (7 minutes)
- Importance of confidence intervals and how they help interpret results.
- Intuitive computation of confidence intervals using bootstrapping.
- Impact of sample size on confidence intervals and certainty.
Why These Statistics Matter (5 minutes)
- Discussion on the practical necessity of statistical techniques.
- How these methods ensure data-driven decision-making in A/B testing.

Thomas Mayer

Thomas Mayer holds a PhD in Quantitative Language Comparison and brings a profound background in Machine Learning and Natural Language Processing (NLP) to his work. As Team Lead in the Data Intelligence team at HolidayCheck, Thomas combines his passion for data-driven insights with his expertise in linguistics and AI to drive innovation in the travel industry. With a deep understanding of both technical and business challenges, he plays a pivotal role in leveraging data to enhance customer experiences and inform strategic decisions.