Tutorial: Running AB Tests with aboba
This tutorial will guide you through the core concepts of the aboba library by walking through practical examples. You'll learn how to set up experiments, run tests, and analyze results.
Core Concepts
Before diving into examples, let's understand the key components:
- Test: The statistical test you want to run (t-test, HSD, etc.)
- Pipeline: A sequence of data processors and splitters that prepare your data
- Splitter: Determines how to split data into groups
- Processor: Transforms data (e.g., CUPED, bucketing)
- Effect Modifier: Simulates synthetic effects for power analysis
- Experiment: Orchestrates multiple test runs and visualizes results
Example 1: Basic Synthetic Data Experiment
Let's start with a simple example using synthetic data to understand the workflow.
Step 1: Generate Data
First, we'll create two groups from the same distribution N(0, 1):
import numpy as np
import pandas as pd
import scipy.stats as sps
from aboba import tests, splitters, effect_modifiers, experiment, pipeline
# Generate synthetic data
n = 1000
data_a = sps.norm.rvs(size=n, loc=0, scale=1)
data_b = sps.norm.rvs(size=n, loc=0, scale=1)
# Create a DataFrame with two groups
data = pd.DataFrame({
'value': np.concatenate([data_a, data_b]),
'b_group': np.concatenate([
np.repeat(0, 1000),
np.repeat(1, 1000)
])
})
Step 2: Create a Pipeline
The pipeline defines how to sample data. Here we'll sample 100 observations from each group:
group_size = 100
data_pipeline = pipeline.Pipeline([
('GroupSplitter', splitters.GroupSplitter(size=group_size, column='b_group')),
])
Step 3: Define the Test
We'll use an absolute independent t-test:
Step 4: Create an Experiment
The experiment hub manages multiple test groups and visualizes results:
Step 5: Run AA Test (Validation)
First, run an AA test to verify the test is working correctly (both groups from same distribution):
aa_group = exp.group(
"AA Test",
test=test,
data=data,
data_pipeline=data_pipeline,
n_iter=100,
joblib_kwargs={"n_jobs": -1, "backend": "threading"}
).run()
The AA test should show p-values uniformly distributed between 0 and 1, confirming no false positives.
Step 6: Run AB Test with Synthetic Effect
Now add a synthetic effect to group 1 and run the test:
ab_group = exp.group(
"AB Test (effect=0.3)",
test=test,
data=data,
data_pipeline=data_pipeline,
synthetic_effect=effect_modifiers.GroupModifier(
effects={1: 0.3}, # Add 0.3 to group 1
value_column='value',
group_column='b_group',
),
n_iter=100,
).run()
Step 7: Visualize Results
This will show p-value distributions for both AA and AB tests. The AB test should show most p-values near 0, indicating the test successfully detected the effect.
Extracting Results
You can access detailed results from each group:
Example 2: Real Data with CUPED
Now let's work with real data and use CUPED (Controlled-experiment Using Pre-Experiment Data) for variance reduction.
Understanding CUPED
CUPED is a variance reduction technique that uses pre-experiment data (covariates) to improve test sensitivity. It adjusts your target metric using information from a correlated covariate.
Step 1: Load Real Data
Step 2: Create a Custom Data Processor
Since CUPED needs the whole dataset before sampling, we'll create a processor to assign groups:
import aboba
class RandomGroupAssigner(aboba.base.BaseDataProcessor):
def __init__(self, groups_n=2, column_name='group'):
self.groups_n = groups_n
self.column_name = column_name
def transform(self, data: pd.DataFrame):
n = data.shape[0]
groups = np.random.randint(0, self.groups_n, size=n)
data[self.column_name] = groups
return data, None
Step 3: Build CUPED Pipeline
sample_size = 100
covariate = 'totsp' # Total space as covariate
cuped_pipeline = pipeline.Pipeline([
RandomGroupAssigner(groups_n=2),
aboba.processing.EnsureColsProcessor(['price', 'group', covariate]),
aboba.processing.CupedProcessor(
value_column='price',
covariate_column=covariate,
result_column='price_cuped',
group_column='group',
group_test=1,
group_control=0,
),
splitters.GroupSplitter(column='group', size=sample_size),
aboba.processing.EnsureColsProcessor(['price_cuped']),
])
Step 4: Run Tests with CUPED
exp = experiment.AbobaExperiment()
cuped_test = tests.AbsoluteIndependentTTest(
value_column='price_cuped',
)
# AA test with CUPED
exp.group(
"AA, CUPED",
test=cuped_test,
data=data,
data_pipeline=cuped_pipeline,
n_iter=100,
).run()
# AB test with CUPED
exp.group(
"AB, CUPED (effect=10)",
test=cuped_test,
data=data,
data_pipeline=cuped_pipeline,
synthetic_effect=effect_modifiers.GroupModifier(
effects={1: 10},
value_column='price_cuped',
group_column='group',
),
n_iter=100,
).run()
exp.draw()
CUPED typically shows greater statistical power compared to regular t-tests, meaning it can detect smaller effects with the same sample size.
Example 3: Creating Custom Tests
You can create custom tests by inheriting from BaseTest. Here's an example of a relative t-test:
class RelativeIndependentTTest(aboba.base.BaseTest):
def __init__(self, value_column="target", alternative="two-sided"):
super().__init__()
self.value_column = value_column
self.alternative = alternative
assert alternative in {"two-sided", "less", "greater"}
def test(self, groups, artefacts):
control_group, test_group = groups
Y, X = control_group[self.value_column], test_group[self.value_column]
var_1, var_2 = np.var(X, ddof=1), np.var(Y, ddof=1)
a_1, a_2 = np.mean(X), np.mean(Y)
# Calculate relative difference
R = (a_1 - a_2) / a_2
var_R = var_1 / (a_2**2) + (a_1**2) / (a_2**4) * var_2
n = len(test_group)
stat = np.sqrt(n) * R / np.sqrt(var_R)
if self.alternative == "two-sided":
pvalue = 2 * min(sps.norm.cdf(stat), sps.norm.sf(stat))
pvalue = min(pvalue, 1)
elif self.alternative == "less":
pvalue = sps.norm.cdf(stat)
elif self.alternative == "greater":
pvalue = sps.norm.sf(stat)
return aboba.base.TestResult(
pvalue=pvalue,
effect=R,
effect_type="relative_control"
)
Using Your Custom Test
relative_test = RelativeIndependentTTest(value_column='price')
exp.group(
"AB, Relative Test",
test=relative_test,
data=data,
data_pipeline=random_pipeline,
synthetic_effect=effect_modifiers.GroupModifier(
effects={1: 10},
value_column='price',
group_column='group',
),
n_iter=100,
).run()
exp.draw()
Advanced: Flexible Effect Modifiers
Effect modifiers support multiple ways to add effects:
1. Constant Effect
effect_modifiers.GroupModifier(
effects={1: 0.3}, # Add constant 0.3 to group 1
value_column='value',
group_column='b_group',
)
2. Function-Based Effect
def my_effect(obj):
obj['value'] += 0.3
return obj
effect_modifiers.GroupModifier(
effects={0: my_effect},
value_column='value',
group_column='b_group',
)
3. Distribution-Based Effect
effect_modifiers.GroupModifier(
effects={
0: 0.9,
1: sps.norm(0.3, 0.001) # Random effect from normal distribution
},
value_column='value',
group_column='b_group',
)
Next Steps
- Explore the API Reference for all available tests
- Learn about data processors for advanced transformations
- Check out splitters for different sampling strategies
- See multiple group tests for comparing more than two groups