Set up A/B testing framework
✓Works with OpenClaudeYou are a data engineer specializing in experimentation infrastructure. The user wants to set up a production-ready A/B testing framework that assigns users to variants, tracks metrics, and computes statistical significance.
What to check first
- Verify you have a database or analytics warehouse with user/session table (
SELECT COUNT(*) FROM users LIMIT 1) - Confirm your event tracking system is capturing conversion events with timestamp and user_id
- Check that you have Python 3.8+ with pip access for scipy and numpy
Steps
- Create an experiment metadata table with columns:
experiment_id,name,variant_a_name,variant_b_name,start_date,end_date,traffic_allocation(0-100 for variant allocation) - Implement a consistent hashing function using user_id to deterministically assign users to variants (MD5 hash modulo 100)
- Set up event logging that captures
user_id,experiment_id,variant,event_type,timestamp, andvalue(for revenue/metric) - Build a query that aggregates metric by variant: count conversions, sum revenue, calculate conversion rate per variant
- Implement statistical significance test using two-sample t-test or chi-square test depending on metric type
- Create confidence interval calculation (95% CI) for uplift percentage between variants
- Build a results dashboard query that shows sample size, conversion rate, p-value, and confidence intervals for each variant
- Add guardrail metrics to detect unexpected negative effects (latency, error rate)
Code
import hashlib
import numpy as np
from scipy import stats
from datetime import datetime
import json
class ABTestFramework:
def __init__(self, db_connection):
self.db = db_connection
def assign_variant(self, user_id, experiment_id):
"""Deterministically assign user to variant using consistent hashing."""
hash_input = f"{user_id}_{experiment_id}".encode()
hash_value = int(hashlib.md5(hash_input).hexdigest(), 16)
return "variant_b" if (hash_value % 100) < 50 else "variant_a"
def log_event(self, user_id, experiment_id, event_type, value=None):
"""Log conversion/metric event for analysis."""
variant = self.assign_variant(user_id, experiment_id)
query = """
INSERT INTO experiment_events
(user_id, experiment_id, variant, event_type, value, timestamp)
VALUES (%s, %s, %s, %s, %s, NOW())
"""
self.db.execute(query, (user_id, experiment_id, variant, event_type, value))
def get_experiment_results(self, experiment_id):
"""Compute stats and significance for experiment."""
query = """
SELECT variant, COUNT(*
Note: this example was truncated in the source. See the GitHub repo for the latest full version.
Common Pitfalls
- Treating this skill as a one-shot solution — most workflows need iteration and verification
- Skipping the verification steps — you don't know it worked until you measure
- Applying this skill without understanding the underlying problem — read the related docs first
When NOT to Use This Skill
- When a simpler manual approach would take less than 10 minutes
- On critical production systems without testing in staging first
- When you don't have permission or authorization to make these changes
How to Verify It Worked
- Run the verification steps documented above
- Compare the output against your expected baseline
- Check logs for any warnings or errors — silent failures are the worst kind
Production Considerations
- Test in staging before deploying to production
- Have a rollback plan — every change should be reversible
- Monitor the affected systems for at least 24 hours after the change
Related Data & Analytics Skills
Other Claude Code skills in the same category — free to download.
CSV Parser
Parse and process CSV files
Data Transformer
Transform data between formats (JSON, XML, CSV)
Analytics Setup
Set up analytics tracking (GA4, Mixpanel, PostHog)
Data Pipeline
Create data processing pipeline
Report Generator
Generate reports from data
Chart Creator
Create charts and visualizations (Chart.js, D3)
Data Exporter
Export data in multiple formats
ETL Script
Create ETL (Extract, Transform, Load) scripts
Want a Data & Analytics skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.