ANOVA: Comparing Multiple Groups Efficiently
Analysis of Variance (ANOVA) is a statistical method used to determine if there are significant differences between the means of three or more groups. Instead of running multiple t-tests (which increases error rates), ANOVA provides a single test for overall group differences.
Practical Example: Marketing Campaign Testing
A company tests four different advertising strategies (A, B, C, D) to see which generates the most sales:
- Campaign A: 45, 48, 50, 47, 46 sales
- Campaign B: 52, 55, 53, 54, 51 sales
- Campaign C: 60, 58, 62, 59, 61 sales
- Campaign D: 40, 42, 38, 41, 39 sales
ANOVA answers: Are these sales differences statistically significant, or just random variation?
How It’s Used:
- Research: Compare multiple treatments in medicine
- Business: Test different pricing strategies
- Education: Evaluate teaching methods
- Manufacturing: Compare production methods
code
import scipy.stats as stats
import pandas as pd
# Sample data: Sales from 4 marketing campaigns
campaign_a = [45, 48, 50, 47, 46]
campaign_b = [52, 55, 53, 54, 51]
campaign_c = [60, 58, 62, 59, 61]
campaign_d = [40, 42, 38, 41, 39]
# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(campaign_a, campaign_b, campaign_c, campaign_d)
print(f"F-statistic: {f_stat:.2f}")
print(f"P-value: {p_value:.4f}")
if p_value < 0.05:
print("result: Significant differences exist between campaigns")
else:
print("result: no significant differences between campaigns")
Output Interpretation:
- P-value < 0.05: Significant differences exist
- P-value ≥ 0.05: No significant differences
When to Use ANOVA:
Comparing 3+ groups
-
Testing one categorical variable
-
Meeting assumptions: normal distribution, equal variances
resources :github
by gregory.tech