ANOVA or Analysis Of Variance

ANOVA: Comparing Multiple Groups Efficiently

Analysis of Variance (ANOVA) is a statistical method used to determine if there are significant differences between the means of three or more groups. Instead of running multiple t-tests (which increases error rates), ANOVA provides a single test for overall group differences.

Practical Example: Marketing Campaign Testing

A company tests four different advertising strategies (A, B, C, D) to see which generates the most sales:

  • Campaign A: 45, 48, 50, 47, 46 sales
  • Campaign B: 52, 55, 53, 54, 51 sales
  • Campaign C: 60, 58, 62, 59, 61 sales
  • Campaign D: 40, 42, 38, 41, 39 sales

ANOVA answers: Are these sales differences statistically significant, or just random variation?

How It’s Used:

  1. Research: Compare multiple treatments in medicine
  2. Business: Test different pricing strategies
  3. Education: Evaluate teaching methods
  4. Manufacturing: Compare production methods

code

import scipy.stats as stats
import pandas as pd

# Sample data: Sales from 4 marketing campaigns
campaign_a = [45, 48, 50, 47, 46]
campaign_b = [52, 55, 53, 54, 51]
campaign_c = [60, 58, 62, 59, 61]
campaign_d = [40, 42, 38, 41, 39]

# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(campaign_a, campaign_b, campaign_c, campaign_d)

print(f"F-statistic: {f_stat:.2f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("result: Significant differences exist between campaigns")
else:
    print("result: no significant differences between campaigns")

Output Interpretation:

  • P-value < 0.05: Significant differences exist
  • P-value ≥ 0.05: No significant differences

When to Use ANOVA:

Comparing 3+ groups

  • Testing one categorical variable

  • Meeting assumptions: normal distribution, equal variances

resources :github

by gregory.tech

Leave a Reply