Chi-Square Test
Chi-square goodness-of-fit between observed and expected counts.
Overview
The Chi-Square Test compares an observed set of counts against an expected set and reports the chi-square statistic, degrees of freedom and p-value. It answers the question, "Is the distribution of my categories different from what I expected?"
It is the daily workhorse for marketers checking whether a campaign attracted a representative sample, biologists comparing observed phenotype counts to Mendel's ratios and product teams validating that user signups are uniform across cohorts.
How it works
Given observed counts O_i and expected counts E_i for each of k categories, the goodness-of-fit statistic is χ² = Σ (O_i - E_i)^2 / E_i. Degrees of freedom equal k - 1 for a simple test, or k - 1 - p if you estimated p parameters from the data.
The p-value is 1 - F(χ², df) where F is the chi-square CDF. Small p-values mean the discrepancy between observed and expected is unlikely under the null hypothesis. The classic threshold is 0.05.
Examples
Observed: 30, 25, 25, 20
Expected: 25, 25, 25, 25
→ χ² = 2.0, df = 3, p ≈ 0.572 (not significant)
Observed: 50, 30, 20
Expected: 33.3, 33.3, 33.4
→ χ² ≈ 14.0, df = 2, p ≈ 0.0009 (very significant)
Coin flips, observed 60H / 40T, expected 50/50
→ χ² = 4.0, df = 1, p ≈ 0.046
FAQ
What's the minimum expected count per cell?
A common rule of thumb is at least 5 per cell. Below that, the chi-square approximation breaks down — use Fisher's exact test instead.
Is this the same as the chi-square test of independence?
Goodness-of-fit compares observed counts to a known expected distribution. Independence (a 2D test) uses row and column marginals to compute expected counts. The math is the same; the setup differs.
What if my expected counts don't match my observed total?
The tool normalises expected counts so the totals match. Better yet, supply expected proportions that sum to one and the right total is computed automatically.
Does the test tell me which category caused the discrepancy?
No — it gives an overall verdict. Look at each (O - E)^2 / E contribution to spot the biggest offenders.
What does a high p-value mean?
The data are consistent with the expected distribution. It does not prove the distribution is correct, only that you can't reject it with this sample.