Mann–Whitney U Test
Non-parametric two-sample rank test.
Overview
The Mann-Whitney U Test compares two independent samples without assuming the data are normally distributed. It is the non-parametric counterpart to the two-sample t-test, ranking all observations together and asking whether one group's ranks tend to be higher than the other's.
It is invaluable for researchers with small samples or skewed data, A/B testers comparing engagement metrics with heavy tails and quality engineers comparing two suppliers' defect rates. When your data look nothing like a bell curve, the t-test loses validity — Mann-Whitney holds up.
How it works
Combine the two samples and assign ranks 1, 2, 3, ... from smallest to largest, averaging tied ranks. Sum the ranks for sample 1 (call it R1). The U statistic is U1 = R1 - n1 * (n1 + 1) / 2, and U2 = n1 * n2 - U1. The test statistic is U = min(U1, U2).
Under the null hypothesis that the two distributions are the same, U has a known sampling distribution. For large samples it is approximately normal with mean n1 * n2 / 2 and variance n1 * n2 * (n1 + n2 + 1) / 12, giving a z-score and p-value.
Examples
Sample A: [3, 5, 7, 9]
Sample B: [2, 4, 6, 8]
→ U = 6, p ≈ 0.486 (no significant difference)
Sample A: [1, 2, 3, 4]
Sample B: [5, 6, 7, 8]
→ U = 0, p ≈ 0.029 (significant)
Sample A: [10, 20, 30, 40, 50]
Sample B: [15, 25, 35, 45, 55]
→ U ≈ 10, p ≈ 0.55 (no difference)
FAQ
When should I prefer Mann-Whitney over a t-test?
When your data are clearly non-normal, contain outliers or are ordinal rather than continuous. For roughly symmetric data with similar variance, the t-test is more powerful.
Does it test equal means or equal distributions?
Strictly, it tests whether one distribution is stochastically larger than the other. Under the stronger assumption of identical shapes, it effectively tests equal medians.
How are ties handled?
Tied observations get the average of the ranks they would have received. A correction to the variance is applied for the z-approximation when ties are present.
Is it the same as the Wilcoxon rank-sum test?
Yes — Mann-Whitney U and Wilcoxon rank-sum are equivalent formulations of the same test. Different textbooks pick different names.
What's the minimum sample size?
The test works for any sample sizes, but the normal approximation needs n1 and n2 of at least 8 each. Below that, use exact tables.