Simpson's paradox facts for kids
Simpson's paradox is a surprising idea in statistics. It's named after Edward H. Simpson, a British expert in statistics. He first wrote about it in 1951. Other experts like Karl Pearson (in 1899) and Udny Yule (in 1903) also noticed this effect. Sometimes, people call it the Yule–Simpson effect.
This paradox happens when a trend appears in different groups of data. But this trend disappears or even reverses when you combine those groups. You often see this in studies about people (called social sciences) or in medical research. It can be confusing because it makes it hard to see what's really causing something. It's also known as the reversal paradox or amalgamation paradox.
Contents
How Simpson's Paradox Works
Let's look at a real example from a medical study. This study compared two ways to treat kidney stones.
The table below shows how successful two treatments were. It includes results for both small and large kidney stones. Treatment A was a type of open surgery. Treatment B was a less invasive method.
Treatment A | Treatment B | |||
---|---|---|---|---|
success | failure | success | failure | |
Small Stones | Group 1 | Group 2 | ||
number of patients | 81 | 6 | 234 | 36 |
93% | 7% | 87% | 13% | |
Large Stones | Group 3 | Group 4 | ||
number of patients | 192 | 71 | 55 | 25 |
73% | 27% | 69% | 31% | |
Both | Group 1+3 | Group 2+4 | ||
number of patients | 273 | 77 | 289 | 61 |
78% | 22% | 83% | 17% |
The Surprising Result
Here's the surprising part:
- For small stones, Treatment A worked better (93% success) than Treatment B (87% success).
- For large stones, Treatment A also worked better (73% success) than Treatment B (69% success).
So, if you look at each stone size separately, Treatment A seems better.
But now, look at the "Both" row in the table. This combines all patients.
- When all patients are grouped together, Treatment A had a 78% success rate.
- Treatment B had an 83% success rate.
This means when you combine the groups, Treatment B looks more effective! This is the paradox. Treatment A is better for small stones and better for large stones, but Treatment B is better overall.
Why This Happens
This happens because of a "lurking variable" or hidden factor. In this example, the size of the kidney stone was that hidden factor. Doctors often gave the more serious cases (large stones) the stronger Treatment A. They gave milder cases (small stones) the less invasive Treatment B.
Two main things cause this paradox:
- Different Group Sizes: The groups being combined were very different in size. Many more patients with small stones got Treatment B (270 patients in Group 2) than Treatment A (87 patients in Group 1). Also, many more patients with large stones got Treatment A (263 patients in Group 3) than Treatment B (80 patients in Group 4).
- Hidden Factor's Big Impact: The size of the stone had a much bigger effect on success rates than the treatment itself. Patients with large stones generally did worse, no matter the treatment. So, even though Treatment A was better for large stones, the large number of patients with large stones in Treatment A's group pulled its overall success rate down.
This paradox shows that you need to be careful when combining data. A hidden factor can completely change what the numbers seem to tell you.
Images for kids
See also
In Spanish: Paradoja de Simpson para niños