Simpson's paradox Facts for Kids

Simpson's paradox is a surprising idea in statistics. It's named after Edward H. Simpson, a British expert in statistics. He first wrote about it in 1951. Other experts like Karl Pearson (in 1899) and Udny Yule (in 1903) also noticed this effect. Sometimes, people call it the Yule–Simpson effect.

This paradox happens when a trend appears in different groups of data. But this trend disappears or even reverses when you combine those groups. You often see this in studies about people (called social sciences) or in medical research. It can be confusing because it makes it hard to see what's really causing something. It's also known as the reversal paradox or amalgamation paradox.

How Simpson's Paradox Works
- The Surprising Result
- Why This Happens
Images for kids
See also

How Simpson's Paradox Works

Let's look at a real example from a medical study. This study compared two ways to treat kidney stones.

The table below shows how successful two treatments were. It includes results for both small and large kidney stones. Treatment A was a type of open surgery. Treatment B was a less invasive method.

	Treatment A		Treatment B
	success	failure	success	failure
Small Stones	Group 1		Group 2
number of patients	81	6	234	36
	93%	7%	87%	13%
Large Stones	Group 3		Group 4
number of patients	192	71	55	25
	73%	27%	69%	31%
Both	Group 1+3		Group 2+4
number of patients	273	77	289	61
	78%	22%	83%	17%

The Surprising Result

Here's the surprising part:

For small stones, Treatment A worked better (93% success) than Treatment B (87% success).
For large stones, Treatment A also worked better (73% success) than Treatment B (69% success).

So, if you look at each stone size separately, Treatment A seems better.

But now, look at the "Both" row in the table. This combines all patients.

When all patients are grouped together, Treatment A had a 78% success rate.
Treatment B had an 83% success rate.

This means when you combine the groups, Treatment B looks more effective! This is the paradox. Treatment A is better for small stones and better for large stones, but Treatment B is better overall.

Why This Happens

This happens because of a "lurking variable" or hidden factor. In this example, the size of the kidney stone was that hidden factor. Doctors often gave the more serious cases (large stones) the stronger Treatment A. They gave milder cases (small stones) the less invasive Treatment B.

Two main things cause this paradox:

Different Group Sizes: The groups being combined were very different in size. Many more patients with small stones got Treatment B (270 patients in Group 2) than Treatment A (87 patients in Group 1). Also, many more patients with large stones got Treatment A (263 patients in Group 3) than Treatment B (80 patients in Group 4).
Hidden Factor's Big Impact: The size of the stone had a much bigger effect on success rates than the treatment itself. Patients with large stones generally did worse, no matter the treatment. So, even though Treatment A was better for large stones, the large number of patients with large stones in Treatment A's group pulled its overall success rate down.

This paradox shows that you need to be careful when combining data. A hidden factor can completely change what the numbers seem to tell you.

Images for kids

This image helps you see how Simpson's paradox works. It shows how easy it can be to misunderstand what's really happening with data.

Simpson's paradox facts for kids

Contents

How Simpson's Paradox Works

The Surprising Result

Why This Happens

Images for kids

See also