Simpson's paradox facts for kids
Simpson's paradox is a paradox from statistics. It is named after Edward H. Simpson, a British statistician who first described it in 1951. The statistician Karl Pearson described a very similar effect in 1899.- Udny Yule's description dates from 1903. Sometimes, it is called the Yule–Simpson effect. When looking at the statistical scores of groups, these scores may change, depending on whether the groups are looked at one by one, or if they are combined into a larger group. This case often occurs in social sciences and medical statistics. It may confuse people, if frequency data is used to explain a causal relationship. Other names for the paradox include reversal paradox and amalgamation paradox.
Example: Kidney stone treatment
This is a real-life example from a medical study comparing the success rates of two treatments for kidney stones.
The table shows the success rates and numbers of treatments for treatments involving both small and large kidney stones, where Treatment A includes all open procedures and Treatment B is percutaneous nephrolithotomy:
Treatment A | Treatment B | |||
---|---|---|---|---|
success | failure | success | failure | |
Small Stones | Group 1 | Group 2 | ||
number of patients | 81 | 6 | 234 | 36 |
93% | 7% | 87% | 13% | |
Large Stones | Group 3 | Group 4 | ||
number of patients | 192 | 71 | 55 | 25 |
73% | 27% | 69% | 31% | |
Both | Group 1+3 | Group 2+4 | ||
number of patients | 273 | 77 | 289 | 61 |
78% | 22% | 83% | 17% |
The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time. In this example, it was not known that the size of the kidney stone influenced the result. This is called a hidden variable (or lurking variable) in statistics.
Which treatment is considered better is determined by an inequality between two ratios (successes/total). The reversal of the inequality between the ratios, which creates Simpson's paradox, happens because two effects occur together:
- The sizes of the groups, which are combined when the lurking variable is ignored, are very different. Doctors tend to give the severe cases (large stones) the better treatment (A), and the milder cases (small stones) the inferior treatment (B). Therefore, the totals are dominated by groups three and two, and not by the two much smaller groups one and four.
- The lurking variable has a large effect on the ratios, i.e. the success rate is more strongly influenced by the severity of the case than by the choice of treatment. Therefore, the group of patients with large stones using treatment A (group three) does worse than the group with small stones, even if the latter used the inferior treatment B (group two).
Images for kids
See also
In Spanish: Paradoja de Simpson para niños