Simpson’s Paradox
I haven’t put anything in the Bad Statistics file for a while, so I thought I’d put this interesting little example up for your perusal.
Although my own field of modern cosmology requires a great deal of complicated statistical reasoning, cosmologists have it relatively easy because there is not much chance that any errors we make will actually end up harming anyone. Speculations about the Anthropic Principle or Theories of Everything are sometimes reported in the mass media but, if they are, and are garbled, the resulting confusion is unlikely to be fatal. The same can not be said of the field of medical statistics. I can think of scores of examples where poor statistical reasoning has been responsible for shambles in the domain of public health.
Here’s an example of how a relatively simple statistical test can lead to total confusion. In this version, it is known as Simpson’s Paradox.
A standard thing to do in a medical trial is to take a set of patients suffering from some condition and divide them into two groups. One group is given a treatment (T) and the other group is given a placebo; this latter group is called the control and I will denote it T* (no treatment).
To make things specific suppose we have 100 patients, of whom 50 are actively treated and 50 form the control. Suppose that at the end of the trial for the treatment, patients can be classified as recovered (“R”) or not recovered (“R*”). Consider the following outcome, displayed in a contingency table:
R | R* | Total | Recovery | |
T | 20 | 30 | 50 | 40% |
T* | 16 | 34 | 50 | 32% |
Totals | 36 | 64 | 100 |
Clearly the recovery rate for those actively treated (40%) exceeds that for the control group, so the treatment seems at first sight to produce some benefit.
Now let us divide the group into older and younger patients: the young group Y contains those under 50 years old (carefully defined so that I would belong to it) and Y* is those over 50.
The following results are obtained for the young patients.
R | R* | Total | Recovery | |
T | 19 | 21 | 40 | 47.5% |
T* | 5 | 5 | 10 | 50% |
Totals | 24 | 26 | 50 |
The older group returns the following data:
R | R* | Total | Recovery | |
T | 1 | 9 | 10 | 10% |
T* | 11 | 29 | 40 | 27.5% |
Totals | 12 | 38 | 50 |
For each of the two groups separately, the recovery rate for the control exceeds that of the treated patients. The placebo works better than the treatment for the young and the old separately, but for the population as a whole the treatment seems to work better than the placebo!
This seems very confusing, and just think how many medical reports in newspapers contain results of this type: drinking red wine is good for you, eating meat is bad for you, and so on. What has gone wrong?
The key to this paradox is to note that many more of the younger patients are actually in the treatment group than in the non-treatment group, while the situation is reversed for the older patients. The result is to confuse the effect of the treatment with a perfectly possible dependence of recovery on the age of the recipient. In essence this is a badly designed trial, but there is no doubting that it is a subtle effect and not one that most people could understand without a great deal of careful explanation which it is unlikely to get in the pages of a newspaper.
October 12, 2009 at 9:05 pm
Congratulations on getting the essence of Simpson’s paradox captured in few words.
re Bird. I stumbled across a 2-CD collection called “Birth of Be_Bop. It contains a jam session recording of Cherokee, which is amazing. Interested in the CD coodinates? The personnel is announced at the end of the cut. There is one unfortunate other alto player who I guess hhappened to be in the wrong place at the wrong time. However the rhythm session is unbelievable, I’ leave that as a surprise
Cheers Dale