Read I Think You'll Find It's a Bit More Complicated Than That Online
Authors: Ben Goldacre
The last two, at least, made a good effort to explain that this effect disappeared when the researchers accounted for social and demographic factors. But was there ever any point in reporting the raw finding, from before this correction was made?
I will now demonstrate, with a nerdy table illustration, how you correct for things such as social and demographic factors. You’ll have to pay attention, because this is a tricky concept; but at the end, when the mystery is gone, you will see why reporting the unadjusted figures as the finding, especially in a headline, is simply wrong.
Correcting for an extra factor is best understood by doing something called ‘stratification’. Imagine you do a study, and you find that people who drink are three times more likely to get lung cancer than people who don’t. The results are in Table 1. Your odds of getting lung cancer as a drinker are 0.16 (that’s 366 ÷ 2,300). Your odds as a non-drinker are 0.05. So your odds of getting lung cancer are three times higher as a drinker (0.16 ÷ .05 is roughly 3, and that figure is called the ‘odds ratio’) – as in Table 1 below.
But then some clever person comes along and says: Wait, maybe this whole finding is confounded by the fact that drinkers are more likely to smoke cigarettes. That could be an alternative explanation for the apparent relationship between drinking and lung cancer. So you want to factor smoking out.
The way to do this is to chop your data in half, and analyse non-smokers and smokers separately. So you take only the people who smoke, and compare drinkers against non-drinkers; then you take only the people who don’t smoke, and compare drinkers against non-drinkers in that group separately. You can see the results of this in the second and third tables.
Now your findings are a bit weird. Suddenly, since you’ve split the data up by whether people are smokers or not, drinkers and non-drinkers have exactly the same odds of getting lung cancer. The apparent effect of drinking has been eradicated, and this means that the observed risk of drinking was entirely due to smoking: smokers had a higher chance of lung cancer – in fact their odds were 0.3 rather than 0.03, ten times higher – and drinkers were more likely to also be smokers. Looking at the figures in these tables, 203 out of 1,954 non-drinkers smoked, whereas 1,430 out of 2,666 drinkers smoked.
I explained all this with a theoretical example, where the odds of cancer apparently trebled before correction for smoking. Why didn’t I just use the data from the unplanned pregnancies paper? Because in the real world of research, you’re often correcting for lots of things at once. In the case of this
BMJ
paper, the researchers corrected for parents’ socioeconomic position and qualifications, sex of child, age, language spoken at home, and a huge list of other factors.
When you’re correcting for so many things, you can’t use old-fashioned stratification, as I did in this simple example, because you’d be dividing your data up among so many smaller tables that some would have no people in them at all. That’s why you calculate your adjusted figures using cleverer methods, such as logistic regression
1
and likelihood theory. But it all comes down to the same thing. In our example above, alcohol wasn’t really associated with lung cancer. And in
this
BMJ
paper
, unplanned pregnancy wasn’t really associated with slower development. Pretending otherwise is just silly.
Ben Goldacre and David Spiegelhalter,
British Medical Journal
, 12 June 2013
We have both spent a large part of our working lives discussing statistics and risk with the general public. We both dread questions about bicycle helmets. The arguments are often heated and personal; but they also illustrate some of the most fascinating challenges for epidemiology, risk communication and evidence-based policy.
With regard to the use of bicycle helmets, science broadly tries to answer two main questions. At a societal level, ‘What is the effect of a public health policy that requires or promotes helmets?’ and at an individual level, ‘What is the effect of wearing a helmet?’ Both questions are methodologically challenging and contentious.
The linked paper by
Dennis and colleagues
(doi:10.1136/bmj.f2674) investigates the policy question and concludes that the effect of Canadian helmet legislation on hospital admission for cycling head injuries
‘seems to have been minimal’
. Other ecological studies have
come to different conclusions
, but the current study has somewhat superior methodology – controlling for background trends and modelling head injuries as a proportion of all cycling injuries.
This finding of ‘no benefit’ is superficially hard to reconcile with case-control studies, many of which have shown that people wearing helmets are
less likely to have a head injury
. Such findings suggest that, for individuals, helmets confer a benefit. These studies, however, are vulnerable to many methodological shortcomings. If the controls are cyclists presenting with other injuries in the emergency department, then analyses are conditional on having an accident and therefore assume that wearing a helmet does not change the overall accident risk. There are also confounding variables that are generally unmeasured and perhaps even unmeasurable. People who choose to wear bicycle helmets will probably be different from those who ride without a helmet: they may be more cautious, for example, and so less likely to have a serious head injury, regardless of their helmets.
People who are forced by legislation to wear a bicycle helmet, meanwhile, may be different again. Firstly, they may not wear the helmet correctly, seeking only to comply with the law and avoid a fine. Secondly, their behaviour may change as a consequence of wearing a helmet through ‘risk compensation’, a phenomenon that has been
documented in many fields
. One study – albeit with a single author and subject – suggests that drivers give
larger clearance to cyclists without
a helmet.
Even if helmets do have an effect on head-injury rates, it would not necessarily follow that legislation would have public health benefits overall. This is because of ‘second-round’ effects, such as changes in cycling rates, which may affect individual and population health. Modelling studies have generally concluded that regular cyclists live longer because the health effects of cycling far
outweigh the risk of crashes
. This trade-off depends crucially, however, on the absolute risk of an accident: any true reduction in the relative risk of head injury will have a greater impact where crashes are more common,
such as for children
.
The impact on all-cause mortality, and on head injuries, may be even further complicated if such legislation has varying effects on different groups. For example, a recent study identified two broad subpopulations of cyclist: ‘one speed-happy group that cycle fast and have lots of cycle equipment including helmets, and one traditional kind of cyclist without much equipment, cycling slowly’. The study concluded that compulsory cycle-helmet legislation may selectively reduce
cycling in the second group
. There are even more complex second-round effects if each individual cyclist’s safety is improved by increased cyclist density through ‘safety in numbers’, a phenomenon known as
Smeed’s law
. Statistical models for the overall impact of helmet habits are therefore inevitably complex and based on
speculative assumptions
. This complexity seems at odds with the current official BMA policy, which confidently calls for compulsory helmet legislation.
Standing over all this methodological complexity is a layer of politics, culture and psychology. Supporters of helmets often tell vivid stories about someone they knew, or heard of, who was apparently saved from severe head injury by a helmet. Risks and benefits may be exaggerated or discounted depending on the
emotional response
to the idea of a helmet. For others, this is an explicitly political matter, where an emphasis on helmets reflects a seductively individualistic approach to risk management (or even ‘victim blaming’), while the real gains lie elsewhere. It is certainly true that in many countries, such as Denmark and the Netherlands, cyclists have low injury rates, even though rates of cycling are high and almost no cyclists wear helmets. This seems to be achieved through interventions such as good infrastructure, stronger legislation to protect cyclists, and a culture of cycling as a popular, routine, non-sporty, non-risky behaviour.
In any case, the current uncertainty about any benefit from helmet wearing or promotion is unlikely to be substantially reduced by further research. Equally, we can be certain that helmets will continue to be debated, and at length. The enduring popularity of helmets as a proposed major intervention for increased road safety may therefore lie not with their direct benefits – which seem too modest to capture compared with other strategies – but more with the cultural, psychological and political aspects of popular debate around risk.
Guardian
, 12 January 2008
So we’re all going to get screened for our health problems, by some businessmen who’ve bought a CT scanner and put an advert in the paper maybe, or perhaps by Gordon Brown: because screening saves lives, data is good, and it’s always better to do something rather than nothing.
Unfortunately, it’s a tiny bit more complicated than that.
Screening is a fascinating area, mainly because of the maths of rare events, but also because of the ethics. Screening isn’t harmless, as tests – inevitably – aren’t perfect. You might get a false alarm, causing stress and anxiety (‘the worst time in my life’ said women in one survey on breast screening). Or you might have to endure more invasive medical investigations to follow up the early warning: even something as innocuous as a biopsy can sometimes result in harmful adverse events, and if you do a lot of those, unnecessarily, in a population, then you’re hurting people, sometimes more than you’re helping. Lastly, people might get false reassurance from a false negative result, and ignore other niggles, which can in turn delay the diagnosis of genuine problems.
Then, there are the interesting ethical issues. One of the proposed screening programmes is intended to catch abdominal aortic aneurysms earlier. An AAA is a swelling of the main blood-vessel trunk in your belly: they can rupture without much warning, and when they do, people often die fast and frighteningly. But if you know the AAA is there, and do the repair operation at your leisure before it ruptures, then survival is far better.
Screening and repairing
have been shown to reduce mortality by around 40 per cent, looking at the whole population, which is a good thing.
But remember, you will operate on some people – as a preventive measure, because you picked up their aneurysm on screening – who would
never
have died from their aneurysm: it would have just ticked away quietly, not rupturing. And some of the people you operate on unnecessarily (and remember, there’s no crystal ball to identify these people) will die of complications on the operating table. They only died because of your screening programme. It saves lives
overall
, but Fred Bloggs – loving husband of Winona Bloggs – who would have lived, is now dead, thanks to you.
That’s Vegas, you could say. But it’s tricky, and the sums are often close. For example,
mammogram screening for breast cancer
every two years has been estimated to prevent two deaths per thousand women aged fifty to fifty-nine over ten years: that is good. But achieving this requires 5,000 screenings among those thousand women, resulting in 242 recalls, and sixty-four women having at least one biopsy. Five women will have cancer detected and treated. Again, this isn’t an argument against screening, we’re just walking through some example numbers.
Although, interestingly, that’s not something everybody is keen to do with screening. People in healthcare can be zealots, and enthusiasts, and we can often project our own values and preferences onto everyone else.
Researchers have studied
the invitation letters sent out for screening programmes, along with the websites and pamphlets, and they have
repeatedly been shown
to be biased in favour of participation, and lacking in information.