Read I Think You'll Find It's a Bit More Complicated Than That Online
Authors: Ben Goldacre
By working backwards and sideways from these kinds of calculations, Ioannidis was able to determine, from the sizes of effects measured, and from the numbers of people scanned, how many positive findings could plausibly have been expected, and compare that to how many were actually reported. The answer was stark: even being generous, there were twice as many positive findings as you could realistically have expected from the amount of data reported on.
What could explain this? Inadequate blinding is an issue: a fair amount of judgement goes into measuring the size of a brain area on a scan, so wishful nudges can creep in. And boring old publication bias is another: maybe whole negative papers aren’t getting published.
But a final, more interesting explanation is also possible. In these kinds of studies, it’s possible that many brain areas are measured to see if they’re bigger or smaller, and maybe then only the positive findings get reported
within
each study.
There is one final line of evidence to support this. In studies of depression, for example, thirty-one studies report data on the hippocampus, six on the putamen, and seven on the prefrontal cortex. Maybe, perhaps, more investigators really did focus solely on the hippocampus. But given how easy it is to measure the size of another area – once you’ve recruited and scanned your participants – it’s also possible that people are measuring these other areas, finding no change, and not bothering to report that negative result in their paper alongside the positive ones they’ve found.
There’s only one way to prevent this: researchers would have to publicly pre-register what areas they plan to measure, before they begin, and report all findings. In the absence of that process, the entire field might be distorted by a form of exaggeration that is – we trust – honest and unconscious, but more interestingly, collective and disseminated.
Guardian
, 15 January 2011
Sometimes something will go wrong with an academic paper, and it will need to be retracted: that’s entirely expected. What matters is how academic journals deal with problems when they arise.
In
2004 the
Annals of Thoracic Surgery
published a study
comparing two heart drugs. This week
it was retracted
. Ivan Oransky and Adam Marcus are two geeks who set up a website called RetractionWatch because it was clear that retractions are often handled badly: they contacted the editor of
ATS
,
Dr L. Henry Edmunds Jr, MD
to find out why the paper was retracted. ‘It’s
none of your damn business
,’ replied Dr Edmunds, before railing against ‘journalists and bloggists’. The retraction notice, he said, was merely there ‘to inform our readers that the article is retracted’. ‘If you get divorced from your wife, the public doesn’t need to know the details.’
ATS
’s retraction notice on this paper is equally uninformative and opaque. The paper was retracted ‘following an investigation by the University of Florida, which uncovered instances of repetitious, tabulated data from previously published studies’. Does that mean duplicate publication, two bites of the cherry? Or maybe plagiarism? And if so, of what, by whom? And can we still trust the authors’ numerous other papers?
What’s odd is that this is not uncommon. Academic journals have high expectations of academic authors, with explicit descriptions of every step in an experiment, clear references, peer review, declarations for financial conflicts of interest, and so on, for a good reason: academic journals are there to inform academics about the results of experiments, and to discuss their interpretation. Retractions form an important part of that record.
Here’s one example of why. In October 2010 the
Journal of the American Chemical Society
retracted a 2009 paper
about a new technique for measuring DNA, explaining it was because of ‘inaccurate DNA hybridization detection results caused by application of an incorrect data processing method’. This tells you nothing. When RetractionWatch got in touch with the author, he explained that his team forgot to correct for something in their analysis, which made the technique they were testing appear to be more powerful than it really was; they actually found it’s no better than the process it was proposed to replace.
That’s useful information, much more informative than the paper simply disappearing one morning, and it clearly belongs in the academic journal the original paper appeared in, not in an email to two people from the internet running an
ad hoc blog tracking
down the
stories behind retractions
.
This all becomes especially important when you think through how academic papers are used: that
JACS
paper has now been cited fourteen times, by people who believed it to be true. And we know that news of even the simple fact of a retraction fails to permeate through to consumers of information.
Researcher Stephen Breuning faked huge amounts of trial data on the drug ritalin, and was found guilty of scientific misconduct in 1988 by a US federal judge – which is unusual and extreme in itself – so most of his papers were retracted. A
study last year
chased up all the references to Breuning’s work from 1989 to 2007, and found over a dozen academic papers still citing his work. Some discussed it as a case of fraud, but around half – in more prominent journals – still cited it as if it was valid, twenty-four years after its retraction.
The role of journals in
policing academic misconduct
is still unclear, but obviously, explaining the disappearance of a paper you published is a bare minimum. Like publication bias, whereby negative findings are less likely to be published, this is a systemic failure, across all fields, so it has far greater ramifications than any one single, eye-catching academic cock-up or fraud. Unfortunately it’s also a boring corner in the technical world of academia, so nobody has been shamed into fixing it. Eyeballs are an excellent disinfectant: you should read RetractionWatch.
Twelve Monkeys
. No … Eight. Wait, Sorry, I Meant Fourteen
Guardian
, 23 January 2010
Like many people, you’re possibly afraid to share your views on animal experiments, because you don’t want anyone digging up your grandmother’s grave, or setting fire to your house, or stuff like that. Animal experiments are necessary, they need to be properly regulated, and we have some of the tightest regulation in the world.
But it’s easy to assess whether animals are treated well, or whether an experiment was necessary. In the nerd corner there is another issue: is the research well conducted, and are the results properly communicated? If it’s not, then animals have suffered – whatever you believe that might mean for an animal – partly in vain.
The National Centre for the Replacement, Refinement and Reduction of
Animals in Research
was set up by the government in 2004. It has published, in the academic journal
PLoS One
, a systematic survey of the quality of reporting, experimental design and statistical analysis of recently published biomedical research using laboratory animals. These results are not good news.
The study is pretty solid. It describes the strategy they used to search for papers, which is important, because you don’t want to be like a homeopath, and only quote the papers that support your conclusions: you want to have a representative sample of all the literature. And the papers they found covered a huge range of publicly funded research: behavioural and diet studies, drug and chemical testing, immunological experiments, and more.
Some of the flaws they discovered were bizarre. Four per cent of papers didn’t mention how many animals were used in the experiment, anywhere. The researchers looked in detail at forty-eight studies that did say how many were used: not one explained why that particular number of animals had been chosen. Thirty-five per cent of the papers gave one figure for the number of animals used in the methods, and then a different number of animals appeared in the results. That’s pretty disorganised.
They looked at how many studies used basic strategies to reduce bias in their results, like randomisation and blinding. If you’re comparing one intervention against another, for example, and you don’t randomly assign animals to each group, then it’s possible you might unconsciously put the stronger animals in the group getting a potentially beneficial experimental intervention, or vice versa, thus distorting your results.
If you don’t ‘blind’, then you know, as the experimenter, which animals had which intervention. So you might allow that knowledge, even unconsciously, to affect close calls on measurements you take. Or maybe you’ll accept a high blood-pressure reading when you expected it to be high, knowing what you do about your own experiment, but then double-check a high blood-pressure measurement in an animal where you expected it to be low.
Only 12 per cent of the animal studies used randomisation. Only 14 per cent used blinding. And the reporting was often poor. Only 8 per cent gave the raw data, allowing you to go back and do your own analysis. About half the studies left the numbers of animals in each group out of their tables.
I grew up friends with the daughters of Colin Blakemore, a neuroscientist in Oxford who has taken courageous risks over many decades to speak out and defend necessary animal research. My first kiss – not one of those sisters, I should say – was outside a teenage party in a church hall, in front of two Special Branch officers sitting in a car with their lights off.
People who threaten the lives of fifteen-year-old girls, to shut their father up, are beneath contempt. People who fail to damn these threats are similarly contemptible. That’s why it sticks in the throat to say that the reporting and conduct of animal research is often poor; but we have to be better.
Medical Hypotheses
Fails the Aids Test
Guardian
, 12 September 2009
This week the peer-review system has been
in the newspapers
, after a survey of scientists suggested it had some problems. This is barely news.
Peer review
– where articles submitted to an academic journal are reviewed by other scientists from the same field for an opinion on their quality – has always been recognised as problematic. It is time-consuming, it can be open to corruption, and it cannot always prevent fraud, plagiarism or duplicate publication, although in a more obvious case it might. The main problem with peer review is: it’s hard to find anything better.
Here is one example of a failing alternative. This month, after a concerted campaign by academics aggregating around websites such as
Aidstruth.org
, academic publishers
Elsevier have withdrawn
two papers from a journal called
Medical Hypotheses
. This academic journal is a rarity: it does not have peer review; instead, submissions are approved for publication by its one editor.
Articles from
Medical Hypotheses
have appeared in this column quite a lot. It carried
one almost surreally crass paper
1
in which two
Italian doctors argued
that ‘mongoloid’ really was an appropriate term for people with Down’s syndrome after all, because they share many characteristics with Oriental populations (including: sitting cross-legged, eating small amounts of lots of different types of food with MSG in it, and an enjoyment of handicrafts). You might also remember two pieces discussing the benefits and side effects of masturbation as a
treatment for nasal congestion
.
2
The papers withdrawn this month step into a new domain of foolishness. Both were from the community whose members characterise themselves as ‘Aids dissidents’, and one was co-authored by its figureheads,
Peter Duesberg and David Rasnick
.
To say that a peer reviewer might have spotted the flaws in their paper – which had already been rejected by the
Journal of Aids
– is an understatement. My favourite part is the whole page they devote to arguing that there cannot be lots of people dying of Aids in South Africa, because the population of that country has grown over the past few years.
We might expect anyone to spot such poor reasoning – and only two days passed between this paper’s submission and its acceptance – but they also misrepresent landmark papers from the literature on Aids research. Rasnick and Duesberg discuss antiretroviral medications, which have side effects, but which have stopped Aids being a death sentence, and attack the notion that their benefits outweigh the toxicity: ‘Contrary to these claims,’ they say, ‘hundreds of American and British researchers jointly published a collaborative analysis in the
Lancet
in 2006
, concluding that treatment of Aids patients with anti-viral drugs has “not translated into a decrease in mortality”.’
This is a simple, flat, unambiguous misrepresentation of the
Lancet
paper to which they refer. Antiretroviral medications have repeatedly been shown to save lives in systematic reviews of large numbers of well-conducted randomised controlled trials. The
Lancet
paper they reference simply surveys the first decade of patients who received highly active antiretroviral therapy (HAART) – modern combinations of multiple antiretroviral medications – to see if things had improved, and they had not. Patients receiving HAART in 2003 did no better than patients receiving HAART in 1995. This doesn’t mean that HAART is no better than placebo. It means outcomes for people on HAART didn’t improve over an eight-year period of their use. This would be obvious to anyone familiar with the papers, but also to anyone who thought to spend the time checking the evidence for an obviously improbable assertion.