12 December 1998

Lies. Damn Lies and Statistics

According to the US National Cancer Institute there has been just a 4% improvement in the survival propects of cancer patients since 1971, despite the regular introduction of various clinically proven therapies.

Trials of heart drugs according to Dr Nigel Brown of the Queen's Medical Centre in Nottingham, UK, would suggest a doubling of the survival rate for heart attack victims since 1982. On the wards there has been no increase in survival.

The answer to this conundrum lies in the statistical basis of significance testing. If a drug in clinical trials appears to improve upon the standard therapy then statistics are brought to bear to determine whether the improvement can be deemed to be significant.

Relied on by the scientific community for over 70 years significancy tests underpin thousands of research papers and millions of pounds of funding. However it is becoming increasingly understood that these tests are fatally flawed, the result of which can be seen by the overall lack of improvement in general drug therapy in comparison to each individually claimed improvement.

The accusatory finger is being pointed at Professor Ronald Aylmer Fisher, a distinguished Cambridge (UK) geneticist working in the 1920s. From his work on plant breeding trials he appeared to have discovered a truly objective way of drawing conclusions from data. He recommended turning raw data into probability-values or P-values. This is the probability of getting results at least as impressive as those obtained assuming mere fluke was their cause. If this P-value was less than 0.005, the results could be declared 'significant'. Since Fisher's time this P-value has been used to determine significancy. It is however absolutely flawed.

If a scientist is observing the effect of a new drug, then the question they want to answer is whether the drug is having some genuine effect on patients. The P-value seems to answer the question, but it doesn't. given the data the P-value only tells the scientisthow likely the effect would be, assuming it were due to chance. If this is less than 0.005, then it is taken to mean that the result is significant. However this not the same thing as asking how likely that chance is actually responsible for the observed effect. To answer this question a scientist would have to revert back to the work of Thomas Bayes, an English mathematician of the 18th century. The same Thomas Bayes whose work was considered to be the basis of scientific evidence during the 18th and 19th centuries until rubbished by the scientific community lead by a certain Professor Ronald Aylmer Fisher who was looking for a way of supporting his work on plant breeding.

Back to the school room for the clinical trial scientists.