Things We Don't Know: Why don’t we know more about causation?

You’ve seen them – every week a new story about genes for this or that, or the environmental cause of some effect or other. Often the stories say opposite things; red wine is good for us in one article, bad in the next, depending on the study. Or it might be about coffee, eggs or vitamin supplements, and health advice to pregnant women is often contradictory. The gene for autism or dyslexia or asthma is found one week, but the finding can’t be replicated the next. Does hydraulic fracturing of rocks to mine natural gas and oil contaminate ground water? Are GMO foods good or bad for us? You’d think figuring out these things would be easy – genes do something, they make us what we are, while people who eat a given food get sick and those who don’t stay well, so what’s the problem?

The problem is complexity. Modern scientific methods are very good at finding cause when an effect is large or clear-cut – the fact that smoking causes lung cancer was easy to see once the question was asked, the genetic variant that causes cystic fibrosis was relatively straightforward to find because the effect of the mutation is so major. But determining causation gets thorny when a disease or effect is caused by multiple genes, or genes plus some environmental factor, or when there are many pathways to the same outcome - all of which are common scenarios. The fact that everyone is unique to start with makes it even thornier.

Photograph entitled "The way it is" by Dhilung Kirat (creative commons)

When an object casts a shadow, we know it's not the shadow causing the light or creating the object. Cause and effect are easy to determine. But sometimes causality can be a lot trickier to determine. Image credit: Dhilung Kirat

Simple

Epidemiology, the study of patterns of disease in populations, got off the ground in 1854 with a now famous cholera epidemic in the Soho area of London. At the time it was widely believed that cholera was caused by "miasma" or bad air, but obstetrician John Snow showed it to be caused by contaminated water.

Since then, Epidemiology had great success with identifying the causes of disease – malaria, yellow fever, measles, Legionnaires' disease and many more, until the latter part of the 20th century when infectious diseases seemed to be on the decline. Epidemiology today is still good at quickly understanding newly emerging diseases like SARS and the new coronavirus from the Middle East, MERS. Epidemiological methods are also great at determining other kinds of causes with large effects, like cigarettes or asbestos or toxic chemicals. Here, the causes determine the effect, most if not every time they are present. But, epidemiology is not so good at finding the cause of chronic diseases, and in fact has turned to genetics for help.

But genetics has its own troubles. In the 1980's, genetics began to identify genes for diseases or traits that behaved according to Mendel's rules - genetic variants that seem to determine whether or not someone has a particular trait. The number of genes with variants known to cause a disorder is now around 3,000, and the number of diseases and traits with a known molecular basis is about 5,000¹. But these are generally rare, paediatric diseases.

Not so simple

X-linked Recessive inheritance, by the National Institutes of Health (public domain)

Some diseases are linked to a particular gender, such as Fragile X syndrome.
Public domain image

Mendel chose traits for his experiments which were determined by a single gene. Unfortunately most traits aren't as simple as this and yet the belief, or hope, that they are continues to guide much of the work in genetics to this day. But these methods don't work so well with the diseases that most of us will get. Complex traits like type 2 diabetes, heart disease or asthma seem to be caused by multiple genes - each with small effects, triggered by environmental factors or lifestyle, and which can be caused by different genes or other factors in different people.

The cause of traits like these can rarely be reduced to single factors, but science is reductionist. It works best when all the irrelevant 'noise' can be ignored, leaving just one or a few clear causes. When the noise can be part of the cause, that's a problem. But in so many areas of modern science, what 'causes' do is raise the chance (probability) of an effect; the effect doesn't always happen, and often the chance that it will happen is very difficult to estimate from data.

This I believe

Add to this the problem of belief and how it influences the lessons we take from science. Scientific hypotheses are supposed to be falsifiable. The classic example is white swans. You can believe that all swans are white… until you see a black one. Once you see that black one you are duty bound to throw out your 'all swans are white' idea. But people can get very fond of their hypotheses, and construct explanations such as "maybe that black swan ran into an oil slick", or "it's a mutant and the only black swan there is, and so really we needn't throw away our hypothesis after all".

Does fracking add methane to well-water? The answer depends on who you ask. A new study² in the US Proceedings of the National Academies of Science says wells within a kilometre of a fracking site are six times more likely to be contaminated with methane than wells further away. Industry spokespeople don't question this finding but they criticize the choice of test wells as being more likely to be contaminated than a random sample of wells near fracking sites because they were in areas where home owners complained. And, methane has been found in well water regardless of distance from fracking sites.³

So, is the science right or isn't it? The scientific method is rigorous, with guidelines for sample selection, data collection, analytic methods and so forth, but no study is perfect, and it's always difficult to know whether the flaws are fatal or not. And some causes do seem to actually be elusive or only probabilistic. This is where we know that we don't know. Enter belief. If you don't like the results it's a lot easier to find a flaw you consider fatal.

On the face of it, it looks as though determining causation should be easy. But there are too many methodological weaknesses, human foibles and unknowables for that to be so.

This guest post is by Anne Buchanan and Ken Weiss.

Anne Buchanan is a Sr Research Associate in the Anthropology Department at Penn State. She's a long time collaborator with Ken Weiss on developmental and evolutionary genetics as well as complex traits and why they are so difficult to understand.

Ken Weiss is Evan Pugh Professor of Anthropology and Genetics at Penn State University. His lab does developmental and evolutionary genetics and he's had a long standing interest in complexity and epistemological questions of how we know what we know. He writes a regular column on these issues in the journal Evolutionary Anthropology. They can be accessed via his webpage.

Anne and Ken co-authored a book, The Mermaid's Tale, 2009, about genetics and evolution, and they blog regularly at a site of the same name, The Mermaid's Tale.

References

¹ OMIM, Johns Hopkins University School of Medicine
² Jackson, Robert B et al. "Increased stray gas abundance in a subset of drinking water wells near Marcellus shale gas extraction." Proceedings of the National Academy of Sciences 110.28 (2013): 11250-11255 doi: 10.1073/pnas.1221635110
³ Sloto, R.A., 2013, Baseline groundwater quality from 20 domestic wells in Sullivan County, Pennsylvania, 2012: U.S. Geological Survey Scientific Investigations Report 2013–5085, 27 p.

Things We Don't Know

Search our site

Saturday 20 July 2013

Why don’t we know more about causation?

No comments:

Post a Comment