Probability, Bayesian theory, and causation

Let’s say that one day you go into the doctor’s office. He asks you what’s wrong, and you say that you have a fever and a headache.

“Hm… ,” he says. “Is the headache particularly sharp, and towards the back of your head? And your fever, is it exactly 101.3 degrees, and it’s been that way for a couple days?”

“Yes,” you say, “exactly that, for both of those things.”

“I read about this in the medical literature,” the doctor responds, “I think you have a brain parasite from Nicaragua. 99% of Nicaraguan brain parasite sufferers have that headache and that fever, and it’s the most reliable way of diagnosing Nicaraguan brain parasites. We need to start treatment immediately.”

This would be an alarming thing to hear from your doctor, and you’d likely want to start treatment immediately. That is, unless, you’d been reading my philosophy essays, and were wondering if this is really sufficient justification to start treatment for brain parasites. The answer is that it’s not, and I’ll tell you why.

The doctor read in the medical literature that 99% of Nicaraguan brian parasite sufferers have those symptoms, and assumed that 99% percent of people with those symptoms have brain parasites. Unfortunately, that’s not a fair assumption. It’s a big world, and there are many diseases. Presumably, at least one of them could cause those symptoms without it having to be a Nicaraguan brain parasite. In fact, you should at least check if the person has been to Nicaragua, because, if they haven’t, the chance that the patient has a brain parasite from Nicaragua is presumably zero. In this case, correlation does not mean causation.

But let’s take this a step further. Let’s say the doctor isn’t quite so hasty. Instead, he says:

“Have you been to Nicaragua?”

“Yes,” you respond.

“Oh, that’s quite bad. Of the 100 people in Nicaragua with the brain parasite, 99 of them had your exact symptoms. Only 1% of Nicaragua’s population at large has these symptoms. That means it’s a specific, reliable test, which both tells us reliably when people have the parasite and when they do not. We need to start treatment immediately.”

Is that better? Well, slightly, but still not by much. In fact, the doctor has now given us a mathematical way of proving why it’s not a good idea to trust his judgment. Not everyone in Nicaragua has a chance of having the parasite. In fact, most do not. We’re told that 100 people had the parasite, and Nicaragua has a population of 6 million. 1% of 6 million is 60,000. 100 is 0.2% of 60,000. That means if you picked a person at random who had those symptoms from Nicaragua, you only have a 0.2% chance of them being a sufferer of a brain parasite. It is almost certain you don’t have a brain parasite.

This method of reasoning is called Bayesian probability, after Thomas Bayes, the guy who (sort-of) discovered it. In brief, it’s about taking prior probabilities into account when discussing current probabilities. When the doctor read about the power of the test, he imagined the test being given to 100 people, each of whom might have a parasite. Then the test would identify with 99% accuracy who had the parasite and who didn’t. But the doctor didn’t consider the prior probability of a person having the parasite in the first place. This was probably in part due to how hard it is for us to think intuitively about big numbers, which I’ve discussed previously.

This might seem like a strange and silly example to you, but it’s actually pretty important. Breast cancer screenings have this trouble all the time. When we say that mammograms are 90% accurate at early detection, we normally mean that if 100 women have breast cancer, on average mammograms can early detect 90 of them. But we should be asking how often a woman doesn’t have breast cancer and gets a “false positive” from a mammogram.

So far, so good. Bayesian proponents (of which there are many who are surprisingly fervent) claim Bayesian statistics should also be used to evaluate evidence itself. So, for instance, there was a murder case in Britain in the 1990’s. In this case, a mother was accused of murdering her children, because both infants had died in their sleep. Roy Meadows, a pediatrician, calculated that if only 1 in 8500 infants died in their sleep, then the chance of it happening twice was 1 in 8500 squared, or 1 in 73,000,000.

Bayesians were outraged. They said that that was true for just the current probability, but the prior probabilities that had to be considered were: the chance a mother would murder her children (unlikely), the chance of it being completely random that some infants die and others live (very unlikely), and the chance that one environmental or health condition caused the death of two children who lived in the same house (very likely). In other words, Bayes’s theorem wasn’t just used to evaluate the tests, but the evidence. Nobody knew the probabilities of mothers murdering their children, the randomness of infant deaths, or environmental conditions causing two infants’ deaths, but Bayesians refused to let them just be ignored.

You can see their logic. Meadows claimed that almost no children die of natural causes, so the mother shouldn’t be believed when she said two of her children died of natural causes. The equivalent is that almost nobody is born on January 1, so any friend of yours who claims to be born on January 1 is a liar. There’s some prior probability that needs to be taken into account of how likely it is that people would lie about their birthday.

So far, so good. But Bayesians are confident people, and they take these ideas further. To them, you can set up an entire system of science like this. You take a hypothesis, see what the hypothesis entails, and start gathering evidence. If the evidence fits into the hypothesis, it helps confirm the hypothesis. If it doesn’t fit into the hypothesis, it helps disprove the hypothesis. And, of course, you have to consider your prior probabilities.

So let’s go back to the brain parasites. Bayesians gather their evidence, meaning your symptoms, the frequency of brain parasites in Nicaragua, and the frequency of your symptoms among sufferers of brain parasites. If you have a headache and fever, the Bayesians are slightly more confident that you have brain parasites, because they’ve considered the prior probability and know that many people with those symptoms do not have brain parasites. If you don’t have a headache and fever, the Bayesians are very confident that you don’t have brain parasites, because someone with brain parasites should have those symptoms and it was unlikely for you to have brain parasites in the first place.

The Bayesians claim that this way they solve the problem of induction. They see themselves as like Popper, careful not to ever say causation is assured, but they believe their method is more rigorous. Forcing people to assign probabilities and likelihoods to evidence means that you actually get a number for how likely a theory is, and the Bayesians see this as a distinct advantage.

Let’s evaluate their method. Popper, for one, hated this idea, although he was an ornery guy in general. He thought that it didn’t matter if you were somewhat or completely sure that you’ve proved causation, your method of induction is still flawed. You are still assuming that two things happening simultaneously is a reason to believe they have a necessary connection, which Hume told us is never justified.

They also, ironically, suffer from the same problem as Popper, in that this is a practically difficult way to do science. Knowing the probabilities of everything is impossible (like Einstein knowing the probability that gravity would bend light), and forcing yourself to assign probabilities when you don’t know them is a recipe for disaster. Bacon warned us of humanity’s tendency to leap from evidence to conclusion, and here the Bayesians tell us that we need to start forming conclusions about the likelihood of things before science begins.

However, even considering these criticisms, it still seems like if there’s a place for Bayesian statistics in science. After all, the Nicaraguan brain parasite example was convincing, as was the example of the mother’s conviction. At the very least, it serves as a proper framework for statistics, and for evaluating the likelihood of events. If we have the sort of evidence that the Bayesian framework requires, it’s useful. If we don’t, we shouldn’t force it. And, as the mother’s conviction example shows, we can also use Bayesian statistics to show that other sorts of statistics are inappropriate as well.

However, one advantage of our era of computers is that probabilities aren’t as hard to come by as they used to be. In my field of Geosciences, Monte Carlo simulations were often used to come up with probabilities. So, for instance, a scientist would come up with an equation that he believed described some system, like the chance of rain. Then, using a computer, he’d put in all the possible values for the system into the equation, like temperatures ranging from 0 degrees to 100 degrees. He could then come up with a probability for the weather. If, for instance, the weather has to be 45 degrees for it to rain, and, if it’s 45 degrees, there’s a 50% chance of rain, then all he needs is for someone to tell him the probability of 45 degree weather for him to say the probability of rain. And he got these probabilities just by feeding a bunch of random numbers into his computer, and seeing how often and in what cases rain occurred.

If you’re computer savvy and interested in predicting things, check out Monte Carlo simulations, and try to figure out how to apply them to Bayesian statistics. The math is pretty easy to do through programs like Excel, but the tricky part is knowing what it means and what it doesn’t. If you’re not computer savvy, though, you can use Bayesian statistics to come up with your own ideas of what’s likely, and avoid being tricked by the statistics of others. After all, as Mark Twain said, “There are three types of lies: lies, damned lies, and statistics.”

Leave a Reply

Your email address will not be published. Required fields are marked *