2010年3月11日星期四

Fooled by Rates and Ratios

In the book Fooled by Randomness, Nassim Nicholas Taleb wrote:
I found in the behavioral literature at least forty damning examples of such acute biases, systematic departures from rational behavior widespread across professions and fields. Below is the account of a well-known test, and an embarrassing one for the medical profession. The following famous quiz was given to medical doctors (which I borrowed from the excellent Deborah Bennett's Randomness).

A test of a disease presents a rate of 5% false positives. The disease strikes 1/1,000 of the population. People are tested at random, regardless of whether they are suspected of having the disease. A patient's test is positive. What is the probability of the patient being stricken with the disease?

Most doctors answered 95%, simply taking into account the fact that the test has a 95% accuracy rate. The answer is the conditional probability that the patient is sick and the test shows it ─ close to 2%. Less than one in five professionals got it right.

I will simplify the answer (using the frequency approach). Assume no false negatives. Consider that out of 1,000 patients who are administered the test, one will be expected to be afflicted with the disease. Out of a population of the remaining 999 healthy patients, the test will identify about 50 with the disease (it is 95% accurate). The correct answer should be that the probability of being afflicted with the disease for someone selected at random who presented a positive test is the following ratio:

Number of afflicted persons / Number of true and false positives here 1 in 51.

Think of the number of times you will be given a medication that carries damaging side effects for a given disease you were told you had, when you may only have a 2% probability of being afflicted with it!

I recall that when I first came across with the subject of verification, I often mixed up rates and ratios. I later learned that false alarm ratio and false alarm rate are not the same, and hit rate and hit ratio are different. As in the above example, "rate of 5% false positive" means the false alarm rate is 5% (which is also equivalent to saying that the hit rate is 95%) whereas what the question asking is about the hit ratio (which complements the false alarm ratio).

What is more striking is that false alarm rate and false alarm ratio, both llustrates how frequent false alarm is encountered, can differ very much in number when the subject event is rare. In the given example, subject to the rate of occurrence of 1/1,000, a false alarm rate of 5% is in contrast with a false alarm ratio of over 98%!

The lesson to learn is that: when you are being told of the performance of a system by someone who throws out charts and numbers, beware of being tricked by small false alarm rate. It can be leveraged by the large number in the null-null quadrant of the contingency table to result in a large false alarm ratio.

The 2002 Nobel Prize in Economics was notable because one of the laureates Daniel Kahneman is not an economist but a psychologist. Kahneman, in collaboration with Amos Tversky, developed the Prospect Theory which is a psychologically realistic alternative to the classical Expected Utility Theory.

In their study, Kahneman and Tversky clarified the kinds of misperceptions of randomness that fuel many of the common fallacies and biases. What illustrated in the original posting of this subject is a fallacy that is commonly known as the base rate fallacy. In the legal regime, such a fallacy together with other fallacies, are collectively referred to as the prosecutor's fallacy.