The last word: When numbers deceive
Our understanding of the world around us is profoundly influenced by statistics. Unfortunately, say authors Michael Blastland and Andrew Dilnot, we often have no idea what they really mean.
Counting is easy when you don’t have to count anything with it: 1, 2, 3 … it could go on forever, precise, regular, and blissfully abstract. This is how children learn to count in kindergarten.
But for grown-ups hopeful of putting counting to practical use, it has to lose that innocence. The difference is between counting, which is easy, and counting something, which is anything but.
How many centenarians are there in the United States? Someone must know, don’t they? They just go out and count ’em.
“How old are you?”
“One hundred and one, if I’m a day.”
“Thank you.” And they tick the box.
We often talk of social statistics, especially those that seem as straightforward as age, as if a bureaucrat were poised with a clipboard, peering through every window, counting; or, better still, had some machine to do it for them. The unsurprising truth is that, for many of the statistics we take for granted, there is no such bureaucrat, no machine, no easy count. What is out there, more often than not, is thick strawberry jam, through which someone with a bad back on a tight schedule has to wade—and then try to tell us how many strawberries are in it.
When the U.S. Census Bureau distributed forms asking people how old they were, the answer that came back in 1970 was that there were about 106,000 centenarians in the United States.
Job done? The bureau believes there probably were, in truth, not 106,000 but a little fewer than 5,000. That is, the count was roughly 22 times the estimated reality. And though the 1970 Census produced a particularly inaccurate tally, persistent overcounting of centenarians remains a problem.
What happened? The Census Bureau describes the count of people at older ages as “hounded” by problems, including: “Lack of birth records … low literacy levels … functional and/or cognitive disability … individuals who deliberately misreport their age for a variety of reasons, as well as those who mistakenly report an incorrect age in surveys.”
The raw data shows, in short, a mess. And this is at the relatively simple end of counting, where definitions are clear and everyone knows what they are looking for. If the simple, well-defined facts are not always easy to come by, what happens when the task becomes just a little more complicated?
When mammograms cry wolf
Percentages confuse even the experts. The accuracy of a typical medical test, for instance, is usually expressed as a percentage: “The test is 90 percent reliable.” But it has been found that doctors, no less than patients, are often hopelessly confused when it comes to interpreting what this means in human terms.
Gerd Gigerenzer, a German psychologist, asked two-dozen physicians to tell him the chance of a patient truly having breast cancer when a mammogram that was 90 percent accurate at spotting those who had it, and 93 percent accurate at spotting those who did not, came back positive.
Of the 24 doctors, just two worked out correctly the chance of the patient really having the condition. Most were hopelessly wrong.
Gigerenzer had added one other important piece of information: that the condition affected about 0.8 percent of the population for the group of 40- to 50-year-old women being tested. Quite a few of the physicians assumed that, since the test was 90 percent accurate, a positive result meant a 90 percent chance of having the condition. In fact, more than nine out of 10 positive tests under these assumptions are false positives, and those nine patients are in the clear.
To see why, look at the question again, this time expressed in terms that make more human sense: natural frequencies.
Imagine 1,000 women. Typically, eight have cancer, for whom the test, a fairly but not perfectly accurate test, comes back positive in seven cases. The remaining 992 do not have cancer, but remember that the test can be inaccurate for them, too. Nearly 70 of them will also have a positive result. These are the false positives, people with positive results that are wrong.
Now we can see that there will be about 77 positive results in total (the true positives and the false positives combined), but that only about seven of them will be accurate. This means that for any one woman with a positive test, the chance that it is accurate is low (one in about 11) and not, as most physicians thought, high.
Taming those gazillions
Numbers, when they’re large enough, simply blow our mental fuses. People often find anything with an “-illion” on the end incomprehensible. They make a useless mental shortcut: “lots of zeros = big.”
But any time a reporter or a politician tells of millions or billions spent, cut, lost, added, saved, it is worth asking, in all innocence: “Is that a big number?”
Millions, billions … if they all sound like the same thing, one useful trick is to imagine those numbers as seconds. A million seconds is about 11.5 days. A billion seconds is nearly 32 years.
What you usually need, though, is a way to think about a number on a human scale. Often, if you divide a big number by all the people it is supposed to affect, it becomes strangely humble and manageable.
A convenient number to help in this sort of calculation is 15.6 billion (15,600,000,000), which is the U.S. population (300 million) multiplied by 52, the number of weeks in a year. This is about how much the U.S. government needs to spend annually on any program for the program to cost $1 per American citizen per week. Divide any public spending announcement by 15.6 billion to gauge its real size.
In defense of bacon
Numbers have an amazing power to put life’s anxieties into proportion: We actually have the ability to measure uncertainty. Yet this power is squandered through an often-needless misalignment of the way people habitually think, and the way risks and uncertainties are typically reported. The news says, “Risk up 42 percent.” All you want to know is, “Does that mean me?”
“Don’t eat bacon,” for instance. That bit of advice comes from the American Institute for Cancer Research. Not “cut down,” or “limit your intake.” The AICR says “avoid” processed meat.
The AICR also says, “Research on processed meat shows cancer risk starts to increase with any portion.” And the institute is right; this is what the research shows. A massive joint report in 2007 by the AICR and the World Cancer Research Fund found that an extra ounce of bacon a day increased the risk of colorectal cancer by 21 percent. A single sausage is just as dangerous.
You will sense there is a “but” coming. The but is that nothing we have said so far gives you the single most essential piece of information: namely, what the risk actually is.
First, let’s look at the way the AICR report does it. On Page 23, the authors acknowledge that the incidence of colorectal cancer in the United States is about 45 per 100,000 for men and about 40 per 100,000 for women. A hundred pages later, readers find the 21 percent increase owing to bacon. None of this is conveniently presented, and media coverage was by and large even worse, usually ignoring the baseline risk altogether.
Fortunately, there is another way to present the report’s featured finding. Here it is:
“About five men in a hundred typically get colorectal cancer in a lifetime. If each of the 100 ate an extra couple slices of bacon every day, about six would.”
The geography of cancer
Chance is a concept we tend to believe we do understand. But chance has a genius for disguise. Frequently it appears in numbers that seem to form a pattern. People feel an overwhelming temptation to deduce that there is more to the events they witness than chance alone. Sometimes we are right. Often, though, we are suckered, and the apparent order merely resembles one.
To see why, take a bag of rice and chuck the contents straight into the air.
Observe the way the rice is scattered on the carpet at your feet. What you have done is create a chance distribution of rice grains. There will be thin patches here, thicker ones there, and every so often a much larger and distinct pile of rice. It has clustered.
Now imagine each grain of rice as a cancer case falling across a map of the United States. Wherever cases of cancer bunch, people demand an explanation. The rice patterns, however, don’t need an explanation. The rice shows that clustering, as the result of chance alone, is to be expected. The truly weird result would be if the rice had spread itself in a smooth, regular layer. Similarly, the genuinely odd pattern of illness would be an even spread of cases across the population.
This analogy draws no moral equivalence between cancer and rice patterns. Sometimes, certainly, a cancer cluster will point to a shared local cause. Often, though, the explanation lies in the complicated and myriad causes of disease, mingled with the complicated and myriad influences on where we choose to live, combined with accidents of timing, all in a collision of endless possibilities that, just like the endless collisions of those flying rice grains, come together to produce a cluster.
Why numbers matter
Asked about basic facts and figures of American life, individuals are often terribly wrong.
Michael Ranney likes asking questions. Being a professor of education at UCLA, he has plenty of young people around who can act as guinea pigs. For example: For every 1,000 U.S. residents, how many legal immigrants are there each year? How many people, per thousand residents, are incarcerated? For every thousand people, how many computers are there? How many abortions?
Few of us spend our leisure hours looking up and memorizing data. But many of us flatter ourselves that we know about these issues. And yet …
On abortion and immigration, says Ranney, about 80 percent of those questioned base their opinions on inaccurate information. For example, students at one college typically estimated annual legal immigration at about 10 percent of the U.S. population (implying 30 million legal immigrants every year). Nonstudents guessed even higher. The actual rate in 2006 was about 0.3 percent. That is, even the lower estimates were more than 30 times too high.
The students’ estimates for the number of abortions varied widely, but the middle of the range was about 5,000 for every million live births. The actual figure in the United States in 2006 was 335,000 per million live births—67 times higher than the typical estimate.
If anyone conversing with you made the numerically equivalent mistake of telling you they were at least 370 feet tall, it’s unlikely you’d take seriously their views on any subject.
The good news: Many respondents found the correct answers so surprising that they adjusted their political views on the spot.
From The Numbers Game by Michael Blastland and Andrew Dilnot. ©2009 by Michael Blastland and Andrew Dilnot. Published by arrangement with Gotham Books, a member of Penguin Group