You've probably heard the saying, attributed to Mark Twain in the United States, to Benjamin Disraeli in England, and to numerous others in various languages around the world, that there are three kinds of lies: lies, damned lies, and statistics. Of course, Mark Twain never said half of the things that are attributed to him. We just like to associate popular sentiments with famous figures. In this case, the popular sentiment reveals a deep distrust of statistics.
In fact, the problem is not really with statistics themselves, but with the way we use and understand them. By themselves, statistics mean nothing: they serve merely as evidence to support a claim. And, as with all evidence, we must evaluate both the accuracy and the application of all statistics. In other words, we need to ask whether a statistic is true, and whether it supports the argument.
Unfortunately, many people are persuaded by the mere use of statistics. Consider your own reactions: what if we had written above, "Mark Twain never said 50% of the things that are attributed to him." "50%" means exactly the same thing as "half," yet using a number instead of a word may make the statement sound more authoritative or more definite. Anyway, who could possibly know how many sayings are attributed to Mark Twain, in order to derive a ratio such as this? In this case, we understand "half" as implying simply that "a lot" of the things attributed to Mark Twain did not originate with him; but the apparent concreteness of "50%" suggests there must be more evidence behind that statistic. Of course, there isn't. Statistics, like all evidence, can be erroneous, misrepresented, manufactured, and ambiguous. The evaluation of statistics, then, begins with an understanding of what statistics are and how they are generated.
The term statistics refers to quantitative data, that is, information that has been measured or calculated, and can be expressed as a numerical value. As such, statistics can sometimes bring a sense of clarity to very complex problems. But that clarity often comes at the cost of oversimplification, and in order to guard against this there are some basic questions you should ask whenever dealing with statistics.
1. Is the statistic absolute or relative?
An absolute statistic is one that gives the total number; a relative statistic presents that information in terms of some other kind of reference. Consider, for example, the following statement:
The first statement is uses an absolute statistic; the other two employ relative ones. The second one relates the death rate from drunk driving to the number of days in a year, the third to the rate from the previous year. Because each of these usages contributes slightly different information, they are all valuable expressions of the same statistic.
- There were 117 deaths attributed to drunken driving in the county last year.
- Every three days last year, someone died due to drunk driving.
- Last year, 10% fewer people were killed in the county by drunk drivers.
The problem is that we are often given only one sort of statistic--the one that is most favorable to the position of the person or group making the argument. Consider these very different statements:
Each of these claims is based on a single statistic: the first and third are relative figures, the second absolute. But a little more information might significantly affect our reaction to these statistics. What if, for example, the area being discussed had only two residents: the previous year, there was only one crime, a robbery; this year, there was two crimes, both robberies; and both residents were robbed. "Doubling the crime rate," a relative expression, seems like a lot, until you find out that in absolute terms it only went from one to two. And the absolute number of "two burglaries" sounds low, until you find out that it means everyone in the area was robbed.
- We have to do something about the crime rate! It doubled last year.
- Crime remains low here. There were only two burglaries committed last year.
- Crime is out of hand. Last year, everyone was directly affected by a serious crime.
The examples above illustrate how statistics can be presented in very misleading ways. Without further information, we might have accepted either of the first two as convincing evidence of the crime rate. The presentation of either relative or absolute statistics on their own should certainly increase your skepticism; in that case, it is important to imagine how more information or another kind of presentation might affect the slant. But important details can be left out even when a statistic is given both relative and absolute expressions, so you should not let your guard down even when a figure is given multiple presentations.
2. Is the statistic individual or collective?
An easier problem to deal with is whether the statistic given is an average or a cumulative total--that is, whether it applies to each individual in a group, or to the group collectively. Consider these claims:
In each of those examples, the statistic offered is collective. Since there are twenty times more students than professors, it's not hard to believe that students as a group put in more hours than do their professors. And, although professional athletes earn high average salaries, since there are thousands of fans for each player, it should be obvious that the fans as a group earn more. Finally, while it is rare for a child under 16 to earn a million dollars, as a group children no doubt earn far in excess of that amount.
- Students put in more hours of study than their professors.
- Fans of professional sports earn more money than the players do.
- Children under 16 earn millions of dollars annually.
As with relative and absolute statistics, the best solution for any problem with individual or collective figures is to insist on as much information as possible. If you are unsure about whether a statistic is to be applied individually or collectively, and that statistic is of central importance to the strength of an argument, it is probably best to suspend judgment.
3. How were the statistics generated?
Statistics can be generated in two ways: by enumeration, and estimation. Enumeration requires a direct counting or measuring of the entire subject, while estimation studies only part of the subject, and then approximates what the results would have been if they had been based on the whole. Obviously, enumeration produces more reliable figures, but it is often impractical or impossible to deal with extremely large or diverse or distant subjects.
If, for example, we wanted to know the average number of credits earned by a student at San Jose State University last semester, we might simply have the computer that keeps track of such data calculate the result. Since that result would be an average based on the record of every student at SJSU during the semester, it would be an enumeration.
But what if the computer were not available? Then we might try asking each and every student but, with nearly 30,000 students, this enumeration would be a long and expensive task, especially taking into account students from last semester that are no longer attending SJSU and would need to be identified, located, and contacted off-campus. Instead, we might try a process of estimation known as sampling. Rather than ask every student, we could as a representative sample of the students who attended last semester, and then estimate the accuracy of our results when applied to the whole, based on the size and representativeness of our sample. So, if we sampled a group of 1,000 students, who were representative of the whole in every way we could imagine, including major, year in school, age, gender, race, marital status, number of children, weekly hours of work, amount of financial aid, and so on, we could assume that our results would closely approximate the results we might have obtained by enumerating the entire student body.
4. If an estimation, what is the margin of error?
Whenever we rely on a sampling technique to create an estimate, we take the risk that our results do not accurately represent the whole. Ways to minimize these risks will be covered in a later section, on surveys and experiments, but nothing can make those risks disappear entirely. Those risks are usually expressed as a range, and that range is called the margin of error or the confidence interval. What this all means is that the results of a sample-based estimation are repeatable 95% of the time within the margin of error.
Let's imagine a poll of 1000 likely voters showing 54% supporting Clinton, 46% supporting Dole, and a margin of error of plus or minus 3%. That means that we can be 95% sure that Clinton is favored by 51-57% of likely voters, and Dole is favored by 43-49%. Notice that we cannot be sure where in the range represented by the margin of error the true figure should be: it is just as likely that Clinton is leading 57% to Dole's 43%, or that Dole is just two percent back, at 49% to 51% for Clinton, as it is that the initial figures of 54% to 46% are accurate. In fact, all we can really conclude is that, if the difference between two results is greater than the margin of error, then that difference is statistically significant, and if the difference is less than the margin of error--no matter how big that difference may be--then it is statistically insignificant.
The most important factor in figuring the margin of error is the sample size. As the size goes up, the margin of error decreases. Usually, a sample size of 500 produces a margin of error of about plus or minus 5%; doubling the sample size to 1000 will knock a couple of percentage points off the margin or error; choosing a sample of only 100 will put the margin of error well over plus or minus 10%. Had our sample size in the Clinton-Dole poll above been smaller, for example, the margin of error would have been larger. At a confidence interval of plus or minus 4%, the difference between 54% and 46% becomes statistically insignificant, and we cannot say, within 95% accuracy, who is leading between the two.
It is also important to remember that the margin of error is not based on the whole sample, but on the number of individuals in a group whose answers are being considered together. Thus, in the Clinton-Dole example above, if exactly half the 1000-person sample were women and we were looking at the difference between the way men and women were voting, then the size of the sample group would be 500 for men and 500 for women, and the margin of error would go up accordingly.
Too often, those using surveys and estimations fail to give the margin of error or ignore its significance. In such cases, it will be up to you to determine whether consider the information at all, or suspend judgment on the argument because the evidence has been presented in an unreliable form.
5. Were the statistics generated by a survey or an experiment?