Statistical Studies and Experiments
A. Polls, Studies and Experiments: Sampling Phase
As we saw in the section on statistics, statistical support for claims can be generated either by enumeration (counting each instance in an entire population), or by estimation (counting each instance in only a subset of the entire population). Though enumeration is always the more accurate of the two, it is far from perfect, especially as the size of the population increases. In the United States, the best known example of enumeration in a large population is probably the national census, which takes place every decade, and which is usually considered to be full of errors. Since, however, the Constitution specifically requires enumeration, and since a more accurate count might have political consequences (for example, representation in the U.S. House of Representatives is based on the census), we continue to use the very expensive and time-consuming method of enumeration. It is possible that we could get just as accurate a picture of the U.S. population more quickly and cost-effectively through estimation, but the assumptions involved in devising that estimation would be even more subject to political pressure and influence.
When large populations are involved, estimation is usually employed. (Even the U.S. Census, which by law counts the population by enumeration, uses estimation to produce most of its analysis of American society.) Estimation involves two stages: first, selection of the group or population to study, and then the investigation itself, which involves collection and analysis of information.
The selection stage is roughly equivalent in all forms of estimation. The process of selection is, first, to identify the group or population that the estimation will describe (the target), and then to select from that target a smaller but representative group (the sample). In theory, if the sample is fully representative of the target, then what is true of the sample is true of the target. Unfortunately, there are few situations in which a fully representative sample can be obtained, such as checking the specifications on a mass-produced engine part. As a result, researchers have devised methods for selecting samples sufficiently representative of the target to make estimations about it. These methods can be divided into two types:
We have seen that one of the central areas of concern for any estimation is the way in which the sample is selected. The next section will continue this discussion, by distinguishing between polls, studies and experiments, and by looking at how the results of estimations should be interpreted.
B. Polls, Studies and Experiments: Investigation Phase
The investigation phase of estimation can take one of three forms: polls, studies, or experiments. As you read in the last section, all three of these begin with a rather similar process of identifying the target population, and then selecting a representative sample from that target. After that, polls, studies, and experiments become quite different.
The differences between polls, studies, and experiments are easy to spot.
In experiments, the researchers themselves actively control something related to the sample group, either by introducing it where there was none before, or by removing it where it once existed. Experiments always move from cause to effect, by manipulating the suspected cause, and then gathering data about the results of that manipulation.
Generally speaking, samples in experiments tend to be smaller than in other forms of estimation, and that sample is divided further into at least two groups: the experimental group and the control group. The manipulation of the suspected cause only occurs in the experimental group. Because it is difficult to trace effects to a single cause, it is important to have a second group, the control group, which is statistical similar to the experimental group, and which undergoes all the experiences of the experimental group except the introduction or removal of that single cause under study.
In cases of a medical experiment, for example, where researchers know that some patients respond favorably to any medication, at least at first, when the experimental group is given a pill containing the drug under study, the control group is often given a harmless sugar pill, with no active ingredient, in order to simulate the taking of medication as it occurs in the experimental group. Those sugar pills are known as placebos, a term that can be generally applied to any neutral activity or stimulus introduced in the control group for the sake of reproducing the experiences of the experimental group; and the tendency of subjects to respond favorably to any treatment, including sugar pills, is known as the placebo effect.
In studies, the researchers only passively collect data, whether they record the data from their own observations or analyze existing records. On one hand, because they do not involve the active control of a suspected cause, studies can only show correlation, never causation. On the other, studies have the flexibility of moving from the effect back to its cause, as well as from the cause forward to its effect.
Studies, then, are largely statistical analysis. They do not have the component of direct manipulation, as do experiments, and they usually do not need to rely on the statements of individuals, as do polls. Depending on the design of the study, a control group--that is, a second sample group similar to the first but missing the factor under study--may be used in order to help strengthen the causal arguments, which will be discussed briefly below.
In polls, researchers rely on what people say, rather than studying a phenomenon itself. Polls are the most common type of estimation, and require the least amount of investigative effort because, once the sample is chosen, pollsters simply ask that sample questions and record the responses. Unlike experiments and studies, polls can only be conducted on human populations, since only humans can communicate their responses. (Exceptions such as signing apes, talking parrots, and clicking cetaceans suggest that polls may be done among non-human species in the future, but not yet.) Unfortunately, polls must rely on the veracity of their subjects--and humans are notorious liars, especially on subjects of enough consequence to warrant study, such as sexual practices, food consumption, voting preferences, spending habits, and so on.
Sometimes, polls do not seem to identify a correlation. Asking likely voters whom they favor, for example, does not appear to be involved with correlation or causation. However, pollsters are usually looking for patterns that associate the relevant qualities of their adjusted sample (such things as age, race or ethnicity, gender, education, party affiliation, occupation, income, and so on) with the results of their poll.
Since all forms of estimation are usually looking to show either correlation or causation, they all employ causal reasoning, such as you read about in the section by that name at the beginning of Part 3. Arguing that one factor is the difference or the commonality between sample groups that show a particular outcome is the whole purpose of estimation; and usually both forms of causal reasoning (difference and commonality) need to be employed in order to demonstrate the causation or correlation convincingly.
Polls, studies, and experiments usually produce results that have, at best, a 95% chance of being repeated if the estimation were run again. In addition, as you read in the section on statistics, the results of all estimations are limited by a factor called the "margin of error," which depends largely on the size of the sample used. The results of estimations cannot be precise, but must be expressed within the range of the margin of error. If, for example, George W. Bush received 49% percent of support in a Florida opinion poll during the election, and the opinion poll has a margin of error of plus or minus 3 percentage points, then we can be 95% sure than Bush's actual support at that time in Florida was somewhere between 46% (49-3) and 52% (49+3). Note that it is just as likely, in this example, for Bush's actual support to be 46% or 52%, or any other figure within that range, as it is to be 49%.
Also note that, to be statistically significant, the difference between two results (say, the support for Bush and John Kerry) must exceed that margin of error. If, in the same poll, Kerry received support from 46% of likely voters, and Bush received 49% of likely voters, and if that poll had a margin of error of plus or minus 3 percentage points, then Kerry's results should actually be tabulated as falling between 43% and 49%, while Bush's should fall between 46% and 52%. Because of the overlap of these ranges, however, and despite the apparent 3% lead which Bush seemed to enjoy, we must conclude that there is no statistical significance between Bush's 49% and Kerry's 46%.