ABUSING STATISTICS FOR POLITICS AND PROFIT
Beware of these abusive techniques.
There are many statistical functions that may be used to compare values. Unfortunately, there are more ways to use them wrong
than to use them right. There are 4 ways to abuse statistics:
- Bad data collection
- Bad application of statistical functions
- Forming the wrong conclusion
- Sneaky tricks to mislead the unwary
Here is a list of the wrong things to do in statistics, with some horrific examples of how politicians and
"scientists" have used them to mislead people. The list is sorted according to the four categories above:
Bad data collection
- Sampling the wrong population: In a study on whether a gene causes
homosexuality, one research group studied 15 homosexuals to see if they had a certain gene. They did NOT study anyone who
is not homosexual, so they had nothing to compare their results with.
- Lack of a control group: This is also true in the case above.
- Nonrandom sampling methods: Placing a survey form in a magazine that is read mostly by liberals will
tend to elicit a liberal viewpoint, not an average viewpoint.
- Not checking for other variables acting on the sample: The number "Deaths caused by automobile
airbags in the years 1990-1998" fails to reveal that most of them happened in 1997 and 1998. The number of airbags
in existence (an independent variable) was increasing during that period.
- Not collecting a large enough sample: In the homosexual-genetics study above, they studied only 15
cases. This made the margin of error so large that the results would have been meaningless even if they had tested
- Not noting whether or not it is possible that an already sampled item can be sampled again when designing
the sampling method: Different formulas must be used for each design.
- Using experimental conditions that change the nature of the system being tested: Megadosing lab
rats may cause metabolic changes that are not present with smaller doses over longer periods.
- Creating false data to replace data that were lost: In one suspected-carcinogen case, most of the
lab rats were accidentally killed by fumigators spraying for bugs. Not wanting to lose a semester on his thesis, the
research student obtained new rats to replace the dead ones (for display only), and faked data for the new rats based on
data collected from the remaining original rats. He did not revise his error estimate to reflect the smaller number of
rats. Unfortunately, the pesticide had induced tumors in them, and a perfectly safe product was removed from the market
as a result.
- Discarding samples that don't fit the desired theory: Other laboratories were unable to duplicate
the results of a study that concluded that 60 Hz power line radiation causes cell changes. The "researcher"
had discarded all of the data that didn't fit his theory, leaving only about a tenth of the data he had collected.
- Reporting more precision than is available: Proponents of the "global warming" theory
state that average temperatures have risen by half a degree since temperature readings were first recorded. The
problem with this is that early thermometers had an accuracy of plus or minus one degree or worse.
- Changing the collection method in the middle of the experiment: Changes made in the methods used
to obtain data may affect a time study.
Bad application of statistical functions
- Using the wrong statistic: The nature of the population under study and the sampling method used
must be studied very carefully to select the proper parameter or statistic to use when evaluating the data. In the
homosexual-genetics study cited above, they should have tested for a correlation between homosexuality and the presence
of the gene. Since they had no data on non-homosexuals, they could not very well do this. Therefore, they just reported
the percentage of homosexuals that had the gene (which means nothing -- it could have just as well been an eye-color
- Using the population standard deviation instead of the sample one: This is probably the most oft
repeated mistake, because the sample standard deviation is much harder to calculate. This changes the margin of
- Using the wrong error calculating function: The error function must be carefully chosen to match
the parameter or statistic used.
- Using the wrong averaging method: The averaging method must be carefully chosen to match the type
of data used. For example, velocities cannot be averaged using an arithmetic mean. Here is an example of an attempt to
qualify a race car:
By rule, the driver must make an average speed of 60 mi/hr over four laps on a one-mile oval in order to qualify
for the race. He killed his engine, and averaged only 30 mi/hr on his first lap. He averaged 60 mi/hour on the second.
Then he floorboarded the car and averaged 90 mi/hr on the last two laps. He used his calculator to average his lap
speeds using the arithmetic mean, and got 67.5 mi/hr. He thought he was in. But the official (who knew his math)
said he did not make the race. Here's why:
|Qualifying||FIRST LAP||SECOND LAP||THIRD LAP||FOURTH LAP
||Arithmetic mean||Geometric mean||Harmonic mean||Sum of parts
|Distance||1 mi||1 mi||1 mi||1 mi
|Time||2 min||1 min||2/3 min||2/3 min
|Speed||30 mi/hr||60 mi/hr||90 mi/hr||90 mi/hr
||67.5 mi/hr||61.8 mi/hr||55.4 mi/hr||55.4 mi/hr
In order to qualify, he had to average 60 mi/hr. That means he had to go four miles in less than four minutes, since
it takes one minute to go one mile at 60 mi/hr. He took 3 and 2/3 minutes to finish the first 3 laps. That meant he had
to go 180 mi/hr on the last lap to make the race.
The sum of the distances must be divided by the total time taken to traverse them, instead of relying on a formula.
Notice that for the arithmetic mean to work here, the time durations of all of the speeds must be the same, and for
the harmonic mean to work, the distances for all of the speeds must be the same (as they were in this special
- Comparing samples with different characteristics as though they were not different: Assuming that
accident rates before a particular hazard is publicized are the same as those after publication can cause wrong results
when comparing different kinds of accidents to each other.
- Misunderstanding or misapplying a formula: The 55 mi/hr speed limit was the result of misapplying
Power = Force X Velocity.
It applies only during the acceleration of a car up to speed, not while the car is moving at a steady speed.
- Assuming that an effect is linear: Many economists assume that the effect on the economy of
increasing the tax rate is linear. But the effect is really more like subtracting the tax rate from 1 and squaring the
- Bad mathematics: In the homosexual-genetics study cited above, the "scientists" couldn't
even get their mathematics right. 26 percent is not an even multiple of 1/15. They should have either reported 26.7
percent, or rounded it to 27 percent. The ignorance of even the most basic principles of scientific method shows
throughout the procedures and calculations they performed.
Forming the wrong conclusion
- Forgetting to compare the experimental value to the control value: In the homosexual-genetics
study above, they obtained a result that "26 percent" of the homosexuals tested had the gene. Without
comparing their result to a control group, they "concluded" that the gene causes homosexuality.
- Failing to consider the margin of error: In the homosexual-genetics study, no margin of error was
reported at all.
- Confusing correlation with causality: Because the incidence of the disease rickets was strongly
correlated to being in certain families, early scientists concluded that the disease was hereditary. Later it was
shown that rickets was the result of malnutrition, and that poverty, not rickets, was inherited.
- In the homosexual-genetics study above, they concluded that the gene "caused homosexuality" when they
had not even demonstrated a correlation between the presence of the gene and the presence of homosexuality.
- Using faulty logic to reach conclusions: In the homosexual-genetics study above, they deviated
from sound logic in two ways:
Some might think that I keep pointing to this study because of a supposed hatred for homosexuals. NOT SO! I keep
indicating the study because there was so much wrong with the "scientific" methods used. It is
the proverbial horrid example of science done wrong.
- Even though only 26 percent (sic) of the subjects had the gene, they concluded that it "caused"
homosexuality. So how do they explain the other 74 percent?
- They didn't even think about the rate of occurrence of the gene in the heterosexual population. It might be
26 percent too.
- 25 percent is the expected value for a randomly distributed recessive gene. 4/15 (26.7%) is as close as one can
get to 25 percent with a sample size of 15.
- Affirming the Consequent: Those saying Global Warming is real are using the expected effects
that Global Warming is expected to produce to "prove" the presence of Global Warming. But they ignore the
fact that those effects might have other causes besides Global Warming.
In addition, their conclusion that man is causing Global Warming is also affirming the consequent. There could
be other causes, including an increase in the output of the sun (demonstrated by the presence of Global Warming on
Venus, Mars, Europa, and Titan).
- False Premise: Some UFO investigators used the following logic to show that a particular UFO
sighting was unidentified:
"The object was seen high in the sky, moving toward the east. The weather bureau said that they had
launched a balloon fifteen minutes before the sighting. Although the object appeared similar to a balloon, and
the motion was characteristic of a balloon, it could not have been the balloon. The weather bureau also stated that
there was a west wind. If the object was a balloon, it was heading directly into the wind."
What's wrong with this conclusion? The premise that the west wind would make the object move west is false. By
the definition that weathermen use, a west wind comes from the west. It doesn't blow toward the
west. The object was probably the weather balloon.
- Confusing a curative effect with removing the cause: Because removing salt from the diet has a
curative effect on some cases of high blood pressure, some people (including some government regulators) jumped to
the conclusion that salt causes high blood pressure. Actually, removing salt abnormally thins the blood, allowing it
to flow through constricted spaces easier.
Sneaky tricks to mislead the unwary
One has to be very wary when reading and interpreting the claims made by others. Many have axes to grind, and they
want to use you for a whetstone.