RU AccessScheduleRU OnlineDirectoryContact Us
   Future Students Current Students Parents Alumni Faculty & Staff
Print-friendly version

Size Matters, a Faculty Essay, by Stephen T. Ziliak
University News

Stephen T. Ziliak has been an associate professor of economics at Roosevelt University since 2003. He is the author of award-winning
articles on misuse of statistics by economists, and is currently working on a book about Wwilliam Sealy Gosset, who is considered to be
the "father" of Oomph in economics.

"Statistical significance" is, you once learned, crucial for getting a scientific result. Now it’s time to unlearn it. Because it’s not.

Suppose you want to help your mother lose weight, and are looking at two diet pills nearly identical: in price, side effects, style. The only difference between them is in amount of probable weight loss. Pill "Oomph" will on average take off 20 pounds but it’s a little shaky, at plus or minus 14 pounds. Not bad. Alternatively, the pill "Precision" will take off only five pounds on average, but it’s much more certain in its effect: Precision brings a probable error of plus or minus 0.5 pound. Sweet!

Scientists say the "signal-to-noise ratio" of diet pill Oomph is 1.43-to-1—that’s because the predicted effect of 20 pounds divided by the probable error of 14 pounds is 1.43. It’s error-ridden. But the ratio for pill Precision is higher, 10-to-1. Error-ridden, yes, but much more precise, you see. Which pill for mom? "Well," say our scientific colleagues, "the one with the highest ‘signal-to-noise ratio’ is Precision. So Precision, right?" Wrong. Yet a distressingly large number of scientists in fields from agronomy to zoology choose Precision over Oomph. They decide whether
something is important or not, whether it has effect, by looking not at its Oomph but at how precisely it is estimated. OOomph pills promise to shed from six to 34 pounds. The much less effective Precision will shed no more than five and a half pounds. Anyone with common sense could figure out
which pill is best: obviously Oomph. Get me to the drug store. But the precision-minded nutritionist or economist or biologist picks the wrong pill.
Who cares if the spread around the average of pill Precision is less? No one who wants to lose weight, or choose the most effective cancer drug or choose the best economic policy, will care. Mom cares about the spread around her hips, not around her estimate.

The phrase for this singular pursuit of precision is "statistical significance." Interestingly, it’s almost never pursued by atomic physicists, say, or by cell biologists. Wildlife biologists are a lot more confused. Economists still more so. The very worst are medical scientists and epidemiologists: they take precision over Oomph, then equate them, nearly every time, as if inference were possible relative to no currency. Soon-to-be-dead sperm and minke whales of Antarctica, and the makers and users of Vioxx®, are only the most recent victims of this strange ritual.

In "The Standard Error of Regressions" (1996) I showed with Deirdre McCloskey how significance testing was used during the 1980s in the leading journal of mainstream economics, the American Economic Review. Of 182 papers published in the Review 70 percent did not distinguish statistical from policy or substantive significance—that is, from what we call "economic significance." And fully 96 percent misused a statistical test in some (shall i say) significant way or another. Of the 70 percent that flatly mistook statistical significance for economic significance,
further, again about 70 percent failed to report any magnitudes of Oomph. Not for price controls on gasoline or the money supply on interest rates. In other words, during the 1980s about one half of the papers published in the top journal of economics did not establish their claims as economically significant. At all. Pretty startling. Maybe even "significant."

Proof that this mistaken use of chance is causing a loss of jobs and justice can be found in a September 1987 study of the state of illinois unemployment insurance system.

The authors estimate benefit-cost ratios for the state of Illinois from a pilot experiment. The intent of the experiment was to find a cash bonus that would reduce the duration of insurance claims. One group of unemployed workers was given a cash bonus for getting a job quickly and keeping the job for several months. In the other group the employers were given the bonus if claimants got a job with them and kept it for several months.

Like Mom and other users of Oomph, taxpayers want to know the economic significance of the experiment. But the authors focused instead on the statistical significance. They argued that the benefit-cost ratio of the employer-based subsidy, which was $4.29 of unemployment benefit for each $1 of bonus paid out to employers, was "not statistically different from zero." Strange. Illinois was ahead of the game by $3.29 for every dollar spent, and the workers furthermore got employed. The authors didn’t buy it. They said the noise was higher than one normally reports in the scientific journals so they ignored the benefit-cost ratio.

Bad idea. The signal was ample, actually—they could reject the hypothesis of zero effect with 88 percent assurance—but anyway the "zero" talk is the crazy part. The authors found and then ignored an efficient government program, a diet pill that’s easy to swallow.

Some of my economist colleagues said, "Fine. But you bring old news. After the 1980s, best practice improved." So in a new paper, "Size Matters," I applied the same analysis to all the papers of the next decade, the 1990s. Unhappily, statistical practice is not getting better. It’s getting worse. Of 137 papers in the 1990s, 82 percent mistook a statistically significant finding for an economically significant finding. Precision conquered Oomph. Market forecast?—a dismal science as barbaric as medical science. Little wonder that students have dubbed
statistics "sadistics."

How do scientists manage to get something so simple so wrong? I’m not too sure though i have hunches, and have said so field by field, back to 1885, in a forthcoming book, Size Matters: How Some Sciences Lost Interest in Magnitude, and What to Do About It (The MiT Press, 2006).
Francis Y. Edgeworth, who coined in 1885 the very term statistical "significance," warned readers of the mistake. Oother theorists—notably, William Sealy Gosset, the very inventor of "Student’s t test"—greatly amplified his warning. But in the 1920s a statistician and rhetorical
magician, the forceful eugenicist Ronald A. Fisher invented a "rule of two": if the signal-to-noise ratio is equal to two or higher, Fisher insisted, the finding is "significant." If not, not. In a book of 1925, now reprinted many times over, Fisher nowhere confronted the main goal of science, which is to find and explain Oomph. Fisher’s rule can’t help your dieting mother. It can only sharpen your opinions about a less effective diet pill. Scientists listened to Fisher. His philosophy of neo-positivism they found persuasive. And with the advent of desk-top computers, the rule of two has stuck. An eminent statistician at the University of Chicago, the late William Kruskal, reminded us that Fisher’s rule "is the cheapest way to get marketable results." Bingo. Costco science.

The temptation is certainly there. Think of the O-rings of the spaceship Challenger, and the "scientific" cover up. Sir Francis Galton said if "the Greeks" had known about the bell curve they would have "personified" and "deified" it. Apparently we’re all idol-worshippers now. One can only hope that scientists will abandon their little deity and embrace again the real prime mover of science: size matters. "No size," we should say, noisily as possible, "no significance."

Or, Precision is nice but Oomph is the bomb.

Home

"In other words, during the 1980s about
one half of the papers published in the top
journal of economics did not establish their
claims as economically significant. At all.
Pretty startling. Maybe even "significant."

Related links:

Department of Economics

Dr. Stephen T. Ziliak

Archived front page news

 

 

 

© 2006, Roosevelt University, All Rights Reserved
Chicago  430 S. Michigan Ave, Chicago, IL 60605 | 312-341-3500
Schaumburg 1400 N. Roosevelt Blvd, Schaumburg, IL 60173 | 847-619-7300