Retire Standard Deviation

Wednesday, January 22nd, 2014

“The statistician cannot evade the responsibility for understanding the process he applies or recommends,” Sir Ronald A. Fisher said, so it is time to retire standard deviation from common use and replace it with mean deviation, Nassim Nicholas Taleb suggests:

Standard deviation, STD, should be left to mathematicians, physicists and mathematical statisticians deriving limit theorems. There is no scientific reason to use it in statistical investigations in the age of the computer, as it does more harm than good — particularly with the growing class of people in social science mechanistically applying statistical tools to scientific problems.

Say someone just asked you to measure the “average daily variations” for the temperature of your town (or for the stock price of a company, or the blood pressure of your uncle) over the past five days. The five changes are: (–23, 7, –3, 20, –1). How do you do it?

Do you take every observation: square it, average the total, then take the square root? Or do you remove the sign and calculate the average? For there are serious differences between the two methods. The first produces an average of 10.8, the second 15.7. The first is technically called the root mean square deviation. The second is the mean absolute deviation, MAD. It corresponds to “real life” much better than the first — and to reality. In fact, whenever people make decisions after being supplied with the standard deviation number, they act as if it were the expected mean deviation.

It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term “standard deviation” for what had been known as “root mean square error”. The confusion started then: people thought it meant mean deviation. The idea stuck: every time a newspaper has attempted to clarify the concept of market “volatility”, it defined it verbally as mean deviation yet produced the numerical measure of the (higher) standard deviation.

But it is not just journalists who fall for the mistake: I recall seeing official documents from the department of commerce and the Federal Reserve partaking of the conflation, even regulators in statements on market volatility. What is worse, Goldstein and I found that a high number of data scientists (many with PhDs) also get confused in real life.

Comments

  1. James James says:

    You seem to be focusing on the sensible and interesting answers to this year’s Edge question, whereas Steve Sailer is focusing on the wrong, dishonest ones.

  2. Swierczekml says:

    It’s interesting that the numbers are reversed in the excerpted part of the article above(15.7 is actually the STD, 10.8 is the MAD), but if you go to the original article (via the link), the numbers are listed correctly. Was there a correction made to the original article that didn’t make it into the excerpt? If the author didn’t recognize that the numbers were listed incorrectly in the original article, I question his judgement as to their relative usefulness!

Leave a Reply