Regression to the mean

Regression to the mean is a technical way of saying that things tend to even out over time. The sprinter that breaks the world record will probably run closer to their average time on the next race, or the medical treatment that achieves stunning results on the first trial will probably not be as efficacious on the second. Specifically, it refers to the tendency of a random variable that is highly distinct from the norm to return to "normal" over repeated tests. On average, observations tend to cluster around the mean (forming a normal distribution), whether or not they follow an unusual value. It only becomes most obvious when a strange result (e.g. a hole-in-one in golf) is followed by something much more ordinary (like a double-bogey). Regression to the mean forms the basis for the Central Limit Theorem (CLT), which allows statisticians to do calculations on samples that are very large even if the sample isn't known to have a normal distribution.

In medicine
Unfortunately, many of the effects claimed by alternative medicine can often be explained simply as regression to the mean. When Aunt Jane's acne gets better after rubbing mint leaves on her face, that's "anecdotal evidence" based almost entirely on regression to the mean. Many symptoms will come and go randomly if recorded in an objective way &mdash; headaches, for example, tend to disappear without the aid of any treatment over time. People seek treatment when their symptoms are particularly severe when they are at their respective "top". Regression to the mean, therefore, suggests that if symptoms are excessively severe this week, then next week they should be less severe simply by random fluctuations. If treatment is only sought when these symptoms are at their worst, there will almost always be a coincidental recovery. This appears even if the treatment has no effect whatsoever.

A placebo control group in controlled trials removes the effect of regression to the mean. Both groups, on average, experience a tendency to regress to the mean. If the treatment group shows a statistically significant increase in the speed that symptoms regress, it can be attributed to the effects of the treatment with some certainty, not the placebo effect or regression to the mean.

Testing
For example, if a researcher gave a large group of people a test and selected the top-performing 5%, these people would be likely to score worse, on average, if re-tested. Similarly, the bottom 5% would be likely to score better on a retest. In either case, the extremes of the distribution are likely to "regress to the mean" due to simple luck and natural random variation in the results.

Sports
One way of thinking about "regression to the mean" is in terms of sports performance. To win a football championship, for example, it is not enough only to be a good team &mdash; one needs to be both good and lucky. The team at the top of the standings in mid-season is likely to have been both good and lucky to that point, but cannot count on still being lucky for the rest of the season. For this reason, the team that is at the top of the standings at midseason is more likely to drop in standings than to remain at the top, and more likely to remain at the top than to improve (how does one improve from "the top," anyway?).

This observation has been tagged the "Sports Illustrated Jinx". The jinx states that a player or team featured on the cover of a sports magazine such as SI is likely to have a disappointing year the following season (or even a disappointing game the following week). But if you think about it, a player is only likely to make the cover once, and for some surprisingly good performance &mdash; something truly spectacular that requires not only their superlative skill but also lots of luck to beat the superlative skill of their competitors. Athletes on the cover of Sports Illustrated are likely to be at the very top of their game, and at the top, the most likely direction to move next is down. The next year, although the player may still be as skilled, they will not be as lucky, and post scores closer to "typical".

Traffic cameras
A good example of how regression to the mean can seem to prove the effectiveness of almost any intervention is that of the installation of road traffic safety measures &mdash; say speed cameras, a very common device in the UK. A flukey cluster of motor vehicle collisions one year seems to show that a stretch of road is becoming more dangerous, and it is suggested that a speed camera be installed. In the year after the camera is installed, the number of collisions is roughly average again.