Central tendency

Central tendency is a term in descriptive statistics which gives an indication of the typical score in that data set. The three most common measures typically used for this: the mean, the median (not to be confused with Medium) and the mode. However, there are other measures of central tendency.

Average is an often-used term that may refer to any measure of central tendency, though in casual conversation it is generally assumed to refer to the mean.

Arithmetic mean
The arithmetic mean is easily calculated by summing up all scores in the sample and then dividing by the number of scores in that sample. The mean for the sample 5, 6, 9, 2 would therefore be calculated as follows:
 * $$\frac{5+6+9+2}{4}{=5.5}$$

The mathematical formula can be expressed as:
 * $$\frac1n\sum_{i=1}^na_i$$

where $$n$$ is the total number of samples.

The above is the arithmetic mean; there are also other means, such as the geometric mean and the harmonic mean, but usually the arithmetic mean is meant if the type of mean is not specified.

Geometric mean
All the values are multiplied and then the nth root is taken (where n is the total number of scores). Useful in some geometric (heh) contexts; for example, the area of an ellipse is equal to that of a circle whose radius is the geometric mean of the ellipse's semi-major and semi-minor axes. Has the neat property of being equal to e raised to the arithmetic mean of the natural logarithms of the values being averaged. (The same principle holds for other bases, so for example it could be defined as ten raised to the arithmetic mean of the common logarithms.) The mathematical formula can be expressed as:
 * $$\left(\prod_{i=1}^n x_i\right)^\frac1n$$

Harmonic mean
The mean obtained by taking the reciprocal of the arithmetic mean of the reciprocals of a set of (nonzero) numbers. One of the most memorable uses of the harmonic mean is in physics, where the equivalent resistance of a set of resistors in parallel is the harmonic mean of their resistances divided by the number of resistors. The same principle applies to capacitors placed in series. The mathematical formula can be expressed as:
 * $$\frac{n}{\displaystyle\sum\limits_{i=1}^n\frac1{x_i}}$$

Other means

 * Weighted mean: If the values have different "weights" (not in the physical sense), the most appropriate way to calculate an average is to calculate a weighted mean. This is done by multiplying every value  with its corresponding weight   (  being the index or rank of the data set) and the summing up all the products (i.e.  ) and then dividing it all by the sum of the weights (i.e.  ). Ergo, the final formula:

For example, a school test has 4 subjects with different weights (in parentheses), going from 0 (minimum) to 10 (maximum): Mathematics (3), physics (2), chemistry (1) and biology (1). Student A got the following grades, respectively:. The weighted mean therefore is:

It can also be used to combine measurements from classes with different sizes, where the weight corresponds to the relative size of each class.


 * Truncated mean (or trimmed mean): this is similar to the arithmetic mean, but outliers are discarded by dropping some part of the probability distribution. If you discard the lowest 25% and highest 25%, this is known as the interquartile mean. It is used in the ISU Judging System for figure-skating, where the highest and lowest scores are dropped and the rest are averaged.


 * Mid-range: the arithmetic mean of the highest and lowest value, this is very vulnerable to outliers and not much use in most circumstances.

Median
The median is defined as the value that lies in the middle of the sample when that data set is ordered (by rank).

It is calculated by ranking all the scores in order and taking the one in the middle. When there is an even number, conventionally, the mean of the two in the middle is taken.

For example, the median for the already ordered data set. It has 9 data points (n = 9; it is the number of "observations" in that data set) and the value in the middle would be the fifth rank, which is 19.

However, as another example, the data set  has an even number of observations (n = 10), the median would be between 19 (fifth rank) and 20 (sixth rank), which are the two scores in the middle. Therefore, it is the mean of those two, which is 19.5.

Mode
The mode is simply the most frequently occurring value in the data set.

For example, the mode of the sample  would be 3 (since it appears thrice). If there is a second number that occurs just as frequently within the sample, then it can be described as having two modes (bimodal). However, some some sources will identify such a sample as having no mode.

Comparison
The three methods can be useful in different ways, and how they relate can give information about your statistical sample. The mean is the most intuitive measure of the concept of "average", while the median is most useful for breaking samples into groups (e.g., quartiles). For evenly distributed samples, or symmetrically distributed samples, the median and mean (and usually the mode) should match. The difference between them is an indicator of how skewed the data is. For instance, in economics, income is not evenly distributed (not by a long shot), nor even symmetrically distributed, and the mean value is easily shifted by those high earners at the top — for this reason the median is most often used.