Home
About Crown Hill
Exchange Traded
Products
Institutional Investors
Investment Process
Advanced ALM
 
 
   

First Moment  

Mean

Arithmetic Mean

The arithmetic mean is what is commonly called the average: When the word "mean" is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. The formula in summation notation is:

where m is the population mean and N is the number of scores. The mean is a good measure of central tendency for roughly symmetric distributions but can be misleading in skewed distributions since it can be greatly influenced by extreme scores. Therefore, other statistics such as the median may be more informative for distributions such as reaction time or family income that are frequently very skewed.


For normal distributions, the mean is the most efficient and therefore the least subject to sample fluctuations of all measures of central tendency.

The formal definition of the arithmetic mean is µ = E[X] where m is the population mean of the variable X and E[X] is the expected value of X. The expected value of a variable is the long-run average value of that variable. The expected value of a statistic is therefore the mean of the sampling distribution of the statistic.


Geometric Mean


The geometric mean is the nth root of the product of the scores. Thus, the geometric mean of the scores: 1, 2, 3, and 10 is the fourth root of 1 x 2 x 3 x 10 which is the fourth root of 60 which equals 2.78. The formula can be written as: Geometric mean = pX where pX means to take the product of all the values of X.

If any one of the scores is zero then the geometric mean is zero. The geometric mean does not make sense if any scores are less than zero.

The geometric mean is less affected by extreme values than is the arithmetic mean and is useful as a measure of central tendency for some positively skewed distributions.


Harmonic Mean

The harmonic mean is used to take the mean of sample sizes. If there are k samples each of size n, then the harmonic mean is defined as:


For the numbers 1, 2, 3, and 10, the harmonic mean is 2.069. This is less than the geometric mean of 2.78 and the arithmetic mean of 4.

 

 

Second Moment

Variance

The variance is a measure of how spread out a distribution is. It is computed as the average squared deviation of each number from its mean. For example, for the numbers 1, 2, and 3, the mean is 2 and the variance is:


The formula (in summation notation) for the variance in a population is



where m is the mean and N is the number of scores.

When the variance is computed in a sample, the following statistic can be used


where M is the mean of the sample. S 2 is a biased estimate of s 2 , however. By far the most common formula for computing variance in a sample is:



which gives an unbiased estimate of s 2. Since samples are usually used to estimate parameters, S 2 is the most commonly used measure of variance.

Standard Deviation

The formula for the standard deviation is very simple: it is the square root of the variance. It is the most commonly used measure of spread.

An important attribute of the standard deviation as a measure of spread is that if the mean and standard deviation of a normal distribution are known, it is possible to compute the percentile rank associated with any given score. In a normal distribution, about 68% of the scores are within one standard deviation of the mean and about 95% of the scores are within two standards deviations of the mean.

The standard deviation has proven to be an extremely useful measure of spread in part because it is mathematically tractable. Many formulas in inferential statistics use the standard deviation.

 

 

Third Moment

 Skewness

Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right. The skewness of the normal distribution (or any perfectly symmetric distribution) is zero.

The skewness of a distribution is defined as:

where µ is the mean of x, s is the standard deviation of x, and E( t) represents the expected value of the quantity t.

A distribution is skewed if one of its tails is longer than the other. The first distribution shown has a positive skew. This means that it has a long tail in the positive direction. The distribution below it has a negative skew since it has a long tail in the negative direction. Finally, the third distribution is symmetric and has no skew. Distributions with positive skew are sometimes called "skewed to the right" whereas distributions with negative skew are called "skewed to the left."

Distributions with positive skew are more common than distributions with negative skews. One example is the distribution of income. Most people make under $40,000 a year, but some make quite a bit more with a small number making many millions of dollars per year. The positive tail therefore extends out quite a long way whereas the negative tail stops at zero.

Skew can be calculated as:

 


where m is the mean and s is the standard deviation.

The normal distribution has a skew of 0 since it is a symmetric distribution.

As a general rule, the mean is larger than the median in positively skewed distributions and less than the median in negatively skewed distributions. Although counter examples can be found, they are very rare in real data.

 

 

Fourth Moment

 Kurtosis

Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3.

The kurtosis of a distribution is defined as

where µ is the mean of x, s is the standard deviation of x, and E( t) represents the expected value of the quantity t.

Kurtosis is based on the size of a distribution's tails. Distributions with relatively large tails are called "leptokurtic"; those with small tails are called "platykurtic." A distribution with the same kurtosis as the normal distribution is called "mesokurtic."

The following formula can be used to calculate kurtosis:



where s is the standard deviation. The kurtosis of a normal distribution is 0.

    Home | About Crown Hill | Exchange Traded Products | Institutional Investors | Investment Process