Variance
back to Statistical Dispersion
The variance of a set of data is the mean of the sum of squared deviations from the Arithmetic Mean of the same set of data. Because this calculation sums the squared deviations, we can conclude two things:
*Variance = {(x1-µ)2 + (x2-µ)2 + ... + (xn-µ)2} / n; where µ is the Arithmetic Mean of the data set.
When the set of data is a Population, we call this the population variance. If the set is a Sample, we call it the sample variance.
The method of calculation may be more easily understood from the table below where the mean is 8.
i x[i] x[i]-mean (x[i]-mean)^2
(deviation) (squared deviation)
1 5 -3 9
2 7 -1 1
3 8 0 0
4 10 2 4
5 10 2 4
--- ---- --- ---
n=5 sum=40 0 18
mean = 40/5 = 8
variance = 18/5 = 3.6
standard deviation = 1.897366596101 or 1.9
Note that the column of deviations sums to zero. This is always the case. Note also that we round the standard deviation to one more than the number of significant digits in the mean.
There is another formula for calculating variance which you may see. It uses the sum of all the data and the sum of the squares. The formula is:
*Variance = [n{x12 + x22 + ... + xn2} - {x1 + x2 + ... + xn}2] / n2
This formula was introduced when the prevailing calculators made it much easier to sum squares and the raw data than to sum the squared deviations. Because this formula can result in loss of precision, it should no longer be recommended except for small exercises.
:see also Standard Deviation