Intro to Statistics: Part 3: A Random Variable's Variance
A random variable is described mathematically using the characteristics of its distribution. In the previous article we learned about the expected value of the distribution, E[X], which is the weighted average of all possible outcomes. In this post we'll cover another important characteristic of distributions: variance.
The variance of a random variable measures the dispersion of the outcomes in its distribution. Variance gives you an idea of how "spread out" the possible outcomes are. Higher variance means the outcomes are more dispersed (further apart), whereas lower variance means the outcomes are less dispersed (closer together).
Here's wikipedia's definition of variance:
The variance of a random variable X is its second central moment, the expected value of the squared deviation from the mean µ:
In other words, variance is calculated by first taking the difference between each outcome and the expected value, squaring that difference, multiplying the squared-difference by the outcome's probability (its weight), then summing across all outcomes:
Note that this is basically the same formula we use to calculate the expected value E[X], the only difference being that instead of summing the outcomes, xi, we're summing the squared diffs, (xi - µ)^2
Aside: Why square the diffs, you ask? Honestly I don't know at this point, but I'm sure there's a good reason.
Another useful derivation of the variance formula is:
... where E[X^2] is computed similarly to E[X], the only difference being that each outcome value is squared:
Standard deviation
A random variable's variance is sometimes expressed using its standard deviation. The standard deviation is simply the square root of the variance.
Standard deviation is sometimes more convenient and intuitive to use, since it's value is in the same units as the random variable, whereas variance is expressed in squared units of the random variable.
Aside: Note that the standard deviation is NOT the same as simply calculating the average of the non-squared diffs -- i.e: σ != E[X-μ]. The average of the non-squared diffs, E[X-μ], is called the "mean absolute deviation" (or something like that). It's almost never used so feel free to forget I mentioned it. Just be sure not to confuse it with the standard deviation, which is specifically the square root of the variance.
Variance visualized
Conceptually it might be easier to understand what variance is by taking a look at it graphically. The following chart depicts the complete set of outcomes for a random variable representing the toss of a single die. There are six possible outcomes, 1 - 6. The outcomes are listed along the y-axis.
The variance measures the "spread" of the outcomes -- i.e. how much the outcomes vary from the expected value. The expected value, 3.5, is depicted below by the blue dashed line that cuts thru the center of the data. The variance is illustrated using the dotted red lines, which measure the distance between each outcome and the expected value.
The closer the data are to the expected value, the smaller the variance. The further away, the larger the variance. Below are two charts of two random variables, both with the same expected value, E[X] = 10, but different variances. The first chart has smaller variance than the second chart.
Calculating Variance: Some Quick Examples
Let's calculate the variance of a random variable X that represents the tossing of a single die. Recall that the expected value, E[X], is calculated by:
Now let's plug the expected value into the variance formula. We'll use both forms of the variance formula shown above, just to check our work and verify they both produce the same result.
And using the other formula:
For another example, let's calculate the variance of flipping a coin. Remember there are just two outcomes, 0 and 1, and the expected value is E[X] = 0.5:
Recap
- A random variable is described by the characteristics of its distribution
- Each outcome in the distribution has an associated probability of occurring
- The sum of the probabilities of all possible outcomes equals 1
- The expected value, E[X], of a distribution is the weighted average of all outcomes, where each outcome is weighted by its probability
- The variance, Var(X), of a distribution is a "measure of dispersion". It's calculated by taking the weighted average of the squared differences between each outcome and the expected value, where each squared diff is weighted by the probability associated with the outcome
- The standard deviation of a distribution is the square root of its variance
- The standard deviation is expressed in the same units as the random variable, whereas variance is expressed in squared units of the random variable