Conditional Means and Variances

Printer-friendly versionPrinter-friendly version

Now that we've mastered the concept of a conditional probability mass function, we'll now turn our attention to finding conditional means and variances. We'll start by giving formal definitions of the conditional mean and conditional variance when X and Y are discrete random variables. And then we'll end by actually calculating a few!

Definition.  Suppose X and Y are discrete random variables. Then, the conditional mean of Y given X = x is defined as:

\(\mu_{Y|X}=E[Y|x]=\sum\limits_y yh(y|x)\)

And, the conditional mean of X given Y = y is defined as:

\(\mu_{X|Y}=E[X|y]=\sum\limits_x xg(x|y)\)

The conditional variance of Y given X = x is:

\(\sigma^2_{Y|x}=E\{[Y-\mu_{Y|x}]^2|x\}=\sum\limits_y [y-\mu_{Y|x}]^2 h(y|x)\)

or, alternatively, using the usual shortcut:

\(\sigma^2_{Y|x}=E[Y^2|x]-\mu^2_{Y|x}=\left[\sum\limits_y y^2 h(y|x)\right]-\mu^2_{Y|x}\)

And, the conditional variance of X given Y = y is:

\(\sigma^2_{X|y}=E\{[X-\mu_{X|y}]^2|y\}=\sum\limits_x [x-\mu_{X|y}]^2 g(x|y)\)

or, alternatively, using the usual shortcut:

\(\sigma^2_{X|y}=E[X^2|y]-\mu^2_{X|y}=\left[\sum\limits_x x^2 g(x|y)\right]-\mu^2_{X|y}\)

As you can see by the formulas, a conditional mean is calculated much like a mean is, except you replace the probability mass function with a conditional probability mass function. And, a conditional variance is calculated much like a variance is, except you replace the probability mass function with a conditional probability mass function. Let's return to one of our examples to get practice calculating a few of these guys.

Example

Let X be a discrete random variable with support S1 = {0, 1}, and let Y be a discrete random variable with support S2 = {0, 1, 2}. Suppose, in tabular form, that X and Y have the following joint probability distribution f(x,y):

pmf

What is the conditional mean of Y given X = x?

Solution. We previously determined that the conditional distribution of Y given X is:

h(y|x)

Therefore, we can use it, that is, h(y|x), and the formula for the conditional mean of Y given X = x to calculate the conditional mean of given X = 0. It is:

\(\mu_{Y|0}=E[Y|0]=\sum\limits_y yh(y|0)=0\left(\dfrac{1}{4}\right)+1\left(\dfrac{2}{4}\right)+2\left(\dfrac{1}{4}\right)=1\)

And, we can use h(y|x) and the formula for the conditional mean of Y given X = x to calculate the conditional mean of Y given X = 1. It is:

\(\mu_{Y|1}=E[Y|1]=\sum\limits_y yh(y|1)=0\left(\dfrac{2}{4}\right)+1\left(\dfrac{1}{4}\right)+2\left(\dfrac{1}{4}\right)=\dfrac{3}{4}\)

Note that the conditional mean of Y|X = x depends on x, and depends on x alone. You might want to think about these conditional means in terms of sub-populations again. The mean of Y is likely to depend on the sub-population, as it does here. The mean of Y is 1 for the X = 0 sub-population, and the mean of Y is ¾ for the X = 1 sub-population. Intuitively, this dependence should make sense. Rather than calculating the average weight of an adult, for example, you would probably want to calculate the average weight for the sub-population of females and the average weight for the sub-population of males, because the average weight no doubt depends on the sub-population!

What is the conditional mean of X given Y = y?

Solution. We previously determined that the conditional distribution of X given Y is:

g(x|y)

As the conditional distribution of X given Y suggests, there are three sub-populations here, namely the Y = 0 sub-population, the Y = 1 sub-population and the Y = 2 sub-population. Therefore, we have three conditional means to calculate, one for each sub-population.  Now, we can use g(x|y) and the formula for the conditional mean of X given Y = y to calculate the conditional mean of given Y = 0. It is:

\(\mu_{X|0}=E[X|0]=\sum\limits_x xg(x|0)=0\left(\dfrac{1}{3}\right)+1\left(\dfrac{2}{3}\right)=\dfrac{2}{3}\)

And, we can use g(x|y) and the formula for the conditional mean of X given Y = y to calculate the conditional mean of given Y = 1. It is:

\(\mu_{X|1}=E[X|1]=\sum\limits_x xg(x|1)=0\left(\dfrac{2}{3}\right)+1\left(\dfrac{1}{3}\right)=\dfrac{1}{3}\)

And, we can use g(x|y) and the formula for the conditional mean of X given Y = y to calculate the conditional mean of given Y = 2. It is:

\(\mu_{X|2}=E[X|2]=\sum\limits_x xg(x|2)=0\left(\dfrac{1}{2}\right)+1\left(\dfrac{1}{2}\right)=\dfrac{1}{2}\)

Note that the conditional mean of X|Y = y depends on y, and depends on y alone. The mean of X is 2/3 for the Y = 0 sub-population, the mean of X is 1/3 for the Y = 1 sub-population, and the mean of X is 1/2 for the Y = 2 sub-population.

What is the conditional variance of Y given X = 0?

Solution. We previously determined that the conditional distribution of Y given X is:

h(y|x)

Therefore, we can use it, that is, h(y|x), and the formula for the conditional variance of Y given X = x to calculate the conditional variance of given X = 0. It is:

\begin{align}
\sigma^2_{Y|0} &= E\{[Y-\mu_{Y|0}]^2|x\}=E\{[Y-1]^2|0\}=\sum\limits_y (y-1)^2 h(y|0)\\
&= (0-1)^2 \left(\dfrac{1}{4}\right)+(1-1)^2 \left(\dfrac{2}{4}\right)+(2-1)^2 \left(\dfrac{1}{4}\right)=\dfrac{1}{4}+0+\dfrac{1}{4}=\dfrac{2}{4}
\end{align}

We could have alternatively used the shortcut formula. Doing so, we better get the same answer:

\begin{align}
\sigma^2_{Y|0} &= E[Y^2|0]-\mu_{Y|0}]^2=\left[\sum\limits_y y^2 h(y|0)\right]-1^2\\
&= \left[(0)^2\left(\dfrac{1}{4}\right)+(1)^2\left(\dfrac{2}{4}\right)+(2)^2\left(\dfrac{1}{4}\right)\right]-1\\
&= \left[0+\dfrac{2}{4}+\dfrac{4}{4}\right]-1=\dfrac{2}{4}
\end{align}

And we do! That is, no matter how we choose to calculate it, we get that the variance of Y is ½ for the X = 0 sub-population.