12.1 - Notations and Terminology

Notation Section

Collect all of the variables X 's into a vector \(\mathbf{X}\) for each individual subject. Let \(\mathbf{X_i}\) denote observable trait i. These are the data from each subject, and are collected into a vector of traits.

\(\textbf{X} = \left(\begin{array}{c}X_1\\X_2\\\vdots\\X_p\end{array}\right) = \text{vector of traits}\)

This is a random vector, with a population mean. Assume that vector of traits \(\mathbf{X}\) is sampled from a population with population mean vector:

\(\boldsymbol{\mu} = \left(\begin{array}{c}\mu_1\\\mu_2\\\vdots\\\mu_p\end{array}\right) = \text{population mean vector}\)

Here, \(\mathrm { E } \left( X _ { i } \right) = \mu _ { i }\) denotes the population mean of variable i.

Consider m unobservable common factors \(f _ { 1 } , f _ { 2 } , \dots , f _ { m }\). The \(i^{th}\) common factor is \(f _ { i } \). Generally, m is going to be substantially less than p .

The common factors are also collected into a vector,

\(\mathbf{f} = \left(\begin{array}{c}f_1\\f_2\\\vdots\\f_m\end{array}\right) = \text{vector of common factors}\)

Model Section

Our factor model can be thought of as a series of multiple regressions, predicting each of the observable variables \(X_{i}\) from the values of the unobservable common factors \(f_{i}\) :

\begin{align} X_1 & =  \mu_1 + l_{11}f_1 + l_{12}f_2 + \dots + l_{1m}f_m + \epsilon_1\\ X_2 & =  \mu_2 + l_{21}f_1 + l_{22}f_2 + \dots + l_{2m}f_m + \epsilon_2 \\ &  \vdots \\ X_p & =  \mu_p + l_{p1}f_1 + l_{p2}f_2 + \dots + l_{pm}f_m + \epsilon_p \end{align}

Here, the variable means \(\mu_{1}\) through \(\mu_{p}\) can be regarded as the intercept terms for the multiple regression models.

The regression coefficients \(l_{ij}\) (the partial slopes) for all of these multiple regressions are called factor loadings. Here, \(l_{ij}\) = loading of the \(i^{th}\) variable on the \(j^{th}\) factor. These are collected into a matrix as shown here:

\(\mathbf{L} = \left(\begin{array}{cccc}l_{11}& l_{12}& \dots & l_{1m}\\l_{21} & l_{22} & \dots & l_{2m}\\ \vdots & \vdots & & \vdots \\l_{p1} & l_{p2} & \dots & l_{pm}\end{array}\right) = \text{matrix of factor loadings}\)

And finally, the errors \(\varepsilon _{i}\) are called the specific factors. Here, \(\varepsilon _{i}\) = specific factor for variable i. The specific factors are also collected into a vector:

\(\boldsymbol{\epsilon} = \left(\begin{array}{c}\epsilon_1\\\epsilon_2\\\vdots\\\epsilon_p\end{array}\right) = \text{vector of specific factors}\)

In summary, the basic model is like a regression model. Each of our response variables X is predicted as a linear function of the unobserved common factors \(f_{1}\) , \(f_{2}\) through \(f_{m}\) . Thus, our explanatory variables are \(f_{1}\) , \(f_{2}\) through \(f_{m}\). We have m unobserved factors that control the variation among our data.

We will generally reduce this into matrix notation as shown in this form here:

\(\textbf{X} = \boldsymbol{\mu} + \textbf{Lf}+\boldsymbol{\epsilon}\)

Note! In general we want m << p.