PCA (Principal Component Analysis)
Dimensionality reduction by projecting data onto orthogonal directions of maximum variance.
Definition
PCA finds linear combinations of variables (e.g. returns) that explain the most variance. The first PC is the direction of max variance; the second is max variance orthogonal to the first; and so on.
Why it matters
- Risk: First few PCs often capture most of correlation structure (e.g. “market” factor).
- Signals: Residuals after subtracting first PC can be used for relative value or alpha.
- Parsimony: Fewer dimensions for covariance or factor models.
Common mistakes
- Using PCA on raw prices (use returns or standardized returns).
- Interpreting PCs as “fundamental” without checking stability.
- Overfitting number of components (use out-of-sample or information criteria).
Notation
( X = U \Lambda V' ); columns of ( V ) are loadings; ( XV ) are principal component scores.