PCA
How to calculate PCA
Given a data matrix with samples and d features:
- Center the data (subtract the mean of the each column).
- Compute the covariance matrix:
- Compute the eigenvectors and eigenvalues of the covariance matrix
- Sort the eigenvectors by decreasing eigenvalues
- Select the top eigenvectors -- these are your principal components.
- Project the original data onto these vectors:
, where is a matrix whose columns are the selected eigenvectors.
Pros & Cons
Cons: Assumes linear relationship