Notes on MPPCA denoising
Introduction
This note is for the paper titled as Denoising of diffusion MRI using random matrix theory, which was published on the NeuroImage Journal on 2016.
Considering the publish data, this paper only compared the proposed method with two denoising methods used in denoising diffusion MRI: the non-local mean(ANLM), total variation(TV/TGV). And the authors concluded the shortcomings of these two denoising approaches:
ANLM: loss of spatial resolution of the image (blur) and introduction of additional partial volume effects that lead to complications in further quantitative analyses or to biases in diffusion modeling.
TV: dependency on a regularization term, introduction of reconstruction artifacts, and the fact that thermal noise is not the sole source of local variations. Indeed, fine anatomical details might be removed as well by this non-selective technique.
Then the authors pointed out the main problem of PCA denoising. They wrote:
The number of signal-carrying principle components, i.e. the number of components that significantly contribute to the description of the underlying diffusion process, is unknown and is expected to depend on imaging factors such as resolution, b-value and SNR. Hence, an objective criterion to discriminate between the signal-carrying and noise-only components has been missing... Commonly used criteria include thresholding of the eigenvalues associated with the principal components by an empirically set value.
Based on that, they objectified the threshold for PCA denoising by exploiting the fact that noise-only eigenvalues are expected to obey the universal Marchenko-Pastur law, a result of the random matrix theory for noisy covariance matrices. There is a blog by Terrence Tao describing the topics in random matrix theory.
The question is what's the Marchenko-Pastur law and how we use it to objectify the threshold for PCA?
Method
Marchenko-Pastur distribution
A redundant data matrix is the one that can be synthesized by a combination of a few, linearly independent sources, or principal components, derived via the singular value decomposition of :
where and are unitary matrices whose columns are the left-singular and right-singular vectors of . Without loss of generality, we assume < . The diagonal elements of the matrix are the singular values, with being the eigenvalue of the matrix:
Marcheko-Pastur law: If denotes random matrix whose entries are independent identically distributed (iid.) random variables with mean 0 and variance < let and let be the eigenvalues of . Finally, consider the random measure
counting the number of eigenvalues in the subset included in .
Assume that so that the ratio . Then , where
Hence, the samllest nonzero eigenvalues obey the MP distribution. And and the noise level. If . Note that the width of the MP bulk spectrum equals:
and the expectation value of an MP distribution is given by:
The distribution edge distinguishes between noise- and significant signal carrying principle components. Nullifying all , and reconstructing the matrix results in a denoised matrix:
Using the expectation value, we can calculate the variance accumulated in the omitted eigenvalues is give by:
As the omitted and residual principal components are linearly uncorrelated, the variance of the residual noise, contained within the significant components, can be give by:
The SNR after denoising should thus scale with .
Denoising algorithm
Now it's clear that we should know the , or know the . The noise level will be given as a additional product. In this paper, the number of significant components is given by incrementally increasing until
The update of using the equation mentioned to calculate the spectrum length:
Once the is determined, the estimated noise level :
Discussion
The assumption that the omitted and residual principal components are linearly uncorrelated may fail for diffusion-weighted images with non-Cartesian readout. A manuscript on arxiv trying to solving this problem.
Denoising must be the first step of post-processing pipeline because data interpolation or smoothing will change the noise characteristics on which MPPCA relies.
A tensor-based MPPCA was recently published on MRM to address issue of denoising the high-dimensional data.
A question in my mind: what's the difference between Low Rank approximation and PCA? A pdf I found online for it.