Rician nonlocal means denoising for MR images using nonparametric principal component analysis

Denoising is always a challenging problem in magnetic resonance imaging (MRI) and is important for clinical diagnosis and computerized analysis, such as tissue classification and segmentation. The noise in MRI has a Rician distribution. Unlike additive Gaussian noise, Rician noise is signal dependent, and separating the signal from the noise is a difficult task. In this paper, we propose a useful alternative of the nonlocal mean (NLM) filter that uses nonparametric principal component analysis (NPCA) for Rician noise reduction in MR images. This alternative is called the NPCA-NLM filter, and it results in improved accuracy and computational performance. We present an applicable method for estimating smoothing kernel width parameters for a much larger set of images and demonstrate that the number of principal components for NPCA is robust to variations in the noise as well as in images. Finally, we investigate the performance of the proposed filter with the standard NLM filter and the PCA-NLM filter on MR images corrupted with various levels of Rician noise. The experimental results indicate that the NPCA-NLM filter is the most robust to variations in images, and shows good performance at all noise levels tested.


Introduction
Magnetic resonance (MR) images are affected by several types of artifact and noise sources, such as random fluctuations in the MR signal mainly due to the thermal vibrations of ions and electrons. Such noise markedly degrades the acquisition of quantitative measurements from the data. The noise in MR images obeys a Rician distribution [1][2][3]. Unlike additive Gaussian noise, Rician noise is signal dependent, and consequently separating the signal from the noise is difficult.
There is an extensive literature on Rician noise reduction in magnetic resonance imaging (MRI), varying from the use of traditional smoothing filters to more elegant methods. Most conventional mask-based denoising filters, such as Gaussian and Wiener filters [4], are conceptually simple. However, they will most likely fail to reduce Rician noise in MRI, as they usually assume that the noise is Gaussian. Restored images may often look blurred and may be corrupted by artifacts that are usually visible around the edges. One way to overcome the problems of simple smoothing is to use a nonlocal means (NLM) filter [5][6][7][8][9][10]. These methods make use of the self-similarity of images, in that many structures show up more than once in the image. The NLM filter takes advantage of the high degree of redundancy of any natural image and produces an optimal denoising result if the noise can be modeled as Gaussian. Unfortunately, the method requires computation of the weighting terms for all possible pairs of pixels, making it computationally expensive. A number of recent reports on NLM denoising focused on shortcuts to make the method computationally practical [11][12][13][14]. One of the most compelling strategies is to exclude many weight computations between the image neighborhood feature vectors. Azzabou, et al. [11] and Tasdizen [13,14] proposed the so-called PCA-NLM filter, which uses the lower dimensional subspace of the space of image neighborhood vectors in conjunction with NLM using principal component analysis (PCA). More important, this approach was also shown to result in increased accuracy over those that use the full-dimensional ambient space.
There are, however, some applications for which the PCA-NLM filter is not recommended because PCA is sensitive to image features and the presence of noise in the data, and the PCA-NLM filter is, therefore, highly dependent on the settings of its parameters.
In this paper, we propose a nonparametric PCA-NLM filter that is a useful alternative to the PCA-NLM filter for Rician noise reduction in MR images. The proposed filter uses PCA with ranked data instead of the original pixel data. We refer to this as the NPCA-NLM filter. We estimate the subspace dimensionality from parallel analysis [15][16][17] based on the artificial rank correlation matrix. In contrast to the method reported by Tasdizen [13,14], our estimation does not require the assumption of a Gaussian distribution and produces a more robust subspace dimensionality regardless of the images being denoised. We also propose a nonparametric method for optimal smoothing kernel width selection.

Rician noise in MRI
In MRI, raw data are intrinsically complex valued and corrupted with zero mean Gaussian distributed noise with equal variance [1]. MR magnitude images are formed by simply taking the square root of the sum of the square of the two independent Gaussian random variables (real and imaginary images) pixel by pixel. After this nonlinear transformation, MR magnitude data can be shown to have a Rician distribution [1][2][3]9,18]. For an MR magnitude image, the Rician probability density function of the measured pixel intensity x is given by where I 0 is the modified zeroth-order Bessel function of the first kind, A is the underlying noise-free signal amplitude, and s denotes the standard deviation of the Gaussian noise in the real and imaginary images. When A/s is high, the Rician distribution approaches a Gaussian; when A/s approaches 0, i.e., when only noise is present, the Rician distribution becomes Rayleigh distributed and Equation 1 reduces to [1]

NLM filter
Starting from a true, discrete image u, a noisy observation of u at pixel i is defined as υ(i) = u(i) + n(i). Let N i and v (N i ) denote a square neighborhood centered around pixel i and the image neighborhood vector, the elements of which are the gray-level values of υ at N i , respectively. Also, let S i be a square search-window of fixed size centered around pixel i. Then, the NLM filter [5,6] defines an estimator for u(i) aŝ where Z i is a normalization constant and h acts as a smoothing parameter controlling the decay of the exponential function.
This method is too slow to be practical. The high computational complexity is due to the cost of weight calculation for all pixels in the image during the process of denoising. For every pixel being processed, the whole image is searched and the differences between corresponding neighborhoods computed.

NPCA approach
We include a brief description of the projections of the image neighborhood vectors onto a lower dimensional space by NPCA, which uses their ranks instead of the original data. Let Ω denote the entire set of pixels in the image and let Ψ be a randomly chosen subset of Ω. Also, let R i denote the rank vector of {x i , i = 1, ..., Q} in each window of Q size, where Q = r × r. The principal components of Q rank vector can be obtained from the eigenvectors of the rank covariance matrix whereR = (1/| |) i∈ R i is the rank mean and |Ψ| is the number of elements in the set Ψ. Let {e p : p = 1,...,Q} be the eigenvectors of C R , i.e., the principal neighborhoods, sorted in order of descending eigenvalues. Let the d-dimensional NPCA subspace be the space spanned by {e p : p = 1,..., d}. Then, the projection of the image neighborhood vectors onto d-dimensional NPCA subspace is where (R i • e p ) denotes the inner product of the two vectors. Let T be the d-dimensional vector of projection coefficients. Then, because of the orthonormality of the basis Then, the d-dimension NLM algorithm in NPCA space iŝ where is the new normalizing term. Note that the proposed approach with d = Q is equivalent to the useful NLM filter when applied to ranks rather than the original observations, i.e., Equation 4 with d = Q becomes Equation 2 when calculated on the ranks.

Optimal smoothing parameter selection
Given a noisy image and a combination of N and d, there exists an optimal choice of the parameter h in Equation 3.2 that yields the best output in terms of peak signal-to-noise ratio (PSNR). Figure 1 shows the PSNR of the estimator output û d as a function of h for an image corrupted with Rician noise (s= 30). Buades et al. [5,6] reported that the smoothing parameter h depends on the standard deviation of the noise s, and typically a good choice for 2D images is h ≈ 10s. Tasdizen [14] used simple linear regression based on the least squares method as an automatic way of choosing h given an image neighborhood size and subspace dimensionality size. Working under the normality assumption, he achieved a reasonable applicable rule for parameter selection. If the pixel values are not normally distributed, then h in [14] may not be optimal. In practice, the choice of h that is applicable in situations where the normal procedures cannot be utilized may be appropriate. We take as our model where b(r, d) and a(r, d) are unknown parameters and is an independent and identically distributed random variable with an arbitrary continuous distribution function F. We use a nonparametric method [19] for the linear regression problem as an automatic way of choosing h given an image neighborhood size and d. Table 1 shows the linear fit parameters for several choices of d of 7 × 7 image neighborhoods. Parameters such as those shown in Table 1 can be precomputed for all N and d of interest.
Unlike Tasdizen [14], the linear fit parameters in Table 1 do not require the assumption of a Gaussian distribution of the noise with which the images are corrupted. Therefore, we expect that h produced by these parameters will yield results for a much larger set of images than that from which they were learned.

Automatic dimensionality selection
Determining the number of components to retain is a crucial problem when using PCA. Of several methods proposed for determining the significance of principal components, the K1 method proposed by Kaiser [20] is the best known and most commonly utilized in practice [21]. However, this method often overextracts components [22]. Another commonly used approach is based on Cattell's Scree test [23], which involves the visual exploration of a graphical representation of the eigenvalues. This method has been criticized for its subjectivity, as there is no objective definition of the cutoff point between the important and trivial factors. Parallel analysis [15] is more accurate than the above methods for determining the number of retained components, but it tends to overextract components [22]. Tasdizen [14] proposed a modification of the parallel analysis algorithm for determining the number of components in PCA of image neighborhoods for denoising. One of the main drawbacks of parallel analysis is that the number of principal components to retain is highly dependent on the images and the noise. Therefore, different numbers of principal components are required for images with different features. A quick solution to this problem is to use a modified procedure for parallel analysis of ranked data. Let l p for 1 ≤ p ≤ Q denote the eigenvalues of C R in Equation 3 sorted in descending order. Similarly, let a p denote the sorted eigenvalues of the artificial   Figure 2 shows the automatic dimensionality selection results for brain, body, and knee images with noise s ranging from 5 to 50 in increments of 5. The numbers of significant components for PCA as shown in Figure 2 were computed as about 10, 14, and 16 for brain, body, and knee images, respectively; however, the numbers of significant components for NPCA were all computed as about 15. It is important to note that the numbers of components vary more significantly with noise levels for PCA than for NPCA. Therefore, the number of principal components for NPCA is more robust to variations in noise as well as in images than for PCA. Figure 3 shows the PSNR of the estimator output û as a function of d for a brain image that was corrupted with Rician noise(s = 10). The curve for PCA showed a steep increase until the peak d, for example, d = 15, after which the curve declined significantly. In contrast, the curve for NPCA increased similarly until the peak point, after which it became considerably flat. Therefore, as expected, the incorrect determination of the number of components for PCA can result in remarkable PSNR loss.

Experiments and results
The proposed NPCA-NLM filter was tested using 256 × 256, 8-bits/pixel MR images, i.e., the brain, body, and knee images shown in Figure 4. The performance of the proposed filter was tested for various levels of noise corruption and compared with the standard NLM filter and recently proposed PCA-NLM filter.
We generated MR magnitude data by adding Rician noise to noise-free images. The Rician noise was created as y e (t i ) = [y(t i ) + e 1 ] 2 + e 2 2 , , where y is the true signal and e 1 and e 2 are random numbers from a Gaussian distribution with zero mean and standard deviation s [9]. Four levels of noise were tested with s = [10,20,30,40]. Figure 5 shows close-up images of the test images in Figure 4, which were corrupted with Rician noise s = 10 and 40.
In addition to the visual quality, the performance of the proposed filter was measured by the following criteria: PSNR, mean absolute error (MAE), and structural dissimilarity (DSSIM). For 8-bit images, PSNR and MAE are defined as follows: where Î(i,j) and I(i,j) denote the pixel values of the restored image and the original image, respectively, at location (i,j) and M × N is the size of the image. Higher PSNR values indicate better restoration, and smaller MAE values indicate that the filter can preserve more details and edges. However, they were not very well matched to perceived visual quality. DSSIM is a distance metric derived from structural similarity (SSIM), which takes into account the human visual system, and was defined as follows [24]: where SSIM is given by where μ x and μ y are means of x and y, respectively; σ 2 x and σ 2 y are variances of x and y, respectively; and cov xy is covariance of x and y. The constants were set as follows: c 1 = 0.01L and c 2 = 0.03L, and L was 255 for 8bits/pixel gray scale images. Figures 6 and 7 show the results of applying the three filters to noisy images corrupted with Rician noise s = 10 in Figure 5a-c, and Rician noise s = 40 in Figure 5df, respectively.

Visual quality comparison
As shown in Figure 6, the three filters based on the NLM filter performed well on images with low noise variance (s = 10). The differences in performance of these filters are difficult to distinguish in the restored images for low noise, but inspection of images with high noise in Figure 7 showed that the denoising effects of the NPCA-NLM filter and PCA-NLM filter were almost identical, except for slight blurring of the output from the PCA-NLM filter. However, the impact of noise on the standard NLM filter was clearly visible, and the restored images contained considerable noise spots.

Quantitative comparison
The quantitative performances in terms of PSNR, MAE, and DSSIM for all of the algorithms are given in Tables 2, 3, and 4, respectively. The results shown in these tables indicate that the NPCA-NLM filter had the best performance among the filters examined for the brain image over all noise ranges s = 10, 20, 30, 40. The NPCA-NLM filter showed better performance than the NLM filter and PCA-NLM filter for body and knee images, which were corrupted with s = 10 and 40. It should be noted that the performance gap between the NPCA-NLM filter and the PCA-NLM filter increased on images with high noise. This was expected, as the PCA-NLM filter is more sensitive to noise. Our filter with the PCA-NLM filter still showed good performance on images that were corrupted with s = 20 and 30.   Of the three filters investigated, the NPCA-NLM filter appeared to be the most robust to variations in images, performing well at all noise distributions tested.

Conclusion
We proposed an NPCA-NLM filter, which is a useful alternative to the PCA-NLM filter for Rician noise reduction in MR images. The filter uses PCA with   ranked data instead of the original pixel data. The image neighborhood vectors used in the NLM filter are projected onto a lower dimensional subspace using NPCA. Therefore, the lower dimensional projections are not only used as search criteria, but also for computing similarity weights resulting in better accuracy in addition to reduced computational cost. We estimated the subspace dimensionality from parallel analysis based on the artificial rank correlation matrix. We demonstrated that the numbers of components varied more significantly with noise level for PCA than for NPCA. Therefore, the number of principal components was more robust to variations in the noise as well as in the images for NPCA than for PCA. We also proposed a nonparametric method for optimal smoothing kernel width selection that produces results for a much larger set of images than that from which they were learned.
We investigated the performance of the proposed filter in comparison with the standard NLM filter and the recently proposed the PCA-NLM filter for various levels of Rician noise corruption. The experimental results showed that the NPCA-NLM filter was the most robust to variations in images, performing well at all noise levels tested.