Skip to main content

An adaptive bandwidth nonlocal means image denoising in wavelet domain


This paper proposes a new wavelet domain denoising algorithm. In the results of conventional wavelet domain denoising methods, ringing artifacts or wavelet-shaped noises are sometimes observed due to thresholding of small but important coefficients or due to generation of large coefficients in flat areas. In this paper, nonlocal means filtering is applied to each subband of wavelet decomposition, which can keep small coefficients and does not generate unwanted large coefficients. Since the performance of nonlocal means filtering depends on the appropriate kernel bandwidth, we also propose a method to find global and local kernel bandwidth for each subband. In comparison with conventional methods, the proposed method shows lower PSNR than BM3D when pseudo white Gaussian noise is added, but higher PSNR than the spatial nonlocal means filtering and wavelet thresholding methods. For the mixture noise or Poisson noise, which may better explain the real noise from camera sensors, the proposed method shows better or comparable results than the state-of-the-art methods. Also, it is believed that the proposed method shows better subjective quality for the noisy images captured in the low-illumination conditions.


Denoising is one of the fundamental image processing problems and thus has been studied for a long time. To name a few of the existing methods that are related with our work and the state-of-the-art methods, there are wavelet shrinkage methods [1, 2], a total variation minimization [3], a prior probability modeling [4], nonlocal means filtering [5], and BM3D [6]. Among these, the BM3D generally shows the highest PSNR when the noise is additive white Gaussian.

In the case of wavelet domain thresholding methods [1, 2], an image is transformed into the wavelet domain, and the coefficients in each subband are suppressed by hard or soft thresholding. The advantage of wavelet shrinkage methods is that they require not much computations while providing pleasing results. The probabilistic wavelet coefficient modeling method [4] fits the neighborhoods of coefficients as Gaussian scale mixture (GSM) model and applies the Bayesian least squares (BLS) technique to adjust the coefficients. Although wavelet shrinkage methods and BLS-GSM provide relatively high PSNR improvement, shrinking or modifying wavelet coefficients sometimes bring ringing or wavelet-shaped artifacts. For example, wavelet transformation of a step edge generates small coefficients up to the highest subbands. Hence, when the small coefficients are removed by thresholding and are inverse transformed, then ringing artifacts arise due to the loss of high frequencies. In the case of probabilistic wavelet coefficient modeling, unwanted coefficients can be generated in the homogeneous region, which result in wavelet-shaped artifacts in the spatial domain. Another popular denoising method is the nonlocal means filtering [5], which substitutes a noisy pixel by the weighted sum of neighborhood pixels. The weights are determined based on the kernel density estimation, which can be regarded as a Nadaraya-Watson estimator, i.e., a kind of local constant regression [7]. In other words, the smooth kernel estimate in the nonlocal means approach is a sum of bumps placed on the data points. The kernel function determines the shape of the bumps, and the ‘smoothing parameter’ or ‘bandwidth’ controls the degree of smoothness. In [8], an automatic bandwidth selection method was proposed based on the reduction of entropy of image patterns, and the global bandwidth was applied to the overall area of image. However, it is noted that narrower kernels are suitable for the complex regions, whereas larger kernels would be better for more sparse areas. Hence, it is important to find an appropriate bandwidth according to the local characteristics, which is not an easy task. One of the main factors that strongly influence the local properties of the image is the noise statistics in the neighborhood, and thus, the bandwidth needs to be adaptively determined according to the local noise variance. In summary, we need to estimate the local noise statistics for finding an appropriate bandwidth for the given region. There are many methods for estimating the variance of white additive noise in images, but they cannot be used for the images with non-uniform noise variance. In this consideration, the estimation of local noise statistics is necessary to find the appropriate bandwidth for the given area.

In this paper, inspired by the performance of nonlocal means filtering method in keeping the structures of the image while suppressing the noise, we attempt to apply the nonlocal means filter to the wavelet coefficients. The wavelet coefficients contain the information on the structures of the image, which have different but related characteristics depending on the subbands. This property has been extensively and effectively exploited in many image processing applications, including denoising. However, as stated previously, manipulation of wavelet coefficients sometimes brings ringing artifacts and wavelet-shaped noise. Hence, instead of thresholding or generating the wavelet coefficients, we filter the coefficients based on the nonlocal means approach. This approach keeps small but important wavelet coefficients which would have been thresholded in the conventional schemes and also does not generate large coefficients in homogeneous regions while effectively suppressing noisy ones. In applying the nonlocal means filter, determining the bandwidth is also an important factor for successful filtering. Hence, we also propose a method that gives different bandwidths to each subband and region, depending on its properties and noise statistics.

The experiments are conducted with various types of pseudo noises and also with real noise that is observed in the images taken in low-illumination conditions. It is shown that the proposed method gives lower PSNR than the state-of-the-art methods such as BM3D [6] and BLS-GSM [4] when the white Gaussian noise is added. However, it gives higher PSNR than the conventional wavelet shrinkage methods and the spatial nonlocal means filtering method. Also, it gives higher PSNR than BM3D when the noise is a mixture of Gaussian and impulsive noises and when the noise model is Poisson which better explains the real noise from CCD/CMOS sensors [9]. For the experiments with real noise, images taken under low-illumination conditions and film images are denoised by various denoising methods. Subjective comparison shows that the wavelet domain nonlocal means filtering provides competitive results for real noises, which supports the simulation results with non-Gaussian noises.

The rest of this paper is organized as follows. In the second section, we review the nonlocal means filter and its bandwidth parameter estimation. In the third section, we propose the extension of nonlocal means filter to the wavelet domain denoising with the bandwidth selection method. Then, we show some experimental results on the images degraded by various pseudo noise and the images with real noise. The last section concludes this paper.

Related works

Nonlocal means filter

Let us denote a noisy observation of an image as y(i) = u(i) + n(i), where y(i), u(i), and n(i) are the noisy observation, original image, and the noise, respectively, at the i th pixel. Also, we define N i  and S i  as a square neighborhood and a square search window centered at the pixel i, respectively. Then, the nonlocal means filter can be described as [5]

û(i)= j S i 1 Z ( i ) e - Y i - Y j 2 h 2 y(j),

where Y i  represents the vector of pixel intensities in N i , Z(i)= j S i e - Y i - Y j 2 h 2 is a normalizing factor, and h is the smoothing kernel width which controls the degree of averaging. The denoised pixel û(i) is obtained by locally weighted averaging, which corresponds to the Nadaraya-Watson estimator [10]. From Equation (1), it can be seen that a small h shrinks the area of averaging, and thus, the noise is not suppressed enough. Conversely, if h is too large, the weights at the boundary of S i  are also very large, which results in a blurry output. In the conventional work [5], h is set between 10σ and 15σ, where the noise standard deviation σ is estimated from the image statistics.

Bandwidth selection

Choosing an appropriate bandwidth is thus very important for the balanced nonlocal means filtering. Traditionally, the bandwidth h is selected to minimize the error between the estimate and true density. For this purpose, the mean square error (MSE) at a point x is defined as [7]

MSE x ( p KDE ) = E [ ( p KDE ( x ) - p ( x ) ) 2 ] = E [ p KDE ( x ) - p ( x ) ] 2 + var ( p KDE ( x ) ) ,

where p KDE(x) is the kernel density estimate of true density p(x), at a point x. This shows that there is a tradeoff between the bias and variance, which also means that a large bandwidth reduces the variance of the estimator but increases the bias and vice versa.

There have been several approaches to bandwidth estimation [11], which include subjective choice based on the asymptotic mean squared error, cross-validation methods using pseudo-likelihood maximization by the leave-one-out criterion, and plug-in estimator based on the asymptotically best choice of h. Also, the existing methods can be categorized as global or local bandwidth adjustment, where the local adaptivity gives better performance but requires heavy computational burden.

Wavelet denoising

The wavelet transform has an excellent localization property and thus shown to be effective in many image processing applications. Since the work of Donoho and Johnstone [1], there have been a lot of researches on wavelet shrinkage method. These wavelet denoising methods suppress the noisy coefficient magnitudes while keeping the local structures. Ideally, only the wavelet coefficients that correspond to the noise component should be removed, whereas the coefficients containing a significant structure component should be less reduced. Figure 1 shows the comparison of wavelet coefficients of noisy and noiseless images, where it can be seen that the small coefficients appear in the subbands of noisy image. Hence, one of the popular wavelet domain denoising methods is to shrink the coefficients by thresholding [1, 2], i.e., the coefficients under a certain magnitude are treated as nonsignificant and are set to zero, while the remaining significant ones are kept unmodified (hard-thresholding) or their magnitudes are reduced (soft-thresholding). Unfortunately, the edges often generate small wavelet coefficients up to the highest bands, along with the significant ones. Hence, the suppression of the small coefficients around the significant ones results in ringing artifacts, i.e., Gibb’s phenomena.

Figure 1
figure 1

Noise in the wavelet domain. Wavelet coefficients of the vertical subband in a noisy image and a noise-free image.

Thus, instead of thresholding or making probabilistic decision, bilateral filtering of wavelet coefficients is shown to provide competitive results [12]. In our previous work [13], we have also shown that the wavelet domain nonlocal means filter provides higher PSNR than the spatial domain nonlocal means filtering. In this paper, we improve the performance by finding the locally adaptive bandwidth for each subband and region, whereas the previous work applied global bandwidth. In addition, we test the algorithm for various kinds of noise model such as mixture noise or Poisson noise, which better explains the real camera noise. Also, it is tested on real noises that arise in the low-illumination conditions and film grain noise.

Wavelet domain nonlocal means filter with adaptive bandwidth

The main idea of our work is to apply the nonlocal means filtering to scaling and wavelet coefficients of an image to keep small but important coefficients which might have been shrunk in conventional wavelet denoising. Another contribution is the derivation of global and local bandwidth for the wavelet-domain nonlocal means filtering, according to the subband’s statistics. It is noted that each subband has different noise statistics which may also vary depending on the location in each band. Hence, for each subband, we first find a global bandwidth that can be applied to the overall subband, based on the plug-in method [14]. Then, from Abramson’s rule [15] using the statistics derived in this process, we also find the locally adaptive bandwidth in each subband.

In kernel density estimation, the estimated density at any point x is formulated as

f ̂ h (x)= 1 n i = 1 n K x - x ( i ) h ,

where x(i) is a neighboring point to x, n is the number of neighbors, K(·) is the kernel function, and h is its bandwidth. The kernel function can be considered a weighting factor that gives larger value when x(i) is close to x, and h is also called the smoothing constant. A typical shape of kernel function is Gaussian, and the bandwidth h determines its width and thus smoothing factor in estimating the kernel density. To obtain the globally optimal bandwidth for the given data, denoted as h go, the conventional method is to find the h that minimizes the mean integrated squared error (MISE) between f ̂ h (x) defined above and the true but unknown density f(x) as [7]

h go = argmin h MISE( f h ̂ )= argmin h E { f h ̂ ( x ) - f ( x ) } 2 dx.

Since the plug-in approach [14] is known to be one of the best data-driven bandwidth selection methods, we employ this approach to minimize the MISE, and we obtain the bandwidth

h go = K 2 f 2 { μ 2 ( K ) } 2 M 1 / 5 ,

where K 2 = K 2 (x)dx and μ 2 (K)= x 2 K(x)dx are constants depending on the kernel function, and M is the number of sample data in the subband. Note that f′′2 is the only unknown term in Equation (5), and the idea behind the plug-in approach is to replace f′′ by an estimate from the data. Silverman’s rule of thumb [16] computes f′′ as if f had the density of the normal distribution N(μ,σ2) and then the optimal global bandwidth for the subband can be approximated as

h go = 4 σ ̂ 5 3 M 1 / 5 1.06 σ ̂ M - 1 / 5 ,

where σ ̂ is the standard deviation of the noise to be estimated. To obtain σ ̂ , we employ the empirical preliminary estimation from each subband’s wavelet coefficients as introduced in [17]:

σ ̂ =1.4826med(r-med(r)),

where r = {r 1,r 2,…,r X} is the set of residuals of the entire wavelet coefficients in the subband, and |X| is the total number of coefficients in the subband. The residual r i  is defined as

r i = 2 X m , n - ( X m + 1 , n + X m , n + 1 ) 6 ,

where i is the index for the pixel position (m,n), and X m,n  is the wavelet coefficient at that position. The residual can be considered a prediction error of X(m,n) by its neighboring data, and the median operation over the residuals as Equation (7) gives approximated standard deviation of data. In summary, we estimate the noise standard deviation by Equation (7) for each subband and then the nonlocal means filter with the bandwidth in Equation (6) can be applied to the given subband.

In addition to the above global characteristics, the consideration of local statistics brings better denoising results. That is, applying locally adaptive bandwidth yields better result than applying the above h go to the overall subband. For the derivation of locally adaptive bandwidth, let us denote the i th wavelet coefficient in the l th subband as

X l (i)= α l (i)+ ε l (i),

where α l (i) denotes a noise-free coefficient, and ε l (i) denotes a random variable assumed to be N(0, σ ε l 2 (i)). When there seems to be no confusion, we will drop the indexes and subscripts of the above notation in the rest of the paper.

The main step for the locally adaptive filtering is to estimate the local noise statistics σ ̂ lo from a set of L × L coefficients centered at a pixel of interest. This requires a hypothesis test for determining whether a coefficient X is a noise coefficient or not and then we compute the variance of the noise coefficients within the L × L window. Here, the hypothesis test follows the algorithm in probabilistic wavelet shrinkage [18]. To be precise, we first model the wavelet coefficient as a sample of generalized Gaussian random variable with the probability density function

f G (α)= λ · ν 2 Γ ( 1 ν ) exp(-λ|α | ν ),

where Γ(z)= 0 t z - 1 e - t dt,z>0 is the Gamma function, λ > 0 is the scale parameter, and ν is the shape parameter. To test whether a given X is a noise coefficient or not, we test its significance by binary hypothesis test: H 0 is the hypothesis that X is a noise coefficient, and H 1 is a significant one. To assess the hypothesis, we use the Bayes’ rule, where it is assumed that a prior is known, and its parameters are random variables. The Bayes’ rule produces the conditional probability P(H 1|X) = μ η/(1 + μ η), where μ = P(H 1)/P(H 0), η = f G (X|H 1)/f G (X|H 0), and the product μ η defines the generalized likelihood ratio. From the assumption of generalized Gaussian prior and the Bayes’ rule, the estimate of the true coefficient is represented as

β ̂ =P( H 1 |X)X= μ η 1 + μ η X

which is a simple shrinkage rule for the given wavelet coefficient under the hypothesis test. We use P( H 1 |X)= μ η 1 + μ η as a measure to decide the noisy coefficients.

For details of calculating the parameters above, the conditional densities of the noisy coefficients f G (X|H 0) and the noise-free coefficients f G (X|H 1) are defined as the following convolutions [18]:

f G ( X | H 0 ) = - ϕ ( X - α ; σ ) f G ( α | H 0 ) d α f G ( X | H 1 ) = - ϕ ( X - α ; σ ) f G ( α | H 1 ) ,

where ϕ(X;σ) is the zero mean Gaussian density, and the standard deviation σ is computed as Equation (7). The hypothesis test is to define an element as a significant one when it is larger than a threshold and vice versa, and thus the conditional densities are defined as

f G (α| H 0 )= B 0 exp ( - λ | α | ν ) , if | α | T α 0 , if | α | > T α


f G (α| H 1 )= 0 , if | α | T α B 1 exp ( - λ | α | ν ) , if | α | > T α ,

where T α = σ ̂ is the threshold, and B 0 and B 1 are the normalizing constants as

B 0 = - T T exp ( - | λ α | ν ) d α - 1 and B 1 = 2 T exp ( - | λ α | ν ) d α - 1 .

The following steps are also from wavelet shrinkage method [18], which is repeated here for convenience. The parameters λ and ν are determined by the noise statistics in each band. To be precise, let σ X  be the variance of the overall coefficients in the subband, m 4,X  be their fourth moment, and σ α  be the standard deviation of noise-free coefficients in the band. Then, from [19], ν and λ are found from the equations as follows:

Γ ( 1 ν ) Γ ( 5 ν ) Γ 2 ( 3 ν ) = m 4 , X + 3 σ α 4 - 6 σ α 2 σ X 2 σ X 2 - σ α 2 , and λ = ( σ X 2 - σ α 2 ) Γ ( 1 ν ) Γ ( 3 ν ) - 1 2 .

From the left equation above, ν can be derived numerically, and it is used for computing λ from the right equation above. The next step is to compute P(H 0) and P(H 1) as [18]:

P( H 0 )= Γ inc ( λ T ) ν , 1 ν andP( H 1 )=1-P( H 0 ),

where Γ inc (x,a)= 1 Γ ( a ) 0 x t a - 1 exp - t dt is the incomplete gamma function.

From the parameters derived above, we can compute P( H 1 |X)= μ η 1 + μ η in Equation (11) and test whether this is above a certain threshold T α . If P(H 1|X) < T α , then the corresponding coefficient is considered a noisy one. For all the coefficients that are determined to be noisy in the L×L window, the local noise variance σ ̂ lo is estimated, and in the same manner as h go in Equation (6) is derived, the local bandwidth is determined as

h lo ( X i )1.06 σ ̂ lo L lo - 1 / 5 ,

where L lo is the number of noise coefficients in the L × L window centered at X i . In summary, Figure 2 shows the block diagram of overall process of the proposed method, and the algorithm is summarized as follows:

Figure 2
figure 2

Block diagram of the proposed method.

Algorithm 1 Summary of wavelet domain denoising

Finally, it is worth mentioning that these procedures are applied regardless of noise statistics in our experiments, i.e., the same algorithm is applied to the images corrupted by Gaussian noise, Poisson noise, mixture noise, and real images without any modification. The hypothesis test that we adopt is based on the generalized Gaussian distribution of coefficients, whereas the Poisson noise or real noise may not meet this assumption. However, we found that almost all noises that we test show Gaussian statistics in the wavelet domain, especially in local patches that we process. Figure 3 shows some evidence for this, which shows the histogram of noise over the local patches. It can be seen that the histograms show generalized Gaussian shape unlike their histograms in the spatial domain. More precisely, Figure 3a is the histogram of Poisson noise, which shows some peaky distribution, whereas Figure 3b shows the distribution closer to Gaussian, although long tail remains. In [20], they also show some experiments that the camera noise has Gaussian shape in local areas even though the noise is actually signal dependent.

Figure 3
figure 3

Histogram of noise signal in the spatial domain local patches of wavelet domain. Histograms for (a) Poisson noise in the spatial domain. (b) Poisson noise in the local patches of wavelet domain. (c) Real camera noise in the spatial domain. (d) Real camera noise in the local patches of wavelet domain.

Experimental results

Summarizing the experimental results in advance, the proposed method yields higher PSNR than the spatial nonlocal means filtering and conventional wavelet shrinkage methods. However, it shows lower PSNR than the BM3D for the denoising of images corrupted by additive pseudo white Gaussian noise. A recent wavelet domain approach, BLS-GSM, also gives quite high PSNR, slightly less than BM3D. However, for more complex noise models such as mixture of impulsive and Gaussian, uniform noise, or Poisson noise that better explains the real camera noise, the proposed method yields higher PSNR and better subjective quality than the above referenced methods. Also, it is believed that the proposed method yields subjectively better output for the real noise, especially the noises that are often observed in the images taken under low-illumination conditions.

In the implementation of the proposed method, Daubechie’s orthogonal wavelet is used for the subband decomposition, specifically db 8 filters in MATLAB is used for one-level multiresolution analysis. The window size centered at a wavelet coefficient is 21×21, and the threshold T α  for classifying a noise coefficient is set as 0.3. In the experiments with the pseudo Gaussian noise, the images are artificially corrupted by the addition of noise with standard deviation, σ = 20 and 30. The proposed method is compared with several state-of-the-art algorithms, and the results are summarized in Tableف1. It can be seen that the BM3D provides the best PSNR for the white Gaussian noise, the proposed method gives the second best PSNR in Barbara, and the BLS-GSM shows the second best PSNR for the rest.

Table 1 Denoising performances (dB) of various methods for the pseudo white Gaussian noise

For the subjective comparison, Figure 4 shows cropped parts of Lena image, processed by several methods reference in this paper and the proposed method. It can be observed that the proposed method gives the least blurry output. Figure 5 shows another result that the proposed method yields less blurry output than the other methods that show higher PSNR. Figures 6 and 7 show that the result of proposed method has less artifacts that are commonly found in the conventional wavelet domain denoising. Specifically, Figure 6a shows the result of BLS-GSM, where many wavelet-like noise are observed (in the circles, and magnified in Figure 6b) due to the generation of large wavelet coefficients, whereas there is no such artifacts in Figure 6h. Figure 7d shows the ringing artifacts due to the shrinkage of wavelet coefficients in the conventional method, where the proposed method shows less artifacts.

Figure 4
figure 4

Denoising results with Lena, σ = 30. Comparison of various methods for the Lena image. (a) Cropped image from Lena, σ = 30. (b) Nonlinear TV. (c) Nonlocal means. (d) ProbShrink. (e) Multiresolution bilateral. (f) BLS-GSM. (g) BM3D. (h) Proposed method.

Figure 5
figure 5

Denoising results with Barbara, σ = 20. Comparison of various methods for the Barbara image. (a) Cropped image from Barbara, σ = 20. (b) Nonlinear TV. (c) Nonlocal means. (d) ProbShrink. (e) Multiresolution bilateral. (f) BLS-GSM. (g) BM3D. (h) Proposed method.

Figure 6
figure 6

Denoising results with Lena, σ = 20. Comparison of ringing artifacts. (a) Denoising result by BLS-GSM. (b) Magnified region around the circles near the lip in (a). (c) Nonlinear TV. (d) Nonlocal means. (e) ProbShrink. (f) Multiresolution bilateral. (g) BM3D. (h) Proposed method.

Figure 7
figure 7

Denoising results for Lena corrupted by additive Gaussian noise, σ = 30. Denoising result comparison of the cropped part of Lena (a) noisy image, σ = 30, (b) nonlinear TV [3], (c) nonlocal means filter [5], (d) ProbShrink [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], (g) BM3D [6], and (h) the proposed method.

Experiments for the mixture noise model and Poisson noise model, which might explain the real noise in the low-illumination condition [21], are also conducted. The results are summarized in Table 2, where it can be seen that the proposed method provides higher PSNR than the state-of-the-art methods in many cases, except for the Poisson-specific NL method in [21] in the case of Poisson noise corruption. Figures 8, 9, 10, 11, and 12 show the outputs by several methods for subjective comparison. Finally, the subjective comparisons for the real noisy images are shown in Figures 13, 14, 15, 16, and 17. From these results, it is believed that the proposed method effectively reduces the real noise that arises due to low-light conditions. Original-sized images of all the results in this paper and additional ones can be found in

Table 2 Denoising performances (dB) of various methods for the mixed noise, additive non-Gaussian noise, and Poisson noise
Figure 8
figure 8

Denoising results for Lena corrupted by mixture noise with 20% impulse noise and Gaussian noise, with σ = 10. Denoising result comparison of the mixture noise for Lena. (a) Lena corrupted by mixed noise (PSNR = 17.98 dB), (b) denoising by nonlinear TV (PSNR = 19.36 dB) [3], (c) denoising by nonlocal means filter (PSNR = 21.91 dB) [5], (d) denoising by ProbShrink method (PSNR = 22.05 dB) [18], (e) denoising by multiresolution bilateral (PSNR = 18.09 dB)[12], (f) denoising by BLS-GSM (PSNR = 22.06 dB) [4], (g) denoising by BM3D (PSNR = 25.27 dB) [6], (h) denoising by the proposed method (PSNR = 28.92 dB).

Figure 9
figure 9

Denoising results for Boat corrupted by mixture noise with 10% impulse noise and Gaussian noise, with σ = 10. Denoising result comparison of the mixture noise for Boat. (a) Boat corrupted by mixed noise (PSNR = 18.14 dB), (b) denoising by nonlinear TV (PSNR = 25.78 dB) [3], (c) denoising by nonlocal means filter (PSNR = 21.90 dB) [5], (d) denoising by ProbShrink method (PSNR = 21.23 dB) [18], (e) denoising by multiresolution bilateral (PSNR = 18.24 dB) [12], (f) denoising by BLS-GSM (PSNR = 22.15 dB) [4], (g) denoising by BM3D (PSNR = 24.85 dB) [6], (h) denoising by the proposed method (PSNR = 27.85 dB).

Figure 10
figure 10

Denoising results for Barbara corrupted by 20% impulse noise only. Denoising result comparison of impulse noise for Barbara. (a) Barbara corrupted by impulse noise (PSNR = 18.42 dB), (b) denoising by nonlinear TV (PSNR = 24.84 dB) [3], (c) denoising by nonlocal means filter (PSNR = 22.36 dB) [5], (d) denoising by ProbShrink method (PSNR = 21.15 dB) [18], (e) denoising by multiresolution bilateral (PSNR = 18.52 dB) [12], (f) denoising by BLS-GSM (PSNR = 22.14 dB) [4], (g) denoising by BM3D (PSNR = 25.50 dB) [6], (h) denoising by the proposed method (PSNR = 28.23 dB).

Figure 11
figure 11

Denoising results for Lena corrupted by corrupted by Poisson noise σ = 2. Comparison of Poisson noise removal. (a) Lena corrupted by Poisson noise (PSNR = 16.33 dB), (b) denoising by MA filter (PSNR = 25.09 dB), (c) denoising by NLM (PSNR = 27.30 dB) [5], (d) denoising by the Poisson NL (PSNR = 29.38 dB) [21], (e) denoising by the PURE LET (PSNR = 27.47 dB) [22], and (f) denoising by the proposed method (PSNR = 28.54 dB).

Figure 12
figure 12

Denoising results for Boat corrupted by Poisson noise σ = 2. Comparison of denoising results for the Poisson noise. (a) Boat corrupted by Poisson noise (PSNR = 15.95 dB), (b) denoising by MA filter (PSNR = 22.80 dB), (c) denoising by NLM (PSNR = 24.76 dB) [5], (d) denoising by the Poisson NL (PSNR = 26.70 dB) [21], (e) denoising by the PURE LET (PSNR = 26.16 dB)[22], and (f) Denoising by the proposed method (PSNR = 26.56 dB).

Figure 13
figure 13

Denoising results for a real image. The results of (a) real noisy image, (b) nonlinear TV [3], (c) NLM [5], (d) ProShrink method [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], and (g) BM3D [6] contain more ringing artifacts and worse visual quality than that of the (h) proposed method.

Figure 14
figure 14

Denoising results for another real image. The results of (a) real noisy image, (b) nonlinear TV [3], (c) NLM [5], (d) ProShrink method [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], and (g) BM3D [6] contain more ringing artifacts and worse visual quality than that of the (h) proposed method.

Figure 15
figure 15

Denoising results for real images taken under low-light condition. The results of (a) real noisy image, (b) nonlinear TV [3], (c) NLM [5], (d) ProShrink method [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], and (g) BM3D [6] contain more ringing artifacts and worse visual quality than that of the (h) proposed method.

Figure 16
figure 16

Denoising results for a real image from image database by Liu [23]. The results of (a) real noisy image, (b) nonlinear TV [3], (c) NLM [5], (d) ProShrink method [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], and (g) BM3D [6] contain more ringing artifacts and worse visual quality than that of the (h) proposed method.

Figure 17
figure 17

Denoising results for a grain noisy image. The results of (a) real noisy image, (b) nonlinear TV [3], (c) NLM [5], (d) ProShrink method [18], (e) multiresolution bilateral [12], (f) BLS-GSM [4], and (g) BM3D [6] contain more ringing artifacts and worse visual quality than that of the (h) proposed method.

Finally, we have also performed experiments changing the decomposition levels and kind of wavelet filters. About the decomposition level, the maximum PSNR was attained for the 1-level decomposition as in our experiment. Further decomposition does not improve the gain because the lower- and mid-band images become small, and thus, there are not enough patches to be used for smoothing the data. In the case of experiments on different kinds of wavelet filters, there was not much difference in the PSNR gain. It is believed that the denoising effect is not much affected by the shape of the coefficients which differs depending on the given wavelet filters as long as the coefficients are effectively denoised.


We have proposed a new image denoising algorithm based on nonlocal means filtering in the wavelet domain. By the nonlocal means filtering, small wavelet coefficients that constitute an important image structure are well preserved, while noisy coefficients are suppressed. Since the local adaptation of the kernel bandwidth gives better results, we have also proposed a method to find the appropriate kernel bandwidth to each region for the effective nonlocal means filtering. As a result, the proposed method provides comparable or sometimes higher PSNR than the state-of-the-art algorithms. Also, subjective comparisons show that the proposed method keeps the structures of the images very well and gives less ringing artifacts.


  1. Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81: 425-445. 10.1093/biomet/81.3.425

    Article  MathSciNet  MATH  Google Scholar 

  2. Chang SG, Yu B, Vetterli M: Adaptive wavelet thresholding for image denoising and compression. IEEE Trans. Image Process 2000, 9(9):1532-1546. 10.1109/83.862633

    Article  MathSciNet  MATH  Google Scholar 

  3. Rudin LI, Osher S, Fatemi E: Nonlinear total variation based noise removal algorithms. Phys. D 1992, 60: 259-268. 10.1016/0167-2789(92)90242-F

    Article  MathSciNet  MATH  Google Scholar 

  4. Portilla J, Strela V, Wainwright M, Simoncelli E: Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Trans. Image Process 2003, 12(11):1338-1351. 10.1109/TIP.2003.818640

    Article  MathSciNet  MATH  Google Scholar 

  5. Buades A, Coll B, Morel JM: A non-local algorithm for image denoising. CVPR 2005, 2: 60-65.

    MATH  Google Scholar 

  6. Dabov K, Foi A, Katkovnik V: Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Proc 2007, 16(8):2080-2095.

    Article  MathSciNet  Google Scholar 

  7. Wand M, Jones M: Kernel Smoothing. Chapman and Hall, London, UK; 1995.

    Book  MATH  Google Scholar 

  8. Awate SP, Whitakerl RT: Unsupervised, information-theoretic, adaptive image filtering for image restoration. IEEE Trans. Pattern Anal. Mach. Intell 2006, 28(3):364-376.

    Article  Google Scholar 

  9. Alter F, Matsushita Y, Tang X: An intensity similarity measure in low-light conditions. In Proceeding of European Conference on Computer Vision. Edited by: Leonardis A, Bischof H, Pinz A. Springer, Berlin; 2006:267-280.

    Google Scholar 

  10. Chu CK, Marron JS: Choosing a kernel regression estimator. Stat. Sci 1992, 6: 404-436.

    Article  MathSciNet  MATH  Google Scholar 

  11. Jones MC, Marron JS, Sheather SJ: A brief survey of bandwidth selection for density estimation. J. Am. Stat. Assoc 1996, 91(3):401-407.

    Article  MathSciNet  MATH  Google Scholar 

  12. Zhang M, Gunturk BK: Multiresolution bilateral filtering for image denoising. IEEE Trans. Image Process 2008, 17(12):2324-2333.

    Article  MathSciNet  Google Scholar 

  13. You SJ, Cho NI: A new image denoising method based on the wavelet domain nonlocal means filtering. In Proceeding of the International Conference on Acoustics, Speech, and Signal Processing. Czech Republic, Prague; 22–27 May 2011.

    Google Scholar 

  14. Sheather SJ, Jones MC: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. Series B 1991, 53(3):683-690.

    MathSciNet  MATH  Google Scholar 

  15. Abramson I: On bandwidth variation in kernel estimates—a square root law. Ann. Stat 1982, 10: 1217-1223. 10.1214/aos/1176345986

    Article  MathSciNet  MATH  Google Scholar 

  16. Silverman BW: Monographs on Statistics and Applied Probability: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London);

  17. Black MJ, Sapiro G: Edges as outliers: anisotropic smoothing using local image statistics. In Scale-Space Theories in Computer Vision, Second Int. Conf.. Edited by: Nielsen M, Johansen P, Olsen OF, Weickert J. (Springer, Berlin; 1999:259-270.

    Chapter  Google Scholar 

  18. Pizurica A, Philips W: Estimating the probability of the presence of a signal of interest in multiresolution single- and multiband image denoising. IEEE Trans. Image Process 2006, 15(3):654-665.

    Article  Google Scholar 

  19. Simoncelli EP, Adelson EH: Noise removal via Bayesian wavelet coring. Int. Con. Image Proc 1996, 1: 379-382.

    Google Scholar 

  20. Hirakawa K, Parks TW: Joint demosaicing and denoising. IEEE Trans. Image Proc 2006, 15(8):2146-2157.

    Article  Google Scholar 

  21. Deledalle C, Tupin F, Denis L: Poisson NL means: unsupervised non local means for Poisson noise. In Proceeding of the International Conference on Image Processing. Hong Kong; 26–29 Sept 2010.

    Google Scholar 

  22. Luisier F, Blue T, Parks TW: Image denoising in mixed poisson-Gaussian noise. IEEE Trans. Image Proc 2011, 20(3):696-708.

    Article  MathSciNet  Google Scholar 

  23. Liu C: The image database. . Accessed Jan 2010

Download references


We appreciate anonymous reviewers for their valuable comments and helpful suggestions. This research was supported by Samsung Electronics and supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2009-0083495).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nam Ik Cho.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

You, S.J., Cho, N.I. An adaptive bandwidth nonlocal means image denoising in wavelet domain. J Image Video Proc 2013, 60 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: