Skip to main content

Image denoising with morphology- and size-adaptive block-matching transform domain filtering


BM3D is a state-of-the-art image denoising method. Its denoised results in the regions with strong edges can often be better than in the regions with smooth or weak edges, due to more accurate block-matching for the strong-edge regions. So using adaptive block sizes on different image regions may result in better image denoising. Based on these observations, in this paper, we first partition each image into regions belonging to one of the three morphological components, i.e., contour, texture, and smooth components, according to the regional energy of alternating current (AC) coefficients of discrete cosine transform (DCT). Then, we can adaptively determine the block size for each morphological component. Specifically, we use the smallest block size for the contour components, the medium block size for the texture components, and the largest block size for the smooth components. To better preserve image details, we also use a multi-stage strategy to implement image denoising, where every stage is similar to the BM3D method, except using adaptive sizes and different transform dimensions. Experimental results show that our proposed algorithm can achieve higher PSNR and MSSIM values than the BM3D method, and also better visual quality of denoised images than by the BM3D method and some other existing state-of-the-art methods.

1 Introduction

Image denoising, as a basic topic of pattern recognition and computer vision, has been studied for many years. However, there are still many new methods and algorithms that have been proposed in recent years. Particularly, the non-local method has become a mainstream method. For example, Buades et al. [1] proposed a novel method for image denoising, which was named as non-local means (NL_means). Afterwards, many other non-local methods have been proposed, in which the block-matching 3D (BM3D) [2] transform domain filtering method is the most successful. BM3D is the current state-of-the-art image denoising method [3, 4], and, as a special non-local image denoising model, can achieve very precise image denoising results.

Recently, a variety of new image denoising methods have also been proposed, but very few approaches can perform better than BM3D. Many of these methods are based on the non-local idea, i.e., using similar image blocks (or patches) to explore new image denoising methods. For example, Zhang et al. [5] proposed a two-stage principal component analysis (PCA) on local pixel grouping, with its local pixel grouping achieved by block matching. Although this method successfully combined the classical PCA with non-local idea and achieved better results than those traditional local methods, the denoising performance of this method is still lower than the BM3D method. Rajwade et al. [6] used the higher order singular value decomposition (HOSVD) for image denoising, by modifying the 3D transform in BM3D to HOSVD. For the color images with strong noise, HOSVD method can slightly perform better than BM3D, however, this method is not better than BM3D for the gray images or lower noise level situations. Papyan et al. [7] proposed a multi-scale patch-based method by improving the expected patch log likelihood (EPLL) method [8], this method can achieve better subjective visual quality of the denoised images with less artifacts, it can also be applied to other image processing problems, such as deblurring and super-resolution. There are also many other patch-based image denoising methods [9,10,11,12,13,14,15,16,17,18] in the literature. Some of them are improved non-local methods, while others are improved BM3D methods. Except patch-based methods, dictionary-learning based methods [19,20,21,22,23,24,25,26,27,28] can also achieve good image denoising results. For example, Dong et al. [29] proposed a combined non-local and bilateral variance estimation method, it provided a conceptually simple interpretation from a bilateral variance estimation perspective to further proposed a spatially adaptive iterative singular-value thresholding (SAIST), this method can achieve better denoised results in higher noise level situations, but it is only partially superior to BM3D. Most recently, there have been some new denoising methods developed. For example, Lebrun et al. [30] proposed a blind denoising algorithm, which can automatically estimate image noise. Ghimpeteanu et al. [31] proposed a decomposition framework for image denoising, and can better preserve image geometry (i.e., directions of gradients and level lines), when used with some existing denoising methods, such as NL_means and BM3D. Romano et al. [32] proposed a boosting strategy, i.e., employing a “SOS” procedure, to improve image denoising performance of some existing image denoising methods. But this method can only improve very little on the BM3D method, Romano et al. [33] also proposed a patch similarity measurement problem and used it to three kinds of image processing application. J. Mairal et al. [34] proposed non-local sparse models (LSSC, for learned simultaneous sparse coding) for image restoration which can better preserve image details than BM3D method on image denoising. S. Gu et al. [35] proposed a weighted nuclear norm minimization (WNNM)-based image denoising method, which achieved nice results especially on some texture-rich images, for example, House, and Barbara, however, this method usually produce some artifacts in some parts of the denoised images, especially in higher level noise situations.

BM3D method’s success in image denoising mainly comes from the two main characteristics: (1) natural images usually hold a large number of similar image blocks, and (2) contents in small image block is often locally highly correlated. Based on these two characteristics, grouping operation assembles the highly correlated blocks into each slide of the 3D matrix, and then sparse representation of real signal can be achieved by the de-correlation 3D transform. Due to the sparsity, effective image denoising can be realized just by the coefficients shrinkage with hard threshold. BM3D filtering is a powerful denoising method, and its denoising results are often much better than most existing denoising algorithms. Moreover, the BM3D-based denoising effects will be especially prominent when many easy matching blocks (such as with textures and/or contours) can be found. However, the basic assumption of high correlation of local image contents in the fixed-size square image block is not always holding. For example, if some image blocks contain weak image details, singularity, and sharp edges, the non-adaptive transform usually cannot obtain effective sparse representation. Therefore, in this case, the BM3D filtering may introduce some artifacts, and denoising is often not very effective. But this kind of image contents is often the most important part of the human visual attention.

On the other hand, shape-adaptive discrete cosine transform (SA-DCT) [36] is another type of image denoising methods, which uses neighborhoods with adaptive shapes to local image contents. Thus, in each shape-adaptive neighborhood, the discrete cosine transform can achieve sparse representation of real signal, effectively shrinking the transform domain coefficients and achieving image denoising. Due to the adaptability of neighborhood to local image details, SA-DCT can better preserve image details after denoising. However, SA-DCT has a disadvantage, e.g., it can fail in the texture-rich regions since local homogeneity in these regions is very limited. In addition, SA-DCT is a kind of local filter, and thus cannot make full use of repetitive structures or patterns in the natural images.

In order to achieve better results for image denoising, a shape-adaptive BM3D (SA-BM3D) [37] has been proposed by combining the advantages of both BM3D and SA-DCT. Specifically, SA-BM3D groups similar shape-adaptive neighborhoods (such as image patches), instead of shape-fixed image blocks in the BM3D method. In this way, the adaptation of non-local model, and also the local image characteristics, can be simultaneously used by particularly improving the spatial correlation within each image patch. As each image patch is not necessarily square, the orthogonal wavelet transform cannot be directly applied to the shape-adaptive image patch. Therefore, SA-DCT is first used on each shape-adaptive image patch, and then one-dimensional orthogonal wavelet transform is performed in the third dimension, followed by shrinking the transform coefficients with hard thresholding or Wiener filtering to finally achieve image denoising.

Another improvement to BM3D is, namely, BM3D shape adaptive principal component analysis (BM3D-SAPCA) [38], by combining the advantages of the SA-BM3D and the principal component analysis (PCA) methods. Note that this improved method changed DCT in SA-BM3D to PCA, by using eigenvalue decomposition of each image patch to get a PCA basis for selecting some eigenvectors with eigenvalues greater than a certain threshold (determined by noise level) as principal components. As a result, the whole 3D transform is changed to a new kind of separable transform combinations, i.e., performing PCA on each image patch, and then performing one-dimensional orthogonal transform on the third dimension. BM3D-SAPCA achieved better denoising results than the classical BM3D and SA-BM3D, by preserving better image details and introducing less image artifacts. Besides, Chen et al. [39] proposed a bounded BM3D, which has a little bit improvement on BM3D only for relatively higher noise level. Zhong et al. [40] proposed modified BM3D algorithm for image denoising using non-local centralization prior; this method removed the 1D transform inter-blocks and utilized a prior to improve the BM3D method; however, the denoised results are only partially better than BM3D method.

The successes of both BM3D-SAPCA and SA-BM3D are primarily by the use of shape-adaptive image patches/neighborhoods. But, the procedure for computing local adaptive shapes is relatively complex. For example, PCA often needs greater calculation time than the two-dimensional orthogonal transformation, and thus the whole operation of BM3D-SAPCA is time-consuming. Most importantly, it is difficult to make shapes adaptive, when the noise level is relatively high. To address these issues, in this paper, we propose an improved block-size-adaptive BM3D method for image denoising. First, DCT is performed on the reference image block before conducting the block matching. Then, all image blocks can be divided into three morphological components (namely, smooth, texture, and contour regions) based on the regional energy of alternating current (AC) component in the DCT coefficients. For different morphological component, the size of the reference image block will be enlarged or reduced. For example, the size for smooth-component image block will be enlarged appropriately, and the size for the contour-component image block will be reduced, while the size for the texture-component image block will be kept as the original size. Experimental results show that our proposed method can achieve better image denoising results than both BM3D and BM3D-SAPCA, in terms of PSNR and MSSIM values, and can also preserve better image details and introduce less image artifacts.

2 Morphological component representation in image

In recent years, a morphological component analysis method [41] has been proposed and widely used in image processing. The main idea of morphological component analysis is to divide the image contents into different components, such as smooth, texture, and contour. As DCT is a tool that can effectively depict periodic signals, we perform DCT on image blocks and then classify the image blocks into different morphological components according to their respective energies of the alternating current (AC) coefficients.

2.1 2D discrete cosine transform (DCT)

DCT, normally used in signal processing and image processing, is often used for compression of signal and image data (including still images and motion images). This is because DCT has relatively strong “energy concentration” features, i.e., the energy of most natural images is concentrated in the low frequency part after DCT. A 2D DCT is defined as

$$ F\left(u,v\right)=c(u)c(v){\sum}_{i=0}^{N-1}{\sum}_{j=0}^{N-1}f\left(i,j\right)\cos \left[\frac{\left(i+0.5\right)\pi }{N}u\right]\mathit{\cos}\left[\frac{\left(j+0.5\right)\pi }{N}v\right] $$

where \( c(u)=\left\{\begin{array}{c}\sqrt{\frac{1}{N}},\kern0.5em u=0\\ {}\sqrt{\frac{2}{N}},\kern0.5em u\ne 0\end{array}\right.\kern1em \), and \( c(v)=\left\{\begin{array}{c}\sqrt{\frac{1}{N}},\kern0.5em v=0\\ {}\sqrt{\frac{2}{N}},\kern0.5em v\ne 0\end{array}\right.\kern1em . \)

For faster and more convenient implementation of DCT on each image block, the BM3D method first generates a DCT forward transform matrix Tfor and also an inverse transform matrix Tinv, where \( {T}_{\mathrm{for}}\left(i,j\right)=c(i)\cos \left[\frac{\left(j+0.5\right)\pi }{N}i\right] \) and \( {T}_{\mathrm{inv}}={T}_{\mathrm{for}}^T \).

The forward transform on each image block B is defined as the following:

$$ \widehat{B}={T}_{\mathrm{for}}\bullet B\bullet {T}_{\mathrm{for}}^T $$

The inverse transform is similar to the forward one, i.e.,

$$ \widehat{\widehat{B}}={T}_{\mathrm{inv}}\bullet \widehat{B}\bullet {T}_{\mathrm{inv}}^T $$

A 8 × 8 DCT forward transform matrix is given as the following:

$$ {T}_{\mathrm{for}}=\left(\begin{array}{l}0.3536\kern1.3em 0.3536\kern2em 0.3536\kern1.4em 0.3536\kern1.4em 0.3536\kern1.2em 0.3536\kern1.4em 0.3536\kern1.4em 0.3536\\ {}0.4904\kern1.3em 0.4157\kern2em 0.2778\kern1.4em 0.0975\kern0.75em -0.0975\kern0.75em -0.2778\kern0.75em -0.4157\kern0.75em -0.4904\\ {}0.4619\kern1.3em 0.1913\kern1.5em -0.1913\kern0.75em -0.4619\kern0.75em -0.4619\kern0.75em -0.1913\kern1.2em 0.1913\kern1.4em 0.4619\\ {}0.4157\kern0.75em -0.0975\kern1.5em -0.4904\kern0.75em -0.2778\kern1.3em 0.2778\kern1.3em 0.4904\kern1.3em 0.0975\kern0.75em -0.4157\\ {}0.3536\kern0.75em -0.3536\kern1.5em -0.3536\kern1.4em 0.3536\kern1.3em 0.3536\kern0.75em -0.3536\kern0.75em -0.3536\kern1.3em 0.3536\\ {}0.2778\kern0.75em -0.4904\kern2.00em 0.0975\kern1.4em 0.4157\kern0.75em -0.4157\kern0.75em -0.0975\kern1.3em 0.4904\kern0.75em -0.2778\\ {}0.1913\kern0.75em -0.4619\kern2.00em 0.4619\kern0.75em -0.1913\kern0.75em -0.1913\kern1.3em 0.4619\kern0.75em -0.4619\kern1.3em 0.1913\\ {}0.0975\kern0.75em -0.2778\kern2.00em 0.4157\kern0.75em -0.4904\kern1.3em 0.4904\kern0.75em -0.4157\kern1.3em 0.2778\kern0.75em -0.0975\end{array}\right) $$

For visual inspection, the 128 × 128 Tfor and Tinv matrices can also be displayed as images in Fig. 1, respectively.

Fig. 1
figure 1

Illustration of DCT transform matrices. a Forward transform matrix, and b inverse transform matrix

2.2 Classification of morphological components in 2D image

In this section, we give a classification method for determining the morphological component of each image block. Specifically, we first implement the DCT forward transform on a given image block BR,

$$ {\widehat{B}}_R={T}_{for}\bullet {B}_R\bullet {T}_{\mathrm{for}}^T $$

Then, we compute the AC energy of the transform spectrum \( {\widehat{B}}_R \) by the following equation:

$$ {E}_{\mathrm{AC}\kern0.5em }={\sum}_{i=1}^N{\sum}_{j=1}^N\left|{\widehat{B}}_R\left(i,j\right)\right|-\left|{\widehat{B}}_R\left(1,1\right)\right| $$

Finally, we can classify the morphological component of an image block as follows:

$$ \left\{\begin{array}{c}{C}_{\mathrm{contour}},\kern6.25em if\kern0.5em {E}_{\mathrm{AC}}- c\sigma \ge {K}_1\ \\ {}{C}_{\mathrm{texture}},\kern4em if\kern0.5em {K}_2\le {E}_{\mathrm{AC}}- c\sigma <{K}_1\\ {}{C}_{\mathrm{smooth}},\kern6.75em if\kern0.5em {E}_{\mathrm{AC}}- c\sigma <{K}_2\end{array}\right. $$

Where K1 and K2 are the two empirical values used to classify the morphological component. In this paper, we use the following method to define the parameter c: (1) add different-level Gaussian white noises to a constant value 8 × 8 image block, (2) perform DCT transform on this noisy image block, and (3) use Eq. (6) to calculate the energy EAC , we found that EAC has proportional relationship with the standard deviation of noise, σ, by many experiments, the relationship between them are as follows,

$$ {E}_{\mathrm{AC}\kern0.5em }= c\sigma $$

Thus, we can get value for c as c = 0.18.

3 Influence of morphological components on image denoising with BM3D

The number of similar blocks is an important factor on non-local image denoising problem, the accuracy of block matching in transform domain filtering is also critical in influencing the denoising performance. Romano and Elad [33] and Levin et al. [42] respectively analyzed the influence of block size on image block matching accuracy and denoising performance; they theoretically discussed the effect of block size on the accuracy of image block matching. In the classic BM3D algorithm, the size of the image block is fixed. However, for image blocks with different morphological components, using fixed block size could limit the accuracy of block matching results or the denoising ability. Actually, the bigger block size has the stronger denoising ability in advance of the accurate block matching; however, image blocks with contour component are difficult to get higher accuracy of block matching when bigger image blocks are used than those image blocks in smooth areas. So we had better use bigger blocks in smooth areas but smaller blocks in texture or contour areas. This is consistent with human visual perception, and inspired us to use different block size in block matching, according to different morphological components.

To verify this idea, we extract three image blocks with different morphological components from House, Barbara, and Cameraman images, and then perform denoising with BM3D, as shown in Fig. 2. Later, we first add the Gaussian white noise with the standard deviation 25 to these three image blocks, and then use BM3D to denoise them. In the process of denoising, all other parameters remain the same, by changing only the block size, to fairly compare the denoising results. From the PSNR values of denoised image blocks shown in Fig. 3, we can draw the two following observations: (1) under the same noise level, the denoising result of the image block with smooth component is the best, followed by the image block with contour component and the image block with texture component; and (2) the image block with smooth component can usually achieve better denoised results using large block size, while the image block with texture component needs median block size and the image block with contour component needs small block size. These two observations, especially the second observation, inspire us to adaptively select block size according to the morphological components, when performing block-matching filtering for the image denoising.

Fig. 2
figure 2

Illustration of image blocks with three morphological components, i.e., with a smooth, b texture, and c contour components, respectively

Fig. 3
figure 3

Changes of denoising result with respect to the use of different block size, for three image blocks with a smooth component, b texture component, and c contour component, respectively

4 Methods

It is worth noting that the 3D transform in BM3D algorithm is separable. For example, we can first implement 2D transform on each block in every image block group, which is obtained by block matching, and then implement the 1D transform along the third dimension. Since the 2D transform is still local, there exist inevitable disadvantages of local transform, such as introduction of artifacts as well as blurry image edges after denoising. If an enough number of similar noisy image blocks can be obtained, the 1D transform can help denoise the noisy image very well. In particular, no transformation is required, as we can simply average those similar blocks in the case of added white Gaussian noise (AWGN) and achieve the ideal denoised results. Unfortunately, in reality, there are not many similar image blocks in the single image. Therefore, to avoid performing 2D transform on each image block, we can iteratively perform 1D transforms, thus better preserving edges and introducing less artifacts.

On the other hand, however, there are also two problems of iteratively performing 1D transforms. The first problem is that the image blocks are not completely the same. Even if we just use 1D transform, the image edges are only smoothed to a certain degree. The second problem is that some isolated strong noises will be retained if we just use 1D transform to denoise the image, since the transformed coefficient magnitude of strong noise will be very large, i.e., even larger than the coefficient magnitude of real signal. Thus, when we use a hard threshold to shrink the transformed coefficients, we cannot remove those strong noises. To solve these two problems and also achieve better image denoising results, we propose an improved BM3D algorithm. The respective improvements include (1) the use of adaptive block size, and (2) the use of multi-stage strategy. This improved BM3D algorithm includes four steps, as shown in Fig. 4 . As for the multi-stage strategy, we have presented more details in a previous conference paper [43].

Fig. 4
figure 4

The flowchart of the proposed algorithm

4.1 Step 1: block matching 1D (Haar) transform domain filtering

In the original BM3D algorithm, the first step is block-matching 3D transform. Since the 2D transform on each block is still the local transform, some artifacts will be introduced after inverse transform. Also, since the transformed coefficient magnitude of strong noise is very large, the hard-thresholding operation can only remove the coefficients of those relatively small noises. To avoid the introduction of artifacts, we can only perform 1D transform along the third dimension, in which this 1D transform is a real non-local transform. So, in this step, we first implement block-matching operation according to the BM3D method, and then perform 1D Haar transform on the third dimension, i.e., the inter-block Haar transform. Note that we will not implement the 2D transform on each block, so we call this step as block matching 1D transform (BM1D). Also, since we just implement 1D transform, only a few noises can be separated from the real signal, while many other noises, especially strong noises, are still retained after this step. On the other hand, image edges can be better preserved and few artifacts are introduced.

To further efficiently preserve image edges, we use a perturbation strategy, i.e., we amplify the low frequency coefficients after the 1D transform as follows:

$$ {\widehat{B}}_G={T}^{-1}\left(\mathrm{shrink}\left(\left[{\left(T\left({B}_G\right)\right)}_1\bullet \gamma, {\left(T\left({B}_G\right)\right)}_2,\cdots, {\left(T\left({B}_G\right)\right)}_K\right]\right)\right) $$

where (T(BG))1 is the low frequency subband of the 1D transform on image block group BG, and γ > 1 is a gain factor, which is used to amplify the low frequency coefficients. T and T−1 are the forward and inverse 1D transforms, respectively. And shrink denotes a hard-thresholding operation. This perturbation strategy has two major functions: (1) it can protect low frequency coefficients without shrinkage, because most coefficients of image edges belong to low frequency subband after the 1D transform, and (2) it can reduce the effect of strong noise in the next step of block-matching, thus improving the block-matching accuracy. On the other hand, however, it will reduce denoising performance. So we next implement BM3D Wiener filtering to further denoise the result of this step.

4.2 Step 2: block matching 3D Wiener filtering

We use the resulted image of the first step as reference, and perform empirical Wiener filtering on the original noisy image. The purpose of using BM3D-based Wiener filtering in this step is to enhance weakened image edges after using the first step, as image edges are always weakened to some extent after applying the first step of noise reduction. The following is the empirical Wiener filtering:

$$ {\mathrm{Shrink}}_{\mathrm{wie}}\left(\theta \right)=\frac{\theta \bullet {\left|\widehat{\theta}\right|}^2}{{\left|\widehat{\theta}\right|}^2+{\sigma}^2} $$

where \( \widehat{\theta} \) is the block matching 3D transformed coefficients of the result image of the first step, θ is the block matching 3D transformed coefficients of the original noisy image, and σ is the standard deviation of noise. Actually, we can use a smaller σ in this step to better preserve image edges, the smaller σ can obtain the similar effects with the low frequency coefficients perturbation in the first step.

Note that here we use 3D transform, instead of 1D transform in the first step, because using 3D transform and hard-threshold operation can remove strong noises. Of course, 3D transform is not good at preserving image edges than 1D transform. After completing this second step, noises can be further removed partly.

4.3 Step 3: size-adaptive block matching 1D transform domain filtering

In this third step, we first perform DCT on each reference block before using the block matching operation, then use Eq. (6) to calculate the AC energy of the transformed coefficients, and finally determine the class of morphological component by Eq. (7), which will be used to adjust the block size according to the descriptions in Section 3. Note that this step is the most critical step in the whole algorithm. This is because, after the second step, there still exist a lot of noises. Moreover, since DCT is operated on each block of BM3D Wiener filtering in the second step, some strong noises become the certain pseudo textures, which are no longer subject to Gaussian distribution. Thus, we should not perform block matching on the results produced by the second step. To get rid of these noises with certain pseudo textures, we perform block matching on the original noisy image, and then use the block matching results to extract image blocks at the same locations to implement 1D Haar transform among blocks. Finally, we similarly use a hard-threshold operation to remove the noise, which can be effectively done except for few isolated strong noises.

It is worth noting that we use size-adaptive blocks to improve block-matching accuracy. In the smooth region, if we use a small size to perform block matching, the noise, especially isolated strong noises, would influence the block-matching result. In other words, it would result in noise matching, instead of real signal matching. In the contour region, if we use a large size to perform block matching, it would be difficult to obtain ideal contour matching result, as contours are illustrated as strait lines. However, if we use a small size to perform block matching in these contour regions, we can easily obtain accurate contour matching results. In the texture region, the block size is in-between the former two. This is because if the size is too small, the block matching result will be influenced by the noise; while, if the size is too big, it is still difficult to obtain accurate block matching result as the contour region.

4.4 Step 4: size-adaptive block matching 3D Wiener filtering

In the fourth step, similar to the third step, we first use Eqs. (5)–(7) to determine the class of morphological component for the reference block, and then adjust the size for the reference block. The rest of the procedure in this fourth step is the same as the classic BM3D Wiener filtering.

After completing the above four steps, we are able to obtain a final denoised image. Notably, all of the first three steps are equivalent to the first step in the original BM3D algorithm, i.e., the basic estimation stage in the original BM3D algorithm. Because we use both 1D and 3D transforms as well as the size adaptive block-matching in the proposed algorithm, we call the proposed algorithm as SA-BM1-3D in the rest of this paper. To better understand the procedure of SA-BM1-3D, we use Fig. 5 to show the results of each step.

Fig. 5
figure 5

Denoised results of House image at each step of SA-BM1-3D. a Original image, b noisy image (σ = 50), c step 1, d step 2, e step 3, and f step 4

To denoise color images, we also transform RGB image to luminance-chrominance image just like BM3D; there are three channels in a luminance-chrominance image, i.e., Y, U, and V, respectively. We also use the opponent color transformation to obtain luminance-chrominance image, its transform matrix refers to [2]. In each step of SA-BM1-3D, we use Y channel to perform block-matching, then apply the block-matching result to other two channels, i.e., U and V. We still use the proposed 4-step algorithm to denoise each channel.

5 Experimental results and discussion

5.1 Parameter values

In this section, we give all the parameter values for SA-BM1-3D algorithm, determined based on our experiments. The name of each parameter in SA-BM1-3D is summarized below.

5.1.1 Step 1 (1D transform)

S1: block size, the block size is gradually increased with the noise level becoming higher and higher. The block matching can easily achieve better results as the noise level is low, so we use the smaller size, however, the higher noise level will significantly influence the block-matching accuracy, so we should utilize relatively bigger block size. N1: the number of blocks in each group, when noise level is low we use small number so as to guarantee the sufficient similarity among the matched blocks to better preserve weak texture and contour after denoising. The relatively bigger number for higher noise level to achieve the better denoising; T1: hard threshold, we use the same value in this stage, because this stage is just slightly denoising, we need not use different hard threshold whether lower or higher noise level, the common value is enough; γ: perturbation factor, the purpose of this parameter is to reduce the influential of the isolated strong noise, in other words, increase the block matching accuracy in the next step. With the increase of the noise level, the isolated strong noise will also be stronger, so we use the bigger value for higher noise level.

5.1.2 Step 2 (Wiener filtering)

S2: block size; N2: the number of blocks in each group; T2D: 2D transform on each block with DCT. All the parameters in this step are the same as BM3D Wiener filtering step, because we implement the same Wiener filtering operation in BM3D method in this step.

5.1.3 Step 3 (size-adaptive 1D transform)

S3: initial block size, we implement DCT on an initial 8 × 8 image, then use its AC component to decide the morphological component, then further adjust the block size according different morphological components; N3: the number of blocks in each group, the principle is the same as step 1; T2: hard threshold. Because we utilized bigger block size for higher noise level in step 1, the denoising performance is stronger than the lower noise level ones, so we use the relatively smaller hard threshold for higher noise level situations in this step to better preserve texture or contour details.

5.1.4 Step 4 (size-adaptive Wiener filtering)

S4: initial block size, the same selection principle as the S3; N4: the number of blocks in each group, the same principle as the Wiener filtering step in BM3D method; T2D: 2D transform on each block with DCT.

Table 1 shows all the parameter values at different noise levels. Three other parameters (not listed in Table 1) are also used in each step, such as (1) Nstep=3 for the sliding step size of reference block, (2) NS=39 × 39 for the searching neighborhood size in block matching, and (3) Haar transformation on the third dimension.

Table 1 Parameter values used in SA-BM1-3D

Both steps 3 and 4 use adaptive block size, with the initial block size as 8 × 8. Then, we adaptively determine the block size according to the AC energy of DCT coefficients, i.e., using the block size of 17 × 17 for the smooth component, 7 × 7 for the texture component, and 4 × 4 for the contour component, respectively.

5.2 Experimental results

We use the standard images provided in the BM3D website to conduct the denoising experiments. Table 2 shows the comparisons of PSNR values for the denoised results by proposed SA-BM1-3D, BM3D, EPLL in [8], SAIST in [29], LSSC in [34], and WNNM in [35]. On the other hand, Table 3 shows the comparisons of MSSIM values [44] for the denoised results by the BM3D, BM3D-SAPCA algorithm, and SA-BM1-3D. From these two tables, we can see that both PSNR and MSSIM values of SA-BM1-3D are consistently higher than those existing state-of-the-art algorithms, and are mostly higher than those of the BM3D-SAPCA and WNNM algorithms. Table 4 shows the comparison of PSNR for the denoised results on color images by the BM3D algorithm and SA-BM1-3D. From Table 4, we can see that the PSNR values of SA-BM1-3D are almost consistently higher than those of the BM3D algorithm. When the noise standard deviation is higher than 40, all the PSNR values of BM3D on both gray and color images are given by our previous improvement on BM3D [45]. Figure 6 shows the visual comparison of results by the BM3D algorithm, the BM3D-SAPCA algorithm, and SA-BM1-3D on a grayscale image. From this figure, we can see that when the noise level is relatively high, SA-BM1-3D still hardly introduces any artifacts, whereas the BM3D algorithm introduces a lot of periodic artifacts. The result by the BM3D-SAPCA algorithm is even worse than that obtained by the BM3D algorithm, since the step of determining adaptive shape in the BM3D-SAPCA algorithm fails for the case of strong noise. Figure 7 shows the comparison of three algorithms in denoising the House image. It can be seen that SA-BM1-3D can better preserve image details, but the BM3D-SAPCA algorithm introduces artifacts even with lower noise level. Figures 8 and 9 show the denoising results comparison between BM3D and SA-BM1-3D for color House, Lena, and Barbara images, as well as some zoomed in fragments, verifying the performance of SA-BM1-3D in the case of denoising color images. We can see from these two figures that SA-BM1-3D can better preserve image edge information than original BM3D method.

Table 2 Comparison of PSNR values obtained by BM3D, BM3D-SAPCA, EPLL in [8], SAIST in [29], LSSC in [34], WNNM in [35], and proposed SA-BM1-3D
Table 3 Comparison of MSSIM values obtained by BM3D, BM3D-SAPCA, and SA-BM1-3D. In each cell, the upper number is for BM3D, the middle number is for BM3D-SAPCA, while the lower number is for SA-BM1-3D
Table 4 Comparison of PSNR values obtained by BM3D and SA-BM1-3D on color images. In each cell, the upper number is for BM3D while the lower number is for SA-BM1-3D
Fig. 6
figure 6

Comparison of denoised results by b BM3D, c BM3D-SAPCA, and d SA-BM1-3D, for the case of adding noise (σ = 100) on the original image shown in a

Fig. 7
figure 7

Comparison of denoised results by BM3D, BM3D-SAPCA, and SA-BM1-3D. a Noisy image (σ = 15), b BM3D, c BM3D-SAPCA, d SA-BM1-3D, e zoomed patch of (c), and f zoomed patch of (d)

Fig. 8
figure 8

The denoised results comparison of color Lena and House images between BM3D and SA-BM1-3D (σ = 35). a Original images, b noisy images, c denoised images by BM3D, and d denoised images by SA-BM1-3D

Fig. 9
figure 9

The denoised results comparison of color image fragments between BM3D and SA-BM1-3D (σ = 35). Left: BM3D, right: SA-BM1-3D

Because SA-BM1-3D can better simulate human visual perception, it can achieve better image denoising results than the BM3D algorithm. In comparison to current state-of-the-art image denoising methods, such as BM3D-SAPCA, SA-BM1-3D is also generally competitive. Especially, by evaluating the denoising results with MSSIM, SA-BM1-3D can obtain better values than those by BM3D-SAPCA in most cases. We use Fig. 10 to intuitively show the denoised PSNRs comparison among BM3D, BM3D-SAPCA, and SA-BM1-3D for gray Lena and House images. We can see from Fig. 10 that SA-BM1-3D can achieve higher PSNR values than other two methods, BM3D-SAPCA can obtain higher PSNR values than BM3D in the low level noise case; however, it will be lower than BM3D when the noise level is relatively higher.

Fig. 10
figure 10

The denoised PSNR values comparison among BM3D, BM3D-SAPCA, and SA-BM1-3D for gray Lena and House images

5.3 Computational complexity

In the total four stages of the proposed method, the number of operations per pixel is approximately

$$ \frac{\left({S}_1^2+{N}_1\right){N}_S^2}{N_{\mathrm{step}}^2}+\frac{S_1^2{C}_{T_{1D}}}{N_{\mathrm{step}}^2}+\frac{\left({S}_2^2+{N}_2\right){N}_S^2}{N_{\mathrm{step}}^2}+\frac{2{S}_2^2{C}_{T_{1D}}+2{N}_2{C}_{T_{2D}}}{N_{\mathrm{step}}^2}+\frac{\left({S}_3^2+{N}_3\right){N}_S^2}{N_{\mathrm{step}}^2}+\frac{S_3^2{C}_{T_{1D}}}{N_{\mathrm{step}}^2}+\frac{\left({S}_4^2+{N}_4\right){N}_S^2}{N_{\mathrm{step}}^2}+\frac{2{S}_4^2{C}_{T_{1D}}+2{N}_4{C}_{T_{2D}}}{N_{\mathrm{step}}^2}+\frac{2{C}_{T_{2D}}}{N_{\mathrm{step}}^2}, $$

Where \( {C}_{T_{1\mathrm{D}}} \) and \( {C}_{T_{2\mathrm{D}}} \) denote the number of arithmetic operations required for a 1D and 2D transforms respectively.

The \( \frac{\left({S}_i^2+{N}_i\right){N}_S^2}{N_{\mathrm{step}}^2},i=1,2,3,4 \) denotes the number of block-matching operations, the \( \frac{S_i^2{C}_{T_{1D}}}{N_{\mathrm{step}}^2},i=1,3 \) denotes the number of 1D transform in stages 1 and 3, \( \frac{2{S}_i^2{C}_{T_{1D}}+2{N}_i{C}_{T_{2D}}}{N_{\mathrm{step}}^2},i=2,4 \) denotes the number of separable 3D transform in stages 2 and 4 for Wiener filtering. The last addend \( \frac{2{C}_{T_{2D}}}{N_{\mathrm{step}}^2} \) denotes the 2D-DCT transform number in stages 3 and 4 for morphological component analysis.

Comparing the proposed method with BM3D, the computational complexity of the proposed method is higher than BM3D method indeed, the running time comparison between BM3D on a same computer and using the same MATLAB software is shown in Figs. 11 and 12; Fig. 11 shows the running time of SA-BM1-3D and BM3D on gray scale images with size 256 × 256 and 512 × 512 and Fig. 12 shows the running time for SA-BM1-3D and BM3D on color images of the size 256 × 256 and 512 × 512. We can see that the time complexity of the proposed method is much higher than BM3D, the reasons are mainly from the following two aspects: firstly, our programs did not optimized better than BM3D ones, for example, the computational complexity of the stage 1 in the proposed method is actually lower than the first stage in BM3D; however, the time complexity of the proposed method in this stage is much higher than BM3D. Secondly, the proposed method includes four stages but BM3D includes only two stages, especially, the size adaptive stages in the proposed method usually use bigger size image blocks than BM3D method, so it always increases the computational complexity. We will try to use the parallel computing or other strategies to lower the time complexity of the proposed method in the future work.

Fig. 11
figure 11

The time complexity comparison between BM3D and SA-BM1-3D for gray images

Fig. 12
figure 12

The time complexity comparison between BM3D and SA-BM1-3D for color images

6 Conclusions

Based on human visual perception of noise in images, the image blocks in natural images, under denoising, can be divided into three morphological components, i.e., smooth, contour, and texture. Since the impact of noise is different in different regions, i.e., strongest in the smooth region followed by the texture and the contour regions, we propose using different parameter values (such as different thresholds and block sizes) on regions with different morphological components during image denoising. For example, we use a relatively large block size for the smooth regions since noise effect in the smooth regions seems strongest for human visual perception. On the other hand, we use the smallest block size for the contour regions, which are affected less by the noise.

Since DCT can depict the periodic signals very well, we use the AC energy of DCT coefficients to classify image blocks into three morphological components, i.e., smooth, texture, and contour. For the same block size, the image blocks with smooth component have the minimum AC energy followed by the image blocks with texture component and the image blocks with contour component. Experimental results have shown the robustness of our proposed algorithm to noise. Also, our proposed algorithm can achieve better denoising results than both BM3D and BM3D-SAPCA, in terms of PSNR and MSSIM values as well as visual inspection.



Alternating current


Additive white Gaussian noise


Block matching and 1D filtering


Block matching and 3D filtering


Shape adaptive principal component analysis block matching and 3D filtering


Discrete cosine transform


Expected patch log likelihood


Higher order singular value decomposition


Learned simultaneous sparse coding


Mean structural similarity


Non-local means


Principal component analysis


Peak signal to noise ratio


Size-adaptive block matching 1D and 3D filtering


Shape-adaptive block matching and 3D filtering


Shape-adaptive discrete cosine transform


Spatially adaptive iterative singular-value thresholding


Weighted nuclear norm minimization


  1. A Buades, B Coll, JM Morel, A review of image denoising algorithms, with a new one. Multiscale Model. Simul. 4(2), 490–530 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  2. K Dabov, A Foi, V Katkovnik, et al., Image denoising by sparse 3D transformdomain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)

    Article  MathSciNet  Google Scholar 

  3. V Katkovnik, A Foi, K Egiazarian, et al., From local kernel to nonlocal multiple-model image denoising. Int. J. Comput. Vis. 86(1), 1–32 (2010)

    Article  MathSciNet  Google Scholar 

  4. MH Alkinani, MR El-Sakka, Patch-based models and algorithms for image denoising: a comparative review between patch-based images denoising methods for additive noise reduction. EURASIP J. Image Video Proces. 58, 1–27 (2017)

    Google Scholar 

  5. Z Lei, W Dong, D Zhang, G Shi, Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recogn. 43(4), 1531–1549 (2010)

    Article  MATH  Google Scholar 

  6. R Ajit, A Rangarajan, A Banerjee, Image denoising using the higher order singular value decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 35(4), 849–862 (2013)

    Article  Google Scholar 

  7. P Vardan, M Elad, Multi-scale patch-based image restoration. IEEE Trans. Image Process. 25(1), 249–261 (2016)

    Article  MathSciNet  Google Scholar 

  8. D Zoran, Y Weiss, in Proc. IEEE Int. Conf. Comput. Vis. (ICCV). From learning models of natural image patches to whole image restoration (2011), pp. 479–486

    Google Scholar 

  9. D Charles-Alban, L Denis, F Tupin, Iterative weighted maximum likelihood denoising with probabilistic patch-based weights. IEEE Trans. Image Process. 18(12), 2661–2672 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. B Antoni, B Coll, J-M Morel, Image denoising methods. A new nonlocal principle. SIAM Rev. 52(1), 113–147 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. B Thomas, O Kleinschmidt, D Cremers, Efficient nonlocal means for denoising of textural patterns. IEEE Trans. Image Process. 17(7), 1083–1092 (2008)

    Article  MathSciNet  Google Scholar 

  12. D Weisheng, L Zhang, G Shi, X Li, Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. VDV Dimitri, M Kocher, SURE-based non-local means. IEEE Signal Process Lett. 6(11), 973–976 (2009)

    Google Scholar 

  14. C Priyam, P Milanfar, Patch-based near-optimal image denoising. IEEE Trans. Image Process. 21(4), 1635–1649 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  15. S Joseph, On two parameters for denoising with non-local means. IEEE Signal Process Lett. 17(3), 269–272 (2010)

    Article  MathSciNet  Google Scholar 

  16. D Charles-Alban, V Duval, J Salmon, Non-local methods with shape-adaptive patches (NLM-SAP). J. Math. Imaging Vision 43(2), 103–120 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. J Zexuan, Q Chen, Q-S Sun, D-S Xia, A moment-based nonlocal-means algorithm for image denoising. Inf. Process. Lett. 109(23), 1238–1244 (2009)

    MathSciNet  MATH  Google Scholar 

  18. F Shu, N Fukushima, M Kimura, Y Ishibashi, in SIGGRAPH Asia 2015 Technical Briefs. Randomized redundant DCT: efficient denoising by using random subsampling of DCT patches (2015), pp. 7–7

    Google Scholar 

  19. M Julien, G Sapiro, M Elad, Learning multiscale sparse representations for image and video restoration. Multiscale Model. Simul. 7(1), 214–241 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  20. X Li, C Lu, Y Xu, et al., Image smoothing via L 0 gradient minimization. ACM Trans. Graph. 30(6), 174 (2011)

    Article  Google Scholar 

  21. D Weisheng, X Li, L Zhang, G Shi, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Sparsity-based image denoising via dictionary learning and structural clustering (2011), pp. 457–464

    Google Scholar 

  22. C Priyam, P Milanfar, Clustering-based denoising with locally learned dictionaries. IEEE Trans. Image Process. 18(7), 1438–1451 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  23. S Ling, R Yan, X Li, Y Liu, From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms. IEEE Trans. Cybern. 44(7), 1001–1013 (2014)

    Article  Google Scholar 

  24. Y Ruomei, L Shao, Y Liu, Nonlocal hierarchical dictionary learning using wavelets for image denoising. IEEE Trans. Image Process. 22(12), 4689–4698 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  25. P Javier, A Tristan-Vega, IW Selesnick, Efficient and robust image restoration using multiple-feature L2-relaxed sparse analysis priors. IEEE Trans. Image Process. 24(12), 5046–5059 (2015)

    Article  Google Scholar 

  26. J Sulam, B Ophir, M Zibulevsky, et al., Trainlets: Dictionary learning in high dimensions. IEEE Trans. Signal Process. 64(12), 3180–3193 (2016)

    Article  MathSciNet  Google Scholar 

  27. Z Xianhua, W Bian, W Liu, J Shen, D Tao, Dictionary pair learning on Grassmann manifolds for image Denoising. IEEE Trans. Image Process. 24(11), 4556–4569 (2015)

    Article  MathSciNet  Google Scholar 

  28. R Ron, T Peleg, M Elad, Analysis K-SVD: A dictionary-learning algorithm for the analysis sparse model. IEEE Trans. Signal Process. 61(3), 661–677 (2013)

    Article  MathSciNet  Google Scholar 

  29. D Weisheng, G Shi, X Li, Nonlocal image restoration with bilateral variance estimation: a low-rank approach. IEEE Trans. Image Process. 22(2), 700–711 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  30. L Marc, M Colom, J-M Morel, Multiscale image blind denoising. IEEE Trans. Image Process. 24(10), 3149–3161 (2015)

    Article  MathSciNet  Google Scholar 

  31. G Gabriela, T Batard, M Bertalmio, S Levine, A decomposition framework for image Denoising algorithms. IEEE Trans. Image Process. 25(1), 388–399 (2016)

    Article  MathSciNet  Google Scholar 

  32. R Yaniv, M Elad, Boosting of image Denoising algorithms. SIAM J. Imag. Sci. 8(2), 1187–1219 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  33. Y Romano, M Elad, Con-patch: When a patch meets its context. IEEE Trans. Image Process. 25(9), 3967–3978 (2016)

    Article  MathSciNet  Google Scholar 

  34. J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman., Non-local sparse models for image restoration, International Conference on Computer Vision (ICCV), 2009.

    Book  Google Scholar 

  35. S Gu, Q Xie, D Meng, W Zuo, X Feng, L Zhang, Weighted nuclear norm minimization and its applications to low level vision. Int. J. Comput. Vis. 121(2), 183–208 (2017)

    Article  Google Scholar 

  36. A Foi, K Dabov, V Katkovnik, K Egiazarian, in International Society for Optics and Photonics Electronic Imaging. Shape-adaptive DCT for denoising and image reconstruction (2006), pp. 60640N–60640N

    Google Scholar 

  37. K Dabov, A Foi, V Katkovnik, et al., A nonlocal and shape-adaptive transform-domain collaborative filtering, Proceedings of International Workshop Local and Non-Local Approximation Image Processing (2008), pp. 179–186

    Google Scholar 

  38. K Dabov, A Foi, V Katkovnik, et al., BM3D image denoising with shape-adaptive principal component analysis, Proceedings of Workshop on Signal Processing with Adaptive Sparse Structured Representations (2009), pp. 221–226

    Google Scholar 

  39. Q Chen, W D, Image denoising by bounded block matching and 3D filtering. Signal Process. 90(9), 2778–2783 (2010)

    Article  MATH  Google Scholar 

  40. H Zhong, K Ma, Y Zhou, Modified BM3D algorithm for image denoising using nonlocal centralization prior. Signal Process. 106, 342–347 (2015)

    Article  Google Scholar 

  41. J Bobin, JL Starck, JM Fadili, et al., Morphological component analysis: an adaptive thresholding strategy. IEEE Trans. Image Process. 16(11), 2675–2681 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  42. A Levin, B Nadler, F Durand, WT Freeman, Patch complexity, finite pixel correlations and optimal denoising, European Conference on Computer Vision (2012), pp. 73–86

    Google Scholar 

  43. Yingkun Hou, Tao Chen, Deyun Yang, Lili Zhu, and Hongxiang Yang, Image denoising by block-matching and 1D filtering, Proc. SPIE 8349, Fourth International Conference on Machine Vision (ICMV 2011): Machine Vision, Image Processing, and Pattern Analysis, 83490A, 2012.

    Google Scholar 

  44. Z Wang, AC Bovik, HR Sheikh, et al., Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  45. Y Hou, C Zhao, D Yang, Y Cheng, Comments on image Denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 20(1), 268–270 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references


The authors thank the editor and reviewers.


This work was supported by the National Science Foundation of China [grant numbers 61379015, 61463008]; the Natural Science Foundation of Shandong Province [grant number ZR2011FM004]; the Science and Technology Development Project of Taian City [grant number 20113062]; and the Talent Introduction Project of the Taishan University.

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article.

Author information

Authors and Affiliations



YH initiated the research, designed the experiments, and wrote the paper. DS revised the paper. Both authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yingkun Hou or Dinggang Shen.

Ethics declarations

Authors’ information

Yingkun Hou is an Associate Professor. He has a PhD degree of pattern recognition and intelligent system. He is from the School of Information Science and Technology, Taishan University, China. He has published more than 50 papers in the international journals and conference proceedings. His main research interests include image processing, pattern recognition, and medical image analysis.

Dinggang Shen is a Professor of Radiology, Biomedical Research Imaging Center (BRIC), Computer Science, and Biomedical Engineering in the University of North Carolina at Chapel Hill (UNC-CH). He is currently directing the Center for Image Analysis and Informatics, the Image Display, Enhancement, and Analysis (IDEA) Lab in the Department of Radiology, and also the medical image analysis core in the BRIC. He was a tenure-track assistant professor in the University of Pennsylvanian (UPenn) and a faculty member in the Johns Hopkins University. Dr. Shen’s research interests include medical image analysis, computer vision, and pattern recognition. He has published more than 600 papers in the international journals and conference proceedings. He serves as an editorial board member for six international journals. He also serves in the Board of Directors, The Medical Image Computing and Computer Assisted Intervention (MICCAI) Society.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hou, Y., Shen, D. Image denoising with morphology- and size-adaptive block-matching transform domain filtering. J Image Video Proc. 2018, 59 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: