Smooth Adaptation by Sigmoid Shrinkage
© Abdourrahmane M. Atto et al. 2009
Received: 27 March 2009
Accepted: 6 August 2009
Published: 4 October 2009
This paper addresses the properties of a subclass of sigmoid-based shrinkage functions: the non zeroforcing smooth sigmoid-based shrinkage functions or SigShrink functions. It provides a SURE optimization for the parameters of the SigShrink functions. The optimization is performed on an unbiased estimation risk obtained by using the functions of this subclass. The SURE SigShrink performance measurements are compared to those of the SURELET (SURE linear expansion of thresholds) parameterization. It is shown that the SURE SigShrink performs well in comparison to the SURELET parameterization. The relevance of SigShrink is the physical meaning and the flexibility of its parameters. The SigShrink functions performweak attenuation of data with large amplitudes and stronger attenuation of data with small amplitudes, the shrinkage process introducing little variability among data with close amplitudes. In the wavelet domain, SigShrink is particularly suitable for reducing noise without impacting significantly the signal to recover. A remarkable property for this class of sigmoid-based functions is the invertibility of its elements. This propertymakes it possible to smoothly tune contrast (enhancement, reduction).
The Smooth Sigmoid-Based Shrinkage (SSBS) functions introduced in  constitute a wide class of WaveShrink functions. The WaveShrink (Wavelet Shrinkage) estimation of a signal involves projecting the observed noisy signal on a wavelet basis, estimating the signal coefficients with a thresholding or shrinkage function and reconstructing an estimate of the signal by means of the inverse wavelet transform of the shrunken wavelet coefficients. The SSBS functions derive from the sigmoid function and perform an adjustable wavelet shrinkage thanks to parameters that control the attenuation degree imposed to the wavelet coefficients. As a consequence, these functions allow for a very flexible shrinkage.
The present work addresses the properties of a subclass of the SSBS functions, the non-zero-forcing SSBS functions, hereafter called the SigShrink (Sigmoid Shrinkage) functions. First, we provide a discussion on the optimization of the SigShrink parameters in the context of WaveShrink estimation. The optimization exploits the new Stein Unbiased Risk of Estimation ((SURE), ) proposed in . SigShrink performance measurements are compared to those obtained when using the parameterization of , which consists of a sum of Derivatives of Gaussian (DOG). We then address the main features of the SigShrink functions; artifact-free denoising and smooth contrast functions make SigShrink a worthy tool for various signal and image processing applications.
The presentation of this paper is as follows. Section 2 presents the SigShrink functions. Section 3 briefly describes the nonparametric estimation by wavelet shrinkage and addresses the optimization of the SigShrink parameters with respect to the new SURE approach described in . Section 4 discusses the main properties of the SigShrink functions by providing experimental tests. These tests assess the quality of the SigShrink functions for image processing: adjustable and artifact-free denoising as well as contrast functions. Finally, Section 5 concludes this paper.
2. Smooth Sigmoid-Based Shrinkage
(P1) Smoothness. There is smoothness of the shrinkage function so as to induce small variability among data with close values;
(P2) Penalized Shrinkage. A strong (resp., a weak) attenuation is imposed for small (resp., large) data.
(P3) Vanishing Attenuation at Infinity. The attenuation decreases to zero when the amplitude of the coefficient tends to infinity.
where is the indicator function of a given set : if if . It follows that acts as a threshold. Note that sets a coefficient with amplitude to half of its value and so minimizes the local variation (second derivative) around , since .
3. Sigmoid Shrinkage in the Wavelet Domain
3.1. Estimation via Shrinkage in the Wavelet Domain
Let us recall the main principles of the nonparametric estimation by wavelet shrinkage (the so-called WaveShrink estimation) in the sense of . Let stand for the sequence of noisy data where is an unknown deterministic function, the random variables are independent and identically distributed (iid), Gaussian with null mean and variance , in short, for every .
where and . The random variables are iid and . The transform is assumed to achieve a sparse representation of the signal in the sense that, among the coefficients , only a few of them have large amplitudes and, as such, characterize the signal. In this respect, simple estimators such as "keep or kill" and "shrink or kill" rules are proved to be nearly optimal, in the Mean Square Error (MSE) sense, in comparison with oracles (see  for further details). The wavelet transform is sparse in the sense given above for smooth and piecewise regular signals . Hereafter, the matrix represents an orthonormal wavelet transform. Let be the sequence resulting from the shrinkage of by using a function . We obtain an estimate of by setting where is the transpose, and thus, the inverse orthonormal wavelet transform.
In , the hard and soft-thresholding functions are proposed for wavelet coefficient estimation of a signal corrupted by Additive, White and Gaussian Noise (AWGN). Using these thresholding functions adjusted with suitable thresholds,  shows that, in AWGN, the wavelet-based estimators thus obtained achieve within a factor of of the performance achieved with the aid of an oracle. Despite the asymptotic near-optimality of these standard thresholding functions, we have the following limitations. The hard-thresholding function is not everywhere continuous and its discontinuities generate a high variance of the estimate; on the other hand, the soft-thresholding function is continuous but creates an attenuation on large coefficients, which results in an over smoothing and an important bias for the estimate . In practice, these thresholding functions (and their alternatives "nonnegative garrote" function , "smoothly clipped absolute deviation" function ) yield musical noise in speech denoising and visual artifacts or over smoothing of the estimate in image processing (see, e.g., the experimental results given in Section 4.1). Moreover, although thresholding rules are proved to be relevant strategies for estimating sparse signals , wavelet representations of many signals encountered in practical applications such as speech and image processing fail to be sparse enough (see illustrations given in [8, Figure 3]). For a signal whose wavelet representation fails to be sparse enough, it is more convenient to impose the penalized shrinkage condition (P2) instead of zero forcing since small coefficients may contain significant information about the signal. Condition (P1) guarantees the regularity of the shrinkage process, and the role of condition (P3) is to avoid over smoothing of the estimate (noise mainly affects small wavelet coefficients). SigShrink functions are thus suitable functions for such an estimation since they satisfy (P1), (P2), and (P3) conditions. The following addresses the optimization of the SigShrink parameters.
3.2. SURE-Based Optimization of SigShrink Parameters
for a shrinkage function . The SURE approach  involves estimating unbiasedly the risk . The SURE optimization then consists in finding the set of parameters that minimizes this unbiased estimate. The following result is a consequence of [3, Theorem 1].
the result derives from (1), (8), and (9).
As a consequence of Proposition 3.1, we get that minimizing of (6) amounts to minimizing the unbiased (SURE) estimator given by (7). The next section presents experimental tests for illustrating the SURE SigShrink denoising of some natural images corrupted by AWGN. For every tested image and every noise standard deviation considered, the optimal SURE SigShrink parameters are those minimizing , the vector representing the wavelet coefficients of the noisy image.
3.3. Experimental Results
The SURE optimization approach for SigShrink is now given for some standard test images corrupted by AWGN. We consider the standard -dimensional Discrete Wavelet Transform (DWT) by using the Symlet wavelet of order ("sym8" in the Matlab Wavelet toolbox).
The SigShrink estimation is compared with that of the SURELET "sum of DOGs" (Derivatives Of Gaussian). SURELET (free MatLab software is avalaible at http://bigwww.epfl.ch/demo/suredenoising/) is a SURE-based method that moreover includes an interscale predictor with a priori information about the position of significant wavelet coefficients. For the comparison with SigShrink, we only use the "sum of DOGs" parameterization, that is, the SURELET method without inter-scale predictor and Gaussian smoothing. By so proceeding, we thus compare two shrinkage functions: SigShrink and "sum of DOGs."
In the sequel, the SURE SigShrink parameters (attenuation degree and threshold) are those obtained by performing the SURE optimization on the whole set of the detail DWT coefficients. The attenuation degree and threshold thus computed are then applied at every decomposition level to the detail DWT coefficients. We also introduce the SURE Level-Dependent SigShrink (SURE LD-SigShrink) parameters. These parameters are obtained by applying an SURE optimization at every detail (horizontal, vertical, diagonal) subimage located at the different resolution levels concerned (4 resolution levels in our experiments).
Means, variances, minima, and maxima of the PSNRs computed over 25 noise realizations, when denoising test images by the SURE SigShrink, SURE LD-SigShrink, and "sum of DOGs" methods. The tested images are corrupted by AWGN with standard deviation . The DWT is computed by using the "sym8" wavelet. Some statistics are given in Tables 2, 3, 4, and 5 for the SigShrink and LD-SigShrink optimal SURE parameters.
Mean values (based on 25 noise realizations) for optimal DWT "sym8" SURE SigShrink parameters, when denoising the "Lena" image corrupted by AWGN. The SURE SigShrink parameters are the SigShrink parameters and obtained by performing the SURE optimization on the whole set of the detail DWT coefficients. It follows from these results that the threshold height as well as the attenuation degree tends to be increasing functions of the noise standard deviation .
Variances (based on 25 noise realizations) for the optimal SURE SigShrink parameters whose means are given in Table 2.
Mean values of the optimal SURE LD-SigShrink parameters, for the denoising of the "Lena" image corrupted by AWGN. The DWT with the "sym8" wavelet is used. The SURE LD-SigShrink parameters are obtained by applying a SURE optimization at every detail (Hori. for Horizontal, Vert. for Vertical, Diag. for Diagonal) subimage located at the different resolution levels concerned. We remark first that the threshold height, as well as the attenuation degree, tends to be increasing functions of the noise standard deviation . In addition, for every considered, the attenuation degree as well as the threshold tends to decrease when the resolution level increases.
Variances (based on 25 noise realizations) for optimal SURE SigShrink parameters whose means are given in Table 4.
We use the Matlab routine fmincon to compute the optimal SURE SigShrink parameters. This function computes the minimum of a constrained multivariable function by using nonlinear programming methods (see Matlab help for the details). Note the following. First, one can use a test set and average the optimal parameter values on this set for application to images other than those used in the test set. By so proceeding, we avoid the systematic use of optimization algorithms such as fmincon on images that do not pertain to the test class. The low variability that holds among the optimal parameters given in Tables 2, 3, 4, and 5 ensures the robustness of the average values. Second, instead of using optimal parameters, one can use heuristic ones (calculated by taking into account the physical meaning of these parameters and the noise statistical properties) such as the standard minimax or universal thresholds, which are shown to perform well with SigShrink (see Section 4).
From Table 1, it follows that the 3 methods yield PSNRs of the same order. The level dependent strategy for SigShrink (LD-SigShrink) tends to achieve better results than the SigShrink and the "sum of DOGs." For every method, the difference (over the 25 noise realizations) between the minimum and maximum PSNR is less than 0.2 dB.
for every tested , the SURE level-dependent attenuation degree and threshold tend to decrease when the resolution level increase (see Table 4),
4. Smooth Adaptation
In this section, we highlight specific features of SigShrink functions with respect to several issues in image processing.
Besides its simplicity (function with explicit close form, in contrast to parametric methods such as Bayesian shrinkages [9–14]), the main features of the SigShrink functions in image processing are the following.
The flexibility of the SigShrink parameters allows to choose the denoising level. From hard denoising (degenerated SigShrink) to smooth denoising, there exists a wide class of regularities that can be attained for the denoised signal by adjusting the attenuation degree and threshold.
The smoothness of the nondegenerated SigShrink functions allows for reducing noise without impacting significantly the signal; a better preservation of the signal characteristics (visual perception) and its statistical properties is guaranteed due to the fact that the shrinkage is performed with less variability among coefficients with close values.
The SigShrink function and its inverse, the SigStretch function, can be seen as contrast functions. The SigShrink function enhances contrast, whereas the SigStretch function reduces contrast.
In what follows, we detail these characteristics. The following proposition characterizes the SigStretch function.
In the rest of the paper, the wavelet transform used is the Stationary (also call shift-invariant or redundant) Wavelet Transform (SWT) . This transform has appreciable properties in denoising. Its redundancy makes it possible to reduce residual noise due to the translation sensitivity of the orthonormal wavelet transform.
4.1. Adjustable and Artifact-Free Denoising
For a fixed attenuation degree, we observe that the smoother denoising is obtained with the larger threshold (universal threshold). Small value for the threshold (minimax threshold) leads to better preservation of the textural information contained in the image (compare in Figure 4, image (a) versus image (d); image (b) versus image (e); image (c) versus image (f); or equivalently, compare the zooms of these images shown in Figure 5).
Now, for a fixed threshold , the SigShrink shape is controllable via (see Figure 2). The attenuation degree , reflects the regularity of the shrinkage and the attenuation imposed to data with small amplitudes (mainly noise coefficients). The larger , the more the noise reduction. However, SigShrink functions are more regular for small values of , and thus, small values for lead to less artifacts (in Figure 5, compare images 5(d), 5(e), and 5(f)).
It follows that SigShrink denoising is flexible thanks to parameters and , preserves the image features, and leads to artifact-free denoising. It is thus possible to reduce noise without impacting the signal characteristics significantly. Artifact free denoising is relevant in many applications, in particular for medical imagery where visual artifacts must be avoided. In this respect, we henceforth consider small values for the attenuation degree.
At this stage, it is worth mentioning the following. Some parametric shrinkages using a priori distributions for modeling the signal wavelet coefficients can sometimes be described by nonparametric functions with explicit formulas (e.g., a Laplacian assumption leads to a soft-thresholding shrinkage). In this respect, one can wonder about possible links between SigShrink and the Bayesian Sigmoid Shrinkage (BSS) of . BSS is a one-parameter family of shrinkage functions; whereas SigShrink functions depend on two parameters. Fixing one of these two parameters yields a subclass of SigShrink functions. It is then reasonable to think that depending on the distribution of the signal and noise wavelet coefficients, these functions should somehow relate to BSS. Actually, such a possible link has not yet been established.
4.2. Speckle Denoising
In SAR, oceanography and medical ultrasonic imagery, sensors record many gigabits of data per day. These images are mainly corrupted by speckle noise. If postprocessing such as segmentation or change detection have to be performed on these databases, it is essential to be able to reduce speckle noise without impacting the signal characteristics significantly. The following illustrates that SigShrink makes it possible to achieve this because of its flexibility (see the shapes of SigShrink functions given in Figure 2) and the artifact-free denoising they perform (see Figures 4 and 5). In addition, since SigShrink is invertible, it is not essential to store a copy of the original database (thousands and thousands of gigabits recorded every year); one can retrieve an original image by simply applying the inverse SigShrink denoising procedure (SigStrech functions). More precisely, the following illustrates that SigShrink performs well for denoising speckle noise in the wavelet domain.
Speckle noise is a multiplicative type noise inherent to signal acquisition systems using coherent radiation. This multiplicative noise is usually modeled as a correlated stationary random process independent of the signal reflectance.
Two different additive representations are often used for speckle noise. The first model is a "signal-dependent" stationary noise model; noise, assumed to be stationary, depends on the signal reflectance. This model is simply obtained by noting that , with being the signal reflectance and being a stationary random process independent of . The second model is a "signal-independent" model obtained by applying a logarithmic transform to the noisy image.
In addition, we consider the speckle signal-independent model. We use the estimation procedure described above for denoising the logarithmic transformed noisy image. The results are given in Figures 7(d) and 7(e).
By comparing the results of Figure 7, we observe that the PSNRs achieved are of the same order whatever the model. However, the denoising obtained with the additive independent noise model (logarithmic transform) has a better visual quality than that obtained with the additive signal-dependent speckle model. In fact, one can note, from this figure, the ability of SigShrink functions to reduce speckle noise without impacting structural features and textural information of the image. Note also the gain in PSNR is larger than 10 dBs, performance of the same order as that of the best up-to-date speckle denoising techniques ([17–22] among others).
4.3. Contrast Function
To conclude this section, we now present the SigShrink and SigStretch functions as contrast functions. Contrast functions are very useful in medical image processing. As a matter of fact, medical monitoring for arthroplasty (replacement of certain bone surfaces by implants due to lesions of the articular surfaces) requires 2D-3D registration of the implant, and thus, requires knowing exactly the position of the implant contour. Precise edge detection is no easy task  because edge detection methods are sensitive to contrast (global contrast for the image and local contrast around a contour). The following briefly describes how to use SigShrink-SigStretch functions as contrast functions.
This work proposes the use of SigShrink-SigStretch functions for practical engineering problems such as image denoising, image restoration, and image enhancement. These functions perform adjustable adaptation of data in the sense that they can enhance or reduce the variability among data, the adaptation process being regular and invertible. Because of the smoothness of the function used (infinitely differentiable in ), the data adaptation is performed with little variability so that the signal characteristics are better preserved. The SigShrink and SigStretch methods are simple and flexible in the sense that the parameters of these classes of functions allow for a fine tuning of the data adaptation. This adaptation is nonparametric because no prior information about the signal is taken into account. A SURE-based optimization of the parameters is possible.
The denoising achieved by a SigShrink function is almost artifact-free due to the little variability introduced among data with close amplitudes. This artifact-free denoising is relevant for many applications, in particular for medical imagery where visual artifacts must be avoided. In addition, a fine calibration of SigShrink parameters allows noise reduction without impacting the signal characteristics. This is important when some postprocessing (such as a segmentation) must be performed on the signal estimate.
As far as perspectives are concerned, we can reasonably expect to improve SigShrink denoising performance by introducing interscale or/and intrascale predictor, which could provide information about the position of significant wavelet coefficients. It could also be relevant to undertake a complete theoretical and experimental comparison between SigShrink and Bayesian sigmoid shrinkage .
In addition, application of SigShrink to speech processing could also be considered. Since SigShrink yields denoised images that are almost artifact-free, would it be possible that such an approach denoises speech signals corrupted by AWGN without returning musical noise, in contrast to classical shrinkages using thresholding rules?
Another perspective is the SigShrink-SigStretch calibration of contrast in order to improve edge detection in medical imagery. Exact edge detection is necessary for 2D-3D registration of images. Subpixel measurement of edge is possible by using, for example, the moment-based method of . However, the method is very sensible to contrast. Low contrast varying images result in multiple contours; whereas high varying contrast in image leads to good precision for certain contour points but induces lack of detection for points in lower contrast zones. The idea is the use of the SigShrink-SigStretch functions for improving image contrast so as to alleviate edge detection in medical imagery. For instance, we can expect that combining SigShrink-SigStretch with edge detection methods such as  can lead to good subpixel measurement of the contour in an image.
Proof of Proposition 4.1
- Atto AM, Pastor D, Mercier G: Smooth sigmoid wavelet shrinkage for non-parametric estimation. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '08), March-April 2008, Las Vegas, Nev, USA 3265-3268.Google Scholar
- Stein C: Estimation of the mean of a multivariate normal distribution. The Annals of Statistics 1981, 9: 1135-1151. 10.1214/aos/1176345632View ArticleMathSciNetMATHGoogle Scholar
- Luisier F, Blu T, Unser M: A new sure approach to image denoising: interscale orthonormal wavelet thresholding. IEEE Transactions on Image Processing 2007,16(3):593-606.View ArticleMathSciNetGoogle Scholar
- Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994,81(3):425-455. 10.1093/biomet/81.3.425View ArticleMathSciNetMATHGoogle Scholar
- Bruce AG, Gao H-YE: Understanding waveshrink: variance and bias estimation. Biometrika 1996,83(4):727-745. 10.1093/biomet/83.4.727View ArticleMathSciNetMATHGoogle Scholar
- Gao H-Y: Wavelet shrinkage denoising using the non-negative garrote. Journal of Computational and Graphical Statistics 1998,7(4):469-488. 10.2307/1390677MathSciNetGoogle Scholar
- Antoniadis A, Fan J: Regularization of wavelet approximations. Journal of the American Statistical Association 2001,96(455):939-955. 10.1198/016214501753208942View ArticleMathSciNetMATHGoogle Scholar
- Simoncelli EP, Adelson EH: Noise removal via bayesian wavelet coring. Proceedings of the IEEE International Conference on Image Processing (ICIP '96), September 1996, Lausanne, Switzerland 1: 379-382.View ArticleGoogle Scholar
- Do MN, Vetterli M: Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Transactions on Image Processing 2002,11(2):146-158. 10.1109/83.982822View ArticleMathSciNetGoogle Scholar
- Şendur L, Selesnick IW: Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency. IEEE Transactions on Signal Processing 2002,50(11):2744-2756. 10.1109/TSP.2002.804091View ArticleGoogle Scholar
- Portilla J, Strela V, Wainwright MJ, Simoncelli EP: Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Transactions on Image Processing 2003,12(11):1338-1351. 10.1109/TIP.2003.818640View ArticleMathSciNetMATHGoogle Scholar
- Johnstone IM, Silverman BW: Empirical bayes selection of wavelet thresholds. Annals of Statistics 2005,33(4):1700-1752. 10.1214/009053605000000345View ArticleMathSciNetMATHGoogle Scholar
- ter Braak CJF: Bayesian sigmoid shrinkage with improper variance priors and an application to wavelet denoising. Computational Statistics and Data Analysis 2006,51(2):1232-1242. 10.1016/j.csda.2006.06.011View ArticleMathSciNetMATHGoogle Scholar
- Coifman RR, Donoho DL: Translation Invariant de-Noising, Lecture Notes in Statistics. Springer, New York, NY, USA; 1995.Google Scholar
- Xie H, Pierce LE, Ulaby FT: SAR speckle reduction using wavelet denoising and markov random field modeling. IEEE Transactions on Geoscience and Remote Sensing 2002,40(10):2196-2212. 10.1109/TGRS.2002.802473View ArticleGoogle Scholar
- Argenti F, Alparone L: Speckle removal from SAR images in the undecimated wavelet domain. IEEE Transactions on Geoscience and Remote Sensing 2002,40(11):2363-2374. 10.1109/TGRS.2002.805083View ArticleGoogle Scholar
- Achim A, Tsakalides P, Bezerianos A: SAR image denoising via bayesian wavelet shrinkage based on heavy-tailed modeling. IEEE Transactions on Geoscience and Remote Sensing 2003,41(8):1773-1784. 10.1109/TGRS.2003.813488View ArticleGoogle Scholar
- Argenti F, Bianchi T, Alparone L: Multiresolution MAP despeckling of SAR images based on locally adaptive generalized Gaussian pdf modeling. IEEE Transactions on Image Processing 2006,15(11):3385-3399.View ArticleGoogle Scholar
- Achim A, Kuruoglu EE, Zerubia J: SAR image filtering based on the heavy-tailed Rayleigh model. IEEE Transactions on Image Processing 2006,15(9):2686-2693.View ArticleGoogle Scholar
- Sen D, Swamy MNS, Ahmad MO: Computationally fast techniques to reduce AWGN and speckle in videos. IET Image Processing 2007,1(4):319-334. 10.1049/iet-ipr:20060299View ArticleGoogle Scholar
- Antoniadis A: Wavelet methods in statistics: some recent developments and their applications. Statistics Surveys 2007, 1: 16-55. 10.1214/07-SS014View ArticleMathSciNetMATHGoogle Scholar
- Mahfouz MR, Hoff WA, Komistek RD, Dennis DA: Effect of segmentation errors on 3D-to-2D registration of implant models in X-ray images. Journal of Biomechanics 2005,38(2):229-239. 10.1016/j.jbiomech.2004.02.025View ArticleGoogle Scholar
- Atto AM, Pastor D, Mercier G: Detection threshold for non-parametric estimation. Signal, Image and Video Processing 2008,2(3):207-223. 10.1007/s11760-008-0051-xView ArticleMATHGoogle Scholar
- Lyvers EP, Mitchell OR, Akey ML, Reeves AP: Subpixel measurements using a moment-based edge operator. IEEE Transactions on Pattern Analysis and Machine Intelligence 1989,11(12):1293-1309. 10.1109/34.41367View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.