Robust fuzzy scheme for Gaussian denoising of 3D color video
- Alberto Jorge Rosales-Silva^{1}Email author,
- Francisco Javier Gallegos-Funes^{1},
- Ivonne Bazan Trujillo^{1} and
- Alfredo Ramírez García^{1}
https://doi.org/10.1186/1687-5281-2014-20
© Rosales-Silva et al.; licensee Springer. 2014
Received: 2 March 2012
Accepted: 12 March 2014
Published: 2 April 2014
Abstract
We propose a three-dimensional Gaussian denoising scheme for application to color video frames. The time is selected as a third dimension. The algorithm is developed using fuzzy rules and directional techniques. A fuzzy parameter is used for characterization of the difference among pixels, based on gradients and angle of deviations, as well as for motion detection and noise estimation. By using only two frames of a video sequence, it is possible to efficiently decrease Gaussian noise. This filter uses a noise estimator that is spatio-temporally adapted in a local manner, in a novel way using techniques mentioned herein, and proposing a fuzzy methodology that enhances capabilities in noise suppression when compared to other methods employed. We provide simulation results that show the effectiveness of the novel color video denoising algorithm.
Keywords
1. Introduction
where x_{ β } represents the original pixel component value, β = {Red, Green, Blue} are the notations on each pixel color component (or channel), and σ is the standard deviation of the noise. In our case, the Gaussian function is independently used on the pixel component of each channel of the frame in order to obtain the corrupted video sequence.
A pre-processing procedure to reduce noise effect is the main stage of any computer vision application. It should include procedures to reduce the noise impact in a video without degrading the quality, edges, fine detail, and color properties.
The current proposal is an attempt to enhance the quality while processing the color video sequences corrupted by Gaussian noise; this methodology is an extension of the method proposed for impulsive noise removal[1]. There exist numerous algorithms that perform the processing of 3D signals using only the spatial information [2]. Other applications use only the temporal information [3, 4]; an example is one that uses wavelet procedures to reduce the delay in video coding [5]. There exist also some interesting applications that use spatio-temporal information [6–13]. The disadvantage of these 3D solutions is that they often require large memory and may introduce a significant time delay in cases where there is a need for more than one frame to be processed. This is undesirable in interactive applications such as infrared camera-assisted driving or videoconferencing. Moreover, full 3D techniques tend to require more computation than separable ones, and their optimal performance can be very difficult to determine. For example, integrating video coding and denoising is a novel processing paradigm and brings mutual benefits to both video processing tools. In Jovanov et al. [14], the main idea is the reuse of motion estimation resources from the video coding module for the purpose of video denoising. Some disadvantages of the work done by Dai et al. [15] is that they use a number of reference frames that increases the computational charge; the algorithm MHMCF was originally applied to grayscale video signal; and in the paper referenced [14], it was adapted to color video denoising, transforming the RGB video in a luminance color difference space proposed by the authors.
Other state-of-the-art algorithms found in literature work in the same manner; for example in Liu and Freeman [16], a framework that integrates robust optical flow into a non-local means framework with noise level estimation is used, and the temporal coherence is taken into account in removing structured noise. In the paper by Dabov et al. [17], it is interesting to see how they propose a method based on highly sparse signal representation in local 3D transform domain; a noisy video is processed in blockwise manner, and for each processed block, they form data array by stacking together blocks found to be similar to the currently processed one. In [18], Mairal et al. presented a framework for learning multiscale sparse representations of color images and video with overcomplete dictionaries. They propose a multiscaled learned representation obtained by using an efficient quadtree decomposition of the learned dictionary and overlapping image patches. This provides an alternative to predefined dictionaries such as wavelets.
The effectiveness of the algorithm designed is justified by comparing it with four other state-of-the-art approaches: ‘Fuzzy Logic Recursive Spatio-Temporal Filter’ (FLRSTF), where a fuzzy logic recursive scheme is proposed for motion detection and spatio-temporal filtering capable of dealing with Gaussian noise and unsteady illumination conditions in both the temporal and the spatial directions [19]. Another algorithm used for comparison is the ‘Fuzzy Logic Recursive Spatio-Temporal Filter using Angles’ (FLRSTF_ANGLE). This algorithm uses the angle deviations instead of gradients as a difference between pixels in the FLRSTF algorithm. The ‘Video Generalized Vector Directional Filtering in Gaussian Denoising’ (VGVDF_G) [20] is a directional technique that computes the angle deviations between pixels as a difference criterion among them. As a consequence, the vector directional filters (VDF) do not take into account the image brightness when processing the image vectors. Finally, the ‘Video Median M-type K-Nearest Neighbor in Gaussian Denoising’ filter (VMMKNN_G) [21, 22] uses order statistics techniques to characterize the pixel differences.
The proposed algorithm employs only two frames in order to reduce the computational processing charge and memory requirements, permitting one to produce an efficient denoising framework. Additionally, it applies the relationship that the neighboring pixels have to the central one in magnitude and angle deviation, connecting them by fuzzy logic rules designed to estimate the motion and noise parameters. The effectiveness of the present approach is justified by comparing it with four state-of-the-art algorithms found in literature as explained before.
The digital video database is formed by the Miss America, Flowers, and Chair color video sequences; this database is well known in scientific literature [23]. Frames were manipulated to be 24 bits in depth to form true-color images with 176 × 144 pixels, in order to work with the Quarter Common Intermediate Format (QCIF). These video sequences were selected because of their different natures and textures. The database was contaminated by Gaussian noise at different levels of intensity for each channel in an independent manner. This was used to characterize the performance, permitting the justification of the robustness of the novel framework.
2. Proposed fuzzy design
The first frame of the color video sequence is processed as follows. First, the histogram and the mean value $\left({\overline{\mathit{x}}}_{\mathit{\beta}}\right)$ for each pixel component are calculated, using a 3 × 3 processing window. Then, an angle deviation between two vectors $\left(\overline{\mathit{x}}\phantom{\rule{0.12em}{0ex}}\mathrm{and}\phantom{\rule{0.12em}{0ex}}{\mathit{x}}_{\mathit{c}}\right)$ containing components in the Red, Green, and Blue channels is computed as ${\mathit{\theta}}_{\mathit{c}}=\mathit{A}\left(\overline{\mathit{x}},{\mathit{x}}_{\mathit{c}}\right)$, where ${\mathit{\theta}}_{\mathit{c}}={cos}^{-1}\left\{\frac{\overline{\mathit{x}}\cdot {\mathit{x}}_{\mathit{c}}}{\left|\overline{\mathit{x}}\right|\cdot \left|{\mathit{x}}_{\mathit{c}}\right|}\right\}$ is the angle deviation of the mean value vector $\left(\overline{\mathit{x}}\right)$ with respect to the central pixel vector (x_{ c }) in a 3 × 3 processing window. Color-image processing has traditionally been approached in a component-wise manner, that is, by processing the image channels separately. These approaches fail to consider the inherent correlation that exists between the different channels, and they may result in pixel output values that are different from the input values with possible shifts in chromaticity [24]. Thus, it is desirable to employ vector approaches in color image processing to obtain the angle deviations.
The angle interval [0, 1] is used to determine the histogram. The pixel intensity takes values from 0 to 255 in each channel; the angle deviation θ_{ c } for any given pixel with respect to another one falls within the interval $\left[0,\frac{\mathit{\pi}}{2}\right]$. The angle deviations outside the proposed interval ([0, 1]) are not taken into account in forming the histogram. Therefore, the noise estimator is obtained using only values inside this interval; this is to avoid the smoothness of some details and improve the criteria results.
To summarize, we use the SD parameter as an estimate of the noise to be applied in the spatial algorithm, which will be renewed on a temporary adaptive filter in order to ultimately generate an adaptive spatio-temporal noise estimator.
2.1. Spatial algorithm
where N = 8 represents the number of data samples to be taken into account and it is in agreement with Figure 2; the fuzzy weight computed will produce an output in the interval [0,1], and it corresponds to each angle deviation value computed excluding the central angle deviation.
If the Spatial Filtering Algorithm was selected, it probably means that the sample contained edges and/or fine details. To implement this filter, the following methodology is proposed. The procedure consists of computing a new locally adapted SD (σ_{ β }) for each plane of the color image, using a 5 × 5 processing window (see Figure 2a). In addition, the local updating of the SD should be undertaken according to the following condition: if ${\mathit{\sigma}}_{\mathit{\beta}}={\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}$, then ${\mathit{\sigma}}_{\mathit{\beta}}={\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}$; otherwise ${\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}={\mathit{\sigma}}_{\mathit{\beta}}$, where ${\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}$ was previously defined. This is most likely because the sample has edges and details, presenting a large value of dispersion among the pixels, so the largest SD value describes best this fact.
Subsequently, the following condition should be verified: IF ∇_{γβ(M,D 1,D 2)} < T_{ β } (where T_{ β } = 2 · σ_{ β }), THEN a membership value using ∇_{γβ(M,D 1,D 2)} is computed; otherwise, the membership value is 0. The threshold value T_{ β } is obtained experimentally according to the PSNR and MAE criteria: γ represents each of the eight cardinal points, and ∇_{γβ(M,D 1,D 2)} represents each of the values computed for each of the neighboring pixels with respect to the central one, within a sliding window. These gradients are called ‘main gradient values’. Two ‘derived gradient values’ are employed, avoiding the blurring of the image in the presence of an edge instead of a noisy pixel.
The detailed methodological procedures used to compute the main and derived gradient values for the eight cardinal directions are described by Zlokolica et al. [19]. If ∇_{γβ(M,D 1,D 2)} < T_{ β } for each of the three gradient values (the main and derived according to Figure 3a), then the angle deviation is calculated in the corresponding direction (γ). This means that if the main and derived gradient values are of a lower value than the threshold (T_{ β }), one gets the angle deviations from the values for the three gradients; however, if any of these values do not satisfy the condition, the angle deviation is set to 0 for the values that do not comply.
where x_{γβ(M,D 1,D 2)} is the pixel component in the associated direction. For example, for the x_{ γβM } component of the pixel, the coordinate is (0, 0) as shown in Figure 3a. Therefore, for component x ' _{ γβM }, the coordinate should be (1, 1) for the ‘SE’ cardinal direction, and so on. This parameter indicates that the smaller the difference in angle between the pixels involved, the greater the weight value of the pixel in the associated direction.
Finally, the main and derived vectorial gradient values are used to find a degree of membership using membership functions, which are functions that return a value between 0 and 1, indicating the degree of membership of an element with respect to a set (in our case, we define a BIG fuzzy set). Then, we can characterize the level of proximity of the components of the central pixel with respect to its neighbors, and see if it is a noisy or in motion component, or free of motion and/or low noise.
As mentioned above, we have defined a BIG fuzzy set; it will feature the presence of noise in the sample to be processed. The values that belong to this fuzzy set, in whole or in part, will represent the level of noise present in the pixel.
A fuzzy rule is created from this membership function, which is simply the application of the membership function by fuzzy operators. In this case, fuzzy operator OR is defined as OR(f_{1}, f_{2}) = max(f_{1}, f_{2}).
Each pixel has one returned value defined by the level of corruption present in the pixel. That is, one says ‘the pixel is corrupted’ if its BIG membership value is 1, and ‘the pixel is low-noise corrupted’ when its BIG membership value is 0. The linguistics ‘the pixel is corrupted’ and ‘the pixel is low-noise corrupted’ indicate the degree of belonging to each of the possible states in which the pixel can be found.
From the fuzzy rules, we obtain outputs, which are used to make decisions. The function defined by Equation 4 returns values between 0 and 1. It indicates how the parameter behaved with respect to the proposed fuzzy set. Finally, the following fuzzy rule is designed to connect gradient values with angle deviations, thus forming the ‘fuzzy vectorial-gradient values’.
Fuzzy rule 1 helps to detect the edges and fine details using the membership values of the BIG fuzzy set obtained by Equation 4. The fuzzy values obtained by this rule are taken as fuzzy weights and used in a fast processing algorithm to improve the computational load. This fast processing algorithm is defined by means of Equation 5.
Fuzzy rule 1: the fuzzy vectorial-gradient value is defined as ∇_{ γβ }α_{ γβ }, so: IF ((∇_{ γβM }, α_{ γβ }) is BIG AND (∇_{γβD 1}, α_{γβD 1}) is BIG) OR ((∇_{ γβM }, α_{ γβ }) is BIG AND (∇_{γβD 2}, α_{γβD 2}) is BIG), THEN ∇_{ γβ }α_{ γβ } is BIG. In this fuzzy rule, the ‘AND’ and ‘OR’ operators are defined as algebraic operations, consequently: AND = A · B, and OR = A + B - A · B.
where x_{ γβ } represents each component magnitude of the neighboring pixels around the central pixel within the pre-processing window (Figure 2b) in the respective cardinal direction, and y_{β out} is the output of the spatial algorithm applied to the first frame of the video sequence. From this, we obtain the first spatially filtered t frame which is then passed to the temporal algorithm, joined to the t + 1 frame according to the scheme described in Figure 1.
2.2. Temporal algorithm
The outlined spatial algorithm smoothes Gaussian noise efficiently but still loses some of the image's fine details and edges. To avoid these undesirable outputs, a temporal algorithm is proposed. To design such an algorithm, only two frames of the video sequence are used. The spatially filtered t frame obtained with the methodology developed in Section 2 is used once in order to provide the temporal algorithm of a filtered t frame to be used for reference to enhance the capabilities of the temporal algorithm from the first frame of the video stream without losing significant results, and the corrupted t + 1 frame of the video sequence.
Similarly defined as was the BIG fuzzy set, this set is defined as the SMALL fuzzy set. The same meanings for the expressions ‘the pixel is corrupted’ and the ‘the pixel is low-noise corrupted’ apply, but in the opposite direction. Assuming that a fuzzy set is totally characterized by a membership function, the membership function μ_{SMALL} (in the SMALL fuzzy set) is introduced to characterize the values associated with no movement and low-noise presence. By doing this, one can have a value between [0, 1] in order to measure the membership value with respect to the SMALL fuzzy set, where the value of 1 implies that the sample has no movement and low noise presence, and the value of 0 implies the opposite.
when χ = θ_{ βγ } for angle deviations, one has to select the parameters, standard deviation σ = 0.3163, mean μ_{1} = 0.2, and mean μ_{2} = 0.615; when χ = ∇_{ βγ } for gradient values, select the parameters, standard deviation σ = 31.63, mean μ_{1} = 60, and mean μ_{2} = 140. The parameter values were obtained through extensive simulations carried out on the color video sequences used in this study. The idea was to find the optimal parameter values according to the PSNR and MAE criteria. The procedure used to compute the optimal values of the parameters in the event that χ = θ_{ βγ } is selected was the beginning and variation of standard deviation starting with the value 0.1, so the PSNR and MAE criteria could reach their optimal values while maintaining the fixed values of μ_{1} = 0.1 and μ_{2} = 0.1. Once we have the optimal values of PSNR and MAE, the parameter of standard deviation is fixed and μ_{1} subsequently increases until it reaches the optimal values for the PSNR and MAE criteria. Finally, upon the fixing of the standard deviation and μ_{1}, the μ_{2} is varied until it again reaches the optimal values for the PSNR and MAE criteria. The same approach is used to calculate the values of the parameters when the event χ = ∇_{ βγ } is selected, based on the PSNR and MAE criteria. These experimental results were obtained using the well-known Miss America and Flowers color video sequences.
The fuzzy rules of Figure 5 were designed to characterize, in a fuzzy directional manner, the relationship between pixels in a sliding window using two frames. Hence, the movement and the noise level presence in the central pixel of the sample are found. To understand the meaning of these fuzzy rules, the following situation is assumed: if the fuzzy directional values obtained by the membership function for the SMALL fuzzy set are close to one, then there is neither motion nor low-noise presence in the central pixel component. Conversely, if the values of the membership function are close to one for the BIG fuzzy set, the central pixel component is noisy and/or presents motion. Thus, for fuzzy rule 2, the values SMALL, BIG, and BIG (SBB) characterize a pixel in motion, in such a way that the first value characterizes the closeness of a SMALL component to the central pixel in the t + 1 frame with the pixel component of a neighbor in the t frame; the first BIG value indicates that the component of the pixel in the t frame and the component of the pixel in the t + 1 frame are unrelated; and the second BIG conveys that the value of the component of the pixel of the t + 1 frame, with respect to the component of the central pixel of the t + 1 frame shows some difference, therefore this pixel is highly likely to belong to an edge and/or is in motion. These findings reinforce the correctness of the parameters obtained for other neighboring component pixels. In this way, the relationship of proximity between the central pixel of the t + 1 frame with respect to the neighboring pixels of the t and t + 1 frames is obtained.
where x_{ βi } represents each one of the pixels in a 3 × 3 × 2 pre-processing window, N = 17 is selected to take into account all pixel components in the two frames to be processed.
The general standard deviation used in this stage of the algorithm was adapted locally according to the pixels that agree with Figure 5 in the current sample. To acquire a new locally adapted SD, which will be used in the next frame of video sequence, a sensitive parameter α must be introduced describing the current distribution of the pixels and featuring a measure of temporal relationship between the t and t + 1 frames. The main idea of the sensitive parameter is to control the amount of filtering; this parameter modifies its value on its own to agree with the locally adapted SD. The same parameter allows the upgrading of the SD that helps to describe the relationship in the frames t and t + 1, producing a temporal parameter. When the Mean Filter is applied, the sensitivity parameter value is α = 0.125.
In case there is a drastic change in the fine details, edges, and movements in the current samples, these will be reflected in their parameter values - such as the membership functions, the SD, and the sensitivity parameters, as well as in their fuzzy vectorial-gradient values. The consequences, which are applied for each fuzzy rule, are based on the different conditions present in the sample.
The aim of this equation is to control the locally adapted spatial SD and, in the same manner, control the temporal SD which will, on its turn, control the amount of filtering modifying the T_{ β } threshold as will be shown later.
Parameters ${\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}},{\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}\text{'}},\mathrm{and}\phantom{\rule{0.5em}{0ex}}{\mathit{\sigma}}_{\mathrm{total}}$ describe how the pixels in the t and t + 1 frames are related to each other in a spatial $\left({\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}\right)$ and temporal $\left({\mathit{\sigma}}_{\mathit{\beta}}^{"},\mathrm{and}\phantom{\rule{0.5em}{0ex}}{\mathit{\sigma}}_{\mathrm{total}}\right)$ way. The SD updating of σ_{total} is achieved through: ${\mathit{\sigma}}_{\mathrm{total}}=\raisebox{1ex}{$\left({\mathit{\sigma}}_{\mathrm{Red}}^{\text{'}\text{'}}+{\mathit{\sigma}}_{\mathrm{Green}}^{\text{'}\text{'}}+{\mathit{\sigma}}_{\mathrm{Blue}}^{\text{'}\text{'}}\right)$}\!\left/ \!\raisebox{-1ex}{$3$}\right.$; this is the average value of the temporal SD using the three color planes of the images. This relationship is designed to have the other color components of the image contribute to the sensitivity parameter.
The structure of Equation 11 can be illustrated using an example: if the Mean Filter Algorithm was selected for application instead of fuzzy rules 2, 3, 4, and 5, the sensitive parameter α = 0.125 used for the algorithm describes that the t and t + 1 frames are closely related. This means that the pixels in the t frame bear low noise due to the fact that the spatial algorithm was applied to this frame (see Subsection 2.1) and that the pixels in the t + 1 frame are probably low-noise too. However, at this time, because the t frame has only been filtered by the spatial algorithm (see Subsection 2.1), it seems better to increase the weight obtained by the t frame in the spatial SD $\left({\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}\right)$, rather than using that obtained by the t + 1 frame $\left(\mathrm{temporal}\phantom{\rule{0.12em}{0ex}}\mathrm{SD}\phantom{\rule{0.12em}{0ex}}\left({\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}\text{'}}\right)\right)$. That is why the weights of σ_{total} multiplied by α = 0.125, and the weight of ${\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}$ multiplied by (1 - α) = 0.875 are used.
Procedure 1: consider the nine fuzzy vectorial-gradient values obtained from the BBB_{ βi } values. The central value is selected along with the three neighboring fuzzy values in order to detect the motion. The conjunction of the four subfacts is performed, which are combined by a triangular norm [19]. The intersection of all possible combinations of BBB_{ βi } and three different neighboring membership degrees gives 56 values to be obtained: ${\mathit{C}}_{\mathit{N}-1}^{\mathit{K}}=\left(\begin{array}{l}\mathit{N}-1\\ \phantom{\rule{1em}{0ex}}\mathit{K}\end{array}\right)=56$, where N = 9, and with K = 3 elements are to be included in the intersection process. The values are added using an algebraic equation (sum = A + B - A · B) [19] of all instances in order to obtain the motion-noise confidence parameter.
The motion-noise confidence parameter is used to update the SD and to obtain the output pixel by means of the next algorithm: ${\mathit{y}}_{\mathit{\beta}\mathrm{out}}=\left(1-\mathit{\alpha}\right)\cdot {\mathit{x}}_{\mathit{\beta c}}^{\mathit{t}+1}+\mathit{\alpha}\cdot {\mathit{x}}_{\mathit{\beta c}}^{\mathit{t}}$, (where α = 0.875 if the motion-noise = 1; and α = 0.125 when the motion-noise = 0). If there is no majority in the number of pixels to any of the fuzzy rules, then the output pixel is computed as follows: ${\mathit{y}}_{\mathit{\beta}\mathrm{out}}=0.5\cdot {\mathit{x}}_{\mathit{\beta c}}^{\mathit{t}+1}+0.5\cdot {\mathit{x}}_{\mathit{\beta c}}^{\mathit{t}}$, where α = 0.5.
Finally, the algorithm employs the above-outlined spatial algorithm for smoothing the non-stationary noise remaining after application of the temporal filter, with the only modification in its threshold value of ${\mathit{T}}_{\mathit{\beta}}=0.25{\mathit{\sigma}}_{\mathit{\beta}}^{\text{'}}$, in agreement with Figure 1.
All parameters used in the algorithm and their optimal values
Parameters with optimal values used in the algorithm | |||
---|---|---|---|
τ_{1} = 0.1 (threshold used in the ‘Spatial Filtering Algorithm’ used only once) | |||
τ_{ β } = 0.25σ^{'}_{ β } (threshold value used in the whole video sequence ) | |||
when χ = θ_{ βγ } | σ = 0.3163 | μ_{1} = 0.2 | μ_{2} = 0.615 |
when χ = ∇_{ βγ } | σ = 31.63 | μ_{1} = 60 | μ_{2} = 140 |
α = 0.125 (sensitivity parameter) | |||
1 - α = 0.875 (complement of the sensitive parameter) |
All the other parameters used in the algorithm are locally updated in agreement with the adaptive method; this means that these parameters change locally in all the sequences of the video frames.
3. Simulation results
The results presented show the effectiveness of the proposed algorithm against others used for comparison. To accomplish this, video sequences containing different features and textures were used: Miss America, Flowers, and Chair sequences; all of them contaminated by Gaussian noise with a variance (VAR) 0.0 to 0.05. The color video sequences processed for this work were 24-bit true color and 176 × 144 pixels (QCIF format).
The proposed ‘Fuzzy Directional Adaptive Recursive Temporal Filter for Gaussian Denoising’ algorithm, referred to as FDARTF_G, was compared with others, the FLRSTF algorithm that uses similar fuzzy techniques [19], the FLRSTF_ANGLE, the VGVDF_G, and the VMMKNN_G [21, 22] algorithm that uses order statistics techniques for the removal of Gaussian noise.
Comparative restoration results agree with the MAE, PSNR, and NCD criteria
Algorithm | Criteria | Gaussian noise VAR = 0.005 | Gaussian noise VAR = 0.01 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Chair | Miss America | Flowers | Chair | Miss America | Flowers | ||||||||
Frame 20 | Frame 30 | Frame 20 | Frame 30 | Frame 20 | Frame 30 | Frame 20 | Frame 30 | Frame 20 | Frame 30 | Frame 20 | Frame 30 | ||
FLRSTF | MAE | 6.375 | 6.613 | 5.818 | 5.622 | 9.628 | 9.551 | 8.400 | 8.325 | 7.477 | 7.362 | 11.932 | 11.652 |
PSNR | 28.979 | 28.720 | 29.926 | 30.157 | 26.192 | 26.263 | 26.701 | 26.589 | 27.686 | 27.720 | 24.363 | 24.559 | |
NCD | 0.0110 | 0.0112 | 0.0263 | 0.0261 | 0.0167 | 0.0168 | 0.0140 | 0.0137 | 0.0257 | 0.0260 | 0.0205 | 0.0203 | |
FLRSTF_ANGLE | MAE | 6.374 | 6.620 | 5.826 | 5.629 | 9.825 | 9.643 | 8.458 | 8.319 | 7.500 | 7.386 | 11.971 | 11.661 |
PSNR | 28.968 | 28.716 | 29.905 | 30.133 | 26.007 | 26.164 | 26.638 | 26.605 | 27.681 | 27.687 | 24.340 | 24.556 | |
NCD | 0.0110 | 0.0112 | 0.0263 | 0.0262 | 0.0175 | 0.0171 | 0.0140 | 0.0137 | 0.0258 | 0.0259 | 0.0205 | 0.0203 | |
FDARTF_G | MAE | 5.501 | 5.603 | 4.459 | 4.279 | 8.503 | 8.281 | 7.416 | 7.245 | 6.069 | 5.909 | 10.438 | 10.056 |
PSNR | 29.170 | 29.291 | 32.510 | 32.917 | 27.309 | 27.611 | 27.454 | 27.391 | 30.059 | 30.300 | 25.717 | 26.054 | |
NCD | 0.0105 | 0.0106 | 0.0219 | 0.0216 | 0.0155 | 0.0151 | 0.0135 | 0.0128 | 0.0220 | 0.0216 | 0.0185 | 0.0182 | |
VMMKNN_G | MAE | 7.987 | 7.886 | 6.178 | 6.103 | 8.777 | 8.459 | 9.250 | 9.677 | 8.143 | 8.081 | 9.916 | 9.634 |
PSNR | 25.368 | 25.607 | 29.799 | 29.826 | 25.348 | 25.801 | 25.159 | 24.427 | 27.612 | 27.683 | 24.629 | 24.978 | |
NCD | 0.0146 | 0.0141 | 0.0286 | 0.0285 | 0.0153 | 0.0147 | 0.0174 | 0.0164 | 0.0287 | 0.0282 | 0.0167 | 0.0167 | |
CBM3D | MAE | 4.863 | 4.854 | 3.512 | 3.509 | 7.821 | 7.795 | 7.712 | 7.729 | 4.518 | 4.526 | 9.934 | 9.896 |
PSNR | 33.185 | 33.193 | 33.658 | 33.931 | 30.790 | 30.670 | 32.1 | 32.06 | 32.58 | 32.87 | 29.45 | 29.363 | |
NCD | - | - | - | - | - | - | - | - | - | - | - | - |
A more sophisticated filter used as a comparison is the CBM3D [17]; this filter works in other domain, which consists of two steps in which blocks are grouped by spatio-temporal predictive blockmatching and each 3D group is filtered by a 3D transform domain shrinkage, and the complex 3D wavelet transform method 3DWF shows better results in terms of PSNR and MAE criteria than our proposed filter. For the Flowers sequence, the received results for our algorithm are worse because the performance of the additional time-recursive filtering in pixels where no motion is detected will be reduced for a moving camera. Advantages to take into account in our filtering method are the prevention/avoidance of spatiotemporal blur; one should only consider neighboring pixels from the current frame in case of detected motion. Other advantage is in preserving the details in the frame content; the filtering should not be as strong when large spatial activity e.g., a large variance, is detected in the current filtering window. As a consequence, more noise will be left, but large spatial activity corresponds to high spatial frequencies, where the eye is not sensitive enough to detect this. In the case of homogeneous areas, strong filtering should be performed to remove as much noise as possible. The performance of our methodology is similar to the achieved in the paper of Mélange et al. [27], and it was outperformed by CBM3D method too.
The PSNR , MAE , and NCD criteria averaged results
Gaussian noise (VAR) | 0.001 | 0.005 | 0.01 | 0.015 | 0.02 | 0.03 | 0.04 | 0.05 | ||
---|---|---|---|---|---|---|---|---|---|---|
Miss America | MAE | FLRSTF | 3.624 | 5.967 | 7.625 | 8.853 | 9.847 | 11.414 | 12.999 | 14.477 |
FDARTF_G | 3.549 | 4.542 | 6.126 | 7.453 | 8.465 | 9.832 | 10.822 | 11.694 | ||
VMMKNN_G | 4.377 | 6.217 | 8.198 | 9.871 | 11.372 | 13.934 | 16.072 | 17.921 | ||
VGVDF_G | 3.71 | 5.685 | 7.419 | 8.92 | 10.253 | 12.563 | 14.609 | 16.441 | ||
PSNR | FLRSTF | 34.303 | 29.73 | 27.573 | 26.267 | 25.328 | 23.998 | 22.827 | 21.888 | |
FDARTF_G | 34.013 | 32.303 | 29.929 | 28.258 | 27.127 | 25.811 | 25.02 | 24.404 | ||
VMMKNN_G | 32.057 | 29.689 | 27.55 | 26.045 | 24.887 | 23.23 | 22.083 | 21.213 | ||
VGVDF_G | 33.048 | 30.384 | 28.383 | 26.921 | 25.789 | 24.111 | 22.847 | 21.85 | ||
NCD | FLRSTF | 0.013 | 0.021 | 0.027 | 0.031 | 0.034 | 0.039 | 0.044 | 0.049 | |
FDARTF_G | 0.013 | 0.016 | 0.022 | 0.027 | 0.031 | 0.036 | 0.04 | 0.042 | ||
VMMKNN_G | 0.015 | 0.022 | 0.029 | 0.034 | 0.04 | 0.049 | 0.058 | 0.066 | ||
VGVDF_G | 0.013 | 0.021 | 0.027 | 0.033 | 0.037 | 0.046 | 0.053 | 0.06 | ||
Flowers | MAE | FLRSTF | 6.011 | 9.866 | 12.013 | 13.823 | 14.911 | 17.142 | 20.408 | 20.664 |
FDARTF_G | 7.295 | 8.847 | 10.647 | 12.262 | 13.539 | 15.288 | 16.464 | 17.39 | ||
VMMKNN_G | 7.754 | 9.07 | 10.31 | 11.357 | 12.258 | 13.896 | 15.346 | 16.642 | ||
VGVDF_G | 8.04 | 9.588 | 10.786 | 11.736 | 12.571 | 14.036 | 15.331 | 16.459 | ||
PSNR | FLRSTF | 30.289 | 25.969 | 24.289 | 23.052 | 22.411 | 21.182 | 19.569 | 19.525 | |
FDARTF_G | 28.139 | 26.906 | 25.427 | 24.255 | 23.424 | 22.375 | 21.722 | 21.24 | ||
VMMKNN_G | 26.128 | 25.324 | 24.523 | 23.858 | 23.303 | 22.347 | 21.564 | 20.907 | ||
VGVDF_G | 25.765 | 24.835 | 24.135 | 23.602 | 23.139 | 22.358 | 21.693 | 21.127 | ||
NCD | FLRSTF | 0.011 | 0.018 | 0.021 | 0.025 | 0.026 | 0.03 | 0.033 | 0.036 | |
FDARTF_G | 0.014 | 0.017 | 0.02 | 0.023 | 0.025 | 0.028 | 0.03 | 0.032 | ||
VMMKNN_G | 0.014 | 0.016 | 0.018 | 0.02 | 0.021 | 0.023 | 0.025 | 0.027 | ||
VGVDF_G | 0.017 | 0.019 | 0.021 | 0.023 | 0.024 | 0.026 | 0.028 | 0.03 |
Since the proposed algorithm is adaptive, it is difficult to obtain computational information related to how many adds, multiplies, or divisions among other operations like trigonometrical ones were carried out; we provide real-time performance using a DSP from Texas Instruments, Dallas, TX, USA; this was the DM642 [32] giving the following results: for our proposed FDARTF_G, it spent an average time of 17.78 s per frame, but in a complete directional (VGVDF_G) processing algorithm, it spent an average time of 25.6 s per frame, both in a QCIF format.
4. Conclusions
The fuzzy and directional techniques working together have proven to be a powerful framework for image filtering applied in color video denoising in QCIF sequences. This robust algorithm performs motion detection and local noise standard deviation estimation. These proper video-sequence characteristics have been obtained and converted into parameters to be used as thresholds in different stages of the novel proposed filter. This algorithm permits the processing of t and t + 1 video frames, producing an appreciable savings of time and resources expended in computational filtering.
Using the advantages of both techniques (directional and diffuse), it was possible to design an algorithm that can preserve edges and fine details of video frames besides maintaining their inherent color, improving the preservation of the texture of the colors versus results obtained by the comparative algorithms. Other important conclusion is that for sequences obtained by a still camera, our method has a better performance in terms of PSNR than other multiresolution filters of a similar complexity, but it is outperformed by some more sophisticated methods (CBM3D).
The simulation results under the proposed criteria PSNR, MAE, and NCD were used to characterize an algorithm's efficiency in noise suppression, fine details, edges, and chromatic properties preservation. The perceptual errors have demonstrated the advantages of the proposed filtering approach.
Declarations
Acknowledgements
The authors thank the Instituto Politécnico Nacional de México (National Polytechnic Institute of Mexico) and CONACYT for their financial support.
Authors’ Affiliations
References
- Rosales-Silva AJ, Gallegos-Funes FJ, Ponomaryov V: Fuzzy Directional (FD) Filter for impulsive noise reduction in colour video sequences. J. Vis. Commun. Image Represent. 2012, 23(1):143-149. 10.1016/j.jvcir.2011.09.007View ArticleGoogle Scholar
- Amer A, Schrerder H: A new video noise reduction algorithm using spatial subbands. Int. Conf. on Electronic Circuits and Systems 13-16 October 1996, 1: 45-48.View ArticleGoogle Scholar
- De Haan G: IC for motion-compensated deinterlacing, noise reduction, and picture rate conversion. IEEE Trans. On Consumers Electronics 1999, 45(3):617-624. 10.1109/30.793549View ArticleGoogle Scholar
- Rajagopalan R, Orchard M: Synthesizing processed video by filtering temporal relationships. IEEE Trans. Image Process. 2002, 11(1):26-36. 10.1109/83.977880View ArticleGoogle Scholar
- Seran V, Kondi LP: New temporal filtering scheme to reduce delay in wavelet-based video coding. IEEE Trans. Image Process. 2007, 16(12):2927-2935.MathSciNetView ArticleGoogle Scholar
- Zlokolica V, De Geyter M, Schulte S, Pizurica A, Philips W, Kerre E: Fuzzy logic recursive change detection for tracking and denoising of video sequences. Paper presented at the IS&T/SPIE Symposium on Electronic Imaging, San Jose, California, USA, 14 March 2005 doi: 10.1117/12.585854Google Scholar
- Pizurica A, Zlokolica V, Philips W: Noise reduction in video sequences using wavelet-domain and temporal filtering. Paper presented at the SPIE Conference on Wavelet Applications in Industrial Processing, USA, 27 February 2004 doi:10.1117/12.516069Google Scholar
- Selesnick W, Li K: Video denoising using 2d and 3d dual-tree complex wavelet transforms. Paper presented at the Proc. SPIE on Wavelet Applications in Signal and Image Processing, USA, volume 5207, pp. 607-618; 14 November 2003 doi: 10.1117/12.504896Google Scholar
- Rajpoot N, Yao Z, Wilson R: Adaptive wavelet restoration of noisy video sequences. Paper presented at the IEEE International Conference on Image Processing, pp. 957-960, October 2004 doi: 10.1109/ICIP.2004.1419459Google Scholar
- Ercole C, Foi A, Katkovnik V, Egiazarian K: Spatio-temporal pointwise adaptive denoising of video: 3d nonparametric regression approach. January: Paper presented at the First Workshop on Video Processing and Quality Metrics for Consumer Electronics; 2005.Google Scholar
- Rusanovskyy D, Egiazarian K: Video denoising algorithm in sliding 3D DCT domain. Lecture Notes in Computer Science 3708. : Springer Verlag, Advanced Concepts for Intelligent Vision Systems; 2005:618-625.Google Scholar
- Ponomaryov V, Rosales-Silva A, Gallegos-Funes F: Paper presented at the Proc. of SPIE-IS&T, Published in SPIE Proceedings Vol. 6811: Real-Time Image Processing 2008. 4 March 2008. doi:10.1117/12.758659Google Scholar
- Varghese G, Wang Z: Video denoising based on a spatio-temporal Gaussian scale mixture model. IEEE Trans. Circ. Syst. Video. Tech. 2010, 20(7):1032-1040.View ArticleGoogle Scholar
- Jovanov L, Pizurica A, Schulte S, Schelkens P, Munteanu A, Kerre E, Philips W: Combined wavelet-domain and motion-compensated video denoising based on video codec motion estimation methods. IEEE Trans. Circ. Syst. Video. Tech. 2009, 19(3):417-421.View ArticleGoogle Scholar
- Dai J, Oscar C, Yang W, Pang C, Zou F, Wen X: Color video denoising based on adaptive color space conversion. Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), June 2010, pp. 2992-2995 doi: 10.1109/ISCAS.2010.5538013Google Scholar
- Liu C, Freeman WT: A high-quality video denoising algorithm based on reliable motion estimation. In Paper presented at the Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III. Heraklion, Crete, Greece: Springer-Verlag; 2010:706-719.Google Scholar
- Dabov K, Foi A, Egiazarian K: Video denoising by sparse 3D transform-domain collaborative filtering. In Proc. 15th European Signal Processing Conference, EUSIPCO 2007. Poznan, Poland: ; September 2007.Google Scholar
- Mairal J, Sapiro G, Elad M: Learning multiscale sparse representations for image and video restoration. SIAM Multiscale Modeling and Simulation 2008, 7(1):214-241. 10.1137/070697653MathSciNetView ArticleGoogle Scholar
- Zlokolica V, Schulte S, Pizurica A, Philips W, Kerre E: Fuzzy logic recursive motion detection and denoising of video sequences. J. Electron. Imag. 2006, 15(2):1-13. doi:10.1117/1.2201548View ArticleGoogle Scholar
- Trahanias PE, Karakos D, Venetsanopoulos AN: Directional processing of color images: theory and experimental results. IEEE Trans. Image Process. 1996, 5(6):868-880. 10.1109/83.503905View ArticleGoogle Scholar
- Ponomaryov VI: Real-time 2D-3D filtering using order statistics based algorithms. J. Real-Time Image Proc. 2007, 1(3):173-194. 10.1007/s11554-007-0021-5View ArticleGoogle Scholar
- Ponomaryov V, Rosales-Silva A, Golikov V: Adaptive and vector directional processing applied to video color images. Electron. Lett. 2006, 42(11):1-2.View ArticleGoogle Scholar
- Arizona State University , October-2010 http://trace.eas.asu.edu/yuv/
- Zheng J, Valavanis KP, Gauch JM: Noise removal from color images. J. Intell. Robot. Syst. 1993, 7: 3.View ArticleGoogle Scholar
- Plataniotis KN, Venetsanopoulos AN: Color Image Processing and Applications. : Springer-Verlag; 26 May 2000.View ArticleGoogle Scholar
- Pearson A: Fuzzy Logic Fundamentals. Chapter 3, 2001, pp. 61–103. . August 2008 www.informit.com/content/images/0135705991/samplechapter/0135705991.pdf Chapter 3, 2001, pp. 61–103. . August 2008Google Scholar
- Mélange T, Nachtegael M, Kerre EE, Zlokolica V, Schulte S, Witte VD, Pizurica A, Philips W: Video denoising by fuzzy motion and detail adaptive averaging. J. Electron. Imag. 2008, 17(4):043005-1-043005-19. http://dx.doi.org/10.1117/1.2992065View ArticleGoogle Scholar
- Yu S, Ahmad O, Swamy MNS: Video denoising using motion compensated 3-D wavelet transform with integrated recursive temporal filtering. IEEE Trans. Circ. Syst. Video. Tech. 2010, 20(6):780-791.View ArticleGoogle Scholar
- Chatterjee P, Milanfar P: Clustering-based denoising with locally learned dictionaries. IEEE Trans. Image Process. 2009, 18(7):1438-1451.MathSciNetView ArticleGoogle Scholar
- Zuo C, Liu Y, Tan X, Wang W, Zhang M: Video denoising based on a spatiotemporal Kalman-bilateral mixture model. Scientific World Journal (Hindawi) 2013.Google Scholar
- Li S, Yin H, Fang L: Group-sparse representation with dictionary learning for medial image denoising and fusion. IEEE Transaction on Biomedical Engineering 2012., 59(12):Google Scholar
- Texas Instruments , January 2008 http://www.ti.com/tool/tmdsevm642
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.