Skip to main content

Image decomposition-based structural similarity index for image quality assessment

Abstract

Perceptual image quality assessment (IQA) adopts a computational model to assess the image quality in a fashion, which is consistent with human visual system (HVS). From the view of HVS, different image regions have different importance. Based on this fact, we propose a simple and effective method based on the image decomposition for image quality assessment. In our method, we first divide an image into two components: edge component and texture component. To separate edge and texture components, we use the TV flow-based nonlinear diffusion method rather than the classic TV regularization methods, for highly effective computing. Different from the existing content-based IQA methods, we realize different methods on different components to compute image quality. More specifically, the luminance and contrast similarity are computed in texture component, while the structural similarity is computed in edge component. After obtaining the local quality map, we use texture component again as a weight function to derive a single quality score. Experimental results on five datasets show that, compared with previous approaches in the literatures, the proposed method is more efficient and delivers higher prediction accuracy.

1 Introduction

With the wide use of digital image, image quality assessment (IQA) becomes extremely important in many applications, such as image acquisition, watermarking, compression, transmission, restoration, enhancement, and denoising [13]. During the past decades, major advances have occurred in image quality assessment. Generally, the IQA methods can be classified into two classes: one is the subjective assessment, where the image quality is decided by human observers. The other is the objective assessment, whose goal is to design algorithms to mimic the subjective judgment accurately and automatically. In practice, subjective assessment is usually inconvenient, time-consuming, and expensive. This drawback makes it impractical in real-world applications. According to the availability of a reference image, objective IQA indices can be classified as full reference (FR), no-reference (NR), and reduced-reference (RR) methods.

Due to the significant advantages of the objective IQA, a lot of excellent schemes have been proposed based on it. These schemes can generally be categorized into three types: intensity-based methods, human visual system (HVS)-based methods, and structure feature-based methods [4]. The classical examples of intensity-based methods, including the mean squared error (MSE) and peak signal-to-noise ratio (PSNR) [5], are widely used in FR-IQA because of their simplicity and clear meaning. However, they regard the image as simple signals when evaluating its quality, which cannot coincide with human’s subjective evaluation.

To address this problem, many HVS property-based FR-IQA methods are proposed. Unlike MSE or PSNR, the HVS property-based methods try to construct a mathematic model to simulate HVS characteristics, including visual masking effect [6], contrast [7], and just noticeable differences [8]. The noise quality measure index and the visual signal-to-noise ratio index (VSNR) emphasize the importance of HVS sensitivity to different visual signals, such as the luminance, the contrast, the frequency content, and the interaction between them. However, as pointed out in [9, 10], since the knowledge about the various processing stages in the HVS is less, there is no satisfying visual perception model that account for all the experimental findings on the HVS.

The structural similarity image (SSIM) index proposed by Wang et al. [3] brings FR-IQA to the structure-based stage [11]. The method is derived from the hypothesis that the HVS is highly adapted for extracting the structural information from the visual scene, and therefore, a measurement of structural similarity can provide a good approximation of the perceived image quality. Due to the success of SSIM, the contrast and structure information are considered as two important factors in the evaluation of FR-IQA. Based on this idea, a number of modifications have been proposed to improve SSIM’s performance [1215]. Based on the fact that HVS is selective for a certain range of spatial frequencies [16]. In [12], the multi-scale method is introduced into SSIM, this method incorporates the SSIM at five different resolutions with the application of successive low-pass filtering. In [13], Wang and Li improved the original MSSSIM to the information content weighted SSIM index (IWSSIM) by introducing a new information content weighting (IW)-based quality score. In [14], Chen et al. proposed gradient SSIM (G-SSIM); in this method, contrast similarity and structural similarity are computed in gradient domain. In [15], SSIM is used directly in the discrete wavelet decomposition band, then the whole image quality can be evaluated by the weighted mean of all the bands. In [17], Wang et al. proposed patch-based objective quality assessment method using an adaptive representation of local patch structure and evaluating their perceptual distortions in different ways. Since SSIM is proven to be more effective in quantifying the suprathreshold compression artifacts, such as artifacts that distort the structure of an image [18], it has been used in various scenarios, such as video coding and image denoising [19, 20]. In [19], Wang et al. proposed a perceptual video coding framework based on SSIM-inspired divisive normalization. In [20], the SSIM index is embedded into the framework of non-local means image denoising.

In the last few decades, some effective features that can well characterize contrast and structural information in image are employed to improve the performance of the FR-IQA metrics [11, 2123]. For example, the gradient magnitude have been used to characterize contrast and structural information, and have played important roles in recent FR-IQA methods.

Based on the fact that different image regions have different importance for HVS, some researchers attempt to assign visual importance weights to improve the performance of the FR-IQA indices [11, 23, 24]. Zhang et al. proposed a Riesz-transform based feature similarity (RFSIM) [23] index for FR-IQA. This method consists of three steps. First, the first- and second-order Riesz transforms are introduced to characterize local structures in images. Then, based on the assumption that HVS is sensitive to image edges, key locations are marked by a mask formed by the Canny operator. Finally, only those Riesz transform coefficients within key locations are used for evaluating visual quality scores. Recently, Zhang et al. [11] also proposed a feature similarity (FSIM) index where the phase congruency and the gradient magnitude are used to measure the local structures. However, the above-mentioned works are too time-consuming, which cannot be used in the real-time applications.

In this paper, we take two important facts into consideration, one is different image regions have different importance for HVS, the other is different quality metrics have different sensitive in different regions. Inspired by this, we propose a simple and effective image decomposition-based structural similarity (IDSSIM) index for image quality assessment. In our method, we first partition an image into two components: edge and texture components, using the TV flow based nonlinear diffusion method. Then, the mean and standard deviation of texture component are used to evaluate the local luminance and contrast similarity; the gradient magnitude of edge component is used to evaluate the local structural similarity. The effects of the changes in edge and texture are integrated using different weights to obtain the local image quality score. Finally, the texture component is employed as a weight function to derive a single similarity score. Since the chrominance information will also affect HVS in understanding the images, we further extend our proposed IDSSIM by incorporating the chrominance information with the color IQA, and we call this extension IDSSIMc. The experimental results on five benchmark datasets demonstrate that our proposed method provides a reliable performance of FR-IQA.

The rest of this paper is organized as follows. In Section 2, we illustrate the proposed model in details. Experimental results on five datasets are given in Section 3, and the conclusion follows in Section 4.

2 Image decomposition-based structural similarity index

2.1 Motivation

The rationale behind the proposed methods is that the edge and texture regions have different importance for vision perception. As shown in Fig. 1, panel a is a reference image while panels d and g are its two distorted versions (the distortion types are additive gaussian noise and non-eccentricity pattern noise, respectively). Panels b, e, and h are the edge component of panels a, d, and g, respectively. Panels c, f, and i are the texture component of a, d, and g, respectively. We can see that images in panels c and f have more obvious differences than those in panels b and e. In contrast, the differences in panels b and h are more obvious than those in panels c and i. This example clearly illustrates that different regions show different sensitivity in different distortion types. To further specify this statement, we analyze three representative methods PSNR, SSIM, and FSIM in different image regions in TID2013. The following steps demonstrate the process:

Fig. 1
figure 1

Examples of image decomposition. a is a reference image while d and g are its two distorted versions (the distortion types are additive gaussian noise and non-eccentricity pattern noise, respectively); b, e, and h are the edge component of a, d, and g, respectively; c, f, and i are the texture component of a, d, and g, respectively

  1. 1.

    Divide an image into two component, edge and texture, using the TV flow-based image decomposition. More details about this method are illustrated below.

  2. 2.

    Compute the PSNR, SSIM, and FSIM index for each region.

  3. 3.

    Compute the Spearman correlation coefficient in different components.

From Table 1, the best results for PSNR and SSIM are obtained only when considering the texture regions of an image. For the PSNR and SSIM results in Table 1, it can be explained by the contrast sensitivity curve (CSF), which considers that human eyes are more sensitive to median frequency in comparison with lower and higher frequencies. Since the gradient magnitude is the high frequency component of an image, the FSIM shows the best performance in edge region. Motivated by this observation, we propose to implement different methods on different regions to compute image quality. The luminance and contrast are two important attribute for characterizing the quality of an image [3]. Since human sensitivity to the contrast performs well in median frequency, we compute the luminance and contrast similarity in texture region. Besides the luminance and contrast, the structural also plays an important role in the perceived visual quality. Here, we compute the structural similarity in the edge image. In the following, we explain the proposed method in detail.

Table 1 Performance of FSIM and SSIM in different region

2.2 Proposed method

In this section, we propose a novel FR-IQA method based on the image decomposition. The proposed image quality metric works with luminance only. The RGB color inputs are converted into YIQ color space [25], defined as

$$ \left[ \begin{array}{c} Y\\ I\\ Q \end{array} \right] = \left[\begin{array}{ccc} 0.299 &0.587 &0.114 \\ 0.596& -0.274& -0.322\\ 0.211& -0.523 &0.312 \end{array}\right]\left[\begin{array}{c} R\\ G\\ B \end{array}\right] $$
(1)

where Y represents the luminance information, I and Q convey the chrominance information.

The framework of IDSSIM is demonstrated in Fig. 2, which consists of the following four steps:

  1. (1)

    Partition an image into edge and texture component images, using the TV flow image decomposition.

  2. (2)

    Compare the luminance and contrast similarity in texture image.

  3. (3)

    Compare the structural similarity in edge image.

  4. (4)

    Compute the global perceptual quality scores with the texture as the weight function.

Fig. 2
figure 2

The framework of the proposed approach. First, the RGB color reference and distorted images are converted into YIQ color space. The luminance channel of reference and distorted images are divided into edge and texture components. Then, the mean and standard deviation of texture component are used to evaluate the local luminance and contrast similarity; the gradient magnitude of edge component is used to evaluate the local structural similarity. The effects of the changes in edge and texture are integrated using different weight methods to obtain the local image quality score. Moreover, I and Q, two chrominance channels, are used as features to characterize the quality degradation caused by color distortions. Finally, the texture component is employed as a weighting function to derive a single similarity score

2.3 TV flow-based image decomposition model

An image can be regarded as the sum of the edge image u (being piecewise smooth and with sharp edge along the contour) and the texture image v (only containing fine-scale details, usually with some oscillatory nature), defined as: f=f u +f v . The image decomposition is widely used in the literature of image coding, image denoising, image registration, and texture discrimination. A general way to obtain this decomposition using the variational approach is to solve the problem min {T V(f u )f u f B σ}, where T V(f u ) denotes the total variation of f u and · B is a norm. The total variation of f u is minimized to regularize u while keeping edges like object boundaries of f in f u [26]. In our method, we use a TV flow-based nonlinear diffusion technique [27], which is the parabolic counterpart to TV regularization [18], instead of TV regularization. In 1D, TV flow and TV regularization yield exactly the same output. In 2D, this equivalence could not be proven so far; however, both processes at least approximate each other very well [27].

The edge image f u of the image evolves under progress of artificial time t according to the partial differential equation (PDE)

$$ {f}_{u} = {u}^{t+1}, {u}^{t+1}={u}^{t}+\text{div}\left(g\left(\left|\bigtriangledown {u}^{t} \right| \right)\bigtriangledown {u}^{t} \right) $$
(2)

where t is the iteration number, div is the divergence operator, is the gradient operator, and g(·) is the diffusivity function.

Note that it is critical to choose the proper diffusivity function g(·). In order to reduce the smoothing at edges, the diffusivity g(·) is chosen as a decreasing function of the edge detector u t. In this paper, we choose the TV flow [28], defined as:

$$ g(x)=\frac{1}{\epsilon+x} $$
(3)

where ε is a small positive constant.

In practice, the nonlinear diffusion is quite inefficient, which limits its practical application. To overcome its limitation, we adopt an efficient approach, called the additive operator splitting (AOS) scheme, which is defined as:

$$ {u}^{t+1}=\frac{1}{2}\left({\left(f-2\tau {A}_{x} \left({u}^{t} \right)\right)}^{-1}+ {\left(f-2\tau {A}_{y} \left({u}^{t} \right)\right){u}^{t}}^{-1} \right){u}^{t} $$
(4)

where A x and A y denote the diffusion matrices computed in the horizontal and vertical directions, respectively. Compared with the implicit schemes, this scheme uses backward Euler method to obtain a system of linear equations, which is stable for any time step. The efficiency of the diffusion can be improved by using larger time step. More details about the method can be found in [29].

In the following, the texture image is defined as:

$$ {f}_{v} = {f}-{u}^{t} $$
(5)

where t is the number of iterations. Examples of edge and texture images are shown in Fig. 1. The performance variations according to the time step τ and iteration number t settings are given in Section 3.

2.4 Image decomposition-based structural similarity

With the extracted edge and texture images, in this section, we present a novel IDSSIM index for FR-IQA. Suppose that we are going to calculate the similarity between reference image f 1 and distorted image f 2. The computation of IDSSIM consists of two stages. In the first stage, the local similarity map is computed, and then in the second stage, we pool the similarity map into an overall quality score. We separate the IDSSIM measurement between f 1(x) and f 2(x) into two components, each for edge image or texture image.

For similarity measurement in texture image, we divide the task of texture image similarity measurement into two components: luminance and contrast similarity. Similar to [3], we use the mean and standard deviation as estimate of the signal luminance and contrast, respectively. Let μ 1 and μ 2 denote the mean of texture images f v1 and f v2; let σ 1 and σ 2 denote the standard deviation of texture images f v1 and f v2. The similarity of the local statistics is defined as:

$$ S_{\mu }(x)=\frac{2{\mu }_{1}\left (x \right)\cdot {\mu }_{2}\left (x \right)+{C}_{1}}{{{\mu }_{1}\left (x \right)}^{2}+{{\mu }_{2}\left (x \right)}^{2}+{C}_{1}} $$
(6)
$$ S_{\sigma }(x)=\frac{2{\sigma }_{1}\left (x \right)\cdot {\sigma }_{2}\left (x \right)+{C}_{2}}{{{\sigma }_{1}\left (x \right)}^{2}+{{\sigma}_{2}\left (x \right)}^{2}+{C}_{2}} $$
(7)

where C 1 and C 2 are positive constant to increase the stability of S μ (x) and S σ (x).

Specially, we use an KK circular-symmetric Gaussian weighting function W={w i i=1,2,…,N}, with a standard deviation of 1.5 samples, normalized to unit sum, the same as [3]. The estimates of μ(x) and σ(x) are then modified accordingly as:

$$ \mu \left (x \right)=\sum_{i=1}^{N}w_{i}x_{i} $$
(8)
$$ \sigma\left (x \right)=\left (\sum_{i=1}^{N} w_{i}\left (x_{i}-\mu \left (x \right) \right)^{2}\right)^{\frac{1}{2}} $$
(9)

Finally, S μ (x) and S σ (x) are combined to get the texture image local similarity T S(x), defined as:

$$ TS\left (x \right) = \left [ S_{\mu }(x)\right ]^{\alpha }\cdot\left [ S_{\sigma }(x) \right ]^{\beta } $$
(10)

where α and β are two parameters used to adjust the relative importance of S μ (x) and S μ (x). In our experiment, we set α=β=1.

Now, we introduce how to compute the structural similarity in the edge image. Structural information is an excellent attribute for characterizing the quality of an image. Proper structural change may even improve the perceptual quality of images. There are different methods for structural measurement, such as gradient modulus (GM), Harris response, etc. Thus, we chose gradient modulus to compute the structural similarity. There are several differentiation operators that can accomplish this task [3034], such as Sobel operator [30], Prewitt operator [31], and Scharr operator. In this paper, we choose Prewitt operator. With Prewitt operator, the partial derivatives G x (x) and G y (x) are calculated as:

$${} {G}_{x}\left (x \right)=\! \left[ \begin{array}{ccc} -1 & 0 &1 \\ -1& 0 &1 \\ -1& 0 & 1 \end{array}\right] *f\left (x \right),{G}_{y}\left (x \right)=\! \left[ \begin{array}{ccc} 1 & 1 &1 \\ 0 & 0 & 0\\ -1& -1& -1 \end{array}\right] *f\!\left (x \right) $$

The GM of f(x) is then computed as \({G}_{x}\left (x \right) = \sqrt {{G}^{2}_{x}\left (x \right)+{G}^{2}_{y}\left (x \right)}\). Let G 1 and G 2 denote the GM of edge images f u1 and f u2, then, the structural similarity is defined as:

$$ ES(x)=\frac{2{G}_{1}\left (x \right)\cdot {G }_{2}\left (x \right)+{C}_{3}}{{{G }_{1}\left (x \right)}^{2}+{{G}_{2}\left (x \right)}^{2}+{C}_{3}} $$
(11)

Then, T S(x) and E S(x) are combined to get the local similarity S(x) of f 1(x) and f 2(x), defined as:

$$ S\left (x \right) = \left [ TS\left (x \right) \right ]^{\gamma }\cdot\left [ ES\left (x \right) \right ]^{\delta } $$
(12)

After computing the local similarity S μ (x) at each location x, the overall similarity can be calculated. The most commonly used pooling strategy is average pooling, i.e., simply averaging the local quality map as the final FR-IQA score. However, different locations have different contributions to HVS’ perception of image [11]. In [11] and [35], the phase congruency and visual saliency map are used as the weighting function in the overall similarity. Based on the analysis above, for a given location x, if anyone of f v1 and f v2 has a significant difference diffusion value, it implies that this position x will have a high impact on HVS. Therefore, we use TM m =max(f v1,f v2) to weight the importance of S μ (x) in the overall similarity, the IDSSIM index is defined as:

$$ \text{IDSSIM}=\frac{\sum_{x\in \eta}^{}S\left (x \right)\cdot TM_{m}\left (x \right)}{\sum_{x\in \eta}^{}TM_{m}\left (x \right)} $$
(13)

where η means the whole image spatial domain.

2.5 Extension to color IQA

It is known that variations of chrominance components also affect perceived visual quality in color images. To reflect this effect on IDSSIM, we devise two similarity measures S I and S Q by comparing two chrominance values, defined as:

$$ S_{I}(x)=\frac{2{I}_{1}\left (x \right)\cdot {I }_{2}\left (x \right)+{C}_{4}}{{{I }_{1}\left (x \right)}^{2}+{{I}_{2}\left (x \right)}^{2}+{C}_{4}} $$
(14)
$$ S_{Q}(x)=\frac{2{Q}_{1}\left (x \right)\cdot {Q }_{2}\left (x \right)+{C}_{5}}{{{Q }_{1}\left (x \right)}^{2}+{{Q}_{2}\left (x \right)}^{2}+{C}_{5}} $$
(15)

where C 4 and C 5 are positive constants. Finally, the IDSSIM index can be extended to IDSSIMc, defined as:

$$ \text{IDSSIMc}=\frac{\sum_{x\in \eta}^{}S\left (x \right)\cdot\left [S_{I}(x)\cdot S_{Q}(x)\right ]^{\lambda }\cdot TM_{m}\left (x \right)}{\sum_{x\in \eta}^{}TM_{m}\left (x \right)} $$
(16)

where λ is a parameter used to adjust the relative importance of chrominance features.

3 Simulation result and discussion

3.1 Databases and evaluation criteria

The performance of the proposed method is tested on four well-known image quality assess databases, including TID2013 database [36], TID2008 database [37], Categorical Image Quality (CSIQ) database [38], LIVE database [39], and A57 database [40]. The characteristics of these databases are listed in Table 2.

Table 2 Benchmark datasets for evaluating IQA indices

In the following experiments, we use four evaluation criteria to compare the performance of the FR-IQA methods: the Spearman rank order correlation coefficient (SROCC), the Kendall rank order correlation coefficient (KROCC), the Pearson linear correlation coefficient (PLCC), and the root-mean-squared error (RMSE). The SROCC and KROCC are used to measure the prediction monotonicity of an IQA index; the larger the value, the better the performance. Since these two criteria only focus on the rank of the data points and ignore the relative distance between data points. Before computing the other two criteria, it is customary to apply a logistic transform to obtain a nonlinear mapping between the objective scores and subjective mean opinion scores. The PLCC is used to measure the correlation degree between objective scores and the subjective mean opinion scores (MOS) after nonlinear regression; larger value means better performance. The RMSE measures the prediction consistency; smaller value means better performance. For the nonlinear regression, we use the following mapping function [39]:

$$ f\left (x \right)=\beta_{1}\cdot \left (\frac{1}{2}-\frac{1}{1+\text{exp}\left (\beta_{2}\cdot \left (x-\beta_{3} \right) \right)} \right)+\beta_{4}\cdot x+\beta_{5} $$
(17)

where β i ,i=1,2,…,5 are parameters to be fitted. More details about the four performance metrics can be found in [13]. We compare our method with the 10 other state-of-the-art and representative FR-IQA methods, including VIF [41], GSM [21], PSNR [5], VSNR [40], SSIM [3], MSSSIM [12], IWSSIM [13], RFSIM [23], FSIM/FSIMc [11], and SFF [42].

3.2 Determination of parameters

There are several parameters required to be determined for IDSSIM/IDSSIMc. We tuned the parameters based on the TID2013 database, which contains 25 reference images in TID2013 and the associated 3000 distorted images. The tuning criterion is that the parameter value leading to a higher SROCC would be chosen. In order to show the performance according to the parameters (time step τ and iteration number t) of IDSSIM/IDSSIMc, we conducted experiments where the size of the time step and the iteration numbers are varied. As shown in Fig. 3, we can see that the SROCC increases with the increase of time step and iteration numbers. It is also noteworthy that a smaller number of iterations and a larger time step can also guarantee a significant improvement, with less processing time. Considering its overall performance on all the benchmark databases, the parameters are set the number of iterations t=1 and τ=500. The parameters of IDSSIM/IDSSIMc are listed in Table 3.

Fig. 3
figure 3

Different parameters performance The performance of IDSSIM in terms of SROCC with different parameters on a TID2008 and b TID2013 datasets

Table 3 Parameters setting for IDSSIM

In IDSSIM pooling stage, the texture image is used as a weighting function. Figure 4 shows the influences of using texture component as a weighting function. This experiment is carried out on five databases: TID2013 database [36], TID2008 database [37], CSIQ database [38], LIVE database [39], and A57 database [40]. The Spearman’s rank ordered correlation coefficient (SROCC) is used as the evaluation criterion here. From Fig. 4, we observe that the IDSSIM has better performance when the texture image is adopted as the weight function.

Fig. 4
figure 4

Weight function performance. The SROCC of IDSSIM with weight function and IDSSIM without weight function to evaluate on LIVE, TID2008, TID2013, CSIQ, and A57 databases

3.3 Performance evaluation

In this section, we compare the competing FR-IQA models’ performance on the five FR-IQA databases in terms of SROCC, KROCC, PLCC, and RMSE. It is noticed that, except the FSIMc, SFF, and IDSSIMc, all the other IQA indices are based on the luminance component of the image. The results are listed in Table 4. For each performance measure, the three FR-IQA indices producing the best results are highlighted in italics. In Table 5, we list the performance ranking of all the IQA metrics according to their SROCC values. For fairness, the FSIMc, SFF, and IDSSIMc indices, which also exploit the chrominance information of images, are excluded in Table 5. Notice that most of the metrics perform well in the LIVE database, and the LIVE database only contains a few distortion types. Therefore, the experimental results on TID are more reliable.

Table 4 Comparison of 8 IQA indices on three benchmark datasets
Table 5 Ranking of IQA metrics’ performance (except for FSIMc, SFF, and IDSSIMc) on five databases

In Table 4, we can see that the proposed IDSSIMc performs consistently well on all the benchmark databases. On the largest database TID2013, the proposed method IDSSIMc achieves the best results. SFF is the second best performing method. On TID2008, IDSSIMc shows the best performance, closely followed by FSIMc. The results on CSIQ and LIVE databases show that, even though it is not the best, IDSSIMc performs only slightly worse than the best results. On the A57 database, VSNR performs the best, and IDSSIM and IWSSIM perform almost the same. In Table 5, we can see that our methods achieve the best results on almost all the databases, except for TID2008 and LIVE. Even on these two databases, however, the proposed IDSSIM is only slightly worse than the best results.

Table 6 shows the result of the weighted-average SROCC, KROCC, and PLCC results over three datasets. The weight assigned to each dataset linearly depends on the number of distorted images contained in that dataset. The results show that the performance of proposed IDSSIM/IDSSIMc is superior to other methods. Moreover, Fig. 5 shows the scatter plots of the subjective scores against objective scores predicted on TID2013. Compared with other scatter plots, the proposed IDSSIM and IDSSIMc show better linearity and correlation. It is, therefore, reasonable to conclude that objective scores predicted by IDSSIM/IDSSIMc is more correlated with subjective ratings than the other methods.

Fig. 5
figure 5

Scatter plots of subjective MOS against scores obtained by model prediction on the TID2013 database a IFC, b VIF, c GSM, d PSNR, e VSNR, f SSIM, g MSSSIM, h IWSSIM, i RFSIM, j FSIM, k IDSSIM and l IDSSIMc

Table 6 Weighted-average performances over three datasets

3.4 Statistical significance

In order to make statistically meaningful conclusions on the models performance, the left-tailed F-test is conducted on the prediction residuals between the metric outputs (after nonlinear mapping) and the subjective ratings. Let F denotes the ratio between the residual variances of two different metrics, F critical is calculated based on the number of residuals and a given confidence level. If F is larger than F critical, then the difference between the two metrics is considered to be significant at the specified confidence level. The F critical with 95 % confidence is shown in Fig. 6 for the TID2008 and TID2013 databases. In Fig. 6, the proposed metric is compared with the other metrics regarding the statistical significance. In each entry, the symbol “1” or “0” means that on the image databases indicated by the first column of the table, the proposed metric is statistically (with 95 % confidence) better or worse, respectively, when compared with its competitors indicated by the first row. We can see that on TID2013 databases, IDSSIMc is significantly better than all the other models except for FSIMc. On TID2008 database, IDSSIMc is significantly better than all the other models except for SFF and FSIMc. Note that on the two databases, no IQA model performs significantly better than IDSSIMc.

Fig. 6
figure 6

The results of statistical significance tests of the competing IQA models on the a TID2013 and b TID2008 databases. The value of “1” (highlighted in green) indicates that the model in the row is significantly better than the model in the column, while the value of “0” (highlighted in red) indicates that the first model is not significantly better than the second one

3.5 Performance comparison on individual distortion types

To further examine the robustness of the FR-IQA schemes, we compare the performance of our method with other methods on each distortion type in TID2013 and TID2008 databases. In this experiment, we only use the SROCC values as the performance measure. For each performance measure, the top three results are highlighted in italics. From Table 7, we can clearly see that IDSSIMc is among the top three indices 15 times on TID2013 and 10 times on TID2008. Thus, we can have the following conclusions: when the distortion is of a specific type, the proposed method also performs well.

Table 7 SROCC valuse of IQA indices for each type of distortions in TID2013 and TID2008

3.6 Computational cost

The computational cost of each FR-IQA method is also measured. This experiment is performed on a 2.5-GHz Intel core i5 processor with 10 GB RAM. The software is Matlab R2014a. All distorted images in TID2013 dataset are used. To analyze the processing time in detail, we divided the proposed scheme into four main steps: image decomposition, compare the luminance and contrast similarity in texture image, compare the structural similarity in edge image, and compute the global perceptual quality scores. The average processing time for the test dataset is shown in Table 8. These results show that the performance of the proposed method may be considered sufficient to allow its implementation in real-time applications. It should be noted that the image decomposition process is the element that consumes most of the processing time. The average processing time of each FR-IQA method is listed in Table 9. From Table 9, we can see that PSNR and GMS are much faster than IDSSIM. However, their performances are fairly worse than IDSSIM. Specifically, IDSSIM runs much faster than the other modern IQA indices which could achieve the state-of-the-art prediction performance.

Table 8 Analysis of the processing time for the proposed method
Table 9 Time cost Of each FR-IQA index

As mentioned earlier, the IQA algorithm can be not only used for quality assessment tasks but also pervasively used in many other applications. A direct application of IQA measures is to use them to benchmark the image processing algorithms and systems [43]. For example, the rate distortion (RD) curves are often used to characterize the performance of image coding systems, where the RD function is defined as the bit rate distortion between the original and decoded images. A lower RD curve indicates a better image coder. To compute this distortion and obtain the RD curve, a lot of methods based on MSE are proposed. However, these methods suffer from low accuracy. As we mentioned earlier, the RD curve can be used to precisely evaluate the image coder only if the IQA methods have higher accuracy. To improve the accuracy, VIF, FSIM, and MSSSIM are proposed. However, these methods suffer from low computation efficiency, which renders them cannot be used in many applications. Different from previous work, our proposed IDSSIM not only has the high accuracy but also achieves the high efficiency, which is very attractive and competitive for real-time applications.

4 Conclusions

In this paper, we propose an efficient and robust method for image quality assessment. Different from prior arts, we realize different methods on different components to compute image quality. The inspiration behind this paper is that different quality metrics have different sensitivity in different regions. We also propose to exploit the AOS scheme to compute the diffusion map efficiently. In the pooling stage, the texture component image is used to weight the importance of local quality map. We then extended IDSSIM to IDSSIMc by incorporating the image chromatic features into consideration. Finally, we conduct extensive experiments on five databases; the results demonstrate that our proposed methods yield a superior performance than the other state-of-the-art methods.

References

  1. S Bharadwaj, M Vatsa, R Singh, Biometric quality: a review of fingerprint, iris, and face. EURASIP J. Image Video Process. 2014(1), 1–28 (2014).

    Article  Google Scholar 

  2. L Zhang, S Battiato, Z Wang, R Schettini, KR Rao, Emerging methods for color image and video quality enhancement. EURASIP J. Image Video Process. 2010(1), 891703 (2011).

    Google Scholar 

  3. Z Wang, AC Bovik, HR Sheikh, EP Simoncelli, Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13:, 600–612 (2004).

    Article  Google Scholar 

  4. EC Larson, DM Chandler, Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imaging. 19(1), 011006–011006 (2010).

    Article  Google Scholar 

  5. Z Wang, AC Bovik, Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Proc. Mag. 26(1), 98–117 (2009).

    Article  Google Scholar 

  6. J Ross, HD Speed, in Proceedings of the Royal Society of London B: Biological Sciences, 246(1315). Contrast adaptation and contrast masking in human vision, (1991), pp. 61–70.

  7. SJ Daly, Application of a noise-adaptive contrast sensitivity function to image data compression. Opt. Eng. 29(8), 977–987 (1990).

    Article  Google Scholar 

  8. J Lubin, in Broadcasting Convention, 1997. International. A human vision system model for objective picture quality measurements IET (Amsterdam, 1997), pp. 498–503.

  9. W Xue, L Zhang, X Mou, AC Bovik, Gradient magnitude similarity deviation: a highly efficient perceptual image quality index. IEEE Trans. Image Process. 23(2), 684–695 (2014).

    Article  MathSciNet  Google Scholar 

  10. A Beghdadi, R Iordache, Image quality assessment using the joint spatial/spatial-frequency representation. EURASIP J. Adv. Signal Process. 2006(1), 1–8 (2006).

    Article  Google Scholar 

  11. L Zhang, L Zhang, X Mou, D Zhang, Fsim: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011).

    Article  MathSciNet  Google Scholar 

  12. Z Wang, EP Simoncelli, AC Bovik, in Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference On. Multiscale structural similarity for image quality assessment IEEE (Pacific Grove, 2003), pp. 1398–1402.

  13. Z Wang, Q Li, Information content weighting for perceptual image quality assessment. IEEE Trans. Image Process. 20:, 1185–1198 (2011).

    Article  MathSciNet  Google Scholar 

  14. G-H Chen, C-L Yang, S-L Xie, in Image Processing, 2006 IEEE International Conference On. Gradient-based structural similarity for image quality assessment IEEE (Atlanta, 2006), pp. 2929–2932.

  15. C-L Yang, W-R Gao, L-M Po, in Image Processing, 2008. ICIP 2008. 15th IEEE International Conference On. Discrete wavelet transform-based structural similarity for image quality assessment IEEE (San Diego, 2008), pp. 377–380.

  16. S Winkler, Issues in vision modeling for perceptual video quality assessment. Signal Process. 78(2), 231–252 (1999).

    Article  MATH  Google Scholar 

  17. S Wang, K Ma, H Yeganeh, Z Wang, W Lin, A patch-structure representation method for quality assessment of contrast changed images. IEEE Signal Process. Lett. 22(12), 2387–2390 (2015).

    Article  Google Scholar 

  18. AC Brooks, X Zhao, TN Pappas, Structural similarity quality metrics in a coding context: exploring the space of realistic distortions. IEEE Trans. Image Process. 17(8), 1261–1273 (2008).

    Article  MathSciNet  Google Scholar 

  19. T-S Ou, H Chen, Y-H Chen, Ssim-based perceptual rate control for video coding. IEEE Trans. Circ. Syst. Video Technol. 21(5), 682–691 (2011).

    Article  Google Scholar 

  20. WZ Rehman A, in Image Processing, 2011. ICIP 2011. 18th IEEE International Conference On. Ssim-based non-local means image denoising (IEEEBrussels, 2011), pp. 217–220.

    Chapter  Google Scholar 

  21. A Liu, W Lin, M Narwaria, Image quality assessment based on gradient similarity. IEEE Trans. Image Process. 21(4), 1500–1512 (2012).

    Article  MathSciNet  Google Scholar 

  22. D-O Kim, H-S Han, R-H Park, Gradient information-based image quality metric. IEEE Trans. Image Process. 56(2), 930–936 (2010).

    Google Scholar 

  23. L Zhang, L Zhang, X Mou, in Image Processing, 2010. ICIP 2010. 17th IEEE International Conference On. Rfsim: A feature based image quality assessment metric using riesz transforms, (2010), pp. 321–324.

  24. K Gu, S Wang, G Zhai, S Ma, X Yang, W Zhang, Content-weighted mean-squared error for quality assessment of compressed images. SIViP. 10(5), 803–810 (2016).

    Article  Google Scholar 

  25. J-M Geusebroek, R Van den Boomgaard, AW Smeulders, H Geerts, Color invariance. IEEE Trans. Pattern Anal. Mach. Intell. 23(12), 1338–1350 (2001).

    Article  Google Scholar 

  26. W Yin, D Goldfarb, S Osher, A comparison of three total variation based texture extraction models. J. Vis. Commun. Image Represent. 18(3), 240–252 (2007).

    Article  Google Scholar 

  27. T Brox, J Weickert, in Computer Vision-ECCV 2004. A TV flow based local scale measure for texture discrimination (SpringerBerlin Heidelberg, 2004), pp. 578–590.

    Chapter  Google Scholar 

  28. F Andreu, C Ballester, V Caselles, Minimizing total variation flow. Minimizing Total Var. Flow. 14(3), 321–360 (2001).

    MathSciNet  MATH  Google Scholar 

  29. J Weickert, BH Romeny, MA Viergever, Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Trans. Image Process. 7(3), 398–410 (1998).

    Article  Google Scholar 

  30. RC Gonzalez, RE Wood, in Prentice Hall. Digital image processing (Prentice Hall, 2002).

  31. JMS Prewitt, Prewitt j m s. Picture process. Psychopictorics. 10(1), 15–19 (1970).

    Google Scholar 

  32. R LG, in Ph. D. Thesis. Machine perception of three-dimensional solids (Massachusetts Institute of TechnologyCambridge, 1963).

    Google Scholar 

  33. MJ Bentum, BBA Lichtenbelt, T Malzbender, Frequency analysis of gradient estimators in volume rendering. IEEE Trans. Vis. Comput. Graph. 2(3), 242–254 (1996).

    Article  Google Scholar 

  34. AR Rivera, JR Castillo, O Chae, Local directional number pattern for face analysis: face and expression recognition. IEEE Trans. Image Process. 22(5), 1740–1752 (2013).

    Article  MathSciNet  Google Scholar 

  35. L Zhang, Y Shen, H Li, Vsi: A visual saliency-induced index for perceptual image quality assessment. IEEE Trans. Image Process. 23(10), 4270–4281 (2014).

    Article  MathSciNet  Google Scholar 

  36. NN Ponomarenko, O Ieremeiev, VV Lukin, C-CJ Kuo, in Visual Information Processing (EUVIP), 2013 4th European Workshop On. Color image database tid2013: Peculiarities and preliminary results IEEE (Paris, 2013), pp. 106–111.

  37. NN Ponomarenko, VV Lukin, AA Zelensky, F Battisti, Tid2008-a database for evaluation of full-reference visual quality assessment metrics. Adv. Mod. Radioelectron. 10(4), 30–45 (2009).

    Google Scholar 

  38. EC Larson, DM Chandler, Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imaging. 19:, 011006 (2010).

    Article  Google Scholar 

  39. HR Sheikh, MF Sabir, AC Bovik, A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006).

    Article  Google Scholar 

  40. DM Chandler, SS Hemami, Vsnr: A wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans. Image Process. 16(9), 2284–2298 (2007).

    Article  MathSciNet  Google Scholar 

  41. HR Sheikh, AC Bovik, Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006).

    Article  Google Scholar 

  42. H-W Chang, H Yang, Y Gan, M-H Wang, Sparse feature fidelity for perceptual image quality assessment. IEEE Trans. Image Process. 22(10), 4007–4018 (2013).

    Article  MathSciNet  Google Scholar 

  43. Z Wang, Applications of objective image quality assessment methods. IEEE Signal Proc. Mag. 28(6), 137–142 (2011).

    Article  Google Scholar 

Download references

Authors’ contributions

JFY proposed the framework of this work, carried out the whole experiments, and drafted the manuscript. YPL supervised the whole work, participated in its design, offered useful suggestions, and helped to modify the manuscript. BO and XCZ participated in the discussion of this work and helped to polish the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaping Lin.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Lin, Y., Ou, B. et al. Image decomposition-based structural similarity index for image quality assessment. J Image Video Proc. 2016, 31 (2016). https://doi.org/10.1186/s13640-016-0134-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13640-016-0134-5

Keywords