Skip to main content

Prediction error preprocessing for perceptual color image compression

Abstract

In this article, a prediction error preprocessor based on the just noticeable distortion (JND) for the color image compression scheme is presented. The dynamic range of prediction error signals we can reduce, the lower bit rate of the reconstructed image we can obtain at high visual quality. We propose a color JND estimator that is incorporated into the design of the preprocessor in the compression scheme. The color JND estimator is carried out in the wavelet domain to present good estimates to the available amount masking. The estimated JND is used to preprocess the signal and is also used to incorporate into the design of the quantization stage in the compression scheme for higher performance. Simulation results show that the bit rate required by the compression scheme with the preprocessor is lower at high visual quality of the reconstructed color image. The preprocessor is further applied to the input color image of the JPEG and JPEG2000 coders for better performance.

1. Introduction

In the Internet, where the transmission bandwidth is limited, the growing demand for representing high-quality color images is expected. Since human eyes are ultimate receivers of visual in-formation, color image compression that is perceptually lossless to human visual perception is required. The color image compression scheme should take into account the properties of the human visual system (HVS) when considering the image quality as the critical performance to be achieved. The goal of the perceptual image compression is to represent a digital image at the lowest possible bit rate without intro-ducing perceivable distortion. To reach this goal, the perceptual coder has to remove not only statistical redundancy, but also perceptual redundancy of images. Accurately measuring the perceptual redundancy is important to the success of perceptual coding. Perceptual redundancy can quantitatively be measured as error detection thresholds or noise amplitudes of just noticeable distortion (JND) [1], by which signals can be neither undercoded nor overcoded.

In the research efforts of perceptual image coding, the determination of proper quan-tization steps with JND has been so far focused. Once the JND profile of the image is accurately measured, the quantization step size can appropriately be determined such that the coding distortion can properly be distributed and shaped with less objective distortion. By combining band sensitivities, background luminance, and texture masking, Safranek and Johnston [2] measured the JND threshold for each coefficient in a given subband to set the quantization level in a differential pulse code modulation (DPCM) quantizer. In [3], quantization matrices for the use in DCT-based compression were designed by exploiting visibility thresholds that are experimentally measured for quantization errors of the DCT coefficients. In [4], a JPEG compliant encoder utilizing perceptually based quantization was proposed to produce a perceptually equivalent image that has a high compression ratio. Chou and Li [5] estimated the JND threshold by the dominant between the luminance masking and the texture masking. The estimated JND profile is incorporated to tune the step size of a uniform quantizer in the proposed subband image coder. In [6], a model of the HVS based on the wavelet transform was proposed. The model has a number of modifications that make it more amenable to potential integration into a wavelet-based image compression scheme. The author concludes with suggestions on how the model can be used to determine a visually optimal quantization strategy for wavelet coefficients and produce a quantitative measure of image quality. In [7], the masking thresholds derived in a locally adaptive fashion based on subband decomposition are applied to the design of a locally adaptive perceptual quantization scheme for achieving high performance in terms of quality and bit rate. Tang [8] further investigated perceptual video coding by incorporating the motion attention model, visual sensitivity model, and visual masking model for the purpose of adaptive quantization. In [9], the sensitivity of the HVS to edges is considered to construct a classified vector quantization method for image compression. Nevertheless, these research efforts focused on developing the coding schemes for grayscale images. The perceptual compression schemes that are designed for color images can be found in [10–13]. Yang et al. [10] proposed a nonlinear additive model to estimate the spatial JND profiles for color image processing. In [11], the wavelet-based color image compression by exploiting the contrast sensitivity function (CSF) was presented. The method implements the CSF measure over spatial frequency of luminance and chrominance components into the task of noise spectrum shaping and achieves a visually optimal compression quality. Based on the uniformity of the uniform color space, Liu and Chou [12] built a color visual model that can estimate the perceptual redundancy for each color pixel as a visibility threshold of color difference to design the quantization strategy of the locally adaptive perceptual compression scheme for color images. In [13], the same visual model proposed in [12] is modified and incorporated into the JPEG-LS and JPEG2000 coder to improve the performance in both cases.

Based on the JNDs of images, most research efforts of perceptual coding have been concentrated on the design of proper quantizers. They attempt to discriminate between signal components which are and are not detected by human eyes [1]. The main idea in perceptual coding is to hide the quantization error below the detection threshold. Meanwhile, perceptually irrelevant signal information is also removed to improve the standard coding paradigm of redundancy removal. Besides using JND thresholds to adapt quantization step sizes for image coding, the JND thresholds can also effectively be applied to certain stages in the image coding. In this article, a prediction error preprocessor based on the JND is investigated for higher performance in the design of the color image compression scheme. The proposed method is investigated under the guidance of visual tolerance such that the dynamic range of the prediction error signals is reduced to obtain lower coding bit rates without decreasing the visual quality of the reconstructed color image. That is, the prediction error preprocessor will be adapted by the JNDs of the color image to achieve this. Since the measure of JND profiles of the color image dominates this study, the JND estimator for color images will be designed. In this article, the wavelet-domain JND of each coefficient in luminance and chrominance components of color images are estimated in a locally adaptive fashion based on the wavelet decomposition. For the coefficient in luminance component, its visual tolerance is measured by using the visual masking effects given coefficient by coefficient by taking into account the luminance content and the texture of grayscale images. On the other hand, for the coefficient in chrominance components, its visual tolerance is measured by combining not only the visual masking effects, but also the effect given by the variance within the local region of the target coefficient while considering that the HVS is insensitive to chrominance than to luminance. The preprocessor is then designed by adjusting an appropriate quantity regulated by the JND profiles to shape the prediction error signals such that the perceptual distortion of the reconstructed color image can be reduced. Furthermore, for any standard color image coding scheme, the proposed preprocessor that is independent of image cod-ers can be also used to preprocess the input color image such that the processed signal can be coded with higher performance. The rest of this article is organized as follows. In Section 2, the estimation of subband JND profiles for color images is described. The proposed prediction error preprocessor based on the estimated subband JND profiles for color image compression is presented in Section 3. The simulation results on overall performance of the coding scheme are given in Section 4. In Section 5, the conclusions of this article are made.

2. Subband JND profiles for color images

In the application of perceptual color image compression where high visual quality of the reconstructed color image at lower bit rates would be required, the appropriate choice of a color space is important to determine the coding performance. The reduction of the correlation between color channels in the color space is expected for most image compression schemes. Since the redundancy among color channels of the color image in the YUV and YCbCr color spaces is less than that in the RGB color space, most of the compression techniques use the former color spaces for coding images. For example, the Y channel in the YCbCr color space contains almost all the luminance information while the Cb and Cr channels that may easily be down sampled have less information. The subband JND profiles for color images in the YCbCr color space are thus estimated to build the prediction error preprocessor in this article. To obtain better visual quality of the reconstructed color image, the JND of each wavelet subband coefficient in luminance and chrominance components is estimated to preprocess each coefficient signal.

Measurement of the receptive fields shows that the multi-channel frequency- and orientation-selective components demonstrate approximately a dyadic structure. The measurement can be approached by the dyadic structure of the pyramid wavelet transformation that decomposes the input image into subbands that have different levels and orientations. The subband differs from each other in terms of its sensitivity and visual masking properties. By using such characteristics, the wavelet-domain JND of each coefficient in luminance (Y) and chrominance (Cb and Cr) components of color images can be estimated in a locally adaptive fashion based on the wavelet decomposition. The estimated JND of the wavelet coefficient at location (i, j) in the subband with transform level λ and orientation θ of color component O in the color im-age is represented

d O ( λ , θ , i , j ) = d O , D ( λ , θ , i , j ) â‹… a O ( λ , θ , i , j )  for  O =  Y , C b , C r
(1)

where d O, D (λ, θ, i, j) is the luminance-adapted base detection threshold and a O (λ, θ, i, j) is the visual masking adjustment. The indexing for locating each wavelet subband is shown in Figure 1.

Figure 1
figure 1

The indexing of subbands in a three-level wavelet decomposition.

2.1. Luminance-adapted base detection threshold, d O, D (λ, θ, i, j)

The threshold is measured for signals presented against a specified uniform luminance background. The intensity-based contrast sensitivity model proposed in [2, 7] is adopted here. The mathe-matical model used to measure the base detection threshold dO, base(λ, θ) of the (λ, θ) subband in O color component can be found in [14]. The contrast sensitivity is measured while a however uniform background intensity level is fixed. Table 1 shows the base detection threshold for each subband of four-level 9/7 DWT. It is easily found that the base sensitivity threshold is lowest for the lowest frequency band while higher frequency bands have higher thresholds.

Table 1 Base detection threshold for each subband of four-level 9/7 DWT [14]

In fact, the detection threshold actually varies with the background intensity. This is the so-called "luminance adaptation." It denotes the variations in the sensitivity depend on the local mean of luminance component in color images. That is, the variations in local mean luminance within the color image will result in substantial variations in wavelet thresholds. In this article, the luminance adaptation factor, fY,l(λ, θ, i, j), is given by the power function proposed in [3] and is described as

f Y , l ( λ , θ , i , j ) = z Y ( λ max , θ , i ′ , j ′ ) z Y , mean α for  i ′ = i / 2 ( λ max - λ )  and  j ′ = j / 2 ( λ max - λ )
(2)

where ⌊ ⌋ is the operator of rounding to the nearest smaller integer, α the parameter that is suggested a value of 0.649 [3], zY, mean is the LL subband constant corresponding to the mean luminance of the display (128 for 8-bit image), and zY(λmax, θ, i', j') is the wavelet coefficient at location (i', j') in the subband with the highest level λmax and orientation θ of the luminance component Y. According to the intensity-based contrast sensitivity model proposed in [2, 7], the luminance-adapted base detection threshold is therefore computed as

d O , D ( λ , θ , i , j ) = d O , base ( λ , θ ) â‹… f Y , l ( λ , θ , i , j )  for  O =  Y , C b , C r
(3)

Herein, the model designed for gray image is also applied to chrominance components since the human visual perception is more sensitive to luminance component than to chrominance compo-nents.

2.2. Visual Masking Adjustment, a O (λ, θ, i, j)

This adjustment is a measure used to increase the detection threshold while taking visual masking effects into account. For luminance component, the contrast masking effect and the crossed masking effect are considered in this article. The visual masking adjustment is defined as

a Y ( λ , θ , i , j ) = max ( a Y , c ( λ , θ , i , j ) , a Y,cross ( λ , θ , i , j ) )
(4)

where aY,c(λ, θ, i, j) is the contrast masking adjustment and aY, cross(λ, θ, i, j) is the crossed masking adjustment at location (i, j) of the (λ, θ) subband in Y component. The contrast masking effect means that the visual sensitivity of stimuli is reduced by the increasing spatial non-uniformity of the background luminance. That is, human eyes are more sensitive to noises in smooth regions than in textured regions. In the subband image coding, the sensitivity to a particular coefficient's quantization error is affected by the magnitudes of other coefficients. In [2], the contrast masking adjustment of a coefficient is a function of "texture energy" of the neighboring coefficients at the same location in different subbands. In [3], this masking is strongest when both target and masking components are of the same spatial frequency, orientation, and location. To reduce the complexity of measuring detection thresholds in luminance, the contrast masking adjustment presented in [15] is simplified and shown as

a Y , c ( λ , θ , i , j ) = max 1 , | z Y ( λ , θ , i , j ) | d Y, D ( λ , θ , i , j ) β
(5)

where β is an exponent that lies between 0 and 1. In this article, β = 0 for LL band and β = 0.4 for other bands are determined by the experiments. The crossed masking effect is given by the interaction between luminance and chrominance components in color images. It denotes the masking between the luminance masker and the chrominance target, or between the chrominance masker and the luminance target. The research results in [16, 17] suggest that the rather substantial masking on luminance signals by chromatic masks is worth incorporating into the masking model. In [18], it shows that luminance masks have little effect on color contrast detection while chromatic masks greatly affect the detectability of luminance contrast. In this article, the crossed masking adjustment is obtained by modifying the elevation factor presented in [19] as

a Y , cross ( λ , θ , i , j ) = max max 1 , | z Cb ( λ , θ , i , j ) | d Cb, D ( λ , θ , i , j ) η , max 1 , | z Cr ( λ , θ , i , j ) | d Cr, D ( λ , θ , i , j ) κ
(6)

where dCb(λ, θ, i, j) and dCr(λ, θ, i, j) denote visibility thresholds of the coefficients at location (i, j) of the (λ, θ) subband in Cb and Cr components, respectively, and η and κ the parameters that are allowed to vary with frequency and perceptual color channel [17]. The larger the values of η and κ are set, the greater the crossed masking effect can be given. When η and κ are set by the values of 0, no crossed masking occurs and the crossed masking adjustment is constant at 1. Through experiments, η = 1.0 and κ = 1.0 for all bands are determined in this article.

Also, masking effects exist in chrominance components of the color image and affect the sensitivity to chrominance components of a target color pixel. It cannot easily be identified since masking effects in chrominance components involves complex human vision mechanisms. This makes the estimation of noise detection thresholds in chrominance become difficult. In [20], it is clearly shown that the perceptibility of color difference depends on local contents of the color image. The research results presented in [21] further describe that masking by intense dynamic white noise will certainly elevate the distortion thresholds. The conventional interpretation of this elevation is that the noise adds variance to the decision factor. This masking situation will involve the so-called noise masking effect and is another area in need of clarification. Authors of [14] suggest that the visually effective variance should be used to compute the distortion thresholds. We apply the idea proposed in [14, 21] to chrominance components of color images to simplify the estimation of the chromatic JND that is related to the complex features of the HVS. The visual masking adjustment for chrominance components is therefore defined by computing a measure of variance within the local region of a target coefficient that is scaled by the visibility of the coefficient.

a O ( λ , θ , i , j ) = 1 + σ O 2 ( λ , θ , i , j ) d O , base 2 ( λ , θ ) 1 2  for  O = C b , C r
(7)

where σ O 2 ( λ , θ , i , j ) is the local variance measured in O component. The local region that is used to calculate the local variance contains the coefficients in the same subband that lies within a window centered at the location of (i, j).

The subjective viewing test used in [22–24] is performed to accomplish the quality assessment and determine the parameters in Equations (5) and (6) by comparing the original color image and its noise-contaminated color image that is given by randomly adding to or subtracting from each the wavelet coefficient by its estimated JND value. The thumbnail of the test color image set is shown in Figure 2. OK} The two color images are displayed side by side on the LCD monitor (ViewSonic VP2365), against which the subject observes the image pair in a dark room at a viewing distance of six times the image height [1, 2, 24]. In this case, the test image of 512 × 512 pixels is inspected with a viewing angle close to 10 and at a resolution of 54 pixels per degree. Under this viewing condition, the parameters are adjusted until the distortion between the two images is perceivable. The adjustment that results in JND is then recorded to determine the corresponding parameter.

Figure 2
figure 2

Thumbnail of the color image set for testing.

3. Perceptual color image compression scheme

In this article, a prediction error preprocessor built by utilizing the estimated JNDs of one achromatic and two chromatic components is proposed to integrate into a perceptual color image compression scheme using the DPCM technique. The JNDs mainly attempt to design a prediction error preprocessor that can shape the prediction error signals more smooth instead of investigating the adaptive prediction while the same visual quality of the reconstructed color image for lower compression bit rates is achieved. Then, it aims at varying the quantization level to constrain the quantization error under the visual tolerance for higher quality of the reconstructed image.

The functional block diagram of the proposed perceptual compression scheme for color images in the wavelet domain is given in Figure 3, where z O (λ, θ, i, j), z ̃ O ( λ , θ , i , j ) , e O (λ, θ, i, j), ẽ O ( λ , θ , i , j ) , and d O (λ, θ, i, j), re-spectively, denote the original current wavelet coefficient, the predicted coefficient, the prediction error before the preprocessing, the prediction error after the preprocessing, and the JND value for the coefficient located at (i, j) of the (λ, θ) subband in the O color component of the color image. The subband JND profiles of the input color image obtained by using the JND estimator presented in Section 2 are incorporated into the proposed prediction error preprocessor to shape the prediction error and decide the reconstruction level for achieving the increased performance in terms of bit rate at a specified visual quality.

Figure 3
figure 3

Block diagram of the proposed perceptually adaptive coding scheme.

In the proposed compression scheme, the fixed prediction mode that uses the same set of coefficients for every image to be coded is adopted such that the performance given by the proposed JND-based preprocessor can clearly be verified. As shown in Figure 3, the prediction error is obtained by the difference between the current signal and its predicted signal and is given as

e O ( λ , θ , i , j ) = z O ( λ , θ , i , j ) - z ̃ O ( λ , θ , i , j ) for  O =  Y , C b , C r
(8)

The dynamic range of prediction error signals we can reduce, the less objective distortion of the reconstructed color image for a given bit rate we can achieve [25]. In order to shape the prediction error for higher performance, the prediction error preprocessor utilizes the JND profiles to process the prediction error signals such that the dynamic range of processed prediction error signals can be reduced to achieve lower bit rate or better reconstructed image quality. Each prediction error signal is adjusted by the preprocessor and is shown as

ẽ O ( λ , θ , i , j ) = e O ( λ , θ , i , j ) + γ O â‹… d O ( λ , θ , i , j ) , if  e O ( λ , θ , i , j ) < - γ O â‹… d O ( λ , θ , i , j ) e O ( λ , θ , i , j ) - γ O â‹… d O ( λ , θ , i , j ) , else if  e O ( λ , θ , i , j ) > γ O â‹… d O ( λ , θ , i , j ) e O ( λ , θ , i , j ) , otherwise
(9)

where γ O is the parameter used to make a trade-off between the visual quality and the coding bit rate of the reconstructed color image for the color component O. The constraint of γ O ∈0,1] is to avoid introducing the perceptual distortion into prediction errors in the JND-based preprocessor. If the estimated JNDs accurately approximates to the actual JNDs for human eyes, the preprocessor with γ O = 1 can extremely adjusts each prediction error signal of the color image to its critical and just visible bound for human eyes to shape the signals for the highest performance. In this article, the γ O value is conservatively used in the experiment. In the stage of quantizing the preprocessed prediction error signal, the proposed compression scheme uses a locally adaptive perceptual quantization method. The strategy of using the locally adaptive perceptual quantization is to adapt the quantization step size of each coefficient in each subband to the actual amount of locally varying masking threshold. However, the main problem that results from adopting the strategy is that both the encoder and decoder are required to calculate these local masking thresholds. This would require transmitting side information to the decoder to guarantee reconstruction of the coded image. In order to avoid sending a large amount of side information to the decoder, the idea proposed in [7] is extended to eliminate the need for transmitting side information for each step size of each coefficient in each subband of the color image by estimating the available masking from the already received data and a prediction of the transform coefficient to be quantized. That is, the already received data and a prediction of the transform coefficient to be quantized are utilized in the measure of the estimate JND, d ̃ O ( λ , θ , i , j ) , of the coefficient in color images to approximate the actual amount of masking. The properties of the human visual perception to natural color images, including flatness and smooth color transition, allow to present good estimates to the available amount of masking. This is illustrated by the modified JND estimator shown in Figure 3. In order to carry out the synchronization of achieving the estimate JND profiles at the coder and the decoder in the proposed compression scheme, the design of the scanning order for sub-band coefficients is needed (Figure 4). As shown in Figure 4a, the subbands are scanned from the highest subband to the lowest subband in each color component. In each subband, the coefficients are scanned row-wise in the order of Cb, Cr, and Y components (Figure 4b). The quantization step sizes for each coefficient in subbands of the Y, Cb, and Cr components are thus given by

Figure 4
figure 4

Scanning order for subband coefficients (a) subband scanning order in each color component, (b) row-wise scanning in the order of C b , C r , and Y components for coefficients in each subband. (Symbol "ο" means the transform coefficient).

Δ O ( λ , θ , i , j ) = 2 ϕ O ⋅ d ̃ O ( λ , θ , i , j )
(10)

where φ O is the step size multiplier whose value can be chosen such that the compression distortion is uniformly distributed over the reconstructed image while a tight entropy (bit-rate) budget is required. In this article, φY = 1.0, φCb = 1.0, and φCr = 1.0 are used to achieve the perceptually lossless visual quality of the reconstructed image for the variable uniform mid-riser quantizer in the proposed compression scheme.

4. Simulation results

To evaluate the performance of the proposed compression scheme, the scheme has been implemented by incorporating the proposed prediction error preprocessor into the DPCM coder for compressing color images. A variety of color images that represent a great of diversity of visual information is used in the experiments. The size of each color image is 512 × 512 with color depth of 24 bits in the RGB color space. Since the compression performance achieved by the proposed prediction error preprocessor is emphasized, the stage of entropy coding is not further discussed in this article. In the simulation, the proposed compression scheme therefore makes use of entropies rather than bit rates to represent its performance while a specified visual quality of the reconstructed color image is obtained. Meanwhile, the subjective viewing test for assessing the visual quality of the compressed color image is conducted in the simulation.

First, the simulation of the proposed compression scheme with γ O = 0 and φ O = 1.0 is carried out. That is, the proposed scheme is implemented by incorporating the perceptual quantization (φ O = 1.0) without using the prediction error preprocessor (γ O = 0). To evaluate the visual quality of the compressed color images, the same subjective viewing test and viewing condition based on the method presented in [22–24] are used in the simulation (the same test used in Section 2). Twenty subjects of age between 21 and 26 who have normal eyesight or had been corrected to be normal take part in the test. As mentioned above, the test is carried out in the dark room when the subject observes the image on the screen of the monitor at a viewing distance of six times the image's height [1, 2, 24]. The thumbnail of the color image set for testing is shown in Figure 2. In each viewing test, the original image and its compressed image are displayed side-by-side on the screen of the monitor for evaluating the perceptual difference between the two images. According to the subjective rating criterion [23, 24] shown in Table 2, subjects are then asked to vote on the comparative quality of the two images while subjects view them for at least 2-3 s. In order to achieve a fair comparison and evaluation, the presentation order of the image pairs is randomized and the compressed image is randomly displayed on the right or left side of the screen. The results of mean subjective scores by all subjects for the 32 compressed color images are shown in Figure 5. It can be found that the mean subjective scores are all close to zero. Meanwhile, most of the associated standard deviations are quite small in comparison with the range of the subjective rating criterion from -3 to 3. The overall subjective scores are calculated to obtain the mean of 0.024 with the associated standard deviation of 1.079. This indicates that the compressed color image by the proposed compression scheme is hardly distinguishable from its original color image for most subjects. That is, the perceptually lossless visual quality of the compressed image is achieved by the proposed perceptual color image compression scheme. In the above subjective viewing test, the compressing outputs for the original color "Sail" and "Leaf" images (Figures 6a and 7a) are shown in Figures 6b and 7b, respec-tively, where the peak signal-to-noise ratio (PSNR) of the compressed "Sail" image is of 35.36 dB at en-tropy of 0.739 and the PSNR of the compressed "Leaf" image is of 34.73 dB at entropy of 0.491. This means that if the PSNR of the compressed image can be as low as possible while the visual quality of the compressed image remains the same as the original image under the specified viewing condition, thus the visual model is effectively incorporated into the compression scheme for color images. Except for the above subjective viewing test, the representative objective quality metrics based on the image structure features in according to HVS perception are used to verify the visual quality of the compressed image. The metrics of structural similarity (SSIM) [26], visual information fidelity (VIF) [27], visual-SNR (VSNR) [28], weighted-SNR (WSNR) [29], and PSNR HVS masking (PSNR-HVS-M) [30] are computed and shown in Figures 6 and 7, respectively. From the values computed by these objective measures, the visual quality of the compressed color images is difficultly distinguished from that of the corresponding original color images. This indicates that the estimated JNDs indeed give the satisfied estimates to the available amount masking and the proposed visual model is successfully used to increase the compression performance.

Table 2 Subjective rating criterion for visual quality of an image pair
Figure 5
figure 5

Results of mean subjective scores for visual quality of the compressed color images for twenty subjects.

Figure 6
figure 6

Results of compressing the "Sail" color image with the proposed compression scheme. (a) Original image, (b) reconstructed image (PSNR = 35.36 dB, SSIM = 0.932, VIF = 0.927, VSNR = 42.87 dB, WSNR = 40.25 dB, PSNR-HVS-M = 44.45 dB, entropy = 0.739) with γ O = 0 and φ O = 1.0, and (c) reconstructed image (PSNR = 34.57 dB, SSIM = 0.918, VIF = 0.904, VSNR = 42.23 dB, WSNR = 40.01 dB, PSNR-HVS-M = 43.37 dB, entropy = 0.717) with γ O = 0.4 and φ O = 0.5.

Figure 7
figure 7

Results of compressing the "Leaf" color image with the proposed compression scheme. (a) Original image, (b) reconstructed image (PSNR = 34.73 dB, SSIM = 0.926, VIF = 0.918, VSNR = 43.64 dB, WSNR = 41.73 dB, PSNR-HVS-M = 43.17 dB, entropy = 0.491) with γ O = 0 and φ O = 1.0, and (c) reconstructed image (PSNR = 34.19, SSIM = 0.903, VIF = 0.900, VSNR = 42.55 dB, WSNR = 40.32 dB, PSNR-HVS-M = 43.12 dB, en-tropy = 0.465) with γ O = 0.4 and φ O = 0.5.

Second, we assume that half of the reconstruction error of each signal is induced by the quantization error of the perceptual quantizer and the other is induced by the prediction error from the preprocessed signals. We use a simple condition to restrict both parts of the reconstruction error of the signal within half of its visibility threshold such that the overall reconstruction error is under the JND value. That is, the simulation with the condition γ O = 0.4 and φ O = 0.5 is implemented. As described in Section 3, the conservative constraint of γ O = 0.4 (instead of γ O = 0.5) is used in the simulation. In the same viewing condition, the resulting "Sail" image (Figure 6c) with lower PSNR generates consistent visual quality at lower entropy. The same result for the "Leaf" image is shown in Figure 7c. To clarify the performance of the proposed prediction error preprocessor for a variety of color images, Table 3 compares the entropies obtained using the proposed compression scheme with γ O = 0.4 and φ O = 0.5 with those obtained using proposed compression scheme γ O = 0 and φ O = 1.0 for the same high quality compression as described above. One can see that the proposed prediction error preprocessor indeed improves the performance of the proposed compression scheme. The compression results of other test color images are depicted in Table 3 in which the dynamic range of the prediction error signal and the preprocessed prediction error signal is estimated in terms of variance. The results confirm that entropies of the reconstructed image are lower at nearly the same perceived quality when the dynamic range of prediction error signals is effectively reduced by the JND-based prediction error preprocessor.

Table 3 Entropies of the proposed compression scheme with preprocessor, without prepro-cessor and the Watson's compression method at nearly the same visual quality of the reconstructed color images

The proposed compression scheme is also compared with the compression method proposed by Watson et al. [14] to show the performance of compressing color images. The Watson's method uses a perceptual quantization matrix to compress color images, in which the quantization step size of each subband of the color image is determined by the base JND [14] within the subband. In order to make a fair comparison, the same observers took part in the subjective viewing tests. Table 3 lists the entropies of the Watson's compression method while its reconstructed color image is nearly no loss in perceptual quality at a viewing distance equals to six times the image height. The entropy gains, G1 and G2, are, respectively, evaluated while comparing the proposed schemes with the above two conditions with the Watson's method. From the G2 shown in the table, it is obvious that the proposed compression scheme achieves better performance when the prediction error preprocessor is applied. Furthermore, the G1 of the proposed compression scheme only using the perceptual quantization also shows better results, since the JNDs of the wavelet coefficients are effectively estimated to calculate the quantization step sizes for each coefficient as shown in Equation (12).

For the image coding standards, the proposed preprocessor can also be used to pre-process the input color image such that the input signal has smaller dynamic range resulting in lower bit rates at the same image quality of the reconstructed color image. Figure 8 compares the results of coding the "Lena" color image obtained using the conventional JPEG coding with those obtained using the JPEG coding incorporated with the proposed preprocessor in terms of the bit rate at the same JPEG quality factor. At the JPEG quality factor of 85, the JPEG coding with the proposed preprocessor results in the reconstructed image (Figure 8b) with the bit rate of 0.45 bits per pixel (bpp) while the conventional JPEG coding results in the reconstructed image (Figure 8c) with the bit rate of 0.54 bpp. The JPEG coding with the proposed preprocessor preserves the low bit rate much better than the conventional one, while the objective quality measures of using SSIM, VIF, VSNR, WSNR, and PSNR-HVS-M for the JPEG coder with and without the proposed preprocessor are close. As described in Section 2, the proposed preprocessor is successfully reduce the dynamic range of the input color signals to achieve better performance in terms of bit rates when maintaining the nearly same image quality. Figure 9 illustrates the bit rate comparison of the representative "Lena" and "Goldhill" color images under tests, between the conventional JPEG coder (JPEG) and the one incorporated with the pro-posed preprocessor (pre_JPEG). The results confirm that the preprocessor can be independent of the JPEG coding standard and applied to color images for higher performance at any specified quality factor. For the JPEG2000 coding standard [31], the comparison results of coding the "Lena" color image at the lossless coding mode is shown in Figure 10. The better performance can also be seen while the JPEG2000 coding with the proposed preprocessor is adopted.

Figure 8
figure 8

Results of coding the "Lena" color image at the JPEG quality factor of 85. (a) Original image, (b) reconstructed image ((PSNR = 35.29 dB, SSIM = 0.905, VIF = 0.893, VSNR = 41.68 dB, WSNR = 40.33 dB, PSNR-HVS-M = 44.18 dB, bit rate = 0.45 bpp) with the JPEG coding with the proposed preprocessor, and (c) reconstructed image (PSNR = 36.86 dB, SSIM = 0.921, VIF = 0.914, VSNR = 42.34 dB, WSNR = 41.88 dB, PSNR-HVS-M = 45.24 dB, bit rate = 0.54 bpp) with the conventional one.

Figure 9
figure 9

Comparison of bit rate of coding the "Lena" and "Goldhill" color images between the conventional JPEG coder (JPEG) and the one incorporated with the proposed preprocessor (pre_JPEG).

Figure 10
figure 10

Results of coding the "Lena" color image at the lossless coding mode in the JPEG2000 standard. (a) Reconstructed image (bit rate = 2.72 bpp) with the JPEG2000 coding with the proposed preprocessor and (b) reconstructed image (bit rate = 4.53 bpp) with the conventional one.

5. Conclusions

A prediction error preprocessor is presented with the goal to reduce the dynamic range of the prediction error signals of the color image to be compressed. The lower bit rate of the reconstructed image can be obtained by using the preprocessor while reaching high visual quality. For this purpose, a color JND estimator that takes into account various masking effects of human visual perception is proposed and incorporated into the preprocessor for the design of the perceptual color image compression scheme using the DPCM technique. The proposed compression scheme with the preprocessor generates consistent quality images at a lower entropy when comparing with that without prediction error preprocessor and the existing compression method. At the same quality factor of the coding standard, the preprocessor is also applied to the input color image of the JPEG and JPEG2000 coders to provide compatible bitstream and to achieve higher performance in terms of bit rate.

References

  1. Jayant N, Johnston J, Safranek R: Signal compression based on models of human perception. Proc IEEE 1993, 81(10):1385-1422. 10.1109/5.241504

    Article  Google Scholar 

  2. Safranek RJ, Johnston JD: A perceptually tuned subband image coder with image dependent quantization and post-quantization data compression. In Proc IEEE Int Conf Acoust, Speech, Signal Processing. Volume 3. Glasgow, Scotland; 1989:1945-1948.

    Chapter  Google Scholar 

  3. Watson AB: DCT quantization matrices visually optimized for individual images. In Proc SPIE Int Conf Human Version, Visual Processing, and Digital Display-IV. Volume 1913. San Jose, CA, USA; 1993:202-216.

    Chapter  Google Scholar 

  4. Safranek RJ: A JPEG compliant encoder utilizing perceptually based quantiza-tion. In Proc SPIE Int Conf Human Vision, Visual Processing, and Digital Display - V. Volume 2179. Bellingham; 1994:117-126.

    Chapter  Google Scholar 

  5. Chou CH, Li YC: A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. IEEE Trans Circ Syst Video Technol 1995, 5(6):467-476. 10.1109/76.475889

    Article  Google Scholar 

  6. Bradley AP: A wavelet visible difference predictor. IEEE Trans Image Process 1999, 8(5):717-730. 10.1109/83.760338

    Article  Google Scholar 

  7. Höntsch I, Karam LJ: Locally adaptive perceptual image coding. IEEE Trans Image Process 2000, 9(9):1472-1483. 10.1109/83.862622

    Article  MATH  MathSciNet  Google Scholar 

  8. Tang CW: Spatiotemporal visual considerations for video coding. IEEE Trans Multimedia 2007, 9(2):231-238.

    Article  Google Scholar 

  9. Al-Fayadh A, Hussain AJ, Lisboa P, Al-Jumeily D: An adaptive hybrid classified vector quantization and its application to image compression. In Proc European Symp Computer Modeling and Simulation. Cambridge, UK; 2008:253-256.

    Google Scholar 

  10. Yang XK, Lin WS, Lu Z, Ong EP, Yao S: Just-noticeable-distortion profile with nonlinear additivity model for perceptual masking in color images. In Proc IEEE Int Conf Acoustics, Speech, Signal Processing. Volume 3. Hong Kong; 2003:609-612.

    Google Scholar 

  11. Nadenau MJ, Reichel J, Kunt M: Wavelet-based color image compression: exploiting the contrast sensitivity function. IEEE Trans Image Process 2003, 12(1):58-70. 10.1109/TIP.2002.807358

    Article  Google Scholar 

  12. Liu KC, Chou CH: Locally adaptive perceptual compression for color images. IEICE Trans Fund Electron Commun Comput Sci 2008, E91-A(8):2213-2222. 10.1093/ietfec/e91-a.8.2213

    Article  Google Scholar 

  13. Chou CH, Liu KC: Colour image compression based on the measure of just noticeable colour difference. IET Image Process 2008, 2(6):304-322. 10.1049/iet-ipr:20080034

    Article  MathSciNet  Google Scholar 

  14. Watson AB, Yang G, Solomon JA, Villasenor J: Visibility of wavelet quantization noise. IEEE Trans Image Process 1997, 6(8):1164-1175. 10.1109/83.605413

    Article  Google Scholar 

  15. Teo PC, Heeger DJ: Perceptual image distortion. In Proc IEEE Int Conf Image Processing. Austin, Texas, USA; 1994:982-986.

    Chapter  Google Scholar 

  16. de Valois KK, Switkes E: Simultaneous masking interactions between chromatic and luminance gratings. J Opt Soc Am 1983, 73(1):11-18. 10.1364/JOSA.73.000011

    Article  Google Scholar 

  17. Watson AB: Perceptual optimization of DCT color quantization matrices. In Proc IEEE Int Conf Image Processing. Austin, Texas, USA; 1994:100-1044.

    Chapter  Google Scholar 

  18. Le Callet P, Saadane A, Barba D: Interactions of chromatic components on the perceptual quantization of the achromatic component. In Proc SPIE Human Vision and Electronic Imaging. Volume 3644. San Jose, CA, USA; 1999:121-128.

    Google Scholar 

  19. Meng Y, Guo L: Color image coding by utilizing the crossed masking. In Proc IEEE Int Conf Acoust, Speech, Signal Processing. Philadelphia, Pennsylvania, USA; 2005:389-392.

    Google Scholar 

  20. Imai FH, Tsumura N, Miyake Y: Perceptual color difference metric for complex images based on Mahalanobis distance. J Electron Imag 2001, 10(2):385-393. 10.1117/1.1350559

    Article  Google Scholar 

  21. Watson AB, Solomon JA: Model of visual contrast gain control and pattern masking. J Opt Soc Am A 1997, 14(9):2379-2391. 10.1364/JOSAA.14.002379

    Article  Google Scholar 

  22. Longere P, Zhang X, Delahunt PB, Brainaro DH: Perceptual assessment of demosaicing algorithm performance. Proc IEEE 2002, 90(1):123-132. 10.1109/5.982410

    Article  Google Scholar 

  23. Yang XK, Lin WS, Lu ZK, Ong EP, Yao SS: Just noticeable distortion model and its applications in video coding. Signal Process: Image Commun 2005, 20(7):662-680. 10.1016/j.image.2005.04.001

    Google Scholar 

  24. Berger T: Rate Distortion Theory. Prentice-Hall, Englewood Cliffs, NJ; 1971.

    Google Scholar 

  25. Jia YT, Lin WS, Kassim AA: Estimating just-noticeable distortion for video. IEEE Trans Circ Syst Video Technol 2006, 16(7):820-829.

    Article  Google Scholar 

  26. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 2004, 13(4):600-612. 10.1109/TIP.2003.819861

    Article  Google Scholar 

  27. Sheikh HR, Bovik AC: Image information and visual quality. IEEE Trans Image Process 2006, 15: 430-444.

    Article  Google Scholar 

  28. Chandler DM, Hemami SS: VSNR: a wavelet-based visual signal-to-noise-ratio for natural images. IEEE Trans Image Process 2007, 16(9):2284-2298.

    Article  MathSciNet  Google Scholar 

  29. Damera-Venkata N, Kite TD, Geisler WS, Evans BL, Bovik AC: Image quality as-sessment based on a degradation model. IEEE Trans Image Process 2000, 4: 636-650.

    Article  Google Scholar 

  30. Ponomarenko N, Silvestri F, Egiazarian K, Carli M, Astola J, Lukin V: On between-coefficient contrast masking of DCT basis functions. In Proc of the 3rd Int Workshop on Video Processing and Quality Metrics for Consumer Electronics. Scottsdale, Arizona, USA; 2007.

    Google Scholar 

  31. JPEG2000 latest reference software (Jasper Version 1.900.1)[http://www.ece.uvic.ca/~frodo/jasper/]

Download references

Acknowledgements

The study was supported by the National Science Council, Taiwan, under contract NSC99-2221-E-278-001 and the Image & Video Processing Laboratory, National Dong Hwa University, Taiwan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuo-Cheng Liu.

Additional information

Competing interests

The author declares that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Liu, KC. Prediction error preprocessing for perceptual color image compression. J Image Video Proc 2012, 3 (2012). https://doi.org/10.1186/1687-5281-2012-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1687-5281-2012-3

Keywords