- Research Article
- Open Access

# Video Enhancement from Multiple Compressed Copies in Transform Domain

- Viet Anh Nguyen
^{1}Email author, - Zhenzhong Chen
^{2}and - Yap-Peng Tan
^{2}

**2010**:404137

https://doi.org/10.1155/2010/404137

© Viet Anh Nguyen et al. 2010

**Received:**1 May 2010**Accepted:**24 September 2010**Published:**27 September 2010

## Abstract

Increasingly, we can obtain more than one compressed copy of the same video content with different levels of visual quality over the Internet. As the original source video is not always available, how to choose or derive a video of the best quality from these copies becomes a challenging and interesting problem. In this paper, we address this new research problem by blindly enhancing the quality of the video reconstructed from such multiple compressed copies. The aim is to reconstruct a video that achieves a better quality than any of the available copies. Specifically, we propose to reconstruct each coefficient of the video in the transform domain by using a narrow quantization constraint set derived from the multiple compressed copies together, using a Laplacian or Cauchy distribution model for each AC transform coefficient to minimize the distortion. Analytical and experimental results show the effectiveness of the proposed method.

## Keywords

- Discrete Cosine Transform
- Video Frame
- Video Quality
- Video Content
- Reconstructed Video

## 1. Introduction

Over the past few decades, transform-based coding has been widely used in lossy image and video compression to exploit the spatial correlation of visual signals. Achieving good energy compaction over a wide class of visual signals, block-based discrete cosine transform (DCT) is commonly adopted in most popular video compression standards, including H.261/3/4 and MPEG-1/2/4 [1, 2]. While block-based coding can attain good quality at high bit rates, it often suffers from undesirable coding artifacts (such as blocking artifact, ringing noise, and corner outliers) at moderate to-low bit rates. These coding artifacts are mainly due to the error introduced by the quantization/dequantization process, which may result in severe loss in visual quality and fidelity of the reconstructed video.

To alleviate this problem, postprocessing is one of the most promising solutions as it can improve the video quality without the need of changing the encoder structure. Many postprocessing techniques have been proposed to reduce the quantization artifacts of block-based coding. These include block-boundary postfiltering techniques to smooth the discontinuous in either spatial [3–7] or transform domain [8–13] such as adaptive filtering and wavelet-based filtering. Also proposed are more sophisticated methods that enhance the reconstructed video by using image/video restoration techniques such as iterative methods based on the theory of projection onto convex sets (POCS) or constrained minimization [14–18], maximum a posterior probability estimation approach (MAP) [19–21], and regularized image/video restoration [22–27]. These methods consider the compressed images/videos to be distorted by a codec system and apply restoration techniques to reduce the quantization noises and coding artifacts.

With the development of network and communication techniques as well as the popularity of video-centric websites such as YouTube, Facebook, and Google Video, delivery of visual signals over the network has become more and more popular. Given the phenomenal rate at which image and video contents are being generated and distributed, we can now easily obtain many copies of the same video content with different levels of visual quality. For example, different people may record the same interesting soccer match or a piece of news from a television channel and encode it in different formats or using different coding parameters to meet their constraints (e.g., transmission bandwidth, storage capacity, etc.) before sharing it over the network. Similarly, one can gain access to many copies of movie trailers or video clips extracted from DVDs, which have exactly the same content but different visual quality.

Employing the existing postprocessing techniques, one can possibly enhance the quality of each of these compressed copies independently from the other copies. However, as the original source video or information on the video quality is not always available, how to obtain the best video from these multiple compressed copies becomes an interesting problem. The problem shares some similarity with the well-known superresolution (SR) restoration problem, which has been addressed intensively in the literature.For example, Gunturk et al. [28, 29] and Segall et al. [30] proposed to reconstruct high-resolution images by using multiple neighboring low-resolution frames of compressed videos. It should be noted that the restoration or enhancement of high-resolution images in SR requires a set of low-resolution observations, which usually contain different but related views of the scene (e.g., images taken from different cameras, view angles, illumination conditions, or even a sequence of frames from a video). What we consider here is, however, to enhance the video quality from multiple compressed copies of the same content (i.e., no spatial variations) with different levels of quantization noise.

In this paper, we address this new research problem by blindly enhancing the quality of the video reconstructed from multiple compressed copies of the same visual content, where existing postprocessing techniques may no longer be suitable nor effective as they usually consider only a single compressed video. Our aim is to reconstruct a video that achieves better quality than any of the available copies. The proposed method is considered to be a "blind" approach as the original source video is not available, and this makes the problem particularly challenging as we donnot definitively know which of the multiple copies, which frame of a copy, and which region of a frame have the best quality.

In our previous work, we have proposed a scheme based on the theory of POCS to improve the video quality from multiple video copies [31]. However, having projected iteratively the reconstructed video onto the quantization constraint sets in the transform domain and the smooth constraint sets in the spatial domain, the method incurs intensively computational complexity. Here, we consider a different approach and propose a fast method to reconstruct the enhanced video in the transform domain. By exploiting the information from the quantization constraint sets and transform coefficient statistics only in the transform domain, the proposed method can provide an enhanced video with better quality than any of the available copies while incurring much lower computational complexity compared with the previous method.

Specifically, we propose to reconstruct each coefficient of the video in the transform domain by using a narrow quantization constraint set derived from the multiple compressed copies and in which the exact value of the coefficient should lie. In addition, a Laplacian or Cauchy distribution model is utilized to further reduce the distortion of each AC transform coefficient. Analytical and experimental results show that the video reconstructed by the proposed method generally yields a distortion smaller than that of any of the compressed copies available. In many scenarios, the proposed method can attain a notable gain in terms of average peak-signal-to-noise ratio (PSNR) compared to the best video from the multiple compressed copies.

The remainder of the paper is organized as follows. Section 2 briefly reviews some key features of transform-based coding in the latest H.264/AVC standard exploited in this paper. Section 3 formulates the problem of blindly enhancing the video content from multiple compressed copies and describes the proposed method with the assumption that the temporal resolutions of the available copies are well aligned. An effective method for temporal registration of multiple compressed copies is presented in Section 4. Mathematical analysis and experimental results to justify of the performance of the proposed method are presented in Sections 5 and 6, respectively. In Section 7, we conclude the paper by summarizing our main contributions.

## 2. Brief Overview of H.264/AVC Transform

Many video coding standards, such as H.261/3/4 and MPEG-1/2/4, have been developed and standardized to address the need of efficient storage and delivery of video content. Although the recent video coding standard H.264/AVC has incorporated a number of advanced coding features to achieve its high coding efficiency, it still employs the so-called block-based hybrid video coding approach. The basic algorithm of a hybrid video coding approach makes use of motion-compensated prediction to exploit the temporal statistical dependency, and transform coding to exploit the spatial statistical dependency. In this section, we will review some key features of transform coding through the state-of-the-art video coding standard H.264/AVC, which will be exploited in this paper to demonstrate the effectiveness of the proposed method.

In essence, the existing video coding standards support intracoding and intercoding. While intercoding employs temporal prediction (motion compensation) from previously encoded pictures, intracoding only uses the information contained in the picture itself. The (either intra- or inter-) prediction residue, which is the difference between the original and the predicted picture, is then transformed, quantized, and coded. Instead of using discrete cosine transform (DCT) like in the previous standards (e.g., H.263 and MPEG-1/2/4), a integer transform, which basically has the same properties as a DCT, is applied in H.264/AVC to avoid the mismatch between encoders and decoders.

*⊗*represents point-to-point multiplication (e.g., each element of is multiplied by the element in the same position of matrix ), is the forward transformation matrix, and is the forward postscaling factor matrix, which are defined as [32]

Note that if the macroblock is coded in intra prediction mode, the DC coefficients of the luma residue blocks will be transformed again using a Hadamard transform to decorrelate the DC coefficients before the quantization process (see [32] for details).

Quantization step size conversion function.

mod | QP2QSTEP |
---|---|

0 | 0.6250 |

1 | 0.6875 |

2 | 0.8125 |

3 | 0.8750 |

4 | 1.0000 |

5 | 1.1250 |

## 3. Problem Formulation and Proposed Method

where is the total number of pixels in the video frame, and denotes the value of the th integer transform coefficient of the original frame.

Equation (11) shows that by having multiple compressed copies, the size of the QCS for each integer transform coefficient can likely be reduced. The reduction in QCS size allows us to estimate a more accurate integer transform coefficient and thus reconstruct a video frame with a lower distortion. Ideally, we can obtain the exact integer transform coefficient if the QCS becomes a scalar, a scenario that rarely occurs. Hence, we propose in this paper to reconstruct each integer transform coefficient by using the corresponding narrow QCS and justify that by doing so the constraints in (9) can be satisfied.

Since each integer transform coefficient is quantized independently by a quantization step size, minimizing the MSE subject to the constraints specified in (9) is equivalent to minimizing the distortion caused by each integer transform coefficient. In order to have a quantization that better fits to a nonuniform distribution of the integer transform coefficient over the QCS, H.264/AVC decoder uses the rounding control parameter in (3) to control the position of the reconstructed value inside the QCS interval. Due to the nonsymmetric distribution, the reconstructed value is not located in the center of the corresponding QCS like the previous coding standards such as H.263 and MPEG-1/2/4. A fixed value of smaller than half of the QCS size is used to reduce the quantization error. However, to achieve the optimal quantization error, the reconstructed value should be adaptively decided based on the probability distribution of the integer transform coefficient over the corresponding QCS.

It has been shown in the previous studies that in the DCT transform domain of a natural image, while the DC coefficients can be approximated as the uniform distribution, the AC coefficient distribution can be modeled by a generalized Gaussian [35, 36] or Laplacian [37, 38] probability density function. Although the generalized Gaussian model gives the most accurate representation of the AC coefficient distribution, the Laplacian model is commonly employed due to it being more tractable both mathematically and computationally. Recently, Kamaci et al*.* [39] proposed to use the Cauchy model, which is shown as a better choice than the Laplacian model for estimating the actual probability distribution of AC coefficients in H.264/AVC. In this paper, both Laplacian and Cauchy models will be examined for the estimation of the reconstructed video. In what follows, we present a method to estimate the parameters of both distribution models by using the decoded values from the compressed videos.

Laplacian Model Parameter Estimation

where is the original th coefficient value for a given AC frequency and is the number of coefficients.

whose solution can be found by using an iterative root finding algorithm. In our implementation, we used the Newton-Raphon's root finding method [41].

Cauchy Model Parameter Estimation

whose solution can also be found by using an iterative root finding algorithm. Similar to the case of the Laplacian model, the Newton-Raphon's root finding method [41] was used in our implementation.

In short, our proposed method for enhancing the video reconstructed from multiple compressed copies can be summarized as follows.

Step 1.

Estimate the parameters of the Laplacian and Cauchy distribution for each AC coefficient using (20) and (24), respectively.

Step 2.

Obtain the narrow QCS for each integer transform coefficient from the multiple copies using (11).

Step 3.

Reconstruct each integer transform coefficient as the centroid of the narrow QCS obtained in Step 2 using (14).

Note that although H.264/AVC was used in the implementation for evaluating the performance of the proposed method, it can be readily extendable to other video coding standards such as H.263 or MPEG-1,2,4. With different quantization methods in various coding standards such as H.263 and H.264/AVC, only minor modification in (10) is required to obtain the quantization constraint sets for different video coding standards. Nevertheless, the proposed method may not be readily applicable in the case of multicodecs involving both H.264/AVC and the existing coding standards such as H.263. This is due to the different types of transformation used in different coding standards (i.e., integer transform in H.264/AVC and DCT transform in H.263 and MPEG-1,2,4). However, in the case of multi-codecs involving only the old coding standards (e.g., multiple compressed videos encoded by H.263 and MPEG-1,2,4), the proposed method can still be applicable.

Complexity Analysis

It is easy to see that the most computationally intensive part of the proposed method is to construct the narrow QCS and to estimate the model parameter of the distribution for each AC integer transform frequency. Other than the quantization parameters and quantized values available in the compressed bitstream, the prediction values in the integer transform domain are also needed to compute the narrow QCS, which requires fully decoding every available compressed copy. As only simple and straightforward calculations are required to compute the narrow QCS using (10) and (11) and the reconstructed integer transform coefficient as the centroid of the narrow QCS using (14), this amount of computation is rather insignificant. By applying root-finding algorithms such as the Newton-Raphon's method, the model parameter estimation for the distribution of each AC integer transform frequency does not require much computation either in comparison with the whole fully decoding process. Thus, the complexity of the proposed method is approximately equal to the complexity required to decode all available compressed input copies.

In addition, in comparison with our previous method [31] or any relevant SR and post-processing methods for quantization error reduction, the proposed method generally requires much less computational complexity. Note that these methods widely employ the constraint-based techniques with the popular theory of projection onto convex sets (POCS). One of the necessary constraint sets is the smoothness constraint set (SCS) computed in the spatial domain, which also requires fully decoding all the compressed input copies, not to mention the computational load required for the computation of the smoothness criteria. Furthermore, the iterative projection process among various constraint sets requires a number of conversions among the SCS and other constraint sets (e.g., between the spatial domain for the SCS and the transform domain for the QCS), which results in intensively computational load. In order to converge to the optimal solution, a few number of iterations is generally required, which makes the computational complexity of these methods significantly higher compared with that of the proposed method.

## 4. Video Alignment

In Section 3, we propose an effective method to enhance the reconstructed video from multiple compressed copies of the same video content under the assumption that the frames of the available copies are well aligned. However, this assumption may not always hold in practice. For example, a same broadcast video can be encoded by different people starting at slightly different time instances. The same video may also be edited, encoded at different frame rate (e.g., 3-2 pull down), or subjected to frame dropping during the video compression process.

where is the distance function representing the difference or dissimilarity between frame and , is the weighting function which could place different emphasis on different aligned frame pairs, and . Frame is considered similar to frame if their frame distance measure is sufficiently small. In addition, it should be noted that the minimization is subject to a causal constraint on and that is and with .

It can be seen that the accuracy of the alignment will partly depend on how efficiently the frame distance measure is able to differentiate dissimilar frames. Many sophisticated frame distance measures have been proposed in the literature for image/video matching, as color histogram, image signatures, and so forth. Since compressed copies of the same video content exhibit no spatial variations such as different view angles or illumination conditions like the case of existing image/video matching problems, we use here a simple but effective frame distance measure based on the side information extracted from the compressed videos.

where for or . Hence, the matching frame pairs between video sequences and can be found by determining the optimal path (i.e., the path with minimum final matching cost ). It should be noted that only frame pairs obtained from the optimal path whose the distance measures are smaller than some predefined threshold will be utilized to enhance the reconstructed video by using the proposed method.

It is easy to see that the most computationally intensive part of the proposed alignment method is to compute the minimum matching cost function using (27). The computation of each needs only three algebraic operations (each algebraic operation consists of one addition and one multiplication) and two numerical comparisons. Therefore, the optimal path, and hence matching frame pairs between video sequences and , can be obtained with algebraic operations and numerical comparisons. Thus, the complexity of the proposed alignment method is .

To evaluate the performance of the proposed alignment method, we have conducted the experiment on a large number of test sequences. To create the misalignment among the compressed video inputs, the original video sequence was encoded starting from different time instances. Furthermore, we purposely dropped some video frames randomly from the original test sequence before encoding to obtain a compressed copy. The experimental results show that the proposed method can obtain the matching frame pairs among these misaligned compressed copies with a hundred percent of accuracy.

## 5. Analytical Justification

We justify in this section that reconstructing integer transform coefficients using the narrow QCS can generally yield a lower distortion than that of using only the QCS of any single copy.

Let be a random variable representing an integer transform coefficient, which can be either uniform (for a DC coefficient) or Laplacian/Cauchy (for an AC coefficient). denotes the reconstructed value of as the centroid of the QCS . The estimated mean-squared error can be obtained by (13).

Lemma 1.

Consider a quantization constraint set , and its subset where (i.e., and ). Then, for any such subset , we have

Proof.

*β*. Then, is also a function of

*β*and can be easily obtained as . We first prove that is an increasing function of

*β*by showing . Rearranging as and taking the derivative of the above function with respect to

*β*, we have

*β*, we have

Since
is a symmetric function, we only need to consider
. Let
. It is easy to see that
. Hence,
is a decreasing function with the increase of *β*. It follows that
and
, hence
. Thus,
increases with *β*, and
. This leads to
increasing with *β* too, and
. Similarly, we can prove that
is a decreasing function of *α*. Hence, the assertion in Lemma 1 holds.

Lemma 1 implies that reconstructing quantized coefficients as the centroid over a narrow QCS can yield a lower distortion on average. Since the proposed method reconstructs the value of each integer transform coefficient by using a narrow QCS that is a subset of the QCSs obtained from the multiple copies, we can reconstruct a video which has a lower distortion, on average, than the video decoded from any given compressed copy.

## 6. Experimental Results

Standard test video sequences.

Sequence | No. of frames |
---|---|

Flower | 300 |

Silent | 300 |

Foreman | 300 |

Mother & Daughter | 300 |

Coastguard | 300 |

News | 300 |

Stefan | 300 |

Tempete | 250 |

Tennis | 150 |

Mobile | 300 |

The experiments were conducted by using the state-of-the-art transform-coding-based video compression standard, namely, the H.264/AVC encoder. The multiple copies of the input video were obtained by encoding the same video content using the coding standard with different target bit rates and coding parameters such as the structure of the group of pictures (GOP). In what follows, we will discuss various scenarios in which the multiple video copies were compressed in different ways, resulting in various possible performance gains.

### 6.1. Laplacian and Cauchy Probability Distribution Model

### 6.2. Multiple Copies Compressed at Different Target Bit Rates

Coding parameters of the standard test video sequences.

Video copy no. | Target bitrate (kbits/s) | GOP structure | ||
---|---|---|---|---|

Set 1 | Set 2 | Case 1 | Case 2 | |

1 | 400 | 700 | , | , |

2 | 600 | 900 | , | , |

3 | 800 | 1000 | , | , |

Average PSNR results (in dB) of the best input copy and the video reconstructed by the proposed method from multiple input copies compressed with the same GOP structure at different target bit rates.

Sequence | No. of | 400–600–800 (kbits/s) | 700–900–1000 (kbits/s) | ||||
---|---|---|---|---|---|---|---|

copies | Best copy | Proposed | Gain | Best copy | Proposed | Gain | |

Foreman | 2 | 37.19 | 37.49 | 0.30 | 38.93 | 39.48 | 0.55 |

3 | 38.42 | 38.99 | 0.57 | 39.37 | 40.33 | 0.96 | |

Mobile | 2 | 22.54 | 22.86 | 0.32 | 23.86 | 24.28 | 0.43 |

3 | 23.37 | 23.84 | 0.47 | 24.85 | 25.48 | 0.63 | |

Mother and Daughter | 2 | 43.93 | 44.28 | 0.35 | 45.23 | 45.79 | 0.56 |

3 | 44.87 | 45.52 | 0.65 | 45.58 | 46.54 | 0.97 | |

News | 2 | 42.29 | 42.62 | 0.33 | 44.34 | 44.88 | 0.54 |

3 | 43.77 | 44.33 | 0.56 | 44.91 | 45.80 | 0.89 | |

Silent | 2 | 39.02 | 39.14 | 0.12 | 41.40 | 41.69 | 0.29 |

3 | 40.72 | 41.21 | 0.49 | 42.03 | 43.93 | 0.90 | |

Flower | 2 | 31.18 | 31.55 | 0.38 | 33.03 | 33.70 | 0.66 |

3 | 32.49 | 33.22 | 0.72 | 33.49 | 34.65 | 1.16 | |

Stefan | 2 | 30.45 | 30.73 | 0.27 | 32.49 | 33.05 | 0.57 |

3 | 31.82 | 32.43 | 0.61 | 32.95 | 34.08 | 1.13 | |

Tennis | 2 | 31.45 | 31.78 | 0.33 | 32.75 | 33.29 | 0.54 |

3 | 32.40 | 32.97 | 0.57 | 32.89 | 34.17 | 1.28 | |

Coastguard | 2 | 32.45 | 32.80 | 0.35 | 34.06 | 34.65 | 0.59 |

3 | 33.58 | 34.24 | 0.66 | 34.43 | 35.66 | 1.23 | |

Tempete | 2 | 31.52 | 31.85 | 0.32 | 33.23 | 33.81 | 0.58 |

3 | 32.74 | 33.35 | 0.61 | 33.59 | 34.74 | 1.16 |

Average PSNR results (in dB) of the best input copy and the video reconstructed by the proposed method from multiple input copies compressed with different GOP structures at different target bit rates.

Sequence | No. of | 400–600–800 (kbits/s) | 700–900–1000 (kbits/s) | ||||
---|---|---|---|---|---|---|---|

copies | Best copy | Proposed | Gain | Best copy | Proposed | Gain | |

Foreman | 2 | 37.19 | 37.55 | 0.36 | 38.93 | 39.61 | 0.68 |

3 | 38.41 | 39.45 | 1.04 | 39.45 | 41.01 | 1.56 | |

Mobile | 2 | 22.54 | 23.03 | 0.49 | 23.86 | 25.26 | 1.40 |

3 | 25.12 | 25.93 | 0.81 | 26.65 | 28.17 | 1.52 | |

Mother and Daughter | 2 | 43.93 | 44.30 | 0.37 | 45.23 | 45.87 | 0.64 |

3 | 44.87 | 45.77 | 0.90 | 45.59 | 46.92 | 1.33 | |

News | 2 | 42.29 | 42.51 | 0.22 | 44.34 | 44.78 | 0.44 |

3 | 44.04 | 44.67 | 0.63 | 45.19 | 46.23 | 1.04 | |

Silent | 2 | 39.02 | 39.33 | 0.31 | 41.40 | 41.91 | 0.51 |

3 | 41.33 | 41.87 | 0.54 | 42.66 | 43.58 | 0.92 | |

Flower | 2 | 31.18 | 31.34 | 0.16 | 33.03 | 33.54 | 0.51 |

3 | 32.94 | 33.87 | 0.93 | 33.78 | 35.44 | 1.66 | |

Stefan | 2 | 30.45 | 31.14 | 0.69 | 32.49 | 33.70 | 1.21 |

3 | 31.32 | 33.01 | 1.69 | 32.51 | 35.02 | 2.51 | |

Tennis | 2 | 31.45 | 31.70 | 0.25 | 32.75 | 33.41 | 0.66 |

3 | 32.62 | 33.48 | 0.86 | 33.14 | 34.65 | 1.51 | |

Coastguard | 2 | 32.45 | 32.88 | 0.43 | 34.06 | 34.89 | 0.83 |

3 | 33.42 | 34.58 | 1.16 | 34.23 | 36.20 | 1.97 | |

Tempete | 2 | 31.52 | 31.83 | 0.31 | 33.23 | 33.87 | 0.64 |

3 | 33.03 | 34.04 | 1.01 | 33.82 | 35.53 | 1.71 |

The experimental results also show that the PSNR improvement obtained from the set of low-bit rate inputs is lower than that of the high-bit rate set. This can be explained as at low-bit rate range, coarse quantization step sizes are generally used for encoding, resulting in a large QCS for each integer transform coefficient. Furthermore, the QCSs of the low-quality copies (e.g., copies 1 and 2 in Set 1) do not contribute much in reducing the size of the narrow QCS obtained by the proposed method. This is because the quantization step sizes used in these copies are generally too large compared to that of the best copy. As a result, the size of the narrow QCS cannot be significantly reduced, and hence it usually remains the same as that of the best copy. Thus, not much quality improvement compared to the best copy can be obtained (see the results of Set 1 in Tables 4 and 5 and discussion in Section 5). Furthermore, we can generally obtain better PSNR gain in the case where the similar frames from the available copies are coded using different picture coding types (Case 2 in Table 5) compared with that of using the same picture coding types (Case 1 in Table 4). Note that the size of the narrow QCS depends not only on the relation among the sizes of QCSs from multiple compressed copies, but also the relative position of the QCSs' intervals. As explained in Section 5, this relative position of each independent QCS is partly determined by the prediction value, which can be much different when different picture types are used to code similar frames from the available copies. This will help to reduce the size of the narrow QCS obtained by the proposed method significantly, resulting in more distortion reduction.

Note that the reconstructed frame from the best input copy, in terms of average PSNR, may not always provide better quality than those reconstructed from other copies as shown in Figure 7. However, the reconstructed frame obtained by the proposed method can still achieve better quality, in terms of both PSNR and visual quality, than the best frame reconstructed from the available input copies (i.e., the reconstructed frame from Copy 2 in this case).

### 6.3. Multiple Copies Compressed at the Same Target Bit Rates

In another set of experiments, the input copies were obtained by encoding the test sequences at the same target bit rates. For simplicity, the same GOP structure was used but with different starting frames for different video copies. This is likely to occur in practice, for example, when different people can encode a same broadcast video but starting at slightly different time instances and upload the compressed videos to websites such as YouTube and Google Video. Thus, the encoded picture type (i.e., I, P, or B) for each particular frame may not be the same among different compressed copies (e.g., it can be an I frame in one copy and a B or P frame in other copies).

### 6.4. Multiple Copies Compressed as Variable and Constant Bit Rates

### 6.5. Application to Real Video Sequences

Coding parameters of real video sequences.

Video copy no. | Target bitrate (kbits/s) | GOP structure | |
---|---|---|---|

Set 1 | Set 2 | ||

1 | 500 | 900 | , |

2 | 700 | 1100 | , |

3 | 900 | 1300 | , |

Average PSNR results (in dB) of the best input copy and the video reconstructed by the proposed method from multiple copies of a situation comedy compressed with different GOP structures and different target bit rates.

Sequence | No. of | 500–700–900 (kbits/s) | 900–1100–1300 (kbits/s) | ||||
---|---|---|---|---|---|---|---|

copies | Best copy | Proposed | Gain | Best copy | Proposed | Gain | |

Ballroom dancing | 2 | 37.48 | 38.11 | 0.63 | 39.51 | 40.50 | 0.99 |

3 | 39.01 | 39.78 | 0.77 | 40.46 | 41.74 | 1.28 | |

Superbowl | 2 | 37.23 | 38.02 | 0.79 | 39.38 | 40.50 | 1.12 |

3 | 38.96 | 39.84 | 0.88 | 40.40 | 41.86 | 1.46 | |

Football | 2 | 36.44 | 37.11 | 0.67 | 38.56 | 39.65 | 1.09 |

3 | 38.00 | 38.85 | 0.85 | 39.46 | 40.86 | 1.40 | |

London | 2 | 36.89 | 37.44 | 0.55 | 38.67 | 39.60 | 0.93 |

3 | 37.67 | 38.75 | 1.08 | 38.97 | 40.62 | 1.65 | |

Routine | 2 | 34.68 | 34.95 | 0.27 | 36.80 | 37.48 | 0.68 |

3 | 35.50 | 36.34 | 0.84 | 37.21 | 38.58 | 1.37 | |

Rugby | 2 | 33.35 | 33.76 | 0.41 | 35.54 | 36.38 | 0.84 |

3 | 34.82 | 35.61 | 0.79 | 36.51 | 37.84 | 1.33 | |

Soldier | 2 | 36.70 | 37.38 | 0.68 | 38.62 | 39.78 | 1.16 |

3 | 38.44 | 39.24 | 0.80 | 40.63 | 42.43 | 1.80 | |

Vegas | 2 | 35.73 | 36.12 | 0.39 | 38.01 | 38.84 | 0.83 |

3 | 36.86 | 37.69 | 0.83 | 38.57 | 40.01 | 1.44 |

## 7. Conclusion

We have addressed a new and interesting research problem of blindly enhancing the video reconstructed from multiple compressed video copies of the same video content with different levels of quality. Without making reference to the original source video or information on the quality of the compressed copies, the proposed method effectively exploits the compressed information of different video copies to reconstruct a video that has a better quality in terms of PSNR than the best compressed copy. Specifically, each coefficient of the reconstructed in the transform domain is estimated using a narrow quantization constraint set obtained from the multiple compressed copies together with a Laplacian or Cauchy distribution model for each AC frequency to minimize the distortion. By reconstructing the enhanced video in the transform domain, the proposed method incurs much lower computational complexity compared with the previous method. In addition, analytical and experimental results show that the video reconstructed by the proposed method not only yields a lower distortion than any given compressed copy but also achieves a significant PSNR gain compared to the best copy. Furthermore, a similar approach can be easily extended to other transform-based coding schemes such as DCT-based or wavelet-based transform coding.

## Declarations

### Acknowledgment

This research is partially supported by a research grant awarded by The Agency for Science, Technology and Research (A*STAR), Singapore, under the Mobile Media Thematic Strategic Research Programme of the Science and Engineering Research Council.

## Authors’ Affiliations

## References

- Musmann HG, Pirsch P, Grallert H-J:
**Advances in picture coding.***Proceedings of the IEEE*1985,**73**(4):523-548.View ArticleGoogle Scholar - Netravali AN, Haskell BG:
*Digital Pictures: Representation and Compression*. 2nd edition. Plenum Press, New York, NY, USA; 1995.View ArticleGoogle Scholar - Reeves HC III, Lim JS:
**Reduction of blocking effects in image coding.***Optical Engineering*1984,**23**(1):34-37.Google Scholar - Ramamurthi B, Gersho A:
**Nonlinear space-variant postprocessing of block coded images.***IEEE Transactions on Acoustics, Speech, and Signal Processing*1986,**34**(5):1258-1268. 10.1109/TASSP.1986.1164961View ArticleGoogle Scholar - Shen M-Y, Kuo C-C:
**Review of postprocessing techniques for compression artifact removal.***Journal of Visual Communication and Image Representation*1998,**9**(1):2-14. 10.1006/jvci.1997.0378View ArticleGoogle Scholar - Kwon K-K, Im S-H, Lim D-S:
**Deblocking algorithm in MPEG-4 video coding using block boundary characteristics and adaptive filtering.***Proceedings of the IEEE International Conference on Image Processing (ICIP '05), September 2005*541-544.Google Scholar - Kong H-S, Nie Y, Vetro A, Sun H, Barner KE:
**Coding artifacts reduction using edge map guided adaptive and fuzzy filtering.***Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '04), June 2004*1135-1138.Google Scholar - Chen T, Wu HR, Qiu B:
**Adaptive postifiltering of transform coefficients for the reduction of blocking artifacts.***IEEE Transactions on Circuits and Systems for Video Technology*2001,**11**(5):594-602. 10.1109/76.920189View ArticleGoogle Scholar - Liu S, Bovik AC:
**Efficient DCT-domain blind measurement and reduction of blocking artifacts.***IEEE Transactions on Circuits and Systems for Video Technology*2002,**12**(12):1139-1149. 10.1109/TCSVT.2002.806819View ArticleGoogle Scholar - Liew AW-C, Yan H:
**Blocking artifacts suppression in block-coded images using overcomplete wavelet representation.***IEEE Transactions on Circuits and Systems for Video Technology*2004,**14**(4):450-461. 10.1109/TCSVT.2004.825555View ArticleGoogle Scholar - Luo Y, Ward RK:
**Removing the blocking artifacts of block-based DCT compressed images.***IEEE Transactions on Image Processing*2003,**12**(7):838-842. 10.1109/TIP.2003.814252View ArticleGoogle Scholar - Kim S, Jeong J:
**Enhancement of wavelet-coded images via novel directional filtering.***Proceedings of the International Conference on Neural Networks and Signal Processing, December 2003***2:**1062-1065.Google Scholar - Ismaeil IR, Ward RK:
**Removal of DCT blocking artifacts using DC and AC filtering.***Proceedings of the IEEE Pacific Rim Conference on Communications Computers and Signal Processing (PACRIM '03), August 2003*229-232.Google Scholar - Rosenholtz RE, Zakhor A:
**Iterative procedures for reduction of blocking effects in transform image coding.***Image Processing Algorithms and Techniques II, February 1991, San Jose, Calif, USA, Proceedings of SPIE***1452:**116-126.View ArticleGoogle Scholar - Zakhor A:
**Iterative procedures for reduction of blocking effects in transform image coding.***IEEE Transactions on Circuits and Systems for Video Technology*1992,**2**(1):91-95. 10.1109/76.134377View ArticleGoogle Scholar - Yang Y, Galatsanos NP, Katsaggelos AK:
**Projection-based spatially adaptive reconstruction of block-transform compressed images.***IEEE Transactions on Image Processing*1995,**4**(7):896-908. 10.1109/83.392332View ArticleGoogle Scholar - Paek H, Kim R-C, Lee S-U:
**On the POCS-based postprocessing technique to reduce the blocking artifacts in transform coded images.***IEEE Transactions on Circuits and Systems for Video Technology*1998,**8**(3):358-367. 10.1109/76.678636View ArticleGoogle Scholar - Guleryuz OG:
**Linear, worst-case estimators for denoising quantization noise in transform coded images.***IEEE Transactions on Image Processing*2006,**15**(10):2967-2986.View ArticleGoogle Scholar - Mateos J, Katsaggelos AK, Molina R:
**A Bayesian approach for the estimation and transmission of regularization parameters for reducing blocking artifacts.***IEEE Transactions on Image Processing*2000,**9**(7):1200-1215. 10.1109/83.847833View ArticleMATHGoogle Scholar - Li Z, Delp EJ:
**MAP-based post processing of video sequences using 3-D Huber-Markov random field model.***Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '02), August 2002***1:**153-156.Google Scholar - Li J, Kuo C-CJ:
**Coding artifact removal with multiscale postprocessing.***Proceedings of the International Conference on Image Processing, October 1997, Santa Barbara, Calif, USA***1:**45-48.View ArticleGoogle Scholar - Yang T, Galatsanos P, Katsaggelos AK:
**Regularized reconstruction to reduce blocking artifacts of block discrete cosine transform compressed images.***IEEE Transactions on Circuits and Systems for Video Technology*1993,**3**(6):421-432. 10.1109/76.260198View ArticleGoogle Scholar - Choi MG, Yang Y, Galatsanos NP:
**Multichannel regularized recovery of compressed video sequences.***IEEE Transactions on Circuits and Systems II*2001,**48**(4):376-387. 10.1109/82.933797View ArticleGoogle Scholar - Li Z, Delp EJ:
**Block artifact reduction using a transform-domain Markov random field model.***IEEE Transactions on Circuits and Systems for Video Technology*2005,**15**(12):1583-1593.View ArticleGoogle Scholar - Zou JJ, Yan H:
**A deblocking method for BDCT compressed images based on adaptive projections.***IEEE Transactions on Circuits and Systems for Video Technology*2005,**15**(3):430-435.View ArticleGoogle Scholar - Kartalov T, Ivanovski ZA, Panovski L, Karam LJ:
**An adaptive POCS algorithm for compression artifacts removal.***Proceedings of the 9th International Symposium on Signal Processing and Its Applications (ISSPA '07), February 2007*1-6.Google Scholar - Liew AW-C, Yan H, Law N-F:
**POCS-based blocking artifacts suppression using a smoothness constraint set with explicit region modeling.***IEEE Transactions on Circuits and Systems for Video Technology*2005,**15**(6):795-800.View ArticleGoogle Scholar - Gunturk BK, Altunbasak Y, Mersereau RM:
**Multiframe resolution-enhancement methods for compressed video.***IEEE Signal Processing Letters*2002,**9**(6):170-174. 10.1109/LSP.2002.800503View ArticleGoogle Scholar - Gunturk BK, Altunbasak Y, Mersereau RM:
**Multiframe blocking-artifact reduction for transform-coded video.***IEEE Transactions on Circuits and Systems for Video Technology*2002,**12**(4):276-282. 10.1109/76.999205View ArticleGoogle Scholar - Segall CA, Molina R, Katsaggelos AK:
**High-resolution images from low-resolution compressed video.***IEEE Signal Processing Magazine*2003,**20**(3):37-48. 10.1109/MSP.2003.1203208View ArticleGoogle Scholar - Wang C, Yang G, Tan Y-P:
**Reconstructing videos from multiple compressed copies.***IEEE Transactions on Circuits and Systems for Video Technology*2009,**19**(9):1342-1351.View ArticleGoogle Scholar - Richardson I:
*H.264 and MPEG-4 Video Compression*. John Wiley & Sons, New York, NY, USA; 2003.View ArticleGoogle Scholar -
**JM 12.4-H.264 reference software**http://iphome.hhi.de/suehring/tml/ - de Queiroz RL:
**Processing JPEG-compressed images and documents.***IEEE Transactions on Image Processing*1998,**7**(12):1661-1672. 10.1109/83.730378View ArticleGoogle Scholar - Müller F:
**Distribution shape of two-dimensional DCT coefficients of natural images.***Electronics Letters*1993,**29**(22):1935-1936. 10.1049/el:19931288View ArticleGoogle Scholar - Eude T, Grisel R, Cherifi H, Debrie R:
**On the distribution of the DCT coefficients.***Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '94), April 1994, Adelaide, Australia***5:**365-368.Google Scholar - Lam EY, Goodman JW:
**A mathematical analysis of the DCT coefficient distributions for images.***IEEE Transactions on Image Processing*2000,**9**(10):1661-1666. 10.1109/83.869177View ArticleMATHGoogle Scholar - Smoot SR, Rowe LA:
**DCT coefficient distributions.***Human Vision and Electronic Imaging, February 1996, San Jose, Calif, USA, Proceedings of SPIE*403-411.View ArticleGoogle Scholar - Kamaci N, Altunbasak Y, Mersereau RM:
**Frame bit allocation for the H.264/AVC video coder via cauchy-density-based rate and distortion models.***IEEE Transactions on Circuits and Systems for Video Technology*2005,**15**(8):994-1006.View ArticleGoogle Scholar - Brandão T, Queluz MP:
**No-reference image quality assessment based on DCT domain statistics.***Signal Processing*2008,**88**(4):822-833. 10.1016/j.sigpro.2007.09.017View ArticleMATHGoogle Scholar - Kelley CT:
*Solving Nonlinear Equations with Newton's Method, Fundamentals of Algorithms*. SIAM, Philadelphia, Pa, USA; 2003.View ArticleMATHGoogle Scholar - Tan YP, Kulkarni SR, Ramadge PJ:
**A framework for measuring video similarity and its application to video query.***Proceedings of the IEEE International Confference on Image Processing (ICIP '99), 1999, Kobe, Japan***2:**106-110.Google Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.