- Open Access
Adaptive distributed video coding with motion vectors through a back channel
© Min and Sim; licensee Springer. 2013
- Received: 14 March 2012
- Accepted: 11 April 2013
- Published: 25 April 2013
In this paper, a new adaptive distributed video coding (DVC) method with received motion vectors (MVs) through a back channel is proposed. The MVs estimated for side information (SI) generation in the proposed DVC decoder are transmitted to the proposed DVC encoder to reconstruct a predicted SI (PSI) which is the same as the SI at the decoder with minimum computational load. Therefore, the proposed DVC encoder can determine the positions and magnitudes of errors using the PSI. With available error information, the proposed encoder adaptively determines appropriate slice partitions by maintaining fixed per-slice error rates for prevention of channel decoding failure. The encoder also sends a coded SI accuracy map to the decoder and sets the conditional probability of each variable node of the low-density parity-check accumulator (LDPCA) to the correct crossover probability according to the coded map as received on the decoder side. The performance of the LDPCA can also be significantly improved with a minimum number of parity bits and low computational complexity because accurate belief propagation can be carried out using the correct crossover probabilities in the LDPCA decoder. Experimental results show that the rate-distortion performance of the proposed algorithm is twice that of several conventional DVC methods and that the proposed decoder is approximately 24.5% faster than these methods.
- Motion Vector
- Side Information
- Variable Node
- Distribute Video Code
- Discrete Cosine Transform Domain
As many portable video devices such as wireless low-power surveillance cameras, mobile phones, and multimedia sensor networks have been developed, many people have come to enjoy taking videos and transmitting them easily to friends or Web sites using these portable devices. To improve video quality and delivery speed for portable devices, the demand for low-cost, low-power encoders has been continuously increasing [1, 2]. However, conventional video codecs such as MPEG-x and H.26x cannot satisfy these requirements because they were designed according to a philosophy of high-complexity encoding and low-complexity decoding. Several distributed video coding methods have been proposed to satisfy the new demands for low-complexity encoding and high-complexity decoding. The distributed video coding (DVC) technology is based on the migration of computational complexity from the encoder to the decoder and can achieve coding gain with decoder-side predictions [3–5].
DVC was developed to reduce the complexity of the video encoders based on the Slepian-Wolf information theory [1–8]. Two different signals can be coded without prediction between them, and the signals can then be decoded with prediction at the decoder. Slepian and Wolf proved that compression performance with decoder-side prediction can achieve a certain level close to those attained by efficient conventional compression that codes signals with prediction at the encoder side. Therefore, the DVC encoder does not need to perform motion estimation, which enables drastic reduction in the computational complexity of the DVC encoder. Wyner-Ziv (WZ) proposed a lossy DVC approach using side information (SI). The approach codes original input frames using the key frame mode and the WZ mode [4–6]. For the WZ mode, channel coding techniques such as the low-density parity-check accumulator (LDPCA) coder and the turbo coder are often used for distributed video coding [7–12].
As mentioned before, the computational complexity of DVC encoders is significantly lower than that of conventional video encoders. However, the corresponding DVC decoders have high computational complexity because SI generation requires at least as much computation as conventional motion estimation, and channel decoding also requires many iterative computations. Therefore, several methods have been proposed to reduce the computational complexity of the DVC decoder. One of these estimates error rates to reduce the number of feedback iterations and the computational complexity of channel decoding [4–6]. The performance and speed of the channel decoder can be improved with accurate crossover probabilities which are similar to the actual SI error rates. The crossover probability is defined by the probability that a bit value (0 or 1) will be changed into another value (1 or 0) at each variable node of the LDPCA channel decoder . Because the LDPCA is based on the sparse bipartite graph, errors are corrected by a belief propagation algorithm. In addition, the crossover probabilities influence on convergence of the belief propagation . Crossover probability is used to determine the conditional entropy of a variable node of the LDPCA, which directly impacts on the error-correction capability of the LDPCA. Therefore, the computational complexity of the channel decoder can be reduced by a correct setting of the crossover probabilities in the variable nodes of the LDPCA based on correct SI error rates. However, neither the encoder nor the decoder can estimate error rates of SIs because both the original and SI frames must be available for each side. Therefore, several SI error-probability prediction methods have been proposed [7–9], which can be divided into two groups depending on whether the error-probability prediction is performed on the encoder or the decoder side. Error-probability estimation methods on the decoder side assume that the error rate of a bitplane is correlated with that of a previous frame, a neighboring bitplane, or both. These methods can work well for slow-moving sequences and have no impact on the computational complexity of the encoder side. However, iterative feedbacks are still required because the encoder does not have any information about the SI. Predicted error probabilities are also likely to be different from actual SI error rates for fast-moving sequences. Fast-moving sequences have low correlations in both temporal and bitplane domains. Error-probability estimation methods on the encoder side generate a coarse SI by simple linear interpolation, rough block-based motion estimation, motion compensation, and so on. Although the feedback channel can be removed and adaptive DVC modules are employed with these approaches, the computational complexity of the encoder increases. Furthermore, the estimated error probability is not likely to be accurate, and therefore, coding performance can decrease. The error probability can be estimated with motion vectors received from the decoder side. Motion vector feedback algorithm has been presented to improve the coding efficacy of DVC paradigm [13–15].
The proposed DVC encoder can generate a predicted SI (PSI) that is the same as the SI of the decoder side with low computational complexity because only motion compensation is performed with received motion vectors (MVs) and reference key frames on the encoder side [13, 14]. Because the proposed PSI is identical to the SI for both slow-moving and fast-moving sequences, the correct SI error can be estimated using the PSI, and the WZ frames can be adaptively encoded depending on the target per-slice error rate. After SI error estimation, the proposed encoder performs block-based coding and determines the correct slice partitions to prevent errors from concentrating in certain areas because a channel decoder cannot work for data with a too high error rate. In addition, the proposed DVC encoder sends a coded map indicating whether each block has been coded or not. The coded map leads to faster and more accurate convergence in LDPCA decoding because the proposed DVC decoder marks the variable nodes associated with non-coded blocks as not altered. Therefore, decoding failure rates and delays can be reduced simultaneously.
The rest of this paper is organized as follows: Section 2 introduces several conventional DVC algorithms. Section 3 presents details of the proposed method. In Section 4, experimental results are presented and discussed. Finally, Section 5 gives the concluding remarks.
As mentioned before, multimedia DVC is a new video coding paradigm which makes it possible to shift complexity from an encoder to a decoder. Two signals can be independently coded in the encoder and then reconstructed with a prediction of the cross-correlation between them in the decoder. Slepian and Wolf proved that coding performance can be improved with decoder-side prediction. Wyner and Ziv developed several lossy DVC systems based on the Slepian-Wolf information theory [1, 2]. They also proposed a novel SI generation algorithm to improve DVC coding performance [16–18].
2.1. Wyner-Ziv distributed video coding
In the DVC encoder, key frames are encoded by a conventional intra-frame encoding method such as H.264/AVC intra-frame mode. For the WZ mode, two approaches have been proposed: pixel-based DVC and transform-based DVC. For pixel-based DVC, quantization is carried out as a pre-processing. For transform-based DVC, transform and quantization are performed before a channel coder. After pre-processing, the channel coder produces a message (the original WZ frame) and its parity bits for each bitplane. Among the outputs of a channel coder, only the parity bits are transmitted to the corresponding DVC decoder when a request is received from the decoder.
In the DVC decoder, key frames are reconstructed using a conventional intra-frame decoder. Then an SI is generated using the reconstructed key frames. Many SI generation algorithms have been proposed because the rate-distortion (RD) performance of DVC depends highly on SI accuracy . The generated SI is still different from the original WZ frame, but the difference can be corrected by the channel decoder using the received parity information. Because the feedback iterations continue until channel decoding is successful, delays in DVC decoding could not be negligible for real-time applications.
On the encoder side, the original frames are available, but the SI is not; on the decoder side, the SI is available, but the original frame is not. As a result, error rates cannot be computed on the decoder and encoder sides. Therefore, conventional DVC systems cannot help to send correct error probabilities to the variable nodes in LDPCA decoding, which leads to a decrease in coding performance and increases the number of feedback iterations in LDPCA decoding. For performance improvement of channel decoding, several conventional algorithms have been proposed to estimate the SI error rate.
2.2. Estimation of error rates
Either hard decisions or soft decisions can be used in a message-passing module for belief propagation in LDPCA . Soft decisions for message passing are known to lead to better coding performance than hard decisions because the soft decision algorithm can use statistical characteristics to set the channel modeling parameters. Note that the log-likelihood ratio can be used to represent conditional entropy in soft decision-making, while only two values ((-1,1) or (0,1)) can be used for hard decision-making. Although LDPCA decoding based on soft decisions requires more computational complexity, the soft decision methods are frequently adopted by many LDPCA systems due to its decoding accuracy.
To determine accurate crossover probabilities for LDPCA decoding, several methods have been proposed to predict SI error probabilities [4–6]. In conventional DVC algorithms, the SI is available on the decoder side, but the original frame is not. Several methods have been proposed to predict error probabilities using neighboring information such as previous frames or adjacent bitplanes. These methods assume that changed bit rates in previous bitplanes have an influence on error rates in the target bitplane. The error rates of the target bitplane are also assumed to be similar to those of the same bitplane in the previous frame. Therefore, several conventional algorithms set crossover probabilities equal to the error probabilities of the previous corresponding bitplane or frame using an error model of the difference between the original frame and the SI, such as a Laplacian distribution for DVC . On the other hand, the error probabilities can be estimated on the encoder side. Several existing algorithms have tried to estimate a PSI similar to the SI with minimum computational complexity. For the computational constraint, rough SI generation algorithms are used, for example, temporal linear interpolation or simple block-matching algorithms [7–9, 19, 20]. If accurate error rates between the original and SI frames are known on the encoder side, the correct number of parity bits to be sent can be determined. As a result, the iterative feedback can be removed and delays can be drastically reduced for practical DVC. In addition, it is easy to decide whether a block can be coded as an intra-coded or a skipped block, given the number of errors in the target block. However, it is not easy to obtain an accurate SI on the encoder side because this entails a computational load as heavy as that of a DVC decoder. If the estimated error probability is far from the correct one, a channel decoder is likely to converge to an incorrect solution.
2.3. Belief propagation of fixed crossover probability to LDPCA
Although conventional algorithms can estimate a per-slice error probability, they set all variable nodes to a fixed crossover probability for an entire slice. This is one reason why conventional DVC methods assume that the states of the variable nodes are identically and independently distributed. However, it can be easily seen that SI errors concentrate in certain regions which are commonly boundaries of objects. Therefore, the error probabilities can vary significantly depending on location. Note that LDPCA decoding based on belief propagation influences all the connected variable nodes if the local crossover probabilities are incorrect. Channel decoding can become slower and error correction performance can be also degraded when using a fixed crossover probability for all the variable nodes. The correction performance of LDPCA decoding can be improved using adaptive conditional probabilities based on local error rates.
3.1. Generation of SI and PSI
where b i is the i th predicted block for the target frame, (x, y) represent the spatial position of pixel resolution, B(x, y, t - 1) and B(x, y, t + 1) are the key frames reconstructed using H.264/AVC intra-frame coding, t is a time index, w i is the flag bit to indicate whether a backward or a forward frame is used, and MV i is a motion vector that is received from the decoder.
3.2. Triangular-shaped quantization and block-based channel coding
It is helpful to control and balance the bit errors with the aid of a new bit geometry. The proposed method uses isosceles triangular-shaped quantization (TSQ) of the DCT domain  to correctly distribute the error rates. As a result, the proposed algorithm can reduce the probability of decoding failure, and the target bitrate can be gradually adjusted. In other words, error rates can be counterbalanced by combining lower frequency components (which have lower error probabilities) and higher frequency components (which have higher error probabilities) of a block. Therefore, the error rate of a quantized block as determined by the proposed TSQ is not as high as that of the LSB bitplanes or the high-frequency components. The proposed method can also control target bitrates by adjusting quantization parameters and error rates.
3.3. Slice partitioning
3.4. Generation of coded map
To alleviate these problems, the proposed method estimates accurate per-block error rates and generates a coded map. The coded map is a group of flags, with each flag representing a block. Each flag indicates whether its block has an error rate greater than a threshold (T MAP = 10). Note that the threshold influences video quality and rates. When the threshold is set to a small value, video quality improves and rates increase. The threshold can be adjusted to control the rate and quality. The coded map received from the encoder is used to set each variable node to the correct crossover probability in the proposed decoder. Because the crossover probabilities of the variable nodes associated with non-coded blocks are set to zero and those of other variable nodes are set to the target error rate (=700/6,336), the error-free variable nodes will not be altered, and the others will be corrected with the predetermined probability.
For the performance evaluation of the proposed algorithm, the RD performance of the proposed and conventional algorithms was compared. Six test sequences (‘Akko,’ ‘Ballroom,’ ‘Exit,’ ‘Flamenco2,’ ‘Race1,’ and ‘Rena’) were used, with a format and size of 4:0:0 YUV and 640 × 480, respectively. Key frames were coded using JM 17.0, and five quantization parameters (QPs) (33, 37, 41, 45, and 49) were used. All the sequences consist of 100 frames, with 50 frames of these coded as key frames. The SI was reconstructed based on an adaptive search range  and an LDPCA channel coder with a matrix length of 6,336 . Slice lengths, MVs, and the coded map were coded using Exp-Golomb. The two quantization parameters, the height and width of the isosceles TSQ, were set to ((7, 8), (6, 7), (5, 6), (4, 5), (3, 4)), respectively, for the five QPs of the key frames. For example, when the QP for the key frames is 33, the height and width for the two quantization parameters are 7 and 8. The target error rate (T Slice) for the slice length was set to 700, and T MAP for the coded map was set to 10.
Because neither the encoder nor the decoder has the original or the SI, it is not easy to predict accurate SI error rates on either the encoder or the decoder side with minimum computational complexity. To predict SI error rates with minimum computational load on the encoder side, linear interpolation is widely used. However, the peak signal-to-noise ratio (PSNR) between the SI on the decoder side and the PSI obtained by linear interpolation was approximately 27.99 dB for the test sequences. Conventional DVC that generates PSI by linear interpolation does not work for fast-moving sequences. However, the proposed method generates a PSI identical to the SI on the decoder side. Therefore, the proposed method works well regardless of the sequence. Furthermore, the additional computational load is negligible; for example, it takes 8 ms to generate the PSI using a 64-bit Intel Core i5 with a 2.53-GHz CPU.
where T C and T P are the running times of the conventional and the proposed algorithms, respectively. The computation time of the proposed encoder was increased by approximately 57.493% for computing slice partitions, the coded map, and an 8 × 8 DCT; however, the proposed encoder can run in real time. On the other hand, the complexity of the decoder was significantly reduced. In fact, the reduction in complexity for the LDPCA decoder itself is as much as 99.03%, but the complexity reduction of the whole DVC decoder is approximately 24.50% using Equation 3, compared with that of DISCOVER. The proposed DVC decoder does not require iterative feedback, and the number of iterations in the LDPCA itself is also significantly reduced by setting accurate crossover probabilities in all the variable nodes of the LDPCA decoder. The complexity of the proposed DVC encoder increases around 57.5%, compared with that of DISCOVER, because the proposed encoder needs to generate PSI, coded map, and so on. However, the proposed DVC encoder can be implemented in real time even with the additional computational load.
In this paper, a new adaptive distributed video coder is proposed based on received MVs from the DVC decoder side. In the proposed encoder, the PSI is reconstructed using reference key frames and MVs without motion estimation. The correct number and positions of SI errors can be computed on the encoder side using the PSI because the PSI is the same as the SI in the decoder. Based on this error information, slice partitions can be determined by maintaining a fixed per-slice error rate and by generating a coded map and transmitting it to the decoder side. The proposed decoder can then set accurate crossover probabilities for the variable nodes according to the received coded map. The proposed method improves the performance of a channel coder by providing accurate crossover probabilities. It also leads to reduce coding delays in the decoder by eliminating iterative feedbacks. However, a back channel is still needed for the proposed method. In the future, the authors will focus on a new framework to remove the back channel.
This work was partly supported by the IT R&D program of MKE/KEIT (10039199, A Study on Core Technologies of Perceptual Quality based Scalable 3D Video Codecs) and by The Ministry of Knowledge Economy (MKE), Korea, under the Information Technology Research Center (ITRC) support program (NIPA-2013-H0301-13-1011) supervised by the National IT Industry Promotion Agency (NIPA).
- Slepian D, Wolf J: Noiseless coding of correlated information sources. IEEE Trans. on Information Theory 1973, 19(4):471-480. 10.1109/TIT.1973.1055037MathSciNetView ArticleGoogle Scholar
- Wyner A, Ziv J: The rate-distortion function for source coding with side information at the decoder. IEEE Trans. on Information Theory 1976, 22(1):1-10. 10.1109/TIT.1976.1055508MathSciNetView ArticleGoogle Scholar
- Micallef J, Farrugia JR, Debono C: Low-density parity-check codes for asymmetric distributed source coding. In Conference on ICITIS 2010. Beijing; 17–19 Dec 2010:985-988.Google Scholar
- Linbo Q, Xiaohai H, Rui L, Xiewei D: Application of punctured turbo codes in distributed video coding. In Conference on ICIG 2007. Sichuan; 22–24 Aug 2007:241-245.Google Scholar
- Aaron A, Zhang R, Girod B: Wyner-Ziv coding of motion video. In Conference on Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems, and Computers 2002, vol. 1. Pacific Grove; 3–6 Nov 2002:240-244.View ArticleGoogle Scholar
- Varodayan D, Aaron A, Girod B: Rate-adaptive codes for distributed source coding. EURASIP Signal Processing Journal, Special Section on Distributed Source Coding 2006, 86(11):3123-3130.Google Scholar
- Brites C, Pereira F: Encoder rate control for transform domain Wyner-Ziv video coding. In Conference on ICIP 2007, vol. 2. San Antonio; 16 Sep–19 Oct 2007:5-8.Google Scholar
- Zhai F, Fair IJ: Techniques for early stopping and error detection in turbo decoding. Trans. on IEEE Communications 2003, 51: 1617-1623. 10.1109/TCOMM.2003.818099View ArticleGoogle Scholar
- Chien WJ, Karam LJ, Abousleman GP: Rate-distortion based selective decoding for pixel-domain distributed video coding. In Conference on ICIP 2008. San Diego; 12–15 Oct 2008:1132-1135.Google Scholar
- Skorupa J, Slowack J, Mys S, Lambert P, Van de Walle R, Grecos C: Stopping criterions for turbo coding in a Wyner-Ziv video codec. In Conference on PCS 2009. Chicago; 6–8 May 2009:1-4.Google Scholar
- Martinez JL, Holder C, Fernandez GE, Kalva H, Quiles F: DVC using a half-feedback based approach. In Conference and Expo on Multimedia. Hannover; 23 Jun–26 Apr 2008:1125-1128.Google Scholar
- Du B, Shen H: Encoder rate control for pixel-domain distributed video coding without feedback channel. In Conference on MUE 2009. Qingdao; 4–6 Jun 2009:9-13.Google Scholar
- Min K, Park S, Sim D: Distributed video coding based on adaptive slice size using received motion vectors. In 28th Picture Coding Symposium. Nagoya; 8–10 Dec 2010:262-265.View ArticleGoogle Scholar
- Min K, Park S, Sim D: Distributed video coding based on adaptive crossover probability using received motion vectors. In CEWIT 2010. Incheon; 27–29 Sep 2010.Google Scholar
- Kim J-S, Kim J-G, Seo K-D: A selective block encoding scheme based on motion information feedback in distributed video coding. IEICE Trans. on Comm. 2011, E94.B(3):860-862. 10.1587/transcom.E94.B.860View ArticleGoogle Scholar
- Jia W, Xiaolin W, Songyu Y, Jun S: New results on multiple descriptions in the Wyner-Ziv setting. IEEE Trans. on Information Theory 2009, 55(4):1710-1708.Google Scholar
- Liu R, Yue Z, Chen C: Side information generation based on hierarchical motion estimation in distributed video coding. Journal of Aeronautics 2009, 22(2):167-173. 10.1016/S1000-9361(08)60083-7View ArticleGoogle Scholar
- Shuiming Y, Ouaret M, Dufaux F, Ebrahimi T: Improved side information generation with iterative decoding and frame interpolation for distributed video coding. In Conference on ICIP 2008. San Diego; 12–15 Oct 2008:2228-2231.Google Scholar
- Min KY, Park SN, Nam JH, Sim DG, Kim SH: Distributed video coding based on adaptive block quantization using received motion vectors. KICS Journals 2010, 35(2):172-181.Google Scholar
- Huchet G, Demin W: Distributed video coding without channel codes. In Symposium on IEEE BMSB 2010. Shanghai; 24–26 Mar 2010:1-5.Google Scholar
- Min KY, Park SN, Sim DG: Side information generation using adaptive search range for distributed video coding. In Conference on PacRim 2009. Victoria; 23–26 Aug 2009:854-857.Google Scholar
- Artigas X, Ascenso J, Dalai M, Klomp S, Kubasov D, Ouaret M: The discover codec, architecture, techniques and evaluation. In Conference on PCS 2007. Lisbon; 7–9 Nov 2007:6-9.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.