Iterative Multiview Side Information for Enhanced Reconstruction in Distributed Video Coding
- Mourad Ouaret^{1}Email author,
- Frédéric Dufaux^{1} and
- Touradj Ebrahimi^{1}
https://doi.org/10.1155/2009/591915
© Mourad Ouaret et al. 2009
Received: 30 May 2008
Accepted: 15 December 2008
Published: 16 March 2009
Abstract
Distributed video coding (DVC) is a new paradigm for video compression based on the information theoretical results of Slepian and Wolf (SW) and Wyner and Ziv (WZ). DVC entails low-complexity encoders as well as separate encoding of correlated video sources. This is particularly attractive for multiview camera systems in video surveillance and camera sensor network applications, where low complexity is required at the encoder. In addition, the separate encoding of the sources implies no communication between the cameras in a practical scenario. This is an advantage since communication is time and power consuming and requires complex networking. In this work, different intercamera estimation techniques for side information (SI) generation are explored and compared in terms of estimating quality, complexity, and rate distortion (RD) performance. Further, a technique called iterative multiview side information (IMSI) is introduced, where the final SI is used in an iterative reconstruction process. The simulation results show that IMSI significantly improves the RD performance for video with significant motion and activity. Furthermore, DVC outperforms AVC/H.264 Intra for video with average and low motion but it is still inferior to the Inter No Motion and Inter Motion modes.
Keywords
1. Introduction
Multiview video is attractive for a wide range of applications such as free viewpoint television (FTV) [1] and video surveillance camera networks. The increased use of multiview video systems is mainly due to the improvements in video technology. In addition, the reduced cost of cameras encourages the deployment of multiview video systems.
FTV is one of the promising applications of multiview. FTV is a 3D multiview system that allows viewing the scene from a view point chosen by the viewer. Video surveillance is another area where multiview can be beneficial for monitoring purposes. In addition, the multiple views can be used to improve the performance of event detection and recognition algorithms. However, the amount of data generated by multiview systems increases rapidly with the number of cameras. This makes data compression a key issue in such systems.
In DVC [2], the source statistics are exploited at the decoder by computing the SI of the WZ frame using different techniques. In this paper, a review of different SI techniques for multiview DVC is first provided, including a thorough evaluation of their estimation quality, complexity, and RD performance. Moreover, all the SI techniques are combined in the ground truth (GT) fusion, which combines the different SIs using the original WZ frame at the decoder. Even though this is not feasible in practice, it gives the maximum achievable DVC performance. Further, a new technique called iterative multiview side information (IMSI) is proposed to improve the DVC RD performance especially for video with significant motion. IMSI uses an initial SI to decode the WZ frame and then constructs a final SI which is used in a second reconstruction iteration. Finally, the performance of multiview DVC is compared with respect to AVC/H.264 [3] Intra, Inter No Motion (i.e., zero motion vectors), and Inter Motion.
The paper is structured as follows. First, the paradigm of distributed video coding is presented in Section 2. Multiview DVC is described in Section 3, whereas, Section 4 reviews the different intercamera estimation techniques. The IMSI technique is proposed in Section 5. Then, the test material and simulation results are presented and discussed in Section 6. Finally, some concluding remarks are drawn in Section 7.
2. Distributed Video Coding (DVC)
2.1. Theoretical DVC
DVC is the result of the information-theoretic bounds established for distributed source coding (DSC) by Slepian and Wolf [4] for lossless coding, and by Wyner and Ziv [5] for lossy coding with SI at the decoder. Lossless DSC refers to two correlated random sources separately encoded and jointly decoded by exploiting the statistical dependencies.
2.2. Practical DVC
At the encoder, the frames are separated into two sets. The first one is the key frames which are fed to a conventional AVC/H.264 Intra encoder. The second set is the WZ frames. The latter are transformed and then quantized prior to WZ encoding. The same separable integer transform as in AVC/H.264 is used with properties similar to the discrete cosine transform (DCT) [7]. Then, the same bands are grouped together and the different bit planes are extracted and then fed to a turbo encoder [8]. The latter offers near-channel capacity error correcting capability. Furthermore, a cyclic redundancy check (CRC) [9] is computed for each quantized bit plane and transmitted to the decoder. The frequency of the key frames is defined by the group of pictures (GOPs).
A virtual channel is used to model the correlation between the DCT coefficients of the original and SI frames. It is shown that the residual of the DCT coefficients follows the Laplacian distribution [2]. The reconstruction process [11] uses the SI along with decoded bins to recover the original frame up to a certain quality. The decoder accepts the SI DCT value as a reconstructed one if it fits into the quantization interval corresponding to the decoded bin. Otherwise, it truncates the DCT value into the quantization interval. This DVC scheme is decoder driven as the request for parity bits from the encoder is performed via a feedback channel until successful decoding. The decoding is considered successful if the decoded bit plane error probability is lower than and its CRC matches the one received from the encoder.
The multiview DVC scheme used in this research is exactly the same as the monoview DVC described above except for the SI extraction module as it is explained further in Section 3.
3. Multiview DVC (MDVC)
It differs from monoview DVC in the decoder. More precisely, the SI is constructed not only using the frames within the same camera but using frames from the other cameras as well.
In [14], the wavelet transform is combined with turbo codes to encode a multiview camera array in a distributed way. At the decoder, a fusion technique is introduced to combine temporal and homography-based side information. It thresholds the motion vectors and the difference between the corresponding backward and forward estimations to obtain a fusion mask. The mask assigns the regions with significant motion vector and estimation error to homography SI, and the rest is assigned to temporal SI (i.e., regions with low motion and relatively small prediction error). It is reported that the hybrid SI outperforms the temporal one by around 1.5 dB in PSNR. In addition, it outperforms H.263+ Intra by around dB. A video content with spatial resolution is used in the evaluation.
Further, a flexible estimation technique that can jointly utilize temporal and view correlations to generate side information is proposed in [15]. More specifically, the current pixel in the WZ frame is mapped using homography to the left and right camera frames. Then, AVC/H.264 decision modes are applied to the pixel blocks in the left and right camera frames. If both resulting modes are intermodes, the SI value is taken from temporal SI. Otherwise, it is taken from homography SI. The simulation results show that this technique significantly outperforms conventional H.263+ Intra coding. Nevertheless, comparison with AVC/H.264 Intra would be beneficial as it represents state-of-the-art for conventional coding.
Finally, ways of improving the performance of multiview DVC are explored in [20]. Several modes to generate homography-based SI are introduced. The homography is estimated using a global motion estimation technique. The results show an improvement of SI quality by around 6.0 dB and a gain in RD performance by around dB for video content with a spatiotemporal resolution of at 15 fps. However, the reported results assume an ideal fusion mask, which requires the knowledge of the original at the decoder. This is not feasible in a practical scenario.
4. Intercamera Prediction
4.1. Disparity Compensation View Prediction (DCVP)
DCVP [16] is based on the same idea as MCTI, but the motion compensation is performed between the frames from the side cameras. A slight modification is applied to DCVP to improve the SI quality. Instead of interpolating the motion vectors at midpoint, an optimal weight is computed in [16]. For this purpose, the first frame of each camera is conventionally decoded. Then, motion compensation is performed between the side camera frames. The motion vectors are weighted with the weights Further, the SI PSNR is computed for each weight. The weight with maximum PSNR is maintained and used for the rest of the sequence. Nevertheless, the SI generated by DCVP has usually a poorer quality than the one generated by MCTI. This is due to the larger disparity between the side camera frames when compared to the one between the previous and forward frames.
4.2. Homography
where T is a threshold.
The advantage of this technique is that once the homography relating the central camera with the side ones is estimated, computing the SI becomes very simple in terms of computational complexity when compared to techniques based on exhaustive block-based motion estimation. Moreover, this technique is suitable for scenarios, where the global motion is highly dominant with respect to local variations as it would generate a good estimation in this case. On the other hand, if the scene has multiple significant objects moving in different directions, the estimation would be of a poor quality as the technique would only account for global motion.
4.3. View Synthesis Prediction (VSP)
The previously mentioned techniques do not take advantage of some important features of multiview. That is, the speed at which an object is moving in a view depends on its depth information. In addition to this, rotations, zooms, and different intrinsic parameters are difficult to model using a motion vector, which is a simple translational model. Furthermore, the homography tries to estimate a global motion and ignores local motion using a truncated error function, which is not the case of VSP [22]. In the latter, the camera parameters, intrinsic and extrinsic, are used to predict one camera view from its neighbors.
In the multiview camera setup used in this research, the pixel in the central camera is mapped to both side cameras. The pixel value is taken as average of both side camera pixels.
The drawback of this technique is the difficulty to estimate depth for real-world complex scenes. In addition, the quality of the SI depends on the precision of the camera calibration and depth estimation.
4.4. View Morphing (VM)
The problem with VM is that it works very well for simple scenes with a central object infront a uniform background. In this case, extracting matched feature points with a high degree of accuracy from the scene is simple as these points are used to compute the fundamental matrix. On the other hand, VM fails for real-world scenes as the matched feature points task becomes a more challenging task.
4.5. Multiview Motion Estimation (MVME)
5. Iterative Multiview Side Information (IMSI)
Initially, the reconstruction process of DVC is described in this section. Then, IMSI is introduced.
5.1. DVC Reconstruction
This stage in the decoding process is opposite to the quantization step at the encoder. After turbo decoding, the decoder knows perfectly the quantization bin of each decoded band. Relying on the assumption that the WZ frame is correlated with the SI, the reconstruction block uses the SI along with decoded bins to improve the reconstruction quality as described in [11]. The principal consists in either accepting an SI value as a reconstructed value if it fits into the quantization interval corresponding to the decoded bin or truncating the SI value into this quantization interval. The reconstruction is performed independently for every transform coefficient of every band.
For the AC bands, the reconstructed value is computed in a similar way. The only difference is that a quantizer with a dead zone is used for the AC coefficients as they take positive and negative values. On the other hand, the DC coefficient takes only positive value.
5.2. IMSI for Enhanced Reconstruction
- (i)
First, the initial SI to use in the WZ frame decoding is chosen depending on the nature of the video. This is done by computing the average luma variation per pixel between the key frames at the decoder, which is compared to a threshold. If it is below the threshold, the motion is considered not significant and MCTI is used as the initial SI. Otherwise, MVME is taken as initial SI. This is motivated by the results presented further in Section 6.2. Namely, MCTI shows better estimation quality for low-motion video content. On the other hand, MVME is shown to have a better performance for video with significant motion.
- (ii)
WZ decoding is performed using the initial SI, which implies turbo decoding followed by a first reconstruction stage.
- (iii)
The decoded WZ frame from the first stage is then predicted by block-based motion search and compensation as in conventional video coding using four references: the previous, forward, left camera, and right camera frames. More specifically, for each block in the decoded frame, the best matching block with minimum distortion is selected using the square absolute difference (SAD) as the distortion metric as shown in Figure 21. This generates a final SI.
- (iv)
Finally, the final SI is used in a second iteration in the reconstruction block.
The improvement for low-motion video is negligible as both side information, initial and final, are close in terms of estimation quality.
IMSI generates a better estimation of the WZ frame than the initial SI, since it uses the decoded WZ frame from the first iteration to compute the estimation. On the other hand, the price to pay for this good estimation is the initial WZ rate spent to initially decode the WZ frame. In addition, there is an increase in the decoder complexity due to the additional motion search task.
6. Simulation Results
6.1. Test Material and Evaluation Methodology
- (i)
Only luminance data is coded.
- (ii)
The central camera is the only one containing WZ frames. The side cameras (i.e., left and right) are conventionally encoded in the intramode, while the central one contains WZ frames, as depicted in Figure 10.
- (iii)
Each element of the matrices corresponds to the number of quantization levels to the corresponding coefficient band. For example, the DC coefficient has 32, 32, 64, and 128 quantization levels, respectively, in the 1st, 2nd, 3rd, and 4th RD points, and so on.
- (iv)
The same quantization parameter (QP) is used for the side cameras and the key frames of the central camera. A QP is defined per quantization matrix such that the decoded key and WZ frames have a similar quality.
- (v)
The GOP size is equal to 2.
- (a)
Intra, Inter No Motion, and Inter Motion modes. For the Inter No Motion mode, each motion vector is equal to zero, which means that each block in a P frame is predicted from the colocated block in the previous I frame. For the Inter Motion mode, the motion search range is set to 32. In both modes, the GOP size is equal to 12;
- (b)
high profile with CABAC;
- (c)
6.2. Side Information Estimation Quality
In this section, the SI PSNR is evaluated for the SI techniques at the different RD points. Uli is not provided with depth maps. In addition, the feature point matching performs poorly due to highly textured scene background in the sequence. For this reason, the VSP and VM techniques are not evaluated for Uli.
For Ballet, MVME produces the best SI quality followed by MCTI. Ballet contains motion but it is less significant than in the Breakdancers case. This explains the increase in PSNR gap between MCTI and the other SI techniques. As for Breakdancers, we have homography followed by DCVP, then VM, and finally VSP in a decreasing order in terms of SI quality.
Since Uli contains little motion, we expect MCTI and MVME to work very well, since MCTI performs a pure temporal interpolation and MVME performs an intercamera disparity estimation followed by a temporal motion estimation.
In summary, we can see clearly that MVME and MCTI produce by far better estimations than other SI generation techniques for Ballet and Uli. On the other hand, MVME, MCTI, homography, and DCVP are not very far from each other in terms of SI quality for Breakdancers.
For the three sequences, homography-based SI is the one that brings most innovations to the GT fusion as it is the least correlated SI with MCTI. Therefore, we can conclude that possible fusion algorithms combining MCTI and homography-based SI represent a good tradeoff between performance improvement and complexity increase.
6.3. Side Information Complexity
The different techniques complexities are compared in terms of the total number of arithmetic operations (i.e., additions, subtractions, multiplications, and divisions) required to generate the side information. The image dimensions are the height, H, and the width, W. For the block-based methods, a search range and block size are considered.
6.3.1. MCTI and DCVP
Both MCTI and DCVP have the same complexity. The only difference between both techniques is the input frames. For each block match, subtractions are required. Then, the error is computed, which requires additions. This is performed for each position within the search range. Thus, operations are required to find a match for each block. Finally, all the blocks should be processed. Therefore, is the number of operations required to estimate the motion between the two frames.
6.3.2. MVME
There is a maximum of 8 paths. For each one, motion estimation is performed twice with the Intracamera and then across the side and the central cameras. Therefore, operations are required for each path. Thus, a total of operations is required for all the paths. In other words, MVME is approximately 16 times more complex than MCTI.
6.3.3. Homography
Initially, the homography matrices are computed offline. A total of 15 operations is required to compute the mapping for each pixel using the homography matrix. Therefore, the complexity of the homography-based side information generation from both view is
6.3.4. VM
In VM, both side frames are warped, which requires operations. Then, the resulting warped frames are morphed across the virtual camera position. The latter needs operations. Finally, the morphed frame is unwarped to obtain the side information. Therefore, the total complexity is operations.
6.3.5. VSP
For each pixel, the projection from the image plane to the 3D world coordinates requires 38 operations. Moreover, the projection back to the central camera requires 23 operations. This is performed for each pixel, which results in a total complexity of It important to mention that this estimation does not take into account the depth estimation. This complexity applies given that the depth map is already available.
6.3.6. IMSI
The complexity of IMSI depends on the initial SI used, which is either MVME or MCTI. Then, the final SI generations requires operations. This implies a maximum complexity of when MVME is used as the initial SI.
6.4. RD Performance
In this section, the RD plots for the different sequences are presented for the different side information. It is important to mention that only SI with a significant RD performance is presented. Therefore, the performance of VM and VSP is not plotted for Breakdancers and Ballet. For Uli, only IMSI, MCTI, and MVME are plotted as they significantly outperform the other side information. On the other hand, the GT fusion combines all the side information even the ones that are not plotted.
Next, the GT fusion, IMSI, and the fusion techniques introduced in [12, 16], combining MCTI and homography (i.e., the least correlated side information), are compared to AVC/H.264 Intra, Inter No Motion, and Inter Motion. The choice of the Intra and Inter No Motion modes is motivated by the fact they are very close to DVC in terms of encoding complexity. In addition, the DSC theorems state that the performance of a codec that performs joint encoding and decoding (i.e., Inter Motion Mode) should also be achievable (asymptotically) by a DVC codec.
For Ballet, IMSI is superior to AVC/H.264 Intra by around 1.0 dB, and significantly outperformed by AVC/H.264 Inter No Motion and Inter Motion. Both fusions in this case improve the performance over IMSI. More specifically, the decoder-driven fusion improvement is around 0.25 dB. Moreover, the encoder-driven fusion improves the performance even further especially at low and average bit rates by a maximum gap of around 1.0 dB.
For Uli, IMSI, which is similar to MCTI in performance, improves the performance over AVC/H.264 Intra by around 3.0 dB. Moreover, it has a poorer performance than AVC/H.264 Inter No Motion and Inter Motion. The fusions do not result in any improvements as the decision is always made in favor of MCTI for the decoder-driven fusion. In other words, performing the fusion in this case is useless for Uli. For the encoder-driven fusion, the improvement in SI estimation quality is insignificant, and since additional rate is spent to send the binary mask, the overall performance drops below MCTI.
Overall, the performance of DVC is superior to AVC/H.264 Intra for two sequences out of three. On the other hand, it has a poorer performance than AVC/H.264 Inter Inter No Motion and Inter Motion for all the sequences, even with the GT fusion. Concerning DVC, IMSI is better for video content with very significant motion occupying a large part of the scene. MCTI is suitable for more or less static video content as it generates highly correlated SI with the WZ frame, resulting in superior compression efficiency than intraconventional coding, but inferior to conventional intercoding. For video with average motion, the encoder driven fusion produces the best performance for the DVC compression. Finally, the GT fusion shows that there still a large gap for improvement as it reduces the bit rate for DVC up to 50% for video with significant motion with respect to MCTI.
7. Conclusion
In this work, different SI generation techniques are studied for multiview DVC. For video with significant motion, the proposed IMSI significantly improves the performance over other SI techniques. It is followed by MVME and then MCTI. On the other hand, IMSI is more complex than MVME, which is much more complex than MCTI. For videos with average and low motion, MCTI and MVME improve the RD performance over AVC/H.264 Intra. Nevertheless, MCTI has the advantage of having a similar or better RD performance and being less complex than MVME in this case.
Further, we show that it is possible to reduce up to the bit rate with respect to monoview DVC (i.e., MCTI) with the GT fusion. Nevertheless, the GT fusion requires the original video at the decoder, which is not feasible but it shows the maximum possible gain when the different SIs are ideally combined. It shows as well that MCTI, MVME, and DCVP generate highly correlated side information since they belong to the same block-based category techniques. On the other hand, MCTI and homography represent a good tradeoff between performance improvement and complexity increase. Moreover, fusion techniques combining these two side information show significant improvement for video with high motion.
Many improvements are possible over this work. Initially, a better fusion algorithm should be found to exploit the combination of the different side information without needing the original frame and close the gap on the GT fusion. Moreover, fusion between MCTI and homography should be considered as they produce the least-correlated side information, and represent a good tradeoff between performance improvement and complexity increase.
Further, the MVME technique is very complex. Therefore, the complexity of this technique can be reduced by using fast motion search techniques such as a multigrid [27] approach instead of a fixed block size in addition to an N-step [28] search instead of a full search.
Finally, the additional complexity in the IMSI technique can be significantly reduced by selecting the blocks for which the reestimation is performed as defined in [25]. More specifically, a block is reestimated in the final SI if the residual error between the initially decoded WZ frame and the initial SI is greater than a certain threshold for this block. Otherwise, the block from the initial SI is just copied into the final SI.
Declarations
Acknowledgments
This work was partially supported by the European project Discover (http://www.discoverdvc.org) (IST Contract 015314) and the European Network of Excellence VISNET II (http://www.visnet-noe.org) (IST Contract 1-038398), both funded under the European Commission IST 6th Framework Program. The authors also would like to acknowledge the use of the DISCOVER codec, a software which started from the IST WZ software developed at the Image Group from Instituto Superior Técnico (IST) of Lisbon by Catarina Brites, João Ascenso, and Fernando Pereira.
Authors’ Affiliations
References
- Free Viewpoint Television (FTV) http://www.tanimoto.nuee.nagoya-u.ac.jp/study/FTV
- Girod B, Aaron AM, Rane S, Rebollo-Monedero D: Distributed video coding. Proceedings of the IEEE 2005,93(1):71-83.View ArticleGoogle Scholar
- Wiegand T, Sullivan GJ, Bjøntegaard G, Luthra A: Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(7):560-576.View ArticleGoogle Scholar
- Slepian D, Wolf J: Noiseless coding of correlated information sources. IEEE Transactions on Information Theory 1973,19(4):471-480. 10.1109/TIT.1973.1055037View ArticleMathSciNetMATHGoogle Scholar
- Wyner A, Ziv J: The rate-distortion function for source coding with side information at the decoder. IEEE Transactions on Information Theory 1976,22(1):1-10. 10.1109/TIT.1976.1055508View ArticleMathSciNetMATHGoogle Scholar
- Artigas X, Ascenso J, Dalai M, Klomp S, Kubasov D, Ouaret M: The DISCOVER codec: architecture, techniques and evaluation. Proceedings of the Picture Coding Symposium (PCS '07), November 2007, Lisbon, PortugalGoogle Scholar
- Malvar HS, Hallapuro A, Karczewicz M, Kerofsky L: Low-complexity transform and quantization in H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(7):598-603. 10.1109/TCSVT.2003.814964View ArticleGoogle Scholar
- Berrou C, Glavieux A, Thitimajshima P: Near Shannon limit error-correcting coding and decoding: turbo-codes.1. Proceedings of the IEEE International Conference on Communications (ICC '93), May 1993, Geneva, Switzerland 2: 1064-1070.View ArticleGoogle Scholar
- Peterson WW, Brown DT: Cyclic codes for error detection. Proceedings of the IRE 1961,49(1):228-235.View ArticleMathSciNetGoogle Scholar
- Ascenso J, Brites C, Pereira F: Improving frame interpolation with spatial motion smoothing for pixel domain distributed video coding. Proceedings of the 5th EURASIP Conference on Speech and Image Processing, Multimedia Communications and Services, July 2005, Smolenice, SlovakGoogle Scholar
- Aaron A, Zhang R, Girod B: Wyner-ziv coding for motion video. Proceedings of the 36th Asilomar Conference on Signals, Systems and Computers, November 2002, Pacific Grove, Calif, USAGoogle Scholar
- Ouaret M, Dufaux F, Ebrahimi T: Fusion-based multiview distributed video coding. Proceedings of the 4th ACM International Workshop on Video Surveillance and Sensor Networks (VSSN '06), October 2006, Santa Barbara, Calif, USA 139-144.View ArticleGoogle Scholar
- Artigas X, Angeli E, Torres L: Side information generation for multiview distributed video coding using a fusion approach. Proceedings of the 7th Nordic Signal Processing Symposium (NORSIG '06), June 2007, Reykjavik, Iceland 250-253.Google Scholar
- Guo X, Lu Y, Wu F, Gao W, Li S: Distributed multi-view video coding. Visual Communications and Image Processing (VCIP), January 2006, San Jose, Calif, USA, Proceedings of SPIE 6077:Google Scholar
- Guo X, Lu Y, Wu F, Zhao D, Gao W: Wyner-ziv-based multiview video coding. IEEE Transactions on Circuits and Systems for Video Technology 2008,18(6):713-724.View ArticleGoogle Scholar
- Ouaret M, Dufaux F, Ebrahimi T: Multiview distributed video coding with encoder driven fusion. Proceedings of the European Conference on Signal Processing (EUSIPCO '07), September 2007, Poznan, PolandGoogle Scholar
- Joint Bi-Level Image Experts Group http://www.jpeg.org/jbig
- Flierl M, Girod B: Coding of multi-view image sequences with video sensors. Proceedings of the International Conference on Image Processing (ICIP '06), October 2006, Atlanta, Ga, USA 609-612.Google Scholar
- Flierl M, Girod B: Video coding with motion-compensated lifted wavelet transforms. Signal Processing: Image Communication 2004,19(7):561-575. 10.1016/j.image.2004.05.002Google Scholar
- Dufaux F, Ouaret M, Ebrahimi T: Recent advances in multiview distributed video coding. Mobile Multimedia/Image Processing for Military and Security Applications, April 2007, Orlando, Fla, USA, Proceedings of SPIE 6579: 1-11.Google Scholar
- Dufaux F, Konrad J: Efficient, robust, and fast global motion estimation for video coding. IEEE Transactions on Image Processing 2000,9(3):497-501. 10.1109/83.826785View ArticleGoogle Scholar
- Martinian E, Behrens A, Xin J, Vetro A: View synthesis for multiview video compression. Proceedings of the 25th Picture Coding Symposium (PCS '06), April 2006, Beijing, ChinaGoogle Scholar
- Seitz SM, Dyer CR: View morphing. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96), August 1996, New Orleans, La, USA 21-30.View ArticleGoogle Scholar
- Artigas X, Tarres F, Torres L: Comparison of different side information generation methods for multiview distributed video coding. Proceedings of the International Conference on Signal Processing and Multimedia Applications (SIGMAP '07), July 2007, Barcelona, SpainGoogle Scholar
- Ye S, Ouaret M, Dufaux F, Ebrahimi T: Improved side information generation with iterative decoding and frame interpolation for distributed video coding. Proceedings of the 15th International Conference on Image Processing (ICIP '08), October 2008, San Deigo, Calif, USA 2228-2231.Google Scholar
- AVC/H.264 software http://iphome.hhi.de/suehring/tml
- Dufaux F: Multigrid Block Matching Motion Estimation for Generic Video Coding, Ph.D. thesis. Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland; 1994.Google Scholar
- Coban MZ, Mersereau RM: Fast rate-constrained N-step search algorithm for motion estimation. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '98), May 1998, Seattle, Wash, USA 5: 2613-2616.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.