Open Access

Unequal Error Protection Techniques Based on Wyner-Ziv Coding

EURASIP Journal on Image and Video Processing20092009:474689

DOI: 10.1155/2009/474689

Received: 31 May 2008

Accepted: 17 March 2009

Published: 7 June 2009

Abstract

Compressed video is very sensitive to channel errors. A few bit losses can stop the entire decoding process. Therefore, protecting compressed video is always necessary for reliable visual communications. Utilizing unequal error protection schemes that assign different protection levels to the different elements in a compressed video stream is an efficient and effective way to combat channel errors. Three such schemes, based on Wyner-Ziv coding, are described herein. These schemes independently provide different protection levels to motion information and the transform coefficients produced by an H.264/AVC encoder. One method adapts the protection levels to the content of each frame, while another utilizes feedback regarding the latest channel packet loss rate to adjust the protection levels. All three methods demonstrate superior error resilience to using equal error protection in the face of packet losses.

1. Introduction

Channel errors can result in serious loss of decoded video quality. Many error resilience and concealment schemes have been proposed [1]. However, when large errors occur, most of the proposed techniques are not sufficient enough to recover the loss. In recent years, error resilience approaches employing Wyner-Ziv lossy coding theory [2] have been developed and have resulted in improvement in the visual quality of the decoded frames [313]. Other works applied distributed source coding onto error resilience include [1417].

In 1976, Wyner and Ziv proved that when the side information is only known to the decoder, the minimum required source coding rate will be greater or equal to the rate when the side information is available at both encoder and decoder (see Figure 1). Denoting the source data by and the side information by , where and are correlated, but the side information is only available at the decoder, the decoder manages to reconstruct a version of , , subject to the constraint that at most a distortion is incurred. It was shown that [2], where is the data rate used when the side information is only available to the decoder and represents the data rate required when the side information is available at both the encoder and the decoder.
Figure 1

Side information available at decoder only.

Wyner and Ziv also proved that equality can be achieved when is Gaussian memoryless source and is mean square error distortion , as well as when the source data is the sum of an arbitrarily distributed side information and independent Gaussian noise . In addition, they derived the rate boundary that can be achieved when , and where and are the variances of the Gaussian noise and the source data [2].

One of the earliest work of applying Wyner-Ziv lossy coding theory for error resilient video transmission is proposed in [3], 2003. The general approach is to use an independent Wyner-Ziv codec (as shown in Figure 3) to protect a coarse-version of the input video sequence, which can be decoded together with the side information from the primary MPEG-x/H.26x decoder. The basic system structure is shown in Figure 2. The approach proposed in [3] is known as systematic lossy forward error protection (SLEP).
Figure 2

Error resilient video streaming using Wyner-Ziv coding.

Figure 3

Wyner-Ziv codec.

SLEP, in addition to an MPEG-2 encoder, uses a Wyner-Ziv encoder made up of a coarse quantizer and a lossless Slepian-Wolf encoder that utilizes Turbo coding. The input to the Wyner-Ziv encoder consists of the reconstructed frames obtained from the MPEG-2 encoder. These are initially coarsely quantized and then passed onto a Turbo encoder [18, 19], which outputs selected parity bits. At the receiving end, a Turbo decoder uses the output of the MPEG-2 decoder, as side information, and the received parity bits to recover the lost video data. In the absence of any channel errors, the output of the SLEP decoder will be the same as that of the MPEG-2 decoder. If however, channel errors corrupt the MPEG-2 stream, then SLEP attempts to reconstruct a coarse version of the MPEG-2 stream via the received parity bits, which may have also been corrupted. The quality of the reconstructed version depends on the quantization step used by the coarse quantizer as well as the strength of the Turbo code.

Improvements to SLEP have been proposed in [9, 12], and have resulted in a lower data rate for Wyner-Ziv coding as well as improved decoded video quality. It is noted that the SLEP method has been applied to H.264 in [12].

Another approach of using Wyner-Ziv coding for robust video transmission was proposed in [20], in which the Wyner-Ziv encoder consisted of a discrete cosine transform, a scalar quantizer and an irregular repeat accumulate code as the Slepian-Wolf coder.

Our approach to unequal error protection is also based on Wyner-Ziv coding and is motivated by the SLEP approach. The overall goal of our schemes is to correct errors in each frame by protecting motion information and the transform coefficients. The primary codec is an H.264/AVC codec and the Wyner-Ziv codec utilizes coarse quantization and a Turbo codec. Instead of protecting everything associated with the coarsely reconstructed frames, we separately protect motion information, and transform coefficients produced by the primary H.264 encoder. The idea being that since the loss of motion information impacts the quality of decoded video differently from the loss of transform coefficients, both should receive unequal levels of protection that are commensurate with their respective contributions to the quality of the video reconstructed by the decoder [21]. The motion information is protected via Turbo coding whereas the transform coefficients are protected via Wyner-Ziv coding. This approach is referred to as unequal error protection using Pseudo Wyner-Ziv (UEPWZ) coding.

We improve the performance of our unequal error protection technique by adapting the parity data rates for protecting the video information to the content of each frame. This is referred to as content adaptive unequal error protection (CAUEP) [22]. In this scheme, a content adaptive function was used to evaluate the normalized sum of the absolute difference (SAD) between the reconstructed frames and the predicted frames. Depending on pre-selected thresholds, the parity data rates assigned to the motion information and the transform coefficients were varied for each frame. This resulted in a more effective and flexible error resilience technique that had an improved performance compared to the original UEPWZ.

Another approach to improve the proposed unequal error protection is to send feedback regarding the current channel packet loss rates to the Pseudo Wyner-Ziv encoder, in order to correspondingly adjust the amount of parity bits needed for correcting the corrupted slices at the decoder [23]. This approach is referred to as feedback aided unequal error protection (FBUEP). At the decoder, the current packet loss rate is estimated based on the received data and sent back to the Pseudo Wyner-Ziv encoder via the real-time transport control protocol (RTCP) feedback mechanism. This information is utilized by the Turbo encoders to update the parity data rates of the motion information and the transform coefficients, which are still protected independently. At the Wyner-Ziv decoder, the received parity bits together with the side information from the primary decoder are used to decode and restore corrupted slices. These in turn are sent back to the primary decoder to replace their corrupted counterparts. It is to be noted that simply increasing the parity bits when the packet loss rate increases is not applicable, since it will exacerbate network congestion [24]. Instead, the total transmission data rate should be kept constant, which means that when the packet loss rate increases, the primary data transmission rate should be lowered in order to spare more bits for parity bits transmission.

Our proposed error resilience schemes aim to improve both the rate distortion performance as well as the visual quality of the decoded video frames when video has been streamed over data networks such as wireless networks that experience high packet losses. In our experiments, we only consider packet erasures whether due to network congestion or uncorrected bit errors. The main focus of our scheme is for applications such as video conferencing, especially in a wireless network scenario, where serious packet losses will result in unpleasant distortion during real time video streaming.

In this paper, UEPWZ is described in Section 2, and the details of CAUEP and FBUEP as well as the improvement in performance achieved are presented in Section 3. The experimental results of the three techniques are compared and analyzed in Section 4, showing the significant improvement the CAUEP and the FBUEP achieved in rate distortion performance and the visual quality of the decoded frames. Finally, the conclusion is provided in Section 5.

2. Unequal Error Protection Based on Wyner-Ziv Coding

As mentioned previously, the approach to unequal error protection undertaken here is based on Wyner-Ziv coding and is motivated by the SLEP approach. The primary codec is an H.264/AVC codec and the Wyner-Ziv codec utilizes coarse quantization and two pairs of Turbo codecs. Instead of protecting everything associated with the coarsely reconstructed frames, we separately protect motion information and transform coefficients produced by the primary H.264 encoder. The idea being that since the loss of motion information impacts the quality of decoded video differently from the loss of transform coefficients, both should receive unequal levels of protection that are commensurate with their respective contributions to the quality of the video reconstructed by the decoder [21]. The block diagram depicting the unequal error protection system is shown in Figure 4.
Figure 4

Unequal error protection based on Wyner-Ziv coding.

In H.264/AVC, there are modes used for predicting a block in an I frame and modes for predicting a block from its neighbors [25, 26]. The mode index and the transform coefficients are critical for proper frame reconstruction at the decoder. In the case of P and B frames, the H.264/AVC standard allows the encoder the flexibility to choose among different reference frames and block sizes for motion prediction. In particular, the standard permits block sizes of , , , , , , and . Since motion vectors belonging to neighboring blocks are highly correlated, motion vector differences (MVD) are encoded and transmitted to the decoder side, together with the reference frame index, mode information and the residual transform coefficients.

In the unequal error protection scheme, the important video information are protected through the Pseudo Wyner-Ziv coder. In the case of I frames, mode information (MI) as well as the transform coefficients are protected whereas motion vector differences, mode information and reference frame index (RI) are protected for P and B frames. These are scanned and used to create long symbol blocks that are sent to the Turbo encoder.

In order to mitigate the mismatch between the transform coefficients input to the Wyner-Ziv encoder and the corresponding side information at the Wyner-Ziv decoder, an inverse quantizer, identical to the one used in the H.264/AVC decoder, is initially used to de-quantize the coefficients. These are then coarsely quantized by a uniform scalar quantizer with levels ( ), and used to form a block of symbols that is passed onto the Turbo encoder. The quantization step size for processing the transform coefficients is therefore . In all cases, the output of the Turbo encoder is punctured to reduce the overall data rate.

Due to the importance of maintaining its accuracy the motion information is not quantized. Instead, the Turbo encoder takes in the motion information directly and outputs the selected parity bits. It can be noticed that without using quantization, the processing of Turbo coding motion information itself is not strictly speaking Wyner-Ziv coding. Therefore, we name the whole secondary encoder as Pseudo Wyner-Ziv encoder instead of Wyner-Ziv encoder, and we refer to this scheme as unequal error protection using Pseudo Wyner-Ziv coding (UEPWZ). However, the application of Turbo coding in our schemes is different from straight forward error control coding. In our application, only the parity bits p produced by the Turbo encoder are transmitted to the decoder. The output data stream u from the first branch is not transmitted to the decoder side. This is illustrated in Figure 5. The corresponding decoded error prone primary video data from the H.264 decoder will be used as to codecode the parity bits received by the Turbo decoders.
Figure 5

Parallel turbo encoder.

Because of the independent processing of the motion data and the transform coefficients in the Pseudo Wyner-Ziv encoder, the parity data rates in the corresponding Turbo encoder can be assigned separately.

The Turbo encoder we used consists of two identical recursive systematic encoders (see Figure 5) [27], each having the generator function: . The input symbols sent to the second recursive encoder are interleaved first in a permuter before being passed to it. The puncture mechanism is used to delete some of the parity bits output from the two recursive encoders, in order to meet a target parity data rate. Only parity bits are transmitted to the decoder side. The first branch of data, symboled by the dashed line in Figure 5, is not transmitted. The error correction capability of the Turbo coder also depends on the length of the symbol blocks. In our scheme, the symbol block length is in the unit of a frame instead of a slice. For the transform coefficients, the symbol block length is 25344 for a QCIF sequence. In the proposed scheme the motion vectors are obtained for each blocks, which makes the symbol block length of 3168. The experiment results also show that the Turbo encoder still maintains strong error correction ability for such a symbol block length.

The Turbo decoder utilizes the received parity bits and the side information from the H.264/AVC decoder, to perform the iterative decoding using two BCJR-MAP decoders [27]. The error corrected information is then sent back to the H.264/AVC decoder to replace the error corrupted data. In this process, the decoded error-prone transform coefficients are first sent to a coarse quantizer, which is the same as the one used at the Pseudo Wyner-Ziv encoder side. The reason is that at the encoder side, in order to save data rate usage by the Wyner-Ziv coding, a coarse version of the transform coefficients is Turbo encoded. However, Only the output parity bits are transmitted to the decoder side. The video data u output from the Turbo encoder is not transmitted. Instead, the H.264 decoded transform coefficients are used as it, together with the received parity bits of the Turbo encoded coarse-version transform coefficients, to decode the error corrected coarse version of the transform coefficients. When using the real-time transport protocol (RTP), packet loss can be inferred at the decoder easily by checking the sequence number field in the RTP headers. Wyner-Ziv decoding only performs when the decoder detects packet losses. When no packet loss happens, the H.264 decoded transform coefficients are used for decoding the residual frames. However, when packet loss happens, the coarser version of the transform coefficients decoded by the Turbo decoder is used to limit the maximum degradation that can occur. In the parallel process, the error corrupted motion information received by the H.264/AVC decoder was sent directly to the corresponding Turbo decoder, together with the received corresponding parity bits, to decode the error corrected motion information. It is then sent back to the H.264/AVC decoder to replace the error-corrupted motion information. The reconstructed frames can be further used as the reference frames in the following decoding process. Therefore, the final version of the decoded video sequence are obtained based on the error corrected motion information and the transform coefficients, which resulted in good quality decoded frames as shown in Section 4. However, in the case of serious channel loss and/or limited available data rate for error protection, the Pseudo Wyner-Ziv coder might not have enough strength to recover all the lost video information. Also there is no fall back mechanism in use to ensure the correct turbo decoding. On this point, the UEPWZ takes the advantage of allocating different protection level on different protected video data elements depending on their overall impact on the decoded video sequence. The experiments showed that by assigning unequal data rate for protecting motion information and the transform coefficients, the rate distortion performance can be improved compared to the equal parity data rate allocation case.

3. Improved Unequal Error Protection Techniques

In this section, the two approaches developed to improve UEPWZ technique are introduced in detail. Content adaptive unequal error protection (CAUEP) improves UEPWZ from the encoder side by analyzing the content of each frame while feedback aided unequal error protection (FBUEP) utilizes channel loss information conveyed from the H.264 decoder side. Both approaches improved the original UEPWZ in a different aspect, which results in further efficiency on data rate allocation and the significant improvement on the visual quality of the decoded frames.

3.1. Content-Adaptive Unequal Error Protection

In UEPWZ, the parity data rates for Turbo coding the motion information and the transform coefficients are always set in advance and fixed throughout. However, in a video sequence, different video content in each part of the sequence may require different amounts of protection for the corresponding video data elements. The amount of the motion contained in each frame may change over time, which means part of the video sequence may contain a large amount of motion while some other parts may only contain slow motion content. For this type of video sequences, fixed parity data rate assignment may result in inefficient error protection. When motion content increases in the video sequence, the pre-assigned parity data rate may become insufficient to correct the errors while it may result in sending redundant parity bits when the motion content decreases in the same video sequence.

The goal of developing an efficient error resilience technique is to make the algorithm applicable to all types of video sequences. Therefore, a function needs to be embedded in the Wyner-Ziv coder to analyze the video content, such as the amount of the motion, in each frame. CAUEP improves UEPWZ by adapting the protection levels of different video data element, to the content of each frame.

In order to achieve this goal a content adaptive function (CAF) that utilizes the normalized sum of absolute difference (SAD) between each reconstructed frame and its predicted counterpart is used. This is given by , where denotes the reconstructed pixel value at position , is the value of the predicted pixel at position , and represents the normalized total value of SAD of the n th frame in the sequence.

The SAD of each frame is compared to three pre-defined thresholds , and , in order to decide the importance level between the motion information and the transform coefficients. The thresholds and the corresponding sets of parity data rates assignments were chosen experimentally (see Table 1). In these experiments, the normalized average SADs of different type of video sequences were analyzed at the same encoding condition. Different thresholds are chosen for different types of video sequences which were all based on extensive test results. The parity data rates for each range of SADs are not designed to add up to the same number. When SAD is small ( ), the least amount of the parity bits are transmitted to the decoder side. As SAD increases, higher amount of the parity bits are needed for correcting the lost packets. It also needs to mention that thresholds selection is dependent on the encoding data rate. A suggested range for , , and at encoding data rate of  kbps is: , and . The parity data rates given in the Table 1 is the puncturing rate of each code word. For example, is the total output Turbo encoding parity data rate, which means out of every parity bits is output from each convolutional encoder (refer to Figure 5). The experimental results given in section 4 showed that by using the parity data rate allocation and the thresholds decision in Table 1, the content adaptive unequal error protection can provide a better rate distortion performance and the visual quality of the decoded video sequences, comparing to our previously proposed unequal error protection. Both techniques outperform the equal error protection case and the H.264 with error concealment case as shown in Section 4. However, depending on the channel condition and the sequence characters, it may not guarantee perfect recovery of the lost data in all cases. The calculation of the SAD and the comparison to the thresholds are straight forward, therefore it does not add much complexity to the system. The block diagram of the system is shown in Figure 6.
Table 1

Setting of parity data rate (PDR).

SAD range

PDR assignment

,

,

,

,

Figure 6

Content adaptive unequal error protection using Wyner-Ziv coding.

3.2. Feedback Aided Unequal Error Protection

Another approach to improve the unequal error protection is to exploit the feedback information of the channel loss rate from the decoder side. The parity data rates assigned for Turbo encoding the protected video information can accordingly be adjusted.

It is to be noted that data networks suffer from two types of transmission errors, namely random bit errors due to noise in the channels and packet losses due to network congestion. When transmitting a data packet, a single uncorrected bit error in the packet header or body may result in the whole packet being discarded [2833]. In the current work, we only consider packet losses, whether due to network congestion or uncorrected bit errors. When using the real-time transport protocol (RTP), determining which packets have been lost can be easily achieved by monitoring the sequence number field in the RTP headers [24, 34]. Therefore, the packet loss rate of each frame can be easily obtained at the decoder.

Figure 7 depicts a block diagram of the FBUEP. At the H.264/AVC encoder, each frame is divided into several slices. Both the motion information and the transform coefficients of each slice are sent to the Pseudo Wyner-Ziv encoder to be encoded independently by the two Turbo encoders. As for UEPWZ, the parity data rates allocated to protecting the different elements of the video sequence are assigned independently.
Figure 7

Feedback aided unequal error protection based on Wyner-Ziv coding.

At the decoder, the packet loss rate of each frame is evaluated based on the received video information. It is then sent back to the two Turbo encoders via the RTCP feedback packets. Depending on the channel packet loss rates conveyed, the two Turbo encoders adjust the parity data rates for encoding the motion information and the transform coefficients of the current frame.

3.2.1. RTCP Feedback

In the decoder, the channel packet loss rate is obtained based on the received data and sent back to the Pseudo Wyner-Ziv encoder. If the available bandwidth for transmitting the feedback packets is above a certain threshold then an immediate mode RTCP feedback message is sent, otherwise the early feedback RTCP mode is used [35]. The two Turbo encoders update the parity data rates for encoding the motion information and the transform coefficients based upon the received RTCP feedback conveying the packet loss rates. This way the Pseudo Wyner-Ziv encoder attempts to adapt to the decoder's needs, while avoiding blindly sending a large number of parity bits that may not be needed when the packet loss is low or zero. In the case of high channel packet loss rate, the Pseudo Wyner-Ziv encoder enhances the protection by allocating more data rates to the Turbo encoded data, especially the motion information, while decreasing relatively the data rate used for encoding the main data stream by the H.264/AVC encoder. In this way, the total data rate is kept as a constant so that it will not exacerbate the possible congestion over the network transmission.

According to the RTCP feedback profile that is detailed in [35], when there is sufficient bandwidth, each loss event can be reported by means of a virtually immediate RTCP feedback packet. In the RTCP immediate mode, feedback message can be sent for each frame to the encoder. In our scheme an initial parity data rate value is set at the beginning of transmitting a video. When the channel loss condition changes, the immediate mode RTCP feedback packet sends the latest channel packet loss rate to the Turbo encoders to adjust the parity data rate assignment for the next frame. If we let denote the average number of loss events to be reported every interval by a decoder, the RTCP bandwidth fraction for our decoder, and the average RTCP packet size, then feedback can be sent via the immediate feedback mode when
(1)

In the RTCP protocol profile [35], it was assumed that percent of the the RTP session bandwidth is available for RTCP feedback from the decoder to the encoder. For example, for a  kbits/s stream,  kbits are available for transmitting the RTCP feedback. If we assume an average of  bytes (  bits) per RTCP packet and a frame rate of frames/second, then by (1), we can conclude that . In this case, the RTCP immediate mode can be used to send one feedback message per frame to the encoder.

When , the available bandwidth is not sufficient for transmitting a feedback message via the immediate mode. In this case, the early RTCP mode is turned on. In this mode, the feedback message is scheduled for transmission to the encoder at the earliest possible time, although it can not necessarily react to each packet loss event. In this case, a received feedback message at the encoder side may not reflect the latest channel loss rate. We therefore propose to send an estimate average channel packet loss rate based on packet loss rates of the previous k frames. It gives a better estimate of the recent channel packet loss rate. This scheme is detailed in Section 3.2.2.

When the Pseudo Wyner-Ziv encoder does not receive feedback regarding the current packet loss rates (the feedback packet got lost during transmitting back to the Turbo encoders or the available bandwidth is not sufficient for immediate mode feedback), the Turbo encoders keep using the last received channel packet loss rate to decide the parity data rates for encoding the motion information and the transform coefficients of the current encoded frame.

3.2.2. Delay Analysis

Delay must be considered when feedback is used. In our system, a RTCP feedback message is transmitted via the immediate mode, if the available RTCP transmission data rate is above the threshold as defined in (1). Through this mechanism the decoder reports the packet loss rate associated with each received frame to the encoder. The Pseduo Wyner-Ziv encoder then utilizes this information to select the parity data rates for encoding the motion information and the transform coefficients of the current encoded frame.

In early feedback mode, rather than sending feedback on a frame by frame basis, we propose to send the feedback packets to the Pseudo Wyner-Ziv encoder every frames ( ). The feedback in this case is the average channel packet loss rate ( ) evaluated based on the history of the received video information of the past frames, as given in (2). represents the th set of the frames received at the decoder. In this equation is a counter counting the number of the error corrupted slices in the th received frame. is counted in terms of every frames ( ). is the index of the received slice and each frame is assumed to be partitioned into slices. The parity data rates assignment, for Turbo encoding the motion information and the transform coefficients of the next frames, is then updated once every frames and therefore has higher resilience to the delay problem:
(2)
(3)

Furthermore, in the frame by frame based feedback strategy, if the packet loss rate of the current decoded frame is the same as the previous frame's, no feedback message needs to be sent back to the encoder. In the same way, if the average channel packet loss rate of the current received frames ( ) is equal to the average packet loss rate of the past frames ( ), no feedback is needed to be sent back to the Pseudo Wyner-Ziv encoder. In other words, the feedback message is only sent back to the encoder when the packet loss rate is changed. Therefore, there are three scenarios when no feedback is received by the Turbo encoders. One is that the channel packet loss rate is kept as a constant at the moment. Another case is that the feedback packet got lost during transmitting back to the Turbo encoders. The third case is that the available bandwidth is not sufficient for immediate mode feedback. Accordingly, the Turbo encoders only update the parity data rates for encoding the motion information and the transform coefficients when they received the updated feedback regarding the latest packet loss rate.

3.2.3. Data Rate Assignment between Primary Encoding and the Pseudo Wyner-Ziv Encoding

When packet loss rates increase, simply increasing the parity data rates for Turbo encoding the motion information and the transform coefficients while keeping the same data rate for the primary video data coding would only exacerbate channel congestion [24]. A better way would be to reduce the data rate allocated to the primary video data transmission slightly and correspondingly increase the data rate allocated to the transmission of parity bits, so that the total transmission data rate at any packet loss rate is kept constant. Furthermore, more efficient use of the data rate can be achieved by assigning different protection levels to the motion data and the transform coefficients in the Pseudo Wyner-Ziv encoder at different channel packet loss rate.

In our scheme, the parity data rates assigned to the motion information and the transform coefficients were evaluated experimentally. The parity data rates settings at different range of channel packet loss rate were tested by extensive experiments on different video sequences. The experiment results showed that the enough lost information can be corrected for reconstructing a visually good quality decoded frames (See Table 2).
Table 2

Parity data rate (PDR) assignment for FBUEP method.

Packet loss rate

Parity data rate assignment

,

,

,

,

,

,

4. Experiments

To evaluate the proposed techniques, experiments were carried out using the H.264/AVC reference software. The frame rate for each sequence was set at frames per second with an structure. In our experiment, each QCIF frame is divided into 9 slices. The primary encoded video data output from the H.264 encoder are packetized into 9 packets per frame, each containing the video information of one slice. The Turbo encoded parity bits of the motion information and the transform coefficients corresponding to each slice are also sent in separate packets. All three types of the packets are subjected to random losses over the transmission channel. We did not attempt to make all packets the same size. Since the packets containing the parity bits of the motion information or the transform coefficients are much smaller in size comparing to the H.264 packets, the possibility of getting lost over a wireless network transmission is therefore much smaller. All the experiments results were averaged over lossy channel transmission realizations.

As has been mentioned in Section 3.2, data networks suffer from two types of transmission errors: random bit error and packet drop. In our experiments, we only consider the case of packet erasures, whether due to network congestion or uncorrected bit errors. Lower the total data rate to reduce the network congestion is a realistic solution when packet loss is very high. However, since our main application is for video streaming over wireless networks in which case the packet loss situation is more complicated, we did not consider it in our current experiments. It is to be noted that simply increasing the parity bits when the packet loss rate increases is not applicable, since it will exacerbate network congestion (see [20]). Instead, the total transmission data rate should be kept constant, which means that when the packet loss rate increases, the primary data transmission rate should be lowered in order to spare more bits for parity bits transmission.

In our experiments, channel packet loss is simulated by using uniform random number generators. Our algorithm focuses on wireless network application in which case severe packet loss could happen. In the case of wireless network transmission, the probability that the packet arrives in error is approximately proportional to its length [12]. Assume the length of the H.264 data packet is , and the lengths of the parity bits packets containing the motion information and the transform coefficients are and , respectively. If the probability of the packet loss of H.264 data is , then the probabilities of the packet loss of the motion information and the transform coefficients packets are and , respectively. This is implemented in our packet loss simulation.

In our Wyner-Ziv based schemes, different parity data rate settings have been tested extensively for different types of video sequences. For the tested sequences, the final decision on the parity data rate assignments that are given in the paper can achieve a better rate distortion performance, the visual quality of the decoded frames and the overall data rate usage comparing to other values of the parity date rates.

Figures 8 and 9 show the results for fixed packet losses in which the channel packet loss rate is always fixed at for the two sequences foreman and carphone, respectively. To see the performance comparison at a different fixed packet loss rate, the stefan.qcif sequence is used to generate the results at packet loss case as shown in Figure 10. It is noted that for fixed losses FBUEP offers no advantage over UEPWZ. In fact, when both use the same parity data rates, their performance will be identical. For this reason, we do not include the results of the FBUEP in Figures 8, 9 and 10. For EEPWZ and UEPWZ methods, the PDR are fixed through transmitting a video sequence. For primary video encoding at a data rate of  kbps, the corresponding parity data rate assigned to Turbo encoding the motion information and the transform coefficients are and . For EEPWZ method, the parity data rates allocation for motion information and the transform coefficients in this case are both . For UEPWZ and EEPWZ methods, the parity data rate assignment is always fixed. The data rate allocation between the primary video layer and the parity layer is kept at . For FBUEP and CAUEP methods, the parity data rate assignments are always adaptive to the content of the frame or the channel packet loss rate. The overall average data rate used for parity bits and the primary video data transmission should also be kept equal to or less than .
Figure 8

Rate-distortion performance of foreman.qcif at fixed packet loss rate. The results of CAUEP, UEPWZ, EEPWZ and H.264 + ER + EC are compared. CAUEP achieved the best performance but close to that of the UEPWZ due to the content of the video sequence.

Figure 9

Rate-distortion performance of carphone.qcif at fixed packet loss rate. For this sequence CAUEP outperform UEPWZ by 0.3 to 1 dB.)

Figure 10

Rate-distortion performance of stefan.qcif at fixed loss rate. For this sequence and a packet loss rate of 22%, the CAUEP outperform the UEPWZ by 0.3 to 1.12 dB.

As can be observed from the figures, CAUEP has the best performance, outperforming UEPZW by around 0.2 dB in the case of foreman sequence and by around 0.3 to 1 dB in the case of carphone and stefan. When using EEPWZ the motion information and the transform coefficients were provided the same protection level. The EEPWZ is a similar case of SLEP since the motion information and the transform coefficients are protected at the same parity data rate. The difference is that in the EEPWZ case, the parity bits of the motion information and the transform coefficients are sent in individual packets. This makes the experiment results comparable with our unequal error protection based methods. The curve of H264 + ER + EC shows the result of the H.264 using slice group feature for error resilience in the encoding process and the previous colocated slice replacement for the error concealment strategy in the decoding process. All four schemes use the same amount of total data rate. Wyner-Ziv based methods allocated part of the total data rate budget to transmit the information protected via the Pseudo Wyner-Ziv codec. In the H.264 with error concealment, the total data rate is all allocated for transmitting the H.264 encoded video information. We think this is a fair comparison since the total data rate is the same for all the tested schemes. It can be seen from the experiment results that the rate distortion performance and the visual quality can both be improved by sparing certain amount of total data rate for protecting the important video information by Pseudo Wyner-Ziv coding.

Figures 11 and 12 exhibit the average performance of the four strategies when the packet losses range from to for foreman and carphone qcif sequences. The total data rate was kept around kbps and the packet loss rates at and have been tested. Again, CAUEP outperforms the other three techniques. Compared to UEPWZ, CAUEP gains about 0.2-0.3 dB for foreman and 0.5–1 dB for carphone, and its performance converges to that of UEPWZ as the packet loss rate becomes severe. This is because both techniques breaks down due to too serious packet loss and insufficient data rate available for error correction.
Figure 11

Packet loss rate performance of foreman. qcif.

Figure 12

Packet loss rate performance of carphone. qcif.

In general, channel conditions change over time, resulting in variable packet loss rates. In the following experiments, the channel packet loss rates were varied during the transmission time of the video sequences. In our simulation, the lowest packet loss rate is while the highest possible packet loss rate is . The mean of the overall channel packet loss is at . The parity data rates allocated to the motion information and the transform coefficients, in the case of FBEUP, are shown in Table 2.

Figures 13, 14, and 15 depict the results for the dynamic packet loss case. As can be seen in the dynamic packet loss case, the CAUEP and UEPWZ schemes achieved lower PSNRs at the same data rates compared to those in Figures 8, 9, and 10. One of the reasons is that the CAUEP and the UEPWZ schemes were not able to allocate enough parity bits for protecting the important video information when the channel packet loss rates became higher. Furthermore, distortion is accumulated over a sequence of successive P frames due to motion compensation, until a new I frame is inserted. unlike the other schemes, FBUEP attempts to be aware of the varying packet loss rates and is therefore able to adjust the parity data rates accordingly.
Figure 13

Rate distortion performance (foreman-qcif) (dynamic packet loss case).

Figure 14

Rate distortion performance (carphone-qcif) (dynamic packet loss case).

Figure 15

Rate distortion performance (stefan-qcif) (dynamic packet loss case).

For visual comparison, the 85th frame from foreman, which was protected via the various schemes described above, has been decoded and depicted along with the original frame in Figures 16, 17, and 18. The results presented are for dynamic packet losses. It can be seen that both UEPWZ and CAUEP produce block artifacts on the left and right cheeks of the person in the figure, with CAUEP generating less artifacts than UEPWZ. It is also observed that the use of feedback, as in FBUEP, produced the most visually pleasing image. It also has higher PSNR values than the others.
Figure 16

Visual comparison between the original 85th frame (a) and that produced by CAUEP (b) (PSNR = 38. 42 dB).

Figure 17

Visual comparison between the 85th frame produced by FBUEP (a) ( k = 5, PSNR = 39. 75 dB) and UEP (b) (PSNR = 37.15 dB).

Figure 18

Visual comparison between the 85th frame produced by EEP (a) (PSNR = 34. 64 dB) and H264 + EC (b) (PSNR = 29.12 dB).

Figure 18 compares the results from using EEP and the H.264/AVC with error concealment applied to the decoded frames. Both the visual quality and the PSNRs are much worse than those of UEPWZ, CAUEP and FBUEP.

5. Conclusion

This paper described and compared three error resilience techniques each utilizing a Pseudo Wyner-Ziv codec to protect important video information produced by an H.264/AVC codec. In each scheme the motion information and the transform coefficients are protected independently. In the first scheme, unequal error protection using Pseudo Wyner-Zive coding (UEPWZ), motion information and the transform coefficients are provided fixed albeit different protection levels for the entire video sequence. In the second method, content adaptive unequal error protection (CAUEP), the protection afforded motion information and the transform coefficient were updated each frame according to frame content. The third technique, feedback aided unequal error protection (FBUEP), utilized packet loss rates sent from the decoder to the encoder to choose the parity data rates allocated to encode the motion information and the transform coefficients. It was demonstrated that UEPWZ, CAUEP, and FBUEP are more error resilient to packet losses than equal error protection techniques and provide more visually pleasing images. It was also shown that FBUEP is better suited for handling time varying losses while CAUEP has better performance in the presence of fixed losses. This paper aims to show the different amount of contribution that could be obtained from each algorithm. Future work will focus on combining both CAUEP and the FBUEP to develop a more efficient error resilient technique. In addition, we will carry out a study on the system's complexity.

Declarations

Acknowledgment

This work was supported by a grant from the Indiana Twenty-First Century Research and Technology Fund.

Authors’ Affiliations

(1)
Video and Image Processing Laboratory (VIPER), School of Electrical and Computer Engineering, Purdue University
(2)
Department of Electrical and Computer Engineering, Indiana University—Purdue University at Indianapolis

References

  1. Wang Y, Zhu Q-F: Error control and concealment for video communication: a review. Proceedings of the IEEE 1998,86(5):974-997. 10.1109/5.664283View ArticleGoogle Scholar
  2. Wyner D, Ziv J: The rate-distortion function for source coding with side information at the decoder. IEEE Transactions on Information Theory 1976,22(1):1-10. 10.1109/TIT.1976.1055508View ArticleMathSciNetMATHGoogle Scholar
  3. Aaron A, Rane S, Zhang R, Girod B: Wyner-Ziv coding for video-applications to compression and error resilience. Proceedings of the IEEE Data Compression Conference (DCC '03), March 2003, Snowbird, Utah, USA 93-102.Google Scholar
  4. Aaron A, Rane S, Rebollo-Monedero D, Girod B: Systematic lossy forward error protection for video waveforms. Proceedings of the IEEE International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 1: 609-612.Google Scholar
  5. Rane S, Aaron A, Girod B: Systematic lossy forward error protection for error resilient digital video broadcasting. Visual Communications and Image Processing, January 2004, San Jose, Calif, USA, Proceedings of SPIE 5308: 588-595.Google Scholar
  6. Rane S, Girod B: Systematic lossy error protection versus layered coding with unequal error protection. Wyner-Ziv Video Coding, January 2005, San Jose, Calif, USA, Proceedings of SPIE 5685: 663-671.Google Scholar
  7. Rane S, Girod B: Analysis of error-resilient video transmission based on systematic source-channel coding. Proceedings of Picture Coding Symposium (PCS '04), December 2004, San Francisco, Calif, USA 453-458.Google Scholar
  8. Rane S, Aaron A, Girod B: Systematic lossy forward error protection for error resilience digital video broadcasting-a Wyner-Ziv coding approach. Proceedings of the IEEE International Conference on Image Processing (ICIP '04), October 2004, Singapore 5: 3101-3104.Google Scholar
  9. Rane S, Aaron A, Girod B: Error resilient video transimission using multiple embedded Wyner-Ziv descriptions. Proceedings of the IEEE International Conference on Image Processing (ICIP '05), September 2005, Genoa, Italy 2: 666-669.Google Scholar
  10. Rane S, Girod B: Systematic lossy error protection of video based on H.264/AVC redundant slices. Visual Communications and Image Processing, January 2006, San Jose, Calif, USA, Proceedings of SPIE 6077: 1-9.Google Scholar
  11. Rane S, Baccichet P, Girod B: Modeling and optimization of a systematic lossy error protection system based on H.264/AVC redundant slices. Proceedings of the 25th Picture Coding Symposium (PCS '06), April 2006, Beijing, ChinaGoogle Scholar
  12. Baccichet P, Rane S, Girod B: Systematic lossy error protection based on H.264/AVC redundant slices and flexible macroblock ordering. Journal of Zhejiang University 2006,7(5):900-909. 10.1631/jzus.2006.A0900View ArticleMATHGoogle Scholar
  13. Girod B, Aaron AM, Rane S, Rebollo-Monedero D: Distributed video coding. Proceedings of the IEEE 2005,93(1):71-83.View ArticleGoogle Scholar
  14. Puri R, Majumdar A, Ramchandran K: PRISM: a video coding paradigm with motion estimation at the decoder. IEEE Transactions on Image Processing 2007,16(10):2436-2448.View ArticleMathSciNetGoogle Scholar
  15. Sehgal A, Jagmohan A, Ahuja N: Wyner-Ziv coding of video: an error resilient compression framework. IEEE Transactions on Multimedia 2004,6(2):249-258. 10.1109/TMM.2003.822995View ArticleGoogle Scholar
  16. Wang J, Majumdar A, Ramchandran K, Garudadri H: Robust video transmission over a lossy network using a distributed source coded auxiliary channel. Proceedings of Picture Coding Symposium (PCS '04), December 2004, San Francisco, Calif, USA 41-46.Google Scholar
  17. Wang J, Prabhakaran V, Ramchandran K: Syndrome-based robust video transmission over networks with bursty losses. IEEE International Conference on Image Processing, October 2006, Atlanta, Ga, USA 741-744.Google Scholar
  18. Berrou C, Glavieux A, Thitimajshima P: Near shannon limit error-correcting coding and decoding: turbo-codes. Proceedings of International Conference on Communications, May 1993, Geneva, Switzerland 1064-1070.Google Scholar
  19. Berrou C, Glavieux A: Near optimium error correcting coding and decoding: turbo codes. IEEE Transactions on Communications 1996,44(10):1261-1271. 10.1109/26.539767View ArticleGoogle Scholar
  20. Xu Q, Stankovic V, Xiong Z: Layered Wyner-Ziv video coding for transmission over unreliable channels. Signal Processing 2006,86(11):3212-3225. 10.1016/j.sigpro.2006.03.017View ArticleMATHGoogle Scholar
  21. Liang L, Salama P, Delp EJ: Unequal error protection using Wyner-Ziv coding for error resilience. Visual Communications and Image Processing, January 2007, San Jose, Calif, USA, Proceedings of SPIE 6508: 1-9.Google Scholar
  22. Liang L, Salama P, Delp EJ: Content-adaptive unequal error protection based on Wyner-Ziv coding. Proceedings of Picture Coding Symposium (PCS '07), November 2007, Lisbon, PortugalGoogle Scholar
  23. Liang L, Salama P, Delp EJ: Feedback-aided error resilience technique based on Wyner-Ziv coding. Visual Communications and Image Processing, January 2008, San Jose, CA, USA, Proceedings of SPIE 6822: 1-9.Google Scholar
  24. Johanson M: Adaptive forward error correction for real-time internet video. Proceedings of the 13th Packet Video Workshop (PV '03), April 2003, Nantes, FranceGoogle Scholar
  25. Richardson IEG: H.264 and MPEG-4 Video Compression. Wiley, Chichester, UK; 2003.View ArticleGoogle Scholar
  26. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A: Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(7):560-576.View ArticleGoogle Scholar
  27. Ryan WE: Concatenated Codes and Iterative Decoding. John Wiley & Sons, New York, NY, USA; 2003.Google Scholar
  28. Wenger S: H.264/AVC over IP. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(7):645-656. 10.1109/TCSVT.2003.814966View ArticleGoogle Scholar
  29. Zhu X, Girod B: Video streaming over wireless networks. Proceedings of European Signal Processing Conference (EUSIPCO '07), September 2007, Poznan, PolandGoogle Scholar
  30. Liang YJ, Apostolopoulos JG, Girod B: Analysis of packet loss for compressed video: does burst-length matter? Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 684-687.Google Scholar
  31. Stuhlmuller K, Farber N, Link M, Girod B: Analysis of video transmission over lossy channels. IEEE Journal on Selected Areas in Communications 2000,18(6):1012-1032. 10.1109/49.848253View ArticleGoogle Scholar
  32. Wu D, Hou YT, Zhu W, Zhang Y-Q, Chao HJ: MPEG-4 compressed video over the Internet. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '99), May-June 1999, Orlando, Fla, USA 4: 327-331.Google Scholar
  33. Hou YT, Wu D, Zhu W, Lee H-J, Chiang T, Zhang Y-Q: End-to-end architecture for MPEG-4 video streaming over the Internet. Proceedings of the IEEE International Conference on Image Processing (ICIP '99), October 1999, Kobe, Japan 1: 254-257.View ArticleGoogle Scholar
  34. Stockhammer T, Hannuksela MM, Wiegand T: H.264/AVC in wireless environments. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(7):657-673. 10.1109/TCSVT.2003.815167View ArticleGoogle Scholar
  35. Ott J, Wenger S, Sato N, Burmeister C, Ray J: Extended RTP profile for real-time transport control protocol (RTCP)-based feedback (RTP/AVPF). July 2006.View ArticleGoogle Scholar

Copyright

© Liang Liang et al. 2009

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.