Skip to main content

Adaptive error protection coding for wireless transmission of motion JPEG 2000 video

Abstract

The delivery of video over wireless, error-prone transmission channels requires careful allocation of channel and source code rates, given the available bandwidth. In this paper, we present a theoretical framework to find an optimal joint channel and source code rate allocation, by considering an intra-coded video compression standard such as Motion JPEG 2000 and an error-prone wireless transmission channel. Lagrangian optimization is used to find the optimal code rate allocation, from a PSNR perspective, starting from commonly available source coding outputs, such as intermediate rate-distortion traces. The algorithm is simple and adaptive both on the available bandwidth and on the transmission channel conditions, and it has a low computational complexity. Simulation results, using Reed-Solomon (R-S) coding, show that the achieved performance, in terms of PSNR and MSSIM, is comparable with that of other methods reported in literature. In addition, a simplified and sub-optimal expression for determining the channel code assignment is also provided.

1 Introduction

Many multimedia devices are being turned into complete entertainment centers, also by taking profit of wireless transmission. There exist several industry-backed liaisons aimed at transmitting wireless audio/video contents between multimedia home appliances using either the 60-GHz band [13] or the 2.4–5.0-GHz unlicensed spectrum [4, 5], using for this purpose techniques such as UltraWideBand, orthogonal frequency division multiplexing, and multi-antenna links.

Moreover, JPEG 2000 [6] is rapidly spreading as a valuable intra-coding scheme for video contribution applications [7] due to the high compression efficiency, wide coverage of encoding profiles from lossless to lossy, and the low latency. Recently the International Organization for Standardization (ISO), jointly with the International Electrotechnical Commission (IEC) and the International Telecommunications Union (ITU), added new profiles, to the JPEG 2000 standard, for broadcast video contribution and distribution with an amendment to the JPEG 2000 core coding system [8]. This amendment defines three new profiles, aimed at studio contribution links, specifying encoding parameters and rate limits over seven operating levels for video encoded with JPEG 2000. Even JPEG 2000 over MPEG-2 Transport Stream is a recently standardized method suited to this scenario [9]. In this kind of application, wireless cameras may produce a video contribution that has to adapt, in real time, to time-varying transmission channel profiles. In such case, both the available bandwidth and the wireless link bit error rate (BER) may be considered slowly variable with respect to the video frame rate [10].

The streaming of video either directly over the physical layer or using IP packets, is subject to transmission errors. Retransmission of lost/corrupted data or packets is viable, but decreases interactivity and real-time requirements. Thus, forward error correction (FEC) is generally adopted, and the channel code rate may be matched to compressed data error sensitivity, performing unequal error protection (UEP) [1113].

Many researchers have investigated optimal methods for the protection of intra-coded video streams. For instance, JPEG 2000 for wireless (JPWL) [14] has been standardized for this purpose, and several works have shown its good performance either when used on IP networks [1517] or directly over the physical layer [1823].

In [24, 25], the authors addressed similar problems showing how rate-distortion optimized (RaDiO) audio/video streaming over packetized networks can be achieved, and they solved this problem using Lagrangian optimization. Cataldi et al. proposed a technique based on raptor codes and sliding windows, where different H.264 [26] code rates are associated to each quality layer [27]. In [28], the authors proposed Wyner-Ziv coding for the protection of a coarse version of the video, where side information is provided by a primary H.26x decoder. Ahmad et al. proposed a UEP system using fountain codes [29]. In general, many of the solutions reported in literature for searching optimal UEP strategies are based on heuristic methods or use optimization algorithms [3033]. It should be noted, however, that such solutions are based on search strategies characterized by a variable amount of computational complexity, which could prevent their use in real-time and bandwidth-adaptive video transmission.

1.1 Review of recent works

In recent years, several researchers have investigated techniques capable to apply differentiated FEC levels to wavelet-based image/video compressors, when the multimedia stream is delivered over unreliable or wireless channels.

In [34], a motion-compensated temporal filtering discrete wavelet transform (DWT) video coder is coupled with double binary turbo codes: the joint source-channel coding strategy is based on distortion profiling and code statistics. Also, Ho and Tsai [35] used 3D-wavelets, data interleaving, and Reed-Solomon (R-S) codes for their UEP streaming system. In [36], the authors investigate the performance of MJPEG 2000 and rate-compatible punctured convolutional codes for streaming over a time-varying binary symmetric channel (BSC); in their work, rate-distortion tradeoff of the coding units adapts the error correction code to the bandwidth and error characteristics. Schwartz et al. [37] adopted the DWT-based compression and convolutional coding FEC of CCSDS 122.0-B-1 and 131.0-B-1 satellite standards. Their results show that wavelet coefficient UEP outperforms the equal error protection (EEP) method over the simulated AWGN channel.

In [15, 38], the authors focus on JPEG 2000, R-S coding, and interleaving over wireless channels, simulated by a time-varying BSC with Gilbert-Elliot (GE) model. UEP is performed by variable FEC rates defined by solving a convex optimization problem. Based on interleaving effects, they derive a lower bound for successful image decoding rate in wireless environments. In [39], several UEP schemes are compared and layered JPWL streaming with RTP packetization on wireless channels is studied; the FEC allocation method is comparably faster and less complex than others, although yielding comparable quality.

JPWL has been shown to be both flexible and reactive to variable channel status. In [21, 40], streaming performance is simulated over realistic wireless channels, such as multiple-input multiple-output or Rayleigh fading ones. In [17], the authors conjugate JPWL with a dynamic bandwidth estimation scheme in order to provide the best layering, scaling, and protection of video streams. Even if source distortion is coarsely estimated, it has been shown that it can be effectively used to find an optimal rate allocation that outperforms EEP [23].

In our previous papers [16, 41], we used JPWL and interleaving over lossy packet networks. The UEP solution, found by means of a recursive, dichotomic search algorithm, was shown to always outperform EEP, and a low complexity interleaving strategy was devised for JPWL implementation on a DSP device. Iqbal et al. [42, 43] devised a family of dynamic programming code allocation methods for FEC protected wireless video streaming. Their protection assignment can provide variable trade-off between performance and implementation complexity.

A different view was adopted by Bahmani et al. [44]. The method devised by the authors operates mainly at the decoder side, leaving the particular UEP implementation open. By leveraging the error resilience features of JPEG 2000, their method guesses the erased received symbols and improves the error correction capability.

Ouaret et al. [45] compared R-S coded JPEG 2000 to the Slepian-Wolf/Wyner-Ziv distributed video coding (DVC) approach. Their results show that JPEG 2000 results in better quality at high error rates, even if only an EEP scheme was used, whereas DVC performs better at lower packet loss rates. In [46], JPEG 2000 and H.264 streams are protected with UEP and transmitted over lossy packet networks. A performance comparison with multiple description coding shows that UEP achieves better quality. Chen et al. [47], by using progressive digital fountain codes, allowed different users to receive broadcast video at different qualities, depending on the reliability of their UDP-based WiFi link. In [48], the authors describe the emerging MPEG multimedia transport standard for delivering high bit rate video over packet-lossy networks, using a low-density generator-matrix FEC. They show the effectiveness of their method on the streaming of JPEG 2000 digital cinema.

1.2 Proposed contributions

In this paper, we present a simple mathematical and algorithmic solution for finding an optimal UEP channel code rate allocation strategy. The method used for solving the problem is based on Lagrangian optimization, which is known in the literature and has been applied in several works of other authors [25, 34, 36]. Our proposed solution can be calculated with a closed-form expression, directly from the knowledge of the data units rate-distortion and of the transmission channel characteristics.

Differently from other existing strategies, such as those reviewed in Sec. 1.1, our method has a closed-form representation for the solution to the UEP problem, and it does not require iterative or dynamic programming-based strategies. Moreover, the presence of data interleaving enables optimal video quality when channel conditions are time-varying, and data- and channel-adaptivity can be fulfilled.

The devised method does not rely on a particular channel coding technique, since it can be applied universally to all block-based FEC schemes. The algorithm itself is also lightweight, as it gives a closed-form expression of the optimal channel code allocation strategy, which can be computed in real time and adaptively respond to changes in the available transmission bandwidth and experienced channel BER. This makes the technique suitable for channels with unknown and slowly changing error rate and, even, available bandwidth; the video stream receiver should communicate the experienced error rate to the sender side, which in turn would change the UEP solution accordingly. We also show how the algorithm can be practically applied, using JPEG 2000 source coding and R-S channel coding, and present some performance results expressed in terms of either PSNR (peak signal-to-noise ratio) or MSSIM (mean structural similarity index metric).

Some mathematical derivations and performance results shown in this paper have already been partially presented in [49]. However, in this paper, we present additional derivations and novel simulation results. First, we describe in detail a method for assigning codewords of a channel code and implementing the designed protection profile. Furthermore, we work out a formula that allows approximating the optimal protection profile without knowledge of the image content, but only by means of a statistical entropy approach. Finally, we also present some results on the simulated transmission of a complete video sequence.

The paper is structured as follows. First, the theoretical framework for optimized unequal error protection is presented in Sec. 2 and a Lagrangian optimization strategy is shown to be able to find a UEP solution, both for particular and general cases. Then, a practical UEP code rate assignment using R-S codes is presented in Sec. 3. In Sec. 4, the Monte Carlo simulation setup is described, and the results of several simulated scenarios are presented and discussed. Eventually, conclusions are drawn, followed by an Appendix that contains proofs to assumptions and lemmas.

2 Optimized unequal error protection

In this paper, we model the distortion as a function of the combined channel characteristics/channel code performance in terms of probability of data loss. This approach is commonly used in the reference literature [16, 30]. We use Motion JPEG 2000 as source coding algorithm and R-S as channel coding algorithm, but the same optimization technique can be applied to other source/channel coding combinations as well. Table 1 summarizes the mathematical notation that will be used throughout the following sections and in the appendices.

Table 1 Summary of mathematical notation

Since we are considering an intra-video coding method, the video stream is regarded as a sequence of frames that are compressed independently of each other [50]. Each frame is compressed using JPEG 2000, and the size of the compressed bitstream is of B bits or, equivalently, the source code rate is r s bits/pixel. The compressed bitstream is then divided in N k pieces, each one of (k−16) bits; a two-byte cyclic redundancy check (CRC) codeword, used to test for the error-free condition, is appended at the end of each piece (see Fig. 1). The CRC codeword is considered in our method, even if it is not necessary when R-S codes are used, since they are able to detect the presence of residual errors. However, for different types of channel codes (e.g., convolutional codes), the CRC codeword is required. In this work, the detection of eventual erasures is not managed differently than the detection of errors. The message words of the k-bit long pieces are then R-S encoded, and the codewords are assembled in n i -bit long pieces, 0≤i<N k . If the average channel code rate is \(\bar {r}\), the combined source/channel code rate for the whole codestream will be \(R = r_{s} / \bar {r}\).

Fig. 1
figure 1

Organization of the protected bitstream after division in N k pieces

The packetized codestream is then transmitted over a Q-ary symmetric transmission channel, where Q is the number of available different transmission symbols, that is, the pieces are assembled in symbols of log2Q bits. We define p i as the probability that no errors occur up to piece i, and M i as the mean-square error (MSE) distortion of the decompressed image using all pieces up to piece i. The combined transmission channel performance and channel coding correction capability results in the probability h(n l ) that a piece of n l bits (after channel decoding) is received with errors (residual error rate); many other researchers, in the past, have already used this model (e.g., [30]). In this paper, we assume that this probability is approximated by a log-linear model [51], such as

$$ h\left(n_{l}\right) \cong C e^{-s\left({n_{l} - d} \right)}, $$
((1))

where C, s, and d represent the combined transmission channel/channel code performance characteristics. In particular, the parameter s is a decay connected to the correcting performance of the code, d is an offset to n l used to improve the validity of this model, and C is an amplitude normalization constant. Further details on this approximation are given in Appendix A.1 Loglinear approximation of residual BER curves. This approximation is correct when the channel error probability is lower than 10−1, as for higher values, the log-linear relationship does not prove to be valid; however, in such cases, any practical rate allocation/FEC method is hardly operational without increasing the channel coding redundancy to a limit beyond which the video quality is severely impaired by the high decoding latency and the low source code rate. In particular, for the model presented in (1), it is found that a unique parameter, s, can be used to represent the transmission scenario, and its value increases as the combined transmission channel/channel code performance improves (see Appendix A.1 Loglinear approximation of residual BER curves). At the receiving side, the probability of having no errors up to piece i depends on the chosen sequence of code word lengths (up to piece i) as p i =p i (n 0,n 1,…,n i ). The optimization objective is that of minimizing the average MSE distortion, given that a constant amount of combined source and channel coding bits is sent on the channel, i.e.,

$$ \min \limits_{\lbrace n_{i} \rbrace} \sum\limits_{i = 0}^{{N_{k}} - 1}{{M_{i}}{p_{i}}} \quad {\mathrm{s.t.}} \quad \sum\limits_{i = 0}^{{N_{k}} - 1} {{n_{i}}} = \frac{B}{{\bar{r}}} . $$
((2))

Lemma 1.

The optimal UEP profile, i.e., the set of codeword lengths {n i } to be used for the pieces, which minimizes the overall distortion on the received image, is given by

$$ n_{i} = \bar{n} + \frac{1}{s}\ln \frac{m_{i}}{\hat{m}}, \ 0 \le i < {N_{k}} \, $$
((3))

where \(\bar {n}=k/\bar {r}\) is the average protected piece length, m i is a complementary cumulative distortion (CCD), \({m_{i}} =\! \Sigma _{j = i}^{{N_{k}} - 1}{M_{j}}\), and \(\hat {m}\) is the CCD geometric mean \(\hat {m} = {\left (\Pi _{i = 0}^{{N_{k}} - 1}{m_{i}}\right)^{1/{N_{k}}}}\).

Appendix A.2 Proof of Lemma 1 shows how the closed-form expression (3), mathematical solution of (2), can be obtained. The relationship in (3) can be commented as follows:

  • The protection rate for piece i depends on the average protection rate \(\bar {n}\) plus a modification term.

  • The modification term logarithmically weights the CCD at piece i, normalized by the geometric mean of the CCD.

  • If the channel code has high-error correction performance and/or the channel conditions are good, the parameter s is large, which gives a small modification term.

  • The protection profile depends on the equivalent transmission channel conditions only by means of the parameter s, not C and d.

  • Since m i is monotonically not increasing and ln(·) is a monotonic function, the modification term is monotonically not increasing, i.e., pieces at the beginning of the bitstream are more protected than those at the end.

  • The protection level at piece i depends on the cumulative amount of distortion of all following pieces.

  • The shape of the protection profile is determined by the CCD. Ordinate extrema are defined only by the channel/code combined performance.

With respect to other similar solutions presented in the literature, and described in Sec. 1.1, the main advantage of our method is that a closed-form solution to the UEP problem is readily available, without needing iterative or dynamic programming-based solving strategies. The proposed solution is data and channel adaptive; regular, low bit rate feedback from the receiver lets the transmitter modify the UEP strategy, which, considering also the presence of data interleaving, enables optimal decompressed video quality when channel conditions change with time. Moreover, the side information produced at no cost during the compression process allows implementing a well-crafted protection profile, which minimizes the expected amount of distortion due to missing or corrupted data at the receiver. Eventually, we also want to outline that when stringent real-time requirements are needed, and the wireless channel is time-varying, deep interleaving matrices are necessary to intersperse the symbol losses occurring in the channel far away (for high Doppler spread f D ), and the decoding delay increases correspondingly.

Upon looking carefully at the solution (3) proposed in Lemma 1, it can be noticed that an additional condition to be satisfied is that n i k, 0≤i<N k , meaning that we cannot overprotect the pieces at the beginning, since there would be not enough bits to allocate for the last pieces, not even the source coding bits; especially at high symbol error probabilities, the protection profile, given the total bit budget, could be extremely unbalanced and might provide values lower than k.

Lemma 2.

The minimum average channel code rate \(\bar {r}_{\text {min}}\), expressed in terms of minimum average piece size after channel coding, is approximated by

$$ {\bar{n}_{{\text{min}}}} = \max \limits_{i < {N_{k}}} \left({k + \frac{{\ln \left(\hat{m} / m_{i} \right)}}{s}} \right) \, $$
((4))

for a given equivalent channel error performance s.

Lemma 2 can be proved after some work on (3) and supposing that \(\ln \left (m_{i} / \hat {m} \right) < 0\), for large i close to N k .

In all cases where the exact rate-distortion curve of the compressed image is not known or cannot be calculated exactly, an approximation of (3) can be done, if the source coding process is expected to generate an ideal progressive codestream with a typical rate-distortion curve.

Lemma 3.

The approximated protection profile for progressively source encoded codestreams is given by

$$ {n'_{i}} \cong \bar{n} + \frac{{\Delta \rho \ln 2}}{s}\left({{N_{k}} - 1 - 2i} \right), \ 0 \le i \ll {N_{k}} \, $$
((5))

for N k large, where Δ ρ is the bit rate sampling step of the MSE profile.

Proof of this lemma is given in Appendix A.3 Proof of Lemma 3.

Our proposed UEP method has been devised in order to be as general as possible, with application scenarios that can extend also to other source and channel coding methods. For instance, considerations on the distortion reduction carried by data packets may be similarly done also for the network abstraction layer (NAL) units used in H.264 or H.265. In this case, NAL units naturally segment the video stream in pieces for which there is a correspondence with the pieces used in Fig. 1. The computation of the distortion profile can be achieved in several ways, if temporal and spatial frames intra/inter-dependency is maintained or not; layering methods for post-compression reordering of NAL units, similar to those adopted by the scalable video coding (SVC) extension of H.264 are then possible [52].

3 UEP profile generation using R-S coding

In the scenario adopted in this paper, when R-S coding is used, a strategy must be devised for the practical implementation of the optimal UEP profile found with (3). First, we consider a list of \(\tilde {N}\) mother code rates \(\{\tilde {r}_{l}\} = \{(\tilde {k}_{l} / \tilde {n}_{l}) \}\), \(l = 0, 1, \ldots, \tilde {N} - 1\), ordered by decreasing code rate and not containing repeated code values, i.e., \( \tilde {r}_{l} > \tilde {r}_{l + 1}\), \(j = 0, 1, \ldots, \tilde {N} - 2\).

Given the actual code rate for a generic piece i, r i =k i /n i (it was previously assumed that pieces are of fixed length, i.e., k i =k, thus this is a generalization), an index w i can be determined, such that \( \tilde {r}_{w_{i} + 1} \le r_{i} \le \tilde {r}_{w_{i}} \). The number α i of R-S codewords with rate \(\tilde {r}_{w_{i}}\), and β i of codewords with rate \(\tilde {r}_{w_{i} + 1}\) is chosen in such way to achieve the target code rate r i for the piece i. Due to the constraints

$$ \alpha_{i} \tilde{k}_{w_{i}} + \beta_{i} \tilde{k}_{w_{i} + 1} = k_{i} \quad \text{and} \quad \alpha_{i} \tilde{n}_{w_{i}} + \beta_{i} \tilde{n}_{w_{i} + 1} = n_{i} \, $$

the number of codewords for each code is

$$\begin{array}{*{20}l} {\alpha_{i}} &= \frac{{{k_{i}}{\tilde{n}_{w_{i} + 1}} - {n_{i}}{\tilde{k}_{w_{i}+1}}}}{{{\tilde{k}_{w_{i}}}{\tilde{n}_{w_{i} + 1}} - {\tilde{n}_{w_{i}}}{\tilde{k}_{w_{i}+1}}}} \, \\ {\beta_{i}} &= \frac{{{\tilde{k}_{w_{i}}}{n_{i}} - {\tilde{n}_{w_{i}}}{k_{i}}}}{{{\tilde{k}_{w_{i}}}{\tilde{n}_{w_{i}+1}} - {\tilde{n}_{w_{i}}}{\tilde{k}_{w_{i}+1}}}} . \end{array} $$
((6))

Positive-valued solutions always exists for (6), since the constraint \({\tilde {r}_{w_{i}+1}} \le {r_{i}} \le {\tilde {r}_{w_{i}}}\) gives

$$ {\tilde{k}_{w_{i}}}{n_{i}} - {k_{i}}{\tilde{n}_{w_{i}}} > 0, \quad {k_{i}}{\tilde{n}_{w_{i}+1}} - {n_{i}}{\tilde{k}_{w_{i}+1}} > 0 $$

and the constraint \({\tilde {r}_{w_{i}}} > {\tilde {r}_{w_{i}+1}}\) gives

$$ {\tilde{k}_{w_{i}}}{\tilde{n}_{w_{i}+1}} - {\tilde{k}_{w_{i}+1}}{\tilde{n}_{w_{i}}} > 0 . $$

The values of α i and β i obtained from (6) are fractional numbers. In order to best approximate them with integer numbers, we first compute α i with rounding, and then use this result to compute β i , as

$$\begin{array}{*{20}l} \alpha'_{i} &= \left\lfloor {1/2 + \frac{{{k_{i}}{\tilde{n}_{w_{i}+1}} - {n_{i}}{\tilde{k}_{w_{i}+1}}}}{{{\tilde{k}_{w_{i}}}{\tilde{n}_{w_{i}+1}} - {\tilde{n}_{w_{i}}}{\tilde{k}_{w_{i}+1}}}}} \right\rfloor \, \\ \beta'_{i} &= \frac{k_{i} - \alpha_{i} \tilde{k}_{w_{i}}}{\tilde{k}_{w_{i}+1}} . \end{array} $$
((7))

4 System simulation and performance

4.1 Simulation setup

For purposes of assessing the performance of the technique presented in this paper, we have prepared a 512-frame video composed by the first 32 frames of each of the following 16 clips with CIF resolution (352 × 288, 30 frames/s) and YUV 4:2:0 format, combined in sequence: akiyo, bus, coastguard, crew, flower, football, foreman, harbor, husky, ice, news, soccer, stefan, tempete, tennis, and waterfall [53]. Only the luminance (Y) component of the video frames has been used to perform the optimization strategy and the transmission and reception over a simulated channel.

First, the partial distortions M i for each frame in the video sequence have been calculated. To this purpose, JPEG 2000 compression has been performed using Kakadu 6.0 [54], with default parameters, no visual weighting, and the ‘-rate’ option on every frame. The portion of each JPEG 2000 codestream located after the start-of-data (SOD) marker has been split into multiple pieces, each one with a size of (k−2) bytes (after CRC insertion, the piece will be of k bytes). Then, a new codestream has been constructed using the original header data, with an amended start-of-tile (SOT) marker to account for the new codestream length, a number i of codestream pieces, and the end-of-codestream (EOC) marker. The obtained codestream has been decompressed using Kakadu 6.0, and M i has been calculated as the distortion of the reconstructed frame. Although this process of determining M i is cumbersome, it should be said that the JPEG 2000 encoding process is able, per se, to provide such values; during the encoding procedure of JPEG 2000, an accurate rate-distortion estimation of the compressed frame is calculated, since the distortion values are gathered for the selection of the compressed wavelet coefficients with embedded block coding optimized truncation (EBCOT) of the bit-stream [55]. In this work, we have favored a direct calculation of the distortion values, in order to achieve more precise results.

When the piece boundaries are not coincident with the codestream interruption points decided by the JPEG 2000 compressor, we adopt a continuum hypothesis, i.e., we assume that the intermediate distortions at the piece boundaries can be calculated using linear interpolation from the nearest known, available distortions. This assumption is generally valid, since JPEG 2000 is a position-progressive encoder, and distortions are related to the way wavelet coefficients are truncated by EBCOT, in order to best satisfy the quality-rate constraints imposed on the compression process.

Moreover, in order to make a fair comparison among different channel code rates, we have kept fixed the total amount of data sent on the channel, i.e., the combined source and channel code rate R.

The transmission of the compressed stream has been simulated, using MATLAB, on three different types of channels. The first type is a Q-ary symmetric channel (Q=256), characterized by symbol error rates P S ranging from 10−3 to 10−1. In this type of channel, errors occur at a symbol-level; since the bit errors are equiprobable and uniformly distributed over the symbol bits (log2Q=8 bits/symbol), there is a simple relationship between bit and symbol error rates when the number of bits per symbol is large, i.e., P b P S /2 ([56] Section 4.4-1).

The second type of channel uses binary phase shift keying (BPSK) and additive white Gaussian noise (AWGN), in order to represent a transmission condition akin to physical level signaling on an actual, but ideal, channel. In this case, the performance depends on the signal-to-noise ratio (SNR) expressed by γ b . Finally, the last type of simulated channel uses BPSK, AWGN, and Rayleigh-distributed flat fading, which represents a condition similar to that experienced on actual, wireless, non-line-of-sight (NLOS) channels. In this case, the performance depends on the SNR γ b and on the correlation degree among fades, expressed by the Doppler spread f D . For channels using BPSK, the expressions used to determine the average BER (and the corresponding channel parameter s), given a certain value of γ b , are

$$\begin{array}{*{20}l} P_{b,\textrm{AWGN}} &= \textrm{Q}\left(\sqrt{2\gamma_{b}}\right) \, \end{array} $$
((8))
$$\begin{array}{*{20}l} P_{b,\text{Rayleigh}} &= \frac{1}{2} \left(1 - \sqrt{\frac{\gamma_{b}}{1+\gamma_{b}}} \right) \, \end{array} $$
((9))

where Q(·) is the Gaussian tail function, and the Rayleigh channel BER is calculated for the maximum uncorrelated Doppler spread [56].

The UEP profile has been generated using (3), given the distortion profile M i and the channel parameter s. Then, the codestream has been split into pieces that have been protected according to the determined UEP profile, using R-S coding with \(\tilde {N} = 24\) mother code rates \(\left \lbrace \tilde {r}_{l} \right \rbrace = \left \lbrace 32/36, 32/38, 32/40, \ldots, 32/80 \right \rbrace \), and adopting the codeword allocation strategy given by (7). The effect of the channel is simulated by randomly changing the transmitted bytes according to the simulation symbol error rate P S . The error-affected codestream has been recomposed by terminating it at the last error-free received piece (thanks to the CRC codeword). In this way, any image reconstruction artifact due to wrong/erased codestream bytes has been eliminated, and the reconstructed image MSE is that used by the UEP allocation strategy. The JPEG 2000 header (about 300 bytes) has been considered as transmitted on a reliable channel, since it represents the most critical section of the codestream. At the receiving side, the JPEG 2000 header has been pre-pended to the JPEG 2000 bitstream bytes, and only the portion of the header carrying information on the bitstream size (Psot field of the SOT marker) has been changed accordingly. Performance has been evaluated as objective visual quality, and Y-PSNR has been used as objective quality indicator. In addition, we used MSSIM to faithfully represent the subjective evaluation by a human observer. The overall performance has been calculated by averaging the PSNR and MSSIM values calculated at each frame of the video sequence. The performance of the UEP method has been directly compared with that of an EEP method.

Additionally, comparisons with existing techniques in literature have been done using a static reference image, the 512 × 512 pixel grayscale version of lena, compressed at a total bit rate (joint source and channel code rate) of R=0.5 bits/pixel. For each simulated channel error rate, at least 100 independent transmissions of the image have been repeated, and the results averaged. Since both the video sequence and the static image cases cover a standard definition application scenario, we have also used a static image frame from the high-definition 1920 × 1080 pixels crowdrun RGB sequence [53], in order to show some properties of the calculated UEP profiles.

4.2 Simulated performance results

4.2.1 Performance for static images on BSC

We first report the performance obtained with static images. Both images (lena and crowdrun) were compressed to a total rate of R=2.5 bits/pixel, comprising both the source and the channel code bits.

Figure 2 (top) presents the UEP profile for lena, calculated with (3) at an average channel code rate of \(\bar {r}=32/44\), and a channel symbol error probability P S =5×10−2. The UEP profile, represented by the solid line, is shown in terms of the protected piece length n i versus the piece index i; the average protected piece length \(\bar {n}\), coincident with the EEP profile, is represented by the dashed line. Figure 2 (bottom) shows the equivalent case for crowdrun. In both cases, the piece length is k=1 600 bytes, and the channel parameter is s=0.0127 (see Appendix A.1 Loglinear approximation of residual BER curves and Table 3). The dot-dashed line in Fig. 2 represents the protection profile obtained using (5), at rate steps corresponding to the situation illustrated so far. The UEP profiles begin with a high protection level (low code rate), which then gradually decreases as the index of the piece increases, as expected.

Fig. 2
figure 2

UEP profiles for lena (top) and crowdrun (bottom), with code rate of \(\bar {r} = 32/44\), for k=1 600 bytes and equivalent channel characterized by s=0.0127 (P S =5×10−2). The values of k and n are expressed in bytes

Figure 3 shows how the UEP profile is practically generated, for lena, using a proper combination of the \(\left \lbrace \tilde {r}_{l} \right \rbrace \) mother codes and with a number of codewords α i′ and β i′, for each pair of codes, as given by (7). The top subplot shows the mother code rates \(\tilde {r}_{w_{i}}\) and \(\tilde {r}_{w_{i}+1}\) used in each i-th piece, expressed in terms of codeword size \(\tilde {n}_{w_{i}}\). The bottom subplot describes the normalized amount \(\delta _{\alpha '_{i}}\) of codewords with rate \(\tilde {r}_{w_{i}}\) with respect to the total number of codewords used in the ith piece,

$$ \delta_{\alpha'_{i}} = \frac{\alpha'_{i}}{\alpha'_{i}+\beta'_{i}} . $$
((10))
Fig. 3
figure 3

Generation of the UEP profile for lena with an average channel code rate of \(\bar {r} = 32/44\), k=1 600 and equivalent channel s=0.0127 (P S =5×10−2). The used mother code rates are \(\left \lbrace \tilde {r}_{l} \right \rbrace = \left \lbrace 32/36, 32/38, 32/40, \ldots,\right.\left. 32/80 \right \rbrace \). The values of k and n are expressed in bytes

Clearly, (1−δ α,i ) is the normalized amount of codewords with rate \(\tilde {r}_{w_{i} + 1}\).

Figure 4 shows the UEP profile n i obtained for several values of the channel symbol error rate P S (from 10−3 to 10−1) and an average code rate \(\bar {r} = 32/46\), for lena. The used message word size is k=1 600 bytes, which corresponds to the floor of the plot. For low values of error rates, the protection profile is almost linear, meaning that an EEP solution is optimal. On the other side, at higher error rates, the profile climbs over, at the beginning, and submerge below, toward the ending, the average protection level. At even higher error rates, the average protection is not sufficient for keeping the profile above the message size floor, and an increased protection rate is requested for correct operation of the algorithm.

Fig. 4
figure 4

UEP profiles for different channel symbol error rates for lena (k=1 600). The average protection code rate is \(\bar {r} = 32/46\). The floorof the plot represents k. The k and n values are given in bytes

Figure 5 shows the minimum required protected piece average size \(\bar {n}_{\text {min}}\) given by (4), plotted versus the channel symbol error rate P S and the total bit rate R, using a message word size of k=1 024 bytes and the crowdrun image. The plot has been generated considering also the channel code rate in the total bit rate. We outline that, using this relationship, the system can adaptively respond to variations of the conditions of the transmission channel and of the available bandwidth, by keeping in all cases the received video quality at the optimal level. Given a channel error rate, the minimum average protection slightly increases as the available bandwidth increases, thus meaning that higher protection is required to achieve the optimal UEP profile for an error rate P S . It is also evident how, when the available bandwidth increases, the algorithm selects progressively higher levels of protection to maximize image quality. Clearly, in order to achieve the optimal profile for a given error rate, there must be some knowledge of the channel status even at the transmitter. Thus, the receiver must be able to calculate an estimate \(\tilde {P}_{S}\) of the current channel symbol error rate and feed this information back to the transmission side. If channel conditions are slowly varying with respect to the information exchange rate, then this feedback may happen with minimum signaling requirements. Since our method employs the secure delivery of the JPEG 2000 header part (using a reliable transmission technique), these data can be repeated periodically (e.g., 1–2 times per second) on the same channel and used as pilot information for estimating \(\tilde {P}_{S}\).

Fig. 5
figure 5

Minimum average protection required for crowdrun, plotted versus the channel symbol error rate and the total bit rate, in case of k=1 024. The values of k and n are expressed in bytes

The performance of our method has been measured using the PSNR and MSSIM quality metrics. In order to compare the results with similar methods referenced in [16, 57, 58], Monte Carlo simulations have been done at a total bit rate (joint source and channel code rate) of R=0.5 bits/pixel, and the metrics have been averaged. Figure 6 shows the achieved performance plotted versus the actual bit error probability P b before channel decoding at the receiver side. The simulations have been done with a total of nine different configurations: three possible piece length values k (512, 1 024, and 1 600 bytes) and three different average channel code rates \(\bar {r}\) (32/40, 32/44, and 32/48). In all the presented cases, it can be seen that shorter values of k allow achieving slightly higher PSNR/MSSIM values, for the same error rate on the channel.

Fig. 6
figure 6

Performance plotted in terms of PSNR (a) and MSSIM (b) versus channel bit error rate, for \(\bar {r} = 32/40\) (solid lines), \(\bar {r} = 32/44\) (dashed lines), \(\bar {r} = 32/48\) (dotted lines), respectively, and k=512 (diamond), k=1 024 (plus), k=1 600 (asterisk). The image is lena, compressed at 0.5 bpp

The results summarized in Table 2, for the case \(\bar {r} = 32/40\) and k=512, show that the achieved PSNR is comparable or better than that obtained in [16, 57, 58]. There is an exception in the comparison with [58], which presents a higher PSNR. In that work, the authors used turbo codes with codewords longer than those used in our work, resulting in improved error correction capability. However, we want to outline that our algorithm is designed to find optimal UEP profiles, and using different error-correcting codes would result in different final PSNR values.

Table 2 PSNR (dB) for lena at 0.5 bpp, \(\bar {r} = 32/40\), and k=512 bytes

4.2.2 Performance for video sequence on BSC

As for the performance obtained on the video sequence, Fig. 7 shows the quality metric indicators history along the N F =512 frames of the test video. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel. Given the video frame rate (30 frames/s) and resolution (352 × 288), this corresponds to a transmitted bit rate of nearly R b =7.6 Mbit/s. The solid dark-green line (MAX) represents the maximum theoretically achievable PSNR at the receiver; it is due only to the compression artifacts introduced by the JPEG 2000 lossy source encoding. The performance indicators have been measured in the following way. First, the MSE ε i of the luminance component of every decoded frame has been converted into logarithmic PSNR as Γ i =10 log10(1/ε i ) and plotted versus the frame number i. Then, the average MSE has been calculated using the arithmetic mean of all MSEs, \(\bar {\epsilon } = \left (1/N_{F}\right)\sum _{i=0}^{N_{F} - 1}\epsilon _{i}\). Finally, the average PSNR is calculated from the average MSE as \(\bar {\Gamma }=10 \log _{10}{\left (1/\bar {\epsilon }\right)}\). For the MSSIM, the values ι i obtained at each frame have been plotted versus the frame number i and arithmetically averaged as \(\bar {\iota } = \left (1/N_{F}\right)\sum _{i=0}^{N_{F} - 1}\iota _{i}\). Also in this case, the actual BER P b , before channel decoding at the receiver side, is calculated and used to compare the performance.

Fig. 7
figure 7

PSNR and MSSIM history for the test video sequence on Q-ary channel. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel. The simulated channel has an error rate of P S =7×10−2

The PSNR and MSSIM histories Γ i and ι i plotted in Fig. 7 are obtained on a simulated channel with an error rate of P S =7×10−2, equivalent to a bit error rate of P b =3.5×10−2. The red line depicts the quality of the received UEP frames after decompression, with additional artifacts due to the errors introduced by the loss of pieces during transmission on the channel. For comparison purposes, we have also reported the quality of an EEP profile (blue line), having the same total rate R. The average performance of the UEP method results in a PSNR of \(\bar {\Gamma }=25.3\) dB, while the PSNR of the EEP method is of \(\bar {\Gamma }=19.6\) dB. Similarly, we have an MSSIM of \(\bar {\iota }=0.84\) and \(\bar {\iota }=0.66\), for the UEP and EEP methods, respectively.

Figure 8 shows the obtained average PSNR \(\bar {\Gamma }\) and MSSIM \(\bar {\iota }\) for different values of channel SER P S , varying from 2×10−2 to 10−1 (P b varying from 10−2 to 5×10−2). The results are plotted both for the UEP and EEP cases. The curves show that the proposed UEP method outperforms the EEP method, in terms of PSNR, up to nearly 7 dB for P S =10−1 (P b =5×10−2). It is also evident that the UEP method begins to provide better results than EEP starting from a SER of P S =3×10−2 (P b =1.5×10−2), while for values of SER lower than P S =2×10−2 (P b =10−2) the two methods are equivalent. Similar considerations can be declared for the MSSIM. In this case, the quality index of UEP begins to improve over the EEP one for SER higher than P S =4×10−2 (P b =2×10−2); the protection advantage given by UEP over EEP is thus only slightly reduced when considering this more subjective-quality related metric.

Fig. 8
figure 8

Average PSNR and MSSIM for the test video sequence at different values of Q-ary channel BER. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel

Figure 9 shows the a posteriori cumulative distribution function (CDF) of the per-frame MSE, F E (ε), defined as the computed probability that the MSE of a decompressed frame is lower than ε. The figure plots the CDFs for UEP and EEP cases, and for two different values of channel SER, P S =5×10−2 (P b =2.5×10−2) and P S =8×10−2 (P b =4×10−2). By setting, for instance, a threshold probability of 0.9, we find that, for the higher SER, the UEP MSEs are lower than 0.009 whereas for EEP they are lower than 0.05. Moreover, for every threshold probability, the UEP curves always give lower MSEs than the UEP curve. Similarly, at the lower SER, the 90 % threshold values are 0.0017 and 0.0025 for UEP and EEP, respectively. However, in this case, for a probability of 79 % and MSE of 0.0008, the two curves cross each other. This peculiar fact can be explained in the following way: in the EEP case, there are few values of ε i that are much worse than the worst values obtained in the UEP case. On the contrary, UEP gives a lot of ε i values that are somewhat lower than those of EEP ones, but never get much worse than that. This fact is mainly responsible for the improved average PSNR \(\bar {\Gamma }\) exhibited by the UEP method over the EEP method in Fig. 8.

Fig. 9
figure 9

CDF of MSE for the test video sequence at different values of Q-ary channel BER. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel

A proof that this phenomenon affects the decoded video quality is also given by Fig. 10. The figure shows the measured probability that no decoding happens at all in the received compressed video frame, due to the presence of uncorrectable errors in the first piece (i=0). Both methods can successfully decode at least the first packet up to a SER of P S =4×10−2 (P b =2×10−2). For larger BERs, the UEP method attains a maximum of about 5 % probability of no decoding, whereas the EEP method failure probability is an order of magnitude higher and grows up to 80 %. Figure 11 shows two decompressed frames (frames no. 6 and 360) obtained during the transmission on a channel with an error rate of P S =7×10−2 (P b =3.5×10−2). In this case, the error sequence has been exactly the same for the UEP and the EEP methods. Figure 11 a and c display the frames obtained with EEP, while Fig. 11 b and d present the frames obtained with UEP. Simulation results and samples of the original and decompressed video clips are available for download [59].

Fig. 10
figure 10

Probability of failed decoding for the test video sequence at different values of Q-ary channel BER. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel

Fig. 11
figure 11

Comparison between decompressed frames using the EEP (ac) and UEP (bd) methods, for a Q-ary channel simulated error rate of P S =7×10−2. Top row contains frame no. 6, bottom row contains frame no. 390

4.2.3 Performance on AWGN and Rayleigh channels

The performance of the proposed system has been also verified using the AWGN and uncorrelated (f D R b ) Rayleigh flat fading channels, for the video sequence only (without loss of generality, the results apply also to the static images case). Figure 12 shows the average PSNR and MSSIM versus the average channel SNR γ b . For both types of channels, UEP outperforms EEP. This is not surprising, as the proper combination of interleaving depth and channel coding results in a equivalent BSC channel, which we have already simulated. In case of correlated Rayleigh fading (f D <R b ), the bit interleaver size has to be chosen to span over an amount of bits such that, after deinterleaving, the fades are practically uncorrelated. If transmission on a channel with Doppler spread f D adopts an N row×N col block interleaver, then, after deinterleaving, the equivalent Doppler spread becomes N row times higher, N row f D [60]. Thus, by properly choosing the interleaver dimension N row, one can revert to the condition of uncorrelated Rayleigh fading, for which the performance is plotted in the right-side curves of Fig. 12. If, on the other side, the Doppler spread is so low (such as it happens on nearly static NLOS channels, f D R b ) that the interleaver size should exceed the available memory or the decoding delay bounds, then the periodic feedback from the receiver would allow the transmitter to adapt the protection profile and coding rate at the measured channel conditions. In this case, in the short term, the performance will be practically that of the AWGN case, for which the curves on the left side of Fig. 12 apply.

Fig. 12
figure 12

Average PSNR and MSSIM for the test video sequence at different values of γ b on the AWGN and Rayleigh fading channels. The piece length is k=1 024 bytes, the average code rate is \(\bar {r} = 32/48\), and the total rate is R=2.5 bits/pixel

4.3 Computational complexity

The optimization problem requires the knowledge of the distortion profile of the image. Using JPEG 2000 compression, the partial distortions M i (and the CCD m i ) can be easily obtained during the rate allocation step of the JPEG 2000 bitstream preparation [55]; thus, these values can be obtained virtually at no cost.

For the calculation of the C, s, and d coefficients, a look-up table (LUT) can be used to store the parameters, for different values of the packet size N k , of the channel bit/symbol error rate P b / P S , and possibly even for different channel coding algorithms (e.g., convolutional, binary R-S, low-density parity check codes). The LUT can then be accessed to provide the parameters that will be used in the protection profile generation, with a large saving with respect to storing the entire UEP profile, for each combination of the three variables. As for generating the coefficients stored in the LUT, they can be calculated off-line and smoothly interpolated to provide all the intermediate values that could be requested by the system.

The calculation of the optimal protection profile in (3) depends on the geometric mean of the CCD function. In order to avoid overflow or underflow problems due to floating point operations rounding during the computation, the geometric mean should be calculated logarithmically as

$$ \hat{m} = e^{\frac{1}{N_{k}}\sum_{i=0}^{N_{k}-1} \ln m_{i}} \, $$
((11))

in which case it takes N k logarithms, (N k −1) additions, 1 division, and 1 exponentiation to be computed. Then, we need (N k −1) additions for the computation of the CCD function, N k multiplications for the logarithm operand (one division to obtain the inverse of the geometric mean), N k logarithm operations, N k multiplications for logarithm result scaling (one division to obtain the inverse of s, if not already saved in this form in the LUT), and N k additions. In summary, to implement (3), a total of (3N k −2) additions, 2N k multiplications, 2N k logarithms, 3 divisions, 1 exponentiation is needed. Assuming that natural logarithms and powers of e can be implemented by means of another LUT, with a sufficient precision once the dynamic ranges of the operands have been characterized, the asymptotic complexity becomes \(\mathcal {O}\left (N_{k}\right)\). Differently, the solutions presented in [16] or in [58] require multiple evaluations of expressions similar to (2), which are more cumbersome to calculate from the computational viewpoint.

5 Conclusions

The transmission of video over error-prone wireless channels is a problem that can be solved by using an adequate error protection layer added to the streams, once the characteristics of the channel are known. In this work, we have presented a UEP strategy devised to allocate channel code bits, using an optimization algorithm that is computationally light during the search of the UEP profile. The general formulation of the problem has been solved using a Lagrangian minimization method. The discovered closed-form UEP expression requires already available data, such as the image rate-distortion curve, the average error protection code rate, the typical allowed packet size for transmission, and the channel error model (represented by one parameter). In addition, we have also presented a practical method for implementing the discovered UEP profile using R-S codes. The simulated performance of the proposed UEP strategy shows that the results outperform those obtained using an EEP strategy, that they are comparable with the UEP performance results of other methods presented in similar works, yet having a lower computational complexity, and that this UEP method can be used to effectively counteract the impairments introduced by an error-prone transmission channel.

6 Appendix A

6.1 A.1 Loglinear approximation of residual BER curves

The expression (1), derived from a more general expression found in [51], approximates the functions h(n l ) with exponentials, at least in definite regions where h(·) is lower than 10−1, which is a common requirement. In order to show the validity of this approximation, we have simulated the performance of R-S error coding applied to the devised packetization scheme. Each piece of k bytes has been split in message words of \(\tilde {k}= 32\) bytes and a R-S code with rate \(\tilde {n}/\tilde {k}\) has been applied to each word. Then, the resulting codewords have been concatenated, producing a piece of n bytes. Multiple pieces have been transmitted over an Q-ary channel (Q=256) with a defined symbol error rate P S , and the residual error probability after decoding, h(n), has been measured. For a fixed P S , the value of n has been increased and the measurement on the residual error rate performed again. This procedure has been repeated for several different values of P S . Figure 13 shows the set of residual error probability curves obtained adopting a piece length k=1 600 bytes, for channel symbol error rates from P S =10−3 (P b =5×10−4) to P S =10−1 (P b =5×10−2). Similar sets of error rate curves have been obtained for different lengths of the pieces. The resulting curves have been fit using a nonlinear least-squares method in the logarithmic domain, thus providing the relevant model parameters; Table 3 lists the parameters C, s, and d for several piece lengths and channel symbol error probabilities P S .

Fig. 13
figure 13

Simulated (solid line) and modeled (dashed line) piece error rate performance for R-S coding, using a mother code from \(\left \lbrace \tilde {r}_{l} \right \rbrace \) and a piece size of k=1 600 (n and k are expressed in bytes). The channel symbol error rate P S varies from 10−3 to 10−1 (curve styles are in the left-side legend)

Table 3 Fitting parameters C, s, and d found for the curves in Fig. 13

In Fig. 13, solid lines represent the results of simulations, whereas the dashed lines are obtained by evaluating (1) with the best-fit model parameters of Table 3, for every channel symbol error rate.

The following can be said on C, s, and d parameters:

  • The amplitude C is generally much lower than 1.

  • The decay s increases as the channel/channel code performance improves.

  • The offset d is the point where the error rate curve becomes linear, and is higher than 10−2.

  • d is greater than the values of k that we have used.

For other types of transmission methods, such as, for instance, BPSK on AWGN or Rayleigh fading channels, if a bit/byte interleaver interspersing consecutive errors is present, then the results and comments discussed above are still valid.

6.2 A.2 Proof of Lemma 1

Proof.

The constrained minimization problem expressed by (2) can be solved using the Lagrange multipliers method, as

$$\begin{array}{@{}rcl@{}}{} &&\frac{\partial}{\partial{n_{i}}}\sum\limits_{j=0}^{{N_{k}}-1}{\left({M_{j}}{p_{j}}\left(n_{0},n_{1}, \ldots,n_{j}\right)-\lambda{n_{j}}\right)} = 0 \, ,\\ && \ 0 \le i < {N_{k}} . \end{array} $$
((12))

First, we simplify the probability of having no received errors up to piece i, p i =p i (n 0,n 1,…,n i ). We suppose that this probability is expressed by the product of correct reception probabilities for each piece, since they are independently decoded of each other, as

$$ {p_{j}}\left({{n_{0}},{n_{1}}, \ldots,{n_{j}}}\right) = \prod\limits_{l = 0}^{j} {\left({1 - h\left({{n_{l}}} \right)} \right)} \, $$
((13))

where h(n l ) is the probability that a piece of k bits (n l bits after channel encoding) is received with errors. Under such conditions, products in (13) are approximated as

$$ \prod\limits_{l = 0}^{j} {\left({1 - h\left({{n_{l}}} \right)} \right)} \cong 1 - \sum\limits_{l = 0}^{j} {h\left({{n_{l}}} \right)} \, $$
((14))

since products of h(n l ) terms can be neglected. Supposing that all h(n l ) have the same value, it can be found that when h(n l )<2×10−2, the approximation (14) is valid with an error lower than 10 %.

After substituting (14) in (13) and differentiating, we obtain

$$ \frac{{\partial{p_{j}}}}{{\partial{n_{i}}}} = - \frac{{\partial h\left({{n_{i}}} \right)}}{{\partial {n_{i}}}}\left({1 - \sum\limits_{\substack{l = 0 \\ l \ne i} }^{j} {h\left({{n_{l}}} \right)}} \right) . $$
((15))

With the approximations (1) and (15), and neglecting C 2 (since C1), (12) becomes

$$ {e^{- s{n_{i}}}} = \frac{\lambda}{{sC{e^{sd}}\sum\limits_{j = i}^{{N_{k}} - 1} {{M_{j}}} }}, \ 0 \le i < {N_{k}} . $$
((16))

The summation at the denominator of (16) is the CCD \({m_{i}} = \Sigma _{j = i}^{{N_{k}} - 1}{M_{j}}\), which is a nonincreasing function in the i variable (as i increases, there are less M j ’s to sum). Then, n i is found to be

$$ {n_{i}} = d + \frac{1}{s}\ln \left({\frac{{sC{m_{i}}}}{\lambda }} \right), \ 0 \le i < {N_{k}} . $$
((17))

In order to find the Lagrange multiplier λ, we use the constraint in (2). After some work, the constraint becomes

$$ \lambda = sC{e^{- \frac{{s\left({T - D} \right)}}{{{N_{k}}}}}}\hat{m} \, $$
((18))

where \(\hat {m} = {(\Pi _{i = 0}^{{N_{k}} - 1}{m_{i}})^{1/{N_{k}}}}\) is the geometric mean of the CCD, \(T = B(\bar {n} - k)/k\), and D=N k (dk). We can substitute (18) into (17), and find the closed form solution to the minimization problem

$$\begin{array}{*{20}l} {n_{i}} &= \frac{k}{{\bar{r}}} + \frac{1}{{s{N_{k}}}}\ln \left({\frac{{{{\left({\sum\limits_{j = i}^{{N_{k}} - 1} {{M_{j}}}} \right)}^{{N_{k}} - 1}}}}{{\prod\limits_{\substack{l = 0 \\ l \ne i} }^{{N_{k}} - 1} {\left({\sum\limits_{j = l}^{{N_{k}} - 1} {{M_{j}}}} \right)} }}} \right) \\ &= \bar{n} + \frac{1}{s}\ln \frac{{{m_{i}}}}{{\hat{m}}}, \ 0 \le i < {N_{k}} . \end{array} $$
((19))

6.3 A.3 Proof of Lemma 3

Proof.

If the exact R-D curve of the image is not known, it is still possible to calculate an approximate UEP profile considering the MSE profile, sampled at Δ ρ bit/symbol steps, using the bounds

$$ \frac{1}{{2\pi e}}{2^{- 2\left({i \Delta \rho - H} \right)}} < {M_{i}} < {\sigma^{2}}{2^{- 2 i \Delta \rho}} \, $$
((20))

0≤i<N k , where the lower bound expresses the differential entropy H of the actual source, and the upper bound is calculated considering the hypothesis of Gaussian source (with encoded image coefficients that are Gaussian distributed with variance σ 2), respectively.

We start by expressing the CCD, for both bounds in (20), as

$$ {m_{i}} = K\sum\limits_{j = i}^{{N_{k}} - 1} {{{\left({{2^{- 2\Delta \rho }}} \right)}^{j}}} = K\frac{{{2^{- 2\Delta \rho i}} - {2^{- 2\Delta \rho {N_{k}}}}}}{{1 - {2^{- 2\Delta \rho }}}} \, $$
((21))

where either K=(22H−1/π e) or K=σ 2. Considering iN k and N k large, (21) can be approximated by m i K2−2Δρi/(1−2−2Δρ). The geometric mean is expressed and approximated by

$$\hat{m} \cong \frac{K}{{1 - {2^{- 2\Delta \rho }}}}{2^{- \frac{{2\Delta \rho }}{{{N_{k}}}}\sum\limits_{i = 0}^{{N_{k}} - 1} i }} = K\frac{{{2^{- \Delta \rho \left({{N_{k}} - 1} \right)}}}}{{1 - {2^{- 2\Delta \rho }}}} \, $$

when N k is large. Eventually, the approximated protection profile \(n^{\prime }_{i}\) is given by

$$\begin{array}{*{20}l} n'_{i} &\cong \bar{n} + \frac{1}{s}\ln \left({\frac{{K{2^{- 2\Delta \rho i}}}}{{\hat{m}\left({1 - {2^{- 2\Delta \rho }}} \right)}}} \right)\\ & = \bar{n} + \frac{{\Delta \rho \ln 2}}{s}\left({{N_{k}} - 1 - 2i} \right) \,\\ & 0 \le i \ll {N_{k}}, \ {N_{k}} \, \, {\text{large}} . \end{array} $$

References

  1. H Singh, J Oh, C Kweon, X Qin, H-R Shao, CY Ngo, A 60 GHz wireless network for enabling uncompressed video communication. IEEE Commun. Mag.46(12), 71–78 (2008). doi:10.1109/MCOM.2008.4689210.

    Article  Google Scholar 

  2. CJ Hansen, WiGiG: Multi-gigabit wireless communications in the 60 GHz band. IEEE Wireless Commun. 18(6), 6–7 (2011). doi:10.1109/MWC.2011.6108325.

    Article  Google Scholar 

  3. S Yong, C-C Chong, An overview of multigigabit wireless through millimeter wave technology: potentials and technical challenges. EURASIP J. Wirel. Commun. Netw. 2007(078907) (2007). doi:10.1155/2007/78907.

  4. G Lawton, Wireless HD video heats up. IEEE Computer. 41(12), 18–20 (2008). doi:10.1109/MC.2008.509.

    Article  Google Scholar 

  5. S Srinivasan, in Proc. of 20th Int. Conf. on Computer Commun. and Networks (ICCCN). An assessment of technologies for in-home entertainment (IEEEMaui, Hawaii, 2011), pp. 1–6. doi:10.1109/ICCCN.2011.6005803.

    Google Scholar 

  6. ISO/IEC, JPEG 2000 image coding system – part 1: core coding system. ISO/IEC ISO/IEC 15444-1 (Int. Standards Org./Int. Electrotech. Comm., Geneva, Switzerland, 2004).

    Google Scholar 

  7. F-O Devaux, C De Vleeschouwer, Parity bit replenishment for JPEG 2000-based video streaming. EURASIP J. Image Video Process.2009(1), 683820 (2009). doi:10.1155/2009/683820.

    Google Scholar 

  8. ISO/IEC, JPEG 2000 profiles for broadcast applications (ISO/IEC 15444-1:2004/Amd 3:2010, Int. Standards Org./Int. Electrotech. Comm., Geneva, Switzerland, 2010).

    Google Scholar 

  9. ISO/IEC, ISO/IEC 13818-1:2007/FPDAM 5, Int. Standards Org./Int. Electrotech. Comm., (Geneva, Switzerland, 2010).

  10. S Pejoski, V Kafedziski, in Proc. of 5th European Conf. on Circuits and Sys. for Commun. (ECCSC). Causal video transmission over fading channels with full channel state information (IEEEBelgrade, Serbia, 2010), pp. 294–297.

    Google Scholar 

  11. M Etoh, T Yoshimura, Advances in wireless video delivery. Proc. IEEE. 93(1), 111–122 (2005). doi:10.1109/JPROC.2004.839605.

    Article  Google Scholar 

  12. A Albanese, J Blomer, J Edmonds, M Luby, M Sudan, Priority encoding transmission. IEEE Trans. Inf. Theory. 42(6), 1737–1744 (1996). doi:10.1109/18.556670.

    Article  MathSciNet  MATH  Google Scholar 

  13. R de Albuquerque, D Cunha, C Pimentel, On the complexity-performance trade-off in soft-decision decoding for unequal error protection block codes. EURASIP J. Adv. Signal Process.2013(1), 28 (2013). doi:10.1186/1687-6180-2013-28.

    Article  Google Scholar 

  14. ISO/IEC, JPEG 2000 image coding system-part 11: wireless (ISO/IEC WD2.0 15444-11, Int. Standards Org./Int. Electrotech. Comm., Geneva, Switzerland, 2003).

    Google Scholar 

  15. M Agueh, J-F Diouris, M Diop, F-O Devaux, C De Vleeschouwer, B Macq, Optimal JPWL forward error correction rate allocation for robust JPEG 2000 images and video streaming over mobile ad hoc networks. EURASIP J. Adv. Signal Process. 2008(1), 192984 (2008). doi:10.1155/2008/192984.

    Article  MATH  Google Scholar 

  16. G Baruffa, P Micanti, F Frescura, Error protection and interleaving for wireless transmission of JPEG 2000 images and video. IEEE Trans. Image Process. 18(2), 346–356 (2009). doi:10.1109/TIP.2008.2008421.

    Article  MathSciNet  Google Scholar 

  17. C Mairal, M Agueh, in Proc. of 2nd Int. Conf. on Advan. in Multimedia (MMEDIA). Smooth and scalable wireless JPEG 2000 images and video streaming with dynamic bandwidth estimation (IARIAAthens, Greece, 2010), pp. 174–179. doi:10.1109/MMEDIA.2010.40.

    Google Scholar 

  18. M Murroni, A power-based unequal error protection system for digital cinema broadcasting over wireless channels. Signal Process. Image Commun. 22(3), 331–339 (2007). doi:10.1016/j.image.2006.12.006.

    Article  Google Scholar 

  19. MI Iqbal, H-J Zepernick, U Engelke, in Proc. of 2nd Int. Conf. on Signal Process. and Commun. Sys. (ICSPCS 2008). Error sensitivity analysis for wireless JPEG 2000 using perceptual quality metrics (IEEEGold Coast, Australia, 2008), pp. 1–9. doi:10.1109/ICSPCS.2008.4813665.

    Google Scholar 

  20. MI Iqbal, H-J Zepernick, U Engelke, in Proc. of 6th Int. Symp. on Wireless Commun. Sys. (ISWCS 2009). Perceptual-based quality assessment of error protection schemes for wireless JPEG 2000 (IEEESiena, Italy, 2009), pp. 348–352. doi:10.1109/ISWCS.2009.5285217.

    Chapter  Google Scholar 

  21. W Xiang, A Clemence, J Leis, Y Wang, in Proc. of 7th Int. Conf. on Inf., Commun. and Signal Process. (ICICS 2009). Error resilience analysis of wireless image transmission using JPEG, JPEG 2000 and JPWL (IEEEMacau, China, 2009), pp. 1–6. doi:10.1109/ICICS.2009.5397742.

    Google Scholar 

  22. KM Alajel, W Xiang, J Leis, in Proc. of 4th Int. Conf. on Signal Proc. and Commun. Sys. (ICSPCS). Error resilience performance evaluation of H.264 I-frame and JPWL for wireless image transmission (IEEEGold Coast, Australia, 2010), pp. 1–7. doi:10.1109/ICSPCS.2010.5709766.

    Google Scholar 

  23. C Bergeron, B Gadat, C Poulliat, D Nicholson, in Proc. of 17th IEEE Int. Conf. on Image Process. (ICIP). Extrinsic distortion based source-channel allocation for wireless JPEG2000 transcoding systems (IEEEHong Kong, China, 2010), pp. 4469–4472. doi:10.1109/ICIP.2010.5651228.

    Google Scholar 

  24. J Chakareski, PA Chou, Application layer error-correction coding for rate-distortion optimized streaming to wireless clients. IEEE Trans. Commun. 52(10), 1675–1687 (2004). doi:10.1109/TCOMM.2004.836436.

    Article  Google Scholar 

  25. PA Chou, Z Miao, Rate-distortion optimized streaming of packetized media. IEEE Trans. Multimedia. 8(2), 390–404 (2006). doi:10.1109/TMM.2005.864313.

    Article  Google Scholar 

  26. ISO/IEC, Coding of audio-visual objects-part 10: Advanced video coding (ISO/IEC 14496-10, Int. Standards Org./Int. Electrotech. Comm., Geneva, Switzerland, 2010).

    Google Scholar 

  27. P Cataldi, M Grangetto, T Tillo, E Magli, G Olmo, Sliding-window raptor codes for efficient scalable wireless video broadcasting with unequal loss protection. IEEE Trans. Image Process. 19(6), 1491–1503 (2010). doi:10.1109/TIP.2010.2042985.

    Article  MathSciNet  Google Scholar 

  28. L Liang, P Salama, E Delp, Unequal error protection techniques based on Wyner-Ziv coding. EURASIP J. Image Video Process.2009(1), 474689 (2009). doi:10.1155/2009/474689.

    Google Scholar 

  29. S Ahmad, R Hamzaoui, MM Al-Akaidi, Unequal error protection using fountain codes with applications to video communication. IEEE Trans. Multimedia. 13(1), 92–101 (2011). doi:10.1109/TMM.2010.2093511.

    Article  Google Scholar 

  30. J Lu, A Nosratinia, B Aazhang, in Proc. of Int. Conf. on Image Process. (ICIP 98), 2. Progressive source-channel coding of images over bursty error channels (IEEEChicago, Illinois, 1998), pp. 127–131. doi:10.1109/ICIP.1998.723331.

    Google Scholar 

  31. VM Stankovic, R Hamzaoui, Z Xiong, Real-time error protection of embedded codes for packet erasure and fading channels. IEEE Trans. Circuits Syst. Video Technol. 14(8), 1064–1072 (2004). doi:10.1109/TCSVT.2004.831964.

    Article  Google Scholar 

  32. N Thomos, NV Boulgouris, MG Strintzis, Optimized transmission of JPEG2000 streams over wireless channels. IEEE Trans. Image Process. 15(1), 54–67 (2006). doi:10.1109/TIP.2005.860338.

    Article  Google Scholar 

  33. Y Zhang, S Qin, B Li, Z He, Rate-distortion optimized unequal loss protection for video transmission over packet erasure channels. Signal Process. Image Commun.28(10), 1390–1404 (2013). doi:10.1016/j.image.2013.05.009.

    Article  Google Scholar 

  34. N Ramzan, S Wan, E Izquierdo, Joint source-channel coding for wavelet-based scalable video transmission using an adaptive Turbo code. EURASIP J. Image Video Process. 2007(1), 047517 (2007). doi:10.1155/2007/47517.

    Article  Google Scholar 

  35. C-P Ho, C-J Tsai, Content-adaptive packetization and streaming of wavelet video over IP networks. EURASIP J. Image Video Process.2007(1), 045201 (2007). doi:10.1155/2007/45201.

    Article  Google Scholar 

  36. S Bezan, S Shirani, RD optimized, adaptive, error-resilient transmission of MJPEG2000-coded video over multiple time-varying channels. EURASIP J. Adv. Signal Process. 2006(1), 079769 (2006). doi:10.1155/ASP/2006/79769.

    MATH  Google Scholar 

  37. C Schwartz, F da Silva Marques, M da Silva Pinho, in International Telecommunications Symposium (ITS). An unequal coding scheme for remote sensing systems based on CCSDS recommendations (IEEESao Paulo, Brazil, 2014), pp. 1–5. doi:10.1109/ITS.2014.6947971.

    Google Scholar 

  38. D Pascual Biosca, M Agueh, in Mobile Multimedia Communications. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 79, ed. by L Atzori, J Delgado, and Giusto D. Optimal interleaving for robust wireless JPEG 2000 images and video transmission (SpringerBerlin Heidelberg, 2012), pp. 217–226. doi:10.1007/978-3-642-30419-4-19.

    Google Scholar 

  39. M Agueh, S Ataman, H Soude, in Fourth International Conference on Communications and Networking in China (ChinaCOM 2009). A low time-consuming smart FEC rate allocation scheme for robust wireless JPEG 2000 images and video transmission (IEEEXi’an, China, 2009), pp. 1–5. doi:10.1109/CHINACOM.2009.5339854.

    Chapter  Google Scholar 

  40. J Abot, C Olivier, C Perrine, Y Pousset, A link adaptation scheme optimized for wireless JPEG 2000 transmission over realistic MIMO systems. Signal Process. Image Commun.27(10), 1066–1078 (2012). doi:10.1016/j.image.2012.08.003.

    Article  Google Scholar 

  41. F Fiorucci, G Baruffa, P Micanti, F Frescura, in IEEE International Conference on Multimedia and Expo (ICME). A real-time, DSP-based JPWL implementation for wireless High Definition video transmission (IEEEBarcelona, Spain, 2011), pp. 1–4. doi:d10.1109/ICME.2011.6012054.

    Google Scholar 

  42. MI Iqbal, H-J Zepernick, in International Symposium on Communications and Information Technologies (ISCIT). Error protection for wireless imaging: providing a trade-off between performance and complexity (IEEETokyo, Japan, 2010), pp. 249–254. doi:10.1109/ISCIT.2010.5664847.

    Google Scholar 

  43. MI Iqbal, H-J Zepernick, A framework for error protection of region of interest coded images and videos. Signal Process. Image Commun.26(4–5), 236–249 (2011). doi:10.1016/j.image.2011.03.001.

    Article  Google Scholar 

  44. S Bahmani, IV Bajic, A HajShirmohammadi, Joint decoding of unequally protected JPEG2000 bitstreams and Reed-Solomon codes. IEEE Trans. Image Process. 19(10), 2693–2704 (2010). doi:10.1109/TIP.2010.2049529.

    Article  MathSciNet  Google Scholar 

  45. M Ouaret, F Dufaux, T Ebrahimi, Error-resilient scalable compression based on distributed video coding. Signal Process. Image Commun.24(6), 437–451 (2009). doi:10.1016/j.image.2009.02.011.

    Article  Google Scholar 

  46. E Baccaglini, T Tillo, G Olmo, Image and video transmission: a comparison study of using unequal loss protection and multiple description coding. Multimedia Tools Appl.55(2), 247–259 (2011). doi:10.1007/s11042-010-0574-3.

    Article  Google Scholar 

  47. Z Chen, M Xu, L Yin, J Lu, in International Conference on Wireless Communications and Signal Processing (WCSP). Unequal error protected JPEG 2000 broadcast scheme with progressive fountain codes (IEEENanjing, China, 2011), pp. 1–5. doi:10.1109/WCSP.2011.6096843.

    Google Scholar 

  48. T Nakachi, Y Tonomura, T Fujii, in 7th International Conference on Signal Processing and Communication Systems (ICSPCS). A conceptual foundation of NSCW transport design using an MMT standard (IEEEGold Coast, Australia, 2013), pp. 1–6. doi:10.1109/ICSPCS.2013.6723976.

    Google Scholar 

  49. G Baruffa, F Frescura, P Micanti, B Villarini, in Proc. of 19th IEEE Int. Conf. on Image Process. (ICIP 2012). An optimal method for searching UEP profiles in wireless JPEG 2000 video transmission (IEEEOrlando, FL, 2012), pp. 1645–1648. doi:10.1109/ICIP.2012.6467192.

    Google Scholar 

  50. L Pu, MW Marcellin, B Vasic, A Bilgin, in Proc. of IEEE Int. Conf. on Image Process. (ICIP 2005), 3. Unequal error protection and progressive decoding for JPEG2000 (IEEEGenoa, Italy, 2005), pp. 896–899. doi:10.1109/ICIP.2005.1530537.

    Google Scholar 

  51. S Feldmann, M Radimirsch, in IEEE Int. Symp. on Pers., Indoor and Mobile Radio Commun. (PIMRC 2002), 3. A novel approximation method for error rate curves in radio communication systems (IEEELisboa, Portugal, 2002), pp. 1003–1007. doi:10.1109/PIMRC.2002.1045178.

    Google Scholar 

  52. I Amonou, N Cammas, S Kervadec, S Pateux, Optimized rate-distortion extraction with quality layers in the scalable extension of H.264/AVC. IEEE Trans. Circuits Syst. Video Technol. 17(9), 1186–1193 (2007). doi:10.1109/TCSVT.2007.906870.

    Article  Google Scholar 

  53. Xiph.org media. >https://media.xiph.org/video/derf/. Accessed 1 March 2016.

  54. Kakadu v. 6.0. http://www.kakadusoftware.com. Accessed 1 March 2016.

  55. D Taubman, M Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice (Springer, New York, NY, 2002).

    Book  Google Scholar 

  56. JG Proakis, M Salehi, Digital Communications: Fifth Edition (McGraw-Hill Education, Singapore, 2007).

    Google Scholar 

  57. V Sanchez, MK Mandal, in Proc. Int. Conf. on of Consumer Electronics (ICCE 2002). Robust transmission of JPEG 2000 images over noisy channels (IEEELos Angeles, CA, USA, 2002), pp. 80–81. doi:10.1109/ICCE.2002.1013935.

    Google Scholar 

  58. BA Banister, B Belzer, TR Fischer, Robust image transmission using JPEG2000 and turbo-codes. IEEE Signal Process. Lett. 9(4), 117–119 (2002). doi:10.1109/97.1001646.

    Article  Google Scholar 

  59. G Baruffa, F Frescura, Adaptive error protectioncoding for wireless transmission of motion JPEG 2000 Video. http://dante.diei.unipg.it/~baruffa/uep2015/. Accessed 1 March 2016.

  60. J Lai, NB Mandayam, Performance of Reed-Solomon codes for hybrid-ARQ over Rayleigh fading channels under imperfect interleaving. IEEE Trans. Commun. 48(10), 1650–1659 (2000). doi:10.1109/26.871390.

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Paolo Micanti and Barbara Villarini for the help on theoretical aspects and simulations carried out for this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppe Baruffa.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

GB and FF worked jointly on the development of the theoretical model. GB performed the simulations and drafted the manuscript, FF carried out the statistical analysis of video and helped to draft the manuscript. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baruffa, G., Frescura, F. Adaptive error protection coding for wireless transmission of motion JPEG 2000 video. J Image Video Proc. 2016, 10 (2016). https://doi.org/10.1186/s13640-016-0111-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13640-016-0111-z

Keywords