Skip to main content

Efficient bitstream switching for streaming of H.264/AVC coded video


In this article, a novel scheme is proposed for switching among multiple bitrate video streams over Internet, to achieve best quality of service (QoS). The research is focused on selecting the best switching points inside the pre-coded variable bit rate video streams, to facilitate bandwidth scalability among the non-scalable multiple bitstreams. With a stepwise monotonically decreasing (downstairs) bandwidth reservation scheme, it is shown that if switching occurs at the transition points in the "downstairs", then the best QoS can be achieved. However, insertion of switching pictures at these transition points may change the characteristics of the "downstairs" function and hence the bandwidth requirement. Here, we propose a scheme for proper management of the allocated resources to facilitate the stream switching in the "downstairs" through SP-frames. The performance is measured in terms of the wastage of bits in the receiver buffer and bandwidth utilization at the switching instances.

1. Introduction

In video streaming applications, such as video-on-demand, archived video news, and non-interactive distance learning, video sequences are generally encoded offline and stored in a server [1]. To receive and play the stored video at any time, users may access the server over a shared channel such as the Internet. For continuous display of variable bit rate (VBR)-coded video over such a time varying network, a part of the video bitstream needs to be pre-loaded in the receiver buffer to ensure that every frame is decoded at its scheduled time.

One of the key problems in video streaming is to match the data rate of the transmitted video according to the varying network conditions [2]. This problem can be eliminated by either adapting the video bit rates with the available channel bandwidth or by configuring the network resources to accommodate the video bit rates through some quality of service (QoS) control mechanisms [3]. To adapt the bit rate of the transmitted video according to the available bandwidth, codecs may generate scalable bitstream [4, 5]. In the standard codecs, limited scalability is achieved using layered bitstream, but inclusion of every additional scalable layer reduces the coding efficiency [6]. A better method for bandwidth adaptation is to dynamically switch among the multiple and independently encoded bitstreams of the same video but having different quality and bit rates [7].

Video can be coded as VBR for better quality or constant bit rate (CBR) for easy transmission over the bandwidth varying channels [8]. Usually, for streaming purposes the video is transmitted at a rate much higher than the actual bit rate of the VBR stream [9], leading to the bit accumulation in the receiver buffer and poor bandwidth utilization. To easily transmit the high-quality VBR video over the Internet, a number of techniques are used, in which the VBR is mapped to CBR and bandwidth is reserved using some QoS control mechanism that support resource reservation like RSVP [10]. Some of these approaches are as follows: PEAK [11] is based on peak rate, where the servers setup service connections with a bandwidth equal to the peak rate of the video; bandwidth allocation mechanism [12] is based on the allocation of the peak data rate for the time till the frame having the maximum data rate is transmitted, which is then reduced to the next highest data rate and so on; and a monotonically decreasing rate scheduler approach is suggested in [8], in which steps are determined according to the maximum slope of the bit rate. Another monotonically decreasing rate scheduler (referred as "downstairs") is proposed in [13], which is based on the moving average of bit rates. But all these techniques suffer from poor bandwidth utilization and bit wastage when bandwidth is adapted dynamically to map the bitrate to the available channel capacity.

Among the various rate allocation algorithms, "downstairs" has some interesting properties which make it suitable for video streaming, and it can be used for efficient switching among the H.264/AVC coded video streams of different qualities, to map the rate of VBR video content to the available bandwidth. Its most important characteristic is that the reservation function has "downstairs" shape and every new step has a value less than its previous step and therefore there is no risk of reservation modification failure. The first step in the "downstairs" curve corresponds to the initial bandwidth required to start the streaming. The motivation behind the switching between various "downstairs" functions in resource reservation environment is to reduce the initial reserved bandwidth. For instance in a highly congested network, the requested bandwidth for high-quality video stream may be denied. Therefore, transmission may be initiated with a relatively lower quality video (hence lower bit rate) and then switched to a higher quality video when sufficient bandwidth becomes available or the bandwidth requirement of high-quality video is reduced with time which is a property of the monotonically decreasing "downstairs" reservation function. Also as the bandwidth requirement in "downstairs" decreases monotonically, once the bandwidth is allocated then possibility of denial of future requests is greatly reduced.

In H.264/AVC, a new type of frame, namely, switching predictive frame (SP-frame), has been defined for the drift-free switching between the streams [14, 15]. It specifies two types of SP-frames, namely, primary and secondary SP-frames (throughout this article, they are referred as SP and SSP frames, respectively, while switching frame is referred to the whole concept).a The SP-frames are inserted at the probable switching points, whereas the SSP-frames, which are a mismatch-free version of the SP-frames, are only used when the actual switching occurs.

To utilize the available bandwidth and to reduce the wastage of bits, this article combines the concept of switching frames introduced in H.264/AVC with the rate allocation algorithm "downstairs" for drift-free switching among multiple bitrate video streams over the Internet, to achieve best QoS. The QoS is achieved in terms of bit wastage and bandwidth utilization by selecting best switching points inside the pre-coded VBR video streams.

The rest of the article is organized as follows: the "downstairs" function and some fundamental parameters necessary for the analysis of the "downstairs" scheme are described in Section 2. Analysis of "downstairs" with and without switching frames is given in Section 3. Section 4 describes switching with "downstairs" reservation scheme. Section 5 includes simulation results and finally concluding remarks are given in Section 6.

2. Preliminary concepts for streaming using "downstairs"

In this section, we review the concept of monotonically decreasing rate scheduler (i.e., "downstairs"), switching frames, and some fundamental parameters used to evaluate the QoS in video streaming that are necessary to follow the rest of the article.

2.1 Review of monotonically decreasing rate scheduler (downstairs)

The basic philosophy of the "downstairs" reservation scheme is that certain characteristics of a pre-coded VBR video can be exploited for efficient network resource allocation, e.g., to reduce the required bandwidth, some form of bit rate smoothing can be done [16].

In the "downstairs" reservation scheme through a moving average bit rate strategy, the bandwidth at each step is calculated in the following form [13]. Consider an encoded video sequence of N-frames (from 0 to N - 1) where the k th frame is coded with r k bits, then the moving average (A) of the first (i + 1) frames is defined as:

A i = k = 0 i r k i + 1 ; 0 i < N

The largest average A i among the moving averages A0A1A2... AN-1is selected as the first step with the width extending from 0 to the j th frame. The moving average is then recomputed starting from the (j + 1)th frame and a new largest value (say A m ) is selected as the height of the next step extending from the (j + 1)th frame to the m th frame (m > j). It can easily be verified that A m is always less than A j . This process is repeated until the last frame of the video sequence and thus generating a "downstairs" type curve.

2.2 Function of reservation (FOR) and function of transmission (FOT)

For a resource reservation-based transmission scheme, two functional curves, the FOR and the FOT, are commonly used. FOR specifies a time varying function to reserve the network resources (e.g., bandwidth) during video transmission, whereas FOT specifies the time varying nature of the transmitted video bits (packets).

2.3 Bandwidth utilization factor

To stream N frames of video with FOT f(t) of duration τ, and bandwidth is allocated with FOR specified by g(t), the utilization of the allocated bandwidth is defined as the ratio of areas under FOT to FOR as

U = 0 τ f ( t ) d t 0 τ g ( t ) d t

Usually FOT and FOR may not be exactly equal, but it is desired they should have equal life time, in this article it is considered that the reserved bandwidth is fully utilized and the FOR is equal to FOT.

2.4 Latency of start (LOS)

The LOS or the start-up delay is the time that takes the first bit of the bitstream left the server and at which the first frame is displayed on the receiver/decoder side. The user sees it as a delay in the start of video and it should be as small as possible.

2.5 Receiver buffer requirement

In VBR video with varying bits per frame, a buffer is needed at the receiver to store sufficient amount of bits to decode and play a video at its scheduled time to maintain display continuity. However, large buffers introduce LOS and small ones may lead to buffer overflow. For a LOS of t0 the minimum required buffer size (B) can be estimated as

B b 0 + t 0 τ f ( t ) d t - k = 0 n r k

where τ is the video duration, b0 is the number of pre-fetched bits at the decoder before the first frame is displayed, r k is the number of bits for the k th frame, and n is the frame number (n th frame) to be displayed at time t.

2.6 Switching frames concept

In H.264/AVC, a new type of frame, SP-frame, has been defined for the switching purposes [14, 15, 17]. It specifies two types of switching frames, the primary and the secondary SP-frames. They use motion compensated predictive coding and thus have better compression efficiency than the I-frames [14]. The primary SP-frames are similar to the P-frames and the secondary SP-frames have special encoding which leads to the same reconstructed picture as the primary SP-frame [14, 15, 18].

To enable drift-free switching, the streaming server stores several copies of the same sequences, encoded at different quantization parameters and hence different quality along with the SSP-frames [15]. These bitstreams are populated with the SP-frames at the intended switching locations. As long as switching is not desired, the primary SP-frames are transmitted instead of P-frames at the preselected positions [15]. If switching becomes necessary, the secondary SP-frame is transmitted, replacing the SP-frame [14, 15]. To maintain the consistency throughout the article, the bitstream being transmitted before the switching will be called bitstream1 and the bitstream that is transmitted after the switching will be called bitstream2 or the target bitstream. A typical switching scenario between two H.264/AVC coded bitstreams is shown in Figure 1. Both streams are populated with SP-frames at the points where switching is desired. The arrows indicate the direction of transmission starting from the bitstream1 including the first SP-frame (here switching is not occurred at the first SP-frame, and it is transmitted as such), switching is done at the second SP-frame by sending an SSP-frame instead of the second SP-frame followed by the frames of the target bitstream.

Figure 1
figure 1

Switching between H.264/AVC bitstreams using switching frames.

3. Analysis of the "downstairs" reservation scheme

As discussed in Section 2.1, the "downstairs" reservation function is based on the moving average and guarantees monotonically decreasing step function (downstairs). This feature of "downstairs" ensures that once sufficient bandwidth is available to start the streaming, the bandwidth requirement for the rest of the stream is guaranteed. However, in "downstairs" the transition points do not occur periodically and in most of the work, it is assumed that the switching frames should be inserted at regular intervals [14, 15, 1719]. Therefore, to utilize the allocated resources properly, SP-frames need to be inserted in such a way that switching always occurs at the transition points, which are the best switching points exploiting the characteristics of the "downstairs" reservation scheme. But due to the different coding nature of the switching frames, the transition points of the "downstairs" may change that are required to be handled in such a way that the shape of "downstairs" is preserved and, i.e., a monotonically decreasing curve [20, 21]. To implement the idea of switching at the transition points in the standard video codecs, the source code of H.264/AVC encoder is modified such that switching frames can be inserted at any desired location.

In this section, the behavior of the "downstairs" scheme is analyzed for the H.264/AVC coded bitstreams with and without SP-frames.

3.1 "Downstairs" reservation function without SP-frames

"Downstairs" reservation functions (i.e., FOR) for the "Mobile" sequence (CIF resolution and 4:2:0 YUV format) coded with H264/AVC (with I,P,P,P,... coding patterns) at two different bit rates and hence different qualities are shown in Figure 2. It can be observed that these "downstairs" functions have coinciding transition points, although the step heights are different. These transition points can be considered as possible switching points, due to the reasons discussed below.

Figure 2
figure 2

"Downstairs" function for "Mobile" sequence.

To investigate the switching points for the best QoS, we calculate the bandwidth utilization factor and the number of bits available in the decoder buffer just before the switching. It should be noted here that the bits available in the buffer other than those needed to reconstruct the frame just before the switching are useless and are simply wasted. Therefore, at the switching instant the buffer content should be as minimal as possible and ideally it should be empty. Figures 3 and 4 show the bandwidth utilization factor defined in Equation 2 and the minimum required size of the receiver buffer at each frame, defined in Equation 3, respectively, for the "Mobile" sequence at QP = 30. From these figures, the following interesting and unique properties of the "downstairs" scheme can be observed.

Figure 3
figure 3

Bandwidth Utilization curve at each frame for "Mobile" sequence.

Figure 4
figure 4

Amount of bits in the receiver buffer at each frame position for "Mobile" sequence.

  1. 1.

    It can be observed from Figure 2 that the "downstairs" functions at two different bit rates have almost coinciding transitions points.

  2. 2.

    From Figure 3, it can be observed that at the end of every step (e.g., frames; 159, 229, 245, 264, 276, 299) of "downstairs", the bandwidth utilization factor is maximum (i.e., 100%).

  3. 3.

    Figure 4 shows that at the end of every step, the receiver buffer is empty and all the received bits are used by the decoder for reconstruction.

  4. 4.

    If bitstreams are switched at these points, then no bits in the buffer are wasted and the buffered bits before and after the switching will be independent.

  5. 5.

    At a fixed QP, the largest peak in Figure 4 corresponds to the minimum size of the buffer required for that video of a specified quality of interest to receive and play it back at its scheduled time.

These observations imply that the transition points in the "downstairs" curve are the suitable switching points from the QoS point of view. Hence, SP-frames should be inserted at these points to facilitate switching between the streams. However, due to different rate-distortion characteristics of P- and SP-frames, after inserting SP-frames the FOR curve and hence "downstairs" is likely to change, changing the position of transition points as well.

3.2 Effect of the switching frames on the "downstairs" reservation curve

The effects of the switching frames on the "downstairs" reservation function is twofold: (1) the changes in the "downstairs" function, due to replacing the P-frames with the primary SP-frames at all the transition points and (2) changes due to SSP-frames that only occur at the switching instant, which are discussed in the following sections.

3.2.1. Effect due to the primary SP-frames

First we look at the effect of the SP-frames inserted at the transition points of the "downstairs" function of the bitstream with the I,P,P,P... coding pattern. These SP-frames will change the average bit rate of the sequence and hence the "downstairs" reservation curve (which is based on moving averages) with and without SP-frames will be different as is evident from Figures 5 and 6 for "Mobile" sequence with quantization parameters of 25 and 30, respectively. Here, QP values correspond to the quantization parameters of the P-frames. The SP-frames are coded at the quantization parameter pairs, QPSP = QP-3 and QPSP2 = QP (taken from [19]). Both FORs, with and without the SP-frames are derived from Equation 1. It can be observed from these two figures that FORs with and without SP-frames differ in terms of number of transition points, step heights, and locations of transition points. Further, it can be observed that new "downstairs" at different QP do not have all transition points coinciding.

Figure 5
figure 5

"Downstairs" functions with and without SP-frames for "Mobile" sequence with QP 25.

Figure 6
figure 6

"Downstairs" functions with and without SP-frames for "Mobile" sequence with QP 30.

These differences are due to the fact that coding gains of P- and SP-frames at the same quantizer step size are different [14, 18]; the use of two different quantizers for the SP-frame results in the loss of coding efficiency compared to that of one quantizer used in the P-frames. This is because every combined quantization and dequantization operation reduces visual quality [18]. One may argue that if QPSP and QPSP2 are adjusted such that SP-frames are coded at exactly the same bit rate of the corresponding P-frames, then the change in the "downstairs" characteristics may be avoided. However, we have observed that although it is very difficult to exactly match the two bit rates for all the corresponding frames, but even if they are somehow matched, still the two "downstairs" will be different. One reason for such difference is that the rate of a P-frame referenced from a P-frame is different than the rate of the P-frames referenced from an SP-frame due to the difference in the visual quality of the two reference frames. Therefore, the average bit rate is affected, resulting changes in the "downstairs". Therefore, bringing the rate of SP-frame equal to the rate of P-frame does not guarantee the steps of "downstairs" will not change.

Further, it is observed that for the same sequence, relative changes between the P and SP frames for various different quantization parameter pairs (QPSP and QPSP2) are different [14, 18] and hence the relative changes between the "downstairs" with and without SP-frames. To study the effect of QPSP and QPSP2 on the changes in the "downstairs" characteristics, we have considered different sets of (QPSP, QPSP2) pairs and video test sequences. The values of the quantization parameter pairs were chosen from the literature, such as (QP, QP-6) [14], (QP-3, QP), (QP-2, QP-5) [19], and (QP-2, QP-3) [18] as well as some arbitrary values and it was observed that the relative difference between the bit rates of P and SP frames are different in all the cases, and hence the "downstairs" characteristics. Thus, a new FOR is required to be calculated to reduce these changes and to ensure switching possibility at most of the transition points. This will be discussed in Section 4.

3.2.2. Effect of the secondary SP-frames

The difference between the bit rates of the SP and SSP frames is a tradeoff between the QPSP and QPSP2 [14]. Figures 7 and 8 show switching between two bitstreams (of the same video sequence but of different quality), from low bit rate to high bit rate (up switching) and vice versa (down switching), respectively. The high spikes indicate that the instantaneous bandwidth requirement for transmission of SSP-frames (they are used only once at the switching instant) will be much higher than the bandwidth derived from the FOR with SP-frames. This high bit rate of the SSP-frame can be reduced by changing the (QPSP, QPSP2) pairs, but this may also affect the bit rate of the SP-frames. As the number of SP-frames is usually more than the number of SSP-frames, therefore one may set the bit rate of the SP-frames as low as possible to keep the overall average bit rate of the stream low.

Figure 7
figure 7

"Downstairs" function for "Mobile" sequence with upswitching.

Figure 8
figure 8

"Downstairs" function for "Mobile" sequence with downswitching.

This excess bandwidth required for the SSP-frame at the switching instant is needed to be managed properly so that its effect on the reserved bandwidth is minimized. This issue will be discussed in Section 4 in detail.

4. Switching with "downstairs" reservation scheme

As discussed earlier, one way to adapt to the changing network conditions is to switch among the multiple bitstreams of the same video at different qualities. To enable drift-free switching, the streaming server in addition to storing several copies of the video encoded with different quantization parameters also stores the SSP-frames. As discussed, the SP-frames are inserted at the transition points of each stream and the SSP-frames are only transmitted when necessary.

In a "downstairs" reservation scheme based on moving average, there will always be accumulation of bits in the receiver buffer. After switching, the leftover bits of the bitstream1 in the buffer are of no use to the decoder. The buffer needs to drain out these bits, before the bits of the target bitstream start to flow to the decoder buffer. Ideally, the best switching instant is one at which the buffer content is zero and bandwidth utilization is maximum. As discussed in Section 3, both conditions are met at the transition points. Hence, if bitstream1 is terminated after the completion of the step and the target bitstream is started; better utilization of the allocated resources is guaranteed.

Based on the discussions in Section 3, the following two problems because of insertion of switching frames were identified: (1) changes in the FOR due to insertion of SP-frames and (2) additional bandwidth required at the switching instant. To the best of the authors' knowledge, these issues are not considered previously in the literatures. Solutions to these problems, which are the main contributions of this article, are suggested below.

4.1 New FOR after insertion of SP-frames

As evident from Figures 5 and 6, the "downstairs" curves with inserted SP-frames corresponding to different QP, calculated using Equation 1, do not have coinciding transition points, also the transition points of the "downstairs" with SP-frames do not coincide with the SP-frames inserted, because of the drift of transition points from its original positions. The implication of this observation is that all the inserted SP-frames are not suitable for switching as condition of zero buffer content and high bandwidth utilization at the time of switching cannot be satisfied. To make most of the SP-frames suitable for switching, it is necessary that "downstairs" corresponding to different QPs after inserting SP-frames must have coinciding transition points and also those transition points must have SP-frames, i.e., the "downstairs" with inserted SP-frames should maintain the monotonicity of original "downstairs" (without inserting SP-frames) and most of the transition points should be unaltered. Here, we propose a novel algorithm for obtaining the new "downstairs" while achieving the above objectives.

The primary SP-frames replace the P-frames of the bitstream at the transition points of the original FOR function (FOR of the original bitstream without SP). Since bits per frame of the SP-frames differ from those of the replaced P-frames, and also the bits per frame of the subsequent P-frames referenced from the newly inserted SP-frame are different from the P-frames of the sequence without SP-frames. Thus, the derived moving averaged bits will differ; hence, the new step transitions may not coincide with the old ones. Therefore, a new FOR function is required to be calculated for the sequence with the SP-frames, without violating the monotonicity of the "downstairs" reservation function, as well as not altering the transition points. Methodology defined in Equation 1 cannot be used any more, and instead of calculating moving average of the whole sequence, a new average for each step of the original FOR is calculated for this new sequence with SP-frames as follows.

Consider a function f(t) calculated using the procedure in Equation 1. Assume S 1, S 2, S 3... are the heights of the "downstairs" (in decreasing order), N 1, N 2, N 3... are the frame positions at which the step changes occur, and r i is the number of bits at i th frame, then the step heights of the new "downstairs" are

S k = i = N k - 1 + 1 N k r i N k - N k - 1

In case, S k + 1 > S k , i.e., an average of a step (step height) is greater than its previous step size, Equation 4 is extended to the next transition point of the next step, and so on until the new average becomes less than the size of the previous step.

These new averages, replace the original "downstairs" function, creating a new FOR. This procedure is repeated until the end of the sequence ensuring monotonicity of the new reservation scheme. Figures 9 and 10 show the "downstairs" reservation curves for the "Mobile" sequence coded with and without SP-frames for QP = 25 and QP = 30, respectively, having the same values of QPSP and QPSP2 as in Figures 5 and 6. It can be observed from Figures 9 and 10 that "downstairs" curve before and after SP-frames insertion are not identical in heights, thereby altering the minimal bandwidth requirements at all the steps, while keeping the step positions unchanged, modifying the FOR. In this process, some of the transition points of original FOR may merge to keep the monotonicity of the function, thereby reducing the number of the transition points. However, all the transition points of the new "downstairs" function coincides with one of the transition point in the original "downstairs" reservation scheme; therefore, all the switching points of the new "downstairs" are suitable for the switching. In Figures 9 and 10, the last two steps are merged to keep the monotonicity of the function, whereas all other steps are at the same frame positions as in the original "downstairs".

Figure 9
figure 9

New "downstairs" reservation functions with SP-frames for "Mobile" sequence with QP 25.

Figure 10
figure 10

New "downstairs" reservation functions with SP-frames for "Mobile" sequence with QP 30.

Based on the above discussion, the new resource allocation algorithm is as follows:

  1. a.

    First FORs of the original bitstreams (i.e., stream without SP-frames) are calculated. The step transition points are then identified.

  2. b.

    Primary SP-frames are inserted at the transition points in all the copies of the video sequence.

  3. c.

    FORs of the resulting new bitstreams with inserted SP-frames are calculated as discussed above.

  4. d.

    These FORs are then used for the resource reservation and switching.

Here, it is important to note that in the original "downstairs" the highest average among all the moving averages of the sequence is selected, while in the new FOR with the SP-frames individual averages are calculated for every step. This average is the minimum bandwidth required to transmit all the frames of that step. As the SP-frame has a higher bit rate than all the other frames of the step and it is the first frame of every step, therefore, there is a possibility of buffer underflow at the beginning of the step. This underflow can be avoided if some bits are pre-fetched at the beginning of the transmission, with a slight increase in LOS. Here, I-frame is taken as the pre-fetched bits which are more than the bits of the SP-frames. Figures 11 and 12 show the bandwidth utilization and the buffer contents, respectively, for the mobile sequence coded with QP = 30. From Figure 11, it is clear that the bandwidth utilization is maximized at the end of each step, similar to the bandwidth utilization of the original "downstairs" shown in Figure 3. Figure 12 shows the amount of bits in the receiver buffer. It is clear that all the frames of the step are utilized by the receiver buffer at the end of the step. Therefore, the new "downstairs" retain the basic features of the original "downstairs" while increasing the suitability of SP-frames for switching.

Figure 11
figure 11

Bandwidth utilization in new "downstairs" reservation function for "Mobile" sequence with QP 30.

Figure 12
figure 12

Buffer contents in new "downstairs" reservation function for "Mobile" sequence with QP 30.

4.2 Additional bandwidth and transition points management at the switching instant

Since the "downstairs" steps are derived using the statistics of SP-frames at that instant before the transmission starts, therefore at the time of switching, the bandwidth requirement for the SSP-frame will be much larger than the allocated bandwidth, as evident from Figures 7 and 8. This additional bandwidth requirement needs to be managed properly.

First consider the case for the coinciding transition points (both bitstream1 and target bitstream have coinciding transition points at the switching instant). In this article, we propose to solve the excess bandwidth requirement by recalculating the step of the target bitstream at which switching occurred (step with SSP-frame), by replacing SP-frame with the SSP-frame using the procedure given below maintaining the monotonicity of the "downstairs" reservation function.

Consider the "downstairs" function f'(t) calculated using the procedure in Section 4.1. Let the bit rate of the SSP-frame used for switching is rssp, k(corresponding to k th transition point). The k th step height S k of new "downstairs" function f"(t) is obtained as follows:

S k = ( N k - N k - 1 ) S k + ( r ssp , k - r s , k ) N k - N k - 1

In case, S k + 1 > S k , i.e., an average of a step (step height) is greater than its previous step size, Equation 5 is extended to the next transition point of the next step, and so on until the new average becomes less than the average of the previous step.

In case if the switching point in the target bitstream is not at the transition point, all the accumulated bits up to that point need to be transmitted in a very short time to ensure continuous bandwidth reservation requirements which introduces increase in the bandwidth of the step containing SSP-frame. The same procedure is used for calculating the first step of the target bitstream, as in Equation 5, the only difference is that the additional bandwidth now consists of the bandwidth required for the SSP-frame plus the bandwidth required for the transmission of the accumulated bits bacc, as given below.

S k = ( N k - N k - 1 ) S k + ( r ssp , k - r s , k ) + b acc N k - N k - 1

It should be noted that this recalculation will not disturb the whole "downstairs" up to the end, only one step in the FOR of the target bitstream is changed that can be updated in the specifications for resource reservation. Once the step containing SSP-frame is completed then the remaining steps will be similar to that of the target bitstream calculated before starting the transmission.

5. Simulation results

To perform drift-free switching among multiple bitstreams, the source code of H.264/AVC (JM10.2) was modified so that the switching frames could be inserted at any desired location. The original codec supports only the periodic insertion of SP-frames. Each video sequence was encoded at different bit rates and their FORs were derived. Then, the modified H.264/AVC encoder was instructed to include the SP-frames at the transition points of the derived FORs. Then, upswitching and downswitching between the bitstream was performed. First, we have considered single switching; the results are then compared with the conventional method of periodic insertion of SP-frames for bitstream switching. Finally, the results of multiple switching are also reported. These results are derived for a number of test sequences and quantization parameters, and are found to be consistent, hence only a few results are given.

5.1 Single bitstream switching

First, switching was performed only once between the streams at different frame positions corresponding to the transition points. The corresponding results for the "Coastguard" and "Mobile" sequences are shown in Table 1. In the "Coastguard" sequence, downswitching is done from high bit rate (QP = 20) to low bit rate (QP = 30) at frame numbers 142, 179, and 222, and for the "Mobile" sequence upswitching is performed from low bit rate (QP = 30) to high bit rate (QP = 25) at frames number 160, 230, and 246. At all these frame positions, the FOR of both bitstreams has coinciding transition points. It is clear from Table 1 that at these switching points the bandwidth utilization is maximum and the bit wastage in the receiver buffer is zero in all the cases.

Table 1 Single switching between the streams

The values of average PSNR and average bit rates of the whole sequence after switching (sequence containing portions from both streams, i.e., target and bitstream1) depend on the number of the frames from both bitstreams. For example, in Table 1 for the "Coastguard" sequence, switching at frame number 222, the portion of target bitstream is less than that of bitstream1; so the average PSNR and bit rate of the switched bitstream are closer to that of the bitstream1. A similar explanation holds for the PSNR and average bit rate for other switching positions and sequences.

5.1.1. Additional bandwidth and transition points management

Experiments were performed with "Mobile" sequence to verify the effectiveness of our method in managing the additional bandwidth requirement because of SSP-frames at the switching instants. Figures 13 and 14 show the upswitching and downswitching of the "Mobile" sequence after merging the additional bandwidth in the step of the FOR of target bitstream after switching. It can be seen that the new "downstairs" step is less than the previous step of the target bitstream keeping the monotonically decreasing nature of the "downstairs" reservation function. These figures also show that only the step containing SSP-frame is changed, while all other steps are unchanged.

Figure 13
figure 13

Up switching for "Mobile" sequence, after bandwidth management for SSP-frame.

Figure 14
figure 14

Down switching for "Mobile" sequence, after bandwidth management for SSP-frame.

Thus, it can be concluded that switching at the transition points leads to no bit wastage in the receiver buffer and maximum utilization of the bandwidth. Also merging the additional bandwidth into the next step of the target bitstream does not disturb the whole "downstairs" up to the end, only one step (containing SSP-frame) needs to be recalculated and updated in the specifications for resource reservation.

5.1.2. Comparison with other methods

As discussed above, in most of the previous studies, it is assumed that switching frames can be inserted at regular intervals without any concern for resource utilization and no effort has been made to investigate the best switching instant. The same idea of periodic switching is implemented in H.264/AVC; here in this section, switching between bitstreams with periodic SP-frames is compared with our proposed scheme. Since SP frames have poor coding efficiency than the P-frame [14, 18, 19], therefore the coding efficiency of the sequence decreases with the increase in the number of SP-frames. Owing to this decrease in coding efficiency, it is necessary to keep the number of SP-frames as lows as possible. For fair comparison between the two schemes, the periodicity of the SP-frames is kept such that the number of SP-frames inserted in the test (periodic) sequence is equal to the number of the SP-frames used in the proposed method. The values of the quantization parameters for this comparison were kept the same as those used for the previous results, i.e., QPSP = QP-3 and QPSP2 = QP. Since the proposed method has SP-frames only at the step transitions, therefore only those SP-frames of the periodic method are compared that are temporally nearer to the SP-frames of proposed algorithm (e.g., frame 144 of periodic scheme which is closer to frame 142 of the proposed scheme) as shown in Table 2. The two methods are compared in terms of bit wastage and bandwidth utilizations. The minimum bandwidth is used for periodic SP-frame sequence necessary to avoid underflow in the receiver buffer and to keep the number of accumulated bits limited at the switching instant. It is clear from Table 2 that the bit wastage of the periodic scheme is much higher than our proposed method and also the bandwidth utilization of our scheme at the switching instant is more than that of the periodic scheme.

Table 2 Comparison of bit wastage in periodic SP-frames and our proposed algorithm

Therefore, periodic insertion of the SP-frames without any concern for the bit wastage, bandwidth utilization, and over all average bit rate is not ideal for switching. On the other hand, switching at specific frames not only limits the number of surplus SP-frames and hence the overall average bit rate is reduced, but also the wastage of bits is minimized and the bandwidth utilization is maximized.

5.2 Multiple bitstream switching

For multiple switching, the bandwidth, buffer, and transition point management are similar to those of the single switching. Figures 15 and 16 show multiple switching between the two bitstreams for the "Coastguard" and "Mobile" sequences, respectively. In Figure 15, an upswitching occurs at frame 97 and a downswitching at frame 222. In Figure 16, first switching is downswitching at frame 160 and the second is upswitching at frame 246. In both cases, the additional bandwidth required at the switching instant is calculated as discussed in Section 4.2. Only the step with SSP-frame is changed while all other steps are the same as calculated before the transmission. There is no need of merging more steps to accommodate the additional bandwidth as the new steps are less than the previous step of the target bitstream in the both sequences. Table 3 shows the results compared with the periodic switching at the frame positions nearer to that of the proposed algorithm, showing the same behavior as in the case of single switching.

Figure 15
figure 15

Multiple switching "Coastguard" sequence, after bandwidth management for SSP-frames.

Figure 16
figure 16

Multiple switching "Mobile" sequence, after bandwidth management for SSP-frames.

Table 3 Multiple switching between the streams

6. Conclusions

In this article, the monotonically decreasing "downstairs" reservation scheme and bitstream switching approach for streaming of pre-coded VBR video are analyzed. Since, only the allocation scheme is not sufficient to provide good QoS, it is proposed that "downstairs" resource allocation strategy in conjunction with bitstream switching will offer the best QoS within minimum available resources. The proposed scheme can be used as an alternative by the service provider when the network is highly demanding. Our main aim was to investigate best switching points in pre-coded VBR video streams. It was shown through simulation that the transitions of the "downstairs" reservation function are the best switching points in terms of bits wastage in the receiver buffer and bandwidth utilization. In H.264/AVC framework, the effect of inserting SP-frames on the "downstairs" reservation scheme was analyzed. The solutions are proposed to minimize the side effects, while retaining the best features of "downstairs". The results are derived for both single and multiple switchings. Results of our algorithm are compared with other existing algorithm and a clear improvement is shown.


aNaming convention for primary and secondary SP frames (SP-frame and SSP-frame) is adapted from [9].


  1. Pao IM, Sun MT: Encoding stored video for streaming applications. IEEE Trans. Circuits Syst. Video Technol 2001, 11(2):199-209. 10.1109/76.905985

    Article  Google Scholar 

  2. Cai J, He Z, Chen CW: Rate-reduction transcoding design for video streaming applications. Proceedings of IEEE Packet Video Workshops (PV'2002) 2002, 1: 29-32.

    Google Scholar 

  3. Foster I, Roy A: A quality of service architecture that combines resource reservation and application adaptation. Proceedings of Eight International Workshop on Quality of Service (IWQoS 2000) 2000, 181-188.

    Google Scholar 

  4. Rejaie R, Handley M, Estrin D: Layered quality adaptation for Internet video streaming. IEEE J. Select. Areas Commun 2000, 18(12):2530-2543. 10.1109/49.898735

    Article  Google Scholar 

  5. Feamster N, Bansal D, Balakrishnan H: On the interaction between layered quality adaptation and congestion control for streaming video. In Proceedings 11th International Packet Video Workshop (PV2001). Kyongiu, Korea; 2001.

    Google Scholar 

  6. Ghanbari M: Standard Codecs: Image Compression to Advanced Video Coding. The Institution of Electrical Engineers (IEE), London; 2003. Telecommunication Series 49

    Book  Google Scholar 

  7. Apostolopoulos JG, Tan WT, Wee SJ:Video Streaming: Concepts, Algorithms, and Systems. HP Laboratories, Palo Alto; 2002. []

    Google Scholar 

  8. Lai H, Lee JY, Chen L: A monotonic-decreasing rate scheduler for variable-bit-rate video streaming. IEEE Trans. Circ. Syst. Video Technol 2005, 15(2):221-231.

    Article  Google Scholar 

  9. Guo L, Tan E, Chen S, Xiao Z, Spatscheck O, Zhang X: Delving into internet streaming media delivery: a quality and resource utilization perspective. Proceedings of ACM SIGCOMM Internet Measurement Conference (IMC'06) 2006.

    Google Scholar 

  10. Braden R, Zhang L, Berson S, Herzog S, Jamin S: Resource Reservation Protocol (RSVP). RFC 2205 1997.

    Google Scholar 

  11. Nichols K, Jacobson V, Zhang L: A two-bit differential services architecture for the Internet. Internet Draft 1999.

    Google Scholar 

  12. Furini M, Towsley DF: Real-time traffic transmissions over the Internet. IEEE Trans. Multimedia 2001, 3(1):33-40. 10.1109/6046.909592

    Article  Google Scholar 

  13. Sun K, Ghanbari M: An algorithm for VBR video transmission scheme over the Internet. In Proceedings of International Symposium on Telecommunication (IST2003). Isfahan, Iran; 2003.

    Google Scholar 

  14. Karczewicz M, Kurceren R: The SP- and SI-frames design for H.264/AVC. IEEE Trans. Circ. Syst. Video Technol 2003, 13(7):637-644. 10.1109/TCSVT.2003.814969

    Article  Google Scholar 

  15. Stockhammer T, Liebel G, Walter M: Optimized H.264/AVC-based bit stream switching for mobile video streaming. EURASIP J. Appl. Signal Process 2006, 2006: 1-19.

    MATH  Google Scholar 

  16. Fng WC, Sechrest S: Critical bandwidth allocation for the delivery of compressed video. Comput. Commun 1995, 18(10):709-717. 10.1016/0140-3664(95)98484-M

    Article  Google Scholar 

  17. Setton E, Girod B: Rate-distortion analysis and streaming of SP and SI frames. IEEE Trans. Circ. Syst. Video Technol 2006, 16(6):733-743.

    Article  Google Scholar 

  18. Stockhammer T, Liebel G, Walter M: Advanced bitstream switching for wireless for wireless video streaming. Diploma Thesis, Institute for Communication Engineering, Munich University of Technology 2004.

    Google Scholar 

  19. Setton E, Ramanathan P, Girod B: Rate-distortion analysis of SP and SI frames. Presented at the Visual Communication Image Processing (VCIP), San Jose, CA; 2006.

    Google Scholar 

  20. Altaf M, Khan E, Ghanbari M: Affects of SP-pictures on bitstream switching through monotonically decreasing rate schedulers for streaming of H.264/AVC coded video. In Proceeding of International Symposium on Telecommunication (IST-2008). Tehran, Iran; 2008.

    Google Scholar 

  21. Alam F, Khan E, Ghanbari M: Multiple bitstream switching for video streaming in monotonically decreasing rate scheduler. In Proceeding of IEEE International Conference on Industrial Technology (ICIT-06). Mumbai, India; 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Muhammad Altaf.

Additional information

7. Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Altaf, M., Khan, E., Ghanbari, M. et al. Efficient bitstream switching for streaming of H.264/AVC coded video. J Image Video Proc. 2011, 7 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: