Detailed Performance and Waiting-Time Predictability Analysis of Scheduling Options in On-Demand Video Streaming
© M. A. Alsmirat and N. J. Sarhan. 2010
Received: 2 May 2009
Accepted: 24 November 2009
Published: 3 January 2010
The number of on-demand video streams that can be supported concurrently is highly constrained by the stringent requirements of real-time playback and high transfer rates. To address this problem, stream merging techniques utilize the multicast facility to increase resource sharing. The achieved resource sharing depends greatly on how the waiting requests are scheduled for service. We investigate the effectiveness of the recently proposed cost-based scheduling in detail and analyze opportunities for further tunings and enhancements. In particular, we analyze alternative ways to compute the delivery cost. In addition, we propose a new scheduling policy, called Predictive Cost-Based Scheduling (PCS), which applies a prediction algorithm to predict future scheduling decisions and then uses the prediction results to potentially alter its current scheduling decisions. Moreover, we propose an enhancement technique, called Adaptive Regular Stream Triggering (ART), which significantly enhances stream merging behavior by selectively delaying the initiation of full-length video streams. We analyze the effectiveness of the proposed strategies in terms of their performance effectiveness and impacts on waiting-time predictability through extensive simulation. The results show that significant performance benefits as well as better waiting-time predictability can be attained.
The interest in video streaming has grown dramatically across the Internet and wireless networks and continues to evolve rapidly. Unfortunately, the number of on-demand video streams that can be supported concurrently is highly constrained by the stringent requirements of multimedia data, which require high transfer rates and must be presented continuously in time. Stream merging techniques [1–7] address this challenge by aggregating clients into successively larger groups that share the same multicast streams. These techniques include Patching [1, 3], Transition Patching [8, 9], and Earliest Reachable Merge Target (ERMT) [2, 10]. Periodic broadcasting techniques [11–15] also address this challenge but can be used for only popular videos and require the requests to wait until the next broadcast times of the first corresponding segments. This paper considers the stream merging approach.
The achieved resource sharing by stream merging depends greatly on how the waiting requests are scheduled for service. Despite the many proposed stream merging techniques and the numerous possible variations, there has been only a little work on the issue of scheduling in the context of these scalable techniques. The choice of a scheduling policy can be as important as or even more important than the choice of a stream merging technique, especially when the server is loaded. Minimum Cost First (MCF)  is a cost-based scheduling policy that has recently been proposed for use with stream merging. MCF captures the significant variation in stream lengths caused by stream merging through selecting the requests requiring the least cost (measured in bytes of the delivered video data).
Motivated by the development of cost-based scheduling, we investigate its effectiveness in detail and discuss opportunities for further tunings and enhancements. In particular, we initially seek to answer the following two important questions. First, is it better to consider the stream cost only at the current scheduling time or consider the expected overall cost over a future period of time? Second, should the cost computation consider future stream extensions done by advanced stream merging techniques (such as ERMT) to satisfy the needs of new requests? These questions are important because the current scheduling decision can affect future scheduling decisions, especially when stream merging and cost-based scheduling are used.
Additionally, we analyze the effectiveness of incorporating video prediction results into the scheduling decisions. The prediction of videos to be serviced and the prediction of waiting times for service have recently been proposed in . These prediction results, however, were not used to alter the scheduling decisions. We propose a scheduling policy, called Predictive Cost-Based Scheduling (PCS). Like MCF, PCS is cost-based, but it predicts future system state and uses the prediction results to potentially alter the scheduling decisions. It delays servicing requests at the current scheduling time (even when resource are available) if it is expected that shorter streams will be required at the next scheduling time. We present two alternative implementations of PCS.
We also propose an enhancement technique, called Adaptive Regular Stream Triggering (ART), which can be applied with any scheduling policy to enhance stream merging. The basic idea of ART is to selectively delay the initiation of full-length video streams.
We study the effectiveness of various strategies and design options through extensive simulation in terms of performance effectiveness as well as waiting-time predictability. The ability to inform users about how long they need to wait for service has become of great importance , especially considering the growing interest in human-centered multimedia systems. Today, even for short videos with medium quality, users of online video websites may experience significant delays. Providing users with waiting-time feedback enhances their perceived quality-of-service and encourages them to wait, thereby increasing throughput. The analyzed metrics include customer defection (i.e., turn-away) probability, average waiting time, unfairness against unpopular videos, average cost per request, waiting-time prediction accuracy, and percentage of clients receiving expected waiting times. The waiting-time prediction accuracy is determined by the average deviation between the expected and actual waiting times. We consider the impacts of customer waiting tolerance, server capacity, request arrival rate, number of videos, video length, and skew in video access. We also study the impacts of different request arrival processes and video workloads. Furthermore, in contrast with prior studies, we analyze the impact of flash crowds, whereby the arrival rate experiences sudden spikes.
The results demonstrate that the proposed PCS and ART strategies significantly enhance system throughput and reduce the average waiting time for service, while providing accurate predicted waiting times.
The rest of the paper is organized as follows. Section 2 provides background information on main performance metrics, stream merging, and request scheduling techniques. Section 3 analyzes cost-based scheduling and explores alternative ways to compute the cost. Sections 4 and 5 present the proposed PCS and ART strategies, respectively. Section 6 discusses the performance evaluation methodology and Section 7 presents and analyzes the main results.
2. Background Information
In this section, we discuss the main performance metrics used to evaluate scheduling policies in streaming servers. We then discuss stream merging and request scheduling.
2.1. Main Performance Metrics of Video Streaming Servers
where is the defection probability for video , is the mean defection probability across all videos, and is the number of videos. In this paper, we also consider waiting-time predictability metrics as will be discussed in Section 7.
2.2. Scalable Delivery of Video Streams with Stream Merging
Stream merging techniques aggregate users into larger groups that share the same multicast streams. In this subsection, we discuss three main stream merging techniques: Patching [1, 3, 18], Transition Patching [8, 9], and ERMT [2, 10].
Patching, Transition Patching, and ERMT differ in complexity and performance. Both the implementation complexity and performance increase from Patching to Transition Patching to ERMT. Selecting the most appropriate stream merging technique depends on a tradeoff between the required implementation complexity and the achieved performance.
2.3. Request Scheduling of Waiting Video Requests
A scheduling policy selecting an appropriate video for service whenever it has an available channel. A channel is a set of resources (network bandwidth, disk I/O bandwidth, etc.) needed to deliver a multimedia stream. All waiting requests for the selected video can be serviced using only one channel. The number of channels is referred to as server capacity.
where is the required stream length for the requests in queue , is the (average) data rate for the requested video, and is the number of waiting requests for video . To reduce the bias against videos with higher data rates, can be removed from the objective function (as done in this paper). MCF-P has two variants: Regular as Full (RAF) and Regular as Patch (RAP). RAP treats regular and transition streams as if they where patches, whereas RAF uses their normal costs. MCF-P performs significantly better than all other scheduling policies when stream merging techniques are used. In this paper, we simply refer to MCF-P (RAP) as MCF-P unless the situation calls for specificity.
3. Analysis of Cost-Based Scheduling
We seek to understand the behavior of cost-based scheduling and its interaction with stream merging. Understanding this behavior helps in developing solutions that optimize the overall performance. One of the issues that we explore in this study is determining the duration over which the cost should be computed. In particular, we seek to determine whether the cost should be computed only at the current scheduling time ( ) or over a future duration of time, called prediction window ( ). In other words, should the system select the video with the least cost per request at time or the least cost per request during . The latter requires prediction of the future system state. We devise and explore two ways to analyze the effectiveness of computing the cost over a period of time: Lookahead and Combinational scheduling.
3.1. Lookahead Scheduling
3.2. Combinational Scheduling
4. Proposed Predictive Cost-Based Scheduling
The prediction of videos to be serviced and the prediction of waiting times for service have recently been proposed in . These prediction results, however, were not used to alter the scheduling decisions. In this paper, we analyze the effectiveness of incorporating video prediction results into the scheduling decisions. We propose a scheduling policy, called Predictive Cost-Based Scheduling (PCS). PCS is based on MCF, but it predicts future system state and uses this prediction to possibly alter the scheduling decisions. The basic idea can be explained as follows. When a channel becomes available, PCS determines using the MCF-P objective function the video which is to be serviced tentatively at the current scheduling time ( ) and its associated delivery cost. To avoid unfairness against videos with high data rates, we use the required stream length for the cost . Before actually servicing that video, PCS predicts the system state at the next scheduling time ( ) and estimates the delivery cost at that time assuming that video is not serviced at time . PCS does not service any request at time and thus postpones the service of video if the delivery cost at time is lower than that at time . Otherwise, video is serviced immediately.
To reduce possible server underutilization, PCS delays the service of streams only if the number of available server channels (freeChannels) is smaller than a certain threshold (freeChannelThresh). Algorithm 1 shows a proposed algorithm to dynamically find the best value of freeChannelThresh. The algorithm changes the value of the threshold and observes its impact on customer defection probability over a certain time interval. The value of the threshold is then updated based on the trend in defection probability (increase or decrease) and the last action (increase or decrease) performed on the threshold. The algorithm is to be executed periodically but not frequently to ensure stable system behavior.
Algorithm 1: Simplified algorithm for dynamically computing freeChannelThresh.
else if (last action was increment)
else if (last action was decrement)
We present two alternative implementations of PCS: PCS-V and PCS-L. These two implementations differ in how to compute the delivery cost or required stream length at the next scheduling time. PCS-V predicts the video to be serviced at the next scheduling time and simply uses its required stream length. The video prediction is done by utilizing detailed information about the current state of the server in a manner similar to that of the waiting-time prediction approach in . This information includes the number of waiting requests for each video, the completion times of running streams, and statistics such as the average request arrival rate for each video (which is to be updated periodically). Algorithm 2 shows a simplified algorithm for PCS-V.
Algorithm 2: Simplified algorithm for PCS-V.
where is the request arrival rate for video and assuming a Poisson arrival process. If the video has already one waiting request, then this probability is 1. Sorting the videos according to the scheduling objective function is required to determine the probability that all videos with lower cost (or higher objective) are not selected.
Algorithm 3: Simplified algorithm for PCS-L.
Sort videos from best to worst according to objective function;
As can be clearly seen from the algorithms, both PCS-V and PCS-L require a time overhead of , where is the number of videos, assuming that a priority queue structure is used to rank the videos according to the objective function.
5. Proposed Adaptive Regular Stream Triggering (ART)
As will be shown later, our analysis reveals a significant interaction between stream merging and scheduling decisions. One of the pertaining issues is how to best handle regular (i.e., full) streams. MCF-P (RAP) considers the cost of a regular stream as a patch and thus treats it in a differentiated manner. The question arises as to whether it is worthwhile, however, to delay regular streams in certain situations. Guided by analysis, we propose a technique, called Adaptive Regular Stream Triggering (ART). A possible implementation is shown in Algorithm 4. The basic idea here is to delay regular streams as long as the number of free channels is below a certain threshold, which is to be computed dynamically based on the current workload and system state. ART uses the same algorithm (shown in Algorithm 1) to dynamically find the best value of free Channel Thresh as that of PCS.
Algorithm 4: Simplified implementation of ART.
else // full stream
In principle, ART can be used with any scheduling policy, including PCS, although some negative interference happens when it is combined with PCS, as will be shown in Section 7.
6. Evaluation Methodology
6.1. Workload Characteristics
Summary of Workload Characteristics.
Poisson Process (default)
Request Arrival Rate
Variable, Default is 40 Req./min
200 to 750 channels
0.1 to 0.6, Default = 0.271
Number of Videos
Variable, Default is 120
Fixed-Length Video Workload (Default)
with length of 60 to 180 min (same for all videos),
Default = 120 min
Variable-Length Video Workload:
with lengths randomly in the range: 60 to 180 min
Waiting Tolerance Model
A, B, C, Default is A
two movie lengths and flash crowds arrival rate is variable.
Default: no flash crowds
We characterize the waiting tolerance of customers by three models. In Model A, the waiting tolerance follows an exponential distribution with mean [24, 25]. In Model B, users with expected waiting times less than will wait and the others will have the same waiting tolerance as Model A [24, 25]. We use Model C to capture situations in which users either wait or defect immediately depending on the expected waiting times. The user waits if the expected waiting time is less than and defects immediately if the waiting time is greater than . Otherwise, the defection probability increases linearly from 0 to 1 for the expected waiting times between and .
As in most previous studies, we generally study a server with 120 videos, each of which is 120 minutes long. We examine the server at different loads by fixing the request arrival rate at 40 requests per minute and varying the number of channels (server capacity) generally from 200 to 750. In addition to the fixed-length video workload (in which all videos have the same length), we experiment with a variable-length video workload. Moreover, we study the impacts of arrival rate, user's waiting tolerance, number of videos, and video length (in the fixed-length workload).
Flash crowds workload characteristics were adopted from .
6.2. Considered Performance Metrics
To evaluate the effectiveness of the proposed schemes, we consider the main performance metrics discussed in Section 2.1. In addition, we analyze waiting-time predictability by two metrics: waiting-time prediction accuracy and the percentage of clients receiving expected waiting times. The waiting-time prediction accuracy is determined by the average deviation between the expected and actual waiting times. For waiting-time prediction, we use the algorithm in . Note that this algorithm may not provide an expected waiting time to each client because the prediction may not always be performed accurately.
7. Result Presentation and Analysis
7.1. Comparing the Effectiveness of Different Cost-Computation Alternatives
Although computing the cost over a time interval seems intuitively to be an excellent choice, it interferes negatively with stream merging. Later in this paper, we discuss how the interaction between stream merging and scheduling can be utilized by using the proposed ART technique, which can be used with any scheduling policy. Based on these results, we only consider next computing the cost at the current scheduling time.
7.2. Effectiveness of the Proposed PCS Policy
7.3. Effectiveness of the Proposed ART Enhancement
7.4. Comparing the Effectiveness of PCS and ART
With ERMT, MCF-P when combined with ART performs better than PCS-V in terms of the customer defection probability and average waiting time. The results when Transition Patching and Patching are used exhibit different behavior than those with ERMT. MCF-P combined with ART gives almost the same results as PCS-V in terms of customer defection probability, but it reduces the average waiting time significantly. Unfairness of PCS-V is less than that with ART in all stream merging techniques because ART favors videos with shorter streams more than PCS-V. These results indicate that MCF-P when combined with ART is the best overall performer.
7.5. Impact of Workload Parameters on the Effectiveness of PCS and ART
7.6. Comparing Waiting-Time Predictability with PCS and ART
7.7. Impact of Flash Crowds on the Effectiveness of PCS and ART
7.8. Effectiveness of Combining ART with PCS
There is no clear advantage of computing the cost over a future time window, compared with computing the cost only at the next scheduling time.
The proposed PCS scheduling policy outperforms the best existing policy (MCF-P) in terms of customer defection probability and average waiting time. The waiting times can also be predicted more accurately with PCS. The two variations of PCS (PCS-V and PCS-L) perform nearly the same and thus the simpler variant (PCS-V) is preferred because of its lower implementation complexity.
By enhancing stream merging behavior, the proposed ART technique substantially improves both the customer defection probability and the average waiting time.
Although ART in principle can be applied with any scheduling policy, including PCS, negative interference exists between ART and PCS, and thus their combination generally achieves worse than any of them applied individually. Removing this interference by modifying these two strategies is a challenging task and left for a future study.
The best overall performer is "MCF-P combined with ART", followed by PCS. With ART, significantly more clients can receive expected waiting times for service than PCS, but at a somewhat lower waiting time accuracy.
This paper is a revised and extended version of our paper "Performance and waiting-time predictability analysis of design options in cost-based scheduling for scalable media streaming," which was presented at the International MultiMedia Modeling Conference (MMM 2009), Antipolis, France, January 2009. It also combines the MMM 2009 paper with a short paper "Predictive cost-based scheduling for scalable video streaming," presented at the IEEE International Conference on Multimedia & Expo (ICME 2008), Hannover, Germany, June 2008. This work was supported in part by NSF Grants CNS-0626861 and CNS-0834537.
- Hua KA, Cai Y, Sheu S: Patching: a multicast technique for true video-on-demand services. Proceedings of the 6th ACM International Conference on Multimedia, 1998 191-200.Google Scholar
- Eager DL, Vernon MK, Zahorjan J: Optimal and efficient merging schedules for video-on-demand servers. Proceedings of the 7th ACM International Conference on Multimedia, 1999 199-202.Google Scholar
- Cai Y, Hua KA: Sharing multicast videos using patching streams. Multimedia Tools and Applications 2003,21(2):125-146. 10.1023/A:1025516608573View ArticleGoogle Scholar
- Rocha M, Maia M, Cunha I, Almeida J, Campos S: Scalable media streaming to interactive users. Proceedings of the 13th Annual ACM International Conference on Multimedia, 2005 966-975.View ArticleGoogle Scholar
- Ma H, Shin GK, Wu W: Best-effort patching for multicast true VoD service. Multimedia Tools and Applications 2005,26(1):101-122. 10.1007/s11042-005-6851-xView ArticleGoogle Scholar
- Huang C-J, Chuang Y-T, Guan C-T, Luo Y-C, Hu K-W, Chen C-H: A hybrid priority-based video-on-demand resource sharing scheme. Computer Communications 2008,31(10):2231-2241. 10.1016/j.comcom.2008.02.007View ArticleGoogle Scholar
- Dai H, Chan E: Quick patching: an overlay multicast scheme for supporting video on demand in wireless networks. Multimedia Tools and Applications 2008,36(3):221-242. 10.1007/s11042-007-0143-6View ArticleGoogle Scholar
- Cai Y, Hua KA: An efficient bandwidth-sharing technique for true video on demand systems. Proceedings of the 7th ACM International Conference on Multimedia, 1999 211-214.Google Scholar
- Cai Y, Tavanapong W, Hua K: Enhancing patching performance through double patching. Proceedings of the 9th International Conference on Distributed Multimedia Systems, 2003 72-77.Google Scholar
- Eager DL, Vernon MK, Zahorjan J: Bandwidth skimming: a technique for cost-effective video-on-demand. Multimedia Computing and Networking 2000, January 2000, San Jose, Calif, USA, Proceedings of SPIE 3969: 206-215.View ArticleGoogle Scholar
- Juhn L-S, Tseng L-M: Harmonic broadcasting for video-on-demand service. IEEE Transactions on Broadcasting 1997,43(3):268-271. 10.1109/11.632927View ArticleGoogle Scholar
- Paris J-F, Carter SW, Long DDE: Efficient broadcasting protocols for video on demand. Proceedings of the IEEE International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems ((MASCOTS '98), July 1998, Montreal, Canada 127-132.Google Scholar
- Huang C, Janakiraman R, Xu L: Loss-resilient on-demand media streaming using priority encoding. Proceedings of the 12th ACM International Conference on Multimedia, October 2004, New York, NY, USA 152-159.View ArticleGoogle Scholar
- Shi L, Sessini P, Mahanti A, Li Z, Eager DL: Scalable streaming for heterogeneous clients. Proceedings of the 14th Annual ACM International Conference on Multimedia (MM '06), October 2006, Santa Barbara, Calif, USA 337-346.View ArticleGoogle Scholar
- Gill P, Shi L, Mahanti A, Li Z, Eager DL: Scalable on-demand media streaming for heterogeneous clients. ACM Transactions on Multimedia Computing, Communications and Applications 2008,5(1):1-24.View ArticleGoogle Scholar
- Sarhan NJ, Qudah B: Efficient cost-based scheduling for scalable media streaming. Multimedia Computing and Networking 2007, January 2007, San Jose, Calif, USA, Proceedings of SPIE 6504:View ArticleGoogle Scholar
- Alsmirat MA, Al-Hadrusi M, Sarhan NJ: Analysis of waiting-time predictability in scalable media streaming. Proceedings of the 15th ACM International Conference on Multimedia (MM '07), September 2007, Augsburg, Bavaria 727-736.View ArticleGoogle Scholar
- Carter SW, Long DDE: Improving video-on-demand server efficiency through stream tapping. Proceedings of the 6th International Conference on Computer Communications and Networks (ICCCN '97), September 1997, Las Vegas, Nev, USA 200-207.Google Scholar
- Eager D, Vernon M, Zahorjan J: Minimizing bandwidth requirements for on-demand data delivery. IEEE Transactions on Knowledge and Data Engineering 2001,13(5):742-757. 10.1109/69.956098View ArticleGoogle Scholar
- Bar-Noy A, Goshi J, Ladner RE, Tam K: Comparison of stream merging algorithms for media-on-demand. Multimedia Systems 2004,9(5):411-423. 10.1007/s00530-003-0114-3View ArticleGoogle Scholar
- Dan A, Sitaram D, Shahabuddin P: Scheduling policies for an on-demand video server with batching. Proceedings of the 2nd ACM International Conference on Multimedia, 1994 391-398.Google Scholar
- Aggarwal CC, Wolf JL, Yu PS: The maximum factor queue length batching scheme for video-on-demand systems. IEEE Transactions on Computers 2001,50(2):97-110. 10.1109/12.908987View ArticleGoogle Scholar
- Costa C, Cunha I, Borges A, et al.: Analyzing client interactivity in streaming media. Proceedings of the 13th International World Wide Web Conference (WWW '04), May 2004, New York, NY, USA 534-543.View ArticleGoogle Scholar
- Tsiolis AK, Vernon MK: Group-guaranteed channel capacity in multimedia storage servers. Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '97), June 1997, Seattle, Wash, USA 285-297.Google Scholar
- Sarhan NJ, Das CR: A new class of scheduling policies for providing time of service guarantees in video-on-demand servers. Proceedings of the 7th IFIP/IEEE International Conference on Management of Multimedia Networks and Services, 2004 127-139.View ArticleGoogle Scholar
- Ari I, Hong B, Miller E, Brandt S, Long D: Managing flash crowds on the internet. Proceedings of the 11th IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '03), 2003 246-249.Google Scholar
- Qudah B, Sarhan NJ: Analysis of resource sharing and cache management techniques in scalable video-on-demand. Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06), 2006 327-334.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.