Full-reference video quality metric assisted the development of no-reference bitstream video quality metrics for real-time network monitoring
© Sedano et al.; licensee Springer. 2014
Received: 18 March 2013
Accepted: 23 December 2013
Published: 14 January 2014
High-quality video is being increasingly delivered over Internet Protocol networks, which means that network operators and service providers need methods to measure the quality of experience (QoE) of the video services. In this paper, we propose a method to speed up the development of no-reference bitstream objective metrics for estimating QoE. This method uses full-reference objective metrics, which makes the process significantly faster and more convenient than using subjective tests. In this process, we have evaluated six publicly available full-reference objective metrics in three different databases, the EPFL-PoliMI database, the HDTV database, and the Live Video Wireless database, all containing transmission distortions in H.264 coded video. The objective metrics could be used to speed up the development process of no-reference real-time video QoE monitoring methods that are receiving great interest from the research community. We show statistically that the full-reference metric Video Quality Metric (VQM) performs best considering all the databases. In the EPFL-PoliMI database, SPATIAL MOVIE performed best and TEMPORAL MOVIE performed worst. When transmission distortions are evaluated, using the compressed video as the reference provides greater accuracy than using the uncompressed original video as the reference, at least for the studied metrics. Further, we use VQM to train a lightweight no-reference bitstream model, which uses the packet loss rate and the interval between instantaneous decoder refresh frames, both easily accessible in a video quality monitoring system.
Streaming high-quality digital video over Internet Protocol (IP)-based networks is increasing in popularity both among users and operators. Two examples of these applications are IPTV and Over The Top (OTT) Video. IPTV systems are managed by one operator, from video head-end to the user, and are based on ordinary broadcast television, using IP multicast. OTT Video is used to describe the delivery of TV over the public Internet, using unicast.
In order to ensure a high Quality of Experience (QoE), the network operators and service providers need methods to monitor the quality of the video services . The monitoring and prediction should be performed in real-time and in different parts of the network. Since users' experienced quality is not easily understood and depends on many aspects , subjective assessments involving a panel of observers constitutes the most accurate method to measure the video QoE. However, in a monitoring situation, subjective assessments are very hard to perform and therefore objective measurement methods are desirable. Even for the development of these measurement methods subjective data is usually required, which may be cumbersome and time consuming to obtain when developing real-time monitoring systems. Furthermore, for subjective testing to be accurate, it requires careful planning, preparation and involvement of a number of viewers. This makes it costly to conduct.
Instead, objective metrics, which accurately characterize the video quality and predict viewer quality of experience, have evolved for some time now, but there is still a long way to go before they, in general, can accurately predict the results of subjective measurements . The objective metrics can be classified as no reference, reduced reference, and full reference . Traditionally, in the full-reference scenario, an original undistorted high-quality video is compared to a degraded version of the same video, for example, pixel by pixel or block based. Reduced-reference methods require partial or parameterized information about the original video sequence. No-reference methods rely only on the degraded video. Here, we generalize the concept of FR, RR, and NR by also including packet header models, bitstream models, and hybrid models together with the pure video-based models based on the amount of reference information used by the models, as suggested in Barkowsky et al. .
Definition of scope
Find FR model for the scope
Train NR mode using FR model
Evaluate the performance of NR model
To illustrate the methodology, we have selected a concrete example, where we perform all the necessary steps for the procedure described above. This does not mean that we claim that this is the first time FR models are evaluated for packet loss, but to our knowledge, it has not been done for this particular scope, i.e., packet loss combined with coded reference. Still, it is presented for illustrating the methodology. In case it is already known which FR model that is best for a particular scope, then this step can be omitted. There may also be better NR models also for this scope, but it should be taken into account the relatively high performance combined with its simplicity of model and most importantly the low development effort.
Traditionally, video quality metrics make predictions on computations on the video data itself. Nowadays, there are emerging models also utilizing network information either by itself, i.e., bitstream models or in combination with the video data, i.e., hybrid models. A review of no-reference video quality estimators can be found in .
Most of the objective metrics have been developed and tested to estimate the perceived quality when the video is only compression degraded, for example, see [8, 9]. However, video delivered over Internet will be degraded by transmission distortions, for example, packet loss. Several studies have shown that even a low packet loss rate can and most often will affect the video quality, for example, see .
Also, for objective quality monitoring, there are two other aspects of performance apart from prediction accuracy that are important: computational requirements and run time . To be used in a real-time monitoring and prediction system, the objective quality methods must be lightweight and cannot require the original video reference [12, 13]. Independent evaluations are scarce with the notable exception of the work by the Video Quality Experts Group (VQEG) . One of the problems a developer or a tester must face is the unavailability of video databases, especially if they contain videos subject to packet losses. Also, in real network deployments, the uncompressed original sequence is usually not available. Therefore, we believe that it is important to evaluate the performance of the metrics when a compressed reference is used instead. For example, the video quality degradation introduced by a network node could be evaluated applying a full-reference metric such as VQM comparing the compressed reference that is available before the video enters the network node and the degraded video due to transmission distortions that is available after the video exits the network node.
The paper starts by describing the publicly available video databases that have been used for the development work. The details about the video sequences and how they have been compressed and distorted are described. Also, the subjective tests that have been performed with the aforementioned video sequences are described in terms of number of viewers, viewing conditions, etc.
In the following section, the objective full-reference assessment algorithms are reviewed, and the scenarios in which they are used are outlined.
The following section contains the results of the evaluation of the objective full-reference assessment algorithms against the databases with subjective test data. The means of evaluation are the Spearman Rank Order Correlation Coefficient (SROCC), the Pearson correlation coefficient, the Root Mean Square Error (RMSE), and the Outlier Ratio (OR).
In the last section of the paper, we show how a no-reference objective model can be developed by training it against a full-reference objective metric. Naturally, we choose the metric with the best performance, as evaluated in the previous section.
3. Video databases
Summary of conditions of all databases
HDTV video database
LIVE Wireless video database
Number of sequences
Total 78, 6 different source sequences
Total 45, 9 different source sequences in compressed and uncompressed formats
Total 170, 10 different source sequences
4CIF (704 × 576)
1,920 × 1,080
768 × 480
10 and 8 s
Compressed (with high quality)
Compressed (with high quality) and uncompressed
Compressed (with high quality)
Fixed QP between 28 and 32
Fixed QP value 26
Reference video: Fixed QP value 18
Degraded videos: bitrates 500 kbps, 1 Mbps, 1.5 Mbps and 2 Mbps
Reference video: 14 frames
Degraded videos: 96
25 and 30 fps
59,94 (interlaced) fps
PLR 0.1%, 0.4%, 1%, 3%, 5%, 10%. Two different channel realizations for each PLR
PLR 0.7% (from 42% to 56% of the way), 4.2% (from 21% to 64% of the way), 4.2% (from 42% to 56% of the way).
PLR 0.5%, 2%, 5% and 17%
H.264/AVC JM reference software
H.264/AVC JM reference software
Number of subjects
21 at PoliMI lab and 19 at EPFL lab
Processing of subjective scores
Difference scores → Z-scores (with outliers detection) → re-scaling to range [0,5] → DMOS
Difference scores → Z-scores (with outliers detection) → re-scaling to range [0,5] → DMOS both for compressed and uncompressed reference
Difference scores → DMOS → re-scaling to range [0,5]
3.1 EPFL-PoliMI video database
The freely available EPFL-PoliMI (Ecole Polytechnique Fédérale de Lausanne and Politecnico di Milano) video quality assessment database [16–18] was specifically designed for the evaluation of transmission distortions. The database contains 78 video sequences at 4CIF spatial resolution (704 × 576 pixels). The distorted videos were created from five 10-s-long and one 8-s-long uncompressed video sequences in planar I420 raw progressive format .
The reference videos were lightly compressed to ensure high video quality in the absence of packet losses. Therefore, a fixed Quantization Parameter between 28 and 32 was selected for each sequence. The Quantization Parameter regulates how much spatial detail is saved. The sequences were encoded and decoded in H.264/AVC  High Profile in the H.264/AVC reference software. B-pictures and Context-adaptive binary arithmetic coding (CABAC) were enabled for coding efficiency. Each frame was divided into a fixed number of slices, where each slice consists of a full row of macroblocks.
The compressed videos in the absence of packet losses were used as the reference for the computation of the DMOS (Differential Mean Opinion Score) values. Three of the reference videos have a frame rate of 25 frames per second (fps). This was accomplished by cropping HD resolution video sequences down to 4CIF resolution and reducing the frame rate from 50 to 25 fps. The other three videos have a frame rate of 30 fps.
The transmission distortions were simulated at different packet loss rates (PLR) (0.1%, 0.4%, 1%, 3%, 5%, 10%). The packet loss was generated using a two-state Gilbert's model with an average burst length of three packets and two different channel realizations were selected for each PLR.
Forty naive subjects took part in the subjective tests. The subjective evaluation was done using the ITU continuous scale in the range [0–5] . Twenty-one subjects participated in the evaluation at the PoliMI lab and 19 at the EPFL lab. More details about the subjective evaluation can be found in [16–18].
3.1.2 Processing of subjective scores
Although the raw subjective scores were already processed in the EPFL-PoliMI database, we processed them in a different way in order to merge the data from the two labs.
First of all, we calculated the difference scores by subtracting the scores of the degraded videos to the score of the reference videos. The difference scores for the reference videos were set to 0 and were removed. Accordingly, a lower difference score indicates a higher quality.
Each subject may have used the rating scale differently and with different offset. In order to account for this, the Z-scores were computed for each subject separately by means of the Matlab zscore function. The Z-scores transform the original distribution to one in which the mean becomes zero and the standard deviation becomes one. Indeed, this normalization procedure reduces the gain and offset between the subjects. Subsequently, the outliers were detected according to the guidelines described in ITU-T Rec 910 Annex 2 Section 2.3.1  and removed.
Finally, the Difference Mean Opinion Score (DMOS) of each video was computed as the mean of the re-scaled Z-scores from the 36 subjects that remained after rejection. Additionally, the confidence intervals were also computed. The methodology for the processing of the scores shown in this paper has been applied by many authors. For example, see .
3.2 HDTV video database
The HDTV video database was made freely available by Barkowsky et al. . The video database contains nine different source video sequences, and we selected three different conditions corresponding only to transmission distortions. In , these are referred to as the Hypothetical Reference Circuit (HRC) 5, 6, and 7. HRCs 5 to 7 are coded with high quality (QP26) and contain simulated transmission errors, mainly blurriness and motion artifacts. The errors were inserted in the middle of the video sequence. In HRC 5, from 42% to 56% of the way through the 14-s sequence's bitstream (before removing the beginning and end of the sequence), 0.7% of packets were randomly lost. HRC 6 contained 4.2% of packets randomly lost from 21% to 64% of the way through the bitsream. HRC 7 contained 4.2% of packets randomly lost from 42% to 56% of the way through the bitstream.
The encoder always used two interlaced slice groups of two macroblock lines. For error recovery, an intra image was forced every 24 frames and the ratio of intra macroblock refresh was 5%. The video resolution was 1,920 × 1,080 pixels at 59.94 fields-per-second in interlaced format. The sequences have a duration of 10 s. In total, 24 naive observers viewed the content. The Absolute Category Rating with Hidden Reference (ACR-HR) conforming to ITU-T P.910 with a five-point rating scale was used. The subjects viewed the content at a distance of 1.5 m corresponding to three times the picture height. More details about the subjective experiment can be found in .
The processing of the subjective scores was performed in the same way as for the EPFL-PoliMI video database. The DMOS values were calculated both for the scenario with compressed reference (QP26, HRC1) and with uncompressed reference (HRC0). Two outliers were found in the case of compressed reference and no outliers in the case of uncompressed reference.
3.3 LIVE Wireless video database
Moorthy et al.  evaluated publicly available full-reference video quality assessment algorithms on the LIVE Wireless Video Quality Assessment database. The LIVE Wireless video database contains ten source sequences, each 10 s long at a rate of 30 frames per second. The source videos are in RAW uncompressed progressive scan YUV420 format with a resolution of 768 × 480. However, the videos used as reference were already compressed with high quality (average PSNR > 45 dB). For the reference sequences, the Quantization Parameter was set to 18 and the I-frame period to 14. One-hundred sixty distorted videos were created (4 bitrates × 4 packet loss rates = 16 distorted videos per reference sequence). The simulated wireless transmission errors were inserted to the H.264 compressed videos, which were generated with the JM reference software (Version 13.1). The source videos were encoded using different bitrates: 500 kbps, 1 Mbps, 1.5 Mbps, and 2 Mbps with three different slice groups and an I-frame period of 96. The RD Optimization was enabled, and the baseline profile was used for encoding and hence did not include B-frames. The packet size was set to between 100 and 300 bytes. The Flexible Macroblock Ordering (FMO) mode was set as 'dispersed’.
Packet loss rates of 0%, 5%, 2%, 5%, and 17% were simulated using bit-error patterns captured from different real or emulated mobile radio channels. The JM reference software was used to decode the compressed video stream.
For the subjective test, the Single Stimulus Continuous Quality Evaluation with hidden reference was used. A total of thirty-one subjects participated in the study. The difference scores were calculated by subtracting the score that the subject assigned to the distorted sequence to the score that the subject assigned to the reference sequence. One subject was rejected. The scores from the remaining subjects were then averaged to form a Differential Mean Opinion Score (DMOS) for each sequence. No Z-scores were used. Finally, we re-scaled the DMOS values to the range [0–5]. More details on the subjective study can be found on . The LIVE Wireless video database is no longer publicly available because of the uniformity and simplicity of the content. However, we use this database because our study involves various video databases.
4. Objective assessment algorithms
The video quality metrics that were evaluated are the following well-known publicly available algorithms: Peak Signal-to-Noise Ratio (PSNR) , Structural SIMilarity (SSIM) index , Multi-scale SSIM (MS-SSIM) , Video Quality Metric (VQM) , Visual Signal to Noise Ratio (VSNR) , and MOtion-based Video Integrity Evaluation (MOVIE) . The performance of the objective models is evaluated using the Spearman Rank Order Correlation Coefficient, the Pearson Linear Correlation Coefficient, the Root-Mean-Square Error (RMSE) and the Outlier Ratio. A non-linear regression was done using a monotonic function. The performance of the different metrics was compared by means of a statistical significance analysis based on the Pearson, RMSE, and Outlier Ratio coefficients.
4.2 Video quality algorithms
We have evaluated and compared several well-known objective video quality algorithms using the videos and subjective results in the three databases. The objective algorithms are described below. The default values of the metrics were used for all the metrics. No registration problems, i.e., a misalignment between the reference and degraded videos due to the loss of entire frames, occurred in the dataset.
4.2.1 Peak Signal-to-Noise Ratio
PSNR is computed using the mean of the MSE vector (contains the Mean Square Error of each frame). The MSE is computed per frame. The implementation used is based on the 'PSNR of YUV videos’ program (yuvpsnr.m) by Dima Pröfrock available in the MATLAB Central file repository . Only the luminance values were considered.
4.2.2 Structural SIMilarity
SSIM  is computed for each frame. After that an average value is produced. The implementation used is an improved version of the original version  in which the scale parameter of SSIM is estimated. The implementation, named ssim.m, can be downloaded in the author's implementation home page . Only the luminance values were considered.
4.2.3 Multi-scale SSIM
MS-SSIM  is computed for each frame. Afterwards, an average value is produced. The implementation used was downloaded from the Laboratory for Image & Video Engineering (LIVE) at the University Of Texas at Austin . Only the luminance values were considered.
4.2.4 Video Quality Metric
For VQM, we used the software version 2.2 for Linux that was downloaded from the author's implementation home page . We used the following parameters: parsing type none, spatial, valid, gain and temporal calibration automatic, temporal algorithm sequence, temporal valid uncertainty false, alignment uncertainty 15, calibration frequency 15, and video model general model. The files were converted from planar 4:2:0 to the format required by VQM (Big-YUV file format, 4:2:2) using ffmpeg.
4.2.5 Visual Signal-to-Noise Ratio
VSNR  is computed using the total signal and noise values of the sequence. We modified the authors' implementation available at  to extract the signal and noise values in order to sum them separately. Only the luminance values were considered. The VSNR was obtained dividing the total amount of signal by the total amount of noise.
4.2.6 MOtion-based Video Integrity Evaluation
MOVIE  includes three different versions: the Spatial MOVIE index, the Temporal MOVIE index and the MOVIE index. The MOVIE Index version 1.0 for Linux was used and can be downloaded from . The optional parameters framestart, frameend, or frameint were not used. Only EPFL-PoliMI was analyzed with MOVIE.
4.3 Statistical analysis
In order to test the performance of the objective algorithms, we computed the Spearman Rank Order Correlation Coefficient (SROCC), the Pearson correlation coefficient, the RMSE, and the Outlier Ratio (OR) . The Spearman coefficient assesses how well the relationship between two variables can be described using a monotonic function. The Pearson coefficient measures the linear relationship between a model's performance and the subjective data. The RMSE provides a measure of the prediction accuracy. Finally, the consistency attribute of the objective metric is evaluated by the Outlier Ratio.
In the above equation, the DMOSp is the predicted value. The four parameters were obtained using the MATLAB function 'nlinfit’.
In each of the databases, we used the function providing the best fitting. The performance of the metrics is compared by means of a statistical significance analysis based on the Pearson, RMSE, and Outlier Ratio coefficients .
5. Evaluation of full-reference objective metrics
In this section, we present the results of the statistical analysis. Also, in several figures, the scatter plots of the VQM objective metric scores vs. DMOS for the different databases are shown. We show the plots of the VQM objective metric because the VQM metric performs very well in all the video databases. The fitting function is also plotted.
EPFL-POLIMI video database
5.2 HDTV video database
HDTV video database compressed reference
HDTV video database uncompressed reference
5.3 Live Wireless database
Live Wireless video database
Our results show that VQM has a very good performance in all the databases, being the best metric among the studied in the HDTV video database (uncompressed reference) and in the LIVE Wireless video database. In the EPFL-PoliMI video database, SPATIAL MOVIE performed better than the other metrics. On the other hand, the performance of TEMPORAL MOVIE was lower than the other metrics, at least for the EPFL-PoliMI video database.
The performance of MOVIE, SPATIAL MOVIE, and TEMPORAL MOVIE was not evaluated in HDTV video databases and in the LIVE Wireless video database because the execution of the metric requires a very significant amount of time (many days) in comparison with the other metrics. This fact decreases the usability of these metrics considerably. It may be argued that for development purposes, it is less important, but with computation times of several hours, this is a problem also for this usage.
In the results from the HDTV video database, we can appreciate that the accuracy in the prediction can be increased if the reference is compressed, compared to the case where the reference is uncompressed.
6. No-reference bitstream model development
In this section, we demonstrate how the full-reference objective metrics can be used to speed up the development process of no-reference bitstream real-time video QoE monitoring methods. In particular, we develop a no-reference bitstream model using the VQM full-reference metric, and we validate it using the subjective databases EPFL-PoliMI and LIVE Wireless Video Quality Assessment database.
We present a lightweight no-reference bitstream method that uses the packet loss rate and the interval between instantaneous decoder refresh frames (IDR frames) to estimate the video quality. IDR frames are 'delimiters’ in the stream. After receiving an IDR frame, frames prior to the IDR frame can no longer be used for prediction. As well as this, IDR frames are also I-frames, so they do not reference data from any other frame. This means they can be used as seek points in a video. The no-reference bitstream model was fitted using several videos from the Consumer Digital Video Library (CDVL) database  and the VQM metric. Then, it was validated with the video databases EPFL-PoliMI and LIVE Wireless Video Quality Assessment database. The VQM metric has been used to train the no-reference bitstream model regarding only the transmission distortions, and no compression distortions such as QP have been taken into consideration because it has been shown that VQM is very accurate when only transmission distortions are considered using a compressed reference. The case where VQM is used to measure a combination of compression and transmission distortions (for example, different QP and packet loss rate with uncompressed reference) is not evaluated in this paper.
6.1 Framework for model development
We selected the VQM metric to develop a no-reference bitstream model because of the very good performance shown in the previous section.
List of SRCs used in model development
Name of the sequence in CDVL
Woman smoking and people on a street, high contrast in the rock
NTIA outdoor mall with tulips (3e)
Kayaking, scene changes, fast moving water
NTIA Red Kayak
Trees, leaves, short and numerous movements in most of the image, scene changes
NTIA Aspen Trees in Fall Color, Slow Scene Cuts
Mountain with snow and moving fog in a sunny day, high brightness, scene changes
NTIA Snow Mountain
Global view of a city, buildings, scene changes, rather static
NTIA Denver Skyscrapers
Two people speaking in a table and showing an electronic device
NTIA Front End (Part of a Longer Talk)
The videos were converted from YUV packed 4:2:2 to YUV planar 4:2:0. The videos were compressed with the Quantization Parameter set to 26, 32, 38 and 44. In order to make sure the no-reference model is valid for the different compression qualities the QP has been set to 26, 32, 38 and 44. However the performance of VQM in the case of compressed reference has been only tested in the case of compressed reference of high quality, which may not correspond to a QP value of 44. This causes a small degree of uncertainty in the obtained results because the scenario in which the compressed reference has low quality remains to be verified. The parameter keyint in the x264 encoder, corresponding to the interval between IDR frames, was set to 12, 36, 60 and 84. The maximum slice size was set to 1400 bytes. We consider that the keyint parameter is important since the distortion due to a packet loss propagates until the next IDR frame. Thus a higher value implies more error propagation and lower video quality. Finally the packet loss rate was set to 0.1%, 1%, 3%, 5%, and 10%. In total, 6 × 4 × 4 × 5 = 480 distorted videos were evaluated using the VQM metric.
The videos were encoded with the x264 encoder , random packet losses were inserted using a packet loss simulator  and the videos were decoded with the ffmpeg decoder. The ffmpeg decoder produces incomplete video files when random packet losses are inserted. To be able to apply the VQM metric, the videos were reconstructed so that they have the same length as the original. The reconstruction was done in two steps. First, the frame numbers were inserted into the luminance information of the uncompressed original sequence. After decoding the videos, the frame numbers were read and used to identify the missing frames and reconstruct the decoded video. The reconstruction method is explained in detail in .
6.2 Model development
6.3 Validation of the model
To validate the no-reference bitstream model, we applied the model to the EPFL-PoliMI and LIVE Wireless Video Quality Assessment databases, and we calculated the linear correlation coefficient with the subjective values. The model was not checked on the HDTV database because the HDTV database was done applying a packet loss rate to a percentage of the way through the sequence. In order to apply our model, we expect a constant packet loss rate along all the sequence. As the interval between IDR frames is fixed in all the databases used, we are only able to verify the part of the equation related to the packet loss rate. For the EPFL-PoliMI, we obtained a linear correlation coefficient of 0.945, and for the LIVE Wireless Video Quality Assessment database, we obtained a linear correlation coefficient of 0.903. We believe that the model can be improved by adding new parameters and improving the fitting function used. The important fact is that these results validate the methodology followed in order to develop a no-reference bitstream model.
High-quality video streaming services over the Internet are increasing in popularity, and as people start to pay for the services, the quality must be guaranteed. Therefore, video quality monitoring and prediction become important in the development of Internet service management systems. Numerous objective assessment methods have been proposed; however, independent comparisons are scarce. Also, real-time monitoring requires lightweight no-reference bitstream models that perform accurately enough.
In this paper, we propose a strategy for developing new no-reference objective video quality metrics by using well performing full-reference video objective quality metrics to reduce the development time. The starting point is to define a relatively narrow scope. Find a FR model to create a big training database by varying the parameters that will be present. Train the NR model on this database. The NR model can then be validated using a smaller subjective test. In case there is a need for the use of the model outside the scope, the strategy is to retrain the model for the new scope.
This strategy is illustrated on the scope of transmission distortions in the case of compressed reference. As a first step, we have evaluated six publicly available full-reference metrics using three freely available video databases. The main objective of the evaluation was to compare the performance of the metrics when transmission distortions in the form of packet loss were introduced. The results show that VQM performs very well in all the video databases, being the best metric among the studied in the HDTV video database (uncompressed reference) and in the LIVE Wireless video database. In the EPFL-PoliMI database, SPATIAL MOVIE performed best and TEMPORAL MOVIE performed worst. When transmission distortions are evaluated, using the compressed video as the reference provides greater accuracy than using the uncompressed original video as the reference, at least for the studied metrics.
We believe that the correlation values obtained would be lower if registration problems occurred and different error concealment strategies were applied.
Further, to demonstrate the suggested strategy of model development, we present a no-reference bitstream model trained and optimized using full-reference model evaluation. The objective of the model is to accurately enough predict the video quality when transmission distortions are introduced. We fit the model using videos from the Consumer Digital Video Library (CDVL) database and the VQM metric. Then, the model is validated using the video databases EPFL-PoliMI and LIVE Wireless Video Quality Assessment database with reasonable performance.
This work was developed inside the Future Internet project supported by the Basque Government within the ETORTEK Programme and the FuSeN project supported by the Spanish Ministerio de Ciencia e Innovación, under grant from Fundación Centros Tecnológicos - Iñaki Goenaga and Tecnalia Research & Innovation. The work was also partly financed by the CELTIC project IPNQSIS, and the national project EFRAIM, with the Swedish Governmental Agency for Innovation Systems (VINNOVA) supporting the Swedish contribution.
- Ahmad K, Begen A: IPTV and video networks in the 2015 timeframe: The evolution to medianets. IEEE. Commun. Mag. 2009, 47: 68-74.View ArticleGoogle Scholar
- Takahashi A, Hands D, Barriac V: Standardization activities in the ITU for a QoE assessment of IPTV. IEEE. Commun. Mag. 2008, 46: 78-84.View ArticleGoogle Scholar
- Kuipers F, Kooij R, Vleeschauwer DD, Brunnström K Lecture notes in computer science 6074/2010. In Techniques for measuring quality of experience. Heidelberg: Springer Berlin; 2010.View ArticleGoogle Scholar
- Winkler S, Mohandas P: The evolution of video quality measurements: from PSNR to Hybrid metrics. IEEE. Trans. Broadcast. 2008, 54: 660-668.View ArticleGoogle Scholar
- Reibman AR, Vaishampayan VA, Sermadevi Y: Quality monitoring of video over a packet network. IEEE. Trans. Multimed. 2004, 6: 327-334. 10.1109/TMM.2003.822785View ArticleGoogle Scholar
- Barkowsky M, Sedano I, Brunnström K, Leszczuk M, Staelens N: Hybrid video quality prediction: re-viewing video quality measurement for widening application scope. Multimed Tool Appl. in pressGoogle Scholar
- Hemami S, Reibman A: No-reference image and video quality estimation: applications and human-motivated design. Signal Process. Image Commun. Elsevier. 2010, 25: 469-481. 10.1016/j.image.2010.05.009View ArticleGoogle Scholar
- Pinson M, Wolf S: A new standardized method for objectively measuring video quality. IEEE. Trans. Broadcast. 2004, 50(3):312-322. 10.1109/TBC.2004.834028View ArticleGoogle Scholar
- Lee C, Cho S, Choe J, Jeong T, Ahn W, Lee E: Objective video quality assessment. Optic. Engineer. 2006, 45(1):017004. 10.1117/1.2160515View ArticleGoogle Scholar
- Greengrass J, Evans J, Begen A: Not all packets are equal, part 2: the impact of network packet loss on video quality. IEEE. Int. Comput. 2009, 13: 74-82.View ArticleGoogle Scholar
- Brunnström K, Hands D, Speranza F, Webster A: VQEG validation and ITU standardization of objective perceptual video quality metrics. IEEE. Signal. Process. Mag. 2009, 26: 96-101.View ArticleGoogle Scholar
- Naccari M, Tagliasacchi M, Tubaro S: No-reference video quality monitoring for H.264/AVC coded video. IEEE. Trans. Multimed. 2009, 11: 932-946.View ArticleGoogle Scholar
- Garcia M, Raake A, List P: Towards content-related features for parametric video quality prediction of IPTV services. In Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processes. Las Vegas, NV; 2008.Google Scholar
- Video quality experts group page . Accessed 07 Jan 2014 http://www.its.bldrdoc.gov/vqeg
- Sedano I, Kihl M, Brunnstrom K, Aurelius A: Evaluation of video quality metrics on transmission distortions in H.264 coded videos. In Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). Nuremberg; 2011.Google Scholar
- De Simone F, Naccari M, Tagliasacchi M, Dufaux F, Tubaro S, Ebrahimi T: Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel. In Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX). San Diego, CA; 2009.Google Scholar
- De Simone F, Tagliasacchi M, Naccari M, Tubaro S, Ebrahimi T: A H264/AVC video database for the evaluation of quality metrics. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Dallas, TX; 2010.Google Scholar
- EPFL-PoliMI video quality assessment database [Online] Available: . Accessed 07 Jan 2014 http://vqa.como.polimi.it Available: . Accessed 07 Jan 2014
- Barkowsky M, Pinson M, Pépion R, Le Callet P: Analysis of freely available subjective dataset for HDTV including coding and transmission distortions. In Proceedings of the Fifth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM). Scottsdale, AZ; 2010.Google Scholar
- Moorthy AK, Seshadrinathan K, Soundararajan R, Bovik AC: Wireless video quality assessment: a study of subjective scores and objective algorithms. IEEE. Transact. Circ. Syst. Video Technol. 2010, 20(4):513-516.Google Scholar
- FOURCC, YUV formats [Online] Available: . Accessed 07 Jan 2014 http://www.fourcc.org/yuv.php Available: . Accessed 07 Jan 2014
- H.264/AVC reference software version JM14.2, Tech. Rep., Joint Video Team (JVT) [Online] Available: . Accessed 07 Jan 2014 http://iphome.hhi.de/suehring/tml/download/old_jm/ Available: . Accessed 07 Jan 2014
- ITU-T: Recommendation ITU-T P 910, September 1999, Subjective video quality assessment methods for multimedia applications. Geneva: ITU-T;Google Scholar
- Seshadrinathan K, Soundararajan R, Bovik AC, Cormack LK: Study of subjective and objective quality assessment of video. IEEE. Trans. Image Process. 2010, 19: 1427-1441.MathSciNetView ArticleGoogle Scholar
- Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: from error visibility to structural similarity. IEEE. Trans. Image Process. 2004, 13(4):600-612. 10.1109/TIP.2003.819861View ArticleGoogle Scholar
- Wang Z, Simoncelli EP, Bovik AC: Multi-scale structural similarity for image quality assessment. In Proceedings of the IEEE Asilomar Conference Signals, Systems and Computers. Pacific Grove, CA; 2003.Google Scholar
- Video Quality Metric (VQM) software [online] Available: . Accessed 07 Jan 2014 http://www.its.bldrdoc.gov/vqm/ Available: . Accessed 07 Jan 2014
- Chandler DM, Hemami SS: VSNR: a wavelet-based visual signal-to-noise ratio for natural images. IEEE. Trans. Image Process. 2007, 16(9):2284-2298.MathSciNetView ArticleGoogle Scholar
- Seshadrinathan K, Bovik AC: Motion tuned spatio-temporal quality assessment of natural videos. IEEE. Trans. Image Process. 2010, 19(2):335-350.MathSciNetView ArticleGoogle Scholar
- MATLAB Central File Exchange [Online] Available: . Accessed 07 Jan 2014 http://www.mathworks.com/matlabcentral/fileexchange/ Available: . Accessed 07 Jan 2014
- The Structural SIMilarity (SSIM) index author’s home page [Online] Available: ~z70wang/research/ssim/. Accessed 07 Jan 2014 http://www.ece.uwaterloo.ca/ Available: ~z70wang/research/ssim/. Accessed 07 Jan 2014
- Laboratory for image & video engineering [online] Available: . Accessed 07 Jan 2014 http://live.ece.utexas.edu/research/Quality/index.htm Available: . Accessed 07 Jan 2014
- VSNR implementation from the authors [Online] Available: . Accessed 07 Jan 2014 http://foulard.ece.cornell.edu/dmc27/vsnr/vsnr.html Available: . Accessed 07 Jan 2014
- Final report from the video quality experts group on the validation of objective models of multimedia quality assessment, phase I [Online] Available: . Accessed 07 Jan 2014 ftp://vqeg.its.bldrdoc.gov/Documents/VQEG_Approved_Final_Reports/VQEG_MM_Report_Final_v2.6.pdf Available: . Accessed 07 Jan 2014
- The consumer digital video library [online] Available: . Accessed 07 Jan 2014 http://www.cdvl.org/about/index.php Available: . Accessed 07 Jan 2014
- x264 software [Online] Available: . Accessed 07 Jan 2014 http://www.videolan.org/developers/x264.html Available: . Accessed 07 Jan 2014
- JVT-Q069 [Y. Guo, H. Li, Y.-K. Wang] SVC/AVC loss simulator [Online] Available: . Accessed 07 Jan 2014 http://wftp3.itu.int/av-arch/jvt-site/2005_10_Nice/ Available: . Accessed 07 Jan 2014
- Sedano I, Kihl M, Brunnstrom K, Aurelius A: Reconstruction of incomplete decoded videos for use in objective quality metrics. In Proceedings of the 19th Int. Conf. Syst. Signals Image Process (IWSSIP). Vienna: s; 2012:376-379.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.