 Research
 Open Access
CU splitting early termination based on weighted SVM
 Xiaolin Shen^{1, 2} and
 Lu Yu^{1, 2}Email author
https://doi.org/10.1186/1687528120134
© Shen and Yu; licensee Springer. 2013
 Received: 14 May 2012
 Accepted: 20 December 2012
 Published: 9 January 2013
Abstract
High efficiency video coding (HEVC) is the latest video coding standard that has been developed by JCTVC. It employs plenty of efficient coding algorithms (e.g., highly flexible quadtree coding block partitioning), and outperforms H.264/AVC by 35–43% bitrate reduction. However, it imposes enormous computational complexity on encoder due to the optimization processing in the efficient coding tools, especially the rate distortion optimization on coding unit (CU), prediction unit, and transform unit. In this article, we propose a CU splitting early termination algorithm to reduce the heavy computational burden on encoder. CU splitting is modeled as a binary classification problem, on which a support vector machine (SVM) is applied. In order to reduce the impact of outliers as well as to maintain the RD performance while a misclassification occurs, RD loss due to misclassification is introduced as weights in SVM training. Efficient and representative features are extracted and optimized by a wrapper approach to eliminate dependency on video content as well as on encoding configurations. Experimental results show that the proposed algorithm can achieve about 44.7% complexity reduction on average with only 1.35% BDrate increase under the “random access” configuration, and 41.9% time saving with 1.66% BDrate increase under the “low delay” setting, compared with the HEVC reference software.
Keywords
 HEVC
 fast coding unit decision
 classification
 SVM
 feature selection.
1. Introduction
High definition (HD) and ultrahigh definition (UHD) video contents have become increasingly popular worldwide, thus the demand of video compression technologies that can provide higher coding efficiency over HD/UHD videos can be envisioned in near future. In view of this, high efficiency video coding (HEVC) standard is being developed by the Joint Collaborative Team on Video Coding [1], which is established by the ITUT Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. HEVC outperforms H.264/AVC high profile by 35–43% bitrate reduction at the same reconstructed video quality [2]. HEVC inherits the wellknown blockbased hybrid coding scheme [3] used by previous coding standards, e.g., H.264/AVC, and extends the framework by introducing highly flexible quadtree coding block partitioning. The quadtree coding block partitioning consists of newly brought concepts of coding unit (CU), prediction unit (PU), and transform unit (TU). CU is the basic unit of region splitting used for inter/intra coding, which extends the traditional concept of macroblock (MB) based on a hierarchical structure with block size varying from 64 × 64 to 8 × 8 pixels. A CU is allowed to recursively be split into four smaller CUs of equal size. In this manner, a picture is represented by a contentadaptive coding tree structure comprised of CU blocks with different sizes. PU is the basic unit used for prediction process in a rectangular shape. One PU can be encoded with one of the modes in candidate set, which is similar to MB mode of H.264/AVC in spirit. The pixels in one PU share prediction information, e.g., modes, motion vectors (MV), and reference index. TU is the basic unit for transform and quantization. TU is defined in a similar way as CU, and its size varies from 4 × 4 to 32 × 32. As reported in [4, 5], the flexible data structure representation (extending the MB size up to 64 × 64) introduced over 10% bitrate saving in comparison with the 16 × 16based configuration in H.264/AVC, since the flexibility of block partitioning can effectively deal with the diversity of picture content.
However, the flexibility of block partitioning of HEVC imposes significant computational burden on encoder during seeking of the optimal combinations of CU, PU, and TU sizes. Thus, it is crucial for practical implementation of the new standard to reduce the complexity while maintaining the coding performance. Researches on accelerating the encoder of HEVC test model (HM) are emerging. A fast intra mode decision algorithm [6] was proposed, which made use of the direction information of the neighboring blocks to reduce the number of directions taking part in rate distortion optimization (RDO) process. To reduce the computational complexity of TU size selection, a fast algorithm for residual quadtree mode decision was proposed in [7]. Besides, the depthfirst decision process for TU size selection in HM was replaced by a mergeandsplit decision process, which also reduces unnecessary computation by using the inheritance property of zeroblocks and early termination schemes for nonzero blocks.
In this article, we focus on CU size selection for HEVC. A contentbased fast CU decision algorithm was developed for HEVC TMuC (test model under consideration) [8], which analyzed the ratio of utilized CUs to total number of CUs in different depth in frame level and skipped the rarely used CUs with specified depths. Information of neighboring and colocated CUs was used to skip CUs in unnecessary depth in CU level. The algorithm investigated temporal and spatial correlations of CU depth, and designed different thresholds to control the number of CU depths to be evaluated. However, the correlations were data dependent and the ratio was affected by encoding configurations, such as the hierarchical depth in hierarchical prediction structure. Spatial correlation of CU depth as well as the probability that neighboring CUs were SKIP mode was considered in [9] to design an adaptive weighting factor, which was used to adjust the threshold in early terminating the following RD calculations of the current CU. In [10], a method for complexity controlling was proposed by limiting the number of coding decision tests and comparisons according to temporal correlations. All these related works explored the spatial correlations and/or temporal correlations of CU depth to eliminate specific CU depths with a trivial impact on RD performance. However, they were not robust enough due to diversity of the content. It is necessary to consider more statistics so as to get a more accurate and stable model to simplify the CU splitting.
In the field of accelerating the encoder of H.264/AVC as well as its extensions, various properties were investigated and employed to simplify mode decision. A nearly sufficient condition for early zeroblock detection is constructed based on the analysis of prediction error to speed up the motion estimation of H.264/AVC JM reference software in [11]. It indicated that prediction error offered a valuable clue about encoder acceleration. Spatial and temporal correlations were exploited to predict the skip mode [12] to reduce encoder complexity. In [13, 14], distribution of MV in an MB was chosen as a feature to predict the optimal mode other than performing exhaustive search over all modes. A hierarchical algorithm proposed in [15] categorized all type of modes into three levels which were triggered on by evaluating SAD (which is between current MB and its colocated MB), highfrequency energy in DCT domain, and RD cost of mode P8 × 8. In [16], a fast mode decision algorithm named motion activitybased mode decision was proposed. It classified MBs into different classes by predefined thresholds and motion activity. Each class corresponded to different number of modes to be checked. Tiesong et al. [17] projected encoding modes onto a 2D map and an optimal 2D map was predicted using spatial and temporal information. Then, a prioritybased mode candidate list was constructed based on the optimal 2D map and mode decision was performed starting with the most important mode in the candidate list with early termination conditions. In such a way, the number of modes to be evaluated was reduced and acceleration was achieved. Changsung and Kuo [18] presented a featurebased fast inter/intra mode decision algorithm. This algorithm computed three features regarding spatial and temporal correlations with which to determine inter or intra mode to use. The feature space were partitioned into three regions, i.e., riskfree, risktolerable, and riskintolerable regions by checking the RD loss due to wrong mode decision and the probability distribution of inter/intra modes. Depending on the region, mechanisms with different complexity were applied for final mode decision. MartinezEnriquze et al. [19] analyzed the conditional pdfs for every mode and estimated the RD cost to decide the optimal mode. A fast stereo video encoding algorithm based on hierarchical twostage neural network was proposed in [20]. Local properties of input data and predicted error were extracted as the input feature to train a neural network which was designed to predict the optimal partition mode. SVM were also introduced in the study of fast mode decision [21, 22]. However, MBs were treated equally in the classification problem, and the RD performance of an MB was ignored. In general, these works exploited various moderelated features to predict the optimal mode or reduce the number of modes to be evaluated. The features included spatial and temporal correlations, the gradient or highfrequency energy, the RD cost of specific mode, motion activity, and local properties, such as the prediction error or SAD/sum of absolute transformed differences (SATD).
As shown in the previous researches, CU size selection process applying RD optimization can be unacceptably timeconsuming for practical implementation, which will be further analyzed in Section 2. To solve this problem, we propose a method utilizing machine learning to accelerate the CU size selection process. With properly modeling the problem and applying machine learning algorithm, our method can accurately predict the optimal decision on CU splitting instead of exhaustive searching over all possibilities. In order to derive a more accurate model to predict the CU splitting decision, RD difference is introduced as weights in the SVM training procedure to alleviate the RD performance degradation due to misclassification. Furthermore, various features are extracted from input video as well as earlier encoded data and an optimal feature subset is derived by a wrapper feature selection algorithm.
The rest of the article is organized as follows. We briefly go through CU size selection process of HM, and present the motivation of the proposed algorithm in Section 2. In Section 3, we elaborate the modeling of the CU splitting problem and its solution based on a machine learning algorithm, i.e., SVM. Experimental results in Section 4 demonstrate the effectiveness of the proposed algorithm, and Section 5 concludes the article.
2. CU size optimization in HM
3.CU splitting early termination algorithm based on weighted SVM
3.1. Problem formulation
where φ(·) is a nonlinear operator that maps the input x _{ i } into a higherdimensional space and it is the kernel function.
where α _{ i } and β _{ i } are Lagrange multipliers associated with the constraints in Equation (6).
It is obvious from Equation (10) that the α _{ i } associated with training point x _{ i } expresses the strength with which that point is embedded in the final decision function. Notice that the nonlinear mapping φ(·) never appears explicitly in the training or the decision. In general, the kernel takes the form of linear, polynomial, radial basis function (RBF), or sigmoid. In this article, we use the RBF kernel, since it can handle the case when the relation between class labels and the input vector is nonlinear as well as linear. Furthermore, the model complexity of the RBF kernel is lower than polynomial, and RBF kernel has fewer numerical difficulties [25].
3.2. Proposed CU splitting early termination algorithm
3.3. CU splitting early termination algorithm based on weighted SVM
3.3.1. Offline training and weights generation
In the field of machine learning, accuracy is one of the most important measurements for classification algorithms. However, in this scenario, not only the ratio of correct classification, but also the loss of RD performance introduced by misclassifications is important.
where C _{ i }(s) and C _{ i }(n) are RD cost of splitting the CU into four subCUs and RD cost of nonsplitting CU, respectively. CU with little difference of RD cost is assigned a small weight, while CU with large difference of RD cost is assigned a large weight. Note that the weights are only needed in the training procedure, and not needed anymore when the trained model is used to predict the class label in the encoding process.
The upper bounds of α _{ i } are bounded by dynamical boundaries C*W _{ i } instead of a constant value C. Then the CUs with larger difference when encoded into one CU and into four subCUs will affect the optimal hyperplane more by introducing a larger weight W _{ i }.
3.3.2. Feature selection
 (1)
Collect training samples by running the HEVC reference software HM6.0.
 (2)
Calculate Fscore of every feature in the training set and sort the features in descending order according to Fscore.
 (3)Start from one feature formed subset F (only one feature with the highest Fscore).
 (a)
Randomly divide the training set into S _{tr} and S _{cv}.
 (b)
Train SVM model using the S _{tr}.
 (c)
Predict S _{cv} and get the cross validation (CV) (based on accuracy rate).
 (d)
Add the feature with the highest Fscore in the rest to subset F and repeat steps in (3) until all features are evaluated or early terminate this process by defining the maximum feature number.
 (a)
 4)
Find the optimal feature subset with the lowest validation error.
To setup a rich feature set, diverse features are introduced and evaluated. Furthermore, it is possible to eliminate the dependency on video content by considering as many features as possible and then optimizing the feature subset. The features we consider as potential candidates are summarized as follows.

Prediction errorrelated features, such as SATD and CBF, denoted as x _{std}, x _{vrs}, and x _{cbf}. x _{std} is defined as the SATD between prediction and original pixel values, and x _{vrs} is the variance of four SATDs of subblock. x _{cbf} is the coded block flags (CBF) of the inter 2N × 2N mode. CBF indicates the complexity of the predicted error under specific quantization parameters (QP). As discussed in [11–15], these features are correlated with CU partitioning.

CU depth information of the context [8], denoted as x _{sl}, x _{sa}, and x _{tp}. x _{sl} and x _{sa} are the CU depth of leftneighboring and aboveneighboring CU, respectively. x _{tp} is the CU depth of the colocated CU. Since there is substantial correlation in spatial and temporal domain of video signal, such context provides very good information.

Gradient magnitude of current CU [18], denoted as x _{gm}. It is the summation of gradient of every pixel in the current CU by applying Sobel operator, which reveals the flatness of the CU.

Motion consistencyrelated feature [13, 14], denoted as x _{mc}, which is defined as the variance of the MVs of four subblocks in inter N × N mode. Regions with inconsistent motion activities are more likely to be encoded in small CUs.

RD cost difference between skip and inter 2N × 2N mode, denotes as x _{drc}. If the skip mode is better than inter 2N × 2N, the CU is likely to be background and it maybe not necessary to partition the CU into smaller ones. On the contrary, if inter 2N × 2N mode is better, it may be better to apply smaller partition mode or smaller CU size.

Side information in RD cost, denotes as x _{si}. Small size motion partition provides good RD performance for those blocks with high motion activities or rich in content. However, more bits should be paid to signal the side information. Therefore, the percentages of side information in total RD cost of inter 2N × 2N mode give good indication of optimal CU size.

Hierarchical structurerelated feature, denotes as x _{hrc}. For the hierarchical prediction structure in HEVC, small CU size is preferred for frames with low temporal depth and large CU size is more likely to be optimal for the frames with high temporal depth.
F score of features in different CU depth
Feature  Depth 0  Depth 1  Depth 2 

x _{std}  0.2170  0.3988  0.2858 
x _{vrs}  0.2155  0.3969  0.2846 
x _{gm}  0.1248  0.2121  0.1680 
x _{sl}  0.0734  0.1239  0.0496 
x _{sa}  0.0802  0.1062  0.0463 
x _{tp}  0.6605  0.4157  0.1139 
x _{cbf}  1.5537  0.9419  0.2997 
x _{mc}  0.0532  0.0967  0.0528 
x _{drc}  0.3966  0.6852  0.0002 
x _{si}  0.7687  0.9693  0.2424 
x _{hrc}  0.0099  0.0112  0.0061 
CV of different feature subsets
Input feature  X _{1}  X _{2}  X _{3}  X _{4}  X _{5}  X _{6}  X _{7}  X _{8}  X _{9}  X _{10} 

Depth 0 CV  [x _{cbf}]  [X _{ 1 },x _{si}]  [X _{ 2 },x _{tp}]  [X _{ 3 },x _{drc}]  [X _{ 4 },x _{std}]  [X _{ 5 },x _{vrs}]  [X _{ 6 },x _{gm}]  [X _{ 7 },x _{sa}]  [X _{ 8 },x _{sl}]  [X _{ 9 },x _{mc}] 
93.40  93.80  94.02  93.98  95.93  95.96  95.84  95.84  95.81  95.78  
Depth 1 CV  [x _{si}]  [X _{ 1 },x _{cbf}]  [X _{ 2 },x _{drc}]  [X _{ 3 },x _{tp}]  [X _{ 4 },x _{std}]  [X _{ 5 },x _{vrs}]  [X _{ 6 },x _{gm}]  [X _{ 7 },x _{sl}]  [X _{ 8 },x _{sa}]  [X _{ 9 },x _{mc}] 
83.13  84.90  86.74  87.11  87.19  87.40  87.39  87.35  87.28  87.28  
Depth 2 CV  [x _{cbf}]  [X _{ 1 },x _{std}]  [X _{ 2 },x _{vrs}]  [X _{ 3 },x _{si}]  [X _{ 4 },x _{gm}]  [X _{ 5 },x _{tp}]  [X _{ 6 },x _{mc}]  [X _{ 7 },x _{sl}]  [X _{ 8 },x _{sa}]  [X _{ 9 },x _{drc}] 
91.86  92.90  93.18  93.18  93.22  93.15  93.17  93.12  93.15  93.25 
4. Experimental results
4.1. Experimental results on the proposed CU splitting early termination algorithm
To verify the efficiency of the proposed CU splitting early termination algorithm, we conduct comprehensive experiments by comparing the proposed algorithm with HEVC reference software HM6.0. The encoding configuration exactly follows what is recommended in [29] and the test sequences in the experiments cover a variety of content. The sequences we use to train the SVM predictor model are “Cactus”, “BQMall”, and “FourPeople”, denoted as TS1 (training set 1) and they are not used in performance comparison anymore. The offline training process is carried out by the SVM training software [30] and the proposed CU early termination algorithm is incorporated into HEVC reference software HM6.0.
Complexity and RD performance comparison in TS1 (average of 4 QP points)
Class  Sequence  Random access  Low delay  

ΔT(%)  BDBR (%)  ΔT(%)  BDBR (%)  
A  PeopleOnStreet  32.98  1.7  –  – 
Traffic  59.40  1.8  –  –  
B  BasketBallDrive  52.63  1.5  52.53  1.9 
BQTerrace  53.65  1.3  51.00  1.1  
Kimono  52.63  1.3  43.43  1.4  
ParkScene  56.73  1.7  34.40  1.9  
C  BasketBallDrill  48.43  1.5  47.93  2.2 
PartyScene  36.17  1.0  37.90  1.7  
RaceHorses  33.55  1.4  36.15  1.4  
D  BasketBallPass  35.65  1.6  34.43  1.6 
BlowingBubbles  38.5  1.0  36.05  1.5  
BQSquare  39.90  0.6  36.50  1.0  
RaceHorses  30.05  1.2  28.80  1.4  
E  Johnny  –  –  54.08  2.5 
KristenAndSara  –  –  51.60  1.9  
Average  44.7  1.35  41.9  1.66 
Complexity and RD performance comparison in TS1 (data per QP)
Class  Sequence  QP  Random access  Low delay  

ΔT(%)  BDBR (%)  ΔT(%)  BDBR (%)  
B  BQTerrace  22  19.0  1.3  16.3  1.1 
27  54.5  49.3  
32  67.4  65.8  
37  73.7  72.6  
Kimono  22  35.9  1.3  30.4  1.4  
27  55.5  36.6  
32  66.4  46.0  
37  68.7  60.7 
Regarding complexity, the proposed algorithm achieves a maximum of 73.7% runningtime reduction with respect to HM6.0 with an average of 44.7% under “Random Access, main” configuration, as shown in Tables 3 and 4. In Table 3, the column of “ΔT” is the average ΔT of 4 QP points. Concerning the RD performance, it loses 1.35% in terms of BDrate on average, and a worst case of 1.8% for sequence “Traffic”. The RD loss is not significant. For the “Low Delay, main” configuration as shown in Tables 3 and 4, the proposed algorithm behaves very similar to the “Random Access, main” case and it reduces the complexity by 41.9% with 1.66% RDRate loss on average. In Table 4, part of the experimental results under different QPs is listed. As can be seen from it, more complexity reduction is achieved in low bitrate scenario (i.e., using high QP values). In such cases, larger CUs are more efficient in RD performance than smaller CUs, and large CUs take a high percentage. The proposed algorithm accurately early terminates the RDO procedures on large CU size and avoids unnecessary RD calculations on small CU size. Therefore, greater complexity reduction can be achieved in low bitrate case than the high bitrate case.
Complexity and RD performance comparison in TS2 (average of 4 QP points)
Class  Sequence  Random access  Low delay  

ΔT(%)  BDBR (%)  ΔT(%)  BDBR (%)  
A  PeopleOnStreet  37.94  1.8  –  – 
Traffic  62.58  1.8  –  –  
B  BasketBallDrive  51.32  1.2  43.72  1.8 
BQTerrace  52.16  0.6  44.03  0.7  
Cactus  51.71  1.3  42.29  1.8  
Kimono  56.75  1.0  50.14  1.8  
C  BQMall  45.18  2.2  44.03  3.3 
PartyScene  34.33  1.1  27.51  1.8  
RaceHorses  33.32  1.6  39.82  1.7  
D  BasketBallPass  39.06  1.5  36.80  1.7 
BlowingBubbles  45.96  1.0  38.96  1.6  
BQSquare  42.70  0.7  40.32  1.0  
RaceHorses  29.74  1.2  29.86  1.4  
E  FourPeople  –  –  44.82  2.6 
KristenAndSara  –  –  47.97  1.8  
Average  44.83  1.29  40.00  1.77 
Both the weighted SVM training algorithm and the wrapper feature selection algorithm have been designed to provide the ability to generalize. First of all, the weighted SVM is based on SRM principle as opposed to traditional empirical risk minimization principle employed by conventional learning algorithms. SRM minimizes an upper bound on the expected risk, which equips the SVM with great ability to generalize. Introducing RD difference as weights eliminates the influence of outliers. In other words, those training samples with little RD performance degradation due to misclassification are “almost excluded” by assigning small weights and more attention is paid to “important” samples. Second, large number of relevant features are evaluated and assessed. Diversity of features lowers the opportunity of dependence on training set. The feature selection algorithm chooses optimal feature subset based on CV error to ensure that the optimal subset is not dependent on a specific training set. Therefore, the algorithm performs stably.
4.2. Additional overhead of SVM classification
Computational complexity overheads of SVM prediction
Sequence  QP  Encode time (s)  Depth 0 (s)  Depth 1 (s)  Depth 2 (s)  Total SVM (s)  Percentage (%) 

Basketball drive  22  26623.89  55.13  299.35  603.01  957.50  3.60 
27  17178.92  60.57  130.78  195.82  387.16  2.25  
32  12863.57  53.95  91.44  75.88  221.27  1.72  
37  10754.91  57.71  59.57  23.91  141.18  1.31  
BQTerrace  22  38162.54  64.27  355.10  1260.28  1679.66  4.40 
27  16533.07  70.44  172.71  182.48  425.63  2.57  
32  10971.15  64.40  85.81  50.90  201.10  1.83  
37  8600.12  69.38  35.46  12.28  117.13  1.36  
Cactus  22  23983.84  53.70  294.53  681.30  1029.52  4.29 
27  14079.50  59.71  123.27  163.39  346.38  2.46  
32  10797.25  54.29  87.95  69.85  212.10  1.96  
37  8967.79  58.32  55.40  20.64  134.36  1.50  
Kimono  22  10746.46  26.03  141.21  284.47  451.71  4.20 
27  6743.09  27.98  62.31  90.87  181.15  2.69  
32  4778.21  25.67  43.66  13.07  82.40  1.72  
37  4217.04  27.51  29.98  3.29  60.78  1.44  
ParkScene  22  9920.39  24.85  138.25  232.19  395.30  3.98 
27  6248.77  27.02  62.18  68.37  157.57  2.52  
32  4695.61  25.44  39.24  25.83  90.52  1.93  
37  3809.69  27.52  21.49  7.26  56.27  1.48 
5. Conclusion
In this article, a CU splitting early termination algorithm is proposed. The CU splitting optimization in HEVC is formulized as a binary classification problem and is solved by support vector classification. In order to maintain the RD performance of CU splitting early termination algorithm, RD loss due to misclassification is introduced as weighting factor of training samples in the offline training procedure, with which the training method pays special attention to CUs which are prone to degrade RD performance when using a suboptimal partition. Furthermore, diverse features are considered such as the correlation between CUs both in spatial and temporal domains, prediction errors, motion activities, and RD cost of modes. To select the optimal feature subset, a wrapper feature selection approach is carried out. It embeds the model training into the selection process and simple greedy search is performed based on Fscore ranking. In such a way, the proposed algorithm performs well and stably across different configurations and various video contents. Since the CU splitting early termination model is trained offline and the optimal feature subset is small, the proposed algorithm is computationally simple. Demonstrated by the experimental results, the proposed algorithm can achieve 44.7% reduction in computational complexity with 1.35% BDRate increase in “Random Access, main” configuration and 41.9% complexity reduction with 1.66% BDRate increase in “Low Delay, main” configuration.
Declarations
Acknowledgements
This work is supported by the National Basic Research Program of China (973) under Grant No. 2009CB320903 and Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP) No. 20120101110032.
Authors’ Affiliations
References
 ITUT SG16 Q6 and ISO/IEC JTC1/SC29/WG11, 2010 ITUT SG16 Q6 document VCEGAM91 and ISO/IEC JTC1/SC29/WG11 document N11113: Joint Call for Proposals on Video Compression Technology. ITUT SG16 Q6 and ISO/IEC JTC1/SC29/WG11, Kyoto, Japan;Google Scholar
 Bin L, Sullivan GJ, Jizheng X: Comparison of compression performance of HEVC working draft 5 with AVC high profile. ITUT/ISO/IEC Joint Collaborative Team on Video Coding (JCTVC) document JCTVCH0360, in 8th Meeting of JCTVC, San Jose, USA; 2012.Google Scholar
 Bross B, Han WJ, Sullivan GJ, Ohm JR, Wiegand T: High efficiency video coding (HEVC) text specification draft 6. ITUT/ISO/IEC Joint Collaborative Team on Video Coding (JCTVC) document JCTVCH1003, in 8th Meeting of JCTVC, San Jose,USA; 2012.Google Scholar
 Kim J, Kim M, Kim HY, Sato K, Shen X, Yu L, Choi K, Jang ES, Bross B, Han WJ, Jo JK, Park SN, Sim DG, Oh SJ: JCTVC TE9: Report on large block structure testing. ITUT/ISO/IEC Joint Collaborative Team on Video Coding (JCTVC) document JCTVCC067, in 3rd Meeting of JCTVC, Guangzhou, China; 2010.Google Scholar
 Qualcomm Inc: Video Coding Using Extended Block Sizes, ITUT Q.6/SG16 document COM16C123E. VCEG 36th Meeting, Geneva, Switzerland; 2009.Google Scholar
 Liang Z, Li Z, Siwei M, Debin Z: Fast mode decision algorithm for intra prediction in HEVC. 2011 IEEE Visual Communications and Image Processing (VCIP), Tainan; 2011:14.Google Scholar
 SuWei T, HsuehMing H, YiFu C: Fast mode decision algorithm for residual quadtree coding in HEVC. 2011 IEEE Visual Communications and Image Processing (VCIP), Tainan; 2011:14.Google Scholar
 Jie L, Lei S, Ikenaga T, Sakaida S: Content based hierarchical fast coding unit decision algorithm for HEVC. 1st edition. 2011 International Conference on Multimedia and Signal Processing (CMSP), Guilin, Guangxi; 2011:5659.Google Scholar
 Jongho K, Seyoon J, Sukhee C, Jin Soo C: Adaptive coding unit early termination algorithm for HEVC. 2012 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV; 2012:261262.Google Scholar
 Correa G, Assuncao P, Agostini L, da Silva Cruz LA: Complexity control of high efficiency video encoders for powerconstrained devices. IEEE Trans Consum Electron 2011, 57(4):18661874. 10.1109/TCE.2011.6131165View ArticleGoogle Scholar
 Lee YM, Tsai YJ, Lin Y: Improved motion estimation using early zeroblock detection. EURASIP J Image Video Process 2008, 2008: 524793. 10.1155/2008/524793Google Scholar
 ByungGyu K: Novel intermode decision algorithm based on macroblock (MB) tracking for the Pslice in H.264/AVC video coding. IEEE Trans Circuits Syst Video Technol 2008, 18(2):273279. 10.1109/TCSVT.2008.918121View ArticleGoogle Scholar
 TienYing K, ChenHung C: Fast variable block size motion estimation for H.264 using likelihood and correlation of motion field. IEEE Trans Circuits Syst Video Technol 2006, 16(10):11851195. 10.1109/TCSVT.2006.883512View ArticleGoogle Scholar
 Zhi L, Liquan S, Zhaoyang Z: An efficient inter mode decision algorithm based on motion homogeneity for H.264/AVC. IEEE Trans Circuits Syst Video Technol 2009, 19(1):128132. 10.1109/TCSVT.2008.2005804View ArticleGoogle Scholar
 Yu ACW, Martin GR, Heechan P: Fast intermode selection in the H.264/AVC standard using a hierarchical decision process. IEEE Trans Circuits Syst Video Technol 2008, 18(2):186195. 10.1109/TCSVT.2007.913970View ArticleGoogle Scholar
 Huanqiang Z, Canhui C, KaiKuang M: Fast mode decision for H.264/AVC based on macroblock motion activity. IEEE Trans Circuits Syst Video Technol 2009, 19(4):491499. 10.1109/TCSVT.2009.2014014View ArticleGoogle Scholar
 Tiesong Z, Hanli W, Kwong S, Kuo CCJ: Fast mode decision based on mode adaptation. IEEE Trans Circuits Syst Video Technol 2010, 20(5):697705. 10.1109/TCSVT.2010.2045812View ArticleGoogle Scholar
 Changsung K, Kuo CCJ: Featurebased intra/inter coding mode selection for H.264/AVC. IEEE Trans Circuits Syst Video Technol 2007, 17(4):441453. 10.1109/TCSVT.2006.888829View ArticleGoogle Scholar
 MartinezEnriquez D, JimenezMoreno A, DiazdeMaria F: An adaptive algorithm for fast inter mode decision in the H.264/AVC video coding standard. IEEE Trans Consum Electron 2010, 56(2):826834. 10.1109/TCE.2010.5506008View ArticleGoogle Scholar
 JuiChiu C, WeiChih C, LienMing L, KuoFeng H, WenNung L: A fast H.264/AVCbased stereo video encoding algorithm based on hierarchical twostage neural classification. IEEE J Sel Topics Signal Process 2011, 5(2):309320. 10.1109/JSTSP.2010.2066956View ArticleGoogle Scholar
 ChenKuo C, WeiHau P, Chiuan H, ShinShan Z, ShangHong L: Fast H.264 encoding based on statistical learning. IEEE Trans Circuits Syst Video Technol 2011, 21(9):13041315. 10.1109/TCSVT.2011.2147250View ArticleGoogle Scholar
 Jaeil K, Munchurl K, Sangjin H, Injoon C, Changsub P: Blockmode classification using SVMs for early termination of block mode decision in H.264MPEG4 part 10 AVC. Seventh International Conference on Advances in Pattern Recognition, ICAPR'09, Kolkata; 2009:8386.Google Scholar
 Corinna C, Vapnik V: Supportvector networks. Mach Learn 1995, 20(3):273297. 1995Google Scholar
 Scholkopf B, Burges C, Smola A: Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA; 1999.Google Scholar
 Hsu CW, Chang CC, Lin CJ: A practical guide to support vector classification, Tech. rep. Department of Computer Science, National Taiwan University; 2003. http://www.csie.ntu.edu.tw/cjlin/guide/guide.pdf Google Scholar
 Isabelle G, André E: An introduction to variable and feature selection. J Mach Learn Res 2003, 3: 11571182.Google Scholar
 Chen YW, Lin CJ: Combining SVMs with Various Feature Selection Strategies. Springer, New York; 2006.View ArticleGoogle Scholar
 HM Software. https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM6.0
 Bossen F: Common test conditions and software reference configurations, ITUT/ISO/IEC Joint Collaborative Team on Video Coding (JCTVC) document JCTVCH1100. 8th meeting o JCTVC, San Jose, USA; 2012.Google Scholar
 ChihChung C, ChihJen L: LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2011, 2(27):127.View ArticleGoogle Scholar
 Bjontegaard G: Improvements of the BDPSNR model, ITUT SG16/Q6 document VCEGAI11. 35th VCEG Meeting, Germany, Berlin; 2008.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.