Digital video stabilizer by adaptive fuzzy filtering
- Mohammad J Tanakian^{1},
- Mehdi Rezaei^{1}Email author and
- Farahnaz Mohanna^{1}
https://doi.org/10.1186/1687-5281-2012-21
© Tanakian et al.; licensee Springer. 2012
Received: 8 November 2011
Accepted: 6 November 2012
Published: 7 December 2012
Abstract
Digital video stabilization (DVS) allows acquiring video sequences without disturbing jerkiness, removing unwanted camera movements. A good DVS should remove the unwanted camera movements while maintains the intentional camera movements. In this article, we propose a novel DVS algorithm that compensates the camera jitters applying an adaptive fuzzy filter on the global motion of video frames. The adaptive fuzzy filter is a simple infinite impulse response filter which is tuned by a fuzzy system adaptively to the camera motion characteristics. The fuzzy system is also tuned during operation according to the amount of camera jitters. The fuzzy system uses two inputs which are quantitative representations of the unwanted and the intentional camera movements. The global motion of video frames is estimated based on the block motion vectors which resulted by video encoder during motion estimation operation. Experimental results indicate a good performance for the proposed algorithm.
Keywords
Adaptive Digital video stabilizer Motion estimation Fuzzy filter Motion vector Video coding Video stabilization1. Introduction
Digital video stabilization (DVS) techniques have been studied for decades to improve visual quality of image sequences captured by compact and lightweight digital video cameras. When such cameras are hand held or mounted on unstable platforms, the captured video generally looks shaky because of undesired camera motions. Unwanted video vibrations would lead to degraded view experience and also greatly affect the performances of applications such as video encoding [1–4] and video surveillance [5, 6]. With recent advances in wireless technology, video stabilization systems are also considered for integration into wireless video communication equipments for the stabilization of acquired sequences before transmission, not only to improve visual quality but also to increase the compression performance [1]. Solutions to the stabilization problem involve either hardware or software to compensate the unwanted camera motion. The hardware-based stabilizers are generally expensive and lack the kind of compactness that is crucial for today’s consumer electronic devices [7, 8]. On the contrary, a DVS system that is implemented by software can easily be miniaturized and updated. Consequently, DVS system is suitable for portable digital devices, such as digital camera and mobile phone.
In general, a DVS system consists of two principal units including motion estimation (ME) and motion correction (MC) units. The ME unit estimates a global motion vector (GMV) between every two consecutive frames of the video sequence. Using the GMVs, the MC unit then generates smoothing motion vectors (SMVs) needed to compensate the frame jitters and warp the frames to create a more visual stable image sequence.
According to the motion models being considered, the already proposed global ME techniques for DVS system can roughly be divided into two categories: (1) two-dimensional stabilization techniques which deal with translational jitter only [9–20] and (2) multi-dimensional stabilization techniques which aim at stabilizing more complicated fluctuations in addition to translation [21–25]. Most of the existing algorithms fall into the first category because the translation is the most commonly encountered motion and the complexity of estimating translation parameters is relatively low for real-time stabilization.
Regarding to the ME task of DVS systems, most previous approaches attempt to reduce the computational cost by using fast ME algorithms, e.g., gray-coded bit-plane matching [9], two-bit transform [10], multiplication-free one-bit transform [11], Laplacian two-bit transform [12], and binary image matching of color weight [13]. In another approach, the global ME is limited to small, pre-defined regions [16, 17]. Such approaches consider DVS and video encoding separately and attempt to trade the accuracy of motion vectors (MVs) for the computational efficiency; nevertheless they improve the computational efficiency at the expense of degradation in the accuracy in ME and thereafter in MC tasks.
In video frames with smooth or complex texture regions, the estimated BMVs may not be in coincidence with the real motion of the blocks. Although such LMVs are applicable to the local motion compensation task which is executed in the encoder, they cannot be used for the global motion compensation which is executed by the DVS. These LMVs include some noises that degrade the global ME task. In order to remove the noisy LMVs in these regions some algorithms are proposed in [27–30]. The valid BMVs as LMVs are used for the global ME and MC compensation in next steps.
After global ME, the next essential task of a DVS system is MC in which the unwanted camera jitters are separated and removed from the intentional camera movement. Among the various MC algorithms proposed in the literature, smoothing of the GMV by low-pass filtering is the most popular. For instance, anMV integration method is used in [9, 31] which utilizes a first-order infinite impulse response (IIR) low-pass filter to integrate differential motion and to smoothen the global movement trajectory. A frame position smoothing (FPS) algorithm, based on smoothing absolute frame positions that achieve successful stabilization performance with retained smooth camera movements, is utilized for MC in [17, 32–39]. Off-line discrete Fourier transform (DFT) domain filtering is proposed for FPS-based stabilization in [32]. Kalman filter and fuzzy systems have widely been used in DVS applications [33–39]. Real-time FPS-based stabilizer using Kalman filtering of absolute frame positions has been proposed in [17, 33]. It is shown that the stabilization performance can be improved by a fuzzy adaptive Kalman filter; introducing a stabilization system that is adjusted according to the camera motion characteristics in [34]. Fuzzy stabilization systems improve the stabilization performance when their membership functions (MFs) are optimized to motion dynamics [35]. A membership selective fuzzy stabilization, in which the stabilization system selects between a pre-determined set of MFs according to instantaneous motion characteristics is proposed in [36]. A MF adaptive fuzzy filter for video stabilization is presented in [37] and a fuzzy Kalman system consists of a fuzzy system with a Kalman filter is presented in [38].
Regarding to the MC task of DVS system, almost all published algorithms try to smoothen the global movement trajectory by a kind of low-pass filtering. An important drawback of the low-pass filtering is that smoothened movement trajectory is delayed with respect to the desired camera displacements. A stricter filtering provides more stabilization at the expense of more trajectory delay and vice versa. More trajectory delay means losing more image content after stabilization.
A good MC unit should remove the unwanted camera motion while tracks the intentional motion without any delay. For this purpose, it should discriminate the unwanted and intentional camera motions while adjust the smoothing filter adaptively according to the amount of unwanted and intentional camera motions. The studied published MC algorithms lack some of these features. For example,algorithms presented in [27, 37] suffer from the lack of discrimination of unwanted and intentional camera motions. Moreover, the proposed adaptive algorithm in [27] suffers from a continuous and well adaptation. They use an adaptive filter with a smoothing factor that is switched between only two values and therefore it leads to undesirable jumps in frame position. The proposed algorithm in [38] shows a high performance but still suffers from well adaptation.
In this article, we propose a DVS algorithm with new features in ME and MC units. The ME unit estimates a GMV based on the BMVs which are estimated by the video encoder. Therefore, accurate motion information is used without extra computation cost. Moreover, an adaptive thresholding algorithm is used to remove the noisy invalid LMVs. The MC unit of the proposed DVS system applied a fuzzy adaptive IIR filter to smooth the camera movement trajectory adaptively to the characteristics of unwanted and intentional camera motions. The fuzzy system adjusts the IIR filter by using two novel inputs which are quantitative representations of the unwanted and the intentional camera motions. Experimental results show a good performance for the proposed DVS algorithm.
The remainder of this article is organized as follows. The details of the proposed video stabilization algorithm are described in Section 2. Some experimental results are presented in Section 3, and the article is concluded in Section 4.
2. The proposed method
2.1. Block-based ME
The block-based ME is used to generate the LMVs. Since the ME is done by the video encoder, the computational complexity of the DVS is very low. In this article, to test the proposed DVS system independent of the encoder, a full search ME algorithm with full-pixel resolution is taken for 8 × 8 blocks over a search range of 33 × 33 pixel to achieve the BMVs.
where p defines the motion search range.
2.2. LMV validation
The ME unit plays an important role in DVS system and its estimation accuracy is a decisive factor for the overall performance of stabilization system. Block ME process typically computes some wrong MVs which are not in coincidence to the real motion direction of the blocks. Although, such MVs can be useful for the motion compensation in encoder, they include noise and should not be used for the global motion compensation and video stabilization operations. The noisy MVs are mostly obtained from two types of regions including: very smooth regions with lack of features and very complex uneven regions [27–30]. Inspiring from the algorithm presented in [27], two qualifying tests, namely “Smoothness Test” and “Complexity Test”, are used to detect and remove the noisy MVs by an adaptive thresholding method as follows.
2.2.1. Smoothness test
where MAD_{ min }^{ n } and MAD_{ Avg }^{ n } denote the minimum and the average values of computed MADs, respectively, during ME of n th block within the search area. T_{1} is an experimentally defined constant coefficient about 0.45 and Mean(MAD_{ avg }^{ n }) denotes the average of MAD_{ avg }^{ n }, over all blocks of the frame. In fact the threshold th_{1} includes a global average value over the frame plus a margin.
2.2.2. Complexity test
where T_{2} is an experimentally defined constant coefficient about 0.45, and Max Max(MAD_{min}^{ n }) denotes the maximum value of MAD_{min}^{ n }, over all blocks of the frame. According to the equations above, the MAD_{min}^{ n } is compared against a portion of its global maximum over a frame.
It is notable that MAD is computed during ME by encoder. Therefore, the smoothness test and complexity test have no additional computational complexity cost for the proposed DVS system.
A similar thresholding approach is presented in [27–30], in which fixed values for thresholds th_{1} and th_{2} are used. Our simulation results on different video contents show that using fixed thresholds for different video contents may cause a remarkable amount of invalid noisy LMVs remain or a notable amount of valid LMVs be removed. To solve this problem, the values of thresholds th_{1} and th_{2} are adjusted adaptively based on the video content for each frame. Note, if ME is executed by a fast search algorithm rather than full-search algorithm at the encoder, the MADs calculated during ME are used for adaptation of thresholds th_{1} and th_{2}.
2.3. Global ME
The global ME unit produces a unique GMV for each video frame, which represents the camera movement during the time interval of two frames. Since the LMVs obtained from the image background tend to be very similar in both magnitude and direction, we used a clustering process to classify the motion field into clusters corresponding to the background and foreground objects. The global motion induced by camera movement is determined by a clustering process that consists of the following steps.
Step 1. Construct the histogram H of the valid LMVs. The value of H(x, y) is incremented by one each time the LMV(x, y) is encountered.
Step 2. As long as the scene is not dominated by moving objects, the cluster corresponding to background blocks has the maximum votes in the clustering process. The position (x, y) of the largest cluster or histogram bin is considered as the GMV.
2.4. Unwanted ME and correction
where the index n indicates the frame number. The parameter α, (0 ≤ α ≤ 1), can be regarded as the smoothing factor of the filter which is adjusted by the fuzzy system for each frame. A larger smoothing factor leads to a smoother, but a larger lag during intentional camera motion that makes artificially stabilized, image sequence. Therefore, a fixed value of α hardly leads to good stabilized image sequences. To avoid the lag of intentional movement and to smooth the unwanted camera motion efficiently the following fuzzy adaptation mechanism of α is proposed.
2.4.1. Fuzzy adaptation of smoothing filter
where x_{1} and x_{2} denote the inputs of fuzzy system used for the adaptive filtering of the horizontal motion component and also y_{1} and y_{2} are the inputs of fuzzy system used for the adaptive filtering of the vertical motion component. GMV_{ x }(n) and GMV_{ y }(n) indicate the horizontal and vertical components of the GMV of last frame and M + 1 is the number of last GMVs used for decision. The fuzzy system inputs, Input1 (x_{1},y_{1}) and Input2 (x_{2},y_{2}), are used as quantitative representations of unwanted and intentional camera movements, respectively. The value of Input1 is proportional to the noise amplitude and the value of Input2 is proportional to the intentional camera motion when it has an accelerating movement.
Sample scenarios for combination of unwanted and intentional camera motion
Graph | Noise | Velocity | Acceleration |
---|---|---|---|
a | High | High | High |
b | Low | High | High |
c | High | Zero | Zero |
d | Low | Low | Low |
e | Zero | High | Zero |
F | High | High | Zero |
From the adaptive filtering point of view it is important to measure the amount of noise and the intentional camera movement velocity and acceleration. A stricter smoothing filter is needed when the noise amplitude is high to remove the noise. On the other hand, the strict smoothing filter prevents following of camera path when it has an intentional high acceleration. Therefore, the smoothing factor of filter should be tuned carefully proportional to the amount of noise and camera movement acceleration. According to this, we defined the fuzzy inputs so that Input1 gives information about the amount of noise and Input2 gives information about the amount of camera movement acceleration. It is notable that amount of camera movement velocity itself does not have any constrain on the filtering so it is not measured and used here.
Central values of fuzzy system output
Input1 | |||||||
---|---|---|---|---|---|---|---|
Input2 | L | ML | M | MH | H | VH | |
L | 0.85 | 0.87 | 0.9 | 0.94 | 0.97 | 0.97 | |
ML | 0.8 | 0.85 | 0.87 | 0.9 | 0.94 | 0.97 | |
M | 0.7 | 0.8 | 0.85 | 0.87 | 0.9 | 0.97 | |
MH | 0.6 | 0.7 | 0.8 | 0.85 | 0.87 | 0.97 | |
H | 0.5 | 0.6 | 0.7 | 0.8 | 0.85 | 0.94 |
2.4.2. Adaptive fuzzy MFs
where Input1 and Input2 are clipped to a range from 1 to 10% of video frame height in term of pixel, and the K corresponds to the number of frames received in last few seconds, e.g., 2 s. This means that the system is adapted to the time-varying noise conditions while the frame size and frame rate are considered.
2.4.3. MC
where m is the frame number of the last scene cut frame.
3. Experimental results
The performance of the proposed DVS method is evaluated against 15 video sequences covering different types of scenes.
4. Conclusion
In this article, we proposed a computationally efficient DVS algorithm using motion information obtained from a hybrid block-based video encoder. Since some of the obtained MVs are not valid, an adaptive thresholding was developed to filter out valid MVs and to compute an accurate GMV for each frame. The GMVs are smoothened with an IIR low-pass filter that is tuned adaptively to unwanted and intentional camera movements. The filter is adjusted by a fuzzy system with two inputs which quantify the unwanted and intentional camera movements. The proposed method fulfills two apparently conflicting requirements: close follow-up of the intentional camera movement and removal of the unwanted camera motion. In order to improve the stabilization performance, inputs MFs of the fuzzy system are continuously adapted according to the motion properties of a number of recently received video frames. Simulation results show a high performance for the proposed algorithm. With a low degree of computational complexity, the proposed scheme can effectively be used for the mobile video communications as well as for the conventional video coding applications to improve the visual quality of digital video and to provide a higher compression performance.
Declarations
Authors’ Affiliations
References
- Engelsberg A, Schmidt G: A comparative review of digital image stabilizing algorithms for mobile video communications. IEEE Trans. Consum. Electron. 1999, 45(3):592-597.View ArticleGoogle Scholar
- Peng YC, Liang CK, Chang HA, Chen HH, Kao CJ: Integration of image stabilizer with video codec for digital video cameras. In Proceedings of the International Symposium on Circuits and Systems. 5th edition. Kobe, Japan; 2005:4781-4784.Google Scholar
- Liang CK, Peng YC, Chang HA, Su CC, Chen H: The effect of digital image stabilization on coding performance. IEEE Proceedings of the 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing 2004, 402-405.Google Scholar
- Chen HH, Liang CK, Peng YC, Chang HA: Integration of image stabilizer with video codec for digital video cameras. IEEE Trans. Video Technol. 2007, 17: 801-813.View ArticleGoogle Scholar
- Marcenaro L, Vernazza G, Regazzoni CS: Image stabilization algorithms for video surveillance applications. In Proceedings of the International Conference on Image Processing. 1st edition. Thessaloniki; 2001:349-352.Google Scholar
- Zhou J, He H, Wan D: Video stabilization and completion using two cameras. IEEE Trans. Circuit Syst. Video Technol. 2011, 99: 1.Google Scholar
- Oshima M, Hayashi T, Fujioka S, Inaji T, Mitani H, Kajino J, Ikeda K, Komoda K: VHS camcorder with electronic image stabilizer. IEEE Trans. Consum. Electron. 1989, 35(4):749-758. 10.1109/30.106892View ArticleGoogle Scholar
- Sato K, Ishizuka S, Nikami A, Sato M: Control techniques for optical image stabilizing system. IEEE Trans. Consum. Electron. 1993, 39(3):461-466. 10.1109/30.234621View ArticleGoogle Scholar
- Ko SJ, Lee SH, Jeon SW, Kang ES: Fast digital image stabilizer based on gray-coded bit-plane matching. IEEE Trans. Consum. Electron. 1999, 45(3):598-603. 10.1109/30.793546View ArticleGoogle Scholar
- Ertürk A, Ertürk S: Two-bit transform for binary block motion estimation. IEEE Trans. Circuit Syst. Video Technol. 2005, 15(7):938-946.View ArticleGoogle Scholar
- Ertürk S: Multiplication-free one-bit transform for low complexity block-based motion estimation. IEEE Signal Process. Lett. 2007, 14(2):109-112.View ArticleGoogle Scholar
- Kim NJ, Lee HJ, Lee JB: Probabilistic global motion estimation based on Laplacian two-bit plane matching for fast digital image stabilization. EURASIP J. Adv. Signal Process. 2008., 43: 10.1155/2008/180582Google Scholar
- Nan W, Xiaowei H, Gang W, Zhonghu Y: An approach of electronic stabilization based on binary image matching of color weight. In 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR). 2nd edition. Wuhan; 2010:155-158.Google Scholar
- Battiato S, Puglisi G: Fast block based local motion estimation for video stabilization. In IEEE Computer Society Conference on Pattern Recognition Workshops (CVPRW). Colorado Springs, CO; 2011:50-57.Google Scholar
- Paik JK, Park YC, Kim DW: An adaptive motion decision system for digital image stabilizer based on edge pattern matching. IEEE Trans. Consum. Electron. 1992, 38(3):607-616. 10.1109/30.156744View ArticleGoogle Scholar
- Yeni AA, Ertürk S: Fast digital image stabilization using one bit transform based sub-image motion estimation. IEEE Trans. Consum. Electron. 2005, 51(3):917-921. 10.1109/TCE.2005.1510503View ArticleGoogle Scholar
- Ertürk S: Digital image stabilization with sub-image phase correlation based global motion estimation. IEEE Trans. Consum. Electron. 2003, 49(4):1320-1325. 10.1109/TCE.2003.1261235View ArticleGoogle Scholar
- Ertürk S: Image sequence stabilization: motion vector integration (MVI) versus frame position smoothing (FPS). In Proceedings of the 2nd IEEE R8-EURASIP Symposium on Image and Signal Processing and Analysis,ISPA'01. Croatia, Pula; 2001:266-271.Google Scholar
- Xiang ZY, Jian W, Gong ZW, Quan Z, Rui D: An improved algorithm of electronic image stability based on block matching. In IEEE 5th Conference on Industrial Electronics and Applications (ICIEA). Taichung; 2010:1924-1927.Google Scholar
- Zhu J, Guo B: Fast layered bit-plane matching for electronic video stabilization. In International Conference on Multimedia and Signal Processing (CMSP). 1st edition. Guilin, Guangxi; 2011:276-280.Google Scholar
- Chang JY, Hu WF, Cheng MH, Shang BS: Digital image translation and rotation motion stabilization using optical flow technique. IEEE Trans. Consum. Electron. 2002, 48(1):108-115. 10.1109/TCE.2002.1010098View ArticleGoogle Scholar
- Erturk S: Translation, rotation and scale stabilization of image sequences. IEE Electron. Lett. 2003, 39(17):1245-12462. 10.1049/el:20030816View ArticleGoogle Scholar
- Tsoligkas NA, Xalkiadis S, Donglai X: I French, A guide to digital image stabilization procedure—anoverview. In 18th International Conference on Systems Signals and Image Processing (IWSSIP). Sarajevo; 2011:1-4.Google Scholar
- Wang JM, Chou HP, Chen SW, Fuh CS: Video stabilization for a hand-held camera based on 3D Motion model. In 16th IEEE International Conference on Image Processing (ICIP). Cairo; 2009:3477-3480.Google Scholar
- Nestares O, Gat Y, Haussecker H, Kozinsev I: Video stabilization to a global 3D frame of reference byfusing orientation sensor and image alignment data. In 9th IEEE International symposium on Mixed and Augmented Reality(ISMAR). Seoul; 2010:257-258.Google Scholar
- Mohammadi M, Fathi M, Soryani M: A new decoder side video stabilization using particle filter. In 18th International Conference on Systems signals and Image Processing (IWSSIP). Sarajevo; 2011:1-4.Google Scholar
- Yang SH, Jheng FM: An adaptive image stabilization technique. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics(SMC2006). 3rd edition. Taipei; 2006:1968-1973.Google Scholar
- Vella F, Castorina A, Mancuso M, Messina G: Digital image stabilization by adaptive block motion vectors filtering. IEEE Trans. Consum. Electron 2002, 48: 796-801. 10.1109/TCE.2002.1037077View ArticleGoogle Scholar
- Battiato S, Puglisi G, Bruna AR: A robust video stabilization system by adaptive motion vectors filtering. In IEEE International Conference on Multimedia and Expo. Hannover; 2008:373-376.Google Scholar
- Hsiao JP, Hsu CC, Shih TC, Hsu PL, Yeh SS, Wang BC: The real-time video stabilization for the rescue robot. 2009, 4364-4369.Google Scholar
- Uomori K, Morimura A, Ishii H: Electronic image stabilization system for video cameras and VCRs. J. Soc. Motion Picture Television Eng. 1992, 101: 66-75.Google Scholar
- Ertürk S, Dennis TJ: Image sequence stabilization based on DFT filtering. IEE Proc. Image Vis. Signal Process 2000, 127: 95-102.View ArticleGoogle Scholar
- Ertürk S: Real-time digital image stabilization using Kalman filters. Real-Time Imaging 2002, 8: 317-328. 10.1006/rtim.2001.0278MATHView ArticleGoogle Scholar
- Güllü MK, Yaman E, Ertürk S: Image sequence stabilization using fuzzy adaptive Kalman filtering. Electron. Lett. 2003, 39(5):429-431. 10.1049/el:20030323View ArticleGoogle Scholar
- Güllü MK, Ertürk S: Fuzzy image sequence stabilization. Electron. Lett. 2003, 39(16):1170-1172. 10.1049/el:20030781View ArticleGoogle Scholar
- Güllü MK, Ertürk S: Image sequence stabilization using membership selective fuzzy filtering. Lect. Notes Comput. Sci. (LNCS) 2003, 2869: 497-504. 10.1007/978-3-540-39737-3_62View ArticleGoogle Scholar
- Güllü MK, Ertürk S: Membership function adaptive fuzzy filter for image sequence stabilization. IEEE Trans. Consum. Electron. 2004, 50(1):1-7. 10.1109/TCE.2004.1277834View ArticleGoogle Scholar
- Kyriakoulis N, Gasteratos A: A Recursive Fuzzy System for Efficient Digital Image Stabilization (Hindawi Advances in Fuzzy Systems). 10.1155/2008/920615Google Scholar
- Pinto B, Anurenjan PR: Video stabilization using speeded up robust features. In International Conference on Communications and Signal Processing (ICCSP). Kerala; 2011:527-531.Google Scholar
- Road http://www.jnack.com/adobe/photoshop/videostabilization/
- Shaky Car matlabroot\toolbox\vipblks\vipdemos\shaky_car.aviGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.