Skip to main content
  • Research Article
  • Open access
  • Published:

A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain

Abstract

Motion estimation techniques are widely used in todays video processing systems. The most frequently used techniques are the optical flow method and phase correlation method. The vast majority of these algorithms consider noise-free data. Thus, in the case of the image sequences are severely corrupted by additive Gaussian (perhaps non-Gaussian) noises of unknown covariance, the classical techniques will fail to work because they will also estimate the noise spatial correlation. In this paper, we have studied this topic from a viewpoint different from the above to explore the fundamental limits in image motion estimation. Our scheme is based on subpixel motion estimation algorithm using bispectrum in the parametric domain. The motion vector of a moving object is estimated by solving linear equations involving third-order hologram and the matrix containing Dirac delta function. Simulation results are presented and compared to the optical flow and phase correlation algorithms; this approach provides more reliable displacement estimates particularly for complex noisy image sequences. In our simulation, we used the database freely available on the web.

1. Introduction

The importance of image sequence processing is constantly growing with the ever increasing use of television and video systems in consumer, commercial, medical, and scientific applications. Image sequences can be acquired by film-based motion picture cameras or electronic video cameras. In either case, there are several factors related to imaging sensor limitations that contribute to the graininess (noise) of resulting images. Electronic sensor noise and film grain are among these factors [1]. In many cases, graininess may result in visually disturbing degradation of the image quality, or it may mask important image information. Even if the noise may not be perceived at full-speed video due to the temporal masking effect of the eye, it often leads to unacceptable single-frame hardcopies and to poor-quality freeze-frames that adversely affect the performance of subsequent image analysis [2].

The motion estimation process must be able to track objects within a noisy source. In a noisy source, objects appear to change from frame to frame because of the noise, not necessarily as the result of object motion [3]. Tracking objects within a noisy environment is difficult, especially if the image frames are severely corrupted by additive Gaussian noises of unknown covariance; second-order statistics methods do not work well.

Higher-order statistics (HOS) in general and the bispectrum (order 3) in particular have recently been widely used as an important tool for signal processing. The classical methods based on the power spectrum are now being effectively superseded by the bispectral ones due to some definite disadvantages of the former. These include the inability to identify systems fed by non-Gaussian noise (NGN) inputs and nonminimum phase (NMP) systems and identification of system nonlinearity [4]. In these cases, the autocorrelation-based methods offer no answer. Out of all these, the identifiability of NMP systems has received the maximum attention from researchers.

HOS-based methods have been proposed to estimate motion between image frames [59]. In, the motion estimation is based on the bispectrum method for sub-pixel resolution of noisy image sequences. In [7], the displacement vector is obtained by maximizing a third-order statistics criterion. In [8], the global motion parameters are obtained by a new region recursive algorithm. In [6], several algorithms are developed based on a parametric cumulant method, a cumulant-matching method, and a mean kurtosis error criterion. The latter is an extension of the quadratic pixel-recursive method by Netravali and Robbins [10]. In [11], it is shown that such statistical parameters are insensitive to additive Gaussian noises. In particular, bispectrum parameters are insensitive to any symmetrically distributed noise and also exhibit the capability of better characterizing NGN and identifying NMP linear systems as well as nonlinear systems. Therefore, transformation to a higher-order domain reduces the effect of noise significantly. In this correspondence, a novel algorithm for the detection of motion vectors in video sequences is proposed. The algorithm uses bispectrum model-based subpixel motion estimation in the parametric domain for noisy image sequences to obtain a measure of content similarity for temporally adjacent frames and responds very well to scene motion vectors. The algorithm is insensitive to the presence of symmetrically distributed noise.

The outline of this paper is as follows. First, the problem formulation is introduced in Section 2. In Section 3, we first present briefly the definitions and properties of the bispecrum and cross-bispectrum. Next, we describe the motion estimation in the parametric domain. High-accuracy subpixel motion estimation is discussed in Section 4. Section 5 presents an evaluation of the computational complexity of our algorithm. The results of the experimental evaluation of the proposed method are shown in Section 6 and compared to existing methods while Section 7 concludes the paper.

2. Problem Formulation

The problem of motion estimation can be stated as follows: "Given an image sequence, compute a representation of the motion field that best aligns pixels in one frame of the sequence with those in the next" [9]. This is formulated as follows:

(1)
(2)

where denotes spatial image position of a point; and are observed image intensities at instants and respectively; and are noise-free frames; and are assumed to be spatially and temporally stationary, zero-mean image Gaussian (or non-Gaussian) noise sequences with unknown covariance; and is the displacement vector of the object during the time interval .

The goal is to estimate from and .

3. Bispectrum-Based Image Motion Estimation

3.1. Definitions and Properties

In this subsection, some HOS functions are defined and their properties are described in order to provide the necessary tools to understand the motion estimation methodology.

Also, the third-order autocumulant/moment sequence, , of is defined as follows [5]:

(3)

where denotes the expectation operation; and are two shifted versions of the .

To understand the theory of triple correlations physically for 2D data [4], the reader is referred to Figure 1. The Figure shows the spaces occupied by the original data (denoted by continuous box) and two shifted versions of the same data (denoted by dashed boxes). The shifts are made by the amounts and , respectively. It is now obvious that the product of the overlapping data positions (shown by the shaded portion) denotes the triple correlation function as defined by (3).

Figure 1
figure 1

Physical representation of cumulants [ 4 ].

Also, the third-order autocumulant/moment sequence is defined as follows [12]:

(4)

Equation (4) states that the triple correlation of plus is equal to the triple correlation of the plus the triple correlation of the noise. For a zero-mean Gaussian noise, then its triple correlation is identically zero [7, 12]. This provides a theoretical basis for using the triple correlation (or the bispectrum) as a method of reducing the effects of additive noise. Then the term is negligible which renders the triple correlation very effective in detecting a signal embedded in noise. Therefore,

(5)

Also, can be non-Gaussian if it is independent and identically distributed (i.i.d.) and nonskewed (e.g., symmetrically distributed).

The bispectrum, , is defined as the 4D Fourier transform of the third-order autocumulant (or moment) [4]:

(6)

where denotes the 4D Fourier transform operation; and are the frequency coordinates for the 2D Fourier transform.

Let be the discrete Fourier transform (DFT) of the frame . Each component of the bispectrum is estimated by a triple product of Fourier coefficients as follows [12]:

(7)

where indicates the complex conjugate.

The cross-bispectrum is obtained in a similar manner as the bispectrum. Thus,

(8)

On close observation and after certain algebraic manipulations, (3) shows the third-order cumulant sequence to possess the following symmetry properties [4]:

(9)

This directly leads to the following symmetry properties of the bispectrum sequence [4]:

(10)

These symmetry properties reduce the computational burden while calculating the bispectrum.

3.2. Parametric Model-Based Motion Estimation

The problem of signal processing using bispectrum in the parametric domain has recently been widely addressed by researchers. Two primary schools of thought exist for this. The first line is headed by Raghuveer and Nikias [13], who have parametrized the bispectrum through solution of a cumulant matrix equation. The other school of thought headed by Giannakis [14] calculates the system impulse response coefficients directly using a linear combination of cumulant slices. Other papers have been subsequently published which give extensions and applications of these basic approaches (see [15] and references therein). Here, we propose the bispectrum model-based subpixel motion estimation in the parametric domain. Simulations demonstrate that this method requires large blocks of data and thus may be appropriate for estimating object displacement in background noise. Consequently, our approach will be derived in this context and hence, . in (2). Substituting from (1) into (2) we obtain

(11)

where, in theory, for all except , and . If the search region contains the largest possible horizontal and vertical delays, and if we take for , we obtain

(12)

By multiplying both sides of (12) by and taking expectations, we obtain

(13)

Therefore,

(14)

However, and are identically zero for all due to the fact that the signal and noise are zero-mean and independent. Consequently, (14) becomes

(15)

Taking the 4D Fourier transform of (15) and rearranging, the following pulse transfer function for the system is obtained:

(16)

The third-order hologram, , is then defined by

(17)

where denotes the 4D inverse Fourier transform operation.

Selecting various integers for and , we form an overdetermined system of equations. For example, if the chosen search region is a rectangular that varies from to in the horizontal direction and to in the vertical direction, and if ranges from to , and ranges from to , a set of linear equations can be produced as follows:

(18)

where

(19)

The least-squares solution of (18) is given:

(20)

The least-squares solution is obtained and its maximum is determined. The image motion estimate is then .

Although we assume that the signals are non-Gaussian, it can be shown that for binary, deterministic signals and large images sizes bispectrum is insensitive to Gaussian noise, and thus (20) is approximately true. Therefore,

(21)

4. High-Accuracy Subpixel Motion Estimation

Subpixel performance is a critical element of the proposed algorithm. With reference to our previously published work [16, 17], we are introducing a number of important new features, which improve the accuracy of the motion estimates.

The coordinates of the maximum of the real-valued array can be used as an estimate of the horizontal and vertical components of motion between and as follows:

(22)

where denotes the real part of complex array .

Subpixel accuracy of motion measurements is obtained by variable-separable fitting performed in the neighborhood of the maximum using one-dimensional quadratic function. Using the notation in (22), prototype functions are fitted to the triplets:

(23)
(24)

that is, the maximum peak of the phase correlation surface and its two neighboring values on either side, vertically and horizontally.

The location of the maximum of the fitted function provides the required subpixel motion estimate . Fitting a parabolic function horizontally to the data triplet (23) yields a closed-form solution for the horizontal component of the motion estimate as follows:

(25)

where .

The fractional part of the vertical component can be obtained in a similar way using (24) instead of (23).

Finally the horizontal and vertical components of the subpixel accurate motion estimate are obtained by computing the location of the maxima of each of the above fitted quadratics.

In [18], it is shown that half-pixel accuracy motion vectors lead to a very significant improvement when compared to one pixel accuracy, whereas a higher precision results in negligible changes. Therefore, a half-pixel accuracy was chosen in our simulations.

5. Computational Cost Comparison

The majority of the computational cost of the proposed bispectrum is due to the fast Fourier transform (FFT). Therefore, the fundamental computation required for bispectral estimates is given by (7), the triple product of the three individual Fourier transformations, while this computation is straightforward, limitations on computer time and statistical variance impose severe limitations on implementation of the definition of the bispectrum [19]. On the other hand, we take advantage of the symmetrical properties of the bispectrum to reduce the computational complexity and memory requirements of calculating third-order statistics. It can now be calculated in any one sector and mapped onto the others [20].

The phase correlation is estimated by multiplying each coefficient by its complex conjugate, but each component of the bispectrum is estimated by a triple product of Fourier coefficients as demonstrated in (7). Thus, the number of operations required to compute the bispectrum is significantly increased relative to the phase correlation. There are independent components of the bispectrum while there are only independent components of the phase correlation for an image [21].

6. Simulation Results

Our experiments have aimed at evaluating the performance of the proposed approach and comparing it with that of the optical flow and phase correlation techniques. For the optical flow method we used the implementation obtained from Bruhn method [22]. In our simulation we used the database freely available on the web at http://vision.middlebury.edu/flow/. We contribute three types of data to test different aspects of all techniques: real sequences of independent motion; realistic synthetic sequences; and high frame-rate video. These sequences have been chosen for their difficult motion and their different characteristics. Although the original sequences are in color, only the luminance component is used to estimate the motion vectors.

Figure 2 shows the estimated motion vector fields for the Grove sequence using the three aforementioned motion estimation methods. Note that for a fair comparison we used optical flow technique and phase correlation algorithm with half-pixel accuracy. The motion vectors estimated between frames 6 and 7 are shown for the Grove sequence. For this particular sequence, our scheme provides the most consistent and reliable motion vector field. Both optical flow and phase correlation algorithms fail to detect the true motion vector. Similar results are shown in Figures 3 and 4 for the motion vectors estimated between frames 2 and 3, and between frames 5 and 6 in the Walking and Mequon sequences, respectively. Both optical flow and phase correlation algorithms produce abrupt motion vector fields. Although these abrupt motion vectors may lead to lower numerical mean squared errors (MSEs), they are incorrect motion vectors. Because of the noise resistant property of the parametric bispectrum method, it produces more reliable estimates. Therefore, our approach motion estimation results globally in motion fields more representative of the true motion in the scene.

Figure 2
figure 2

Motion vector fields of Grove sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm.

Figure 3
figure 3

Motion vector fields of Walking sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm.

Figure 4
figure 4

Motion vector fields of Mequon sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm.

To see more clearly the correctness of motion estimation, we use Beanbags sequence as an example. The motion compensated pictures using three methods are shown in Figure 5. Portions of these three pictures are enlarged in Figure 6 to show the differences. We observe better compensated images by the proposed method. We also observe that the motion compensated images for our scheme are much closer to the original images. Thus, the scheme is able to measure the motion vector more accurately and is more robust in general. Overall, parametric bispectrum scheme typically offers better visual quality images than the other methods.

Figure 5
figure 5

Prediction for frame 5 of the Beanbags sequence in the presence of noise using (b) our algorithm, (c) optical flow algorithm, (d) phase correlation algorithm, (a) is original image.

Figure 6
figure 6

Enlarged portions of the motion compensated pictures of the Beanbags sequence using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm.

The detection of motion vectors relies on successive phase correlation operations applied to pairs of consecutive block partitioned frames of a video sequence. The heights of the dominant peaks are monitored, and when a sudden magnitude change is detected, then this is interpreted as a displacement vector. Figure 7 shows sample phase correlation surface between two blocks and , related to frames 3 and 4 of the Hydrangea sequence, respectively. The bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other methods. The phase of the Fourier transform contains important shape information. Therefore, the bispectrum minimizes the influence of the noise and simplifies the identification of the dominant peak on the correlation surface.

Figure 7
figure 7

Phase correlation surfaces between two blocks using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm.

The PSNR of motion compensated is a popular performance measure for motion estimation, giving insight about the quality of the prediction. The PSNRs of the three motion estimation algorithms are shown in Figure 8. This result is obtained by using two real video sequences Tempete and Stefan. These sequences were run for 60 frames with a frame rate of 30 frame/sec. Both sequences are degraded with additive zero-mean Gaussian noise to a signal-to-noise ratio (SNR) of 10 dB. Here we define

(26)

where is the variance of the frame, is the variance of the noise. From Figure 8, it is clear that the implemented optical flow technique is significantly less efficient than the parametric bispectrum technique. It is mainly due to the difficulty of the optical flow technique to cope with large displacement and discontinuities in the motion field. On the other hand, the normalization (equalization) operation in the phase correlation technique enhances the noise power at high frequencies, and it produces incorrect displacement estimates on noisy image sequences. On the whole, the bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other techniques. This confirms the motion that the proposed technique of an image is a superior feature selector utilizing the portions of the image spectrum most likely to contribute to reliable motion estimation.

Figure 8
figure 8

PSNR obtained for noisy sequences (SNR = 10 dB).

In terms of complexity, this is measured by the computation time. All the computations are performed on Intel centrino duo machines (Toshiba Satellite A100-579 T5500, 2 GHz(2 CPUs)) with Windows XP. The three algorithms have been implemented using a prototype written in Matlab 6.5 R13. The comparison between three methods for the motion estimation computation time (MECT) is shown in Table 1.

Table 1 The comparison between three methods for the computation time.

We employ 60 frames of the video Tempete sequence. We perform the motion compensation procedure for each current frame with respect to reference frames , where and . The average PSNR of the motion compensated images is given in Table 2, with Tempete sequence degraded with additive zero-mean Gaussian noise to an SNR of 10 dB.

Table 2 Average PSNR of motion compensated images for the three motion estimation techniques (unit: dB) for Tempete sequence.

The average PSNR, , is given as follows:

(27)

where is the measured PSNR for frame and is the total number of frames. In Table 2, we observe that the decreases with larger apparent disparity between the global motion of the background and the local motion of the foreground. For each value of , we see that the is higher for the proposed scheme than the other methods.

7. Conclusion

In this paper, subpixel motion estimation algorithm using bispectrum in the parametric domain was presented. We have presented a collection of datasets for the evaluation of our method, available on the web at http://vision.middlebury.edu/flow/. In the case of the data is severely corrupted by additive Gaussian noises of unknown covariance, our method suppresses the effects of noise and simplifies the identification of the dominant peak on the correlation surface, unlike other techniques. At high noise levels SNR around 10 dB the optical flow and phase correlation techniques fail, yet even under these extreme conditions, the parametric bispectrum provides improvement in performance over the other algorithms. Overall, our scheme produces smoother displacement vector field with a more accurate measure of object motion in different SNR scenarios.

References

  1. Benmoussat N, Faouzi Belbachir M, Benamar B: Motion estimation and compensation from noisy image sequences: a new filtering scheme. Image and Vision Computing 2007,25(5):686-694. 10.1016/j.imavis.2006.05.010

    Article  Google Scholar 

  2. Brailean JC, Kleihorst RP, Efstratiadis S, Katsaggelos AK, Lagendijk RL: Noise reduction filters for dynamic image sequences: a review. Proceedings of the IEEE 1995,83(9):1272-1292. 10.1109/5.406412

    Article  Google Scholar 

  3. Armitano RM, Schafer RW, Kitson FL, Bhaskaran V: Robust block-matching motion-estimation technique for noisy sources. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 4: 2685-2688.

    Google Scholar 

  4. Bhattacharya S, Ray NC, Sinha S: 2-D signal modelling and reconstruction using third-order cumulants. Signal Processing 1997,62(1):61-72. 10.1016/S0165-1684(97)00115-1

    Article  MATH  Google Scholar 

  5. Ismaili Aalaoui EM, Ibn-Elhaj E: Estimation of subpixel motion using bispectrum. Research Letters in Signal Processing 2008, 2008:-5.

    Google Scholar 

  6. Anderson JMM, Giannakis GB: Image motion estimation algorithms using cumulants. IEEE Transactions on Image Processing 1995,4(3):346-357. 10.1109/83.366482

    Article  Google Scholar 

  7. Kleihorst RP, Lagendijk RL, Biemond J: Noise reduction of severely corrupted image sequences. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '93), April 1993, Minneapolis, Minn, USA 5: 293-296.

    Google Scholar 

  8. Ibn-Elhaj E, Aboutajdine D, Pateux S, Morin L: HOS-based method of global motion estimation for noisy image sequences. Electronics Letters 1999,35(16):1320-1322. 10.1049/el:19990913

    Article  Google Scholar 

  9. Sayrol E, Gasull A, Fonollosa JR: Motion estimation using higher order statistics. IEEE Transactions Image Processing 1996,5(6):1077-1084. 10.1109/83.503924

    Article  Google Scholar 

  10. Netravali AN, Robbins JD: Motion-compensated television coding—part I. Bell System Technical Journal 1979,58(3):629-668.

    Article  Google Scholar 

  11. Murino V, Ottonello C, Pagnan S: Noisy texture classification: a higher-order statistics approach. Pattern Recognition 1998,31(4):383-393. 10.1016/S0031-3203(97)00055-1

    Article  Google Scholar 

  12. Sadler BM, Giannakis GB: Shift- and rotation-invariant object reconstruction using the bispectrum. Journal of the Optical Society of America A 1992,9(1):57-69. 10.1364/JOSAA.9.000057

    Article  Google Scholar 

  13. Raghuveer MR, Nikias CL: Bispectrum estimation: a parametric approach. IEEE Transactions on Acoustics, Speech and Signal Processing 1985,33(5):1213-1230. 10.1109/TASSP.1985.1164679

    Article  Google Scholar 

  14. Giannakis GB: On the identifiability of non-Gaussian ARMA models using cumulants. IEEE Transactions on Automatic Control 1990,35(1):18-26. 10.1109/9.45139

    Article  MathSciNet  MATH  Google Scholar 

  15. Mendel JM: Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proceedings of the IEEE 1991,79(3):278-305. 10.1109/5.75086

    Article  Google Scholar 

  16. Ismaili Aalaoui EM, Ibn-Elhaj E: Estimation of motion fields from noisy image sequences: using generalized cross-correlation methods. Proceedings of the IEEE International Conference on Signal Processing and Communications (ICSPC '07), November 2007, Dubai, UAE

    Google Scholar 

  17. Ismaili Aalaoui EM, Ibn Elhaj E: Estimation of displacement vector field from noisy data using maximum likelihood estimator. Proceedings of the 14th IEEE International Conference on Electronics, Circuits, and Systems (ICECS '07), December 2007, Marrakech, Morocco 1380-1383.

    Google Scholar 

  18. Madec G: Half pixel accuracy in block matching. Proceedings on the Picture Coding Symposium (PCS '90), March 1990, Cambridge, Mass, USA

    Google Scholar 

  19. Lii KS, Helland KN: Cross-bispectrum computation and variance estimation. ACM Transactions on Mathematical Software 1981,7(3):284-294. 10.1145/355958.355961

    Article  MathSciNet  MATH  Google Scholar 

  20. Le Caillec J-M, Garello R: Comparison of statistical indices using third order statistics for nonlinearity detection. Signal Processing 2004,84(3):499-525. 10.1016/j.sigpro.2003.11.013

    Article  MATH  Google Scholar 

  21. Means RW, Wallach B, Busby D: Bispectrum signal processing on HNC's SIMD numerical array processor (SNAP). Proceedings of the ACM/IEEE Conference on Supercomputing (SC '93), November 1993, Portland, Ore, USA 535-537.

    Chapter  Google Scholar 

  22. Bruhn A, Weickert J, Schnörr C: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. International Journal of Computer Vision 2005,61(3):211-231.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to EM Ismaili Aalaoui.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Ismaili Aalaoui, E., Ibn-Elhaj, E. & Bouyakhf, E. A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain. J Image Video Proc 2009, 381673 (2009). https://doi.org/10.1155/2009/381673

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2009/381673

Keywords