Skip to main content

Visual contour tracking based on inner-contour model particle filter under complex background

Abstract

In this paper, a novel particle filter–based visual contour tracking method is proposed, which uses inner-contour model to track contour object under complex background. The purpose is to achieve effectiveness and robustness against complex background. To that end, the proposed method first utilized Sobel edge detector to detect the edge information along the normal line of the contour. Then, it sampled the inner part of the normal line to get the local color information, which was then combined with the edge information to construct new normal line likelihood. After that, all the inner color information was used to construct global color likelihood. Finally, the edge information, local color information, and global color information were fused into new observation likelihood. Experimental results showed that the proposed method was robust for contours tracking under complex background, and it was also computationally efficient and can run in real-time completely.

1 Introduction

Nowadays, the application of visual object tracking is becoming more and more important in many fields, such as surveillance, robots, and human-computer interfaces [1]. However, it is still a challenging task to achieve reliable tracking due to cluttered background, occlusion, and different illuminations. To overcome the above-mentioned challenges and achieve robust tracking, a lot of tracking methods have been published during last two decades. Most of these tracking methods use rectangle or other rigid shapes to represent the target, which lose detailed shape and edge information. Furthermore, rectangle or other rigid shapes contain some background pixels outside the real target region, which will reduce the robustness of tracking. To overcome this problem, some researchers use contours to represent deformable targets [2,3,4,5].

1.1 Related work

Most contour tracking methods can be grouped into two categories, parametric active contour [6,7,8] and geometric active contour [5, 9], with different representations of contour curves. In the former, the contour was approximated by an explicit parametric model, typically using a set of control points and B-splines. In the second case, the contour was typically represented by an implicit function, as in the level set method. In general, the parametric contour methods were more efficient, and were thus more suitable for real-time contour tracking.

In order to track contours with non-Gaussian and nonlinear state densities in cluttered video sequences, Isard and Blake [6] introduced the CONDENSATION algorithm. They used B-spline to represent object contours, and particle filters to track the curve parameters given noisy observations. However, in this work, a very simple measurement term was used. Therefore, this method had difficulties in dealing with complex background clutters. To improve the performance under complex background, several methods were proposed. Li and Zhang [3] proposed Unscented Kalman Particle Filter (UKPF) to construct a new observation model, in which Kalman filter and unscented particle filter were used to adopt sub-optimal proposal distributions. Chen [10] proposed a Multicue Hidden Markov Model Unscented Particle Filter (MHMM-UPF) for contour detection and tracking based on multiple visual cues in spatial domain and improved the performance by joint probability matching to reduce background clutter. Although these two methods could improve the tracking performance in some simple environments, they still could not deal with the existence of objects that are similar with the target. Another contour tracking strategy is tracking by segmentation, in which the contours are represented with segmentation masks. For instance, Godec [11] proposed a Hough Tracking (HT) algorithm, which is one of the state-of-the-art contour tracking algorithms in recent years. In this algorithm, the object location was determined with the Hough voting technique, and a mask was obtained with the Grab cut algorithm. However, due to its imposing shape constraints, the HT algorithm is very time-consuming, and is not suitable for real-time object tracking.

1.2 Our approach

In contrast to the above methods, a novel multi-feature fusion approach called inner-contour model was proposed in this paper. This method fused the color feature with gradient feature to construct a new observation model in the particle filter framework to realize robust contour tracking in cluttered background. The proposed method first utilized Sobel edge detector to detect the edge information along the normal line of the contour. Then, it sampled the inner part of the normal line to get the local color information, which was combined with the edge information to construct new normal line likelihood. After that, all the inner color information was used to construct global color likelihood. Finally, the edge information, local color information, and global color information were fused into new observation likelihood. Experimental results showed that the proposed method was robust for contours tracking under complex background, and it was also computationally efficient and can run in real-time completely. The pipeline of the proposed method is shown in Fig. 1.

Fig. 1
figure 1

The pipeline of the proposed method

1.3 Paper organization

Section 2 summarizes the relevant methods for visual contour tracking; Section 3 deals with the proposed inner-contour model for visual contour tracking; Section 4 introduces the experimental results and compares them with UKPF [3], MHMM-UPF [10], and HT [11]; and Section 5 outlines the conclusions and suggestions for future research.

2 Visual contours tracking based on particle filter

This section briefly overviews the main concepts of the related methods discussed in this paper, including the basic formulae of particle filter for visual tracking and the visual contour observation model for contour appearance representation.

2.1 Particle filter

Particle filter is a Monte Carlo approximation to the optimal Bayesian filter. It provides robust tracking of moving objects in cluttered environment, especially in the case of nonlinear and non-Gaussian problems where the interest lies in the detection and tracking of moving objects. It is a probabilistic framework for sequentially estimating the state of the target, recursively calculating the posterior density p(st| z1 : t) of the current object state st conditioned on all observations z1 : t = (z1, z2......zt) up to time t. The posterior density p(st| z1 : t) can be obtained recursively in two stages: prediction and update, which are, respectively, written as follows:

$$ p\left({s}_t|{z}_{1:t}\right)=\frac{p\left({z}_t|{s}_t\right)p\left({s}_t|{z}_{1:t-1}\right)}{p\left({z}_t|{z}_{1:t-1}\right)}={k}_pp\left({z}_t|{s}_t\right)p\left({s}_t|{z}_{1:t-1}\right) $$
(1)
$$ p\left({s}_t|{z}_{1:t-1}\right)=\int p\left({s}_t|{s}_{t-1}\right)p\left({s}_{t-1}|{z}_{1:t-1}\right)d{s}_{t-1} $$
(2)

According to formulas (1) and (2), we obtain the following formula:

$$ p\left({s}_t|{z}_{1:t}\right)={k}_pp\left({z}_t|{s}_t\right)\int p\left({s}_t|{s}_{t-1}\right)p\left({s}_{t-1}|{z}_{1:t-1}\right)d{s}_{t-1} $$
(3)

where kp is a normalizing constant that is independent of st, p(zt| st) is the likelihood function, p(st| st − 1) is the dynamic model, and p(st| z1 : t − 1) is the temporal prior over st given the prior observations.

The integral in formula (3) has no closed form solution, except in some most basic cases, so the particle filter is used to approximate formula (3) by using a set of weighted particles \( \left\{{s}_t^{(i)},{w}_t^{(i)}\right\}i=1,...,n \), and each particle represents a hypothetical state of the object. Under this representation, formula (3) can be approximated as follows:

$$ p\left({s}_t|{z}_{1:t}\right)\approx {k}_pp\left({z}_t|{s}_t\right)\sum \limits_{i=1}^n{w}_{t-1}^{(i)}p\left({s}_t^{(i)}|{s}_{t-1}^{(i)}\right) $$
(4)

where \( {w}_t^{(i)} \) is the weight for particle \( {s}_t^{(i)} \).

To implement a standard PF, a state representation st should be identified, in object tracking, which might include locations, scales, and rotations of the object. Moreover, it is necessary to design three distributions: the process dynamical distribution p(st| st − 1), which describes how the object moves between frames; the proposal distribution q(st| s1 : t − 1, z1 : t), which is sampled each time the particle distribution updates; and the observation likelihood distribution p(zt| st), which means how the object appears in the video frame. This paper focuses on this likelihood, and will be discussed in detail in the later sections.

The dynamical distribution p(st| st − 1) can usually be represented as a linear stochastic differential function:

$$ {s}_t=A{s}_{t-1}+B{\omega}_{t-1} $$
(5)

where A defines the deterministic component of the dynamic model, st is the state vector of time t, ωt − 1 (0, 1) is the system noise, which is usually an uniformly random variable or a multivariate Gaussian random variable, and B is the propagation distance, indicating the distance the particles can propagate in the next frame.

2.2 Visual contour observation model

In this paper, the visual contour object was modeled as a B-spline curve and was restricted to a shape space proposed by A. Blake and M. Isard [6]. The observation model of the tracking process was based on the model introduced by J. MacCormick [12], and the assumptions and propositions in this reference were adopted as the fundamentals for further derivation. This model can be described briefly as follows.

Giving a candidate contour represented by a B-spline curve, on which a finite number of points are sampled, and then the normal li(i = 1, 2, m) (hereafter called measurement line) to this curve at these points are searched for edge features, all the measurement lines have the same length L. A Sobel edge detector is applied to each measurement line, which is characterized by local maximum. This model made the following hypotheses.

Each feature could correspond to the real edge of the target or clutter feature. The model assumes that all the clutter features are uniformly distributed on the measurement line, and only one edge feature can be detected on each measurement line. The number n of clutter features can be observed on the measurement line with length L obeys a Poisson law with the densityλ:bL(n) = eλL(λL)n/n!; there is a fixed possibility q01 that the edge feature is not detected; the distribution of the distance between edge feature and contour location of the real object is Gaussian, with zero mean and varianceσ2.

Based on these hypotheses, the likelihood of the measurement line li can be expressed as (see reference [13] for details):

$$ {p}_i\left(n;z|v={v}_n\right)={e}^{-\lambda L}\frac{\lambda^n}{n!}\left({q}_{01}+\frac{q_{11}}{\gamma}\sum \limits_{k=1}^n\frac{1}{\sqrt{2\uppi}\delta }{e}^{-\frac{{\left({z}_k^i-{v}_i\right)}^2}{2{\delta}^2}}\right) $$
(6)

where q11 = 1 − q01. Given that all the observations of the m measurement line li(i = 1, 2, m) are statistically independent, then the likelihood of the entire contour becomes:

$$ p\left(z|x\right)=\prod \limits_{i=1}^m{p}_i\left(n;z|v={v}_i\right) $$
(7)

3 The proposed method

3.1 Inner-contour model

With only using the gradient feature of the input image, the contour tracking algorithms based on general model may have good performance with simple background which does not have much edge features. However, in highly cluttered environment, the tracker will easily drift to the noise edge feature, which leads to the failure of the tracking process. In order to improve the robustness of contour tracking, it is necessary to introduce other useful features into this model, and naturally fuse all the features to construct new observation likelihood. Inspired by this idea, this paper proposed new observation likelihood which combines the gradient feature and color feature naturally.

To achieve robustness against non-rigidity, rotation, and partial occlusion, the color distribution is a widely used target representation model. In this paper, the color distribution in HSV color space was used to express the color features.

With the contour as the boundary, all the m measurement lines li(i = 1, 2, m) can be separated into two parts, the inner part normal lines and the outer part normal lines, as shown in Fig. 2. The inner part reflects some characteristics of the target, while the outer part is often the clutter background, so using the inner part measurement lines can enhance the observation model during the tracking process.

Fig. 2
figure 2

Inner-contour model

For a single measurement line li, the histogram of the inner part is hi = {hi, u}u = 1...q, while the corresponding reference histogram is ri = {ri, u}u = 1...q, the size of both histograms are 1 × L/2, thus the similarity of the two histograms can be measured by the Bhattacharyya distance:

$$ {d}_i=\sqrt{1-\rho \left[{h}_i,{r}_i\right]} $$
(8)

where \( \rho \left[{h}_i,{r}_i\right]=\sum \limits_{u=1}^q\sqrt{h_{i,u}{r}_{i,u}} \) is the Bhattacharyya factor. Then, the local color likelihood of li is:

$$ {p}_{i, LC}\left(z|x\right)=\frac{1}{\sqrt{2\uppi}{\delta}_{LC}}{e}^{-\frac{d_i^2}{2{\delta}_{LC}^2}} $$
(9)

The new likelihood for the measurement line combine li with the gradient feature and local color feature can be written as:

$$ {p}_{l_i}={p}_i\left(n;z|v={v}_n\right){p}_{i, LC}\left(z|x\right) $$
(10)

The likelihood of combining the whole contour with all measurement lines turns to be:

$$ p\left(z|x\right)=\prod \limits_{i=1}^m{p}_{l_i} $$
(11)

However, only using the local color information of each measurement line cannot provide the overall color distribution information of the contour target. In order to address this problem, all the inner part measurement lines are combined to construct a global color histogram, as shown in Fig. 3.

Fig. 3
figure 3

Global histogram of inner normals

The histogram of all the m measurement lines li(i = 1, 2, m) is H = {Hu}u = 1...q, while the corresponding reference histogram is Q = {Qu}u = 1...q, the size of both histograms is 1 × L/2, thus the similarity of the two histograms can be measured by Bhattacharyya distance:

$$ {D}_i=\sqrt{1-\rho \left[{H}_i,{Q}_i\right]} $$
(12)

where \( \rho \left[{H}_i,{Q}_i\right]=\sum \limits_{u=1}^q\sqrt{H_{i,u}{Q}_{i,u}} \) is the Bhattacharyya factor. Then, the global color likelihood of the contour is:

$$ {p}_{GC}\left(z|x\right)=\frac{1}{\sqrt{2\uppi}{\delta}_{GC}}{e}^{-\frac{D^2}{2{\delta}_{GC}^2}} $$
(13)

The final likelihood of the contour, which combines the gradient information, local color information, and global color information, can be expressed as follows:

$$ p\left(z|x\right)={p}_{GC}\left(z|x\right)\prod \limits_{i=1}^m{p}_{L_i} $$
(14)
figure a

4 Results and discussion

4.1 Experiment setting

To demonstrate the effectiveness and robustness of the proposed tracking scheme, seven different color videos were used in our experiments, six of which were acquired indoors and outdoors with a SONY CCD camera EX-FCB48, and the rest was acquired from the public tracking dataset by Babenko [14]. These videos contained several challenging conditions, such as partial or total occlusion, and similar objects in the background. The contours of the object of basic truth of all tested videos were marked manually frame by frame. For all videos, the target object was manually selected in the first frame.

The control points of three particle filter–based algorithms (UKPF, MHMM-UPF, and the proposed algorithm) were generated randomly according to formula 5, and the dynamic parameters used in the formula were listed in Table 2.

All the algorithms were implemented in C++ using the OpenCV library and run on a 1.8 GHz Pentium Dual-Core CPU, with 2 Gbyte of DDR memory.

The proposed tracking method was compared with UKPF, MHMM-UPF, and HT algorithms, and the tracking results of each algorithm were marked with different colors to demonstrate the differences. The parameters used in the experiments are shown in Tables 1 and 2.

Table 1 Test videos information and parameters used in the experiments
Table 2 Parameter values used in the experiments

In the following section, we first presented all the tracking results of the two algorithms in the same tested video with different color curves, and then gave detailed evaluations and comparisons to demonstrate the effectiveness of the algorithms.

4.2 Performance and results overview

For comparison, we first implemented all the algorithms separately, and recorded the related data. Then, we redrew all the curves in the same video with different colors in order to observe the differences between the algorithms, as shown in Figs. 4, 5, 6, 7, 8, 9, and 10.

Fig. 4
figure 4

Tracking result of Hand1. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 5
figure 5

Tracking result of Hand3. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 6
figure 6

Tracking result of “Body.” Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 7
figure 7

Tracking result of Leaf1. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 8
figure 8

Tracking result of Leaf2. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 9
figure 9

Tracking result of Taxi. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

Fig. 10
figure 10

Tracking result of David. Yellow—UKPF, white—MHMM-UPF, red—Hough Tracker, green—proposed method

The test videos “Hand1” and “Hand3” were used to test the tracking performance in the scenarios of cluttered background, with abundant edge features and obvious affine transformation when the hand was moving. In “Hand1,” due to the existence of collars, cuffs, pockets, and wrinkles in clothes, the edge features of background interference were very noticeable. In “Hand3,” the background was more chaotic and there were many dense edge features, as shown in Fig. 5. Therefore, if only edge feature was used for contour tracking, it would be difficult to achieve stable tracking. As shown in the 285th, 361st, and 373rd frames in Fig. 4, the 291st, 488th, and 514th frames in Fig. 5, the tracking results of the UKPF and MHMM-UPF drifted to the background, while in the previous frames of the tracking result of HT, the contours were not close to the real target, because the HT algorithm needed a progress to segment the target, as shown in the 43rd frame in Fig. 4. When using the proposed inner-contour model, the tracking process was more reliable and robust with the help of local and global color information, as shown by the tracking results of green curves in Figs. 4 and 5. At the same time, the algorithm can respond to the affine deformation of the target in time and accurately in the tracking process. In Fig. 4, the target had an obvious movement close to the camera and then far away from the camera, that is, the process of the target from small to large and then to small, and also the process of rotation with large angle, during all these affine transformations. The proposed method always kept good tracking performance, while the UKPF and MHMM-UPF could not.

The test video “Body” was used to test the robustness and effectiveness of the algorithm in the complex dynamic backgrounds with similar target interference and occlusion. In the “Body” video, a cartoon movie was always played on the projection screen of the background; therefore, the edge features and color features of the background were changing simultaneously, and another person walked in front of the target, as shown in Fig. 6. When only edge feature was used to track human contours, it was susceptible to dynamic background and similar target occlusions, resulting in tracking failures, as shown in the 174th, 177th, and 183rd frames of Fig. 6. When combined with the inner-contour information, because there were significant differences between the clothes worn by the two people, it was easier to distinguish two different targets, even in the case of partial occlusion, so that the target could be accurately tracked. Although the tracking contours of HT were not as accurate as the proposed method, the accuracy of the target center positions was the best among all the test algorithms.

The test videos “Leaf1” and “Leaf 2” were used to test the tracking performance when there were a large number of similar targets in the background. These two sets of test videos were the most challenging, because the target to be tracked was a bunch of completely similar leaves, which flutter in the wind, and the system dynamic model was more complicated, thus it was difficult to achieve stable tracking. As shown in the 118th, 121st, and 130th frames of Fig.7, and the 377th, 411th, and 466th frames in Fig. 8, if only the edge information was used, it was easy to track other leaves during intense movement and deformation when the leaves were blown by the wind. When the inner-contour model was adopted, it had better stability under the same conditions, which can not only accurately track the position of the target, but also make correct affine changes with the swing and deformation of the leaves. The HT tracker did not perform well in these two test videos because it was hard to segment the real target among so many similar leaves around the target.

The test videos “Taxi” was used to test the tracking performance when the target moved quickly in a simple background, as shown in Fig. 9. In this test, all the algorithms tracked the target well, and the HT tracker performed better than the other three particle filter–based algorithms, because in this simple background, it was relatively easy to segment the real target precisely.

The test video “David” was a well-known video sequence, which was used to track algorithms to test the robustness when the environment luminance changed significantly. The experimental results showed that, as shown in Fig. 10, the proposed method and HT can track the person well from the first frame to the end, while UKPF and MHMM-UPF lost the targets during the luminance changing.

In order to evaluate the tracking performance differences of various algorithms more objectively and accurately, the Euclidean distance of the center of gravity coordinates, the Euclidian distance of the control points and the algorithm time were used to evaluate the difference of different tracking algorithms in this paper. The center of gravity coordinate Euclidean distance represented the Euclidean distance between the center of gravity of the target contour calculated by the algorithm and the ground truth, which was used to characterize the accuracy of the overall position of the contour. The average Euclidean distance of the control points represented the average value of the Euclidean distance between each control point of the target contour calculated by the algorithm and the ground truth, which was used to characterize the track location accuracy. The algorithm time was a measure of the time spent by each algorithm on tracking different targets. It was used to characterize the execution efficiency and real-time performance of the algorithm.

As shown in Fig. 11, the left side represents the difference between the horizontal position and the ground truth of the center of gravity of the tracking target contour calculated by each algorithm of the “Hand1” video, and the right side represents the difference between the vertical position and the ground truth. Figure 12 shows the Euclidean distance between the center of gravity of the control points and the ground truth of each frame. From these figures, it can be seen that the proposed algorithm performed best among all the tested algorithms.

Fig. 11
figure 11

Tracking result of target center of gravity of “Hand1.” Left—horizontal center of gravity; right—vertical center of gravity

Fig. 12.
figure 12

Tracking result of Euclidian distance of “Hand1”

For performance evaluation and comparison, all the three PF-based algorithms (UKPF, MHMM-UPF, and the proposed algorithm) were tested with different numbers of particles, including 100, 150, 200, 250, and 300, and the tracking results are shown in Table 3. It can be found that the tracking performance improves with the increase of the number of particles, and reaches the top when the number of particles is larger than 200.

Table 3 Average Euclidean distance using different particle numbers

The experiment result of average Euclidian distance and time consuming of all the test videos can be seen in Table 3. In terms of average Euclidean distance, the performance of the proposed algorithm and HT tracker was better than that of UKPF and MHMM-UPF, and the average Euclidean distance of the proposed algorithm was the smallest in test videos “Hand1,” “Hand3,” “Leaf1,” and “Leaf2,” while the performance of HT tracker was better in test videos “Body,” “Taxi,” and “David” than that of the proposed one.

The difference of Euclidean distance between the four algorithms in the test video “Leaf2” was the most obvious, followed by the videos “Hand1”and “Hand3,” and the "Taxi" tracking was the least. The reason of this phenomenon was, in the leaf contour tracking video, there were a lot of edge information and some similar targets. Meanwhile, the (Pan-Tilt-Zoom) PTZ camera took intense motion in three degrees of freedom, so when only edge information was used, the calculated weight of each particle may be close in any position, so that the effect of resampling was not obvious, and finally, the target position generated a relatively large deviation. When the color information was introduced, the small difference between the target and the surrounding leaves was used to increase the weight of each particle, and the weight of the edge of the background became smaller. After resampling, more particles moved closer to the target leaves; therefore, the tracking position was more accurate. In the “Body” video, there were few distinct edge features around the body. The edge features could well distinguish the target and the background, so the difference between the two algorithms was very small.

For the perspective of algorithm time-consuming (Table 4), the HT tracker was the most time-consuming one, it took much more time than the other three algorithms due to its complex segment algorithm and imposing shape constraints. Among the other three algorithms except HT, the proposed algorithm was more time-consuming than UKPF, but less than MHMM-UPF; however the difference was very small, about 1 to 2 ms. At the same time, it can be found that the time consuming of the algorithm increased with the increase of the number of control points. The reason was that if only the time consumption of coordinate transformation of the control points was considered, the algorithm time would be slightly affected by the number of control points due to the representation of the shape space, and the main influencing factor was the time to draw the B-spline curve. In the case of 50 equal divisions between every two control points, the time to draw the B-spline curve increased approximately linearly with the number of control points, as shown in Fig. 13. In the case of 18 control points (using test video “Hand3”), the average algorithm time was 40.26 ms, and the algorithm time could basically meet the real-time requirements, and the real-time performance of the algorithm could be further improved by using some fast spatial-temporal mechanisms and algorithms, as mentioned in literatures [15,16,17].

Table 4 Time consumption (ms) of tested algorithms
Fig. 13
figure 13

The relationship between time consuming and the number of control points of test video “Hand3”

5 Conclusions

This paper has presented a method of visual contour tracking based on particle filter for inner-contour model under complex background. This novel method fused the gradient feature, local color feature, and global color feature naturally to achieve robust contour tracking in cluttered environment. Specifically, the proposed algorithm first used Sobel edge detector to detect the edge information along the normal lines of the contour, and then sampled the inner part of the normal lines to get the local color information, which was combined with the edge information to construct new normal line likelihood. After that, all the inner color information was used to construct global color likelihood. Finally, the edge information, local color information and global color information are fused together as new observation likelihood. The experimental results demonstrated that, compared with gradient-only feature method, the proposed algorithm was effective and robust in dealing with cluttered background, and it was also computationally efficient and could run completely in real time.

The proposed algorithm was inspired by the gradient-only contour tracking method (UKPF) and achieved better results, and it would be helpful for other tracking methods that needed to consider multi-cues fusing in cluttered background.

Availability of data and materials

None

Abbreviations

HT:

Hough Tracking

MHMM-UPF:

Multicue Hidden Markov Model Unscented Particle Filter

UKPF:

Unscented Kalman Particle Filter

References

  1. A. Yilmaz, O. Javed, M. Shah, Object tracking: a survey. ACM Comput. Surv. 38(4), 13 (2006)

    Article  Google Scholar 

  2. M. Isard, A. Blake, CONDENSATION—conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1), 5–28 (1998)

    Article  Google Scholar 

  3. P. Li, T. Zhang, A.E.C. Pece, Visual contour tracking based on particle filters. Image Vis. Comput. 21(1), 111–123 (2003)

    Article  Google Scholar 

  4. A.W.M. Smeulders et al., Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)

    Article  Google Scholar 

  5. P. Lv et al., Multiple cues-based active contours for target contour tracking under sophisticated background. Vis. Comput. 33(9), 1103–1119 (2016)

    Article  Google Scholar 

  6. Isard, M. and A. Blake. Contour tracking by stochastic propagation of conditional density. in European Conference on Computer Vision. 1996.

  7. N. Peterfreund, Robust tracking of position and velocity with Kalman snakes. Pami 22(6), 564–569 (1999)

    Article  Google Scholar 

  8. F.Y. Shih, Z. Kai, Locating object contours in complex background using improved snakes. Computer Vision & Image Understanding 105(2), 93–98 (2007)

    Article  Google Scholar 

  9. Chockalingam, P., N. Pradeep, and S. Birchfield. Adaptive fragments-based tracking of non-rigid objects using level sets. in IEEE International Conference on Computer Vision. 2009.

  10. C. Yunqiang, R. Yong, T.S. Huang, Multicue HMM-UKF for real-time contour tracking. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1525–1529 (2006)

    Article  Google Scholar 

  11. M. Godec, P.M. Roth, H. Bischof, Hough-based tracking of non-rigid objects. Comput. Vis. Image Underst. 117(10), 1245–1256 (2013)

    Article  Google Scholar 

  12. MacCormick, J. and A. Blake. A probabilistic exclusion principle for tracking multiple objects. in Proceedings of the Seventh IEEE International Conference on Computer Vision. 1999.

  13. J. Maccormick, Stochastic algorithms for visual tracking (2002)

    Book  Google Scholar 

  14. Babenko, B., M.H. Yang, and S. Belongie. Visual tracking with online multiple instance learning. in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. 2009.

  15. Yan, C., et al., Cross-modality bridging and knowledge transferring for image understanding. IEEE Transactions on Multimedia, 2019: p. 1-1.

  16. Yan, C., et al., STAT: spatial-temporal attention mechanism for video captioning. IEEE Transactions on Multimedia, 2019: p. 1-1.

  17. C. Yan et al., A fast Uyghur text detector for complex background images. IEEE Transactions on Multimedia 20(12), 3389–3398 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The most heartfelt gratitude to Ke Xiang for his helpful discussion and feedback on the algorithm, as well as Fangge Lu’s for proofreading of English writing.

Funding

None

Author information

Authors and Affiliations

Authors

Contributions

SC implemented the core algorithm, designed all the experiments, addressed the resulting data, and drafted the manuscript. XW participated in the design and construction of the inner-contour model and helped draft the manuscript. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Songxiao Cao.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, S., Wang, X. Visual contour tracking based on inner-contour model particle filter under complex background. J Image Video Proc. 2019, 85 (2019). https://doi.org/10.1186/s13640-019-0487-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13640-019-0487-7

Keywords