ComputingReviews.com

Interacting multiview tracker
Yoon J., Yang M., Yoon K. IEEE Transactions on Pattern Analysis and Machine Intelligence38(5):903-917,2016.Type:Article

Date Reviewed: 02/15/17

Tracking objects in image sequences, in the presence of blurring, occlusion, and variations in illumination and object pose, is an important and challenging research area. Today’s demanding applications such as autonomous movement and scene understanding require the ability to track characteristic image regions accurately, more so because poor performance at one point in the sequence can degrade accuracy to an unacceptable level. Yoon, Yang, and Yoon present a method for synthesizing the results of multiple object tracking algorithms, to provide best performance, where the synthesis is done dynamically from probabilistic information on the likely accuracy of each tracker. Two of the authors coauthored a conference paper in 2012 [1], disclosing the current method; the current journal paper gives more detail, results, and added work.

The Multiview method assumes the existence of multiple individual trackers, each with at least reasonable performance, that differ in the feature representation used to compute probable tracking vector values. These tracking vectors encode the object’s position, angle, scale, aspect ratio, and skew direction. Suitable trackers exist in the literature, and are briefly described here. For example, the histogram of oriented gradients (HOG) may be used to model the object in a pose-robust manner. For robustness in the presence of occlusions, representations based on Haar wavelets show promise. From these two examples, it can be seen that dynamic combination of multiple tracker results may deal well with unpredictable image degradation, and that simple high-level approaches such as averaging results will be ineffective.

The authors propose a framework in which results from multiple trackers have likelihoods calculated at each new frame. These likelihoods are based on prior likelihoods (via interaction probability calculation using a particle filter) and a dynamic motion model that is specific to each tracker. Because the different representations are not independent, they use a transition probability matrix to model the interaction between the behaviors of the various trackers. A good description of the method, including pseudocode, is included in the paper.

Results of the Multiview tracker are presented for a variety of image sequences of low to moderate resolution; note that high resolution is not required for good tracking performance. Up to 11 different published trackers were used as input to the Multiview framework. The results are encouraging, notably in the presence of various types of degradation such as motion blur, and change in scale and occlusion. The proposed method was superior to all single trackers, except for cases where the movement from ground truth is small.

As mentioned, this paper elaborates and extends a paper from 2012 on the same subject; there is significantly more and newer material here. The paper is fairly readable, given the dense level of mathematics involved in the method. I was confused by the choice of red, green, and blue to depict findings and results of the three trackers explored in depth; some readers will associate these plots and graphics with the three color channels (though the method is explicitly monochrome). In spite of this, the result reporting is well done and easy to follow; the comparison with state-of-the-art trackers is a model for others to follow. Those interested in tracking algorithms for difficult and/or high-accuracy requirements should read this paper carefully.

Yoon, J. H.; Kim, D. Y.; Yoon, K.-J. Visual tracking via adaptive tracker selection with multiple features. In Computer Vision -- ECCV 2012 (LNCS 7575). Springer, 2012, 28–41.

Reviewer: Creed Jones

Review #: CR145065 (1705-0314)

Reproduction in whole or in part without permission is prohibited. Copyright 2024 ComputingReviews.com™
Terms of Use | Privacy Policy