In this era of heightened security concerns, computer-aided and fully automatic surveillance applications represent a key challenge to computer vision researchers. To that end, Xiang and Gong present an approach for unsupervised learning of normal behavior patterns in a given setting, pairing it with a method for anomaly detection. The approach performs unsupervised clustering to define its “behavior patterns,” segments of representative video on which behavior models are learned. The behavior models are the authors’ multi-object hidden Markov models. These models have the advantage of far fewer free parameters than standard hidden Markov models; they are then clustered to define the normal behavior classes. Online classification is a two-step process: first, the event is tested for anomalies (based on a threshold). Second, it is classified with likelihood ratio tests into one of the discovered normal classes.
Although the authors’ goal is commendable, the paper is lacking in some respects. First, the approach is complex and involves numerous interdependent processing steps, such as segmentation and clustering, thus making analysis difficult--for example, readers are unable to determine which processing module may be responsible for certain errors. Second, the method uses an off-the-shelf video segmentation engine to build the underlying behavior patterns. The robustness of this segmentation is crucial to the success of the method, but no analysis is performed on its robustness. Third, many of the algorithmic steps rest on unjustified assumptions, such as Gaussians. Fourth, the anomaly detection is ultimately performed with an ad hoc threshold that “should be set according to the detection and false alarm rates required by each particular surveillance application.” Readers expect at least some underlying theoretical rigor for such an important part of the paper.