Computing Reviews, the leading online review service for computing literature.

Search

Evaluating two-stream CNN for video classification
Ye H., Wu Z., Zhao R., Wang X., Jiang Y., Xue X. ICMR 2015 (Proceedings of the 5th ACM International Conference on Multimedia Retrieval, Shanghai, China, Jun 23-26, 2015)435-442.2015.Type:Proceedings

Date Reviewed: Aug 4 2015

Video classification is a challenging issue. One of the difficulties is that “there is [a] very limited amount of training data with manual annotations in the video domain.” A two-stream convolutional neural network (CNN), one stream working on a static frame and the other working on temporal motion, has been proposed to tackle this issue. The authors of the paper conduct a comparative study to demonstrate the competitive performance of two-stream CNN against state-of-the-art methods. The authors explain: “state-of-the-art video classification systems are usually built on top of multiple discriminative feature representations.” These features are usually handcrafted. CNN is a new method in which the features are generated directly from raw data using some form of deep learning. In this study, the authors consider many parameters in CNN, “including [neural] network architectures, model fusion, learning parameters, and the final prediction methods.” The network architecture choices presented in the study include CNN_M and VGG_19; fusion strategies include model fusion and modality fusion; and learning parameters include learning rate, dropout ratio, and the number of training iterations. A combination of these parameters forms the basis of the experimental settings. The authors used two datasets in their experiments. One set of data, UCF-101, “consists of 13,320 video clips”; “there are 101 annotated classes that can be divided into five types.” The other dataset, Columbia Consumer Videos (CCV), “contains 9,317 YouTube videos annotated according to 20 classes.” The results are compared against previous studies on the same datasets, four on each for UCF-101 and CCV. Two-stream CNN performs better than the methods shown in these previous studies, especially for the CCV dataset. The paper is very well written, and the project described should be of high interest to researchers working in the area of video classification.

Reviewer: Xiannong Meng	Review #: CR143666 (1510-0908)

Classifier Design And Evaluation (I.5.2 ... )

Content Analysis And Indexing (H.3.1 )

Would you recommend this review?

yes

Other reviews under "Classifier Design And Evaluation":	Date

Linear discrimination with symmetrical models Bobrowski L. Pattern Recognition 19(1): 101-109, 1986. Type: Article	Feb 1 1988

An application of a graph distance measure to the classification of muscle tissue patterns Sanfeliu A. (ed), Fu K., Prewitt J. International Journal of Pattern Recognition and Artificial Intelligence 1(1): 17-42, 1987. Type: Article	Dec 1 1989

Selective networks and recognition automata George N. J., Edelman G. Computer culture: the scientific, intellectual, and social impact of the computer (, New York,2011984. Type: Proceedings	May 1 1987

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy