Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Evaluating two-stream CNN for video classification
Ye H., Wu Z., Zhao R., Wang X., Jiang Y., Xue X.  ICMR 2015 (Proceedings of the 5th ACM International Conference on Multimedia Retrieval, Shanghai, China, Jun 23-26, 2015)435-442.2015.Type:Proceedings
Date Reviewed: Aug 4 2015

Video classification is a challenging issue. One of the difficulties is that “there is [a] very limited amount of training data with manual annotations in the video domain.” A two-stream convolutional neural network (CNN), one stream working on a static frame and the other working on temporal motion, has been proposed to tackle this issue. The authors of the paper conduct a comparative study to demonstrate the competitive performance of two-stream CNN against state-of-the-art methods.

The authors explain: “state-of-the-art video classification systems are usually built on top of multiple discriminative feature representations.” These features are usually handcrafted. CNN is a new method in which the features are generated directly from raw data using some form of deep learning. In this study, the authors consider many parameters in CNN, “including [neural] network architectures, model fusion, learning parameters, and the final prediction methods.” The network architecture choices presented in the study include CNN_M and VGG_19; fusion strategies include model fusion and modality fusion; and learning parameters include learning rate, dropout ratio, and the number of training iterations. A combination of these parameters forms the basis of the experimental settings. The authors used two datasets in their experiments. One set of data, UCF-101, “consists of 13,320 video clips”; “there are 101 annotated classes that can be divided into five types.” The other dataset, Columbia Consumer Videos (CCV), “contains 9,317 YouTube videos annotated according to 20 classes.” The results are compared against previous studies on the same datasets, four on each for UCF-101 and CCV. Two-stream CNN performs better than the methods shown in these previous studies, especially for the CCV dataset.

The paper is very well written, and the project described should be of high interest to researchers working in the area of video classification.

Reviewer:  Xiannong Meng Review #: CR143666 (1510-0908)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Classifier Design And Evaluation (I.5.2 ... )
 
 
Content Analysis And Indexing (H.3.1 )
 
Would you recommend this review?
yes
no
Other reviews under "Classifier Design And Evaluation": Date
Linear discrimination with symmetrical models
Bobrowski L. Pattern Recognition 19(1): 101-109, 1986. Type: Article
Feb 1 1988
An application of a graph distance measure to the classification of muscle tissue patterns
Sanfeliu A. (ed), Fu K., Prewitt J. International Journal of Pattern Recognition and Artificial Intelligence 1(1): 17-42, 1987. Type: Article
Dec 1 1989
Selective networks and recognition automata
George N. J., Edelman G.  Computer culture: the scientific, intellectual, and social impact of the computer (, New York,2011984. Type: Proceedings
May 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy