Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Best of 2016 Recommended by Editor Recommended by Reviewer Recommended by Reader
Dense 3D-convolutional neural network for person re-identification in videos
Liu J., Zha Z., Chen X., Wang Z., Zhang Y.  ACM Transactions on Multimedia Computing, Communications, and Applications 15 (1s): 1-19, 2019. Type: Article
Date Reviewed: Apr 19 2019

It is well known that the current types of neural networks perform quite well in identifying faces in (still) images. But what about re-identifying moving pedestrians in non-overlapping video sequences taken from different cameras?

The paper’s novel approach increases the accuracy of re-identification (here: rank-1 recognition rate and mean average precision) by 40.8 percent and 4.2 percent, respectively, compared to the best alternative video-based and image-based algorithms. This is essentially achieved by the following two innovative design choices for their 56-layer (generalized) convolutional neural network with 3.2 million training parameters.

First, the authors compose the whole network of four blocks of so-called dense 3D blocks where each layer in a block is connected in a feed-forward fashion to all(!) subsequent layers in the block (as opposed to just the immediately following layer). This has been chosen to “enlarge the receptive ... neurons in both spatial and temporal dimensions,” enabling the network to discriminate “short-term and long-term motion patterns.”

Second, they extend the classical identification loss function with a second term aiming to minimize center loss, that is, the algorithm tries to maintain the “centers” of training samples from each class and to use this center point for feature embedding.

As is evident from the decidedly technical description above, the paper is geared toward neural network specialists, as even advanced concepts are used without any explanation in the text. Complemented by highly illustrative graphics and even pseudocode for the overall algorithm, the authors rightfully concentrate on providing all the details necessary for (re)building their nontrivial neural network. And while I have not checked this meticulously, I am confident that they are very close to achieving this goal.

Reviewer:  Christoph F. Strnadl Review #: CR146538
Bookmark and Share
  Editor Recommended
Featured Reviewer
Object Recognition (I.4.8 ... )
Computer Vision (I.5.4 ... )
Neural Nets (I.5.1 ... )
Would you recommend this review?
Other reviews under "Object Recognition": Date
Facial expression analysis and expression-invariant face recognition by manifold-based synthesis
Peng Y., Yin H.  Machine Vision and Applications 29(2): 263-284, 2018. Type: Article
Aug 23 2018
Instance-based object recognition in 3D point clouds using discriminative shape primitives
Zhang J., Sun J.  Machine Vision and Applications 29(2): 285-297, 2018. Type: Article
Jun 5 2018
Salient object detection: a discriminative regional feature integration approach
Wang J., Jiang H., Yuan Z., Cheng M., Hu X., Zheng N.  International Journal of Computer Vision 123(2): 251-268, 2017. Type: Article
Nov 1 2017

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2019 ThinkLoud, Inc.
Terms of Use
| Privacy Policy