Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
A discriminatively learned CNN embedding for person reidentification
Zheng Z., Zheng L., Yang Y. ACM Transactions on Multimedia Computing, Communications, and Applications14 (1):1-20,2017.Type:Article
Date Reviewed: May 4 2018

For knowing whether the same person appears in two images (reidentification), one can either use identification models (that is, classifying the person to its identity) or verification models (that is, classifying whether both the images are of the same person). Both methods have their own advantages and disadvantages. The two models, an identification model and a verification model, are different concerning their inputs, feature extraction, and loss function used to train them. The verification model forces the two images belonging to the same person to be mapped using the nearby points in the resultant feature space. In contrast, the identification network tries to identify the person rather than discriminating it from the other person. A verification neural network does not consider the relationship between the given image pair and other images of the dataset, whereas the identification model tunes different features to classify a person accurately. In the first result, the authors show that just using the verification model is worse than just using the identification model for the reidentification task.

This paper combines the two models to get more discriminative features. Specifically, the authors take some well-known image classification networks (such as CaffeNet, VGG16, and ResNet-50), use input of their last layers as nonlinear embedding functions of the images, and feed these embeddings to two models, simultaneously minimizing identification-loss as well as verification loss. Thus, for a pair of images, the network predicts the identity of the images and whether they belong to the same person. The authors show that their method leads to up to 5 to 11 percent improvement (in different networks) in Rank 1 accuracy compared to using only the identification model, and up to 8 to 21 percent improvement compared to using only the verification loss model.

The paper is well written with a well-articulated problem statement, differences compared to prior work, measurement parameters, results, and different aspects of results. The authors have described both models in detail with their salient points. The authors compare the different loss functions used by the two models: cross-entropy loss (as identification loss) and contrastive loss (as verification loss). The results show that their method achieves 45 percent Rank 1 accuracy even using images from low-resolution cameras. The presentation format of the formulas could have been improved; they were mentioned without any proper explanation that would be useful for a general reader. Overall, though, this is a nice work with good results.

Reviewer:  Rajeev Gupta Review #: CR146017 (1807-0403)
Bookmark and Share
  Editor Recommended
 
 
Image Representation (I.4.10 )
 
Would you recommend this review?
yes
no
Other reviews under "Image Representation": Date
On detecting all saddle points in 2D images
Kuijper A. Pattern Recognition Letters 25(15): 1665-1672, 2004. Type: Article
Jul 14 2005
General adaptive neighborhood image processing
Debayle J., Pinoli J. Journal of Mathematical Imaging and Vision 25(2): 267-284, 2006. Type: Article
Mar 29 2007
Human skeleton tracking from depth data using geodesic distances and optical flow
Schwarz L., Mkhitaryan A., Mateus D., Navab N. Image and Vision Computing 30(3): 217-226, 2012. Type: Article
Aug 26 2013
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy