Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Semi-supervised hybrid clustering by integrating Gaussian mixture model and distance metric learning
Zhang Y., Wen J., Wang X., Jiang Z.  Journal of Intelligent Information Systems 45 (1): 113-130, 2015. Type: Article
Date Reviewed: Jan 20 2016

Zhang et al. propose a semi-supervised clustering algorithm, called SSCGD, addressing a specific class of such techniques: probabilistic clustering. The algorithm optimizes a given Gaussian mixture model (GMM) by adding, on the one hand, more probabilistic information, and on the other, knowledge coming from the geometrical organization of the labeled/unlabeled elements in the training set. Thereby, the authors adapt the original objective function to contain Kullback-Leibler divergences among model components and weighted distance measurements among training elements. The estimation maximization algorithm is applied to deduce new parameters of the model. The work relates to earlier attempts to refine the GMM structure by altering its objective function, such as Laplacian regularized GMM (LapGMM) and local consistent GMM (LCGMM), or to those based on Jensen-Shannon divergence. Most of them are evoked in the introductory part, which surveys the state of the art in the design of supervised or hybrid clustering techniques.

The experimental section evaluates the SSCGD algorithm against sheer GMM and k-means methods, and against semi-supervised algorithms such as PCK-Means or transductive support vector machine (T-SVM), in the context of varied rates of labeled data. The evaluation procedure is based on an adapted F1 formula. The authors use real-world experimental datasets; one, the Chinese Word Sense Induction, is fully labeled. It would be interesting to know how the labeled data can be obtained in the case of unlabeled datasets.

Besides some careless formulations (for instance, the experimental results “indicate that the SSCGD algorithm to integrated distance metric and Gaussian mixture model in clustering can lead to improvements in cluster quality”), the work demonstrates solid grounding and keen investigation of new facets of clustering structure, representing a worthy attempt to enhance classification techniques. These are good reasons for pattern recognition researchers to try it.

Reviewer:  Svetlana Segarceanu Review #: CR144111 (1605-0353)
Bookmark and Share
  Editor Recommended
Clustering (I.5.3 )
Numerical Analysis (G.1 )
Would you recommend this review?
Other reviews under "Clustering": Date
Improved analysis of complete-linkage clustering
Growendt A., Röglin H.  Algorithmica 78(4): 1131-1150, 2017. Type: Article
Jul 31 2018
Scalable density-based clustering with quality guarantees using random projections
Schneider J., Vlachos M.  Data Mining and Knowledge Discovery 31(4): 972-1005, 2017. Type: Article
Oct 30 2017
A novel probabilistic clustering model for heterogeneous networks
Deng Z., Xu X.  Machine Learning 104(1): 1-24, 2016. Type: Article
Aug 31 2016

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2019 ThinkLoud, Inc.
Terms of Use
| Privacy Policy