Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
A new random approach for initialization of the multiple restart EM algorithm for Gaussian model-based clustering
Kwedlo W. Pattern Analysis & Applications18 (4):757-770,2015.Type:Article
Date Reviewed: Mar 7 2016

Initialization of Gaussian mixture model (GMM) parameters is crucial to the quality of the model because its training using the expectation-maximization (EM) algorithm is decidedly influenced by the initial values. An accepted approach to cope with the vulnerability of the EM algorithm is to execute multiple runs, starting with different random choices of the parameters, and choose the configuration with the highest performance (multiple restart EM, MREM). The problem is related to the more general issue of initialization of finite mixture models, which are extensively applied in many pattern recognition fields, such as image or speech recognition applications.

This paper describes a new solution for the initialization stage of the EM algorithm to be applied in a MREM framework. The generated values should be characterized by a high degree of randomness in order to comply with MREM principles. There are three elements that characterize GMMs: component centers, standard deviation around the centers, and component weights. The component centers are selected as the furthermost elements to the already-defined ones, picked from a randomly chosen set of elements in the training set. The algorithm represents an adaptation of earlier techniques devised in the context of either vector quantization (VQ) or GMM. The algorithm preserves to some degree the haphazard character, although it is somehow supervised. The covariance matrices are engendered using eigenvalue and QR decomposition techniques, based on sets of randomly generated values. The weights probably have equal values. A penetrating analysis of the complexity of the algorithm is made.

The experimental section represents a substantial part of the paper, and its objective is to compare the devised method, called rnd-maxmin, to some acknowledged techniques, applied in the MREM framework, and referred to throughout the paper. The evaluation is accomplished using several data sets that are either generated synthetically or real-life. The criteria to judge clustering performance are derived from the adjusted Rand index (ARI) and log-likelihood measures, and the degree of overlap among clusters is used as an additional decision criterion. To appraise the statistical significance of the evaluation results, some statistical tests are provided. The results show the new method seems better suited in cases with lower degrees of component overlap. This conclusion is somehow expected as similar approaches applied in vector quantizer (VQ) initialization have the ability to emphasize the outliers, isolated values [1].

The paper gathers a lot of information on boosting the performance of GMMs, and represents an insightful study of this field; it is useful material for researchers or students. The results are backed by proprietary software solutions, which add more worth to the work.

Reviewer:  Svetlana Segarceanu Review #: CR144207 (1605-0351)
1) Katsavounidis, I.; Kuo, C.-C. J.; Zhang, Z. A new initialization technique for generalized Lloyd iteration. IEEE Signal Processing Letters 1, 10(1994), 144–146.
Bookmark and Share
  Featured Reviewer  
 
Pattern Analysis (I.5.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Pattern Analysis": Date
Understanding data pattern processing
Inmon W., Osterfelt S., QED Information Sciences, Inc., Wellesley, MA, 1991. Type: Book (9780894353864)
Jun 1 1992
Parallel thinning with two-subiteration algorithms
Guo Z., Hall R. Communications of the ACM 32(3): 359-373, 1989. Type: Article
Jan 1 1990
A variable window approach to early vision
Boykov Y., Veksler O., Zabith R. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12): 1283-1294, 1998. Type: Article
Oct 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy