Initialization of Gaussian mixture model (GMM) parameters is crucial to the quality of the model because its training using the expectation-maximization (EM) algorithm is decidedly influenced by the initial values. An accepted approach to cope with the vulnerability of the EM algorithm is to execute multiple runs, starting with different random choices of the parameters, and choose the configuration with the highest performance (multiple restart EM, MREM). The problem is related to the more general issue of initialization of finite mixture models, which are extensively applied in many pattern recognition fields, such as image or speech recognition applications.
This paper describes a new solution for the initialization stage of the EM algorithm to be applied in a MREM framework. The generated values should be characterized by a high degree of randomness in order to comply with MREM principles. There are three elements that characterize GMMs: component centers, standard deviation around the centers, and component weights. The component centers are selected as the furthermost elements to the already-defined ones, picked from a randomly chosen set of elements in the training set. The algorithm represents an adaptation of earlier techniques devised in the context of either vector quantization (VQ) or GMM. The algorithm preserves to some degree the haphazard character, although it is somehow supervised. The covariance matrices are engendered using eigenvalue and QR decomposition techniques, based on sets of randomly generated values. The weights probably have equal values. A penetrating analysis of the complexity of the algorithm is made.
The experimental section represents a substantial part of the paper, and its objective is to compare the devised method, called rnd-maxmin, to some acknowledged techniques, applied in the MREM framework, and referred to throughout the paper. The evaluation is accomplished using several data sets that are either generated synthetically or real-life. The criteria to judge clustering performance are derived from the adjusted Rand index (ARI) and log-likelihood measures, and the degree of overlap among clusters is used as an additional decision criterion. To appraise the statistical significance of the evaluation results, some statistical tests are provided. The results show the new method seems better suited in cases with lower degrees of component overlap. This conclusion is somehow expected as similar approaches applied in vector quantizer (VQ) initialization have the ability to emphasize the outliers, isolated values [1].
The paper gathers a lot of information on boosting the performance of GMMs, and represents an insightful study of this field; it is useful material for researchers or students. The results are backed by proprietary software solutions, which add more worth to the work.