Computing Reviews, the leading online review service for computing literature.

Search

Clustering with qualitative information
Charikar M., Guruswami V., Wirth A. Journal of Computer and System Sciences71 (3):360-383,2005.Type:Article

Date Reviewed: May 23 2006

Grouping or clustering elements based on their relative similarities is a principal step in data analysis, with bioinformatics being one of the applications. Similarity can be defined by a number that measures the strength of the relation, leading to the well-studied k-center, k-sum, and k-median problems. Alternatively, the similarity of a pair of elements can be given as a label “+” (similar) or “-” (dissimilar); such a qualitative notion of similarity is considered in this paper. When cast as an optimization problem for graphs, two versions are of interest. MaxAgree tries to maximize the sum of the number of pluses within the clusters, and minuses between clusters. MinDisAgree is interested in minimizing the sum of minuses within clusters, and pluses between them. This paper follows the study spearheaded by Bansal et al. [1], and answers several questions that provide a good insight into approximability of clustering with qualitative information, both for complete and for general graphs. Among the results for general graphs (that is, where similarity is known only between some pairs), the authors provide a factor O(log n) approximation for MinDisAgree, as well as an approximable (APX) hardness and a factor 0.766 approximation for MaxAgree. In addition to shedding new light on inapproximability, the paper demonstrates the use of a number of algorithmic techniques for designing approximation algorithms. Among them are general techniques of LP relaxation, semidefinite relaxation, and, more specifically, region growing for minimum multicut and reductions to 3-satisfiability (3SAT). Altogether, thanks to its comprehensive discussion, this paper represents a valuable addition to the literature for students and researchers interested in approximation and in clustering in particular.

Reviewer: Jerzy W. Jaromczyk	Review #: CR132819 (0703-0303)

1)	Bansal, N.; Blum, A.; Chawla, S. Correlation clustering. In Proc. of the 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS '02) IEEE, 2002, 238–247.

Algorithms (I.5.3 ... )

Computations On Discrete Structures (F.2.2 ... )

Would you recommend this review?

yes

Other reviews under "Algorithms":	Date

Monte Carlo comparison of six hierarchical clustering methods on random data Jain N., Indrayan A., Goel L. Pattern Recognition 19(1): 95-99, 1986. Type: Article	Nov 1 1987

A parallel nonlinear mapping algorithm Shen C., Lee R., Chin Y. International Journal of Pattern Recognition and Artificial Intelligence 1(1): 53-69, 1987. Type: Article	Jun 1 1988

Algorithms for clustering data Jain A., Dubes R., Prentice-Hall, Inc., Upper Saddle River, NJ, 1988. Type: Book (9780130222787)	Jun 1 1989

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud, Inc.^®
Terms of Use | Privacy Policy