Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Clustering with qualitative information
Charikar M., Guruswami V., Wirth A. Journal of Computer and System Sciences71 (3):360-383,2005.Type:Article
Date Reviewed: May 23 2006

Grouping or clustering elements based on their relative similarities is a principal step in data analysis, with bioinformatics being one of the applications. Similarity can be defined by a number that measures the strength of the relation, leading to the well-studied k-center, k-sum, and k-median problems. Alternatively, the similarity of a pair of elements can be given as a label “+” (similar) or “-” (dissimilar); such a qualitative notion of similarity is considered in this paper. When cast as an optimization problem for graphs, two versions are of interest. MaxAgree tries to maximize the sum of the number of pluses within the clusters, and minuses between clusters. MinDisAgree is interested in minimizing the sum of minuses within clusters, and pluses between them.

This paper follows the study spearheaded by Bansal et al. [1], and answers several questions that provide a good insight into approximability of clustering with qualitative information, both for complete and for general graphs. Among the results for general graphs (that is, where similarity is known only between some pairs), the authors provide a factor O(log n) approximation for MinDisAgree, as well as an approximable (APX) hardness and a factor 0.766 approximation for MaxAgree.

In addition to shedding new light on inapproximability, the paper demonstrates the use of a number of algorithmic techniques for designing approximation algorithms. Among them are general techniques of LP relaxation, semidefinite relaxation, and, more specifically, region growing for minimum multicut and reductions to 3-satisfiability (3SAT). Altogether, thanks to its comprehensive discussion, this paper represents a valuable addition to the literature for students and researchers interested in approximation and in clustering in particular.

Reviewer:  Jerzy W. Jaromczyk Review #: CR132819 (0703-0303)
1) Bansal, N.; Blum, A.; Chawla, S. Correlation clustering. In Proc. of the 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS '02) IEEE, 2002, 238–247.
Bookmark and Share
 
Algorithms (I.5.3 ... )
 
 
Computations On Discrete Structures (F.2.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Algorithms": Date
Monte Carlo comparison of six hierarchical clustering methods on random data
Jain N., Indrayan A., Goel L. Pattern Recognition 19(1): 95-99, 1986. Type: Article
Nov 1 1987
A parallel nonlinear mapping algorithm
Shen C., Lee R., Chin Y. International Journal of Pattern Recognition and Artificial Intelligence 1(1): 53-69, 1987. Type: Article
Jun 1 1988
Algorithms for clustering data
Jain A., Dubes R., Prentice-Hall, Inc., Upper Saddle River, NJ, 1988. Type: Book (9780130222787)
Jun 1 1989
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud, Inc.®
Terms of Use
| Privacy Policy