Computing Reviews, the leading online review service for computing literature.

Search

Document clustering using NMF and fuzzy relation
Park S., An D., Yoo H. ICUIMC 2011 (Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, Seoul, Korea, Feb 21-23, 2011)1-5.2011.Type:Proceedings

Date Reviewed: Jun 28 2011

Park et al. offer a unique combination of nonnegative matrix factorization (NMF) and fuzzy relations to produce a better document clustering algorithm. They note that clustering is useful for document organization, automatic summarization, topic extraction, and information filtering or retrieval. They also note that their algorithm “can extract important cluster label terms [...] using semantic features” via NMF, and “can remove the dissimilar documents [in clusters] using [a] fuzzy relation between semantic features and document terms.” After showing how NMF and fuzzy relations work, the authors begin describing their algorithm, which employs preprocessing to remove stop words and to stem the remaining terms in a document set. They then use NMF to manipulate the document-term matrix in order to generate cluster label terms. Finally, they perform document clustering using a fuzzy relation. The authors use a standard test database of documents to measure the performance of their algorithm against other clustering mechanisms. The results show that their algorithm offers improved performance. I wish, however, that they had discussed in more detail the normalized mutual information metric that they employ.

Reviewer: Donald H. Kraft	Review #: CR139188 (1201-0085)

Clustering (H.3.3 ... )

Would you recommend this review?

yes

Other reviews under "Clustering":	Date

Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases Can F. (ed), Ozkarahan E. ACM Transactions on Database Systems 15(3): 483-517, 1990. Type: Article	Dec 1 1992

A parallel algorithm for record clustering Omiecinski E., Scheuermann P. ACM Transactions on Database Systems 15(3): 599-624, 1990. Type: Article	Nov 1 1992

Organization of clustered files for consecutive retrieval Deogun J., Raghavan V., Tsou T. ACM Transactions on Database Systems 9(4): 646-671, 1984. Type: Article	Jun 1 1985

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy