Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Scalable density-based clustering with quality guarantees using random projections
Schneider J., Vlachos M. Data Mining and Knowledge Discovery31 (4):972-1005,2017.Type:Article
Date Reviewed: Oct 30 2017

Efficient clustering techniques are required for knowledge discovery in large databases. The efforts of scientists have contributed to the development of many clustering algorithms.

This research paper considers the class of density-based clustering algorithms and presents all details, both theoretical and technical, on SOPTICS--speedy OPTICS--a random-projection-based version of the popular OPTICS algorithm. The authors extend their previous work [1] in order to show theoretical arguments on the performance of SOPTICS.

In nine sections and an appendix, the reader will find a valuable description of basic density-based clustering algorithms; the technique of random projections; the steps of the proposed algorithm (including pseudocode); and theoretical results on the speed of algorithms used in different steps like partitioning, neighborhood identification, density estimate, and so on. Starting with the seventh section, a deep analysis is conducted on SOPTICS. Twelve theorems are developed to prove different aspects of the proposed strategy. In section 8, an empirical evaluation is described related to both runtime and clustering quality over ten datasets. SOPTICS showed clear performance advantages when compared against basic OPTICS, OPTICS with locality sensitive hashing, and DeLi-Clu.

The Java implementation of SOPTICS is available at the second author’s website [2], and the previous version can be found as included by the ELKI project [3].

Even for a long paper, it is to the authors’ merit that they include only the necessary background and references for a good understanding of the context, proofs, and SOPTICS.

I highly recommend this contribution to data scientists, researchers in data mining, and students pursuing master’s or PhD degrees doing research in the knowledge discovery field.

Reviewer:  G. Albeanu Review #: CR145627 (1801-0025)
1) Schneider, J.; Vlachos, M. Fast parameterless density-based clustering via random projections. In Proc. of the 22nd ACM International Conference on Information & Knowledge Management (San Francisco, CA), ACM, New York, NY, 2013, 861–866.
2) Schneider, J.; Vlachos, M. Scalable density-based clustering with quality guarantees using random projections, source code. 2013. http://alumni.cs.ucr.edu/~mvlachos/erc/projects/density-based/src.zip. Accessed 10/03/2017.
3) ELKI project. https://elki-project.github.io/releases/ (10/03/2017).
Bookmark and Share
  Featured Reviewer  
 
Clustering (I.5.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Clustering": Date
On the convergence of “A self-supervised vowel recognition system”
Pathak A., Pal S. Pattern Recognition 20(2): 237-244, 1987. Type: Article
Aug 1 1988
Conceptual clustering of structured objects: a goal-oriented approach
Stepp R., Michalski R. (ed) Artificial Intelligence 28(1): 43-69, 1986. Type: Article
Sep 1 1986
The enhanced LBG algorithm
Patané G., Russo M. Neural Networks 14(9): 1219-1237, 2001. Type: Article
Apr 2 2003
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy