Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Scalable density-based clustering with quality guarantees using random projections
Schneider J., Vlachos M.  Data Mining and Knowledge Discovery 31 (4): 972-1005, 2017. Type: Article
Date Reviewed: Oct 30 2017

Efficient clustering techniques are required for knowledge discovery in large databases. The efforts of scientists have contributed to the development of many clustering algorithms.

This research paper considers the class of density-based clustering algorithms and presents all details, both theoretical and technical, on SOPTICS--speedy OPTICS--a random-projection-based version of the popular OPTICS algorithm. The authors extend their previous work [1] in order to show theoretical arguments on the performance of SOPTICS.

In nine sections and an appendix, the reader will find a valuable description of basic density-based clustering algorithms; the technique of random projections; the steps of the proposed algorithm (including pseudocode); and theoretical results on the speed of algorithms used in different steps like partitioning, neighborhood identification, density estimate, and so on. Starting with the seventh section, a deep analysis is conducted on SOPTICS. Twelve theorems are developed to prove different aspects of the proposed strategy. In section 8, an empirical evaluation is described related to both runtime and clustering quality over ten datasets. SOPTICS showed clear performance advantages when compared against basic OPTICS, OPTICS with locality sensitive hashing, and DeLi-Clu.

The Java implementation of SOPTICS is available at the second author’s website [2], and the previous version can be found as included by the ELKI project [3].

Even for a long paper, it is to the authors’ merit that they include only the necessary background and references for a good understanding of the context, proofs, and SOPTICS.

I highly recommend this contribution to data scientists, researchers in data mining, and students pursuing master’s or PhD degrees doing research in the knowledge discovery field.

Reviewer:  G. Albeanu Review #: CR145627 (1801-0025)
1) Schneider, J.; Vlachos, M. Fast parameterless density-based clustering via random projections. In Proc. of the 22nd ACM International Conference on Information & Knowledge Management (San Francisco, CA), ACM, New York, NY, 2013, 861–866.
2) Schneider, J.; Vlachos, M. Scalable density-based clustering with quality guarantees using random projections, source code. 2013. http://alumni.cs.ucr.edu/~mvlachos/erc/projects/density-based/src.zip. Accessed 10/03/2017.
3) ELKI project. https://elki-project.github.io/releases/ (10/03/2017).
Bookmark and Share
 
Clustering (I.5.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Clustering": Date
A novel probabilistic clustering model for heterogeneous networks
Deng Z., Xu X.  Machine Learning 104(1): 1-24, 2016. Type: Article
Aug 31 2016
 Semi-supervised hybrid clustering by integrating Gaussian mixture model and distance metric learning
Zhang Y., Wen J., Wang X., Jiang Z.  Journal of Intelligent Information Systems 45(1): 113-130, 2015. Type: Article
Jan 20 2016
A modified kernel clustering method with multiple factors
Zhu C., Gao D.  Pattern Analysis & Applications 18(4): 871-886, 2015. Type: Article
Jan 6 2016
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2018 ThinkLoud, Inc.
Terms of Use
| Privacy Policy