Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Principles of data mining (2nd ed.)
Bramer M., Springer Publishing Company, Incorporated, London, UK, 2013. 454 pp. Type: Book (978-1-447148-83-8)
Date Reviewed: Sep 23 2013

Data mining is one of the most popular and effective tools for knowledge discovery. It involves the analysis and summary of data from different perspectives and the automatic extraction of useful information. Data mining reveals trends, patterns, and other information hidden within huge volumes of data. Today, it is used in commercial, medical, scientific, geographical, meteorological, and other areas that generate large volumes of information that require automatic processing methods to be of real use.

This book introduces the concept of data mining and explains the various techniques involved. The author starts with an introduction to data mining and its importance, and succinctly explains the fundamental concepts of data mining and principal techniques for classification, association rule mining, and clustering.

Classification is a data mining technique that assigns items in a collection to target categories or classes. The book introduces the various classification techniques (naive Bayes, nearest neighbor, decision trees), and explains the top-down induction of decision trees (TDIDT) algorithm and the various criteria for attribute selection (entropy, Gini index of diversity, chi-square statistic, gain ratio). This is followed by discussions about related topics, including classifier predictive accuracy estimation, classifier performance measurement, classifier comparison, conversion of continuous attributes to categorical ones (discretization), overfitting reduction of decision trees, modular rules for classification, dealing with large volumes of data, and ensemble classification (use of a set of classifiers instead of a single one to classify unseen data).

Association rules are if/then statements that help uncover relationships between data that seems to be unrelated in an information repository. The book covers the basic concepts of association rule mining, along with the various algorithms and criteria for selecting the best algorithms. There is also a comprehensive discussion of association rule mining algorithms, such as Apriori, market basket analysis, and frequency pattern growth.

The author presents a detailed exploration of the two most popular data clustering methods, k-means clustering and hierarchical clustering, followed by a discussion of text mining, a type of classification where the objects are text documents. Other chapters examine the bag-of-words representation for document classification, automatic classification of web pages (hypertext categorization), and the difference between hypertext and standard text classification.

Each topic discussion begins with the basics, and the book assumes that the reader has no prior knowledge of data mining. All explanations are clear and supported with detailed illustrations, examples, and solved problems. The focus on algorithms helps those who do not have a strong mathematical background to better understand the concepts, and the learning process is enhanced with self-assessment exercises and a list of references at the end of each chapter.

The book has five appendices that add value. The first explains the mathematical notation and techniques used in the book and would especially help those with limited mathematical exposure. The second gives basic information about the different datasets used in the book. The third lists sources for further reading. The fourth is a comprehensive glossary of data mining terms and mathematical notation, and the last provides solutions to the self-assessment exercises.

This book is written primarily as a text for a course on data mining. The rich pedagogical features, including illustrations, examples, solved problems, exercises and solutions, a glossary, and references, make it an ideal choice for that purpose. It would be very useful for any reader who wants to gain a good understanding of data mining concepts and techniques.

More reviews about this item: Amazon

Reviewer:  Alexis Leon Review #: CR141581 (1312-1073)
Bookmark and Share
  Featured Reviewer  
 
Data Mining (H.2.8 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Mining": Date
Feature selection and effective classifiers
Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article
May 1 1999
Rule induction with extension matrices
Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article
Jul 1 1998
Predictive data mining
Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)
Feb 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy