Computing Reviews

Exploiting the power of group differences :using patterns to solve data analysis problems
Dong G., Morgan&Claypool Publishers,San Rafael, CA,2019. 148 pp.Type:Book
Date Reviewed: 01/10/20

Emerging patterns (EPs) can be used as features, as simple classifiers, and as subpopulation signatures or characterizations. This book presents methods for investigating group differences based on EPs for a variety of data analysis problems. These methods can also be used when data is limited. The book is not difficult to read, but it requires a bit of effort from those with no formal or technical background.

Chapter 1 introduces the book, providing readers with a summary of the content of each chapter and motivating the needs of EPs. Chapter 2 provides readers with useful formal notions for following and understanding the rest of the book, such as the definitions of attributes, features, variables, data instances and datasets, patterns, supports, equivalence classes, minimal generators, and borders.

Chapter 3 introduces the basics of EPs and presents a simple algorithm for mining them, useful also with small datasets. Chapters 4 and 5 focus on classification by aggregating multiple matching EPs, a discriminating pattern aggregation approach. It describes the algorithm and provides an example on a small dataset, showing how it can outperform other classification algorithms.

Chapter 6 presents an EP-based method for intrusion detection: one-class classification using length of EPs. This is evaluated empirically on two public datasets widely used to evaluate intrusion detection systems. Chapter 7 introduces the contrast pattern-based clustering-quality evaluation method.

Chapter 8 describes a pattern-based clustering algorithm that does not use distance metrics. It produces interpretable results that associate each cluster with a set of contrast patterns characterizing it. Chapter 9 considers the challenging problem of gene ranking in the context of complex diseases with thousands of features, and introduces an EP-based method called interaction-based importance of genes.

Chapter 10 describes pattern-aided prediction models and presents an approach that uses contrast patterns for classification and regression tasks to construct such models. Finally, chapter 11 briefly provides readers with potential applications of EP-based approaches.

The book is not extremely technical, but it does require some background in formalisms. In the absence of images, diagrams, and other visual aids, some further effort is required to understand these formalisms. Overall, the book successfully introduces EPs as an alternative to traditional classification and regression approaches for spotting differences among groups, even those with tiny sample sizes.

Reviewer:  Luca Longo Review #: CR146836 (2006-0120)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy