Computing Reviews

An approach of support approximation to discover frequent patterns from concept-drifting data streams based on concept learning
Li C., Jea K. Knowledge and Information Systems40(3):639-671,2014.Type:Article
Date Reviewed: 04/29/15

Data stream mining is a variation of data mining with additional requirements: instead of looking for the proverbial needle in the haystack, the data miner is expected to inspect a never-ending stream of hay arriving on a conveyor belt. The notion of concept drift means that the attributes and distribution of the data items may change over time, which will affect the quality of the mining results unless methods are used that can deal with concept drift. A real-world example of data stream mining would be a department store, where transactions (purchases) are the data items to be inspected, and purchasing patterns are the concepts representing the results of data mining.

The authors present an analysis of the concept drift problem in data stream mining, including a categorization of different types of concept drift. The main distinctions are between separated drift (where changes are swift, separated, non-periodic, and non-recurrent), circular drift (instant, intersected, periodic, and circular), and gradual (gradual, intersected, non-periodic, and non-recurrent). The latter reflect the ongoing changes in the merchandise of a department store: items are sold and replaced with identical or slightly modified ones, and occasionally new ones are introduced. Then, the authors describe a method, SA-Miner, short for support-approximation-based data stream frequent-pattern miner. In comparison with related work, this method is capable of maintaining high-quality results in the presence of concept drift, while also keeping computation time and memory space requirements fairly low.

The paper can be used as a very good introduction to the notion of concept drift in data stream mining. It is very well written, cogently organized, and shows a good balance between conceptual considerations, formal specifications, and specific implementation aspects.

Reviewer:  Franz Kurfess Review #: CR143400 (1507-0608)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy