Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data mining in software engineering
Halkidi M., Spinellis D., Tsatsaronis G., Vazirgiannis M. Intelligent Data Analysis15 (3):413-441,2011.Type:Article
Date Reviewed: Apr 20 2012

Today, huge volumes of data about software development are available from a variety of sources, including organizational databases, open-source software project metadata, and other software engineering repositories, mailing lists, discussion forums, and newsletters. Data mining provides the capability to analyze this data and transform it into valuable information.

This excellent paper deals with the mining of data produced during the software development life cycle and stored in software repositories. The authors introduce the concepts, approaches, tasks, and techniques of data mining, and the challenges of mining software repositories. The data mining approaches described are clustering, classification, frequent pattern mining and association rules, data characterization and summarization, change and deviation detection, and text mining.

The paper classifies, describes, and explains the different types of software engineering data. These include documentation, software configuration management data, source code, compiled code, execution traces, problem tracking and bug reports, and mailing lists.

The authors map the various data mining approaches and techniques to the software engineering tasks for which they are helpful. The appropriate mining approaches, the input data and data analysis results for software development, testing, debugging, maintenance, and reuse are discussed and summarized. The authors conclude by discussing the challenges in mining software engineering repositories, which in their opinion require further research.

This paper will be useful to anyone who is connected with the development, testing, debugging, maintenance, and reuse of software products--from programmers to managers. The knowledge obtained from mining software repositories and other related sources will help such an audience better understand the development process, and thereby help them refine it. It will also help them make the software life cycle processes more efficient.

Reviewer:  Alexis Leon Review #: CR140076 (1209-0945)
Bookmark and Share
  Featured Reviewer  
 
Data Mining (H.2.8 ... )
 
 
General (D.2.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Data Mining": Date
Feature selection and effective classifiers
Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article
May 1 1999
Rule induction with extension matrices
Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article
Jul 1 1998
Predictive data mining
Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)
Feb 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy