Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods
Myrtveit I., Stensrud E., Olsson U. IEEE Transactions on Software Engineering27 (11):999-1013,2001.Type:Article
Date Reviewed: Jul 2 2002

The authors evaluate four statistical methods that analyze data sets with missing data, in the context of software engineering, with the specific goal of building effort prediction models.

The methods evaluated by the authors are: listwise deletion, mean computation, similar response pattern imputation, and full information maximum likelihood. The data contained in the International Software Benchmarking Standards Group (ISBSG) database is used to derive effort prediction models for each method, which are also evaluated. The evaluation investigates different aspects of the methods, specifically robustness to non-random missing data, bias introduction, prevention of information loss, and appropriateness. It also investigates the models generated for correctness and accuracy.

The paper concludes with an explanation of the applicability conditions of each model, with the one generated by the full information maximum likelihood method being the best one (but also the most restrictive). The comparison of the models generated by each method has been performed using statistical methods, which means the authors present data by which readers can reach their own conclusion. However, it would be a good idea to replicate this experiment in other contexts using other data, to generalize the conclusions of the paper. The paper includes a good selection of references.

This paper's main contribution is in giving empirical software engineering researchers an idea of the statistical methods that can be used when working with incomplete data sets.

Reviewer:  Sira Vegas Review #: CR126233 (0208-0443)
Bookmark and Share
 
Product Metrics (D.2.8 ... )
 
 
Management (D.2.9 )
 
 
Software Management (K.6.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Product Metrics": Date
Communication Metrics for Software Development
Dutoit A., Bruegge B. IEEE Transactions on Software Engineering 24(8): 615-628, 1998. Type: Article
Oct 1 1998
The Optimal Class Size for Object-Oriented Software
El Emam K., Benlarbi S., Goel N., Melo W., Lounis H., Rai S. IEEE Transactions on Software Engineering 28(5): 494-509, 2002. Type: Article
Jan 3 2003
Simulated annealing for improving software quality prediction
Bouktif S., Sahraoui H., Antoniol G.  Genetic and evolutionary computation (Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, Washington, Jul 8-12, 2006)1893-1900, 2006. Type: Proceedings
Nov 8 2006
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy