Computing Reviews, the leading online review service for computing literature.

Search

Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods
Myrtveit I., Stensrud E., Olsson U. IEEE Transactions on Software Engineering27 (11):999-1013,2001.Type:Article

Date Reviewed: Jul 2 2002

The authors evaluate four statistical methods that analyze data sets with missing data, in the context of software engineering, with the specific goal of building effort prediction models. The methods evaluated by the authors are: listwise deletion, mean computation, similar response pattern imputation, and full information maximum likelihood. The data contained in the International Software Benchmarking Standards Group (ISBSG) database is used to derive effort prediction models for each method, which are also evaluated. The evaluation investigates different aspects of the methods, specifically robustness to non-random missing data, bias introduction, prevention of information loss, and appropriateness. It also investigates the models generated for correctness and accuracy. The paper concludes with an explanation of the applicability conditions of each model, with the one generated by the full information maximum likelihood method being the best one (but also the most restrictive). The comparison of the models generated by each method has been performed using statistical methods, which means the authors present data by which readers can reach their own conclusion. However, it would be a good idea to replicate this experiment in other contexts using other data, to generalize the conclusions of the paper. The paper includes a good selection of references. This paper's main contribution is in giving empirical software engineering researchers an idea of the statistical methods that can be used when working with incomplete data sets.

Reviewer: Sira Vegas	Review #: CR126233 (0208-0443)

Product Metrics (D.2.8 ... )

Management (D.2.9 )

Software Management (K.6.3 )

Would you recommend this review?

yes

Other reviews under "Product Metrics":	Date

Communication Metrics for Software Development Dutoit A., Bruegge B. IEEE Transactions on Software Engineering 24(8): 615-628, 1998. Type: Article	Oct 1 1998

The Optimal Class Size for Object-Oriented Software El Emam K., Benlarbi S., Goel N., Melo W., Lounis H., Rai S. IEEE Transactions on Software Engineering 28(5): 494-509, 2002. Type: Article	Jan 3 2003

Simulated annealing for improving software quality prediction Bouktif S., Sahraoui H., Antoniol G. Genetic and evolutionary computation (Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, Washington, Jul 8-12, 2006)1893-1900, 2006. Type: Proceedings	Nov 8 2006

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy