Computing Reviews

An empirical evaluation of outlier deletion methods for analogy-based cost estimation
Tsunoda M., Kakimoto T., Monden A., Matsumoto K.  Promise 2011 (Proceedings of the 7th International Conference on Predictive Models in Software Engineering, Banff, AB, Canada, Sep 20-21, 2011)1-10,2011.Type:Proceedings
Date Reviewed: 11/09/11

This paper presents a helpful and concise survey of the current techniques used to make a set of software engineering estimation data reasonably consistent. The authors show how to eliminate data that just doesn’t fit within a collected set. I’m concerned that this paper tries to improve the accuracy of the estimates when so many vital parameters are ignored and the errors are subsumed by the estimation techniques themselves.

The paper is worth a read for those seriously interested in software effort estimation, if just for its survey of the field and as a pointer to datasets for calibrating estimation methods.

Beware of results that ignore the nonlinear effects of size and the skill level of the program designers doing the work. The authors correctly point out that normal distributions for the estimates are not valid, and show readers how to deal with different distributions. My concern focuses on equation 2, which seems to discount the software development team’s skills, experience, and expertise.

I like the use of the Z-score computations in the authors’ solution, as well as the identification of datasets that can be used to check on estimation methods.

However, correcting for errors of ten to 20 percent is far beyond the accuracy of the estimation methods themselves, and harkens back to an engineering principle taught to me at Bell Labs: “When the precision of a calculation exceeds the accuracy of the data, the results engender mistrust.”

Reviewer:  Larry Bernstein Review #: CR139574 (1205-0493)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy