The authors propose an approach for estimating software size from conceptual data models of information systems. Three independent variables characterize the conceptual data model: C, “the total number of classes”; R, “the total number of unidirectional relationship types”; and , “the average number of attributes per class.” Drawing data from actual development projects in industry and open-source repositories, the authors build multiple linear regression models for size estimation in different system environments, such as industry Visual Basic systems, open-source PHP systems, industry Java systems, and open-source Java systems. The derived regression models are validated to predict system size in terms of number of lines of code, within an acceptable range of performance.
Furthermore, the paper demonstrates that the new approach for cost estimation that predicts software size is comparable to the use of function points. The key advantages of using conceptual data models for cost estimation are the parsimonious use of only three parameters and the fact that such conceptual models are “more readily available in the early stage of software development” than many of the function point parameters. Thus, this cost estimation approach should appeal to managers of information system development projects with well-defined conceptual data models.