Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Recognizing authors: an examination of the consistent programmer hypothesis
Hayes J., Offutt J. Software Testing, Verification & Reliability20 (4):329-356,2010.Type:Article
Date Reviewed: Jul 14 2011

The 29 pages of this paper provide a more detailed “what we did” report than usual for research papers. What the researchers attempted to do was to empirically assess the validity of the consistent programmer hypothesis. A common basis for that hypothesis is the observation that programmers usually act like authors, in general, with their usual idiosyncratic patterns of language use. (What distinguishes your favorite fiction author from other authors?)

More specifically, what the researchers in this paper set out to do was to distinguish C source code authorship by examining “style facets” used in written C programs. Since this paper was published in a testing journal, the paper also includes frequent mention of the relevance of the research to software testing.

The researchers recruited five experienced professional C programmers as volunteer programmer subjects for their research. Working independently from three written specifications, the five each wrote three C programs. The researchers analyzed those 15 programs by manually tallying features (such as counts of semicolons used), by subjecting them to static analysis by running the “lint” software tool and the automatic test analysis for C software tool, and by subjecting them to dynamic analysis by running the PISCES test analysis software tool. Additionally, the researchers acquired 60 student programs (each of 15 students from a class on networks wrote four C programs). The researchers analyzed those 60 student programs by the same processes they used in analyzing the 15 professional programs. In this paper, however, the researchers emphasize their findings from the professional C programs.

Using the data from their analyses, the researchers calculated some probabilities separately from the professional and the student programs, and flagged as significant any probabilities equal to or less than 0.05. However, most of these seem irrelevant to the statistical testing (covered in Section 5.4) of the researchers’ nine hypotheses (listed in Section 3). I could find no coverage of any testing of the paper’s title hypothesis. Section 5.2.1 in the paper presents a “discriminator for identifying programmers,” built in a manner like that of the Welker and Oman Maintainability Index, but does not present any statistical significance testing of the discriminator.

Overall, this paper has an interesting research design, good coverage of related work by others, and an impressive choice of references. The researchers have nicely enhanced the presentation of the software testing aspects of their work (to meet the needs of the journal) while still keeping what was probably the core of their original research report. Regrettably, the relatively small number of programmers involved limited the statistical power that could have enhanced their research findings.

Reviewer:  Ned Chapin Review #: CR139241 (1112-1292)
Bookmark and Share
  Featured Reviewer  
 
Testing And Debugging (D.2.5 )
 
 
Metrics (D.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "Testing And Debugging": Date
Software defect removal
Dunn R., McGraw-Hill, Inc., New York, NY, 1984. Type: Book (9789780070183131)
Mar 1 1985
On the optimum checkpoint selection problem
Toueg S., Babaoglu O. SIAM Journal on Computing 13(3): 630-649, 1984. Type: Article
Mar 1 1985
Software testing management
Royer T., Prentice-Hall, Inc., Upper Saddle River, NJ, 1993. Type: Book (9780135329870)
Mar 1 1994
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy