Open-source software licenses determine the conditions under which code can be used or changed. The Open Source Initiative lists dozens of licenses to choose from. A summary of recent research on software licensing is provided and the requirements of an auditing system are described.
Ninka, with 96 percent precision, is said to be the best tool for identifying licenses in source code. Previous research found systems that had experienced changes in license type and license version and systems comprised of components with incompatible licenses. A survey of developers found that personal bias and morals influenced the choice of initial license and that changes to licenses occurred when code was to be used commercially. A study of several thousand Java projects on GitHub found “a lack of traceability between the license changes and commit messages and issue tracker discussions.” In all, 25 categories of license change were found based on a qualitative analysis of the messages and discussions. Some commit messages simply stated the license was updated.
An auditing system is proposed that can analyze license compliance, describe reasons for a failed audit, and propose fixes for any detected incompatibilities. The ability to determine software provenance is also proposed, making use of forges such as GitHub and SourceForge. However, as noted by the authors, satisfying this requirement will require finding an approach that is not too computationally expensive.
This short paper provides many useful insights into the problems of open-source software licensing and is strongly recommended to the software engineering community.