Computing Reviews

Towards the use of machine learning algorithms to enhance the effectiveness of search strings in secondary studies
Cairo L., Monteiro M., de F. Carneiro G., Brito e Abreu F.  SBES 2019 (Proceedings of the XXXIII Brazilian Symposium on Software Engineering, Salvador, Brazil, Sep 23-27, 2019)22-26,2019.Type:Proceedings
Date Reviewed: 03/26/21

The authors propose the use of text mining to enhance the creation of search strings for constructing so-called secondary studies--that is to say, survey articles. The primary measures of success in this endeavor should be recall (that is, retrieving a high proportion of relevant articles) and “workload” (that is, reducing a researcher’s work while improving retrieval). The paper specifies, in detail, the complete criteria used.

The authors use the Scopus document retrieval site as the repository for potential articles. The main reason for using Scopus is the existence of an application programming interface (API) to the repository. Extending the ideas to multiple repositories is left as a future work project.

After retrieving a set of articles, the authors use a specially written tool to obtain a list of words that characterizes the retrieved documents. To illustrate the ideas, three specific systematic literature reviews are used as input for the system. The sought after search string should retrieve the articles in the systematic reviews while adding additional relevant articles. The algorithms used in the analysis of retrieved documents include term frequency-inverse document frequency, continuous bag-of-words (CBOW), and skip-gram models.

The paper describes results for the three survey papers and presents the recommended search strings. In a welcome move, the tool for implementing the approach is available on a public website. A minor but annoying nit: numbers are used to denote references in the body of the paper, but then not used in the bibliography. The paper is clearly (if densely) written and should be of interest to researchers looking to create a secondary study.

Reviewer:  J. P. E. Hodgson Review #: CR147225 (2107-0181)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy