Computing Reviews

From word embeddings to document similarities for improved information retrieval in software engineering
Ye X., Shen H., Ma X., Bunescu R., Liu C.  ICSE 2016 (Proceedings of the 38th International Conference on Software Engineering, Austin, TX, May 14-22, 2016)404-415,2016.Type:Proceedings
Date Reviewed: 05/10/17

Writing comments in our code is one of the main practices of software engineering. The authors use mapping to map the software comments to the code being described. The authors do a very good job talking about natural language processing and semantic processes in a different way. They use vectors and shared memory maps to understand natural language statements and code snippets, to understand their meaning. Usually, we use web ontology language (OWL) ontologies in these situations, which are documents that the semantic world uses to map words to a data dictionary. But it is still difficult to parse these statements to extract the meaning of the words and the context in which they are used.

Investigating techniques such as latent semantic indexing (LSI) and latent Dirichlet allocation (LDA) for feature location and bug localization, the authors improve their research by categorizing the results in four categories. In the research questions, they (1) add word embedding to help improve extraction, (2) train word embedding, and (3) investigate whether training helps improve results and (4) can similarity be predicted. The techniques described seem very similar to clustering and prediction methods from machine learning techniques, used to understand text. This presents a strong approach for natural language and semantic processing researchers, where text can be trained to understand meanings. In this case, the paper attempts to apply this to find software bugs, which is a very interesting case study. This is a new approach that should definitely be expanded on in future work.

Reviewer:  Mariam Kiran Review #: CR145262 (1707-0469)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy