Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
From word embeddings to document similarities for improved information retrieval in software engineering
Ye X., Shen H., Ma X., Bunescu R., Liu C.  ICSE 2016 (Proceedings of the 38th International Conference on Software Engineering, Austin, TX,  May 14-22, 2016) 404-415. 2016. Type: Proceedings
Date Reviewed: May 10 2017

Writing comments in our code is one of the main practices of software engineering. The authors use mapping to map the software comments to the code being described. The authors do a very good job talking about natural language processing and semantic processes in a different way. They use vectors and shared memory maps to understand natural language statements and code snippets, to understand their meaning. Usually, we use web ontology language (OWL) ontologies in these situations, which are documents that the semantic world uses to map words to a data dictionary. But it is still difficult to parse these statements to extract the meaning of the words and the context in which they are used.

Investigating techniques such as latent semantic indexing (LSI) and latent Dirichlet allocation (LDA) for feature location and bug localization, the authors improve their research by categorizing the results in four categories. In the research questions, they (1) add word embedding to help improve extraction, (2) train word embedding, and (3) investigate whether training helps improve results and (4) can similarity be predicted. The techniques described seem very similar to clustering and prediction methods from machine learning techniques, used to understand text. This presents a strong approach for natural language and semantic processing researchers, where text can be trained to understand meanings. In this case, the paper attempts to apply this to find software bugs, which is a very interesting case study. This is a new approach that should definitely be expanded on in future work.

Reviewer:  Mariam Kiran Review #: CR145262 (1707-0469)
Bookmark and Share
  Featured Reviewer  
Testing And Debugging (D.2.5 )
Information Search And Retrieval (H.3.3 )
Natural Language Processing (I.2.7 )
Software Engineering (D.2 )
Would you recommend this review?
Other reviews under "Testing And Debugging": Date
Is the stack distance between test case and method correlated with test effectiveness?
Niedermayr R., Wagner S.  EASE 2019 (Proceedings of the Evaluation and Assessment on Software Engineering, Copenhagen, Denmark,  Apr 15-17, 2019) 189-198, 2019. Type: Proceedings
Mar 15 2021
Keeping master green at scale
Ananthanarayanan S., Ardekani M., Haenikel D., Varadarajan B., Soriano S., Patel D., Adl-Tabatabai A.  EuroSys 2019 (Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany,  Mar 25-28, 2019) 1-15, 2019. Type: Proceedings
Jun 15 2020
DeFlaker: automatically detecting flaky tests
Bell J., Legunsen O., Hilton M., Eloussi L., Yung T., Marinov D.  ICSE 2018 (Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden,  May 27-Jun 3, 2018) 433-444, 2018. Type: Proceedings
May 20 2020

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2022 ThinkLoud, Inc.
Terms of Use
| Privacy Policy