Zhang et. al., in their paper, present a novel approach to increase the accuracy and efficiency in question-answering systems over a knowledge base. As they explain, “[mapping] a question in a natural language into a fact triple or a collection of fact triples in the KB can be considered a problem of question answering over a KB (QA-KB).” Two approaches exist for tackling the problem. One is based on semantic parsing, which translates natural language questions into a logical structure to query a KB. The other approach exploits neural network models using the APA architecture.
The authors go on to present the three main components of an APA architecture: “an entity alignment component [that] seeks and locates the subject in the question, a path label prediction component [that] uses an exploration or prediction strategy to find the most likely path label, and an object answering component [that] locates the object(s) in the KB.” Of all the components in such a system, predicting the path label in APA is the main challenge.
The authors propose APVA, “a novel modeling framework” that adds a verification mechanism to the APA model to check the correctness of a predicated relation or a path label. They show how the verification mechanism is in some ways a “negative sampling scheme” that “uses the noise examples to better shape the model’s predictive distribution.”
The authors test their APVA model against two large datasets, SimpleQuestions (SQ) and WebQuestions (WQ). SQ contains “108,442 questions written by human annotators ... with ground-truth answers” and WQ consists of “5,810 questions, generated automatically using the Google Suggest [application programming interface, API] and are associated with Freebase facts.” The experimental results show that APVA performs better--in some cases, much better--than a number of other models. This is based on comparison results with nine other models when using WQ and five models when using SQ.
While the questions and answers problem in general is challenging and the existing models do not perform as well as humans, the proposed APVA model does provide a novel mechanism that has the advantage of verifying the truth before propagating the information further. This feedback mechanism improves the performance of such systems.