Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach
Freitas A., Curry E.  IUI 2014 (Proceedings of the 19th International Conference on Intelligent User Interfaces, Haifa, Israel, Feb 24-27, 2014)279-288.2014.Type:Proceedings
Date Reviewed: May 1 2014

Using natural language to query large amounts of heterogeneous linked data is a difficult problem. This paper proposes a distributional-compositional semantics approach to answer queries over linked data. Without heavily relying on ontologies, this approach is data centered and driven. Researchers in natural language processing (NLP), linked data, the semantic web, and query answering will want to study this work.

The proposed approach first constructs a distributional semantic vector space, including a term vector space and a concept vector space, for a given data collection. The term vector space is built upon all terms available in the given dataset. The concept vector space is built upon the extraction of co-occurrence patterns of each term from the dataset. Then the resource description framework (RDF) graph data is rewritten using the term and concept vector spaces.

Each RDF predicate is represented to be a weighted concept vector, and each RDF instance is represented to be a weighted term vector. Based on the distributional and compositional semantic vector space, and given a natural language query, the proposed system first extracts a set of query features and resolves the query into a RDF-like format. After the query is analyzed, the system generates a query processing plan that maps the extracted query features and the semi-structured query representation into a set of search, navigation, and transformation operations over the queried data graph. Finally, the operations of the query processing plan are executed over the semantic vector space to identify the results to the query.

The experiments tested the proposed approach using the question answering over linked data 2011 test collection (QALD-1). The results demonstrate that the proposed system achieves better recall, precision, and reciprocal rank compared to other systems, including PowerAqua and FREyA. However, the paper would have been more comprehensive if the authors had evaluated the proposed system over more than one dataset.

Reviewer:  Yingjie Li Review #: CR142239 (1408-0671)
Bookmark and Share
  Reviewer Selected
 
 
Heterogeneous Databases (H.2.5 )
 
 
Content Analysis And Indexing (H.3.1 )
 
 
Information Search And Retrieval (H.3.3 )
 
 
Natural Language Processing (I.2.7 )
 
Would you recommend this review?
yes
no
Other reviews under "Heterogeneous Databases": Date
Heterogeneous distributed database systems for production use
Thomas G., Thompson G., Chung C., Barkmeyer E., Carter F., Templeton M., Fox S., Hartman B. ACM Computing Surveys 22(3): 237-266, 2001. Type: Article
Nov 1 1992
Interoperability of multiple autonomous databases
Litwin W., Mark L., Roussopoulos N. ACM Computing Surveys 22(3): 267-293, 2001. Type: Article
Oct 1 1992
Multidatabase interoperability
Litwin W., Abdellatif A. Computer 19(12): 10-18, 1986. Type: Article
Aug 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy