ComputingReviews.com

Learning the semantics of structured data sources
Taheriyan M., Knoblock C., Szekely P., Ambite J. Journal of Web Semantics37(C):152-169,2016.Type:Article

Date Reviewed: 09/07/16

The semantic web promises to make web resources machine-understandable by linking them to semantic models that describe the individual entities and their relationships. While already large knowledge graphs have been created, the challenge remains to also integrate data available from sources such as relational databases and spreadsheets that do not provide this information.

The paper addresses this problem by utilizing machine learning techniques to leverage previously constructed models to automatically derive new models for data sources in the same application domain. The process starts by learning candidates for the semantic types of a source’s data attributes, and proceeds with creating a graph that links the already-known semantic models to these types. In the core step, the source attributes are mapped to the graph nodes by a scalable heuristic search algorithm that avoids combinatorial explosion. From the most promising candidate mappings, ultimately a ranked set of candidate models is generated.

This approach is thoroughly described and clearly demonstrated by a running example; it has been implemented in the data framework Karma and evaluated on datasets of artwork provided by various museums. This evaluation demonstrates that, if only a few models are already known, newly generated models are 80 percent accurate compared to manually created ones. A human expert can thus interactively evaluate the suggested models, select the most promising one, and further modify it for the desired result. This is much easier than constructing such models from scratch and provides a pragmatic road toward the semantic web.

Reviewer: Wolfgang Schreiner

Review #: CR144736 (1612-0909)

Reproduction in whole or in part without permission is prohibited. Copyright 2024 ComputingReviews.com™
Terms of Use | Privacy Policy