Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Database and information-retrieval methods for knowledge discovery
Weikum G., Kasneci G., Ramanath M., Suchanek F. Communications of the ACM52 (4):56-64,2009.Type:Article
Date Reviewed: May 8 2009

This excellent article calls for the unification of the information retrieval (IR) and database (DB) communities. The article provides context and background for their separate, yet converging, paths over the past 40 years of computer science (CS) research. Furthermore, the article frames the necessity of dovetailing IR and DB research as a grand challenge, based on the increasing data volumes, low cost of storage, and high computational ability, as well as demanding user expectations for a flexible and efficient search across large data corpora. In order to evaluate and provide meaningful answers to queries, such as “find young patients in central Europe, who have been reported in the past two weeks to have symptoms of tropical virus diseases and an indication of anomalies,” the IR and DB communities must cross-fertilize, according to the authors.

Weikum et al. list several low-hanging fruit areas: first, approximate matching and record linkage--for example, deciphering when queries for William J. Clinton are really the same as queries for Bill Clinton; second, schema relaxation and heterogeneity, dealing with unstructured or semi-structured data transparently and accessing the data in a variety of organizational structures, such as an entity-relationship (E-R) model, tuples, or the resource description framework (RDF); third, information extraction and uncertain data--that is, extracting structured information from text and evaluating it with confidence; and, finally, entity search and ranking, ensuring that the appropriate top N results returned from search are relevant to the asking users.

After the grand vision and collaboration areas, the authors illustrate recent successes in collaborations between the IR and DB communities, using their own experience in the context of the Yet Another Great Ontology (YAGO) project. YAGO is a growing ontology, represented in RDF and Web ontology language (Owl) Lite. It has been used to engender several recent modern search systems, including Microsoft’s Libra scholarly database. These sections present how some of these low-hanging fruit areas for DB and IR collaboration are represented in YAGO-driven projects, including the already-mentioned Libra, as well as Cimple/DBLife and KnowItAll/TextRunner. At the same time, this part of the article feels a tad self-referential and grandiose; so read on, but your mileage may vary.

I found the first part of the article (and the general feeling as a whole) fascinating. It challenges both the IR and DB communities to come together for the greater good of search, information representation, and retrieval in the modern world. Half of the article is a great overview of the challenge. While the other half is a bit wordy and technical, it does provide some existing context for how these challenges are being addressed today.

Reviewer:  Chris Mattmann Review #: CR136803 (1001-0072)
Bookmark and Share
 
Database Management (H.2 )
 
 
Digital Libraries (H.3.7 )
 
 
Information Search And Retrieval (H.3.3 )
 
 
Online Information Services (H.3.5 )
 
Would you recommend this review?
yes
no
Other reviews under "Database Management": Date
Progressive skyline computation in database systems
Papadias D., Tao Y., Fu G., Seeger B. ACM Transactions on Database Systems 30(1): 41-82, 2005. Type: Article
Jan 24 2006
 Raghu Ramakrishnan speaks out on deductive databases, what lies beyond scalability, how he burned through $20M briskly, why we should reach out to policymakers, and more
Winslett M. ACM SIGMOD Record 35(2): 77-85, 2006. Type: Article
Nov 23 2006
Beginning PHP 5 and MySQL 5: from novice to professional (2nd ed.)
Gilmore W., APress, LP, Berkeley, CA, 2005.  952, Type: Book (9781590595527)
Nov 30 2006
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy