Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Integrating Web query results: holistic schema matching
Chuang S., Chang K.  CIKM 2008 (Proceedings of the 17th ACM Conference on Information and Knowledge Management, Napa Valley, CA, Oct 26-30, 2008)33-42.2008.Type:Proceedings
Date Reviewed: Jun 2 2009

Schema matching is one of the challenging problems faced when handling multiple data sources. This gets even more complicated when it needs to be done with only sample instances. Chuang and Chang’s work attempts to address this challenge.

The authors explain the concept of pairwise schema matching techniques attempted by other researchers in the area of instance-based schema matching. They claim that holistic schema matching is the same as domain schema discovery. The major contribution seems to be the extension of pairwise schema matching to what could be termed as weighted multi-pair integrated matching. Chuang and Chang verify the effectiveness of their algorithm by using case studies from four domains: airfares, books, cars, and CDs. The holistic matching algorithm proposed provides the best matching performance, compared with a few other algorithms, such as cluster and chain matching.

While their claim may be valid for the sample set used for the comparison, it would be very difficult to extend it as a general improvement without a much deeper analysis. First, the select data sources used in the chosen domains have relatively comparable schema--for example, expedia.com and travelocity.com. Therefore, whether the algorithm would perform the same way with diverse schema in the same domain is a question to be answered. Second, the authors do not address what would happen if the domains were changed and, particularly, if the number of fields increased significantly. Surprisingly, having used about 300 to 400 records, from 30 to 40 sample pages in each domain, the authors claim to have carried out extensive experiments. Nevertheless, the paper attempts to address a very important challenge in schema matching.

Reviewer:  Sithu D. Sudarsan Review #: CR136895 (1010-1046)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Miscellaneous (H.2.m )
 
 
Heterogeneous Databases (H.2.5 )
 
Would you recommend this review?
yes
no
Other reviews under "Miscellaneous": Date
Data management support for database management
Bayer R., Schlichtiger P. Acta Informatica 21(1): 1-28, 1984. Type: Article
Mar 1 1985
Extracting the extended entity-relationship model from a legacy relational database
Alhajj R. Information Systems 28(6): 597-618, 2003. Type: Article
Oct 23 2003
Static analysis techniques for predicting the behavior of active database rules
Aiken A., Hellerstein J., Widom J. ACM Transactions on Database Systems 20(1): 3-41, 1995. Type: Article
Jan 1 1996
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy