Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Managing heterogeneous information systems through discovery and retrieval of generic concepts
Srinivasan U., Ngu A., Gedeon T. Journal of the American Society for Information Science51 (8):707-723,2000.Type:Article
Date Reviewed: Jun 1 2000

The focus of this paper is data integration problems resulting from a lack of semantic schema information. This analysis requires an understanding of the myriad meanings and uses of data that different applications and users derive from different contextual interpretations. According to the authors, the paper “suggests a solution to the problem of identifying semantically equivalent data in heterogeneous databases through the idea of information reengineering, where information is customized to reflect the domain knowledge of distinct user groups.”

This well-written, well-referenced, and easy-to-read paper adds substantially to the literature on heterogeneous information system technology. The key contribution is the application of heterogeneous system technology to address data integration, semantics, querying, and data clustering. Concepts are explained and validated with an empirical analysis of a medical information system. This suggests that the authors’ approach to categorizing and retrieving data from heterogeneous information systems can be generalized for productive use and is not just another academic example for an academic paper. The authors either describe or cite all the key heterogeneous database literature, presenting the history of each approach and the problems encountered when using it.

The authors’ two main contributions are a concept discovery approach called ConceptDISH (concept discovery from heterogeneous databases) and an information reengineering framework called ConceptVIEW.

There are two problems with searching for common concepts in heterogeneous databases: mismatches between data semantics and domains, and data confidentiality. This work addresses the confidentiality problem by using metadata rather than actual data. The authors use a clinical information system domain as the basis for their assertion that there is sufficient empirical evidence that users of heterogeneous databases have common knowledge and query needs, even though their databases may differ. This suggests that users have both a similar domain knowledge and a cognitive load that indicates background knowledge. Building on these assumptions, the authors suggest a “wrapping service” that generates a conceptual layer above the legacy database system.

ConceptDISH exploits similarities in database structure and usage patterns to automate the discovery of a set of well-understood, domain-dependent generic concepts that coincide with the user’s vocabulary in the application domain. These concepts are the basis for accessing information from heterogeneous databases. This discovery method uses the data dictionary as a main source of application domain knowledge.

ConceptDISH’s discovery algorithm can be classified as an observation learning paradigm in which legacy system database objects are classified into groups that can be described by a concept from a predefined concept class, which is well understood within the application domain. The ConceptDISH algorithm uses the values of the variables to partition these objects into clusters. The authors make a clear case that it is impractical to force disparate users to use identical sets of objects to retrieve information. Instead, the paper proposes and describes the ConceptDISH approach to concept discovery.

While ConceptDISH uses similarities in the static and dynamic properties of databases to discover concepts that they have in common, ConceptVIEW models these concepts as application objects that reflect the vocabulary of different user groups within the application domain.

This paper provides a laconic exegesis of the ConceptDISH algorithm; the ConceptVIEW layer, which uses a group data object and associated “extractors” to query one local data source; and experimental results based on a clinical information database. I recommend it to anyone working with problems of heterogeneous databases or information systems. The ultimate achievement for these authors will be to produce similar results with other information systems. If they succeed in this endeavor, their work will have both academic and practical value.

Reviewer:  Felipe Carinõo Jr. Review #: CR123005
Bookmark and Share
 
Object-Oriented Databases (H.2.4 ... )
 
 
Data Models (H.2.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Object-Oriented Databases": Date
Object databases in practice
Chaudhri A. (ed), Loomis M., Prentice-Hall, Inc., Upper Saddle River, NJ, 1998. Type: Book (9780138997250)
Aug 1 1998
MOVIE: an incremental maintenance system for materialized object views
Ali M., Fernandes A., Paton N. Data & Knowledge Engineering 47(2): 131-166, 2003. Type: Article
Feb 3 2004
An efficient method for checking object-oriented database schema correctness
Formica A., Groger H., Missikoff M. ACM Transactions on Database Systems 23(3): 333-368, 1998. Type: Article
Jun 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy