Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Profiling relational data: a survey
Abedjan Z., Golab L., Naumann F. The VLDB Journal: The International Journal on Very Large Data Bases24 (4):557-581,2015.Type:Article
Date Reviewed: Mar 31 2016

The question of processing metadata for a dataset is covered in this survey paper. Much of the analysis is about the automation that is needed to discover the appropriate metadata and the complexities in such discovery.

Traditionally, metadata is something that one would be able to see while processing data. In this survey paper, the authors describe the difficulties in identifying the metadata that comes from relationships among the data. Knowing that metadata is data about data, one would be able to appreciate the analysis needed to discover the appropriate metadata when the volume of data is very large or the data appear extremely rapidly. These are characteristics of big data. The authors address this aspect as well, including more than 100 references to explore further.

The paper covers the classification of data profiling tasks and provides an extensive review of each such classification. One of the main contributions of this paper is the development of a new taxonomy of profiling tasks. In particular, the survey looks at the analysis of individual columns, analysis of multiple columns, and dependency detection between columns. The authors discuss as well the tools used in these tasks.

In my opinion, the authors have done an excellent job of covering data profiling from many angles, especially the ones involving metadata. This paper makes a significant contribution to the literature by showing the relevance between data profiling and database research. The paper is easy to read and presents the material with several examples for the reader to follow.

Reviewer:  S. Srinivasan Review #: CR144278 (1606-0415)
Bookmark and Share
  Reviewer Selected
 
 
Database Management (H.2 )
 
 
Information Storage And Retrieval (H.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Database Management": Date
Progressive skyline computation in database systems
Papadias D., Tao Y., Fu G., Seeger B. ACM Transactions on Database Systems 30(1): 41-82, 2005. Type: Article
Jan 24 2006
 Raghu Ramakrishnan speaks out on deductive databases, what lies beyond scalability, how he burned through $20M briskly, why we should reach out to policymakers, and more
Winslett M. ACM SIGMOD Record 35(2): 77-85, 2006. Type: Article
Nov 23 2006
Beginning PHP 5 and MySQL 5: from novice to professional (2nd ed.)
Gilmore W., APress, LP, Berkeley, CA, 2005.  952, Type: Book (9781590595527)
Nov 30 2006
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy