Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Studying the effect of multi-query functionality on a correlation-aware SQL-to-MapReduce translator
Amirthalingam T., Springer J.  RIIT 2015 (Proceedings of the 4th Annual ACM Conference on Research in Information Technology, Chicago, IL, Sep 30-Oct 3, 2015)55-60.2015.Type:Proceedings
Date Reviewed: Feb 17 2016

Real-time operational tools for the mining of massive datasets should provide fault tolerance, scalability, and support for complex query processing in relational database management systems. Data mining analytical tools that operate in a clustered environment such as Hadoop can preserve data integrity during a breakdown and support the scalability of query processing in nonrelational database systems. But how should the multifaceted semantics of queries in structured query language (SQL) be accurately translated for optimal processing in nonrelational systems? Amirthalingam and Springer investigate this nontrivial question by exploring the effectiveness of a “correlation-aware SQL-to-MapReduce” translator. They design experiments for recognizing and optimizing the relationships among multiple SQL queries processing in environments that support parallel and distributed transaction processing.

The authors used a Hadoop cluster of concurrent hardware mappers and reducers with software translators to experimentally investigate the effects of translating multiple query optimizations and processing from relational to non-relational data processing environments. The experiments investigated (1) the relationships between the translation and execution times of simple SQL queries for various data sizes, finding results that depict a linear increase in the processing times by MapReduce to execute queries as the data size intensified; (2) the capability of the MapReduce translator to enhance complex query processing of different dataset sizes from reliable databases recommended by the Transaction Performance Processing Council, with the experimental results showing that the processing time of complex queries increased with an increase in the dataset sizes; and (3) the sensitivity of the SQL-to-MapReduce translator to the relationships among multiple queries, which revealed a positive correlation between the execution times and the number of nontrivial translated queries.

I strongly recommend that all big data analysis professionals read the valuable and practical ideas in this paper. The translation of queries in the first normal form (1NF) relational model for processing in distributed environments such as Hadoop might be easy, but how should queries for datasets organized in higher normal forms be translated to capitalize on the current research results? Without a doubt, the authors have opened up new practical research ideas in the area of data mining algorithms.

Reviewer:  Amos Olagunju Review #: CR144174 (1607-0530)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Data Translation (H.2.5 ... )
 
 
Query Processing (H.2.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Translation": Date
A language-driven generalized numerical database translator
Daini O. BIT 25(1): 91-105, 1985. Type: Article
Jun 1 1986
Using semantic values to facilitate interoperability among heterogeneous information systems
Sciore E., Siegel M., Rosenthal A. ACM Transactions on Database Systems 19(2): 254-290, 1994. Type: Article
Mar 1 1995
Schema matching and embedded value mapping for databases with opaque column names and mixed continuous and discrete-valued data fields
Jaiswal A., Miller D., Mitra P. ACM Transactions on Database Systems 38(1): 1-34, 2013. Type: Article
Jul 30 2013
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy