Computing Reviews, the leading online review service for computing literature.

Search

Studying the effect of multi-query functionality on a correlation-aware SQL-to-MapReduce translator
Amirthalingam T., Springer J. RIIT 2015 (Proceedings of the 4th Annual ACM Conference on Research in Information Technology, Chicago, IL, Sep 30-Oct 3, 2015)55-60.2015.Type:Proceedings

Date Reviewed: Feb 17 2016

Real-time operational tools for the mining of massive datasets should provide fault tolerance, scalability, and support for complex query processing in relational database management systems. Data mining analytical tools that operate in a clustered environment such as Hadoop can preserve data integrity during a breakdown and support the scalability of query processing in nonrelational database systems. But how should the multifaceted semantics of queries in structured query language (SQL) be accurately translated for optimal processing in nonrelational systems? Amirthalingam and Springer investigate this nontrivial question by exploring the effectiveness of a “correlation-aware SQL-to-MapReduce” translator. They design experiments for recognizing and optimizing the relationships among multiple SQL queries processing in environments that support parallel and distributed transaction processing. The authors used a Hadoop cluster of concurrent hardware mappers and reducers with software translators to experimentally investigate the effects of translating multiple query optimizations and processing from relational to non-relational data processing environments. The experiments investigated (1) the relationships between the translation and execution times of simple SQL queries for various data sizes, finding results that depict a linear increase in the processing times by MapReduce to execute queries as the data size intensified; (2) the capability of the MapReduce translator to enhance complex query processing of different dataset sizes from reliable databases recommended by the Transaction Performance Processing Council, with the experimental results showing that the processing time of complex queries increased with an increase in the dataset sizes; and (3) the sensitivity of the SQL-to-MapReduce translator to the relationships among multiple queries, which revealed a positive correlation between the execution times and the number of nontrivial translated queries. I strongly recommend that all big data analysis professionals read the valuable and practical ideas in this paper. The translation of queries in the first normal form (1NF) relational model for processing in distributed environments such as Hadoop might be easy, but how should queries for datasets organized in higher normal forms be translated to capitalize on the current research results? Without a doubt, the authors have opened up new practical research ideas in the area of data mining algorithms.

Reviewer: Amos Olagunju	Review #: CR144174 (1607-0530)

Data Translation (H.2.5 ... )

Query Processing (H.2.4 ... )

Would you recommend this review?

yes

Other reviews under "Data Translation":	Date

A language-driven generalized numerical database translator Daini O. BIT 25(1): 91-105, 1985. Type: Article	Jun 1 1986

Using semantic values to facilitate interoperability among heterogeneous information systems Sciore E., Siegel M., Rosenthal A. ACM Transactions on Database Systems 19(2): 254-290, 1994. Type: Article	Mar 1 1995

Schema matching and embedded value mapping for databases with opaque column names and mixed continuous and discrete-valued data fields Jaiswal A., Miller D., Mitra P. ACM Transactions on Database Systems 38(1): 1-34, 2013. Type: Article	Jul 30 2013

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy