Computing Reviews

Cost effective storage space for data cubes
Łatuszko M. Journal of Intelligent Information Systems48(2):243-261,2017.Type:Article
Date Reviewed: 07/05/17

There are many obstacles in the world of optimizing data cube-oriented materialized views, and various issues should be considered in a big data environment. Łatuszko is here with a new approach. Contrary to research on view materialization design that aims to find better view selection methods to optimize the running time or the cost of the solution, this research is focused on optimizing space (memory) allocated for view materialization while still ensuring good query performance and acceptable cost reduction.

Starting with the known database technique online analytical processing (OLAP), the author provides a comprehensive review on view selection problems as well as on choosing the right structures for speeding up query execution time. In order to better explain this research, he presents very successfully the real problem definition of “how much storage space is needed to ensure expected benefit,” supposing we could find an exact “set of views [that] minimizes the storage space.” Thus, the reader can find a description of the experiments performed on the Transaction Processing Performance Council (TPC-H) database, which is a decision support benchmark with business-oriented ad hoc queries and concurrent data modifications. Łatuszko includes a new performance set related to the memory size needed for optimized views materialization. Using such datasets allows for better optimization of data cubes with a huge set of views with corresponding data cube size in relation to the expected maximum benefit. It is an important factor in views materialization since queries run on the cube will be very fast if the whole cube is precomputed, which requires a lot of memory.

Łatuszko excellently presents the premise found throughout this research: “the large dataset used in the numerical experiments minimizes the influence of atypical cases” and “the allocation of large space for views materialization is not justified,” meaning that “the space limit was less than 50 percent of the fully materialized data cube size even when the goal was set to 99 percent of the maximum benefit.” This finding boldly opens a new space for further research in views materialization, which plays a vital role in the current big data world. It is confirmed with Łatuszko’s note on the research problem “related to optimizing query performance by making the best use of different hardware resources,” including “storage devices with different performance characteristics” and different storage limitations, as well as different memory locations in a distributed or cloud environment.

Reviewer:  F. J. Ruzic Review #: CR145404 (1709-0625)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy