Database systems have become very important in any complex online transaction processing (OLTP) information system. Recent advances in the concepts of data warehousing, knowledge management, and business intelligence (BI) are concerned with the meaningful use of such data for decision support, for middle and top-level management. This paper is concerned with the estimation of the size of such a database for BI applications.
Databases for BI applications have different characteristics than typical OLTP databases. Accurate estimation of the workload in BI applications is not easy, and hence the sizing process is relatively difficult. The authors have used the performance of H type queries from the Transaction Processing Performance Council (TPC_H) benchmark for their extrapolation. They explain the approach of their analysis using preaudited DB2 UDB TPC-H benchmark runs.
The normalized results of TPC-H are partitioned into clusters, using singular value decomposition (SVD) and semi discrete decomposition (SDD), with the help of MATLAB. The authors have identified four different types of clusters based on this analysis, with different types of resource usage. Further artificial queries are generated to justify the model.
The paper describes only the approach to sizing the BI database. Further realistic query analysis from real applications is required. This paper may help researchers involved in database design and sizing for BI applications. The references may also provide further leads in this research area.