This chapter addresses how multidimensional databases are handled by statistical, online analytical processing (OLAP), and scientific databases, and the common concepts shared by these three. The two main structural concepts are the cross-product space of the dimensions, and the classification hierarchy structure associated with each dimension.
The presentation of data in multidimensional space (the cross product of dimension categories) eases the viewing of certain features in the data, for example clusters, outliers, and patterns. It also helps summarize the data for which statistical databases and OLAP have been specifically designed (a generic term for both is summary databases). However, while statistical databases concentrate on conceptual modeling and formalization of operators, OLAP emphasizes data structures and operational efficiency.
The graphical notation used for the conceptual modeling of summary databases is discussed. The text also illustrates the star-schema representation and inverted graphical representation using unified modeling language (UML) notation. There are two ways in which the secondary data is represented in the summary database notation. The summary model can be extended to include additional structures that support associations and classification. Another approach is to add links between object and summary data schema. Once the summary database is created, an aggregate/summary operator needs to be defined to calculate the summary measure. The operator can be applied over an entire dimension (consolidation), or over a dimension to a higher level of the category hierarchy (roll-up). Other operations on a summary database include slice (select a single value from a dimension) and dice (select a range of values from a dimension)
Scientific databases deal with four main aspects of multidimensional data: space-time representation, clustering, indexing, and classification structures. Specialized file formats have been developed to address space-time representation of data. Cluster analysis of high-dimensional data requires dimension reduction or data set size reduction. A number of methods for indexing have been reviewed. Normalized and fully normalized versions of the category hierarchy in a tabular form are shown as a means of supporting classification structures in scientific databases. The chapter concludes with a look at the future trends in these databases.
There are a few typographical errors in the text. Otherwise, the figures are very descriptive, and enhance the understanding of the material. The material is easy to read, and well organized.