The growing acceptance of database systems makes their performance increasingly more important. One way to gain performance is to off-load some of the functions of the database system to a back-end computer. The problem is what functions should be off-loaded to maximize the benefits of distributed processing.
--From the Authors’ Abstract
The authors discuss the above problem using an experimental methodology. As a testbed they use the well-known INGRES relational system. This assumes an architecture composed of two processors connected by a fast communication channel. Since, typically, a database system is a layered software system, its software can be partitioned along the dividing lines between any of the layers, and the resultant parts can be placed into the two processors. INGRES was therefore configured into six different configurations, where each successive configuration included more functionality in the back-end processor. Each of the different configurations was tested under several query streams, and several measurements were made. The factors measured included total elapsed time, total I/O time, CPU and I/O times for each processor, and network communication time. After describing the problems encountered in implementing each of the configurations, the authors describe the results and analyze them.
Although the approach taken in this paper is quite unique, the results are, in general, disappointing. Almost a third of the paper is devoted to explaining discrepancies in various measurements which occur as a result of particular INGRES or UNIX implementations. If one’s implementation of a relational system is very close to INGRES, some of the comments made are very useful. Otherwise, perhaps the only general conclusion one can draw from the results shown is that a configuration where most of the functionality is in one processor (i.e., either in the back-end or in the front-end) is far superior to a configuration with functionality divided more or less evenly between the two processors.
Even the authors themselves raise some questions about the experimental methodology used in this paper: “Are the conclusions valid for at least all relational database systems? Are the conclusions valid for all databases and query streams . . . ? Are they valid over the whole spectrum of hardware configurations . . . ?” The authors believe that the results are general, but cannot demonstrate it. The applicability of these results to other systems and configurations is therefore questionable.
In spite of the above reservations, I’d like to commend the authors for trying to analyze and comprehend a working system, which is always more complex than a paper design. The paper should therefore be read by people interested in database performance and database architecture. Hopefully it would lead to more general research on some of the issues raised in it.