Most computer science research papers in the systems area focus on the design and sometimes the implementation of the system in question. Rarely is the performance of the resulting system measured (assuming the system is actually built), and more rarely still are the performance figures analyzed to gain insight into what is going on. This paper, about the DEC Firefly system, is a wonderful exception--the authors not only give performance measurements but examine them under a microscope to see where the time is being spent.
The Firefly is a multiprocessor consisting of five VAX CPUs. Multiple Fireflies are connected by an Ethernet. All interprocess communication in the system uses remote procedure call (RPC). As a consequence, the performance of the RPC mechanism is critical to the total system performance, since everything else is built upon it.
The heart of the paper is a step-by-step walkthrough of an RPC, showing how many microseconds are needed to build the header, calculate the checksum, trap to the kernel, queue the packet for transmission, and complete each of the other steps. Based on these measurements, the authors describe ways their RPC (and, by implication, other people’s) could be speeded up. These include redesigning the header format, omitting certain checksums, and redesigning the RPC protocol.
The authors conclude with an interesting comparison of the Firefly RPC with those of other systems (Cedar, Amoeba, V, and Sprite). This paper should be required reading for all operating system designers.