Parallel and distributed discrete-event simulation has been a fruitful field of research over the last 15 years, mainly focusing on developing protocols to synchronize simulation processes so that the results are logically correct--that is, equivalent to some serial run of the same system. A number of such protocols have been proposed. Most distributed simulation software systems employ only one of these protocols. In this paper, the authors outline a system capable of using any of the protocols that have been discovered to date. They present the conceptual model of such a multi-protocol system, followed by an algorithm for executing the conservative and optimistic protocols and for switching between them. Proofs of correctness of the algorithm are given for the major system properties (fidelity, progress, safety, and termination). The authors discuss how the system can be configured to mimic each of the major protocols. Finally, they discuss how the system will perform as a function of the major determiners of parallel simulation performance--communication topology, process lookahead, network topology, and the presence or absence of dynamic process creation.
The paper’s focus is theoretical. No actual performance numbers or simulation results are presented. The authors state that they have developed a software simulator to test these ideas, and that a distributed system utilizing this design is currently being implemented. The key theoretical observation is that a distinction can be made between local process control (which event to process next) and global process control (how far ahead the suite of processes can proceed, and how much memory can be reclaimed from prior activity). By separately designing conservative and optimistic algorithms for both local and global control, the system can mix and match them to suit the particular requirements of the simulation.
The paper is well written and well presented, although its intended audience is specialists in parallel and distributed simulation. The paper’s main message--that both conservative and optimistic protocols are useful and that a system employing both simultaneously can be developed--raises another set of issues in the already complicated world of distributed simulation: when is one protocol useful, and under what conditions should a process switch from one protocol to another? The paper attempts to answer these questions theoretically, but without performance data on simulations of interest to industry and government, it is impossible to assess the validity of the answers.