Nonintrusive monitoring of execution behavior is an important means of studying performance and investigating errors in complex systems. In this case, the target system is a local network of seven MC68000-based nodes supporting a distributed operating system and distributed applications structured as communicating objects. The monitor described is hybrid in that it comprises both dedicated monitoring hardware and associated software for the analysis of the information gathered.
The principal hardware components are a set of test and measurement processors (TMPs), one of which is incorporated into each node. Each TMP monitors events in its node via the system bus; events are generated by special instructions implanted in the target software. Monitoring is thus almost completely nonintrusive with respect to speed degradation (quoted as less than 0.1 percent), but it is intrusive in its requirement for modification to the monitored program code. This requirement appears to limit the system’s effectiveness as a flexible debugging tool. The examples given in the paper suggest that its main use has been for fairly coarse-grained monitoring, although the authors claim that it can process more than 13,000 events per second per node.
Information gathered is buffered and partially processed in the TMP, and the TMP reports periodic summaries and system status changes to a Central Station Monitor, which performs global data processing and presents information in the form of graphical displays. Apart from a section that discusses means by which the Central Station maintains a consistent global state and synchronization of local times, little in the paper specifically addresses problems of distributed systems monitoring; indeed, the system described is in most respects similar to bus-monitoring systems that have been used in a single-processor context. This paper is a useful description of a well-organized general-purpose monitoring environment.