…we present a solution to the causal reliable multicast problem. User processes generate separate sequences of messages and specify the causal relation among them according to some application need; the algorithm ensures that the messages within the same sequence are delivered to all active, i.e., both correct and faulty, processes in the group, or to none of them, and are processed according to their causal order. Messages belonging to different sequences can be concurrently processed. This problem has few solutions presented in literature; in common with a part of them, this algorithm uses a centralized approach and history buffers to recover from omission failures.…Further, it allows implementation using the most general interpretation of causality and it does not require any particular service to the underlying transport protocol. (From the authors’ abstract).
The authors have developed a new algorithm for application within real-time distributed control environments in which the provision of multimedia spaces for simultaneous multiuser collaborative efforts and conferencing requires a form of communication that reflects the causal relation among the messages. This approach, known as the uniform reliable causal group communications (URCGC) algorithm, uses embedded mechanisms to provide for the normal processing of messages together with the recovery actions that are required when failures occur. The authors claim that, under multiprocessing failure conditions, this new algorithm performs better than currently used algorithms in terms of network overhead loading, throughput, and tolerance of general omission failures, while providing comparable network performance under reliable conditions.
The URCGC algorithm is outlined in the context of a system model and a defined implementation protocol architecture. An analysis is presented as the basis for the network performance improvements attributed to this algorithm. This algorithm has apparently yet to be implemented on an actual multiprocessor network. Judgment as to its ultimate efficacy in improving network throughput and error tolerance is perhaps best left to such an implementation and subsequent network performance measurements.