Failures in distributed software systems are frequently caused by configuration errors. Diagnosing such errors is difficult, and solutions are usually urgently required. A preventative approach, based on version checking, forms the key contribution of this paper.
The approach is comprised of three steps.
1. Prepare a package containing all new or updated components.
2. Prepare an inventory of existing and new or modified components.
3. When an application is executed, version changes are detected and the new inventory is compared and contrasted with the previous inventory. The new configuration is monitored in a phase called pre-run monitoring, and any problems are logged.
The approach is grounded in some formal definitions, presented in section 2 of the paper, and is evaluated by means of a case study that concerns a financial services contact center.
The definitions include component integration consistency, which is true if every reference made by one component to another refers to the expected version of the other component, and application integration consistency, which is true if every component in the application satisfies component integration consistency. The formal definition of component integration consistency is made awkward to read because of the way it uses “forall” universal quantifiers.
The case study is substantial and illustrates the approach well.
In all, this paper is likely to be useful to anyone involved in creating and managing distributed systems, including distributed systems of smart things for the Internet of Things.