Software reliability is one of the most extensively studied of all software quality attributes. There are literally dozens of models to predict and assess software reliability, some that date back to the 1970s. The measurement goal established in this paper is to monitor and control software reliability. Semantic metrics are described as being based on the properties of a software product’s functional quality attributes. The properties of interest to the authors emphasize failure avoidance, in addition to fault avoidance and fault removal. They propose four semantic metrics that, when used together, should provide an indication of a program’s ability to tolerate faults and avoid failure. The four semantic metrics are defined as:
- 1. a measure of state redundancy expressed in Shannon bits,
- 2. a measure of functional redundancy expressed as an abstract number,
- 3. a measure of maskability expressed in Shannon bits, and
- 4. a measure of recoverability expressed in Shannon bits.
A Shannon bit as defined in information theory is a measure of the uncertainty associated with a binary random variable that can be in one of two possible states with equal probability. A Shannon bit is a unit that is related to the uncertainty and unpredictability of outcomes. A Shannon bit with a value of 1 is the amount of entropy that is present in the selection of two equally probable outcomes; it represents the information that is gained when the value of this variable becomes known.
The usefulness of the proposed metrics might be evaluated in the context of three techniques that have been applied in ultra-high-reliability research, specifically, design diversity, testability, and program self-checking. Design diversity for fault tolerance has shown that independently developed software programs will not fail independently and therefore will not increase reliability beyond that attained by single versions of the same functionality. Testability research assumes software will fail if faulty. Software that is more testable will detect and remove more faults, but may be more prone to failure in the presence of undetected latent faults. Program self-checking involves showing a program correct with a probability close to 1. Program self-checking thus far applies to a very restricted domain of mathematics-oriented problems.
The correlation between functional quality attributes, such as reliability and fault tolerance, and the proposed four semantic metrics has not benefited from empirical validation. Software reliability is dependent on multiple factors such as target hardware, system users, operational profiles, software characteristics, and the physical environment of operation. Empirical validation of monitoring and control of software reliability through the application of the proposed semantic metrics may prove to be a nontrivial task.