When a software system fails at a user site, information is often automatically sent to the developer for diagnostics. Usually, only a system log is included, with all the confidential data such as identities and financial information stripped off. Such a log message is found to be insufficient for debugging. In the present paper, the authors set out to enhance the log information, using causal relationships to help narrow down the amount of additional information required for diagnostics. They propose the LogEnhancer tool to support the automation of this approach. The tool performs a delayed collection, which captures all the causally related data still live at the time of the failure, and an in-time collection, which includes all the relevant historical data. The paper provides excellent details for the proposed techniques and tool. An empirical study validates that the approach is feasible. The work provides very useful insight and information for system vendors and debuggers, who rely on automatic bug reports from their users for the diagnosis of software faults.
In general, however, the root cause of a failure may be very subtle and may not be supported by the causal relationships conceived by the developer. Thus, even the enhanced log messages may not be sufficient for diagnostics. Rather than using a simple controlled experiment to verify whether the enhanced log messages are truly effective, it may be useful to conduct mutation analysis to study the types of faults that are more easily discovered and the types that are more problematic. To address the latter kinds of faults, it would be helpful to complement the proposed tool using statistical fault-localization techniques that do not rely on causal relationships.