This paper describes changes to the UNIX file system to enable it to survive disk crashes with no losses. Information is replicated on a pair of disks (the pair is called a “stable disk”), and the kernel is modified to handle writes to the disk properly. A special crash recovery routine is invoked often enough to be sure that both disks contain the same information (in this scheme, crash recovery is simply copying the information on one disk to the other.) In single user mode, disk transfers involving stable disks take 10 to 25 percent longer than those not involving stable disks. Unfortunately, in multiuser mode, the degradation is much worse; the crash-resistant file system’s performance is from 3 to 5 times slower than that for the non-crash-resistant UNIX file system.
The mechanism described in this paper will protect file systems against disk failures unless conditions cause hardware failures in both disks of the stable disk. The scheme’s major weakness is the degradation in a multiuser environment, and the reader has no feeling for the way disk transfers degrade. In particular, some comparison or chart comparing load averages with the degradation in multiuser mode would have been very helpful in determining how acceptable it is to use this crash-resistant scheme. Other than this, the paper is well written and succinct, conveys the nature of the work done, and is useful and interesting to UNIX implementors and operating systems researchers.