Disco [1] is a virtual machine monitor that turns shared memory multiprocessors into multiple virtual machines, each running an unmodified commercial operating system. Disco can take advantage of technology such as non-uniform memory access (NUMA) that the commodity systems do not support. But without support for hardware partitioning in Disco, any hardware fault causes total system failure. Cellular Disco adds support for fault containment and scalable resource management to address these drawbacks in Disco.
Cellular Disco is structured as a virtual cluster of semi-independent cells. These cells must be divisible into one or more fault containment units with fail-stop behavior. This limits Cellular Disco’s generality, since few shared-memory systems exhibit fail-stop behavior. Fault recovery is further simplified by assuming the correctness of Disco’s code, allowing cells to trust each other. The authors justify this assumption by the small amount of code (50K lines).
Good fault containment requires many cells, while good resource management needs few cells with many resources. CPU and memory management support was added to address this tradeoff, including process migration for load balancing and memory borrowing between cells. Global shared memory was also added to support large applications split across virtual machines. Experiments on a 32-processor SGI Origin system using database, I/O, CPU, and kernel intensive workloads showed the worst-case virtualization costs of Cellular Disco to be 9% in the uniprocessor case, and 20% for a 32-processor system executing a single virtual machine. Comparing 1-cell to 8-cell systems showed that the cost of fault containment was only 1%.
The paper includes valuable sections discussing other approaches and related work. The work is interesting, well reported, and recommended to all with an interest in operating systems or multi-processors.
[1] Bugnion, E., Devine, S., Govil, K., and Rosenblum, M. 1997. Disco: Running commodity operating systems on scalable multi-processors. ACM Transactions on Computer Systems, 15, 4 (November), 412-447.