Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Cellular disco: resource management using virtual clusters on shared-memory multiprocessors
Govil K., Teodosiu D., Huang Y., Rosenblum M. ACM Transactions on Computer Systems18 (3):229-262,2000.Type:Article
Date Reviewed: Nov 1 2001

Disco [1] is a virtual machine monitor that turns shared memory multiprocessors into multiple virtual machines, each running an unmodified commercial operating system. Disco can take advantage of technology such as non-uniform memory access (NUMA) that the commodity systems do not support. But without support for hardware partitioning in Disco, any hardware fault causes total system failure. Cellular Disco adds support for fault containment and scalable resource management to address these drawbacks in Disco.

Cellular Disco is structured as a virtual cluster of semi-independent cells. These cells must be divisible into one or more fault containment units with fail-stop behavior. This limits Cellular Disco’s generality, since few shared-memory systems exhibit fail-stop behavior. Fault recovery is further simplified by assuming the correctness of Disco’s code, allowing cells to trust each other. The authors justify this assumption by the small amount of code (50K lines).

Good fault containment requires many cells, while good resource management needs few cells with many resources. CPU and memory management support was added to address this tradeoff, including process migration for load balancing and memory borrowing between cells. Global shared memory was also added to support large applications split across virtual machines. Experiments on a 32-processor SGI Origin system using database, I/O, CPU, and kernel intensive workloads showed the worst-case virtualization costs of Cellular Disco to be 9% in the uniprocessor case, and 20% for a 32-processor system executing a single virtual machine. Comparing 1-cell to 8-cell systems showed that the cost of fault containment was only 1%.

The paper includes valuable sections discussing other approaches and related work. The work is interesting, well reported, and recommended to all with an interest in operating systems or multi-processors.

[1] Bugnion, E., Devine, S., Govil, K., and Rosenblum, M. 1997. Disco: Running commodity operating systems on scalable multi-processors. ACM Transactions on Computer Systems, 15, 4 (November), 412-447.
Reviewer:  Andrew R. Huber Review #: CR125507 (0111-0415)
1) Bugnion,E. Devine,S. Govil,K. Rosenblum,M. Disco: Running commodity operating systems on scalable multi-processors ACM Transactions on Computer Systems 15(4):412-447, 1997
Bookmark and Share
  Featured Reviewer  
 
Process Management (D.4.1 )
 
 
Reliability (D.4.5 )
 
 
Storage Management (D.4.2 )
 
 
Processor Architectures (C.1 )
 
Would you recommend this review?
yes
no
Other reviews under "Process Management": Date
Efficient and correct execution of parallel programs that share memory
Shasha D. (ed), Snir M. ACM Transactions on Programming Languages and Systems 10(2): 282-312, 1988. Type: Article
Feb 1 1989
Synthesis of Petri net models: a rough set approach
Pancerz K., Suraj Z. Fundamenta Informaticae 55(2): 149-165, 2003. Type: Article
Nov 14 2003
The Linux process manager: the internals of scheduling, interrupts and signals
O’Gorman J., John Wiley & Sons, Inc., New York, NY, 2003.  798, Type: Book (9780470847718)
Sep 22 2003
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy