Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Recovery management in QuickSilver
Haskin R., Malachi Y., Chan G. ACM Transactions on Computer Systems6 (1):82-108,1988.Type:Article
Date Reviewed: Aug 1 1988

Quicksilver is a network operating system for IBM workstations connected by a token ring. Quicksilver provides system services as user-level processes that maintain client states. Servers are resilient to external failure and can recover resources associated with failed clients. The commit protocol and log recovery primitives are available to applications so servers can tailor recovery techniques to requirements, trading off simplicity and efficiency against recoverability.

The authors have adopted a high-overhead transaction mechanism in Quicksilver, but with the policy of using it only when necessary. To this end, servers are divided into four types: those that have volatile internal states and only require signaling capability, such as the window manager; those that manage replicated volatile states and use transaction commit for atomicity, like the name server; those that manage recoverable states and require a full panoply of recovery mechanisms, like the file server; and those that manipulate long-lived states and require log service for checkpointing. Only those that manage recoverable states are truly expensive in Quicksilver. Transaction overhead is further reduced by providing alternative commit protocols to servers, so servers can choose how much to pay for recovery.

Interprocess communication (IPC) addresses in Quicksilver are evidently site-dependent (contrary to the author’s statement in section 2.1), so IPC is location sensitive. Thus services (except for transaction management) are bound to nodes, migration is expensive, and load balancing (usually a fundamental rationale for a network operating system) is probably impractical. The Quicksilver IPC mechanism is heavily loaded, with responsibility for guaranteeing delivery and message ordering, for enforcing security constraints, and for maintaining transaction connectivity graphs. These overheads slow down processes that do not require the benefits provided and to some extent defeat the author’s goal of paying optional overhead for optional services.

The paper contains a comprehensive review of possible approaches and a wide-ranging survey of the distributed operating system literature.

Reviewer:  Jason Gait Review #: CR112497
Bookmark and Share
 
Distributed File Systems (D.4.3 ... )
 
 
Checkpoint/ Restart (D.4.5 ... )
 
 
Distributed Databases (H.2.4 ... )
 
 
Fault-Tolerance (D.4.5 ... )
 
 
File Organization (D.4.3 ... )
 
 
Maintenance (D.4.3 ... )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Distributed File Systems": Date
Distributed file systems: concepts and examples
Levy E., Silberschatz A. ACM Computing Surveys 22(4): 321-374, 2001. Type: Article
Nov 1 1991
Scale and performance in a distributed file system
Howard J., Kazar M., Menees S., Nichols D., Satyanarayanan M., Sidebotham R., West M. ACM Transactions on Computer Systems 6(1): 51-81, 1988. Type: Article
Jul 1 1988
Technique for redundancy control in a distributed hierarchical filestore
Lunn K., Bennett K. Information Technology Research Development Applications 3(3): 157-161, 1984. Type: Article
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy