Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
In search of clusters (2nd ed.)
Pfister G., Prentice-Hall, Inc., Upper Saddle River, NJ, 1998. Type: Book (9780138997090)
Date Reviewed: Nov 1 1998

A true cluster is built around a single system image, a cluster-wide realization of services that are necessary to make the cluster an application platform, such as a clock, a file system, high-performance intermachine communication, interprocessor synchronization, caching, work queues for load balancing, a unified name space for all resources, the ability to administer the cluster as though it were a single machine, and transaction logging. A true cluster is a bunch of sheep decked out so the flock looks and acts like a wolf, and thinks it is one. The author describes one way of realizing a single system image, as used by IBM, that designates a particular machine as an initial “floating master” responsible for maintaining the single system image, and fails over the master functionality to another machine in the cluster when necessary.

It is hard to find cluster-ready applications. The only application I know of that is really cluster-friendly is database or transaction processing. The highly organized database partitions naturally, so it is easy to execute queries in parallel, with each machine in the cluster working on a piece of the database. It is as though clusters and databases were made for each other.

With the current proliferation of symmetric multiprocessing (SMP) machines, together with the relative rarity of clusters, the proponents of clusters are becoming defensive. The author sets out to prove that there are reasons why the present favors clusters over SMPs and that the sooner we all recognize these reasons, the sooner clusters will drive SMPs to oblivion. The first reason advanced is that faster processors and relatively slower off-the-shelf memory lead to SMP architectures that do not scale across more than two processors. I do not buy this because any machine architect upgrading to faster processors is going to find a way to upgrade to faster memory at the same time: that is what the NUMA architecture is all about. The second reason is that high-speed interconnects have become common and cheap, making clusters easier to design and cheaper. I do not give this much weight either. Serious clusters have always sported high-speed interconnects whose cost was commensurate with the cost of the cluster. The cost of a cluster is only a small part of the cost of computing. The third reason is that tools for distributed computing have become ubiquitous, a positive development for clusters. One of the examples advanced by the author of a new tool for distributed computing is TCP, which has been around since the mid-1970s. Fourth, the market needs high availability to support rapidly growing markets in database warehousing and Web service. Here is where the cluster shines--high availability is the design center. The author makes a good case here. Businesses are recentralizing resources to regain control, the Internet is a 24-hour-a-day, seven-day-a-week window to the world, and service must be provided continuously.

A critical aspect of cluster behavior is graceful failover when a machine leaves the cluster and equally graceful task redistribution when a machine is added to the cluster. The chapter on high availability via failover is the high point of the book. The tricky part about failover is dealing with false alarms, when the cluster thinks a machine is dead but it is not. After failover, there are two machines beating against one another to perform the same task, perhaps disastrously. One fix is to convert a potential false alarm to the real thing by disconnecting an apparently failed machine from the cluster.

In contrast to a generally lively and authoritative, if long-winded, description of clusters, Pfister provides a simplistic and uninteresting overview of SMP and NUMA machines--he does not like them, and it shows. The book’s flaw is bad editing. The author is exuberant and was allowed free rein for more than 500 pages. A good editor would have kept the book down to 200-odd pages, making it much better. In the end, the book is fun to read but is a weak technical contribution.

Reviewer:  Jason Gait Review #: CR121786 (9811-0858)
Bookmark and Share
 
Distributed Architectures (C.1.4 ... )
 
 
Parallelism And Concurrency (F.1.2 ... )
 
 
Distributed Systems (C.2.4 )
 
 
Special-Purpose And Application-Based Systems (C.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Distributed Architectures": Date
Distributed and parallel computing
El-Rewini H., Lewis T. (ed), Manning Publications Co., Greenwich, CT, 1998. Type: Book (9780137955923)
Mar 1 1999
A correctness condition for high-performance multiprocessors
Attiya H., Friedman R. SIAM Journal on Computing 27(6): 1637-1670, 1998. Type: Article
May 1 1999
How to build a Beowulf
Sterling T., Salmon J., Becker D., Savarese D., MIT Press, Cambridge, MA, 1999. Type: Book (9780262692182)
Dec 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy