Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Linear hashing with overflow-handling by linear probing
Larson P. ACM Transactions on Database Systems10 (1):75-89,1985.Type:Article
Date Reviewed: Dec 1 1985

This paper presents a linear hashing schema for files that grow and shrink dynamically. It is written in a concise and clear style and describes a nice new algorithm.

The author first reminds us of existing linear hashing algorithms. In linear hashing schemes, pages are split in a predetermined sequence. To avoid uneven distributions of the records in the file, one can use Linear Hashing with Partial Expansions (LHPE) schemes. The cost of such expansions strongly depends on the cost of searching all the records addressed to a home page, which should then be minimized.

The author suggests an open-addressing method to handle overflow and describes a possible predetermined sequence that decreases the length of probe sequences. The pages being numbered, the overflow records of a full page are directed to the next page of higher number that is not full. To avoid clustering of full pages, one does not split pages in increasing order but five groups are created, which are split in turn. These groups are formed of buckets evenly distributed over the whole file (i.e., with numbering increasing by five). Moreover, as records tend to cluster on pages with higher numbers, the buckets in the groups are split in decreasing order. Retrieval, insection, and deletion are done as in linear probling. A specific problem appears when one splits a bucket. One locates all the records of the associated probe sequence and eventually moves them to the new bucket. This may leave holes in the probe sequences of the next buckets. Thus, they are copied in some “record pool,” reorganized, and eventually some overflow records are moved closer to their home bucket.

Finally, performances obtained by simulations are discussed. They are typical of both linear hashing and linear probing schemes. In particular, performances are cyclic and deteriorate rapidly when the load factor is greater than some threshold, say 0.80-0.85. It appears that the introduction of five groups and the backwards split order makes this algorithm better than the previous ones. And the operatons are greatly speeded up by using more buffer space.

Reviewer:  M. Regnier Review #: CR109508
Bookmark and Share
 
Threads (D.4.1 ... )
 
 
Decision Tables (D.2.2 ... )
 
 
Hash-Table Representations (E.2 ... )
 
 
Organization/ Structure (E.5 ... )
 
 
Design Tools and Techniques (D.2.2 )
 
 
Data Storage Representations (E.2 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Threads": Date
An analysis of operating system behavior on a simultaneous multithreaded architecture
Redstone J., Eggers S., Levy H.  Architectural support for programming languages and operating systems (Ninth international conference, Cambridge, Massachusetts, United States, Nov 12-15, 2000)245-256, 2000. Type: Proceedings
Sep 25 2003
Scheduling multithreaded computations by work stealing
Blumofe R., Leiserson C. Journal of the ACM 46(5): 720-748, 1999. Type: Article
Jun 1 2000
Design of multithreaded software: the entity-life modeling approach
Sandén B., Wiley-IEEE Computer Society Pr, Hoboken, NJ, 2011.  320, Type: Book (978-0-470876-59-6)
Sep 20 2011

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy