Computing Reviews, the leading online review service for computing literature.

Search

Linear hashing with overflow-handling by linear probing
Larson P. ACM Transactions on Database Systems10 (1):75-89,1985.Type:Article

Date Reviewed: Dec 1 1985

This paper presents a linear hashing schema for files that grow and shrink dynamically. It is written in a concise and clear style and describes a nice new algorithm. The author first reminds us of existing linear hashing algorithms. In linear hashing schemes, pages are split in a predetermined sequence. To avoid uneven distributions of the records in the file, one can use Linear Hashing with Partial Expansions (LHPE) schemes. The cost of such expansions strongly depends on the cost of searching all the records addressed to a home page, which should then be minimized. The author suggests an open-addressing method to handle overflow and describes a possible predetermined sequence that decreases the length of probe sequences. The pages being numbered, the overflow records of a full page are directed to the next page of higher number that is not full. To avoid clustering of full pages, one does not split pages in increasing order but five groups are created, which are split in turn. These groups are formed of buckets evenly distributed over the whole file (i.e., with numbering increasing by five). Moreover, as records tend to cluster on pages with higher numbers, the buckets in the groups are split in decreasing order. Retrieval, insection, and deletion are done as in linear probling. A specific problem appears when one splits a bucket. One locates all the records of the associated probe sequence and eventually moves them to the new bucket. This may leave holes in the probe sequences of the next buckets. Thus, they are copied in some “record pool,” reorganized, and eventually some overflow records are moved closer to their home bucket. Finally, performances obtained by simulations are discussed. They are typical of both linear hashing and linear probing schemes. In particular, performances are cyclic and deteriorate rapidly when the load factor is greater than some threshold, say 0.80-0.85. It appears that the introduction of five groups and the backwards split order makes this algorithm better than the previous ones. And the operatons are greatly speeded up by using more buffer space.

Reviewer: M. Regnier	Review #: CR109508

Threads (D.4.1 ... )

Decision Tables (D.2.2 ... )

Hash-Table Representations (E.2 ... )

Organization/ Structure (E.5 ... )

Design Tools and Techniques (D.2.2 )

Data Storage Representations (E.2 )

Would you recommend this review?

yes

Other reviews under "Threads":	Date

An analysis of operating system behavior on a simultaneous multithreaded architecture Redstone J., Eggers S., Levy H. Architectural support for programming languages and operating systems (Ninth international conference, Cambridge, Massachusetts, United States, Nov 12-15, 2000)245-256, 2000. Type: Proceedings	Sep 25 2003

Scheduling multithreaded computations by work stealing Blumofe R., Leiserson C. Journal of the ACM 46(5): 720-748, 1999. Type: Article	Jun 1 2000

Design of multithreaded software: the entity-life modeling approach Sandén B., Wiley-IEEE Computer Society Pr, Hoboken, NJ, 2011. 320, Type: Book (978-0-470876-59-6)	Sep 20 2011

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy