Linear hashing (LH) is a directoryless, dynamic hashing technique developed by Litwin. LH* is a generalization of LH that allows for hashing in a distributed environment.
A file stored at different sites can be shared by different clients. The client’s image of the file can differ from the file; in particular, the local pointer n′, indicating the next bucket to be split, may differ from the actual pointer n. Thus, the address of a key is calculated by a client and then by a server. This may lead to forwarding the key to another server, after which the client’s file image is adjusted.
A search needs between two and four messages, and insertion needs between one and three messages, not counting messages needed to manage a split, which can be performed asynchronously. Extensive simulations indicate that, for a system using buckets of at least 250 keys, the average number of messages per search is 2.01 and the average number of messages per insert is below 1.05, that is, almost ideal. The number of addressing errors never exceeds log2 (number of buckets), and less active clients--more prone to making addressing errors--make these errors only about 10 percent more often than others. Moreover, the average load factor is 65 to 70 percent, and after the load control is used, the factor increases to between 80 and 95 percent.
The only centralized component of the system is a split coordinator that manages splits and merges of buckets, but the coordinator is not necessary. The authors discuss a variant of LH* without a split coordinator. In that version, the splits are accomplished by cascading them. Another variant concerns concurrent splits, in which a key component is a committed split pointer to indicate that a split is finished and thus can be committed.
The paper is well and clearly written, and it includes helpful examples and diagrams.