3.3. Hashing types

Hashing is a very frequently applied method for storing and retrieving information (in main memory) as fast as possible.

The descriptions of hashing reduce the problem of storing and retrieving arbitrary information to the problem of storing and retrieving integer numbers. This simplification is justified by the universality of binary coding: Each information can be represented as a sequence of zeros and ones, that is, as a number in the binary system.

A conclusion from this universality - not really to be taken seriously - is the instruction how one can go for a walk with the complete knowledge of mankind: One go to all libraries of the world, write all letters of all books down, one after another, code the string obtained in this way with the ASCII coding, and understand the bit sequence just created as one single, gigantic number W. From this number, one calculate the number w=1/W, make a notch in the distance of w centimeters from the lower end of a walking stick, and walk off leaned on the collected knowledge of mankind.

Of course this instruction suffers from the fact that the number W is so huge, and that therefore the reciprocal value w is so small, that there is certainly no molecule “sitting in the distance of w centimeters from the lower end of the walking stick” that one could use for the notching. Likewise, the descriptions of hashing suffer mostly from the fact that the size of the numbers to be stored, that is the amount of information, is not taken into consideration and that one pretends instead that one could write an arbitrarily large number into every memory location of the main memory.

In concrete implementations one therefore restricts oneself to store and to retrieve numbers from a certain range; mostly, this is the interval that is covered by the type int, that is the range [-231..231-1]. Arbitrary information can be stored by mapping it to a value from this interval, the so-called hash value. If one understands this information as a key, one can associate further information to it, thus creating an associative container type. Data structures that employ hashing methods to store keys (and information possibly associated to it) are denoted as hashing types.

This section describes the algorithmic basics and the interface of LEDA's hashing types h_array (hashing array) and map.