'dbm' hash table software

Thu Jul 17 14:14:57 AEST 1986

In article <385 at nike.UUCP> jordan at nike.UUCP (Jordan Hayes) writes:
>dbm was a hack to get 4.2 out the door ...

Not quite: dbm was in V7, and was apparently written by Ken Thompson
in some connection with a chess program.

>4.3 has a new implementation that does the right thing (i.e.,
>dbm_open() returns a DBM *) and the size restriction was raised to
>4096 bytes ...

Why is this `the right thing'?

My `mdbm' implementation allows both the data and the bitmap block
size to be specified at database creation.  A larger data block
allows storing larger items and/or more that hash to the same value,
but also decreases `store' performance.

In any case, to answer the original question, when store() runs
out of room in a data block, it splits it by using one more hash
bit.  This normally results in two nearly-half-full blocks, and
leaves enough room for the new item.  In pathological cases, dbm
may have to split blocks some number of times before discovering
that there is no way to fit the new item into its eventual destination.
This happens only when several large items hash to the same value,
which is rather rare, and in this case the new item is left out of
the database and an error is returned.  The database might then
appear extremely large (as much as 2^32 bytes), but should still
be properly formed internally.

Of course, if the operating system rejects any particular transfer,
for whatever reason, the database can become corrupted.  In such
cases store should also return an error.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris at umcp-cs		ARPA:	chris at mimsy.umd.edu