1cdb(5)                        File Formats Manual                       cdb(5)
2
3
4

NAME

6       cdb - Constant DataBase file format
7
8

DESCRIPTION

10       A  cdb database is a single file used to map `keys' to `values', having
11       records of (key,value) pairs.  File consists of 3 parts: toc (table  of
12       contents), data and index (hash tables).
13
14       Toc  has  fixed  length  of 2048 bytes, containing 256 pointers to hash
15       tables inside index sections.  Every pointer consists of position of  a
16       hash  table in bytes from the beginning of a file, and a size of a hash
17       table in entries, both are 4-bytes (32 bits) unsigned integers in  lit‐
18       tle-endian  form.  Hash table length may have zero length, meaning that
19       corresponding hash table is empty.
20
21       Right after toc section, data section follows  without  any  alingment.
22       It  consists  of  series of records, each is a key length, value (data)
23       length, key and value.  Again, key and value length are 4-byte unsigned
24       integers.  Each next record follows previous without any special align‐
25       ment.
26
27       After data section, index (hash tables) section follows.  It should  be
28       looked  to  in conjunction with toc section, where each of max 256 hash
29       tables are defined.  Index section consists of series of  hash  tables,
30       with  starting  position and length defined in toc section.  Every hash
31       table is a sequence of records each holds two numbers: key's hash value
32       and  record position inside data section (bytes from the beginning of a
33       file to first byte of key length  starting  data  record).   If  record
34       position  is  zero,  then  this is an empty hash table slot, pointed to
35       nowhere.
36
37       CDB hash function is
38         hv = ((hv << 5) + hv) ^ c
39       for every single c byte of a key, starting with hv = 5381.
40
41       Toc section indexed by (hv % 256), i.e. hash value modulo  256  (number
42       of entries in toc section).
43
44       In  order  to  find a record, one should: first, compute the hash value
45       (hv) of a key.  Second, look to hash table number hv modulo 256.  If it
46       is  empty,  then there is no such key exists.  If it is not empty, then
47       third, loop by slots inside that hash table, starting  from  slot  with
48       number  hv divided by 256 modulo length of that table, or ((hv / 256) %
49       htlen), searching for this hv in hash table.  Stop search on empty slot
50       (if  record position is zero) or when all slots was probed (note cyclic
51       search, jumping from end to beginning of a table).  When hash value  in
52       question  is  found in hash table, look to key of corresponding record,
53       comparing it with key in question.  If them  of  the  same  length  and
54       equals  to each other, then record is found, overwise, repeat with next
55       hash table slot.  Note that there may be several records with the  same
56       key.
57
58

SEE ALSO

60       cdb(1), cdb(3).
61
62

AUTHOR

64       The  tinycdb  package written by Michael Tokarev <mjt@corpit.ru>, based
65       on ideas and shares file format with original cdb library by Dan  Bern‐
66       stein.
67
68

LICENSE

70       Public domain.
71
72
73
74                                   Apr, 2005                            cdb(5)
Impressum