DBM::Deep::Internals(3pm)

1DBM::Deep::Internals(3pUms)er Contributed Perl DocumentatDiBoMn::Deep::Internals(3pm)
2
3
4

NAME

6       DBM::Deep::Internals - Out of date documentation on DBM::Deep internals
7

OUT OF DATE

9       This document is out-of-date. It describes an intermediate file format
10       used during the development from 0.983 to 1.0000. It will be rewritten
11       soon.
12
13       So far, the description of the header format has been updated.
14

DESCRIPTION

16       This is a document describing the internal workings of DBM::Deep. It is
17       not necessary to read this document if you only intend to be a user.
18       This document is intended for people who either want a deeper
19       understanding of specifics of how DBM::Deep works or who wish to help
20       program DBM::Deep.
21

CLASS LAYOUT

23       DBM::Deep is broken up into five classes in three inheritance
24       hierarchies.
25
26       •   DBM::Deep is the parent of DBM::Deep::Array and DBM::Deep::Hash.
27           These classes form the immediate interface to the outside world.
28           They are the classes that provide the TIE mechanisms as well as the
29           OO methods.
30
31       •   DBM::Deep::Engine is the layer that deals with the mechanics of
32           reading and writing to the file. This is where the logic of the
33           file layout is handled.
34
35       •   DBM::Deep::File is the layer that deals with the physical file. As
36           a singleton that every other object has a reference to, it also
37           provides a place to handle datastructure-wide items, such as
38           transactions.
39

FILE LAYOUT

41       This describes the 1.0003 and 2.0000 formats, which internally are
42       numbered 3 and 4, respectively. The internal numbers are used in this
43       section. These two formats are almost identical.
44
45       DBM::Deep uses a tagged file layout. Every section has a tag, a size,
46       and then the data.
47
48   File header
49       The file header consists of two parts. The first part is a fixed length
50       of 13 bytes:
51
52         DPDB h VVVV SSSS
53         \  / |    \   \
54          \/  '---. \   '--- size of the second part of the header
55         file      \ '--- version
56        signature  tag
57
58       •   File Signature
59
60           The first four bytes are 'DPDB' in network byte order, signifying
61           that this is a DBM::Deep file.
62
63       •   File tag
64
65           A literal ASCII 'h', indicating that this is the header. The file
66           used by versions prior to 1.00 had a different fifth byte, allowing
67           the difference to be determined.
68
69       •   Version
70
71           This is four bytes containing the file version. This lets the file
72           format change over time.
73
74           It is packed in network order, so version 4 is stored as
75           "\0\0\0\cD".
76
77       •   Header size
78
79           The size of the second part of the header, in bytes. This number is
80           also packed in network order.
81
82       The second part of the header is as follows:
83
84         S B S T T(TTTTTTTTT...) (SS SS SS SS ...)  (continued...)
85         | | | |              \       |
86         | | | '----------.    \  staleness counters
87         | | '--------.    \  txn bitfield
88         | '------.    \  number of transactions
89        byte size  \  data sector size
90                 max buckets
91
92        (continuation...)
93         BB(BBBBBB) DD(DDDDDD) II(IIIIII)
94             |        |            |
95             |    free data        |
96         free blist           free index
97
98       •   Constants
99
100           These are the file-wide constants that determine how the file is
101           laid out.  They can only be set upon file creation.
102
103           The byte size is the number of bytes used to point to an offset
104           elsewhere in the file. This corresponds to the "pack_size" option.
105           This and the next three values are stored as packed 8-bit integers
106           (chars), so 2 is represented by "\cB".
107
108           "max_buckets" and "data_sector_size" are documented in the main
109           DBM::Deep man page. The number stored is actually one less than
110           what is passed to the constructor, to allow for a range of 1-256.
111
112           The number of transactions corresponds to the "num_txns" value
113           passed to the constructor.
114
115       •   Transaction information
116
117           The transaction bitfield consists of one bit for every available
118           transaction ID. It is therefore anywhere from 1 byte to 32 bytes
119           long.
120
121           The staleness counters each take two bytes (packed 32-bit
122           integers), one for each transaction, not including the so-called
123           HEAD (the main transaction that all processes share before calling
124           "begin_work"). So these take up 0 to 508 bytes.
125
126           Staleness is explained in DBM::Deep::Engine.
127
128       •   Freespace information
129
130           Pointers into the first free sectors of the various sector sizes
131           (Index, Bucketlist, and Data) are stored here. These are called
132           chains internally, as each free sector points to the next one.
133
134           The number of bytes is determined by the byte size, ranging from 2
135           to 8.
136
137   Index
138       The Index parts can be tagged either as Hash, Array, or Index. The
139       latter is if there was a reindexing due to a bucketlist growing too
140       large. The others are the root index for their respective datatypes.
141       The index consists of a tag, a size, and then 256 sections containing
142       file locations. Each section corresponds to each value representable in
143       a byte.
144
145       The index is used as follows - whenever a hashed key is being looked
146       up, the first byte is used to determine which location to go to from
147       the root index.  Then, if that's also an index, the second byte is
148       used, and so forth until a bucketlist is found.
149
150   Bucketlist
151       This is the part that contains the link to the data section. A
152       bucketlist defaults to being 16 buckets long (modifiable by the
153       max_buckets parameter used when creating a new file). Each bucket
154       contains an MD5 and a location of the appropriate key section.
155
156   Key area
157       This is the part that handles transactional awareness. There are
158       max_buckets sections. Each section contains the location to the data
159       section, a transaction ID, and whether that transaction considers this
160       key to be deleted or not.
161
162   Data area
163       This is the part that actual stores the key, value, and class (if
164       appropriate). The layout is:
165
166       •   tag
167
168       •   length of the value
169
170       •   the actual value
171
172       •   keylength
173
174       •   the actual key
175
176       •   a byte indicating if this value has a classname
177
178       •   the classname (if one is there)
179
180       The key is stored after the value because the value is requested more
181       often than the key.
182

PERFORMANCE

184       DBM::Deep is written completely in Perl. It also is a multi-process DBM
185       that uses the datafile as a method of synchronizing between multiple
186       processes. This is unlike most RDBMSes like MySQL and Oracle.
187       Furthermore, unlike all RDBMSes, DBM::Deep stores both the data and the
188       structure of that data as it would appear in a Perl program.
189
190   CPU
191       DBM::Deep attempts to be CPU-light. As it stores all the data on disk,
192       DBM::Deep is I/O-bound, not CPU-bound.
193
194   RAM
195       DBM::Deep uses extremely little RAM relative to the amount of data you
196       can access. You can iterate through a million keys (using each())
197       without increasing your memory usage at all.
198
199   DISK
200       DBM::Deep is I/O-bound, pure and simple. The faster your disk, the
201       faster DBM::Deep will be. Currently, when performing "my $x =
202       $db->{foo}", there are a minimum of 4 seeks and 1332 + N bytes read
203       (where N is the length of your data). (All values assume a medium
204       filesize.) The actions taken are:
205
206       1 Lock the file
207       1 Perform a stat() to determine if the inode has changed
208       1 Go to the primary index for the $db (1 seek)
209       1 Read the tag/size of the primary index (5 bytes)
210       1 Read the body of the primary index (1024 bytes)
211       1 Go to the bucketlist for this MD5 (1 seek)
212       1 Read the tag/size of the bucketlist (5 bytes)
213       1 Read the body of the bucketlist (144 bytes)
214       1 Go to the keys location for this MD5 (1 seek)
215       1 Read the tag/size of the keys section (5 bytes)
216       1 Read the body of the keys location (144 bytes)
217       1 Go to the data section that corresponds to this transaction ID. (1
218       seek)
219       1 Read the tag/size of the data section (5 bytes)
220       1 Read the value for this data (N bytes)
221       1 Unlock the file
222
223       Every additional level of indexing (if there are enough keys) requires
224       an additional seek and the reading of 1029 additional bytes. If the
225       value is blessed, an additional 1 seek and 9 + M bytes are read (where
226       M is the length of the classname).
227
228       Arrays are (currently) even worse because they're considered "funny
229       hashes" with the length stored as just another key. This means that if
230       you do any sort of lookup with a negative index, this entire process is
231       performed twice - once for the length and once for the value.
232

ACTUAL TESTS

234   SPEED
235       Obviously, DBM::Deep isn't going to be as fast as some C-based DBMs,
236       such as the almighty BerkeleyDB.  But it makes up for it in features
237       like true multi-level hash/array support, and cross-platform FTPable
238       files.  Even so, DBM::Deep is still pretty fast, and the speed stays
239       fairly consistent, even with huge databases.  Here is some test data:
240
241           Adding 1,000,000 keys to new DB file...
242
243           At 100 keys, avg. speed is 2,703 keys/sec
244           At 200 keys, avg. speed is 2,642 keys/sec
245           At 300 keys, avg. speed is 2,598 keys/sec
246           At 400 keys, avg. speed is 2,578 keys/sec
247           At 500 keys, avg. speed is 2,722 keys/sec
248           At 600 keys, avg. speed is 2,628 keys/sec
249           At 700 keys, avg. speed is 2,700 keys/sec
250           At 800 keys, avg. speed is 2,607 keys/sec
251           At 900 keys, avg. speed is 2,190 keys/sec
252           At 1,000 keys, avg. speed is 2,570 keys/sec
253           At 2,000 keys, avg. speed is 2,417 keys/sec
254           At 3,000 keys, avg. speed is 1,982 keys/sec
255           At 4,000 keys, avg. speed is 1,568 keys/sec
256           At 5,000 keys, avg. speed is 1,533 keys/sec
257           At 6,000 keys, avg. speed is 1,787 keys/sec
258           At 7,000 keys, avg. speed is 1,977 keys/sec
259           At 8,000 keys, avg. speed is 2,028 keys/sec
260           At 9,000 keys, avg. speed is 2,077 keys/sec
261           At 10,000 keys, avg. speed is 2,031 keys/sec
262           At 20,000 keys, avg. speed is 1,970 keys/sec
263           At 30,000 keys, avg. speed is 2,050 keys/sec
264           At 40,000 keys, avg. speed is 2,073 keys/sec
265           At 50,000 keys, avg. speed is 1,973 keys/sec
266           At 60,000 keys, avg. speed is 1,914 keys/sec
267           At 70,000 keys, avg. speed is 2,091 keys/sec
268           At 80,000 keys, avg. speed is 2,103 keys/sec
269           At 90,000 keys, avg. speed is 1,886 keys/sec
270           At 100,000 keys, avg. speed is 1,970 keys/sec
271           At 200,000 keys, avg. speed is 2,053 keys/sec
272           At 300,000 keys, avg. speed is 1,697 keys/sec
273           At 400,000 keys, avg. speed is 1,838 keys/sec
274           At 500,000 keys, avg. speed is 1,941 keys/sec
275           At 600,000 keys, avg. speed is 1,930 keys/sec
276           At 700,000 keys, avg. speed is 1,735 keys/sec
277           At 800,000 keys, avg. speed is 1,795 keys/sec
278           At 900,000 keys, avg. speed is 1,221 keys/sec
279           At 1,000,000 keys, avg. speed is 1,077 keys/sec
280
281       This test was performed on a PowerMac G4 1gHz running Mac OS X 10.3.2 &
282       Perl 5.8.1, with an 80GB Ultra ATA/100 HD spinning at 7200RPM.  The
283       hash keys and values were between 6 - 12 chars in length.  The DB file
284       ended up at 210MB.  Run time was 12 min 3 sec.
285
286   MEMORY USAGE
287       One of the great things about DBM::Deep is that it uses very little
288       memory.  Even with huge databases (1,000,000+ keys) you will not see
289       much increased memory on your process.  DBM::Deep relies solely on the
290       filesystem for storing and fetching data.  Here is output from top
291       before even opening a database handle:
292
293           PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
294         22831 root      11   0  2716 2716  1296 R     0.0  0.2   0:07 perl
295
296       Basically the process is taking 2,716K of memory.  And here is the same
297       process after storing and fetching 1,000,000 keys:
298
299           PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
300         22831 root      14   0  2772 2772  1328 R     0.0  0.2  13:32 perl
301
302       Notice the memory usage increased by only 56K.  Test was performed on a
303       700mHz x86 box running Linux RedHat 7.2 & Perl 5.6.1.
304
305
306
307perl v5.38.0                      2023-09-03         DBM::Deep::Internals(3pm)