DBM::Deep(3pm)

1DBM::Deep(3)          User Contributed Perl Documentation         DBM::Deep(3)
2
3
4

NAME

6       DBM::Deep - A pure perl multi-level hash/array DBM
7

SYNOPSIS

9         use DBM::Deep;
10         my $db = DBM::Deep->new( "foo.db" );
11
12         $db->{key} = 'value'; # tie() style
13         print $db->{key};
14
15         $db->put('key' => 'value'); # OO style
16         print $db->get('key');
17
18         # true multi-level support
19         $db->{my_complex} = [
20               'hello', { perl => 'rules' },
21               42, 99,
22         ];
23

DESCRIPTION

25       A unique flat-file database module, written in pure perl.  True multi-
26       level hash/array support (unlike MLDBM, which is faked), hybrid OO /
27       tie() interface, cross-platform FTPable files, and quite fast.  Can
28       handle millions of keys and unlimited hash levels without significant
29       slow-down.  Written from the ground-up in pure perl -- this is NOT a
30       wrapper around a C-based DBM.  Out-of-the-box compatibility with Unix,
31       Mac OS X and Windows.
32

INSTALLATION

34       Hopefully you are using Perl's excellent CPAN module, which will
35       download and install the module for you.  If not, get the tarball, and
36       run these commands:
37
38               tar zxf DBM-Deep-*
39               cd DBM-Deep-*
40               perl Makefile.PL
41               make
42               make test
43               make install
44

SETUP

46       Construction can be done OO-style (which is the recommended way), or
47       using Perl's tie() function.  Both are examined here.
48
49   OO CONSTRUCTION
50       The recommended way to construct a DBM::Deep object is to use the new()
51       method, which gets you a blessed, tied hash or array reference.
52
53               my $db = DBM::Deep->new( "foo.db" );
54
55       This opens a new database handle, mapped to the file "foo.db".  If this
56       file does not exist, it will automatically be created.  DB files are
57       opened in "r+" (read/write) mode, and the type of object returned is a
58       hash, unless otherwise specified (see OPTIONS below).
59
60       You can pass a number of options to the constructor to specify things
61       like locking, autoflush, etc.  This is done by passing an inline hash:
62
63               my $db = DBM::Deep->new(
64                       file => "foo.db",
65                       locking => 1,
66                       autoflush => 1
67               );
68
69       Notice that the filename is now specified inside the hash with the
70       "file" parameter, as opposed to being the sole argument to the
71       constructor.  This is required if any options are specified.  See
72       OPTIONS below for the complete list.
73
74       You can also start with an array instead of a hash.  For this, you must
75       specify the "type" parameter:
76
77               my $db = DBM::Deep->new(
78                       file => "foo.db",
79                       type => DBM::Deep->TYPE_ARRAY
80               );
81
82       Note: Specifing the "type" parameter only takes effect when beginning a
83       new DB file.  If you create a DBM::Deep object with an existing file,
84       the "type" will be loaded from the file header, and an error will be
85       thrown if the wrong type is passed in.
86
87   TIE CONSTRUCTION
88       Alternately, you can create a DBM::Deep handle by using Perl's built-in
89       tie() function.  The object returned from tie() can be used to call
90       methods, such as lock() and unlock(), but cannot be used to assign to
91       the DBM::Deep file (as expected with most tie'd objects).
92
93               my %hash;
94               my $db = tie %hash, "DBM::Deep", "foo.db";
95
96               my @array;
97               my $db = tie @array, "DBM::Deep", "bar.db";
98
99       As with the OO constructor, you can replace the DB filename parameter
100       with a hash containing one or more options (see OPTIONS just below for
101       the complete list).
102
103               tie %hash, "DBM::Deep", {
104                       file => "foo.db",
105                       locking => 1,
106                       autoflush => 1
107               };
108
109   OPTIONS
110       There are a number of options that can be passed in when constructing
111       your DBM::Deep objects.  These apply to both the OO- and tie- based
112       approaches.
113
114       ·   file
115
116           Filename of the DB file to link the handle to.  You can pass a full
117           absolute filesystem path, partial path, or a plain filename if the
118           file is in the current working directory.  This is a required
119           parameter (though q.v. fh).
120
121       ·   fh
122
123           If you want, you can pass in the fh instead of the file. This is
124           most useful for doing something like:
125
126             my $db = DBM::Deep->new( { fh => \*DATA } );
127
128           You are responsible for making sure that the fh has been opened
129           appropriately for your needs. If you open it read-only and attempt
130           to write, an exception will be thrown. If you open it write-only or
131           append-only, an exception will be thrown immediately as DBM::Deep
132           needs to read from the fh.
133
134       ·   file_offset
135
136           This is the offset within the file that the DBM::Deep db starts.
137           Most of the time, you will not need to set this. However, it's
138           there if you want it.
139
140           If you pass in fh and do not set this, it will be set
141           appropriately.
142
143       ·   type
144
145           This parameter specifies what type of object to create, a hash or
146           array.  Use one of these two constants: "DBM::Deep->TYPE_HASH" or
147           "DBM::Deep->TYPE_ARRAY".  This only takes effect when beginning a
148           new file.  This is an optional parameter, and defaults to
149           "DBM::Deep->TYPE_HASH".
150
151       ·   locking
152
153           Specifies whether locking is to be enabled.  DBM::Deep uses Perl's
154           Fnctl flock() function to lock the database in exclusive mode for
155           writes, and shared mode for reads.  Pass any true value to enable.
156           This affects the base DB handle and any child hashes or arrays that
157           use the same DB file.  This is an optional parameter, and defaults
158           to 0 (disabled).  See LOCKING below for more.
159
160       ·   autoflush
161
162           Specifies whether autoflush is to be enabled on the underlying
163           filehandle.  This obviously slows down write operations, but is
164           required if you may have multiple processes accessing the same DB
165           file (also consider enable locking).  Pass any true value to
166           enable.  This is an optional parameter, and defaults to 0
167           (disabled).
168
169       ·   autobless
170
171           If autobless mode is enabled, DBM::Deep will preserve blessed
172           hashes, and restore them when fetched.  This is an experimental
173           feature, and does have side-effects.  Basically, when hashes are
174           re-blessed into their original classes, they are no longer blessed
175           into the DBM::Deep class!  So you won't be able to call any
176           DBM::Deep methods on them.  You have been warned.  This is an
177           optional parameter, and defaults to 0 (disabled).
178
179       ·   filter_*
180
181           See FILTERS below.
182
183       ·   debug
184
185           Setting debug mode will make all errors non-fatal, dump them out to
186           STDERR, and continue on.  This is for debugging purposes only, and
187           probably not what you want.  This is an optional parameter, and
188           defaults to 0 (disabled).
189
190           NOTE: This parameter is considered deprecated and should not be
191           used anymore.
192

TIE INTERFACE

194       With DBM::Deep you can access your databases using Perl's standard
195       hash/array syntax.  Because all DBM::Deep objects are tied to hashes or
196       arrays, you can treat them as such.  DBM::Deep will intercept all
197       reads/writes and direct them to the right place -- the DB file.  This
198       has nothing to do with the "TIE CONSTRUCTION" section above.  This
199       simply tells you how to use DBM::Deep using regular hashes and arrays,
200       rather than calling functions like "get()" and "put()" (although those
201       work too).  It is entirely up to you how to want to access your
202       databases.
203
204   HASHES
205       You can treat any DBM::Deep object like a normal Perl hash reference.
206       Add keys, or even nested hashes (or arrays) using standard Perl syntax:
207
208               my $db = DBM::Deep->new( "foo.db" );
209
210               $db->{mykey} = "myvalue";
211               $db->{myhash} = {};
212               $db->{myhash}->{subkey} = "subvalue";
213
214               print $db->{myhash}->{subkey} . "\n";
215
216       You can even step through hash keys using the normal Perl "keys()"
217       function:
218
219               foreach my $key (keys %$db) {
220                       print "$key: " . $db->{$key} . "\n";
221               }
222
223       Remember that Perl's "keys()" function extracts every key from the hash
224       and pushes them onto an array, all before the loop even begins.  If you
225       have an extra large hash, this may exhaust Perl's memory.  Instead,
226       consider using Perl's "each()" function, which pulls keys/values one at
227       a time, using very little memory:
228
229               while (my ($key, $value) = each %$db) {
230                       print "$key: $value\n";
231               }
232
233       Please note that when using "each()", you should always pass a direct
234       hash reference, not a lookup.  Meaning, you should never do this:
235
236               # NEVER DO THIS
237               while (my ($key, $value) = each %{$db->{foo}}) { # BAD
238
239       This causes an infinite loop, because for each iteration, Perl is
240       calling FETCH() on the $db handle, resulting in a "new" hash for foo
241       every time, so it effectively keeps returning the first key over and
242       over again. Instead, assign a temporary variable to "$db-"{foo}>, then
243       pass that to each().
244
245   ARRAYS
246       As with hashes, you can treat any DBM::Deep object like a normal Perl
247       array reference.  This includes inserting, removing and manipulating
248       elements, and the "push()", "pop()", "shift()", "unshift()" and
249       "splice()" functions.  The object must have first been created using
250       type "DBM::Deep->TYPE_ARRAY", or simply be a nested array reference
251       inside a hash.  Example:
252
253               my $db = DBM::Deep->new(
254                       file => "foo-array.db",
255                       type => DBM::Deep->TYPE_ARRAY
256               );
257
258               $db->[0] = "foo";
259               push @$db, "bar", "baz";
260               unshift @$db, "bah";
261
262               my $last_elem = pop @$db; # baz
263               my $first_elem = shift @$db; # bah
264               my $second_elem = $db->[1]; # bar
265
266               my $num_elements = scalar @$db;
267

OO INTERFACE

269       In addition to the tie() interface, you can also use a standard OO
270       interface to manipulate all aspects of DBM::Deep databases.  Each type
271       of object (hash or array) has its own methods, but both types share the
272       following common methods: "put()", "get()", "exists()", "delete()" and
273       "clear()".
274
275       ·   new() / clone()
276
277           These are the constructor and copy-functions.
278
279       ·   put() / store()
280
281           Stores a new hash key/value pair, or sets an array element value.
282           Takes two arguments, the hash key or array index, and the new
283           value.  The value can be a scalar, hash ref or array ref.  Returns
284           true on success, false on failure.
285
286                   $db->put("foo", "bar"); # for hashes
287                   $db->put(1, "bar"); # for arrays
288
289       ·   get() / fetch()
290
291           Fetches the value of a hash key or array element.  Takes one
292           argument: the hash key or array index.  Returns a scalar, hash ref
293           or array ref, depending on the data type stored.
294
295                   my $value = $db->get("foo"); # for hashes
296                   my $value = $db->get(1); # for arrays
297
298       ·   exists()
299
300           Checks if a hash key or array index exists.  Takes one argument:
301           the hash key or array index.  Returns true if it exists, false if
302           not.
303
304                   if ($db->exists("foo")) { print "yay!\n"; } # for hashes
305                   if ($db->exists(1)) { print "yay!\n"; } # for arrays
306
307       ·   delete()
308
309           Deletes one hash key/value pair or array element.  Takes one
310           argument: the hash key or array index.  Returns true on success,
311           false if not found.  For arrays, the remaining elements located
312           after the deleted element are NOT moved over.  The deleted element
313           is essentially just undefined, which is exactly how Perl's internal
314           arrays work.  Please note that the space occupied by the deleted
315           key/value or element is not reused again -- see "UNUSED SPACE
316           RECOVERY" below for details and workarounds.
317
318                   $db->delete("foo"); # for hashes
319                   $db->delete(1); # for arrays
320
321       ·   clear()
322
323           Deletes all hash keys or array elements.  Takes no arguments.  No
324           return value.  Please note that the space occupied by the deleted
325           keys/values or elements is not reused again -- see "UNUSED SPACE
326           RECOVERY" below for details and workarounds.
327
328                   $db->clear(); # hashes or arrays
329
330       ·   lock() / unlock()
331
332           q.v. Locking.
333
334       ·   optimize()
335
336           Recover lost disk space.
337
338       ·   import() / export()
339
340           Data going in and out.
341
342       ·   set_digest() / set_pack() / set_filter()
343
344           q.v. adjusting the interal parameters.
345
346       ·   error() / clear_error()
347
348           Error handling methods. These are deprecated and will be removed in
349           1.00.  .  =back
350
351   HASHES
352       For hashes, DBM::Deep supports all the common methods described above,
353       and the following additional methods: "first_key()" and "next_key()".
354
355       ·   first_key()
356
357           Returns the "first" key in the hash.  As with built-in Perl hashes,
358           keys are fetched in an undefined order (which appears random).
359           Takes no arguments, returns the key as a scalar value.
360
361                   my $key = $db->first_key();
362
363       ·   next_key()
364
365           Returns the "next" key in the hash, given the previous one as the
366           sole argument.  Returns undef if there are no more keys to be
367           fetched.
368
369                   $key = $db->next_key($key);
370
371       Here are some examples of using hashes:
372
373               my $db = DBM::Deep->new( "foo.db" );
374
375               $db->put("foo", "bar");
376               print "foo: " . $db->get("foo") . "\n";
377
378               $db->put("baz", {}); # new child hash ref
379               $db->get("baz")->put("buz", "biz");
380               print "buz: " . $db->get("baz")->get("buz") . "\n";
381
382               my $key = $db->first_key();
383               while ($key) {
384                       print "$key: " . $db->get($key) . "\n";
385                       $key = $db->next_key($key);
386               }
387
388               if ($db->exists("foo")) { $db->delete("foo"); }
389
390   ARRAYS
391       For arrays, DBM::Deep supports all the common methods described above,
392       and the following additional methods: "length()", "push()", "pop()",
393       "shift()", "unshift()" and "splice()".
394
395       ·   length()
396
397           Returns the number of elements in the array.  Takes no arguments.
398
399                   my $len = $db->length();
400
401       ·   push()
402
403           Adds one or more elements onto the end of the array.  Accepts
404           scalars, hash refs or array refs.  No return value.
405
406                   $db->push("foo", "bar", {});
407
408       ·   pop()
409
410           Fetches the last element in the array, and deletes it.  Takes no
411           arguments.  Returns undef if array is empty.  Returns the element
412           value.
413
414                   my $elem = $db->pop();
415
416       ·   shift()
417
418           Fetches the first element in the array, deletes it, then shifts all
419           the remaining elements over to take up the space.  Returns the
420           element value.  This method is not recommended with large arrays --
421           see "LARGE ARRAYS" below for details.
422
423                   my $elem = $db->shift();
424
425       ·   unshift()
426
427           Inserts one or more elements onto the beginning of the array,
428           shifting all existing elements over to make room.  Accepts scalars,
429           hash refs or array refs.  No return value.  This method is not
430           recommended with large arrays -- see <LARGE ARRAYS> below for
431           details.
432
433                   $db->unshift("foo", "bar", {});
434
435       ·   splice()
436
437           Performs exactly like Perl's built-in function of the same name.
438           See "perldoc -f splice" for usage -- it is too complicated to
439           document here.  This method is not recommended with large arrays --
440           see "LARGE ARRAYS" below for details.
441
442       Here are some examples of using arrays:
443
444               my $db = DBM::Deep->new(
445                       file => "foo.db",
446                       type => DBM::Deep->TYPE_ARRAY
447               );
448
449               $db->push("bar", "baz");
450               $db->unshift("foo");
451               $db->put(3, "buz");
452
453               my $len = $db->length();
454               print "length: $len\n"; # 4
455
456               for (my $k=0; $k<$len; $k++) {
457                       print "$k: " . $db->get($k) . "\n";
458               }
459
460               $db->splice(1, 2, "biz", "baf");
461
462               while (my $elem = shift @$db) {
463                       print "shifted: $elem\n";
464               }
465

LOCKING

467       Enable automatic file locking by passing a true value to the "locking"
468       parameter when constructing your DBM::Deep object (see SETUP above).
469
470               my $db = DBM::Deep->new(
471                       file => "foo.db",
472                       locking => 1
473               );
474
475       This causes DBM::Deep to "flock()" the underlying filehandle with
476       exclusive mode for writes, and shared mode for reads.  This is required
477       if you have multiple processes accessing the same database file, to
478       avoid file corruption.  Please note that "flock()" does NOT work for
479       files over NFS.  See "DB OVER NFS" below for more.
480
481   EXPLICIT LOCKING
482       You can explicitly lock a database, so it remains locked for multiple
483       transactions.  This is done by calling the "lock()" method, and passing
484       an optional lock mode argument (defaults to exclusive mode).  This is
485       particularly useful for things like counters, where the current value
486       needs to be fetched, then incremented, then stored again.
487
488               $db->lock();
489               my $counter = $db->get("counter");
490               $counter++;
491               $db->put("counter", $counter);
492               $db->unlock();
493
494               # or...
495
496               $db->lock();
497               $db->{counter}++;
498               $db->unlock();
499
500       You can pass "lock()" an optional argument, which specifies which mode
501       to use (exclusive or shared).  Use one of these two constants:
502       "DBM::Deep->LOCK_EX" or "DBM::Deep->LOCK_SH".  These are passed
503       directly to "flock()", and are the same as the constants defined in
504       Perl's "Fcntl" module.
505
506               $db->lock( DBM::Deep->LOCK_SH );
507               # something here
508               $db->unlock();
509

IMPORTING/EXPORTING

511       You can import existing complex structures by calling the "import()"
512       method, and export an entire database into an in-memory structure using
513       the "export()" method.  Both are examined here.
514
515   IMPORTING
516       Say you have an existing hash with nested hashes/arrays inside it.
517       Instead of walking the structure and adding keys/elements to the
518       database as you go, simply pass a reference to the "import()" method.
519       This recursively adds everything to an existing DBM::Deep object for
520       you.  Here is an example:
521
522               my $struct = {
523                       key1 => "value1",
524                       key2 => "value2",
525                       array1 => [ "elem0", "elem1", "elem2" ],
526                       hash1 => {
527                               subkey1 => "subvalue1",
528                               subkey2 => "subvalue2"
529                       }
530               };
531
532               my $db = DBM::Deep->new( "foo.db" );
533               $db->import( $struct );
534
535               print $db->{key1} . "\n"; # prints "value1"
536
537       This recursively imports the entire $struct object into $db, including
538       all nested hashes and arrays.  If the DBM::Deep object contains
539       exsiting data, keys are merged with the existing ones, replacing if
540       they already exist.  The "import()" method can be called on any
541       database level (not just the base level), and works with both hash and
542       array DB types.
543
544       Note: Make sure your existing structure has no circular references in
545       it.  These will cause an infinite loop when importing.
546
547   EXPORTING
548       Calling the "export()" method on an existing DBM::Deep object will
549       return a reference to a new in-memory copy of the database.  The export
550       is done recursively, so all nested hashes/arrays are all exported to
551       standard Perl objects.  Here is an example:
552
553               my $db = DBM::Deep->new( "foo.db" );
554
555               $db->{key1} = "value1";
556               $db->{key2} = "value2";
557               $db->{hash1} = {};
558               $db->{hash1}->{subkey1} = "subvalue1";
559               $db->{hash1}->{subkey2} = "subvalue2";
560
561               my $struct = $db->export();
562
563               print $struct->{key1} . "\n"; # prints "value1"
564
565       This makes a complete copy of the database in memory, and returns a
566       reference to it.  The "export()" method can be called on any database
567       level (not just the base level), and works with both hash and array DB
568       types.  Be careful of large databases -- you can store a lot more data
569       in a DBM::Deep object than an in-memory Perl structure.
570
571       Note: Make sure your database has no circular references in it.  These
572       will cause an infinite loop when exporting.
573

FILTERS

575       DBM::Deep has a number of hooks where you can specify your own Perl
576       function to perform filtering on incoming or outgoing data.  This is a
577       perfect way to extend the engine, and implement things like real-time
578       compression or encryption.  Filtering applies to the base DB level, and
579       all child hashes / arrays.  Filter hooks can be specified when your
580       DBM::Deep object is first constructed, or by calling the "set_filter()"
581       method at any time.  There are four available filter hooks, described
582       below:
583
584       ·   filter_store_key
585
586           This filter is called whenever a hash key is stored.  It is passed
587           the incoming key, and expected to return a transformed key.
588
589       ·   filter_store_value
590
591           This filter is called whenever a hash key or array element is
592           stored.  It is passed the incoming value, and expected to return a
593           transformed value.
594
595       ·   filter_fetch_key
596
597           This filter is called whenever a hash key is fetched (i.e. via
598           "first_key()" or "next_key()").  It is passed the transformed key,
599           and expected to return the plain key.
600
601       ·   filter_fetch_value
602
603           This filter is called whenever a hash key or array element is
604           fetched.  It is passed the transformed value, and expected to
605           return the plain value.
606
607       Here are the two ways to setup a filter hook:
608
609               my $db = DBM::Deep->new(
610                       file => "foo.db",
611                       filter_store_value => \&my_filter_store,
612                       filter_fetch_value => \&my_filter_fetch
613               );
614
615               # or...
616
617               $db->set_filter( "filter_store_value", \&my_filter_store );
618               $db->set_filter( "filter_fetch_value", \&my_filter_fetch );
619
620       Your filter function will be called only when dealing with SCALAR keys
621       or values.  When nested hashes and arrays are being stored/fetched,
622       filtering is bypassed.  Filters are called as static functions, passed
623       a single SCALAR argument, and expected to return a single SCALAR value.
624       If you want to remove a filter, set the function reference to "undef":
625
626               $db->set_filter( "filter_store_value", undef );
627
628   REAL-TIME ENCRYPTION EXAMPLE
629       Here is a working example that uses the Crypt::Blowfish module to do
630       real-time encryption / decryption of keys & values with DBM::Deep
631       Filters.  Please visit
632       <http://search.cpan.org/search?module=Crypt::Blowfish> for more on
633       Crypt::Blowfish.  You'll also need the Crypt::CBC module.
634
635               use DBM::Deep;
636               use Crypt::Blowfish;
637               use Crypt::CBC;
638
639               my $cipher = Crypt::CBC->new({
640                       'key'             => 'my secret key',
641                       'cipher'          => 'Blowfish',
642                       'iv'              => '$KJh#(}q',
643                       'regenerate_key'  => 0,
644                       'padding'         => 'space',
645                       'prepend_iv'      => 0
646               });
647
648               my $db = DBM::Deep->new(
649                       file => "foo-encrypt.db",
650                       filter_store_key => \&my_encrypt,
651                       filter_store_value => \&my_encrypt,
652                       filter_fetch_key => \&my_decrypt,
653                       filter_fetch_value => \&my_decrypt,
654               );
655
656               $db->{key1} = "value1";
657               $db->{key2} = "value2";
658               print "key1: " . $db->{key1} . "\n";
659               print "key2: " . $db->{key2} . "\n";
660
661               undef $db;
662               exit;
663
664               sub my_encrypt {
665                       return $cipher->encrypt( $_[0] );
666               }
667               sub my_decrypt {
668                       return $cipher->decrypt( $_[0] );
669               }
670
671   REAL-TIME COMPRESSION EXAMPLE
672       Here is a working example that uses the Compress::Zlib module to do
673       real-time compression / decompression of keys & values with DBM::Deep
674       Filters.  Please visit
675       <http://search.cpan.org/search?module=Compress::Zlib> for more on
676       Compress::Zlib.
677
678               use DBM::Deep;
679               use Compress::Zlib;
680
681               my $db = DBM::Deep->new(
682                       file => "foo-compress.db",
683                       filter_store_key => \&my_compress,
684                       filter_store_value => \&my_compress,
685                       filter_fetch_key => \&my_decompress,
686                       filter_fetch_value => \&my_decompress,
687               );
688
689               $db->{key1} = "value1";
690               $db->{key2} = "value2";
691               print "key1: " . $db->{key1} . "\n";
692               print "key2: " . $db->{key2} . "\n";
693
694               undef $db;
695               exit;
696
697               sub my_compress {
698                       return Compress::Zlib::memGzip( $_[0] ) ;
699               }
700               sub my_decompress {
701                       return Compress::Zlib::memGunzip( $_[0] ) ;
702               }
703
704       Note: Filtering of keys only applies to hashes.  Array "keys" are
705       actually numerical index numbers, and are not filtered.
706

ERROR HANDLING

708       Most DBM::Deep methods return a true value for success, and call die()
709       on failure.  You can wrap calls in an eval block to catch the die.
710       Also, the actual error message is stored in an internal scalar, which
711       can be fetched by calling the "error()" method.
712
713               my $db = DBM::Deep->new( "foo.db" ); # create hash
714               eval { $db->push("foo"); }; # ILLEGAL -- push is array-only call
715
716           print $@;           # prints error message
717               print $db->error(); # prints error message
718
719       You can then call "clear_error()" to clear the current error state.
720
721               $db->clear_error();
722
723       If you set the "debug" option to true when creating your DBM::Deep
724       object, all errors are considered NON-FATAL, and dumped to STDERR.
725       This should only be used for debugging purposes and not production
726       work. DBM::Deep expects errors to be thrown, not propagated back up the
727       stack.
728
729       NOTE: error() and clear_error() are considered deprecated and will be
730       removed in 1.00. Please don't use them. Instead, wrap all your
731       functions with in eval-blocks.
732

LARGEFILE SUPPORT

734       If you have a 64-bit system, and your Perl is compiled with both
735       LARGEFILE and 64-bit support, you may be able to create databases
736       larger than 2 GB.  DBM::Deep by default uses 32-bit file offset tags,
737       but these can be changed by calling the static "set_pack()" method
738       before you do anything else.
739
740               DBM::Deep::set_pack(8, 'Q');
741
742       This tells DBM::Deep to pack all file offsets with 8-byte (64-bit) quad
743       words instead of 32-bit longs.  After setting these values your DB
744       files have a theoretical maximum size of 16 XB (exabytes).
745
746       Note: Changing these values will NOT work for existing database files.
747       Only change this for new files, and make sure it stays set consistently
748       throughout the file's life.  If you do set these values, you can no
749       longer access 32-bit DB files.  You can, however, call "set_pack(4,
750       'N')" to change back to 32-bit mode.
751
752       Note: I have not personally tested files > 2 GB -- all my systems have
753       only a 32-bit Perl.  However, I have received user reports that this
754       does indeed work!
755

LOW-LEVEL ACCESS

757       If you require low-level access to the underlying filehandle that
758       DBM::Deep uses, you can call the "_fh()" method, which returns the
759       handle:
760
761               my $fh = $db->_fh();
762
763       This method can be called on the root level of the datbase, or any
764       child hashes or arrays.  All levels share a root structure, which
765       contains things like the filehandle, a reference counter, and all the
766       options specified when you created the object.  You can get access to
767       this root structure by calling the "root()" method.
768
769               my $root = $db->_root();
770
771       This is useful for changing options after the object has already been
772       created, such as enabling/disabling locking, or debug modes.  You can
773       also store your own temporary user data in this structure (be wary of
774       name collision), which is then accessible from any child hash or array.
775

CUSTOM DIGEST ALGORITHM

777       DBM::Deep by default uses the Message Digest 5 (MD5) algorithm for
778       hashing keys.  However you can override this, and use another algorithm
779       (such as SHA-256) or even write your own.  But please note that
780       DBM::Deep currently expects zero collisions, so your algorithm has to
781       be perfect, so to speak.  Collision detection may be introduced in a
782       later version.
783
784       You can specify a custom digest algorithm by calling the static
785       "set_digest()" function, passing a reference to a subroutine, and the
786       length of the algorithm's hashes (in bytes).  This is a global static
787       function, which affects ALL DBM::Deep objects.  Here is a working
788       example that uses a 256-bit hash from the Digest::SHA256 module.
789       Please see <http://search.cpan.org/search?module=Digest::SHA256> for
790       more.
791
792               use DBM::Deep;
793               use Digest::SHA256;
794
795               my $context = Digest::SHA256::new(256);
796
797               DBM::Deep::set_digest( \&my_digest, 32 );
798
799               my $db = DBM::Deep->new( "foo-sha.db" );
800
801               $db->{key1} = "value1";
802               $db->{key2} = "value2";
803               print "key1: " . $db->{key1} . "\n";
804               print "key2: " . $db->{key2} . "\n";
805
806               undef $db;
807               exit;
808
809               sub my_digest {
810                       return substr( $context->hash($_[0]), 0, 32 );
811               }
812
813       Note: Your returned digest strings must be EXACTLY the number of bytes
814       you specify in the "set_digest()" function (in this case 32).
815

CIRCULAR REFERENCES

817       DBM::Deep has experimental support for circular references.  Meaning
818       you can have a nested hash key or array element that points to a parent
819       object.  This relationship is stored in the DB file, and is preserved
820       between sessions.  Here is an example:
821
822               my $db = DBM::Deep->new( "foo.db" );
823
824               $db->{foo} = "bar";
825               $db->{circle} = $db; # ref to self
826
827               print $db->{foo} . "\n"; # prints "foo"
828               print $db->{circle}->{foo} . "\n"; # prints "foo" again
829
830       One catch is, passing the object to a function that recursively walks
831       the object tree (such as Data::Dumper or even the built-in "optimize()"
832       or "export()" methods) will result in an infinite loop.  The other
833       catch is, if you fetch the key of a circular reference (i.e. using the
834       "first_key()" or "next_key()" methods), you will get the target
835       object's key, not the ref's key.  This gets even more interesting with
836       the above example, where the circle key points to the base DB object,
837       which technically doesn't have a key.  So I made DBM::Deep return
838       "[base]" as the key name in that special case.
839

CAVEATS / ISSUES / BUGS

841       This section describes all the known issues with DBM::Deep.  It you
842       have found something that is not listed here, please send e-mail to
843       jhuckaby@cpan.org.
844
845   UNUSED SPACE RECOVERY
846       One major caveat with DBM::Deep is that space occupied by existing keys
847       and values is not recovered when they are deleted.  Meaning if you keep
848       deleting and adding new keys, your file will continuously grow.  I am
849       working on this, but in the meantime you can call the built-in
850       "optimize()" method from time to time (perhaps in a crontab or
851       something) to recover all your unused space.
852
853               $db->optimize(); # returns true on success
854
855       This rebuilds the ENTIRE database into a new file, then moves it on top
856       of the original.  The new file will have no unused space, thus it will
857       take up as little disk space as possible.  Please note that this
858       operation can take a long time for large files, and you need enough
859       disk space to temporarily hold 2 copies of your DB file.  The temporary
860       file is created in the same directory as the original, named with a
861       ".tmp" extension, and is deleted when the operation completes.  Oh, and
862       if locking is enabled, the DB is automatically locked for the entire
863       duration of the copy.
864
865       WARNING: Only call optimize() on the top-level node of the database,
866       and make sure there are no child references lying around.  DBM::Deep
867       keeps a reference counter, and if it is greater than 1, optimize() will
868       abort and return undef.
869
870   FILE CORRUPTION
871       The current level of error handling in DBM::Deep is minimal.  Files are
872       checked for a 32-bit signature when opened, but other corruption in
873       files can cause segmentation faults.  DBM::Deep may try to seek() past
874       the end of a file, or get stuck in an infinite loop depending on the
875       level of corruption.  File write operations are not checked for failure
876       (for speed), so if you happen to run out of disk space, DBM::Deep will
877       probably fail in a bad way.  These things will be addressed in a later
878       version of DBM::Deep.
879
880   DB OVER NFS
881       Beware of using DB files over NFS.  DBM::Deep uses flock(), which works
882       well on local filesystems, but will NOT protect you from file
883       corruption over NFS.  I've heard about setting up your NFS server with
884       a locking daemon, then using lockf() to lock your files, but your
885       mileage may vary there as well.  From what I understand, there is no
886       real way to do it.  However, if you need access to the underlying
887       filehandle in DBM::Deep for using some other kind of locking scheme
888       like lockf(), see the "LOW-LEVEL ACCESS" section above.
889
890   COPYING OBJECTS
891       Beware of copying tied objects in Perl.  Very strange things can
892       happen.  Instead, use DBM::Deep's "clone()" method which safely copies
893       the object and returns a new, blessed, tied hash or array to the same
894       level in the DB.
895
896               my $copy = $db->clone();
897
898       Note: Since clone() here is cloning the object, not the database
899       location, any modifications to either $db or $copy will be visible in
900       both.
901
902   LARGE ARRAYS
903       Beware of using "shift()", "unshift()" or "splice()" with large arrays.
904       These functions cause every element in the array to move, which can be
905       murder on DBM::Deep, as every element has to be fetched from disk, then
906       stored again in a different location.  This will be addressed in the
907       forthcoming version 1.00.
908
909   WRITEONLY FILES
910       If you pass in a filehandle to new(), you may have opened it in either
911       a readonly or writeonly mode. STORE will verify that the filehandle is
912       writable. However, there doesn't seem to be a good way to determine if
913       a filehandle is readable. And, if the filehandle isn't readable, it's
914       not clear what will happen. So, don't do that.
915

PERFORMANCE

917       This section discusses DBM::Deep's speed and memory usage.
918
919   SPEED
920       Obviously, DBM::Deep isn't going to be as fast as some C-based DBMs,
921       such as the almighty BerkeleyDB.  But it makes up for it in features
922       like true multi-level hash/array support, and cross-platform FTPable
923       files.  Even so, DBM::Deep is still pretty fast, and the speed stays
924       fairly consistent, even with huge databases.  Here is some test data:
925
926               Adding 1,000,000 keys to new DB file...
927
928               At 100 keys, avg. speed is 2,703 keys/sec
929               At 200 keys, avg. speed is 2,642 keys/sec
930               At 300 keys, avg. speed is 2,598 keys/sec
931               At 400 keys, avg. speed is 2,578 keys/sec
932               At 500 keys, avg. speed is 2,722 keys/sec
933               At 600 keys, avg. speed is 2,628 keys/sec
934               At 700 keys, avg. speed is 2,700 keys/sec
935               At 800 keys, avg. speed is 2,607 keys/sec
936               At 900 keys, avg. speed is 2,190 keys/sec
937               At 1,000 keys, avg. speed is 2,570 keys/sec
938               At 2,000 keys, avg. speed is 2,417 keys/sec
939               At 3,000 keys, avg. speed is 1,982 keys/sec
940               At 4,000 keys, avg. speed is 1,568 keys/sec
941               At 5,000 keys, avg. speed is 1,533 keys/sec
942               At 6,000 keys, avg. speed is 1,787 keys/sec
943               At 7,000 keys, avg. speed is 1,977 keys/sec
944               At 8,000 keys, avg. speed is 2,028 keys/sec
945               At 9,000 keys, avg. speed is 2,077 keys/sec
946               At 10,000 keys, avg. speed is 2,031 keys/sec
947               At 20,000 keys, avg. speed is 1,970 keys/sec
948               At 30,000 keys, avg. speed is 2,050 keys/sec
949               At 40,000 keys, avg. speed is 2,073 keys/sec
950               At 50,000 keys, avg. speed is 1,973 keys/sec
951               At 60,000 keys, avg. speed is 1,914 keys/sec
952               At 70,000 keys, avg. speed is 2,091 keys/sec
953               At 80,000 keys, avg. speed is 2,103 keys/sec
954               At 90,000 keys, avg. speed is 1,886 keys/sec
955               At 100,000 keys, avg. speed is 1,970 keys/sec
956               At 200,000 keys, avg. speed is 2,053 keys/sec
957               At 300,000 keys, avg. speed is 1,697 keys/sec
958               At 400,000 keys, avg. speed is 1,838 keys/sec
959               At 500,000 keys, avg. speed is 1,941 keys/sec
960               At 600,000 keys, avg. speed is 1,930 keys/sec
961               At 700,000 keys, avg. speed is 1,735 keys/sec
962               At 800,000 keys, avg. speed is 1,795 keys/sec
963               At 900,000 keys, avg. speed is 1,221 keys/sec
964               At 1,000,000 keys, avg. speed is 1,077 keys/sec
965
966       This test was performed on a PowerMac G4 1gHz running Mac OS X 10.3.2 &
967       Perl 5.8.1, with an 80GB Ultra ATA/100 HD spinning at 7200RPM.  The
968       hash keys and values were between 6 - 12 chars in length.  The DB file
969       ended up at 210MB.  Run time was 12 min 3 sec.
970
971   MEMORY USAGE
972       One of the great things about DBM::Deep is that it uses very little
973       memory.  Even with huge databases (1,000,000+ keys) you will not see
974       much increased memory on your process.  DBM::Deep relies solely on the
975       filesystem for storing and fetching data.  Here is output from
976       /usr/bin/top before even opening a database handle:
977
978                 PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
979               22831 root      11   0  2716 2716  1296 R     0.0  0.2   0:07 perl
980
981       Basically the process is taking 2,716K of memory.  And here is the same
982       process after storing and fetching 1,000,000 keys:
983
984                 PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
985               22831 root      14   0  2772 2772  1328 R     0.0  0.2  13:32 perl
986
987       Notice the memory usage increased by only 56K.  Test was performed on a
988       700mHz x86 box running Linux RedHat 7.2 & Perl 5.6.1.
989

DB FILE FORMAT

991       In case you were interested in the underlying DB file format, it is
992       documented here in this section.  You don't need to know this to use
993       the module, it's just included for reference.
994
995   SIGNATURE
996       DBM::Deep files always start with a 32-bit signature to identify the
997       file type.  This is at offset 0.  The signature is "DPDB" in network
998       byte order.  This is checked for when the file is opened and an error
999       will be thrown if it's not found.
1000
1001   TAG
1002       The DBM::Deep file is in a tagged format, meaning each section of the
1003       file has a standard header containing the type of data, the length of
1004       data, and then the data itself.  The type is a single character (1
1005       byte), the length is a 32-bit unsigned long in network byte order, and
1006       the data is, well, the data.  Here is how it unfolds:
1007
1008   MASTER INDEX
1009       Immediately after the 32-bit file signature is the Master Index record.
1010       This is a standard tag header followed by 1024 bytes (in 32-bit mode)
1011       or 2048 bytes (in 64-bit mode) of data.  The type is H for hash or A
1012       for array, depending on how the DBM::Deep object was constructed.
1013
1014       The index works by looking at a MD5 Hash of the hash key (or array
1015       index number).  The first 8-bit char of the MD5 signature is the offset
1016       into the index, multipled by 4 in 32-bit mode, or 8 in 64-bit mode.
1017       The value of the index element is a file offset of the next tag for the
1018       key/element in question, which is usually a Bucket List tag (see
1019       below).
1020
1021       The next tag could be another index, depending on how many
1022       keys/elements exist.  See RE-INDEXING below for details.
1023
1024   BUCKET LIST
1025       A Bucket List is a collection of 16 MD5 hashes for keys/elements, plus
1026       file offsets to where the actual data is stored.  It starts with a
1027       standard tag header, with type B, and a data size of 320 bytes in
1028       32-bit mode, or 384 bytes in 64-bit mode.  Each MD5 hash is stored in
1029       full (16 bytes), plus the 32-bit or 64-bit file offset for the Bucket
1030       containing the actual data.  When the list fills up, a Re-Index
1031       operation is performed (See RE-INDEXING below).
1032
1033   BUCKET
1034       A Bucket is a tag containing a key/value pair (in hash mode), or a
1035       index/value pair (in array mode).  It starts with a standard tag header
1036       with type D for scalar data (string, binary, etc.), or it could be a
1037       nested hash (type H) or array (type A).  The value comes just after the
1038       tag header.  The size reported in the tag header is only for the value,
1039       but then, just after the value is another size (32-bit unsigned long)
1040       and then the plain key itself.  Since the value is likely to be fetched
1041       more often than the plain key, I figured it would be slightly faster to
1042       store the value first.
1043
1044       If the type is H (hash) or A (array), the value is another Master Index
1045       record for the nested structure, where the process begins all over
1046       again.
1047
1048   RE-INDEXING
1049       After a Bucket List grows to 16 records, its allocated space in the
1050       file is exhausted.  Then, when another key/element comes in, the list
1051       is converted to a new index record.  However, this index will look at
1052       the next char in the MD5 hash, and arrange new Bucket List pointers
1053       accordingly.  This process is called Re-Indexing.  Basically, a new
1054       index tag is created at the file EOF, and all 17 (16 + new one)
1055       keys/elements are removed from the old Bucket List and inserted into
1056       the new index.  Several new Bucket Lists are created in the process, as
1057       a new MD5 char from the key is being examined (it is unlikely that the
1058       keys will all share the same next char of their MD5s).
1059
1060       Because of the way the MD5 algorithm works, it is impossible to tell
1061       exactly when the Bucket Lists will turn into indexes, but the first
1062       round tends to happen right around 4,000 keys.  You will see a slight
1063       decrease in performance here, but it picks back up pretty quick (see
1064       SPEED above).  Then it takes a lot more keys to exhaust the next level
1065       of Bucket Lists.  It's right around 900,000 keys.  This process can
1066       continue nearly indefinitely -- right up until the point the MD5
1067       signatures start colliding with each other, and this is EXTREMELY rare
1068       -- like winning the lottery 5 times in a row AND getting struck by
1069       lightning while you are walking to cash in your tickets.
1070       Theoretically, since MD5 hashes are 128-bit values, you could have up
1071       to 340,282,366,921,000,000,000,000,000,000,000,000,000 keys/elements (I
1072       believe this is 340 unodecillion, but don't quote me).
1073
1074   STORING
1075       When a new key/element is stored, the key (or index number) is first
1076       run through Digest::MD5 to get a 128-bit signature (example, in hex:
1077       b05783b0773d894396d475ced9d2f4f6).  Then, the Master Index record is
1078       checked for the first char of the signature (in this case b0).  If it
1079       does not exist, a new Bucket List is created for our key (and the next
1080       15 future keys that happen to also have b as their first MD5 char).
1081       The entire MD5 is written to the Bucket List along with the offset of
1082       the new Bucket record (EOF at this point, unless we are replacing an
1083       existing Bucket), where the actual data will be stored.
1084
1085   FETCHING
1086       Fetching an existing key/element involves getting a Digest::MD5 of the
1087       key (or index number), then walking along the indexes.  If there are
1088       enough keys/elements in this DB level, there might be nested indexes,
1089       each linked to a particular char of the MD5.  Finally, a Bucket List is
1090       pointed to, which contains up to 16 full MD5 hashes.  Each is checked
1091       for equality to the key in question.  If we found a match, the Bucket
1092       tag is loaded, where the value and plain key are stored.
1093
1094       Fetching the plain key occurs when calling the first_key() and
1095       next_key() methods.  In this process the indexes are walked
1096       systematically, and each key fetched in increasing MD5 order (which is
1097       why it appears random).   Once the Bucket is found, the value is
1098       skipped and the plain key returned instead.  Note: Do not count on keys
1099       being fetched as if the MD5 hashes were alphabetically sorted.  This
1100       only happens on an index-level -- as soon as the Bucket Lists are hit,
1101       the keys will come out in the order they went in -- so it's pretty much
1102       undefined how the keys will come out -- just like Perl's built-in
1103       hashes.
1104

CODE COVERAGE

1106       We use Devel::Cover to test the code coverage of our tests, below is
1107       the Devel::Cover report on this module's test suite.
1108
1109         ---------------------------- ------ ------ ------ ------ ------ ------ ------
1110         File                           stmt   bran   cond    sub    pod   time  total
1111         ---------------------------- ------ ------ ------ ------ ------ ------ ------
1112         blib/lib/DBM/Deep.pm           95.4   84.6   69.1   98.2  100.0   60.3   91.0
1113         blib/lib/DBM/Deep/Array.pm    100.0   91.1  100.0  100.0    n/a   26.4   98.0
1114         blib/lib/DBM/Deep/Hash.pm      95.3   80.0  100.0  100.0    n/a   13.3   92.4
1115         Total                          96.4   85.4   73.1   98.8  100.0  100.0   92.4
1116         ---------------------------- ------ ------ ------ ------ ------ ------ ------
1117

MORE INFORMATION

1119       Check out the DBM::Deep Google Group at
1120       http://groups.google.com/group/DBM-Deep
1121       <http://groups.google.com/group/DBM-Deep> or send email to
1122       DBM-Deep@googlegroups.com.
1123

AUTHORS

1125       Joseph Huckaby, jhuckaby@cpan.org
1126
1127       Rob Kinyon, rkinyon@cpan.org
1128
1129       Special thanks to Adam Sah and Rich Gaushell!  You know why :-)
1130

LICENSE

1136       Copyright (c) 2002-2006 Joseph Huckaby.  All Rights Reserved.  This is
1137       free software, you may use it and distribute it under the same terms as
1138       Perl itself.
1139

POD ERRORS

1141       Hey! The above document had some coding errors, which are explained
1142       below:
1143
1144       Around line 1959:
1145           You forgot a '=back' before '=head2'
1146
1147
1148
1149perl v5.12.0                      2010-04-30                      DBM::Deep(3)