1DBM::Deep(3pm) User Contributed Perl Documentation DBM::Deep(3pm)
2
3
4
6 DBM::Deep - A pure perl multi-level hash/array DBM that supports
7 transactions
8
10 use DBM::Deep;
11 my $db = DBM::Deep->new( "foo.db" );
12
13 $db->{key} = 'value';
14 print $db->{key};
15
16 $db->put('key' => 'value');
17 print $db->get('key');
18
19 # true multi-level support
20 $db->{my_complex} = [
21 'hello', { perl => 'rules' },
22 42, 99,
23 ];
24
25 $db->begin_work;
26
27 # Do stuff here
28
29 $db->rollback;
30 $db->commit;
31
32 tie my %db, 'DBM::Deep', 'foo.db';
33 $db{key} = 'value';
34 print $db{key};
35
36 tied(%db)->put('key' => 'value');
37 print tied(%db)->get('key');
38
40 A unique flat-file database module, written in pure perl. True multi-
41 level hash/array support (unlike MLDBM, which is faked), hybrid OO /
42 tie() interface, cross-platform FTPable files, ACID transactions, and
43 is quite fast. Can handle millions of keys and unlimited levels
44 without significant slow-down. Written from the ground-up in pure perl
45 -- this is NOT a wrapper around a C-based DBM. Out-of-the-box
46 compatibility with Unix, Mac OS X and Windows.
47
49 NOTE: 2.0000 introduces Unicode support in the File back end. This
50 necessitates a change in the file format. The version 1.0003 format is
51 still supported, though, so we have added a db_version() method. If you
52 are using a database in the old format, you will have to upgrade it to
53 get Unicode support.
54
55 NOTE: 1.0020 introduces different engines which are backed by different
56 types of storage. There is the original storage (called 'File') and a
57 database storage (called 'DBI'). q.v. "PLUGINS" for more information.
58
59 NOTE: 1.0000 has significant file format differences from prior
60 versions. There is a backwards-compatibility layer at
61 "utils/upgrade_db.pl". Files created by 1.0000 or higher are NOT
62 compatible with scripts using prior versions.
63
65 DBM::Deep is a wrapper around different storage engines. These are:
66
67 File
68 This is the traditional storage engine, storing the data to a custom
69 file format. The parameters accepted are:
70
71 • file
72
73 Filename of the DB file to link the handle to. You can pass a full
74 absolute filesystem path, partial path, or a plain filename if the
75 file is in the current working directory. This is a required
76 parameter (though q.v. fh).
77
78 • fh
79
80 If you want, you can pass in the fh instead of the file. This is
81 most useful for doing something like:
82
83 my $db = DBM::Deep->new( { fh => \*DATA } );
84
85 You are responsible for making sure that the fh has been opened
86 appropriately for your needs. If you open it read-only and attempt
87 to write, an exception will be thrown. If you open it write-only or
88 append-only, an exception will be thrown immediately as DBM::Deep
89 needs to read from the fh.
90
91 • file_offset
92
93 This is the offset within the file that the DBM::Deep db starts.
94 Most of the time, you will not need to set this. However, it's
95 there if you want it.
96
97 If you pass in fh and do not set this, it will be set
98 appropriately.
99
100 • locking
101
102 Specifies whether locking is to be enabled. DBM::Deep uses Perl's
103 flock() function to lock the database in exclusive mode for writes,
104 and shared mode for reads. Pass any true value to enable. This
105 affects the base DB handle and any child hashes or arrays that use
106 the same DB file. This is an optional parameter, and defaults to 1
107 (enabled). See "LOCKING" below for more.
108
109 When you open an existing database file, the version of the database
110 format will stay the same. But if you are creating a new file, it will
111 be in the latest format.
112
113 DBI
114 This is a storage engine that stores the data in a relational database.
115 Funnily enough, this engine doesn't work with transactions (yet) as
116 InnoDB doesn't do what DBM::Deep needs it to do.
117
118 The parameters accepted are:
119
120 • dbh
121
122 This is a DBH that's already been opened with "connect" in DBI.
123
124 • dbi
125
126 This is a hashref containing:
127
128 • dsn
129
130 • username
131
132 • password
133
134 • connect_args
135
136 These correspond to the 4 parameters "connect" in DBI takes.
137
138 NOTE: This has only been tested with MySQL and SQLite (with
139 disappointing results). I plan on extending this to work with
140 PostgreSQL in the near future. Oracle, Sybase, and other engines will
141 come later.
142
143 Planned engines
144 There are plans to extend this functionality to (at least) the
145 following:
146
147 • BDB (and other hash engines like memcached)
148
149 • NoSQL engines (such as Tokyo Cabinet)
150
151 • DBIx::Class (and other ORMs)
152
154 Construction can be done OO-style (which is the recommended way), or
155 using Perl's tie() function. Both are examined here.
156
157 OO Construction
158 The recommended way to construct a DBM::Deep object is to use the new()
159 method, which gets you a blessed and tied hash (or array) reference.
160
161 my $db = DBM::Deep->new( "foo.db" );
162
163 This opens a new database handle, mapped to the file "foo.db". If this
164 file does not exist, it will automatically be created. DB files are
165 opened in "r+" (read/write) mode, and the type of object returned is a
166 hash, unless otherwise specified (see "Options" below).
167
168 You can pass a number of options to the constructor to specify things
169 like locking, autoflush, etc. This is done by passing an inline hash
170 (or hashref):
171
172 my $db = DBM::Deep->new(
173 file => "foo.db",
174 locking => 1,
175 autoflush => 1
176 );
177
178 Notice that the filename is now specified inside the hash with the
179 "file" parameter, as opposed to being the sole argument to the
180 constructor. This is required if any options are specified. See
181 "Options" below for the complete list.
182
183 You can also start with an array instead of a hash. For this, you must
184 specify the "type" parameter:
185
186 my $db = DBM::Deep->new(
187 file => "foo.db",
188 type => DBM::Deep->TYPE_ARRAY
189 );
190
191 Note: Specifying the "type" parameter only takes effect when beginning
192 a new DB file. If you create a DBM::Deep object with an existing file,
193 the "type" will be loaded from the file header, and an error will be
194 thrown if the wrong type is passed in.
195
196 Tie Construction
197 Alternately, you can create a DBM::Deep handle by using Perl's built-in
198 tie() function. The object returned from tie() can be used to call
199 methods, such as lock() and unlock(). (That object can be retrieved
200 from the tied variable at any time using tied() - please see perltie
201 for more info.)
202
203 my %hash;
204 my $db = tie %hash, "DBM::Deep", "foo.db";
205
206 my @array;
207 my $db = tie @array, "DBM::Deep", "bar.db";
208
209 As with the OO constructor, you can replace the DB filename parameter
210 with a hash containing one or more options (see "Options" just below
211 for the complete list).
212
213 tie %hash, "DBM::Deep", {
214 file => "foo.db",
215 locking => 1,
216 autoflush => 1
217 };
218
219 Options
220 There are a number of options that can be passed in when constructing
221 your DBM::Deep objects. These apply to both the OO- and tie- based
222 approaches.
223
224 • type
225
226 This parameter specifies what type of object to create, a hash or
227 array. Use one of these two constants:
228
229 • "DBM::Deep->TYPE_HASH"
230
231 • "DBM::Deep->TYPE_ARRAY"
232
233 This only takes effect when beginning a new file. This is an
234 optional parameter, and defaults to "DBM::Deep->TYPE_HASH".
235
236 • autoflush
237
238 Specifies whether autoflush is to be enabled on the underlying
239 filehandle. This obviously slows down write operations, but is
240 required if you may have multiple processes accessing the same DB
241 file (also consider enable locking). Pass any true value to
242 enable. This is an optional parameter, and defaults to 1 (enabled).
243
244 • filter_*
245
246 See "FILTERS" below.
247
248 The following parameters may be specified in the constructor the first
249 time the datafile is created. However, they will be stored in the
250 header of the file and cannot be overridden by subsequent openings of
251 the file - the values will be set from the values stored in the
252 datafile's header.
253
254 • num_txns
255
256 This is the number of transactions that can be running at one time.
257 The default is one - the HEAD. The minimum is one and the maximum
258 is 255. The more transactions, the larger and quicker the datafile
259 grows.
260
261 Simple access to a database, regardless of how many processes are
262 doing it, already counts as one transaction (the HEAD). So, if you
263 want, say, 5 processes to be able to call begin_work at the same
264 time, "num_txns" must be at least 6.
265
266 See "TRANSACTIONS" below.
267
268 • max_buckets
269
270 This is the number of entries that can be added before a
271 reindexing. The larger this number is made, the larger a file gets,
272 but the better performance you will have. The default and minimum
273 number this can be is 16. The maximum is 256, but more than 64
274 isn't recommended.
275
276 • data_sector_size
277
278 This is the size in bytes of a given data sector. Data sectors will
279 chain, so a value of any size can be stored. However, chaining is
280 expensive in terms of time. Setting this value to something close
281 to the expected common length of your scalars will improve your
282 performance. If it is too small, your file will have a lot of
283 chaining. If it is too large, your file will have a lot of dead
284 space in it.
285
286 The default for this is 64 bytes. The minimum value is 32 and the
287 maximum is 256 bytes.
288
289 Note: There are between 6 and 10 bytes taken up in each data sector
290 for bookkeeping. (It's 4 + the number of bytes in your
291 "pack_size".) This is included within the data_sector_size, thus
292 the effective value is 6-10 bytes less than what you specified.
293
294 Another note: If your strings contain any characters beyond the
295 byte range, they will be encoded as UTF-8 before being stored in
296 the file. This will make all non-ASCII characters take up more than
297 one byte each.
298
299 • pack_size
300
301 This is the size of the file pointer used throughout the file. The
302 valid values are:
303
304 • small
305
306 This uses 2-byte offsets, allowing for a maximum file size of
307 65 KB.
308
309 • medium (default)
310
311 This uses 4-byte offsets, allowing for a maximum file size of 4
312 GB.
313
314 • large
315
316 This uses 8-byte offsets, allowing for a maximum file size of
317 16 XB (exabytes). This can only be enabled if your Perl is
318 compiled for 64-bit.
319
320 See "LARGEFILE SUPPORT" for more information.
321
322 • external_refs
323
324 This is a boolean option. When enabled, it allows external
325 references to database entries to hold on to those entries, even
326 when they are deleted.
327
328 To illustrate, if you retrieve a hash (or array) reference from the
329 database,
330
331 $foo_hash = $db->{foo};
332
333 the hash reference is still tied to the database. So if you
334
335 delete $db->{foo};
336
337 $foo_hash will point to a location in the DB that is no longer
338 valid (we call this a stale reference). So if you try to retrieve
339 the data from $foo_hash,
340
341 for(keys %$foo_hash) {
342
343 you will get an error.
344
345 The "external_refs" option causes $foo_hash to 'hang on' to the DB
346 entry, so it will not be deleted from the database if there is
347 still a reference to it in a running program. It will be deleted,
348 instead, when the $foo_hash variable no longer exists, or is
349 overwritten.
350
351 This has the potential to cause database bloat if your program
352 crashes, so it is not enabled by default. (See also the "export"
353 method for an alternative workaround.)
354
356 With DBM::Deep you can access your databases using Perl's standard
357 hash/array syntax. Because all DBM::Deep objects are tied to hashes or
358 arrays, you can treat them as such (but see "external_refs", above, and
359 "Stale References", below). DBM::Deep will intercept all reads/writes
360 and direct them to the right place -- the DB file. This has nothing to
361 do with the "Tie Construction" section above. This simply tells you how
362 to use DBM::Deep using regular hashes and arrays, rather than calling
363 functions like get() and put() (although those work too). It is
364 entirely up to you how to want to access your databases.
365
366 Hashes
367 You can treat any DBM::Deep object like a normal Perl hash reference.
368 Add keys, or even nested hashes (or arrays) using standard Perl syntax:
369
370 my $db = DBM::Deep->new( "foo.db" );
371
372 $db->{mykey} = "myvalue";
373 $db->{myhash} = {};
374 $db->{myhash}->{subkey} = "subvalue";
375
376 print $db->{myhash}->{subkey} . "\n";
377
378 You can even step through hash keys using the normal Perl keys()
379 function:
380
381 foreach my $key (keys %$db) {
382 print "$key: " . $db->{$key} . "\n";
383 }
384
385 Remember that Perl's keys() function extracts every key from the hash
386 and pushes them onto an array, all before the loop even begins. If you
387 have an extremely large hash, this may exhaust Perl's memory. Instead,
388 consider using Perl's each() function, which pulls keys/values one at a
389 time, using very little memory:
390
391 while (my ($key, $value) = each %$db) {
392 print "$key: $value\n";
393 }
394
395 Please note that when using each(), you should always pass a direct
396 hash reference, not a lookup. Meaning, you should never do this:
397
398 # NEVER DO THIS
399 while (my ($key, $value) = each %{$db->{foo}}) { # BAD
400
401 This causes an infinite loop, because for each iteration, Perl is
402 calling FETCH() on the $db handle, resulting in a "new" hash for foo
403 every time, so it effectively keeps returning the first key over and
404 over again. Instead, assign a temporary variable to "$db->{foo}", then
405 pass that to each().
406
407 Arrays
408 As with hashes, you can treat any DBM::Deep object like a normal Perl
409 array reference. This includes inserting, removing and manipulating
410 elements, and the push(), pop(), shift(), unshift() and splice()
411 functions. The object must have first been created using type
412 "DBM::Deep->TYPE_ARRAY", or simply be a nested array reference inside a
413 hash. Example:
414
415 my $db = DBM::Deep->new(
416 file => "foo-array.db",
417 type => DBM::Deep->TYPE_ARRAY
418 );
419
420 $db->[0] = "foo";
421 push @$db, "bar", "baz";
422 unshift @$db, "bah";
423
424 my $last_elem = pop @$db; # baz
425 my $first_elem = shift @$db; # bah
426 my $second_elem = $db->[1]; # bar
427
428 my $num_elements = scalar @$db;
429
431 In addition to the tie() interface, you can also use a standard OO
432 interface to manipulate all aspects of DBM::Deep databases. Each type
433 of object (hash or array) has its own methods, but both types share the
434 following common methods: put(), get(), exists(), delete() and clear().
435 fetch() and store() are aliases to put() and get(), respectively.
436
437 • new() / clone()
438
439 These are the constructor and copy-functions.
440
441 • put() / store()
442
443 Stores a new hash key/value pair, or sets an array element value.
444 Takes two arguments, the hash key or array index, and the new
445 value. The value can be a scalar, hash ref or array ref. Returns
446 true on success, false on failure.
447
448 $db->put("foo", "bar"); # for hashes
449 $db->put(1, "bar"); # for arrays
450
451 • get() / fetch()
452
453 Fetches the value of a hash key or array element. Takes one
454 argument: the hash key or array index. Returns a scalar, hash ref
455 or array ref, depending on the data type stored.
456
457 my $value = $db->get("foo"); # for hashes
458 my $value = $db->get(1); # for arrays
459
460 • exists()
461
462 Checks if a hash key or array index exists. Takes one argument: the
463 hash key or array index. Returns true if it exists, false if not.
464
465 if ($db->exists("foo")) { print "yay!\n"; } # for hashes
466 if ($db->exists(1)) { print "yay!\n"; } # for arrays
467
468 • delete()
469
470 Deletes one hash key/value pair or array element. Takes one
471 argument: the hash key or array index. Returns the data that the
472 element used to contain (just like Perl's "delete" function), which
473 is "undef" if it did not exist. For arrays, the remaining elements
474 located after the deleted element are NOT moved over. The deleted
475 element is essentially just undefined, which is exactly how Perl's
476 internal arrays work.
477
478 $db->delete("foo"); # for hashes
479 $db->delete(1); # for arrays
480
481 • clear()
482
483 Deletes all hash keys or array elements. Takes no arguments. No
484 return value.
485
486 $db->clear(); # hashes or arrays
487
488 • lock() / unlock() / lock_exclusive() / lock_shared()
489
490 q.v. "LOCKING" for more info.
491
492 • optimize()
493
494 This will compress the datafile so that it takes up as little space
495 as possible. There is a freespace manager so that when space is
496 freed up, it is used before extending the size of the datafile.
497 But, that freespace just sits in the datafile unless optimize() is
498 called.
499
500 "optimize" basically copies everything into a new database, so, if
501 it is in version 1.0003 format, it will be upgraded.
502
503 • import()
504
505 Unlike simple assignment, import() does not tie the right-hand
506 side. Instead, a copy of your data is put into the DB. import()
507 takes either an arrayref (if your DB is an array) or a hashref (if
508 your DB is a hash). import() will die if anything else is passed
509 in.
510
511 • export()
512
513 This returns a complete copy of the data structure at the point you
514 do the export. This copy is in RAM, not on disk like the DB is.
515
516 • begin_work() / commit() / rollback()
517
518 These are the transactional functions. "TRANSACTIONS" for more
519 information.
520
521 • supports( $option )
522
523 This returns a boolean indicating whether this instance of
524 DBM::Deep supports that feature. $option can be one of:
525
526 • transactions
527
528 • unicode
529
530 • db_version()
531
532 This returns the version of the database format that the current
533 database is in. This is specified as the earliest version of
534 DBM::Deep that supports it.
535
536 For the File back end, this will be 1.0003 or 2.
537
538 For the DBI back end, it is currently always 1.0020.
539
540 Hashes
541 For hashes, DBM::Deep supports all the common methods described above,
542 and the following additional methods: first_key() and next_key().
543
544 • first_key()
545
546 Returns the "first" key in the hash. As with built-in Perl hashes,
547 keys are fetched in an undefined order (which appears random).
548 Takes no arguments, returns the key as a scalar value.
549
550 my $key = $db->first_key();
551
552 • next_key()
553
554 Returns the "next" key in the hash, given the previous one as the
555 sole argument. Returns undef if there are no more keys to be
556 fetched.
557
558 $key = $db->next_key($key);
559
560 Here are some examples of using hashes:
561
562 my $db = DBM::Deep->new( "foo.db" );
563
564 $db->put("foo", "bar");
565 print "foo: " . $db->get("foo") . "\n";
566
567 $db->put("baz", {}); # new child hash ref
568 $db->get("baz")->put("buz", "biz");
569 print "buz: " . $db->get("baz")->get("buz") . "\n";
570
571 my $key = $db->first_key();
572 while ($key) {
573 print "$key: " . $db->get($key) . "\n";
574 $key = $db->next_key($key);
575 }
576
577 if ($db->exists("foo")) { $db->delete("foo"); }
578
579 Arrays
580 For arrays, DBM::Deep supports all the common methods described above,
581 and the following additional methods: length(), push(), pop(), shift(),
582 unshift() and splice().
583
584 • length()
585
586 Returns the number of elements in the array. Takes no arguments.
587
588 my $len = $db->length();
589
590 • push()
591
592 Adds one or more elements onto the end of the array. Accepts
593 scalars, hash refs or array refs. No return value.
594
595 $db->push("foo", "bar", {});
596
597 • pop()
598
599 Fetches the last element in the array, and deletes it. Takes no
600 arguments. Returns undef if array is empty. Returns the element
601 value.
602
603 my $elem = $db->pop();
604
605 • shift()
606
607 Fetches the first element in the array, deletes it, then shifts all
608 the remaining elements over to take up the space. Returns the
609 element value. This method is not recommended with large arrays --
610 see "Large Arrays" below for details.
611
612 my $elem = $db->shift();
613
614 • unshift()
615
616 Inserts one or more elements onto the beginning of the array,
617 shifting all existing elements over to make room. Accepts scalars,
618 hash refs or array refs. No return value. This method is not
619 recommended with large arrays -- see <Large Arrays> below for
620 details.
621
622 $db->unshift("foo", "bar", {});
623
624 • splice()
625
626 Performs exactly like Perl's built-in function of the same name.
627 See "splice" in perlfunc for usage -- it is too complicated to
628 document here. This method is not recommended with large arrays --
629 see "Large Arrays" below for details.
630
631 Here are some examples of using arrays:
632
633 my $db = DBM::Deep->new(
634 file => "foo.db",
635 type => DBM::Deep->TYPE_ARRAY
636 );
637
638 $db->push("bar", "baz");
639 $db->unshift("foo");
640 $db->put(3, "buz");
641
642 my $len = $db->length();
643 print "length: $len\n"; # 4
644
645 for (my $k=0; $k<$len; $k++) {
646 print "$k: " . $db->get($k) . "\n";
647 }
648
649 $db->splice(1, 2, "biz", "baf");
650
651 while (my $elem = shift @$db) {
652 print "shifted: $elem\n";
653 }
654
656 Enable or disable automatic file locking by passing a boolean value to
657 the "locking" parameter when constructing your DBM::Deep object (see
658 "SETUP" above).
659
660 my $db = DBM::Deep->new(
661 file => "foo.db",
662 locking => 1
663 );
664
665 This causes DBM::Deep to flock() the underlying filehandle with
666 exclusive mode for writes, and shared mode for reads. This is required
667 if you have multiple processes accessing the same database file, to
668 avoid file corruption. Please note that flock() does NOT work for
669 files over NFS. See "DB over NFS" below for more.
670
671 Explicit Locking
672 You can explicitly lock a database, so it remains locked for multiple
673 actions. This is done by calling the lock_exclusive() method (for when
674 you want to write) or the lock_shared() method (for when you want to
675 read). This is particularly useful for things like counters, where the
676 current value needs to be fetched, then incremented, then stored again.
677
678 $db->lock_exclusive();
679 my $counter = $db->get("counter");
680 $counter++;
681 $db->put("counter", $counter);
682 $db->unlock();
683
684 # or...
685
686 $db->lock_exclusive();
687 $db->{counter}++;
688 $db->unlock();
689
690 Win32/Cygwin
691 Due to Win32 actually enforcing the read-only status of a shared lock,
692 all locks on Win32 and cygwin are exclusive. This is because of how
693 autovivification currently works. Hopefully, this will go away in a
694 future release.
695
697 You can import existing complex structures by calling the import()
698 method, and export an entire database into an in-memory structure using
699 the export() method. Both are examined here.
700
701 Importing
702 Say you have an existing hash with nested hashes/arrays inside it.
703 Instead of walking the structure and adding keys/elements to the
704 database as you go, simply pass a reference to the import() method.
705 This recursively adds everything to an existing DBM::Deep object for
706 you. Here is an example:
707
708 my $struct = {
709 key1 => "value1",
710 key2 => "value2",
711 array1 => [ "elem0", "elem1", "elem2" ],
712 hash1 => {
713 subkey1 => "subvalue1",
714 subkey2 => "subvalue2"
715 }
716 };
717
718 my $db = DBM::Deep->new( "foo.db" );
719 $db->import( $struct );
720
721 print $db->{key1} . "\n"; # prints "value1"
722
723 This recursively imports the entire $struct object into $db, including
724 all nested hashes and arrays. If the DBM::Deep object contains existing
725 data, keys are merged with the existing ones, replacing if they already
726 exist. The import() method can be called on any database level (not
727 just the base level), and works with both hash and array DB types.
728
729 Note: Make sure your existing structure has no circular references in
730 it. These will cause an infinite loop when importing. There are plans
731 to fix this in a later release.
732
733 Exporting
734 Calling the export() method on an existing DBM::Deep object will return
735 a reference to a new in-memory copy of the database. The export is done
736 recursively, so all nested hashes/arrays are all exported to standard
737 Perl objects. Here is an example:
738
739 my $db = DBM::Deep->new( "foo.db" );
740
741 $db->{key1} = "value1";
742 $db->{key2} = "value2";
743 $db->{hash1} = {};
744 $db->{hash1}->{subkey1} = "subvalue1";
745 $db->{hash1}->{subkey2} = "subvalue2";
746
747 my $struct = $db->export();
748
749 print $struct->{key1} . "\n"; # prints "value1"
750
751 This makes a complete copy of the database in memory, and returns a
752 reference to it. The export() method can be called on any database
753 level (not just the base level), and works with both hash and array DB
754 types. Be careful of large databases -- you can store a lot more data
755 in a DBM::Deep object than an in-memory Perl structure.
756
757 Note: Make sure your database has no circular references in it. These
758 will cause an infinite loop when exporting. There are plans to fix this
759 in a later release.
760
762 DBM::Deep has a number of hooks where you can specify your own Perl
763 function to perform filtering on incoming or outgoing data. This is a
764 perfect way to extend the engine, and implement things like real-time
765 compression or encryption. Filtering applies to the base DB level, and
766 all child hashes / arrays. Filter hooks can be specified when your
767 DBM::Deep object is first constructed, or by calling the set_filter()
768 method at any time. There are four available filter hooks.
769
770 set_filter()
771 This method takes two parameters - the filter type and the filter
772 subreference. The four types are:
773
774 • filter_store_key
775
776 This filter is called whenever a hash key is stored. It is passed
777 the incoming key, and expected to return a transformed key.
778
779 • filter_store_value
780
781 This filter is called whenever a hash key or array element is
782 stored. It is passed the incoming value, and expected to return a
783 transformed value.
784
785 • filter_fetch_key
786
787 This filter is called whenever a hash key is fetched (i.e. via
788 first_key() or next_key()). It is passed the transformed key, and
789 expected to return the plain key.
790
791 • filter_fetch_value
792
793 This filter is called whenever a hash key or array element is
794 fetched. It is passed the transformed value, and expected to
795 return the plain value.
796
797 Here are the two ways to setup a filter hook:
798
799 my $db = DBM::Deep->new(
800 file => "foo.db",
801 filter_store_value => \&my_filter_store,
802 filter_fetch_value => \&my_filter_fetch
803 );
804
805 # or...
806
807 $db->set_filter( "store_value", \&my_filter_store );
808 $db->set_filter( "fetch_value", \&my_filter_fetch );
809
810 Your filter function will be called only when dealing with SCALAR keys
811 or values. When nested hashes and arrays are being stored/fetched,
812 filtering is bypassed. Filters are called as static functions, passed a
813 single SCALAR argument, and expected to return a single SCALAR value.
814 If you want to remove a filter, set the function reference to "undef":
815
816 $db->set_filter( "store_value", undef );
817
818 Examples
819 Please read DBM::Deep::Cookbook for examples of filters.
820
822 Most DBM::Deep methods return a true value for success, and call die()
823 on failure. You can wrap calls in an eval block to catch the die.
824
825 my $db = DBM::Deep->new( "foo.db" ); # create hash
826 eval { $db->push("foo"); }; # ILLEGAL -- push is array-only call
827
828 print $@; # prints error message
829
831 If you have a 64-bit system, and your Perl is compiled with both
832 LARGEFILE and 64-bit support, you may be able to create databases
833 larger than 4 GB. DBM::Deep by default uses 32-bit file offset tags,
834 but these can be changed by specifying the 'pack_size' parameter when
835 constructing the file.
836
837 DBM::Deep->new(
838 file => $filename,
839 pack_size => 'large',
840 );
841
842 This tells DBM::Deep to pack all file offsets with 8-byte (64-bit) quad
843 words instead of 32-bit longs. After setting these values your DB files
844 have a theoretical maximum size of 16 XB (exabytes).
845
846 You can also use "pack_size => 'small'" in order to use 16-bit file
847 offsets.
848
849 Note: Changing these values will NOT work for existing database files.
850 Only change this for new files. Once the value has been set, it is
851 stored in the file's header and cannot be changed for the life of the
852 file. These parameters are per-file, meaning you can access 32-bit and
853 64-bit files, as you choose.
854
855 Note: We have not personally tested files larger than 4 GB -- all our
856 systems have only a 32-bit Perl. However, we have received user reports
857 that this does indeed work.
858
860 If you require low-level access to the underlying filehandle that
861 DBM::Deep uses, you can call the _fh() method, which returns the
862 handle:
863
864 my $fh = $db->_fh();
865
866 This method can be called on the root level of the database, or any
867 child hashes or arrays. All levels share a root structure, which
868 contains things like the filehandle, a reference counter, and all the
869 options specified when you created the object. You can get access to
870 this file object by calling the _storage() method.
871
872 my $file_obj = $db->_storage();
873
874 This is useful for changing options after the object has already been
875 created, such as enabling/disabling locking. You can also store your
876 own temporary user data in this structure (be wary of name collision),
877 which is then accessible from any child hash or array.
878
880 DBM::Deep has full support for circular references. Meaning you can
881 have a nested hash key or array element that points to a parent object.
882 This relationship is stored in the DB file, and is preserved between
883 sessions. Here is an example:
884
885 my $db = DBM::Deep->new( "foo.db" );
886
887 $db->{foo} = "bar";
888 $db->{circle} = $db; # ref to self
889
890 print $db->{foo} . "\n"; # prints "bar"
891 print $db->{circle}->{foo} . "\n"; # prints "bar" again
892
893 This also works as expected with array and hash references. So, the
894 following works as expected:
895
896 $db->{foo} = [ 1 .. 3 ];
897 $db->{bar} = $db->{foo};
898
899 push @{$db->{foo}}, 42;
900 is( $db->{bar}[-1], 42 ); # Passes
901
902 This, however, does not extend to assignments from one DB file to
903 another. So, the following will throw an error:
904
905 my $db1 = DBM::Deep->new( "foo.db" );
906 my $db2 = DBM::Deep->new( "bar.db" );
907
908 $db1->{foo} = [];
909 $db2->{foo} = $db1->{foo}; # dies
910
911 Note: Passing the object to a function that recursively walks the
912 object tree (such as Data::Dumper or even the built-in optimize() or
913 export() methods) will result in an infinite loop. This will be fixed
914 in a future release by adding singleton support.
915
917 As of 1.0000, DBM::Deep has ACID transactions. Every DBM::Deep object
918 is completely transaction-ready - it is not an option you have to turn
919 on. You do have to specify how many transactions may run simultaneously
920 (q.v. "num_txns").
921
922 Three new methods have been added to support them. They are:
923
924 • begin_work()
925
926 This starts a transaction.
927
928 • commit()
929
930 This applies the changes done within the transaction to the
931 mainline and ends the transaction.
932
933 • rollback()
934
935 This discards the changes done within the transaction to the
936 mainline and ends the transaction.
937
938 Transactions in DBM::Deep are done using a variant of the MVCC method,
939 the same method used by the InnoDB MySQL engine.
940
942 As of 1.0000, the file format has changed. To aid in upgrades, a
943 migration script is provided within the CPAN distribution, called
944 utils/upgrade_db.pl.
945
946 NOTE: This script is not installed onto your system because it carries
947 a copy of every version prior to the current version.
948
949 As of version 2.0000, databases created by old versions back to 1.0003
950 can be read, but new features may not be available unless the database
951 is upgraded first.
952
954 The following are items that are planned to be added in future
955 releases. These are separate from the "CAVEATS, ISSUES & BUGS" below.
956
957 Sub-Transactions
958 Right now, you cannot run a transaction within a transaction. Removing
959 this restriction is technically straightforward, but the combinatorial
960 explosion of possible usecases hurts my head. If this is something you
961 want to see immediately, please submit many testcases.
962
963 Caching
964 If a client is willing to assert upon opening the file that this
965 process will be the only consumer of that datafile, then there are a
966 number of caching possibilities that can be taken advantage of. This
967 does, however, mean that DBM::Deep is more vulnerable to losing data
968 due to unflushed changes. It also means a much larger in-memory
969 footprint. As such, it's not clear exactly how this should be done.
970 Suggestions are welcome.
971
972 Ram-only
973 The techniques used in DBM::Deep simply require a seekable contiguous
974 datastore. This could just as easily be a large string as a file. By
975 using substr, the STM capabilities of DBM::Deep could be used within a
976 single-process. I have no idea how I'd specify this, though.
977 Suggestions are welcome.
978
979 Different contention resolution mechanisms
980 Currently, the only contention resolution mechanism is last-write-wins.
981 This is the mechanism used by most RDBMSes and should be good enough
982 for most uses. For advanced uses of STM, other contention mechanisms
983 will be needed. If you have an idea of how you'd like to see contention
984 resolution in DBM::Deep, please let me know.
985
987 This section describes all the known issues with DBM::Deep. These are
988 issues that are either intractable or depend on some feature within
989 Perl working exactly right. It you have found something that is not
990 listed below, please send an e-mail to bug-DBM-Deep@rt.cpan.org
991 <mailto:bug-DBM-Deep@rt.cpan.org>. Likewise, if you think you know of
992 a way around one of these issues, please let me know.
993
994 References
995 (The following assumes a high level of Perl understanding, specifically
996 of references. Most users can safely skip this section.)
997
998 Currently, the only references supported are HASH and ARRAY. The other
999 reference types (SCALAR, CODE, GLOB, and REF) cannot be supported for
1000 various reasons.
1001
1002 • GLOB
1003
1004 These are things like filehandles and other sockets. They can't be
1005 supported because it's completely unclear how DBM::Deep should
1006 serialize them.
1007
1008 • SCALAR / REF
1009
1010 The discussion here refers to the following type of example:
1011
1012 my $x = 25;
1013 $db->{key1} = \$x;
1014
1015 $x = 50;
1016
1017 # In some other process ...
1018
1019 my $val = ${ $db->{key1} };
1020
1021 is( $val, 50, "What actually gets stored in the DB file?" );
1022
1023 The problem is one of synchronization. When the variable being
1024 referred to changes value, the reference isn't notified, which is
1025 kind of the point of references. This means that the new value
1026 won't be stored in the datafile for other processes to read. There
1027 is no TIEREF.
1028
1029 It is theoretically possible to store references to values already
1030 within a DBM::Deep object because everything already is
1031 synchronized, but the change to the internals would be quite large.
1032 Specifically, DBM::Deep would have to tie every single value that
1033 is stored. This would bloat the RAM footprint of DBM::Deep at least
1034 twofold (if not more) and be a significant performance drain, all
1035 to support a feature that has never been requested.
1036
1037 • CODE
1038
1039 Data::Dump::Streamer provides a mechanism for serializing coderefs,
1040 including saving off all closure state. This would allow for
1041 DBM::Deep to store the code for a subroutine. Then, whenever the
1042 subroutine is read, the code could be eval()'ed into being.
1043 However, just as for SCALAR and REF, that closure state may change
1044 without notifying the DBM::Deep object storing the reference.
1045 Again, this would generally be considered a feature.
1046
1047 External references and transactions
1048 If you do "my $x = $db->{foo};", then start a transaction, $x will be
1049 referencing the database from outside the transaction. A fix for this
1050 (and other issues with how external references into the database) is
1051 being looked into. This is the skipped set of tests in
1052 t/39_singletons.t and a related issue is the focus of
1053 t/37_delete_edge_cases.t
1054
1055 File corruption
1056 The current level of error handling in DBM::Deep is minimal. Files are
1057 checked for a 32-bit signature when opened, but any other form of
1058 corruption in the datafile can cause segmentation faults. DBM::Deep may
1059 try to seek() past the end of a file, or get stuck in an infinite loop
1060 depending on the level and type of corruption. File write operations
1061 are not checked for failure (for speed), so if you happen to run out of
1062 disk space, DBM::Deep will probably fail in a bad way. These things
1063 will be addressed in a later version of DBM::Deep.
1064
1065 DB over NFS
1066 Beware of using DBM::Deep files over NFS. DBM::Deep uses flock(), which
1067 works well on local filesystems, but will NOT protect you from file
1068 corruption over NFS. I've heard about setting up your NFS server with a
1069 locking daemon, then using lockf() to lock your files, but your mileage
1070 may vary there as well. From what I understand, there is no real way
1071 to do it. However, if you need access to the underlying filehandle in
1072 DBM::Deep for using some other kind of locking scheme like lockf(), see
1073 the "LOW-LEVEL ACCESS" section above.
1074
1075 Copying Objects
1076 Beware of copying tied objects in Perl. Very strange things can happen.
1077 Instead, use DBM::Deep's clone() method which safely copies the object
1078 and returns a new, blessed and tied hash or array to the same level in
1079 the DB.
1080
1081 my $copy = $db->clone();
1082
1083 Note: Since clone() here is cloning the object, not the database
1084 location, any modifications to either $db or $copy will be visible to
1085 both.
1086
1087 Stale References
1088 If you take a reference to an array or hash from the database, it is
1089 tied to the database itself. This means that if the datum in question
1090 is subsequently deleted from the database, the reference to it will
1091 point to an invalid location and unpredictable things will happen if
1092 you try to use it.
1093
1094 So a seemingly innocuous piece of code like this:
1095
1096 my %hash = %{ $db->{some_hash} };
1097
1098 can fail if another process deletes or clobbers "$db->{some_hash}"
1099 while the data are being extracted, since "%{ ... }" is not atomic.
1100 (This actually happened.) The solution is to lock the database before
1101 reading the data:
1102
1103 $db->lock_exclusive;
1104 my %hash = %{ $db->{some_hash} };
1105 $db->unlock;
1106
1107 As of version 1.0024, if you assign a stale reference to a location in
1108 the database, DBM::Deep will warn, if you have uninitialized warnings
1109 enabled, and treat the stale reference as "undef". An attempt to use a
1110 stale reference as an array or hash reference will cause an error.
1111
1112 Large Arrays
1113 Beware of using shift(), unshift() or splice() with large arrays.
1114 These functions cause every element in the array to move, which can be
1115 murder on DBM::Deep, as every element has to be fetched from disk, then
1116 stored again in a different location. This will be addressed in a
1117 future version.
1118
1119 This has been somewhat addressed so that the cost is constant,
1120 regardless of what is stored at those locations. So, small arrays with
1121 huge data structures in them are faster. But, large arrays are still
1122 large.
1123
1124 Writeonly Files
1125 If you pass in a filehandle to new(), you may have opened it in either
1126 a readonly or writeonly mode. STORE will verify that the filehandle is
1127 writable. However, there doesn't seem to be a good way to determine if
1128 a filehandle is readable. And, if the filehandle isn't readable, it's
1129 not clear what will happen. So, don't do that.
1130
1131 Assignments Within Transactions
1132 The following will not work as one might expect:
1133
1134 my $x = { a => 1 };
1135
1136 $db->begin_work;
1137 $db->{foo} = $x;
1138 $db->rollback;
1139
1140 is( $x->{a}, 1 ); # This will fail!
1141
1142 The problem is that the moment a reference used as the rvalue to a
1143 DBM::Deep object's lvalue, it becomes tied itself. This is so that
1144 future changes to $x can be tracked within the DBM::Deep file and is
1145 considered to be a feature. By the time the rollback occurs, there is
1146 no knowledge that there had been an $x or what memory location to
1147 assign an export() to.
1148
1149 NOTE: This does not affect importing because imports do a walk over the
1150 reference to be imported in order to explicitly leave it untied.
1151
1153 Devel::Cover is used to test the code coverage of the tests. Below is
1154 the Devel::Cover report on this distribution's test suite.
1155
1156 ---------------------------- ------ ------ ------ ------ ------ ------ ------
1157 File stmt bran cond sub pod time total
1158 ---------------------------- ------ ------ ------ ------ ------ ------ ------
1159 blib/lib/DBM/Deep.pm 100.0 89.1 82.9 100.0 100.0 32.5 98.1
1160 blib/lib/DBM/Deep/Array.pm 100.0 94.4 100.0 100.0 100.0 5.2 98.8
1161 blib/lib/DBM/Deep/Engine.pm 100.0 92.9 100.0 100.0 100.0 7.4 100.0
1162 ...ib/DBM/Deep/Engine/DBI.pm 95.0 73.1 100.0 100.0 100.0 1.5 90.4
1163 ...b/DBM/Deep/Engine/File.pm 92.3 78.5 88.9 100.0 100.0 4.9 90.3
1164 blib/lib/DBM/Deep/Hash.pm 100.0 100.0 100.0 100.0 100.0 3.8 100.0
1165 .../lib/DBM/Deep/Iterator.pm 100.0 n/a n/a 100.0 100.0 0.0 100.0
1166 .../DBM/Deep/Iterator/DBI.pm 100.0 100.0 n/a 100.0 100.0 1.2 100.0
1167 ...DBM/Deep/Iterator/File.pm 92.5 84.6 n/a 100.0 66.7 0.6 90.0
1168 ...erator/File/BucketList.pm 100.0 75.0 n/a 100.0 66.7 0.4 93.8
1169 ...ep/Iterator/File/Index.pm 100.0 100.0 n/a 100.0 100.0 0.2 100.0
1170 blib/lib/DBM/Deep/Null.pm 87.5 n/a n/a 75.0 n/a 0.0 83.3
1171 blib/lib/DBM/Deep/Sector.pm 91.7 n/a n/a 83.3 0.0 6.7 74.4
1172 ...ib/DBM/Deep/Sector/DBI.pm 96.8 83.3 n/a 100.0 0.0 1.0 89.8
1173 ...p/Sector/DBI/Reference.pm 100.0 95.5 100.0 100.0 0.0 2.2 91.2
1174 ...Deep/Sector/DBI/Scalar.pm 100.0 100.0 n/a 100.0 0.0 1.1 92.9
1175 ...b/DBM/Deep/Sector/File.pm 96.0 87.5 100.0 92.3 25.0 2.2 91.0
1176 ...Sector/File/BucketList.pm 98.2 85.7 83.3 100.0 0.0 3.3 89.4
1177 .../Deep/Sector/File/Data.pm 100.0 n/a n/a 100.0 0.0 0.1 90.9
1178 ...Deep/Sector/File/Index.pm 100.0 80.0 33.3 100.0 0.0 0.8 83.1
1179 .../Deep/Sector/File/Null.pm 100.0 100.0 n/a 100.0 0.0 0.0 91.7
1180 .../Sector/File/Reference.pm 100.0 90.0 80.0 100.0 0.0 1.4 91.5
1181 ...eep/Sector/File/Scalar.pm 98.4 87.5 n/a 100.0 0.0 0.8 91.9
1182 blib/lib/DBM/Deep/Storage.pm 100.0 n/a n/a 100.0 100.0 0.0 100.0
1183 ...b/DBM/Deep/Storage/DBI.pm 97.3 70.8 n/a 100.0 38.5 6.7 87.0
1184 .../DBM/Deep/Storage/File.pm 96.6 77.1 80.0 95.7 100.0 16.0 91.8
1185 Total 99.3 85.2 84.9 99.8 63.3 100.0 97.6
1186 ---------------------------- ------ ------ ------ ------ ------ ------ ------
1187
1189 The source code repository is at
1190 <http://github.com/DrHyde/perl-modules-DBM-Deep>
1191
1193 Currently maintained by David Cantrell dcantrell@cpan.org
1194 <mailto:dcantrell@cpan.org>.
1195
1196 Originally written by Joseph Huckaby, jhuckaby@cpan.org
1197 <mailto:jhuckaby@cpan.org> with significant additions by Rob Kinyon,
1198 rkinyon@cpan.org <mailto:rkinyon@cpan.org>
1199
1201 Stonehenge Consulting (<http://www.stonehenge.com/>) sponsored the
1202 development of transactions and freespace management, leading to the
1203 1.0000 release. A great debt of gratitude goes out to them for their
1204 continuing leadership in and support of the Perl community.
1205
1207 The following have contributed greatly to make DBM::Deep what it is
1208 today:
1209
1210 • Adam Sah and Rich Gaushell for innumerable contributions early on.
1211
1212 • Dan Golden and others at YAPC::NA 2006 for helping me design
1213 through transactions.
1214
1215 • James Stanley for bug fix
1216
1217 • David Steinbrunner for fixing typos and adding repository cpan
1218 metadata
1219
1220 • H. Merijn Brandt for fixing the POD escapes.
1221
1222 • Breno G. de Oliveira for minor packaging tweaks
1223
1225 DBM::Deep::Cookbook(3)
1226
1227 perltie(1), Tie::Hash(3), Fcntl(3), flock(2), lockf(3), nfs(5)
1228
1230 Copyright (c) 2007-23 Rob Kinyon and others. All Rights Reserved. This
1231 is free software, you may use it and distribute it under the same terms
1232 as Perl itself.
1233
1234
1235
1236perl v5.38.0 2023-09-03 DBM::Deep(3pm)