1DB_File(3) User Contributed Perl Documentation DB_File(3)
2
3
4
6 DB_File - Perl5 access to Berkeley DB version 1.x
7
9 use DB_File;
10
11 [$X =] tie %hash, 'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
12 [$X =] tie %hash, 'DB_File', $filename, $flags, $mode, $DB_BTREE ;
13 [$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;
14
15 $status = $X->del($key [, $flags]) ;
16 $status = $X->put($key, $value [, $flags]) ;
17 $status = $X->get($key, $value [, $flags]) ;
18 $status = $X->seq($key, $value, $flags) ;
19 $status = $X->sync([$flags]) ;
20 $status = $X->fd ;
21
22 # BTREE only
23 $count = $X->get_dup($key) ;
24 @list = $X->get_dup($key) ;
25 %list = $X->get_dup($key, 1) ;
26 $status = $X->find_dup($key, $value) ;
27 $status = $X->del_dup($key, $value) ;
28
29 # RECNO only
30 $a = $X->length;
31 $a = $X->pop ;
32 $X->push(list);
33 $a = $X->shift;
34 $X->unshift(list);
35 @r = $X->splice(offset, length, elements);
36
37 # DBM Filters
38 $old_filter = $db->filter_store_key ( sub { ... } ) ;
39 $old_filter = $db->filter_store_value( sub { ... } ) ;
40 $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
41 $old_filter = $db->filter_fetch_value( sub { ... } ) ;
42
43 untie %hash ;
44 untie @array ;
45
47 DB_File is a module which allows Perl programs to make use of the
48 facilities provided by Berkeley DB version 1.x (if you have a newer
49 version of DB, see "Using DB_File with Berkeley DB version 2 or
50 greater"). It is assumed that you have a copy of the Berkeley DB
51 manual pages at hand when reading this documentation. The interface
52 defined here mirrors the Berkeley DB interface closely.
53
54 Berkeley DB is a C library which provides a consistent interface to a
55 number of database formats. DB_File provides an interface to all three
56 of the database types currently supported by Berkeley DB.
57
58 The file types are:
59
60 DB_HASH
61 This database type allows arbitrary key/value pairs to be stored
62 in data files. This is equivalent to the functionality provided by
63 other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM.
64 Remember though, the files created using DB_HASH are not
65 compatible with any of the other packages mentioned.
66
67 A default hashing algorithm, which will be adequate for most
68 applications, is built into Berkeley DB. If you do need to use
69 your own hashing algorithm it is possible to write your own in
70 Perl and have DB_File use it instead.
71
72 DB_BTREE
73 The btree format allows arbitrary key/value pairs to be stored in
74 a sorted, balanced binary tree.
75
76 As with the DB_HASH format, it is possible to provide a user
77 defined Perl routine to perform the comparison of keys. By
78 default, though, the keys are stored in lexical order.
79
80 DB_RECNO
81 DB_RECNO allows both fixed-length and variable-length flat text
82 files to be manipulated using the same key/value pair interface as
83 in DB_HASH and DB_BTREE. In this case the key will consist of a
84 record (line) number.
85
86 Using DB_File with Berkeley DB version 2 or greater
87 Although DB_File is intended to be used with Berkeley DB version 1, it
88 can also be used with version 2, 3 or 4. In this case the interface is
89 limited to the functionality provided by Berkeley DB 1.x. Anywhere the
90 version 2 or greater interface differs, DB_File arranges for it to work
91 like version 1. This feature allows DB_File scripts that were built
92 with version 1 to be migrated to version 2 or greater without any
93 changes.
94
95 If you want to make use of the new features available in Berkeley DB
96 2.x or greater, use the Perl module BerkeleyDB
97 <https://metacpan.org/pod/BerkeleyDB> instead.
98
99 Note: The database file format has changed multiple times in Berkeley
100 DB version 2, 3 and 4. If you cannot recreate your databases, you must
101 dump any existing databases with either the "db_dump" or the
102 "db_dump185" utility that comes with Berkeley DB. Once you have
103 rebuilt DB_File to use Berkeley DB version 2 or greater, your databases
104 can be recreated using "db_load". Refer to the Berkeley DB
105 documentation for further details.
106
107 Please read "COPYRIGHT" before using version 2.x or greater of Berkeley
108 DB with DB_File.
109
110 Interface to Berkeley DB
111 DB_File allows access to Berkeley DB files using the tie() mechanism in
112 Perl 5 (for full details, see "tie()" in perlfunc). This facility
113 allows DB_File to access Berkeley DB files using either an associative
114 array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the
115 DB_RECNO file type).
116
117 In addition to the tie() interface, it is also possible to access most
118 of the functions provided in the Berkeley DB API directly. See "THE
119 API INTERFACE".
120
121 Opening a Berkeley DB Database File
122 Berkeley DB uses the function dbopen() to open or create a database.
123 Here is the C prototype for dbopen():
124
125 DB*
126 dbopen (const char * file, int flags, int mode,
127 DBTYPE type, const void * openinfo)
128
129 The parameter "type" is an enumeration which specifies which of the 3
130 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used.
131 Depending on which of these is actually chosen, the final parameter,
132 openinfo points to a data structure which allows tailoring of the
133 specific interface method.
134
135 This interface is handled slightly differently in DB_File. Here is an
136 equivalent call using DB_File:
137
138 tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;
139
140 The "filename", "flags" and "mode" parameters are the direct equivalent
141 of their dbopen() counterparts. The final parameter $DB_HASH performs
142 the function of both the "type" and "openinfo" parameters in dbopen().
143
144 In the example above $DB_HASH is actually a pre-defined reference to a
145 hash object. DB_File has three of these pre-defined references. Apart
146 from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.
147
148 The keys allowed in each of these pre-defined references is limited to
149 the names used in the equivalent C structure. So, for example, the
150 $DB_HASH reference will only allow keys called "bsize", "cachesize",
151 "ffactor", "hash", "lorder" and "nelem".
152
153 To change one of these elements, just assign to it like this:
154
155 $DB_HASH->{'cachesize'} = 10000 ;
156
157 The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are
158 usually adequate for most applications. If you do need to create extra
159 instances of these objects, constructors are available for each file
160 type.
161
162 Here are examples of the constructors and the valid options available
163 for DB_HASH, DB_BTREE and DB_RECNO respectively.
164
165 $a = DB_File::HASHINFO->new();
166 $a->{'bsize'} ;
167 $a->{'cachesize'} ;
168 $a->{'ffactor'};
169 $a->{'hash'} ;
170 $a->{'lorder'} ;
171 $a->{'nelem'} ;
172
173 $b = DB_File::BTREEINFO->new();
174 $b->{'flags'} ;
175 $b->{'cachesize'} ;
176 $b->{'maxkeypage'} ;
177 $b->{'minkeypage'} ;
178 $b->{'psize'} ;
179 $b->{'compare'} ;
180 $b->{'prefix'} ;
181 $b->{'lorder'} ;
182
183 $c = DB_File::RECNOINFO->new();
184 $c->{'bval'} ;
185 $c->{'cachesize'} ;
186 $c->{'psize'} ;
187 $c->{'flags'} ;
188 $c->{'lorder'} ;
189 $c->{'reclen'} ;
190 $c->{'bfname'} ;
191
192 The values stored in the hashes above are mostly the direct equivalent
193 of their C counterpart. Like their C counterparts, all are set to a
194 default values - that means you don't have to set all of the values
195 when you only want to change one. Here is an example:
196
197 $a = DB_File::HASHINFO->new();
198 $a->{'cachesize'} = 12345 ;
199 tie %y, 'DB_File', "filename", $flags, 0777, $a ;
200
201 A few of the options need extra discussion here. When used, the C
202 equivalent of the keys "hash", "compare" and "prefix" store pointers to
203 C functions. In DB_File these keys are used to store references to Perl
204 subs. Below are templates for each of the subs:
205
206 sub hash
207 {
208 my ($data) = @_ ;
209 ...
210 # return the hash value for $data
211 return $hash ;
212 }
213
214 sub compare
215 {
216 my ($key, $key2) = @_ ;
217 ...
218 # return 0 if $key1 eq $key2
219 # -1 if $key1 lt $key2
220 # 1 if $key1 gt $key2
221 return (-1 , 0 or 1) ;
222 }
223
224 sub prefix
225 {
226 my ($key, $key2) = @_ ;
227 ...
228 # return number of bytes of $key2 which are
229 # necessary to determine that it is greater than $key1
230 return $bytes ;
231 }
232
233 See "Changing the BTREE sort order" for an example of using the
234 "compare" template.
235
236 If you are using the DB_RECNO interface and you intend making use of
237 "bval", you should check out "The 'bval' Option".
238
239 Default Parameters
240 It is possible to omit some or all of the final 4 parameters in the
241 call to "tie" and let them take default values. As DB_HASH is the most
242 common file format used, the call:
243
244 tie %A, "DB_File", "filename" ;
245
246 is equivalent to:
247
248 tie %A, "DB_File", "filename", O_CREAT|O_RDWR, 0666, $DB_HASH ;
249
250 It is also possible to omit the filename parameter as well, so the
251 call:
252
253 tie %A, "DB_File" ;
254
255 is equivalent to:
256
257 tie %A, "DB_File", undef, O_CREAT|O_RDWR, 0666, $DB_HASH ;
258
259 See "In Memory Databases" for a discussion on the use of "undef" in
260 place of a filename.
261
262 In Memory Databases
263 Berkeley DB allows the creation of in-memory databases by using NULL
264 (that is, a "(char *)0" in C) in place of the filename. DB_File uses
265 "undef" instead of NULL to provide this functionality.
266
268 The DB_HASH file format is probably the most commonly used of the three
269 file formats that DB_File supports. It is also very straightforward to
270 use.
271
272 A Simple Example
273 This example shows how to create a database, add key/value pairs to the
274 database, delete keys/value pairs and finally how to enumerate the
275 contents of the database.
276
277 use warnings ;
278 use strict ;
279 use DB_File ;
280 our (%h, $k, $v) ;
281
282 unlink "fruit" ;
283 tie %h, "DB_File", "fruit", O_RDWR|O_CREAT, 0666, $DB_HASH
284 or die "Cannot open file 'fruit': $!\n";
285
286 # Add a few key/value pairs to the file
287 $h{"apple"} = "red" ;
288 $h{"orange"} = "orange" ;
289 $h{"banana"} = "yellow" ;
290 $h{"tomato"} = "red" ;
291
292 # Check for existence of a key
293 print "Banana Exists\n\n" if $h{"banana"} ;
294
295 # Delete a key/value pair.
296 delete $h{"apple"} ;
297
298 # print the contents of the file
299 while (($k, $v) = each %h)
300 { print "$k -> $v\n" }
301
302 untie %h ;
303
304 here is the output:
305
306 Banana Exists
307
308 orange -> orange
309 tomato -> red
310 banana -> yellow
311
312 Note that the like ordinary associative arrays, the order of the keys
313 retrieved is in an apparently random order.
314
316 The DB_BTREE format is useful when you want to store data in a given
317 order. By default the keys will be stored in lexical order, but as you
318 will see from the example shown in the next section, it is very easy to
319 define your own sorting function.
320
321 Changing the BTREE sort order
322 This script shows how to override the default sorting algorithm that
323 BTREE uses. Instead of using the normal lexical ordering, a case
324 insensitive compare function will be used.
325
326 use warnings ;
327 use strict ;
328 use DB_File ;
329
330 my %h ;
331
332 sub Compare
333 {
334 my ($key1, $key2) = @_ ;
335 "\L$key1" cmp "\L$key2" ;
336 }
337
338 # specify the Perl sub that will do the comparison
339 $DB_BTREE->{'compare'} = \&Compare ;
340
341 unlink "tree" ;
342 tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
343 or die "Cannot open file 'tree': $!\n" ;
344
345 # Add a key/value pair to the file
346 $h{'Wall'} = 'Larry' ;
347 $h{'Smith'} = 'John' ;
348 $h{'mouse'} = 'mickey' ;
349 $h{'duck'} = 'donald' ;
350
351 # Delete
352 delete $h{"duck"} ;
353
354 # Cycle through the keys printing them in order.
355 # Note it is not necessary to sort the keys as
356 # the btree will have kept them in order automatically.
357 foreach (keys %h)
358 { print "$_\n" }
359
360 untie %h ;
361
362 Here is the output from the code above.
363
364 mouse
365 Smith
366 Wall
367
368 There are a few point to bear in mind if you want to change the
369 ordering in a BTREE database:
370
371 1. The new compare function must be specified when you create the
372 database.
373
374 2. You cannot change the ordering once the database has been created.
375 Thus you must use the same compare function every time you access
376 the database.
377
378 3. Duplicate keys are entirely defined by the comparison function.
379 In the case-insensitive example above, the keys: 'KEY' and 'key'
380 would be considered duplicates, and assigning to the second one
381 would overwrite the first. If duplicates are allowed for (with the
382 R_DUP flag discussed below), only a single copy of duplicate keys
383 is stored in the database --- so (again with example above)
384 assigning three values to the keys: 'KEY', 'Key', and 'key' would
385 leave just the first key: 'KEY' in the database with three values.
386 For some situations this results in information loss, so care
387 should be taken to provide fully qualified comparison functions
388 when necessary. For example, the above comparison routine could
389 be modified to additionally compare case-sensitively if two keys
390 are equal in the case insensitive comparison:
391
392 sub compare {
393 my($key1, $key2) = @_;
394 lc $key1 cmp lc $key2 ||
395 $key1 cmp $key2;
396 }
397
398 And now you will only have duplicates when the keys themselves are
399 truly the same. (note: in versions of the db library prior to
400 about November 1996, such duplicate keys were retained so it was
401 possible to recover the original keys in sets of keys that
402 compared as equal).
403
404 Handling Duplicate Keys
405 The BTREE file type optionally allows a single key to be associated
406 with an arbitrary number of values. This option is enabled by setting
407 the flags element of $DB_BTREE to R_DUP when creating the database.
408
409 There are some difficulties in using the tied hash interface if you
410 want to manipulate a BTREE database with duplicate keys. Consider this
411 code:
412
413 use warnings ;
414 use strict ;
415 use DB_File ;
416
417 my ($filename, %h) ;
418
419 $filename = "tree" ;
420 unlink $filename ;
421
422 # Enable duplicate records
423 $DB_BTREE->{'flags'} = R_DUP ;
424
425 tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
426 or die "Cannot open $filename: $!\n";
427
428 # Add some key/value pairs to the file
429 $h{'Wall'} = 'Larry' ;
430 $h{'Wall'} = 'Brick' ; # Note the duplicate key
431 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
432 $h{'Smith'} = 'John' ;
433 $h{'mouse'} = 'mickey' ;
434
435 # iterate through the associative array
436 # and print each key/value pair.
437 foreach (sort keys %h)
438 { print "$_ -> $h{$_}\n" }
439
440 untie %h ;
441
442 Here is the output:
443
444 Smith -> John
445 Wall -> Larry
446 Wall -> Larry
447 Wall -> Larry
448 mouse -> mickey
449
450 As you can see 3 records have been successfully created with key "Wall"
451 - the only thing is, when they are retrieved from the database they
452 seem to have the same value, namely "Larry". The problem is caused by
453 the way that the associative array interface works. Basically, when the
454 associative array interface is used to fetch the value associated with
455 a given key, it will only ever retrieve the first value.
456
457 Although it may not be immediately obvious from the code above, the
458 associative array interface can be used to write values with duplicate
459 keys, but it cannot be used to read them back from the database.
460
461 The way to get around this problem is to use the Berkeley DB API method
462 called "seq". This method allows sequential access to key/value pairs.
463 See "THE API INTERFACE" for details of both the "seq" method and the
464 API in general.
465
466 Here is the script above rewritten using the "seq" API method.
467
468 use warnings ;
469 use strict ;
470 use DB_File ;
471
472 my ($filename, $x, %h, $status, $key, $value) ;
473
474 $filename = "tree" ;
475 unlink $filename ;
476
477 # Enable duplicate records
478 $DB_BTREE->{'flags'} = R_DUP ;
479
480 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
481 or die "Cannot open $filename: $!\n";
482
483 # Add some key/value pairs to the file
484 $h{'Wall'} = 'Larry' ;
485 $h{'Wall'} = 'Brick' ; # Note the duplicate key
486 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
487 $h{'Smith'} = 'John' ;
488 $h{'mouse'} = 'mickey' ;
489
490 # iterate through the btree using seq
491 # and print each key/value pair.
492 $key = $value = 0 ;
493 for ($status = $x->seq($key, $value, R_FIRST) ;
494 $status == 0 ;
495 $status = $x->seq($key, $value, R_NEXT) )
496 { print "$key -> $value\n" }
497
498 undef $x ;
499 untie %h ;
500
501 that prints:
502
503 Smith -> John
504 Wall -> Brick
505 Wall -> Brick
506 Wall -> Larry
507 mouse -> mickey
508
509 This time we have got all the key/value pairs, including the multiple
510 values associated with the key "Wall".
511
512 To make life easier when dealing with duplicate keys, DB_File comes
513 with a few utility methods.
514
515 The get_dup() Method
516 The "get_dup" method assists in reading duplicate values from BTREE
517 databases. The method can take the following forms:
518
519 $count = $x->get_dup($key) ;
520 @list = $x->get_dup($key) ;
521 %list = $x->get_dup($key, 1) ;
522
523 In a scalar context the method returns the number of values associated
524 with the key, $key.
525
526 In list context, it returns all the values which match $key. Note that
527 the values will be returned in an apparently random order.
528
529 In list context, if the second parameter is present and evaluates TRUE,
530 the method returns an associative array. The keys of the associative
531 array correspond to the values that matched in the BTREE and the values
532 of the array are a count of the number of times that particular value
533 occurred in the BTREE.
534
535 So assuming the database created above, we can use "get_dup" like this:
536
537 use warnings ;
538 use strict ;
539 use DB_File ;
540
541 my ($filename, $x, %h) ;
542
543 $filename = "tree" ;
544
545 # Enable duplicate records
546 $DB_BTREE->{'flags'} = R_DUP ;
547
548 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
549 or die "Cannot open $filename: $!\n";
550
551 my $cnt = $x->get_dup("Wall") ;
552 print "Wall occurred $cnt times\n" ;
553
554 my %hash = $x->get_dup("Wall", 1) ;
555 print "Larry is there\n" if $hash{'Larry'} ;
556 print "There are $hash{'Brick'} Brick Walls\n" ;
557
558 my @list = sort $x->get_dup("Wall") ;
559 print "Wall => [@list]\n" ;
560
561 @list = $x->get_dup("Smith") ;
562 print "Smith => [@list]\n" ;
563
564 @list = $x->get_dup("Dog") ;
565 print "Dog => [@list]\n" ;
566
567 and it will print:
568
569 Wall occurred 3 times
570 Larry is there
571 There are 2 Brick Walls
572 Wall => [Brick Brick Larry]
573 Smith => [John]
574 Dog => []
575
576 The find_dup() Method
577 $status = $X->find_dup($key, $value) ;
578
579 This method checks for the existence of a specific key/value pair. If
580 the pair exists, the cursor is left pointing to the pair and the method
581 returns 0. Otherwise the method returns a non-zero value.
582
583 Assuming the database from the previous example:
584
585 use warnings ;
586 use strict ;
587 use DB_File ;
588
589 my ($filename, $x, %h, $found) ;
590
591 $filename = "tree" ;
592
593 # Enable duplicate records
594 $DB_BTREE->{'flags'} = R_DUP ;
595
596 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
597 or die "Cannot open $filename: $!\n";
598
599 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
600 print "Larry Wall is $found there\n" ;
601
602 $found = ( $x->find_dup("Wall", "Harry") == 0 ? "" : "not") ;
603 print "Harry Wall is $found there\n" ;
604
605 undef $x ;
606 untie %h ;
607
608 prints this
609
610 Larry Wall is there
611 Harry Wall is not there
612
613 The del_dup() Method
614 $status = $X->del_dup($key, $value) ;
615
616 This method deletes a specific key/value pair. It returns 0 if they
617 exist and have been deleted successfully. Otherwise the method returns
618 a non-zero value.
619
620 Again assuming the existence of the "tree" database
621
622 use warnings ;
623 use strict ;
624 use DB_File ;
625
626 my ($filename, $x, %h, $found) ;
627
628 $filename = "tree" ;
629
630 # Enable duplicate records
631 $DB_BTREE->{'flags'} = R_DUP ;
632
633 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
634 or die "Cannot open $filename: $!\n";
635
636 $x->del_dup("Wall", "Larry") ;
637
638 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
639 print "Larry Wall is $found there\n" ;
640
641 undef $x ;
642 untie %h ;
643
644 prints this
645
646 Larry Wall is not there
647
648 Matching Partial Keys
649 The BTREE interface has a feature which allows partial keys to be
650 matched. This functionality is only available when the "seq" method is
651 used along with the R_CURSOR flag.
652
653 $x->seq($key, $value, R_CURSOR) ;
654
655 Here is the relevant quote from the dbopen man page where it defines
656 the use of the R_CURSOR flag with seq:
657
658 Note, for the DB_BTREE access method, the returned key is not
659 necessarily an exact match for the specified key. The returned key
660 is the smallest key greater than or equal to the specified key,
661 permitting partial key matches and range searches.
662
663 In the example script below, the "match" sub uses this feature to find
664 and print the first matching key/value pair given a partial key.
665
666 use warnings ;
667 use strict ;
668 use DB_File ;
669 use Fcntl ;
670
671 my ($filename, $x, %h, $st, $key, $value) ;
672
673 sub match
674 {
675 my $key = shift ;
676 my $value = 0;
677 my $orig_key = $key ;
678 $x->seq($key, $value, R_CURSOR) ;
679 print "$orig_key\t-> $key\t-> $value\n" ;
680 }
681
682 $filename = "tree" ;
683 unlink $filename ;
684
685 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
686 or die "Cannot open $filename: $!\n";
687
688 # Add some key/value pairs to the file
689 $h{'mouse'} = 'mickey' ;
690 $h{'Wall'} = 'Larry' ;
691 $h{'Walls'} = 'Brick' ;
692 $h{'Smith'} = 'John' ;
693
694
695 $key = $value = 0 ;
696 print "IN ORDER\n" ;
697 for ($st = $x->seq($key, $value, R_FIRST) ;
698 $st == 0 ;
699 $st = $x->seq($key, $value, R_NEXT) )
700
701 { print "$key -> $value\n" }
702
703 print "\nPARTIAL MATCH\n" ;
704
705 match "Wa" ;
706 match "A" ;
707 match "a" ;
708
709 undef $x ;
710 untie %h ;
711
712 Here is the output:
713
714 IN ORDER
715 Smith -> John
716 Wall -> Larry
717 Walls -> Brick
718 mouse -> mickey
719
720 PARTIAL MATCH
721 Wa -> Wall -> Larry
722 A -> Smith -> John
723 a -> mouse -> mickey
724
726 DB_RECNO provides an interface to flat text files. Both variable and
727 fixed length records are supported.
728
729 In order to make RECNO more compatible with Perl, the array offset for
730 all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.
731
732 As with normal Perl arrays, a RECNO array can be accessed using
733 negative indexes. The index -1 refers to the last element of the array,
734 -2 the second last, and so on. Attempting to access an element before
735 the start of the array will raise a fatal run-time error.
736
737 The 'bval' Option
738 The operation of the bval option warrants some discussion. Here is the
739 definition of bval from the Berkeley DB 1.85 recno manual page:
740
741 The delimiting byte to be used to mark the end of a
742 record for variable-length records, and the pad charac-
743 ter for fixed-length records. If no value is speci-
744 fied, newlines (``\n'') are used to mark the end of
745 variable-length records and fixed-length records are
746 padded with spaces.
747
748 The second sentence is wrong. In actual fact bval will only default to
749 "\n" when the openinfo parameter in dbopen is NULL. If a non-NULL
750 openinfo parameter is used at all, the value that happens to be in bval
751 will be used. That means you always have to specify bval when making
752 use of any of the options in the openinfo parameter. This documentation
753 error will be fixed in the next release of Berkeley DB.
754
755 That clarifies the situation with regards Berkeley DB itself. What
756 about DB_File? Well, the behavior defined in the quote above is quite
757 useful, so DB_File conforms to it.
758
759 That means that you can specify other options (e.g. cachesize) and
760 still have bval default to "\n" for variable length records, and space
761 for fixed length records.
762
763 Also note that the bval option only allows you to specify a single byte
764 as a delimiter.
765
766 A Simple Example
767 Here is a simple example that uses RECNO (if you are using a version of
768 Perl earlier than 5.004_57 this example won't work -- see "Extra RECNO
769 Methods" for a workaround).
770
771 use warnings ;
772 use strict ;
773 use DB_File ;
774
775 my $filename = "text" ;
776 unlink $filename ;
777
778 my @h ;
779 tie @h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_RECNO
780 or die "Cannot open file 'text': $!\n" ;
781
782 # Add a few key/value pairs to the file
783 $h[0] = "orange" ;
784 $h[1] = "blue" ;
785 $h[2] = "yellow" ;
786
787 push @h, "green", "black" ;
788
789 my $elements = scalar @h ;
790 print "The array contains $elements entries\n" ;
791
792 my $last = pop @h ;
793 print "popped $last\n" ;
794
795 unshift @h, "white" ;
796 my $first = shift @h ;
797 print "shifted $first\n" ;
798
799 # Check for existence of a key
800 print "Element 1 Exists with value $h[1]\n" if $h[1] ;
801
802 # use a negative index
803 print "The last element is $h[-1]\n" ;
804 print "The 2nd last element is $h[-2]\n" ;
805
806 untie @h ;
807
808 Here is the output from the script:
809
810 The array contains 5 entries
811 popped black
812 shifted white
813 Element 1 Exists with value blue
814 The last element is green
815 The 2nd last element is yellow
816
817 Extra RECNO Methods
818 If you are using a version of Perl earlier than 5.004_57, the tied
819 array interface is quite limited. In the example script above "push",
820 "pop", "shift", "unshift" or determining the array length will not work
821 with a tied array.
822
823 To make the interface more useful for older versions of Perl, a number
824 of methods are supplied with DB_File to simulate the missing array
825 operations. All these methods are accessed via the object returned from
826 the tie call.
827
828 Here are the methods:
829
830 $X->push(list) ;
831 Pushes the elements of "list" to the end of the array.
832
833 $value = $X->pop ;
834 Removes and returns the last element of the array.
835
836 $X->shift
837 Removes and returns the first element of the array.
838
839 $X->unshift(list) ;
840 Pushes the elements of "list" to the start of the array.
841
842 $X->length
843 Returns the number of elements in the array.
844
845 $X->splice(offset, length, elements);
846 Returns a splice of the array.
847
848 Another Example
849 Here is a more complete example that makes use of some of the methods
850 described above. It also makes use of the API interface directly (see
851 "THE API INTERFACE").
852
853 use warnings ;
854 use strict ;
855 my (@h, $H, $file, $i) ;
856 use DB_File ;
857 use Fcntl ;
858
859 $file = "text" ;
860
861 unlink $file ;
862
863 $H = tie @h, "DB_File", $file, O_RDWR|O_CREAT, 0666, $DB_RECNO
864 or die "Cannot open file $file: $!\n" ;
865
866 # first create a text file to play with
867 $h[0] = "zero" ;
868 $h[1] = "one" ;
869 $h[2] = "two" ;
870 $h[3] = "three" ;
871 $h[4] = "four" ;
872
873
874 # Print the records in order.
875 #
876 # The length method is needed here because evaluating a tied
877 # array in a scalar context does not return the number of
878 # elements in the array.
879
880 print "\nORIGINAL\n" ;
881 foreach $i (0 .. $H->length - 1) {
882 print "$i: $h[$i]\n" ;
883 }
884
885 # use the push & pop methods
886 $a = $H->pop ;
887 $H->push("last") ;
888 print "\nThe last record was [$a]\n" ;
889
890 # and the shift & unshift methods
891 $a = $H->shift ;
892 $H->unshift("first") ;
893 print "The first record was [$a]\n" ;
894
895 # Use the API to add a new record after record 2.
896 $i = 2 ;
897 $H->put($i, "Newbie", R_IAFTER) ;
898
899 # and a new record before record 1.
900 $i = 1 ;
901 $H->put($i, "New One", R_IBEFORE) ;
902
903 # delete record 3
904 $H->del(3) ;
905
906 # now print the records in reverse order
907 print "\nREVERSE\n" ;
908 for ($i = $H->length - 1 ; $i >= 0 ; -- $i)
909 { print "$i: $h[$i]\n" }
910
911 # same again, but use the API functions instead
912 print "\nREVERSE again\n" ;
913 my ($s, $k, $v) = (0, 0, 0) ;
914 for ($s = $H->seq($k, $v, R_LAST) ;
915 $s == 0 ;
916 $s = $H->seq($k, $v, R_PREV))
917 { print "$k: $v\n" }
918
919 undef $H ;
920 untie @h ;
921
922 and this is what it outputs:
923
924 ORIGINAL
925 0: zero
926 1: one
927 2: two
928 3: three
929 4: four
930
931 The last record was [four]
932 The first record was [zero]
933
934 REVERSE
935 5: last
936 4: three
937 3: Newbie
938 2: one
939 1: New One
940 0: first
941
942 REVERSE again
943 5: last
944 4: three
945 3: Newbie
946 2: one
947 1: New One
948 0: first
949
950 Notes:
951
952 1. Rather than iterating through the array, @h like this:
953
954 foreach $i (@h)
955
956 it is necessary to use either this:
957
958 foreach $i (0 .. $H->length - 1)
959
960 or this:
961
962 for ($a = $H->get($k, $v, R_FIRST) ;
963 $a == 0 ;
964 $a = $H->get($k, $v, R_NEXT) )
965
966 2. Notice that both times the "put" method was used the record index
967 was specified using a variable, $i, rather than the literal value
968 itself. This is because "put" will return the record number of the
969 inserted line via that parameter.
970
972 As well as accessing Berkeley DB using a tied hash or array, it is also
973 possible to make direct use of most of the API functions defined in the
974 Berkeley DB documentation.
975
976 To do this you need to store a copy of the object returned from the
977 tie.
978
979 $db = tie %hash, "DB_File", "filename" ;
980
981 Once you have done that, you can access the Berkeley DB API functions
982 as DB_File methods directly like this:
983
984 $db->put($key, $value, R_NOOVERWRITE) ;
985
986 Important: If you have saved a copy of the object returned from "tie",
987 the underlying database file will not be closed until both the tied
988 variable is untied and all copies of the saved object are destroyed.
989
990 use DB_File ;
991 $db = tie %hash, "DB_File", "filename"
992 or die "Cannot tie filename: $!" ;
993 ...
994 undef $db ;
995 untie %hash ;
996
997 See "The untie() Gotcha" for more details.
998
999 All the functions defined in dbopen are available except for close()
1000 and dbopen() itself. The DB_File method interface to the supported
1001 functions have been implemented to mirror the way Berkeley DB works
1002 whenever possible. In particular note that:
1003
1004 • The methods return a status value. All return 0 on success. All
1005 return -1 to signify an error and set $! to the exact error code.
1006 The return code 1 generally (but not always) means that the key
1007 specified did not exist in the database.
1008
1009 Other return codes are defined. See below and in the Berkeley DB
1010 documentation for details. The Berkeley DB documentation should be
1011 used as the definitive source.
1012
1013 • Whenever a Berkeley DB function returns data via one of its
1014 parameters, the equivalent DB_File method does exactly the same.
1015
1016 • If you are careful, it is possible to mix API calls with the tied
1017 hash/array interface in the same piece of code. Although only a
1018 few of the methods used to implement the tied interface currently
1019 make use of the cursor, you should always assume that the cursor
1020 has been changed any time the tied hash/array interface is used.
1021 As an example, this code will probably not do what you expect:
1022
1023 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1024 or die "Cannot tie $filename: $!" ;
1025
1026 # Get the first key/value pair and set the cursor
1027 $X->seq($key, $value, R_FIRST) ;
1028
1029 # this line will modify the cursor
1030 $count = scalar keys %x ;
1031
1032 # Get the second key/value pair.
1033 # oops, it didn't, it got the last key/value pair!
1034 $X->seq($key, $value, R_NEXT) ;
1035
1036 The code above can be rearranged to get around the problem, like
1037 this:
1038
1039 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1040 or die "Cannot tie $filename: $!" ;
1041
1042 # this line will modify the cursor
1043 $count = scalar keys %x ;
1044
1045 # Get the first key/value pair and set the cursor
1046 $X->seq($key, $value, R_FIRST) ;
1047
1048 # Get the second key/value pair.
1049 # worked this time.
1050 $X->seq($key, $value, R_NEXT) ;
1051
1052 All the constants defined in dbopen for use in the flags parameters in
1053 the methods defined below are also available. Refer to the Berkeley DB
1054 documentation for the precise meaning of the flags values.
1055
1056 Below is a list of the methods available.
1057
1058 $status = $X->get($key, $value [, $flags]) ;
1059 Given a key ($key) this method reads the value associated with it
1060 from the database. The value read from the database is returned in
1061 the $value parameter.
1062
1063 If the key does not exist the method returns 1.
1064
1065 No flags are currently defined for this method.
1066
1067 $status = $X->put($key, $value [, $flags]) ;
1068 Stores the key/value pair in the database.
1069
1070 If you use either the R_IAFTER or R_IBEFORE flags, the $key
1071 parameter will have the record number of the inserted key/value
1072 pair set.
1073
1074 Valid flags are R_CURSOR, R_IAFTER, R_IBEFORE, R_NOOVERWRITE and
1075 R_SETCURSOR.
1076
1077 $status = $X->del($key [, $flags]) ;
1078 Removes all key/value pairs with key $key from the database.
1079
1080 A return code of 1 means that the requested key was not in the
1081 database.
1082
1083 R_CURSOR is the only valid flag at present.
1084
1085 $status = $X->fd ;
1086 Returns the file descriptor for the underlying database.
1087
1088 See "Locking: The Trouble with fd" for an explanation for why you
1089 should not use "fd" to lock your database.
1090
1091 $status = $X->seq($key, $value, $flags) ;
1092 This interface allows sequential retrieval from the database. See
1093 dbopen for full details.
1094
1095 Both the $key and $value parameters will be set to the key/value
1096 pair read from the database.
1097
1098 The flags parameter is mandatory. The valid flag values are
1099 R_CURSOR, R_FIRST, R_LAST, R_NEXT and R_PREV.
1100
1101 $status = $X->sync([$flags]) ;
1102 Flushes any cached buffers to disk.
1103
1104 R_RECNOSYNC is the only valid flag at present.
1105
1107 A DBM Filter is a piece of code that is be used when you always want to
1108 make the same transformation to all keys and/or values in a DBM
1109 database. An example is when you need to encode your data in UTF-8
1110 before writing to the database and then decode the UTF-8 when reading
1111 from the database file.
1112
1113 There are two ways to use a DBM Filter.
1114
1115 1. Using the low-level API defined below.
1116
1117 2. Using the DBM_Filter module. This module hides the complexity of
1118 the API defined below and comes with a number of "canned" filters
1119 that cover some of the common use-cases.
1120
1121 Use of the DBM_Filter module is recommended.
1122
1123 DBM Filter Low-level API
1124 There are four methods associated with DBM Filters. All work
1125 identically, and each is used to install (or uninstall) a single DBM
1126 Filter. Each expects a single parameter, namely a reference to a sub.
1127 The only difference between them is the place that the filter is
1128 installed.
1129
1130 To summarise:
1131
1132 filter_store_key
1133 If a filter has been installed with this method, it will be
1134 invoked every time you write a key to a DBM database.
1135
1136 filter_store_value
1137 If a filter has been installed with this method, it will be
1138 invoked every time you write a value to a DBM database.
1139
1140 filter_fetch_key
1141 If a filter has been installed with this method, it will be
1142 invoked every time you read a key from a DBM database.
1143
1144 filter_fetch_value
1145 If a filter has been installed with this method, it will be
1146 invoked every time you read a value from a DBM database.
1147
1148 You can use any combination of the methods, from none, to all four.
1149
1150 All filter methods return the existing filter, if present, or "undef"
1151 in not.
1152
1153 To delete a filter pass "undef" to it.
1154
1155 The Filter
1156 When each filter is called by Perl, a local copy of $_ will contain the
1157 key or value to be filtered. Filtering is achieved by modifying the
1158 contents of $_. The return code from the filter is ignored.
1159
1160 An Example -- the NULL termination problem.
1161 Consider the following scenario. You have a DBM database that you need
1162 to share with a third-party C application. The C application assumes
1163 that all keys and values are NULL terminated. Unfortunately when Perl
1164 writes to DBM databases it doesn't use NULL termination, so your Perl
1165 application will have to manage NULL termination itself. When you write
1166 to the database you will have to use something like this:
1167
1168 $hash{"$key\0"} = "$value\0" ;
1169
1170 Similarly the NULL needs to be taken into account when you are
1171 considering the length of existing keys/values.
1172
1173 It would be much better if you could ignore the NULL terminations issue
1174 in the main application code and have a mechanism that automatically
1175 added the terminating NULL to all keys and values whenever you write to
1176 the database and have them removed when you read from the database. As
1177 I'm sure you have already guessed, this is a problem that DBM Filters
1178 can fix very easily.
1179
1180 use warnings ;
1181 use strict ;
1182 use DB_File ;
1183
1184 my %hash ;
1185 my $filename = "filt" ;
1186 unlink $filename ;
1187
1188 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1189 or die "Cannot open $filename: $!\n" ;
1190
1191 # Install DBM Filters
1192 $db->filter_fetch_key ( sub { s/\0$// } ) ;
1193 $db->filter_store_key ( sub { $_ .= "\0" } ) ;
1194 $db->filter_fetch_value( sub { s/\0$// } ) ;
1195 $db->filter_store_value( sub { $_ .= "\0" } ) ;
1196
1197 $hash{"abc"} = "def" ;
1198 my $a = $hash{"ABC"} ;
1199 # ...
1200 undef $db ;
1201 untie %hash ;
1202
1203 Hopefully the contents of each of the filters should be self-
1204 explanatory. Both "fetch" filters remove the terminating NULL, and both
1205 "store" filters add a terminating NULL.
1206
1207 Another Example -- Key is a C int.
1208 Here is another real-life example. By default, whenever Perl writes to
1209 a DBM database it always writes the key and value as strings. So when
1210 you use this:
1211
1212 $hash{12345} = "something" ;
1213
1214 the key 12345 will get stored in the DBM database as the 5 byte string
1215 "12345". If you actually want the key to be stored in the DBM database
1216 as a C int, you will have to use "pack" when writing, and "unpack" when
1217 reading.
1218
1219 Here is a DBM Filter that does it:
1220
1221 use warnings ;
1222 use strict ;
1223 use DB_File ;
1224 my %hash ;
1225 my $filename = "filt" ;
1226 unlink $filename ;
1227
1228
1229 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1230 or die "Cannot open $filename: $!\n" ;
1231
1232 $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
1233 $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
1234 $hash{123} = "def" ;
1235 # ...
1236 undef $db ;
1237 untie %hash ;
1238
1239 This time only two filters have been used -- we only need to manipulate
1240 the contents of the key, so it wasn't necessary to install any value
1241 filters.
1242
1244 Locking: The Trouble with fd
1245 Until version 1.72 of this module, the recommended technique for
1246 locking DB_File databases was to flock the filehandle returned from the
1247 "fd" function. Unfortunately this technique has been shown to be
1248 fundamentally flawed (Kudos to David Harris for tracking this down).
1249 Use it at your own peril!
1250
1251 The locking technique went like this.
1252
1253 $db = tie(%db, 'DB_File', 'foo.db', O_CREAT|O_RDWR, 0644)
1254 || die "dbcreat foo.db $!";
1255 $fd = $db->fd;
1256 open(DB_FH, "+<&=$fd") || die "dup $!";
1257 flock (DB_FH, LOCK_EX) || die "flock: $!";
1258 ...
1259 $db{"Tom"} = "Jerry" ;
1260 ...
1261 flock(DB_FH, LOCK_UN);
1262 undef $db;
1263 untie %db;
1264 close(DB_FH);
1265
1266 In simple terms, this is what happens:
1267
1268 1. Use "tie" to open the database.
1269
1270 2. Lock the database with fd & flock.
1271
1272 3. Read & Write to the database.
1273
1274 4. Unlock and close the database.
1275
1276 Here is the crux of the problem. A side-effect of opening the DB_File
1277 database in step 2 is that an initial block from the database will get
1278 read from disk and cached in memory.
1279
1280 To see why this is a problem, consider what can happen when two
1281 processes, say "A" and "B", both want to update the same DB_File
1282 database using the locking steps outlined above. Assume process "A" has
1283 already opened the database and has a write lock, but it hasn't
1284 actually updated the database yet (it has finished step 2, but not
1285 started step 3 yet). Now process "B" tries to open the same database -
1286 step 1 will succeed, but it will block on step 2 until process "A"
1287 releases the lock. The important thing to notice here is that at this
1288 point in time both processes will have cached identical initial blocks
1289 from the database.
1290
1291 Now process "A" updates the database and happens to change some of the
1292 data held in the initial buffer. Process "A" terminates, flushing all
1293 cached data to disk and releasing the database lock. At this point the
1294 database on disk will correctly reflect the changes made by process
1295 "A".
1296
1297 With the lock released, process "B" can now continue. It also updates
1298 the database and unfortunately it too modifies the data that was in its
1299 initial buffer. Once that data gets flushed to disk it will overwrite
1300 some/all of the changes process "A" made to the database.
1301
1302 The result of this scenario is at best a database that doesn't contain
1303 what you expect. At worst the database will corrupt.
1304
1305 The above won't happen every time competing process update the same
1306 DB_File database, but it does illustrate why the technique should not
1307 be used.
1308
1309 Safe ways to lock a database
1310 Starting with version 2.x, Berkeley DB has internal support for
1311 locking. The companion module to this one, BerkeleyDB
1312 <https://metacpan.org/pod/BerkeleyDB>, provides an interface to this
1313 locking functionality. If you are serious about locking Berkeley DB
1314 databases, I strongly recommend using BerkeleyDB
1315 <https://metacpan.org/pod/BerkeleyDB>.
1316
1317 If using BerkeleyDB <https://metacpan.org/pod/BerkeleyDB> isn't an
1318 option, there are a number of modules available on CPAN that can be
1319 used to implement locking. Each one implements locking differently and
1320 has different goals in mind. It is therefore worth knowing the
1321 difference, so that you can pick the right one for your application.
1322 Here are the three locking wrappers:
1323
1324 Tie::DB_Lock
1325 A DB_File wrapper which creates copies of the database file for
1326 read access, so that you have a kind of a multiversioning
1327 concurrent read system. However, updates are still serial. Use for
1328 databases where reads may be lengthy and consistency problems may
1329 occur.
1330
1331 Tie::DB_LockFile
1332 A DB_File wrapper that has the ability to lock and unlock the
1333 database while it is being used. Avoids the tie-before-flock
1334 problem by simply re-tie-ing the database when you get or drop a
1335 lock. Because of the flexibility in dropping and re-acquiring the
1336 lock in the middle of a session, this can be massaged into a
1337 system that will work with long updates and/or reads if the
1338 application follows the hints in the POD documentation.
1339
1340 DB_File::Lock
1341 An extremely lightweight DB_File wrapper that simply flocks a
1342 lockfile before tie-ing the database and drops the lock after the
1343 untie. Allows one to use the same lockfile for multiple databases
1344 to avoid deadlock problems, if desired. Use for databases where
1345 updates are reads are quick and simple flock locking semantics are
1346 enough.
1347
1348 Sharing Databases With C Applications
1349 There is no technical reason why a Berkeley DB database cannot be
1350 shared by both a Perl and a C application.
1351
1352 The vast majority of problems that are reported in this area boil down
1353 to the fact that C strings are NULL terminated, whilst Perl strings are
1354 not. See "DBM FILTERS" for a generic way to work around this problem.
1355
1356 Here is a real example. Netscape 2.0 keeps a record of the locations
1357 you visit along with the time you last visited them in a DB_HASH
1358 database. This is usually stored in the file ~/.netscape/history.db.
1359 The key field in the database is the location string and the value
1360 field is the time the location was last visited stored as a 4 byte
1361 binary value.
1362
1363 If you haven't already guessed, the location string is stored with a
1364 terminating NULL. This means you need to be careful when accessing the
1365 database.
1366
1367 Here is a snippet of code that is loosely based on Tom Christiansen's
1368 ggh script (available from your nearest CPAN archive in
1369 authors/id/TOMC/scripts/nshist.gz).
1370
1371 use warnings ;
1372 use strict ;
1373 use DB_File ;
1374 use Fcntl ;
1375
1376 my ($dotdir, $HISTORY, %hist_db, $href, $binary_time, $date) ;
1377 $dotdir = $ENV{HOME} || $ENV{LOGNAME};
1378
1379 $HISTORY = "$dotdir/.netscape/history.db";
1380
1381 tie %hist_db, 'DB_File', $HISTORY
1382 or die "Cannot open $HISTORY: $!\n" ;;
1383
1384 # Dump the complete database
1385 while ( ($href, $binary_time) = each %hist_db ) {
1386
1387 # remove the terminating NULL
1388 $href =~ s/\x00$// ;
1389
1390 # convert the binary time into a user friendly string
1391 $date = localtime unpack("V", $binary_time);
1392 print "$date $href\n" ;
1393 }
1394
1395 # check for the existence of a specific key
1396 # remember to add the NULL
1397 if ( $binary_time = $hist_db{"http://mox.perl.com/\x00"} ) {
1398 $date = localtime unpack("V", $binary_time) ;
1399 print "Last visited mox.perl.com on $date\n" ;
1400 }
1401 else {
1402 print "Never visited mox.perl.com\n"
1403 }
1404
1405 untie %hist_db ;
1406
1407 The untie() Gotcha
1408 If you make use of the Berkeley DB API, it is very strongly recommended
1409 that you read "The untie Gotcha" in perltie.
1410
1411 Even if you don't currently make use of the API interface, it is still
1412 worth reading it.
1413
1414 Here is an example which illustrates the problem from a DB_File
1415 perspective:
1416
1417 use DB_File ;
1418 use Fcntl ;
1419
1420 my %x ;
1421 my $X ;
1422
1423 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_TRUNC
1424 or die "Cannot tie first time: $!" ;
1425
1426 $x{123} = 456 ;
1427
1428 untie %x ;
1429
1430 tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1431 or die "Cannot tie second time: $!" ;
1432
1433 untie %x ;
1434
1435 When run, the script will produce this error message:
1436
1437 Cannot tie second time: Invalid argument at bad.file line 14.
1438
1439 Although the error message above refers to the second tie() statement
1440 in the script, the source of the problem is really with the untie()
1441 statement that precedes it.
1442
1443 Having read perltie you will probably have already guessed that the
1444 error is caused by the extra copy of the tied object stored in $X. If
1445 you haven't, then the problem boils down to the fact that the DB_File
1446 destructor, DESTROY, will not be called until all references to the
1447 tied object are destroyed. Both the tied variable, %x, and $X above
1448 hold a reference to the object. The call to untie() will destroy the
1449 first, but $X still holds a valid reference, so the destructor will not
1450 get called and the database file tst.fil will remain open. The fact
1451 that Berkeley DB then reports the attempt to open a database that is
1452 already open via the catch-all "Invalid argument" doesn't help.
1453
1454 If you run the script with the "-w" flag the error message becomes:
1455
1456 untie attempted while 1 inner references still exist at bad.file line 12.
1457 Cannot tie second time: Invalid argument at bad.file line 14.
1458
1459 which pinpoints the real problem. Finally the script can now be
1460 modified to fix the original problem by destroying the API object
1461 before the untie:
1462
1463 ...
1464 $x{123} = 456 ;
1465
1466 undef $X ;
1467 untie %x ;
1468
1469 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1470 ...
1471
1473 Why is there Perl source in my database?
1474 If you look at the contents of a database file created by DB_File,
1475 there can sometimes be part of a Perl script included in it.
1476
1477 This happens because Berkeley DB uses dynamic memory to allocate
1478 buffers which will subsequently be written to the database file. Being
1479 dynamic, the memory could have been used for anything before DB
1480 malloced it. As Berkeley DB doesn't clear the memory once it has been
1481 allocated, the unused portions will contain random junk. In the case
1482 where a Perl script gets written to the database, the random junk will
1483 correspond to an area of dynamic memory that happened to be used during
1484 the compilation of the script.
1485
1486 Unless you don't like the possibility of there being part of your Perl
1487 scripts embedded in a database file, this is nothing to worry about.
1488
1489 How do I store complex data structures with DB_File?
1490 Although DB_File cannot do this directly, there is a module which can
1491 layer transparently over DB_File to accomplish this feat.
1492
1493 Check out the MLDBM module, available on CPAN in the directory
1494 modules/by-module/MLDBM.
1495
1496 What does "wide character in subroutine entry" mean?
1497 You will usually get this message if you are working with UTF-8 data
1498 and want to read/write it from/to a Berkeley DB database file.
1499
1500 The easist way to deal with this issue is to use the pre-defined "utf8"
1501 DBM_Filter (see DBM_Filter) that was designed to deal with this
1502 situation.
1503
1504 The example below shows what you need if both the key and value are
1505 expected to be in UTF-8.
1506
1507 use DB_File;
1508 use DBM_Filter;
1509
1510 my $db = tie %h, 'DB_File', '/tmp/try.db', O_CREAT|O_RDWR, 0666, $DB_BTREE;
1511 $db->Filter_Key_Push('utf8');
1512 $db->Filter_Value_Push('utf8');
1513
1514 my $key = "\N{LATIN SMALL LETTER A WITH ACUTE}";
1515 my $value = "\N{LATIN SMALL LETTER E WITH ACUTE}";
1516 $h{ $key } = $value;
1517
1518 What does "Invalid Argument" mean?
1519 You will get this error message when one of the parameters in the "tie"
1520 call is wrong. Unfortunately there are quite a few parameters to get
1521 wrong, so it can be difficult to figure out which one it is.
1522
1523 Here are a couple of possibilities:
1524
1525 1. Attempting to reopen a database without closing it.
1526
1527 2. Using the O_WRONLY flag.
1528
1529 What does "Bareword 'DB_File' not allowed" mean?
1530 You will encounter this particular error message when you have the
1531 "strict 'subs'" pragma (or the full strict pragma) in your script.
1532 Consider this script:
1533
1534 use warnings ;
1535 use strict ;
1536 use DB_File ;
1537 my %x ;
1538 tie %x, DB_File, "filename" ;
1539
1540 Running it produces the error in question:
1541
1542 Bareword "DB_File" not allowed while "strict subs" in use
1543
1544 To get around the error, place the word "DB_File" in either single or
1545 double quotes, like this:
1546
1547 tie %x, "DB_File", "filename" ;
1548
1549 Although it might seem like a real pain, it is really worth the effort
1550 of having a "use strict" in all your scripts.
1551
1553 Articles that are either about DB_File or make use of it.
1554
1555 1. Full-Text Searching in Perl, Tim Kientzle (tkientzle@ddj.com), Dr.
1556 Dobb's Journal, Issue 295, January 1999, pp 34-41
1557
1559 Moved to the Changes file.
1560
1562 Some older versions of Berkeley DB had problems with fixed length
1563 records using the RECNO file format. This problem has been fixed since
1564 version 1.85 of Berkeley DB.
1565
1566 I am sure there are bugs in the code. If you do find any, or can
1567 suggest any enhancements, I would welcome your comments.
1568
1570 General feedback/questions/bug reports should be sent to
1571 <https://github.com/pmqs/DB_File/issues> (preferred) or
1572 <https://rt.cpan.org/Public/Dist/Display.html?Name=DB_File>.
1573
1575 DB_File comes with the standard Perl source distribution. Look in the
1576 directory ext/DB_File. Given the amount of time between releases of
1577 Perl the version that ships with Perl is quite likely to be out of
1578 date, so the most recent version can always be found on CPAN (see
1579 "CPAN" in perlmodlib for details), in the directory
1580 modules/by-module/DB_File.
1581
1582 DB_File is designed to work with any version of Berkeley DB, but is
1583 limited to the functionality provided by version 1. If you want to make
1584 use of the new features available in Berkeley DB 2.x, or greater, use
1585 the Perl module BerkeleyDB <https://metacpan.org/pod/BerkeleyDB>
1586 instead.
1587
1588 The official web site for Berkeley DB is
1589 <http://www.oracle.com/technology/products/berkeley-db/db/index.html>.
1590 All versions of Berkeley DB are available there.
1591
1592 Alternatively, Berkeley DB version 1 is available at your nearest CPAN
1593 archive in src/misc/db.1.85.tar.gz.
1594
1596 Copyright (c) 1995-2023 Paul Marquess. All rights reserved. This
1597 program is free software; you can redistribute it and/or modify it
1598 under the same terms as Perl itself.
1599
1600 Although DB_File is covered by the Perl license, the library it makes
1601 use of, namely Berkeley DB, is not. Berkeley DB has its own copyright
1602 and its own license. See AGPL
1603 <https://www.oracle.com/downloads/licenses/berkeleydb-oslicense.html>
1604 for more details. Please take the time to read the Berkeley DB license
1605 and decide how it impacts your use of this Perl module.
1606
1608 perl, dbopen(3), hash(3), recno(3), btree(3), perldbmfilter, DBM_Filter
1609
1611 The DB_File interface was written by Paul Marquess <pmqs@cpan.org>.
1612
1613
1614
1615perl v5.38.0 2023-08-23 DB_File(3)