1DB_File(3) User Contributed Perl Documentation DB_File(3)
2
3
4
6 DB_File - Perl5 access to Berkeley DB version 1.x
7
9 use DB_File;
10
11 [$X =] tie %hash, 'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
12 [$X =] tie %hash, 'DB_File', $filename, $flags, $mode, $DB_BTREE ;
13 [$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;
14
15 $status = $X->del($key [, $flags]) ;
16 $status = $X->put($key, $value [, $flags]) ;
17 $status = $X->get($key, $value [, $flags]) ;
18 $status = $X->seq($key, $value, $flags) ;
19 $status = $X->sync([$flags]) ;
20 $status = $X->fd ;
21
22 # BTREE only
23 $count = $X->get_dup($key) ;
24 @list = $X->get_dup($key) ;
25 %list = $X->get_dup($key, 1) ;
26 $status = $X->find_dup($key, $value) ;
27 $status = $X->del_dup($key, $value) ;
28
29 # RECNO only
30 $a = $X->length;
31 $a = $X->pop ;
32 $X->push(list);
33 $a = $X->shift;
34 $X->unshift(list);
35 @r = $X->splice(offset, length, elements);
36
37 # DBM Filters
38 $old_filter = $db->filter_store_key ( sub { ... } ) ;
39 $old_filter = $db->filter_store_value( sub { ... } ) ;
40 $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
41 $old_filter = $db->filter_fetch_value( sub { ... } ) ;
42
43 untie %hash ;
44 untie @array ;
45
47 DB_File is a module which allows Perl programs to make use of the
48 facilities provided by Berkeley DB version 1.x (if you have a newer
49 version of DB, see "Using DB_File with Berkeley DB version 2 or
50 greater"). It is assumed that you have a copy of the Berkeley DB
51 manual pages at hand when reading this documentation. The interface
52 defined here mirrors the Berkeley DB interface closely.
53
54 Berkeley DB is a C library which provides a consistent interface to a
55 number of database formats. DB_File provides an interface to all three
56 of the database types currently supported by Berkeley DB.
57
58 The file types are:
59
60 DB_HASH
61 This database type allows arbitrary key/value pairs to be stored
62 in data files. This is equivalent to the functionality provided by
63 other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM.
64 Remember though, the files created using DB_HASH are not
65 compatible with any of the other packages mentioned.
66
67 A default hashing algorithm, which will be adequate for most
68 applications, is built into Berkeley DB. If you do need to use
69 your own hashing algorithm it is possible to write your own in
70 Perl and have DB_File use it instead.
71
72 DB_BTREE
73 The btree format allows arbitrary key/value pairs to be stored in
74 a sorted, balanced binary tree.
75
76 As with the DB_HASH format, it is possible to provide a user
77 defined Perl routine to perform the comparison of keys. By
78 default, though, the keys are stored in lexical order.
79
80 DB_RECNO
81 DB_RECNO allows both fixed-length and variable-length flat text
82 files to be manipulated using the same key/value pair interface as
83 in DB_HASH and DB_BTREE. In this case the key will consist of a
84 record (line) number.
85
86 Using DB_File with Berkeley DB version 2 or greater
87 Although DB_File is intended to be used with Berkeley DB version 1, it
88 can also be used with version 2, 3 or 4. In this case the interface is
89 limited to the functionality provided by Berkeley DB 1.x. Anywhere the
90 version 2 or greater interface differs, DB_File arranges for it to work
91 like version 1. This feature allows DB_File scripts that were built
92 with version 1 to be migrated to version 2 or greater without any
93 changes.
94
95 If you want to make use of the new features available in Berkeley DB
96 2.x or greater, use the Perl module BerkeleyDB instead.
97
98 Note: The database file format has changed multiple times in Berkeley
99 DB version 2, 3 and 4. If you cannot recreate your databases, you must
100 dump any existing databases with either the "db_dump" or the
101 "db_dump185" utility that comes with Berkeley DB. Once you have
102 rebuilt DB_File to use Berkeley DB version 2 or greater, your databases
103 can be recreated using "db_load". Refer to the Berkeley DB
104 documentation for further details.
105
106 Please read "COPYRIGHT" before using version 2.x or greater of Berkeley
107 DB with DB_File.
108
109 Interface to Berkeley DB
110 DB_File allows access to Berkeley DB files using the tie() mechanism in
111 Perl 5 (for full details, see "tie()" in perlfunc). This facility
112 allows DB_File to access Berkeley DB files using either an associative
113 array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the
114 DB_RECNO file type).
115
116 In addition to the tie() interface, it is also possible to access most
117 of the functions provided in the Berkeley DB API directly. See "THE
118 API INTERFACE".
119
120 Opening a Berkeley DB Database File
121 Berkeley DB uses the function dbopen() to open or create a database.
122 Here is the C prototype for dbopen():
123
124 DB*
125 dbopen (const char * file, int flags, int mode,
126 DBTYPE type, const void * openinfo)
127
128 The parameter "type" is an enumeration which specifies which of the 3
129 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used.
130 Depending on which of these is actually chosen, the final parameter,
131 openinfo points to a data structure which allows tailoring of the
132 specific interface method.
133
134 This interface is handled slightly differently in DB_File. Here is an
135 equivalent call using DB_File:
136
137 tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;
138
139 The "filename", "flags" and "mode" parameters are the direct equivalent
140 of their dbopen() counterparts. The final parameter $DB_HASH performs
141 the function of both the "type" and "openinfo" parameters in dbopen().
142
143 In the example above $DB_HASH is actually a pre-defined reference to a
144 hash object. DB_File has three of these pre-defined references. Apart
145 from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.
146
147 The keys allowed in each of these pre-defined references is limited to
148 the names used in the equivalent C structure. So, for example, the
149 $DB_HASH reference will only allow keys called "bsize", "cachesize",
150 "ffactor", "hash", "lorder" and "nelem".
151
152 To change one of these elements, just assign to it like this:
153
154 $DB_HASH->{'cachesize'} = 10000 ;
155
156 The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are
157 usually adequate for most applications. If you do need to create extra
158 instances of these objects, constructors are available for each file
159 type.
160
161 Here are examples of the constructors and the valid options available
162 for DB_HASH, DB_BTREE and DB_RECNO respectively.
163
164 $a = new DB_File::HASHINFO ;
165 $a->{'bsize'} ;
166 $a->{'cachesize'} ;
167 $a->{'ffactor'};
168 $a->{'hash'} ;
169 $a->{'lorder'} ;
170 $a->{'nelem'} ;
171
172 $b = new DB_File::BTREEINFO ;
173 $b->{'flags'} ;
174 $b->{'cachesize'} ;
175 $b->{'maxkeypage'} ;
176 $b->{'minkeypage'} ;
177 $b->{'psize'} ;
178 $b->{'compare'} ;
179 $b->{'prefix'} ;
180 $b->{'lorder'} ;
181
182 $c = new DB_File::RECNOINFO ;
183 $c->{'bval'} ;
184 $c->{'cachesize'} ;
185 $c->{'psize'} ;
186 $c->{'flags'} ;
187 $c->{'lorder'} ;
188 $c->{'reclen'} ;
189 $c->{'bfname'} ;
190
191 The values stored in the hashes above are mostly the direct equivalent
192 of their C counterpart. Like their C counterparts, all are set to a
193 default values - that means you don't have to set all of the values
194 when you only want to change one. Here is an example:
195
196 $a = new DB_File::HASHINFO ;
197 $a->{'cachesize'} = 12345 ;
198 tie %y, 'DB_File', "filename", $flags, 0777, $a ;
199
200 A few of the options need extra discussion here. When used, the C
201 equivalent of the keys "hash", "compare" and "prefix" store pointers to
202 C functions. In DB_File these keys are used to store references to Perl
203 subs. Below are templates for each of the subs:
204
205 sub hash
206 {
207 my ($data) = @_ ;
208 ...
209 # return the hash value for $data
210 return $hash ;
211 }
212
213 sub compare
214 {
215 my ($key, $key2) = @_ ;
216 ...
217 # return 0 if $key1 eq $key2
218 # -1 if $key1 lt $key2
219 # 1 if $key1 gt $key2
220 return (-1 , 0 or 1) ;
221 }
222
223 sub prefix
224 {
225 my ($key, $key2) = @_ ;
226 ...
227 # return number of bytes of $key2 which are
228 # necessary to determine that it is greater than $key1
229 return $bytes ;
230 }
231
232 See "Changing the BTREE sort order" for an example of using the
233 "compare" template.
234
235 If you are using the DB_RECNO interface and you intend making use of
236 "bval", you should check out "The 'bval' Option".
237
238 Default Parameters
239 It is possible to omit some or all of the final 4 parameters in the
240 call to "tie" and let them take default values. As DB_HASH is the most
241 common file format used, the call:
242
243 tie %A, "DB_File", "filename" ;
244
245 is equivalent to:
246
247 tie %A, "DB_File", "filename", O_CREAT|O_RDWR, 0666, $DB_HASH ;
248
249 It is also possible to omit the filename parameter as well, so the
250 call:
251
252 tie %A, "DB_File" ;
253
254 is equivalent to:
255
256 tie %A, "DB_File", undef, O_CREAT|O_RDWR, 0666, $DB_HASH ;
257
258 See "In Memory Databases" for a discussion on the use of "undef" in
259 place of a filename.
260
261 In Memory Databases
262 Berkeley DB allows the creation of in-memory databases by using NULL
263 (that is, a "(char *)0" in C) in place of the filename. DB_File uses
264 "undef" instead of NULL to provide this functionality.
265
267 The DB_HASH file format is probably the most commonly used of the three
268 file formats that DB_File supports. It is also very straightforward to
269 use.
270
271 A Simple Example
272 This example shows how to create a database, add key/value pairs to the
273 database, delete keys/value pairs and finally how to enumerate the
274 contents of the database.
275
276 use warnings ;
277 use strict ;
278 use DB_File ;
279 our (%h, $k, $v) ;
280
281 unlink "fruit" ;
282 tie %h, "DB_File", "fruit", O_RDWR|O_CREAT, 0666, $DB_HASH
283 or die "Cannot open file 'fruit': $!\n";
284
285 # Add a few key/value pairs to the file
286 $h{"apple"} = "red" ;
287 $h{"orange"} = "orange" ;
288 $h{"banana"} = "yellow" ;
289 $h{"tomato"} = "red" ;
290
291 # Check for existence of a key
292 print "Banana Exists\n\n" if $h{"banana"} ;
293
294 # Delete a key/value pair.
295 delete $h{"apple"} ;
296
297 # print the contents of the file
298 while (($k, $v) = each %h)
299 { print "$k -> $v\n" }
300
301 untie %h ;
302
303 here is the output:
304
305 Banana Exists
306
307 orange -> orange
308 tomato -> red
309 banana -> yellow
310
311 Note that the like ordinary associative arrays, the order of the keys
312 retrieved is in an apparently random order.
313
315 The DB_BTREE format is useful when you want to store data in a given
316 order. By default the keys will be stored in lexical order, but as you
317 will see from the example shown in the next section, it is very easy to
318 define your own sorting function.
319
320 Changing the BTREE sort order
321 This script shows how to override the default sorting algorithm that
322 BTREE uses. Instead of using the normal lexical ordering, a case
323 insensitive compare function will be used.
324
325 use warnings ;
326 use strict ;
327 use DB_File ;
328
329 my %h ;
330
331 sub Compare
332 {
333 my ($key1, $key2) = @_ ;
334 "\L$key1" cmp "\L$key2" ;
335 }
336
337 # specify the Perl sub that will do the comparison
338 $DB_BTREE->{'compare'} = \&Compare ;
339
340 unlink "tree" ;
341 tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
342 or die "Cannot open file 'tree': $!\n" ;
343
344 # Add a key/value pair to the file
345 $h{'Wall'} = 'Larry' ;
346 $h{'Smith'} = 'John' ;
347 $h{'mouse'} = 'mickey' ;
348 $h{'duck'} = 'donald' ;
349
350 # Delete
351 delete $h{"duck"} ;
352
353 # Cycle through the keys printing them in order.
354 # Note it is not necessary to sort the keys as
355 # the btree will have kept them in order automatically.
356 foreach (keys %h)
357 { print "$_\n" }
358
359 untie %h ;
360
361 Here is the output from the code above.
362
363 mouse
364 Smith
365 Wall
366
367 There are a few point to bear in mind if you want to change the
368 ordering in a BTREE database:
369
370 1. The new compare function must be specified when you create the
371 database.
372
373 2. You cannot change the ordering once the database has been created.
374 Thus you must use the same compare function every time you access
375 the database.
376
377 3. Duplicate keys are entirely defined by the comparison function.
378 In the case-insensitive example above, the keys: 'KEY' and 'key'
379 would be considered duplicates, and assigning to the second one
380 would overwrite the first. If duplicates are allowed for (with the
381 R_DUP flag discussed below), only a single copy of duplicate keys
382 is stored in the database --- so (again with example above)
383 assigning three values to the keys: 'KEY', 'Key', and 'key' would
384 leave just the first key: 'KEY' in the database with three values.
385 For some situations this results in information loss, so care
386 should be taken to provide fully qualified comparison functions
387 when necessary. For example, the above comparison routine could
388 be modified to additionally compare case-sensitively if two keys
389 are equal in the case insensitive comparison:
390
391 sub compare {
392 my($key1, $key2) = @_;
393 lc $key1 cmp lc $key2 ||
394 $key1 cmp $key2;
395 }
396
397 And now you will only have duplicates when the keys themselves are
398 truly the same. (note: in versions of the db library prior to
399 about November 1996, such duplicate keys were retained so it was
400 possible to recover the original keys in sets of keys that
401 compared as equal).
402
403 Handling Duplicate Keys
404 The BTREE file type optionally allows a single key to be associated
405 with an arbitrary number of values. This option is enabled by setting
406 the flags element of $DB_BTREE to R_DUP when creating the database.
407
408 There are some difficulties in using the tied hash interface if you
409 want to manipulate a BTREE database with duplicate keys. Consider this
410 code:
411
412 use warnings ;
413 use strict ;
414 use DB_File ;
415
416 my ($filename, %h) ;
417
418 $filename = "tree" ;
419 unlink $filename ;
420
421 # Enable duplicate records
422 $DB_BTREE->{'flags'} = R_DUP ;
423
424 tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
425 or die "Cannot open $filename: $!\n";
426
427 # Add some key/value pairs to the file
428 $h{'Wall'} = 'Larry' ;
429 $h{'Wall'} = 'Brick' ; # Note the duplicate key
430 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
431 $h{'Smith'} = 'John' ;
432 $h{'mouse'} = 'mickey' ;
433
434 # iterate through the associative array
435 # and print each key/value pair.
436 foreach (sort keys %h)
437 { print "$_ -> $h{$_}\n" }
438
439 untie %h ;
440
441 Here is the output:
442
443 Smith -> John
444 Wall -> Larry
445 Wall -> Larry
446 Wall -> Larry
447 mouse -> mickey
448
449 As you can see 3 records have been successfully created with key "Wall"
450 - the only thing is, when they are retrieved from the database they
451 seem to have the same value, namely "Larry". The problem is caused by
452 the way that the associative array interface works. Basically, when the
453 associative array interface is used to fetch the value associated with
454 a given key, it will only ever retrieve the first value.
455
456 Although it may not be immediately obvious from the code above, the
457 associative array interface can be used to write values with duplicate
458 keys, but it cannot be used to read them back from the database.
459
460 The way to get around this problem is to use the Berkeley DB API method
461 called "seq". This method allows sequential access to key/value pairs.
462 See "THE API INTERFACE" for details of both the "seq" method and the
463 API in general.
464
465 Here is the script above rewritten using the "seq" API method.
466
467 use warnings ;
468 use strict ;
469 use DB_File ;
470
471 my ($filename, $x, %h, $status, $key, $value) ;
472
473 $filename = "tree" ;
474 unlink $filename ;
475
476 # Enable duplicate records
477 $DB_BTREE->{'flags'} = R_DUP ;
478
479 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
480 or die "Cannot open $filename: $!\n";
481
482 # Add some key/value pairs to the file
483 $h{'Wall'} = 'Larry' ;
484 $h{'Wall'} = 'Brick' ; # Note the duplicate key
485 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
486 $h{'Smith'} = 'John' ;
487 $h{'mouse'} = 'mickey' ;
488
489 # iterate through the btree using seq
490 # and print each key/value pair.
491 $key = $value = 0 ;
492 for ($status = $x->seq($key, $value, R_FIRST) ;
493 $status == 0 ;
494 $status = $x->seq($key, $value, R_NEXT) )
495 { print "$key -> $value\n" }
496
497 undef $x ;
498 untie %h ;
499
500 that prints:
501
502 Smith -> John
503 Wall -> Brick
504 Wall -> Brick
505 Wall -> Larry
506 mouse -> mickey
507
508 This time we have got all the key/value pairs, including the multiple
509 values associated with the key "Wall".
510
511 To make life easier when dealing with duplicate keys, DB_File comes
512 with a few utility methods.
513
514 The get_dup() Method
515 The "get_dup" method assists in reading duplicate values from BTREE
516 databases. The method can take the following forms:
517
518 $count = $x->get_dup($key) ;
519 @list = $x->get_dup($key) ;
520 %list = $x->get_dup($key, 1) ;
521
522 In a scalar context the method returns the number of values associated
523 with the key, $key.
524
525 In list context, it returns all the values which match $key. Note that
526 the values will be returned in an apparently random order.
527
528 In list context, if the second parameter is present and evaluates TRUE,
529 the method returns an associative array. The keys of the associative
530 array correspond to the values that matched in the BTREE and the values
531 of the array are a count of the number of times that particular value
532 occurred in the BTREE.
533
534 So assuming the database created above, we can use "get_dup" like this:
535
536 use warnings ;
537 use strict ;
538 use DB_File ;
539
540 my ($filename, $x, %h) ;
541
542 $filename = "tree" ;
543
544 # Enable duplicate records
545 $DB_BTREE->{'flags'} = R_DUP ;
546
547 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
548 or die "Cannot open $filename: $!\n";
549
550 my $cnt = $x->get_dup("Wall") ;
551 print "Wall occurred $cnt times\n" ;
552
553 my %hash = $x->get_dup("Wall", 1) ;
554 print "Larry is there\n" if $hash{'Larry'} ;
555 print "There are $hash{'Brick'} Brick Walls\n" ;
556
557 my @list = sort $x->get_dup("Wall") ;
558 print "Wall => [@list]\n" ;
559
560 @list = $x->get_dup("Smith") ;
561 print "Smith => [@list]\n" ;
562
563 @list = $x->get_dup("Dog") ;
564 print "Dog => [@list]\n" ;
565
566 and it will print:
567
568 Wall occurred 3 times
569 Larry is there
570 There are 2 Brick Walls
571 Wall => [Brick Brick Larry]
572 Smith => [John]
573 Dog => []
574
575 The find_dup() Method
576 $status = $X->find_dup($key, $value) ;
577
578 This method checks for the existence of a specific key/value pair. If
579 the pair exists, the cursor is left pointing to the pair and the method
580 returns 0. Otherwise the method returns a non-zero value.
581
582 Assuming the database from the previous example:
583
584 use warnings ;
585 use strict ;
586 use DB_File ;
587
588 my ($filename, $x, %h, $found) ;
589
590 $filename = "tree" ;
591
592 # Enable duplicate records
593 $DB_BTREE->{'flags'} = R_DUP ;
594
595 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
596 or die "Cannot open $filename: $!\n";
597
598 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
599 print "Larry Wall is $found there\n" ;
600
601 $found = ( $x->find_dup("Wall", "Harry") == 0 ? "" : "not") ;
602 print "Harry Wall is $found there\n" ;
603
604 undef $x ;
605 untie %h ;
606
607 prints this
608
609 Larry Wall is there
610 Harry Wall is not there
611
612 The del_dup() Method
613 $status = $X->del_dup($key, $value) ;
614
615 This method deletes a specific key/value pair. It returns 0 if they
616 exist and have been deleted successfully. Otherwise the method returns
617 a non-zero value.
618
619 Again assuming the existence of the "tree" database
620
621 use warnings ;
622 use strict ;
623 use DB_File ;
624
625 my ($filename, $x, %h, $found) ;
626
627 $filename = "tree" ;
628
629 # Enable duplicate records
630 $DB_BTREE->{'flags'} = R_DUP ;
631
632 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
633 or die "Cannot open $filename: $!\n";
634
635 $x->del_dup("Wall", "Larry") ;
636
637 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
638 print "Larry Wall is $found there\n" ;
639
640 undef $x ;
641 untie %h ;
642
643 prints this
644
645 Larry Wall is not there
646
647 Matching Partial Keys
648 The BTREE interface has a feature which allows partial keys to be
649 matched. This functionality is only available when the "seq" method is
650 used along with the R_CURSOR flag.
651
652 $x->seq($key, $value, R_CURSOR) ;
653
654 Here is the relevant quote from the dbopen man page where it defines
655 the use of the R_CURSOR flag with seq:
656
657 Note, for the DB_BTREE access method, the returned key is not
658 necessarily an exact match for the specified key. The returned key
659 is the smallest key greater than or equal to the specified key,
660 permitting partial key matches and range searches.
661
662 In the example script below, the "match" sub uses this feature to find
663 and print the first matching key/value pair given a partial key.
664
665 use warnings ;
666 use strict ;
667 use DB_File ;
668 use Fcntl ;
669
670 my ($filename, $x, %h, $st, $key, $value) ;
671
672 sub match
673 {
674 my $key = shift ;
675 my $value = 0;
676 my $orig_key = $key ;
677 $x->seq($key, $value, R_CURSOR) ;
678 print "$orig_key\t-> $key\t-> $value\n" ;
679 }
680
681 $filename = "tree" ;
682 unlink $filename ;
683
684 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
685 or die "Cannot open $filename: $!\n";
686
687 # Add some key/value pairs to the file
688 $h{'mouse'} = 'mickey' ;
689 $h{'Wall'} = 'Larry' ;
690 $h{'Walls'} = 'Brick' ;
691 $h{'Smith'} = 'John' ;
692
693
694 $key = $value = 0 ;
695 print "IN ORDER\n" ;
696 for ($st = $x->seq($key, $value, R_FIRST) ;
697 $st == 0 ;
698 $st = $x->seq($key, $value, R_NEXT) )
699
700 { print "$key -> $value\n" }
701
702 print "\nPARTIAL MATCH\n" ;
703
704 match "Wa" ;
705 match "A" ;
706 match "a" ;
707
708 undef $x ;
709 untie %h ;
710
711 Here is the output:
712
713 IN ORDER
714 Smith -> John
715 Wall -> Larry
716 Walls -> Brick
717 mouse -> mickey
718
719 PARTIAL MATCH
720 Wa -> Wall -> Larry
721 A -> Smith -> John
722 a -> mouse -> mickey
723
725 DB_RECNO provides an interface to flat text files. Both variable and
726 fixed length records are supported.
727
728 In order to make RECNO more compatible with Perl, the array offset for
729 all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.
730
731 As with normal Perl arrays, a RECNO array can be accessed using
732 negative indexes. The index -1 refers to the last element of the array,
733 -2 the second last, and so on. Attempting to access an element before
734 the start of the array will raise a fatal run-time error.
735
736 The 'bval' Option
737 The operation of the bval option warrants some discussion. Here is the
738 definition of bval from the Berkeley DB 1.85 recno manual page:
739
740 The delimiting byte to be used to mark the end of a
741 record for variable-length records, and the pad charac-
742 ter for fixed-length records. If no value is speci-
743 fied, newlines (``\n'') are used to mark the end of
744 variable-length records and fixed-length records are
745 padded with spaces.
746
747 The second sentence is wrong. In actual fact bval will only default to
748 "\n" when the openinfo parameter in dbopen is NULL. If a non-NULL
749 openinfo parameter is used at all, the value that happens to be in bval
750 will be used. That means you always have to specify bval when making
751 use of any of the options in the openinfo parameter. This documentation
752 error will be fixed in the next release of Berkeley DB.
753
754 That clarifies the situation with regards Berkeley DB itself. What
755 about DB_File? Well, the behavior defined in the quote above is quite
756 useful, so DB_File conforms to it.
757
758 That means that you can specify other options (e.g. cachesize) and
759 still have bval default to "\n" for variable length records, and space
760 for fixed length records.
761
762 Also note that the bval option only allows you to specify a single byte
763 as a delimiter.
764
765 A Simple Example
766 Here is a simple example that uses RECNO (if you are using a version of
767 Perl earlier than 5.004_57 this example won't work -- see "Extra RECNO
768 Methods" for a workaround).
769
770 use warnings ;
771 use strict ;
772 use DB_File ;
773
774 my $filename = "text" ;
775 unlink $filename ;
776
777 my @h ;
778 tie @h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_RECNO
779 or die "Cannot open file 'text': $!\n" ;
780
781 # Add a few key/value pairs to the file
782 $h[0] = "orange" ;
783 $h[1] = "blue" ;
784 $h[2] = "yellow" ;
785
786 push @h, "green", "black" ;
787
788 my $elements = scalar @h ;
789 print "The array contains $elements entries\n" ;
790
791 my $last = pop @h ;
792 print "popped $last\n" ;
793
794 unshift @h, "white" ;
795 my $first = shift @h ;
796 print "shifted $first\n" ;
797
798 # Check for existence of a key
799 print "Element 1 Exists with value $h[1]\n" if $h[1] ;
800
801 # use a negative index
802 print "The last element is $h[-1]\n" ;
803 print "The 2nd last element is $h[-2]\n" ;
804
805 untie @h ;
806
807 Here is the output from the script:
808
809 The array contains 5 entries
810 popped black
811 shifted white
812 Element 1 Exists with value blue
813 The last element is green
814 The 2nd last element is yellow
815
816 Extra RECNO Methods
817 If you are using a version of Perl earlier than 5.004_57, the tied
818 array interface is quite limited. In the example script above "push",
819 "pop", "shift", "unshift" or determining the array length will not work
820 with a tied array.
821
822 To make the interface more useful for older versions of Perl, a number
823 of methods are supplied with DB_File to simulate the missing array
824 operations. All these methods are accessed via the object returned from
825 the tie call.
826
827 Here are the methods:
828
829 $X->push(list) ;
830 Pushes the elements of "list" to the end of the array.
831
832 $value = $X->pop ;
833 Removes and returns the last element of the array.
834
835 $X->shift
836 Removes and returns the first element of the array.
837
838 $X->unshift(list) ;
839 Pushes the elements of "list" to the start of the array.
840
841 $X->length
842 Returns the number of elements in the array.
843
844 $X->splice(offset, length, elements);
845 Returns a splice of the array.
846
847 Another Example
848 Here is a more complete example that makes use of some of the methods
849 described above. It also makes use of the API interface directly (see
850 "THE API INTERFACE").
851
852 use warnings ;
853 use strict ;
854 my (@h, $H, $file, $i) ;
855 use DB_File ;
856 use Fcntl ;
857
858 $file = "text" ;
859
860 unlink $file ;
861
862 $H = tie @h, "DB_File", $file, O_RDWR|O_CREAT, 0666, $DB_RECNO
863 or die "Cannot open file $file: $!\n" ;
864
865 # first create a text file to play with
866 $h[0] = "zero" ;
867 $h[1] = "one" ;
868 $h[2] = "two" ;
869 $h[3] = "three" ;
870 $h[4] = "four" ;
871
872
873 # Print the records in order.
874 #
875 # The length method is needed here because evaluating a tied
876 # array in a scalar context does not return the number of
877 # elements in the array.
878
879 print "\nORIGINAL\n" ;
880 foreach $i (0 .. $H->length - 1) {
881 print "$i: $h[$i]\n" ;
882 }
883
884 # use the push & pop methods
885 $a = $H->pop ;
886 $H->push("last") ;
887 print "\nThe last record was [$a]\n" ;
888
889 # and the shift & unshift methods
890 $a = $H->shift ;
891 $H->unshift("first") ;
892 print "The first record was [$a]\n" ;
893
894 # Use the API to add a new record after record 2.
895 $i = 2 ;
896 $H->put($i, "Newbie", R_IAFTER) ;
897
898 # and a new record before record 1.
899 $i = 1 ;
900 $H->put($i, "New One", R_IBEFORE) ;
901
902 # delete record 3
903 $H->del(3) ;
904
905 # now print the records in reverse order
906 print "\nREVERSE\n" ;
907 for ($i = $H->length - 1 ; $i >= 0 ; -- $i)
908 { print "$i: $h[$i]\n" }
909
910 # same again, but use the API functions instead
911 print "\nREVERSE again\n" ;
912 my ($s, $k, $v) = (0, 0, 0) ;
913 for ($s = $H->seq($k, $v, R_LAST) ;
914 $s == 0 ;
915 $s = $H->seq($k, $v, R_PREV))
916 { print "$k: $v\n" }
917
918 undef $H ;
919 untie @h ;
920
921 and this is what it outputs:
922
923 ORIGINAL
924 0: zero
925 1: one
926 2: two
927 3: three
928 4: four
929
930 The last record was [four]
931 The first record was [zero]
932
933 REVERSE
934 5: last
935 4: three
936 3: Newbie
937 2: one
938 1: New One
939 0: first
940
941 REVERSE again
942 5: last
943 4: three
944 3: Newbie
945 2: one
946 1: New One
947 0: first
948
949 Notes:
950
951 1. Rather than iterating through the array, @h like this:
952
953 foreach $i (@h)
954
955 it is necessary to use either this:
956
957 foreach $i (0 .. $H->length - 1)
958
959 or this:
960
961 for ($a = $H->get($k, $v, R_FIRST) ;
962 $a == 0 ;
963 $a = $H->get($k, $v, R_NEXT) )
964
965 2. Notice that both times the "put" method was used the record index
966 was specified using a variable, $i, rather than the literal value
967 itself. This is because "put" will return the record number of the
968 inserted line via that parameter.
969
971 As well as accessing Berkeley DB using a tied hash or array, it is also
972 possible to make direct use of most of the API functions defined in the
973 Berkeley DB documentation.
974
975 To do this you need to store a copy of the object returned from the
976 tie.
977
978 $db = tie %hash, "DB_File", "filename" ;
979
980 Once you have done that, you can access the Berkeley DB API functions
981 as DB_File methods directly like this:
982
983 $db->put($key, $value, R_NOOVERWRITE) ;
984
985 Important: If you have saved a copy of the object returned from "tie",
986 the underlying database file will not be closed until both the tied
987 variable is untied and all copies of the saved object are destroyed.
988
989 use DB_File ;
990 $db = tie %hash, "DB_File", "filename"
991 or die "Cannot tie filename: $!" ;
992 ...
993 undef $db ;
994 untie %hash ;
995
996 See "The untie() Gotcha" for more details.
997
998 All the functions defined in dbopen are available except for close()
999 and dbopen() itself. The DB_File method interface to the supported
1000 functions have been implemented to mirror the way Berkeley DB works
1001 whenever possible. In particular note that:
1002
1003 · The methods return a status value. All return 0 on success. All
1004 return -1 to signify an error and set $! to the exact error code.
1005 The return code 1 generally (but not always) means that the key
1006 specified did not exist in the database.
1007
1008 Other return codes are defined. See below and in the Berkeley DB
1009 documentation for details. The Berkeley DB documentation should be
1010 used as the definitive source.
1011
1012 · Whenever a Berkeley DB function returns data via one of its
1013 parameters, the equivalent DB_File method does exactly the same.
1014
1015 · If you are careful, it is possible to mix API calls with the tied
1016 hash/array interface in the same piece of code. Although only a
1017 few of the methods used to implement the tied interface currently
1018 make use of the cursor, you should always assume that the cursor
1019 has been changed any time the tied hash/array interface is used.
1020 As an example, this code will probably not do what you expect:
1021
1022 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1023 or die "Cannot tie $filename: $!" ;
1024
1025 # Get the first key/value pair and set the cursor
1026 $X->seq($key, $value, R_FIRST) ;
1027
1028 # this line will modify the cursor
1029 $count = scalar keys %x ;
1030
1031 # Get the second key/value pair.
1032 # oops, it didn't, it got the last key/value pair!
1033 $X->seq($key, $value, R_NEXT) ;
1034
1035 The code above can be rearranged to get around the problem, like
1036 this:
1037
1038 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1039 or die "Cannot tie $filename: $!" ;
1040
1041 # this line will modify the cursor
1042 $count = scalar keys %x ;
1043
1044 # Get the first key/value pair and set the cursor
1045 $X->seq($key, $value, R_FIRST) ;
1046
1047 # Get the second key/value pair.
1048 # worked this time.
1049 $X->seq($key, $value, R_NEXT) ;
1050
1051 All the constants defined in dbopen for use in the flags parameters in
1052 the methods defined below are also available. Refer to the Berkeley DB
1053 documentation for the precise meaning of the flags values.
1054
1055 Below is a list of the methods available.
1056
1057 $status = $X->get($key, $value [, $flags]) ;
1058 Given a key ($key) this method reads the value associated with it
1059 from the database. The value read from the database is returned in
1060 the $value parameter.
1061
1062 If the key does not exist the method returns 1.
1063
1064 No flags are currently defined for this method.
1065
1066 $status = $X->put($key, $value [, $flags]) ;
1067 Stores the key/value pair in the database.
1068
1069 If you use either the R_IAFTER or R_IBEFORE flags, the $key
1070 parameter will have the record number of the inserted key/value
1071 pair set.
1072
1073 Valid flags are R_CURSOR, R_IAFTER, R_IBEFORE, R_NOOVERWRITE and
1074 R_SETCURSOR.
1075
1076 $status = $X->del($key [, $flags]) ;
1077 Removes all key/value pairs with key $key from the database.
1078
1079 A return code of 1 means that the requested key was not in the
1080 database.
1081
1082 R_CURSOR is the only valid flag at present.
1083
1084 $status = $X->fd ;
1085 Returns the file descriptor for the underlying database.
1086
1087 See "Locking: The Trouble with fd" for an explanation for why you
1088 should not use "fd" to lock your database.
1089
1090 $status = $X->seq($key, $value, $flags) ;
1091 This interface allows sequential retrieval from the database. See
1092 dbopen for full details.
1093
1094 Both the $key and $value parameters will be set to the key/value
1095 pair read from the database.
1096
1097 The flags parameter is mandatory. The valid flag values are
1098 R_CURSOR, R_FIRST, R_LAST, R_NEXT and R_PREV.
1099
1100 $status = $X->sync([$flags]) ;
1101 Flushes any cached buffers to disk.
1102
1103 R_RECNOSYNC is the only valid flag at present.
1104
1106 A DBM Filter is a piece of code that is be used when you always want to
1107 make the same transformation to all keys and/or values in a DBM
1108 database. An example is when you need to encode your data in UTF-8
1109 before writing to the database and then decode the UTF-8 when reading
1110 from the database file.
1111
1112 There are two ways to use a DBM Filter.
1113
1114 1. Using the low-level API defined below.
1115
1116 2. Using the DBM_Filter module. This module hides the complexity of
1117 the API defined below and comes with a number of "canned" filters
1118 that cover some of the common use-cases.
1119
1120 Use of the DBM_Filter module is recommended.
1121
1122 DBM Filter Low-level API
1123 There are four methods associated with DBM Filters. All work
1124 identically, and each is used to install (or uninstall) a single DBM
1125 Filter. Each expects a single parameter, namely a reference to a sub.
1126 The only difference between them is the place that the filter is
1127 installed.
1128
1129 To summarise:
1130
1131 filter_store_key
1132 If a filter has been installed with this method, it will be
1133 invoked every time you write a key to a DBM database.
1134
1135 filter_store_value
1136 If a filter has been installed with this method, it will be
1137 invoked every time you write a value to a DBM database.
1138
1139 filter_fetch_key
1140 If a filter has been installed with this method, it will be
1141 invoked every time you read a key from a DBM database.
1142
1143 filter_fetch_value
1144 If a filter has been installed with this method, it will be
1145 invoked every time you read a value from a DBM database.
1146
1147 You can use any combination of the methods, from none, to all four.
1148
1149 All filter methods return the existing filter, if present, or "undef"
1150 in not.
1151
1152 To delete a filter pass "undef" to it.
1153
1154 The Filter
1155 When each filter is called by Perl, a local copy of $_ will contain the
1156 key or value to be filtered. Filtering is achieved by modifying the
1157 contents of $_. The return code from the filter is ignored.
1158
1159 An Example -- the NULL termination problem.
1160 Consider the following scenario. You have a DBM database that you need
1161 to share with a third-party C application. The C application assumes
1162 that all keys and values are NULL terminated. Unfortunately when Perl
1163 writes to DBM databases it doesn't use NULL termination, so your Perl
1164 application will have to manage NULL termination itself. When you write
1165 to the database you will have to use something like this:
1166
1167 $hash{"$key\0"} = "$value\0" ;
1168
1169 Similarly the NULL needs to be taken into account when you are
1170 considering the length of existing keys/values.
1171
1172 It would be much better if you could ignore the NULL terminations issue
1173 in the main application code and have a mechanism that automatically
1174 added the terminating NULL to all keys and values whenever you write to
1175 the database and have them removed when you read from the database. As
1176 I'm sure you have already guessed, this is a problem that DBM Filters
1177 can fix very easily.
1178
1179 use warnings ;
1180 use strict ;
1181 use DB_File ;
1182
1183 my %hash ;
1184 my $filename = "filt" ;
1185 unlink $filename ;
1186
1187 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1188 or die "Cannot open $filename: $!\n" ;
1189
1190 # Install DBM Filters
1191 $db->filter_fetch_key ( sub { s/\0$// } ) ;
1192 $db->filter_store_key ( sub { $_ .= "\0" } ) ;
1193 $db->filter_fetch_value( sub { s/\0$// } ) ;
1194 $db->filter_store_value( sub { $_ .= "\0" } ) ;
1195
1196 $hash{"abc"} = "def" ;
1197 my $a = $hash{"ABC"} ;
1198 # ...
1199 undef $db ;
1200 untie %hash ;
1201
1202 Hopefully the contents of each of the filters should be self-
1203 explanatory. Both "fetch" filters remove the terminating NULL, and both
1204 "store" filters add a terminating NULL.
1205
1206 Another Example -- Key is a C int.
1207 Here is another real-life example. By default, whenever Perl writes to
1208 a DBM database it always writes the key and value as strings. So when
1209 you use this:
1210
1211 $hash{12345} = "something" ;
1212
1213 the key 12345 will get stored in the DBM database as the 5 byte string
1214 "12345". If you actually want the key to be stored in the DBM database
1215 as a C int, you will have to use "pack" when writing, and "unpack" when
1216 reading.
1217
1218 Here is a DBM Filter that does it:
1219
1220 use warnings ;
1221 use strict ;
1222 use DB_File ;
1223 my %hash ;
1224 my $filename = "filt" ;
1225 unlink $filename ;
1226
1227
1228 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1229 or die "Cannot open $filename: $!\n" ;
1230
1231 $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
1232 $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
1233 $hash{123} = "def" ;
1234 # ...
1235 undef $db ;
1236 untie %hash ;
1237
1238 This time only two filters have been used -- we only need to manipulate
1239 the contents of the key, so it wasn't necessary to install any value
1240 filters.
1241
1243 Locking: The Trouble with fd
1244 Until version 1.72 of this module, the recommended technique for
1245 locking DB_File databases was to flock the filehandle returned from the
1246 "fd" function. Unfortunately this technique has been shown to be
1247 fundamentally flawed (Kudos to David Harris for tracking this down).
1248 Use it at your own peril!
1249
1250 The locking technique went like this.
1251
1252 $db = tie(%db, 'DB_File', 'foo.db', O_CREAT|O_RDWR, 0644)
1253 || die "dbcreat foo.db $!";
1254 $fd = $db->fd;
1255 open(DB_FH, "+<&=$fd") || die "dup $!";
1256 flock (DB_FH, LOCK_EX) || die "flock: $!";
1257 ...
1258 $db{"Tom"} = "Jerry" ;
1259 ...
1260 flock(DB_FH, LOCK_UN);
1261 undef $db;
1262 untie %db;
1263 close(DB_FH);
1264
1265 In simple terms, this is what happens:
1266
1267 1. Use "tie" to open the database.
1268
1269 2. Lock the database with fd & flock.
1270
1271 3. Read & Write to the database.
1272
1273 4. Unlock and close the database.
1274
1275 Here is the crux of the problem. A side-effect of opening the DB_File
1276 database in step 2 is that an initial block from the database will get
1277 read from disk and cached in memory.
1278
1279 To see why this is a problem, consider what can happen when two
1280 processes, say "A" and "B", both want to update the same DB_File
1281 database using the locking steps outlined above. Assume process "A" has
1282 already opened the database and has a write lock, but it hasn't
1283 actually updated the database yet (it has finished step 2, but not
1284 started step 3 yet). Now process "B" tries to open the same database -
1285 step 1 will succeed, but it will block on step 2 until process "A"
1286 releases the lock. The important thing to notice here is that at this
1287 point in time both processes will have cached identical initial blocks
1288 from the database.
1289
1290 Now process "A" updates the database and happens to change some of the
1291 data held in the initial buffer. Process "A" terminates, flushing all
1292 cached data to disk and releasing the database lock. At this point the
1293 database on disk will correctly reflect the changes made by process
1294 "A".
1295
1296 With the lock released, process "B" can now continue. It also updates
1297 the database and unfortunately it too modifies the data that was in its
1298 initial buffer. Once that data gets flushed to disk it will overwrite
1299 some/all of the changes process "A" made to the database.
1300
1301 The result of this scenario is at best a database that doesn't contain
1302 what you expect. At worst the database will corrupt.
1303
1304 The above won't happen every time competing process update the same
1305 DB_File database, but it does illustrate why the technique should not
1306 be used.
1307
1308 Safe ways to lock a database
1309 Starting with version 2.x, Berkeley DB has internal support for
1310 locking. The companion module to this one, BerkeleyDB, provides an
1311 interface to this locking functionality. If you are serious about
1312 locking Berkeley DB databases, I strongly recommend using BerkeleyDB.
1313
1314 If using BerkeleyDB isn't an option, there are a number of modules
1315 available on CPAN that can be used to implement locking. Each one
1316 implements locking differently and has different goals in mind. It is
1317 therefore worth knowing the difference, so that you can pick the right
1318 one for your application. Here are the three locking wrappers:
1319
1320 Tie::DB_Lock
1321 A DB_File wrapper which creates copies of the database file for
1322 read access, so that you have a kind of a multiversioning
1323 concurrent read system. However, updates are still serial. Use for
1324 databases where reads may be lengthy and consistency problems may
1325 occur.
1326
1327 Tie::DB_LockFile
1328 A DB_File wrapper that has the ability to lock and unlock the
1329 database while it is being used. Avoids the tie-before-flock
1330 problem by simply re-tie-ing the database when you get or drop a
1331 lock. Because of the flexibility in dropping and re-acquiring the
1332 lock in the middle of a session, this can be massaged into a
1333 system that will work with long updates and/or reads if the
1334 application follows the hints in the POD documentation.
1335
1336 DB_File::Lock
1337 An extremely lightweight DB_File wrapper that simply flocks a
1338 lockfile before tie-ing the database and drops the lock after the
1339 untie. Allows one to use the same lockfile for multiple databases
1340 to avoid deadlock problems, if desired. Use for databases where
1341 updates are reads are quick and simple flock locking semantics are
1342 enough.
1343
1344 Sharing Databases With C Applications
1345 There is no technical reason why a Berkeley DB database cannot be
1346 shared by both a Perl and a C application.
1347
1348 The vast majority of problems that are reported in this area boil down
1349 to the fact that C strings are NULL terminated, whilst Perl strings are
1350 not. See "DBM FILTERS" for a generic way to work around this problem.
1351
1352 Here is a real example. Netscape 2.0 keeps a record of the locations
1353 you visit along with the time you last visited them in a DB_HASH
1354 database. This is usually stored in the file ~/.netscape/history.db.
1355 The key field in the database is the location string and the value
1356 field is the time the location was last visited stored as a 4 byte
1357 binary value.
1358
1359 If you haven't already guessed, the location string is stored with a
1360 terminating NULL. This means you need to be careful when accessing the
1361 database.
1362
1363 Here is a snippet of code that is loosely based on Tom Christiansen's
1364 ggh script (available from your nearest CPAN archive in
1365 authors/id/TOMC/scripts/nshist.gz).
1366
1367 use warnings ;
1368 use strict ;
1369 use DB_File ;
1370 use Fcntl ;
1371
1372 my ($dotdir, $HISTORY, %hist_db, $href, $binary_time, $date) ;
1373 $dotdir = $ENV{HOME} || $ENV{LOGNAME};
1374
1375 $HISTORY = "$dotdir/.netscape/history.db";
1376
1377 tie %hist_db, 'DB_File', $HISTORY
1378 or die "Cannot open $HISTORY: $!\n" ;;
1379
1380 # Dump the complete database
1381 while ( ($href, $binary_time) = each %hist_db ) {
1382
1383 # remove the terminating NULL
1384 $href =~ s/\x00$// ;
1385
1386 # convert the binary time into a user friendly string
1387 $date = localtime unpack("V", $binary_time);
1388 print "$date $href\n" ;
1389 }
1390
1391 # check for the existence of a specific key
1392 # remember to add the NULL
1393 if ( $binary_time = $hist_db{"http://mox.perl.com/\x00"} ) {
1394 $date = localtime unpack("V", $binary_time) ;
1395 print "Last visited mox.perl.com on $date\n" ;
1396 }
1397 else {
1398 print "Never visited mox.perl.com\n"
1399 }
1400
1401 untie %hist_db ;
1402
1403 The untie() Gotcha
1404 If you make use of the Berkeley DB API, it is very strongly recommended
1405 that you read "The untie Gotcha" in perltie.
1406
1407 Even if you don't currently make use of the API interface, it is still
1408 worth reading it.
1409
1410 Here is an example which illustrates the problem from a DB_File
1411 perspective:
1412
1413 use DB_File ;
1414 use Fcntl ;
1415
1416 my %x ;
1417 my $X ;
1418
1419 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_TRUNC
1420 or die "Cannot tie first time: $!" ;
1421
1422 $x{123} = 456 ;
1423
1424 untie %x ;
1425
1426 tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1427 or die "Cannot tie second time: $!" ;
1428
1429 untie %x ;
1430
1431 When run, the script will produce this error message:
1432
1433 Cannot tie second time: Invalid argument at bad.file line 14.
1434
1435 Although the error message above refers to the second tie() statement
1436 in the script, the source of the problem is really with the untie()
1437 statement that precedes it.
1438
1439 Having read perltie you will probably have already guessed that the
1440 error is caused by the extra copy of the tied object stored in $X. If
1441 you haven't, then the problem boils down to the fact that the DB_File
1442 destructor, DESTROY, will not be called until all references to the
1443 tied object are destroyed. Both the tied variable, %x, and $X above
1444 hold a reference to the object. The call to untie() will destroy the
1445 first, but $X still holds a valid reference, so the destructor will not
1446 get called and the database file tst.fil will remain open. The fact
1447 that Berkeley DB then reports the attempt to open a database that is
1448 already open via the catch-all "Invalid argument" doesn't help.
1449
1450 If you run the script with the "-w" flag the error message becomes:
1451
1452 untie attempted while 1 inner references still exist at bad.file line 12.
1453 Cannot tie second time: Invalid argument at bad.file line 14.
1454
1455 which pinpoints the real problem. Finally the script can now be
1456 modified to fix the original problem by destroying the API object
1457 before the untie:
1458
1459 ...
1460 $x{123} = 456 ;
1461
1462 undef $X ;
1463 untie %x ;
1464
1465 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1466 ...
1467
1469 Why is there Perl source in my database?
1470 If you look at the contents of a database file created by DB_File,
1471 there can sometimes be part of a Perl script included in it.
1472
1473 This happens because Berkeley DB uses dynamic memory to allocate
1474 buffers which will subsequently be written to the database file. Being
1475 dynamic, the memory could have been used for anything before DB
1476 malloced it. As Berkeley DB doesn't clear the memory once it has been
1477 allocated, the unused portions will contain random junk. In the case
1478 where a Perl script gets written to the database, the random junk will
1479 correspond to an area of dynamic memory that happened to be used during
1480 the compilation of the script.
1481
1482 Unless you don't like the possibility of there being part of your Perl
1483 scripts embedded in a database file, this is nothing to worry about.
1484
1485 How do I store complex data structures with DB_File?
1486 Although DB_File cannot do this directly, there is a module which can
1487 layer transparently over DB_File to accomplish this feat.
1488
1489 Check out the MLDBM module, available on CPAN in the directory
1490 modules/by-module/MLDBM.
1491
1492 What does "wide character in subroutine entry" mean?
1493 You will usually get this message if you are working with UTF-8 data
1494 and want to read/write it from/to a Berkeley DB database file.
1495
1496 The easist way to deal with this issue is to use the pre-defined "utf8"
1497 DBM_Filter (see DBM_Filter) that was designed to deal with this
1498 situation.
1499
1500 The example below shows what you need if both the key and value are
1501 expected to be in UTF-8.
1502
1503 use DB_File;
1504 use DBM_Filter;
1505
1506 my $db = tie %h, 'DB_File', '/tmp/try.db', O_CREAT|O_RDWR, 0666, $DB_BTREE;
1507 $db->Filter_Key_Push('utf8');
1508 $db->Filter_Value_Push('utf8');
1509
1510 my $key = "\N{LATIN SMALL LETTER A WITH ACUTE}";
1511 my $value = "\N{LATIN SMALL LETTER E WITH ACUTE}";
1512 $h{ $key } = $value;
1513
1514 What does "Invalid Argument" mean?
1515 You will get this error message when one of the parameters in the "tie"
1516 call is wrong. Unfortunately there are quite a few parameters to get
1517 wrong, so it can be difficult to figure out which one it is.
1518
1519 Here are a couple of possibilities:
1520
1521 1. Attempting to reopen a database without closing it.
1522
1523 2. Using the O_WRONLY flag.
1524
1525 What does "Bareword 'DB_File' not allowed" mean?
1526 You will encounter this particular error message when you have the
1527 "strict 'subs'" pragma (or the full strict pragma) in your script.
1528 Consider this script:
1529
1530 use warnings ;
1531 use strict ;
1532 use DB_File ;
1533 my %x ;
1534 tie %x, DB_File, "filename" ;
1535
1536 Running it produces the error in question:
1537
1538 Bareword "DB_File" not allowed while "strict subs" in use
1539
1540 To get around the error, place the word "DB_File" in either single or
1541 double quotes, like this:
1542
1543 tie %x, "DB_File", "filename" ;
1544
1545 Although it might seem like a real pain, it is really worth the effort
1546 of having a "use strict" in all your scripts.
1547
1549 Articles that are either about DB_File or make use of it.
1550
1551 1. Full-Text Searching in Perl, Tim Kientzle (tkientzle@ddj.com), Dr.
1552 Dobb's Journal, Issue 295, January 1999, pp 34-41
1553
1555 Moved to the Changes file.
1556
1558 Some older versions of Berkeley DB had problems with fixed length
1559 records using the RECNO file format. This problem has been fixed since
1560 version 1.85 of Berkeley DB.
1561
1562 I am sure there are bugs in the code. If you do find any, or can
1563 suggest any enhancements, I would welcome your comments.
1564
1566 DB_File comes with the standard Perl source distribution. Look in the
1567 directory ext/DB_File. Given the amount of time between releases of
1568 Perl the version that ships with Perl is quite likely to be out of
1569 date, so the most recent version can always be found on CPAN (see
1570 "CPAN" in perlmodlib for details), in the directory
1571 modules/by-module/DB_File.
1572
1573 This version of DB_File will work with either version 1.x, 2.x or 3.x
1574 of Berkeley DB, but is limited to the functionality provided by version
1575 1.
1576
1577 The official web site for Berkeley DB is
1578 http://www.oracle.com/technology/products/berkeley-db/db/index.html.
1579 All versions of Berkeley DB are available there.
1580
1581 Alternatively, Berkeley DB version 1 is available at your nearest CPAN
1582 archive in src/misc/db.1.85.tar.gz.
1583
1585 Copyright (c) 1995-2016 Paul Marquess. All rights reserved. This
1586 program is free software; you can redistribute it and/or modify it
1587 under the same terms as Perl itself.
1588
1589 Although DB_File is covered by the Perl license, the library it makes
1590 use of, namely Berkeley DB, is not. Berkeley DB has its own copyright
1591 and its own license. Please take the time to read it.
1592
1593 Here are a few words taken from the Berkeley DB FAQ (at
1594 http://www.oracle.com/technology/products/berkeley-db/db/index.html)
1595 regarding the license:
1596
1597 Do I have to license DB to use it in Perl scripts?
1598
1599 No. The Berkeley DB license requires that software that uses
1600 Berkeley DB be freely redistributable. In the case of Perl, that
1601 software is Perl, and not your scripts. Any Perl scripts that you
1602 write are your property, including scripts that make use of
1603 Berkeley DB. Neither the Perl license nor the Berkeley DB license
1604 place any restriction on what you may do with them.
1605
1606 If you are in any doubt about the license situation, contact either the
1607 Berkeley DB authors or the author of DB_File. See "AUTHOR" for details.
1608
1610 perl, dbopen(3), hash(3), recno(3), btree(3), perldbmfilter, DBM_Filter
1611
1613 The DB_File interface was written by Paul Marquess <pmqs@cpan.org>.
1614
1615
1616
1617perl v5.26.3 2019-05-14 DB_File(3)