1DB_File(3pm) Perl Programmers Reference Guide DB_File(3pm)
2
3
4
6 DB_File - Perl5 access to Berkeley DB version 1.x
7
9 use DB_File;
10
11 [$X =] tie %hash, 'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
12 [$X =] tie %hash, 'DB_File', $filename, $flags, $mode, $DB_BTREE ;
13 [$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;
14
15 $status = $X->del($key [, $flags]) ;
16 $status = $X->put($key, $value [, $flags]) ;
17 $status = $X->get($key, $value [, $flags]) ;
18 $status = $X->seq($key, $value, $flags) ;
19 $status = $X->sync([$flags]) ;
20 $status = $X->fd ;
21
22 # BTREE only
23 $count = $X->get_dup($key) ;
24 @list = $X->get_dup($key) ;
25 %list = $X->get_dup($key, 1) ;
26 $status = $X->find_dup($key, $value) ;
27 $status = $X->del_dup($key, $value) ;
28
29 # RECNO only
30 $a = $X->length;
31 $a = $X->pop ;
32 $X->push(list);
33 $a = $X->shift;
34 $X->unshift(list);
35 @r = $X->splice(offset, length, elements);
36
37 # DBM Filters
38 $old_filter = $db->filter_store_key ( sub { ... } ) ;
39 $old_filter = $db->filter_store_value( sub { ... } ) ;
40 $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
41 $old_filter = $db->filter_fetch_value( sub { ... } ) ;
42
43 untie %hash ;
44 untie @array ;
45
47 DB_File is a module which allows Perl programs to make use of the
48 facilities provided by Berkeley DB version 1.x (if you have a newer
49 version of DB, see "Using DB_File with Berkeley DB version 2 or
50 greater"). It is assumed that you have a copy of the Berkeley DB man‐
51 ual pages at hand when reading this documentation. The interface
52 defined here mirrors the Berkeley DB interface closely.
53
54 Berkeley DB is a C library which provides a consistent interface to a
55 number of database formats. DB_File provides an interface to all three
56 of the database types currently supported by Berkeley DB.
57
58 The file types are:
59
60 DB_HASH
61 This database type allows arbitrary key/value pairs to be stored
62 in data files. This is equivalent to the functionality provided by
63 other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM.
64 Remember though, the files created using DB_HASH are not compati‐
65 ble with any of the other packages mentioned.
66
67 A default hashing algorithm, which will be adequate for most
68 applications, is built into Berkeley DB. If you do need to use
69 your own hashing algorithm it is possible to write your own in
70 Perl and have DB_File use it instead.
71
72 DB_BTREE
73 The btree format allows arbitrary key/value pairs to be stored in
74 a sorted, balanced binary tree.
75
76 As with the DB_HASH format, it is possible to provide a user
77 defined Perl routine to perform the comparison of keys. By
78 default, though, the keys are stored in lexical order.
79
80 DB_RECNO
81 DB_RECNO allows both fixed-length and variable-length flat text
82 files to be manipulated using the same key/value pair interface as
83 in DB_HASH and DB_BTREE. In this case the key will consist of a
84 record (line) number.
85
86 Using DB_File with Berkeley DB version 2 or greater
87
88 Although DB_File is intended to be used with Berkeley DB version 1, it
89 can also be used with version 2, 3 or 4. In this case the interface is
90 limited to the functionality provided by Berkeley DB 1.x. Anywhere the
91 version 2 or greater interface differs, DB_File arranges for it to work
92 like version 1. This feature allows DB_File scripts that were built
93 with version 1 to be migrated to version 2 or greater without any
94 changes.
95
96 If you want to make use of the new features available in Berkeley DB
97 2.x or greater, use the Perl module BerkeleyDB instead.
98
99 Note: The database file format has changed multiple times in Berkeley
100 DB version 2, 3 and 4. If you cannot recreate your databases, you must
101 dump any existing databases with either the "db_dump" or the
102 "db_dump185" utility that comes with Berkeley DB. Once you have
103 rebuilt DB_File to use Berkeley DB version 2 or greater, your databases
104 can be recreated using "db_load". Refer to the Berkeley DB documenta‐
105 tion for further details.
106
107 Please read "COPYRIGHT" before using version 2.x or greater of Berkeley
108 DB with DB_File.
109
110 Interface to Berkeley DB
111
112 DB_File allows access to Berkeley DB files using the tie() mechanism in
113 Perl 5 (for full details, see "tie()" in perlfunc). This facility
114 allows DB_File to access Berkeley DB files using either an associative
115 array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the
116 DB_RECNO file type).
117
118 In addition to the tie() interface, it is also possible to access most
119 of the functions provided in the Berkeley DB API directly. See "THE
120 API INTERFACE".
121
122 Opening a Berkeley DB Database File
123
124 Berkeley DB uses the function dbopen() to open or create a database.
125 Here is the C prototype for dbopen():
126
127 DB*
128 dbopen (const char * file, int flags, int mode,
129 DBTYPE type, const void * openinfo)
130
131 The parameter "type" is an enumeration which specifies which of the 3
132 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used.
133 Depending on which of these is actually chosen, the final parameter,
134 openinfo points to a data structure which allows tailoring of the spe‐
135 cific interface method.
136
137 This interface is handled slightly differently in DB_File. Here is an
138 equivalent call using DB_File:
139
140 tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;
141
142 The "filename", "flags" and "mode" parameters are the direct equivalent
143 of their dbopen() counterparts. The final parameter $DB_HASH performs
144 the function of both the "type" and "openinfo" parameters in dbopen().
145
146 In the example above $DB_HASH is actually a pre-defined reference to a
147 hash object. DB_File has three of these pre-defined references. Apart
148 from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.
149
150 The keys allowed in each of these pre-defined references is limited to
151 the names used in the equivalent C structure. So, for example, the
152 $DB_HASH reference will only allow keys called "bsize", "cachesize",
153 "ffactor", "hash", "lorder" and "nelem".
154
155 To change one of these elements, just assign to it like this:
156
157 $DB_HASH->{'cachesize'} = 10000 ;
158
159 The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are
160 usually adequate for most applications. If you do need to create extra
161 instances of these objects, constructors are available for each file
162 type.
163
164 Here are examples of the constructors and the valid options available
165 for DB_HASH, DB_BTREE and DB_RECNO respectively.
166
167 $a = new DB_File::HASHINFO ;
168 $a->{'bsize'} ;
169 $a->{'cachesize'} ;
170 $a->{'ffactor'};
171 $a->{'hash'} ;
172 $a->{'lorder'} ;
173 $a->{'nelem'} ;
174
175 $b = new DB_File::BTREEINFO ;
176 $b->{'flags'} ;
177 $b->{'cachesize'} ;
178 $b->{'maxkeypage'} ;
179 $b->{'minkeypage'} ;
180 $b->{'psize'} ;
181 $b->{'compare'} ;
182 $b->{'prefix'} ;
183 $b->{'lorder'} ;
184
185 $c = new DB_File::RECNOINFO ;
186 $c->{'bval'} ;
187 $c->{'cachesize'} ;
188 $c->{'psize'} ;
189 $c->{'flags'} ;
190 $c->{'lorder'} ;
191 $c->{'reclen'} ;
192 $c->{'bfname'} ;
193
194 The values stored in the hashes above are mostly the direct equivalent
195 of their C counterpart. Like their C counterparts, all are set to a
196 default values - that means you don't have to set all of the values
197 when you only want to change one. Here is an example:
198
199 $a = new DB_File::HASHINFO ;
200 $a->{'cachesize'} = 12345 ;
201 tie %y, 'DB_File', "filename", $flags, 0777, $a ;
202
203 A few of the options need extra discussion here. When used, the C
204 equivalent of the keys "hash", "compare" and "prefix" store pointers to
205 C functions. In DB_File these keys are used to store references to Perl
206 subs. Below are templates for each of the subs:
207
208 sub hash
209 {
210 my ($data) = @_ ;
211 ...
212 # return the hash value for $data
213 return $hash ;
214 }
215
216 sub compare
217 {
218 my ($key, $key2) = @_ ;
219 ...
220 # return 0 if $key1 eq $key2
221 # -1 if $key1 lt $key2
222 # 1 if $key1 gt $key2
223 return (-1 , 0 or 1) ;
224 }
225
226 sub prefix
227 {
228 my ($key, $key2) = @_ ;
229 ...
230 # return number of bytes of $key2 which are
231 # necessary to determine that it is greater than $key1
232 return $bytes ;
233 }
234
235 See "Changing the BTREE sort order" for an example of using the "com‐
236 pare" template.
237
238 If you are using the DB_RECNO interface and you intend making use of
239 "bval", you should check out "The 'bval' Option".
240
241 Default Parameters
242
243 It is possible to omit some or all of the final 4 parameters in the
244 call to "tie" and let them take default values. As DB_HASH is the most
245 common file format used, the call:
246
247 tie %A, "DB_File", "filename" ;
248
249 is equivalent to:
250
251 tie %A, "DB_File", "filename", O_CREAT⎪O_RDWR, 0666, $DB_HASH ;
252
253 It is also possible to omit the filename parameter as well, so the
254 call:
255
256 tie %A, "DB_File" ;
257
258 is equivalent to:
259
260 tie %A, "DB_File", undef, O_CREAT⎪O_RDWR, 0666, $DB_HASH ;
261
262 See "In Memory Databases" for a discussion on the use of "undef" in
263 place of a filename.
264
265 In Memory Databases
266
267 Berkeley DB allows the creation of in-memory databases by using NULL
268 (that is, a "(char *)0" in C) in place of the filename. DB_File uses
269 "undef" instead of NULL to provide this functionality.
270
272 The DB_HASH file format is probably the most commonly used of the three
273 file formats that DB_File supports. It is also very straightforward to
274 use.
275
276 A Simple Example
277
278 This example shows how to create a database, add key/value pairs to the
279 database, delete keys/value pairs and finally how to enumerate the con‐
280 tents of the database.
281
282 use warnings ;
283 use strict ;
284 use DB_File ;
285 our (%h, $k, $v) ;
286
287 unlink "fruit" ;
288 tie %h, "DB_File", "fruit", O_RDWR⎪O_CREAT, 0666, $DB_HASH
289 or die "Cannot open file 'fruit': $!\n";
290
291 # Add a few key/value pairs to the file
292 $h{"apple"} = "red" ;
293 $h{"orange"} = "orange" ;
294 $h{"banana"} = "yellow" ;
295 $h{"tomato"} = "red" ;
296
297 # Check for existence of a key
298 print "Banana Exists\n\n" if $h{"banana"} ;
299
300 # Delete a key/value pair.
301 delete $h{"apple"} ;
302
303 # print the contents of the file
304 while (($k, $v) = each %h)
305 { print "$k -> $v\n" }
306
307 untie %h ;
308
309 here is the output:
310
311 Banana Exists
312
313 orange -> orange
314 tomato -> red
315 banana -> yellow
316
317 Note that the like ordinary associative arrays, the order of the keys
318 retrieved is in an apparently random order.
319
321 The DB_BTREE format is useful when you want to store data in a given
322 order. By default the keys will be stored in lexical order, but as you
323 will see from the example shown in the next section, it is very easy to
324 define your own sorting function.
325
326 Changing the BTREE sort order
327
328 This script shows how to override the default sorting algorithm that
329 BTREE uses. Instead of using the normal lexical ordering, a case insen‐
330 sitive compare function will be used.
331
332 use warnings ;
333 use strict ;
334 use DB_File ;
335
336 my %h ;
337
338 sub Compare
339 {
340 my ($key1, $key2) = @_ ;
341 "\L$key1" cmp "\L$key2" ;
342 }
343
344 # specify the Perl sub that will do the comparison
345 $DB_BTREE->{'compare'} = \&Compare ;
346
347 unlink "tree" ;
348 tie %h, "DB_File", "tree", O_RDWR⎪O_CREAT, 0666, $DB_BTREE
349 or die "Cannot open file 'tree': $!\n" ;
350
351 # Add a key/value pair to the file
352 $h{'Wall'} = 'Larry' ;
353 $h{'Smith'} = 'John' ;
354 $h{'mouse'} = 'mickey' ;
355 $h{'duck'} = 'donald' ;
356
357 # Delete
358 delete $h{"duck"} ;
359
360 # Cycle through the keys printing them in order.
361 # Note it is not necessary to sort the keys as
362 # the btree will have kept them in order automatically.
363 foreach (keys %h)
364 { print "$_\n" }
365
366 untie %h ;
367
368 Here is the output from the code above.
369
370 mouse
371 Smith
372 Wall
373
374 There are a few point to bear in mind if you want to change the order‐
375 ing in a BTREE database:
376
377 1. The new compare function must be specified when you create the
378 database.
379
380 2. You cannot change the ordering once the database has been created.
381 Thus you must use the same compare function every time you access
382 the database.
383
384 3 Duplicate keys are entirely defined by the comparison function.
385 In the case-insensitive example above, the keys: 'KEY' and 'key'
386 would be considered duplicates, and assigning to the second one
387 would overwrite the first. If duplicates are allowed for (with the
388 R_DUP flag discussed below), only a single copy of duplicate keys
389 is stored in the database --- so (again with example above)
390 assigning three values to the keys: 'KEY', 'Key', and 'key' would
391 leave just the first key: 'KEY' in the database with three values.
392 For some situations this results in information loss, so care
393 should be taken to provide fully qualified comparison functions
394 when necessary. For example, the above comparison routine could
395 be modified to additionally compare case-sensitively if two keys
396 are equal in the case insensitive comparison:
397
398 sub compare {
399 my($key1, $key2) = @_;
400 lc $key1 cmp lc $key2 ⎪⎪
401 $key1 cmp $key2;
402 }
403
404 And now you will only have duplicates when the keys themselves are
405 truly the same. (note: in versions of the db library prior to
406 about November 1996, such duplicate keys were retained so it was
407 possible to recover the original keys in sets of keys that com‐
408 pared as equal).
409
410 Handling Duplicate Keys
411
412 The BTREE file type optionally allows a single key to be associated
413 with an arbitrary number of values. This option is enabled by setting
414 the flags element of $DB_BTREE to R_DUP when creating the database.
415
416 There are some difficulties in using the tied hash interface if you
417 want to manipulate a BTREE database with duplicate keys. Consider this
418 code:
419
420 use warnings ;
421 use strict ;
422 use DB_File ;
423
424 my ($filename, %h) ;
425
426 $filename = "tree" ;
427 unlink $filename ;
428
429 # Enable duplicate records
430 $DB_BTREE->{'flags'} = R_DUP ;
431
432 tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
433 or die "Cannot open $filename: $!\n";
434
435 # Add some key/value pairs to the file
436 $h{'Wall'} = 'Larry' ;
437 $h{'Wall'} = 'Brick' ; # Note the duplicate key
438 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
439 $h{'Smith'} = 'John' ;
440 $h{'mouse'} = 'mickey' ;
441
442 # iterate through the associative array
443 # and print each key/value pair.
444 foreach (sort keys %h)
445 { print "$_ -> $h{$_}\n" }
446
447 untie %h ;
448
449 Here is the output:
450
451 Smith -> John
452 Wall -> Larry
453 Wall -> Larry
454 Wall -> Larry
455 mouse -> mickey
456
457 As you can see 3 records have been successfully created with key "Wall"
458 - the only thing is, when they are retrieved from the database they
459 seem to have the same value, namely "Larry". The problem is caused by
460 the way that the associative array interface works. Basically, when the
461 associative array interface is used to fetch the value associated with
462 a given key, it will only ever retrieve the first value.
463
464 Although it may not be immediately obvious from the code above, the
465 associative array interface can be used to write values with duplicate
466 keys, but it cannot be used to read them back from the database.
467
468 The way to get around this problem is to use the Berkeley DB API method
469 called "seq". This method allows sequential access to key/value pairs.
470 See "THE API INTERFACE" for details of both the "seq" method and the
471 API in general.
472
473 Here is the script above rewritten using the "seq" API method.
474
475 use warnings ;
476 use strict ;
477 use DB_File ;
478
479 my ($filename, $x, %h, $status, $key, $value) ;
480
481 $filename = "tree" ;
482 unlink $filename ;
483
484 # Enable duplicate records
485 $DB_BTREE->{'flags'} = R_DUP ;
486
487 $x = tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
488 or die "Cannot open $filename: $!\n";
489
490 # Add some key/value pairs to the file
491 $h{'Wall'} = 'Larry' ;
492 $h{'Wall'} = 'Brick' ; # Note the duplicate key
493 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
494 $h{'Smith'} = 'John' ;
495 $h{'mouse'} = 'mickey' ;
496
497 # iterate through the btree using seq
498 # and print each key/value pair.
499 $key = $value = 0 ;
500 for ($status = $x->seq($key, $value, R_FIRST) ;
501 $status == 0 ;
502 $status = $x->seq($key, $value, R_NEXT) )
503 { print "$key -> $value\n" }
504
505 undef $x ;
506 untie %h ;
507
508 that prints:
509
510 Smith -> John
511 Wall -> Brick
512 Wall -> Brick
513 Wall -> Larry
514 mouse -> mickey
515
516 This time we have got all the key/value pairs, including the multiple
517 values associated with the key "Wall".
518
519 To make life easier when dealing with duplicate keys, DB_File comes
520 with a few utility methods.
521
522 The get_dup() Method
523
524 The "get_dup" method assists in reading duplicate values from BTREE
525 databases. The method can take the following forms:
526
527 $count = $x->get_dup($key) ;
528 @list = $x->get_dup($key) ;
529 %list = $x->get_dup($key, 1) ;
530
531 In a scalar context the method returns the number of values associated
532 with the key, $key.
533
534 In list context, it returns all the values which match $key. Note that
535 the values will be returned in an apparently random order.
536
537 In list context, if the second parameter is present and evaluates TRUE,
538 the method returns an associative array. The keys of the associative
539 array correspond to the values that matched in the BTREE and the values
540 of the array are a count of the number of times that particular value
541 occurred in the BTREE.
542
543 So assuming the database created above, we can use "get_dup" like this:
544
545 use warnings ;
546 use strict ;
547 use DB_File ;
548
549 my ($filename, $x, %h) ;
550
551 $filename = "tree" ;
552
553 # Enable duplicate records
554 $DB_BTREE->{'flags'} = R_DUP ;
555
556 $x = tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
557 or die "Cannot open $filename: $!\n";
558
559 my $cnt = $x->get_dup("Wall") ;
560 print "Wall occurred $cnt times\n" ;
561
562 my %hash = $x->get_dup("Wall", 1) ;
563 print "Larry is there\n" if $hash{'Larry'} ;
564 print "There are $hash{'Brick'} Brick Walls\n" ;
565
566 my @list = sort $x->get_dup("Wall") ;
567 print "Wall => [@list]\n" ;
568
569 @list = $x->get_dup("Smith") ;
570 print "Smith => [@list]\n" ;
571
572 @list = $x->get_dup("Dog") ;
573 print "Dog => [@list]\n" ;
574
575 and it will print:
576
577 Wall occurred 3 times
578 Larry is there
579 There are 2 Brick Walls
580 Wall => [Brick Brick Larry]
581 Smith => [John]
582 Dog => []
583
584 The find_dup() Method
585
586 $status = $X->find_dup($key, $value) ;
587
588 This method checks for the existence of a specific key/value pair. If
589 the pair exists, the cursor is left pointing to the pair and the method
590 returns 0. Otherwise the method returns a non-zero value.
591
592 Assuming the database from the previous example:
593
594 use warnings ;
595 use strict ;
596 use DB_File ;
597
598 my ($filename, $x, %h, $found) ;
599
600 $filename = "tree" ;
601
602 # Enable duplicate records
603 $DB_BTREE->{'flags'} = R_DUP ;
604
605 $x = tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
606 or die "Cannot open $filename: $!\n";
607
608 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
609 print "Larry Wall is $found there\n" ;
610
611 $found = ( $x->find_dup("Wall", "Harry") == 0 ? "" : "not") ;
612 print "Harry Wall is $found there\n" ;
613
614 undef $x ;
615 untie %h ;
616
617 prints this
618
619 Larry Wall is there
620 Harry Wall is not there
621
622 The del_dup() Method
623
624 $status = $X->del_dup($key, $value) ;
625
626 This method deletes a specific key/value pair. It returns 0 if they
627 exist and have been deleted successfully. Otherwise the method returns
628 a non-zero value.
629
630 Again assuming the existence of the "tree" database
631
632 use warnings ;
633 use strict ;
634 use DB_File ;
635
636 my ($filename, $x, %h, $found) ;
637
638 $filename = "tree" ;
639
640 # Enable duplicate records
641 $DB_BTREE->{'flags'} = R_DUP ;
642
643 $x = tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
644 or die "Cannot open $filename: $!\n";
645
646 $x->del_dup("Wall", "Larry") ;
647
648 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
649 print "Larry Wall is $found there\n" ;
650
651 undef $x ;
652 untie %h ;
653
654 prints this
655
656 Larry Wall is not there
657
658 Matching Partial Keys
659
660 The BTREE interface has a feature which allows partial keys to be
661 matched. This functionality is only available when the "seq" method is
662 used along with the R_CURSOR flag.
663
664 $x->seq($key, $value, R_CURSOR) ;
665
666 Here is the relevant quote from the dbopen man page where it defines
667 the use of the R_CURSOR flag with seq:
668
669 Note, for the DB_BTREE access method, the returned key is not
670 necessarily an exact match for the specified key. The returned key
671 is the smallest key greater than or equal to the specified key,
672 permitting partial key matches and range searches.
673
674 In the example script below, the "match" sub uses this feature to find
675 and print the first matching key/value pair given a partial key.
676
677 use warnings ;
678 use strict ;
679 use DB_File ;
680 use Fcntl ;
681
682 my ($filename, $x, %h, $st, $key, $value) ;
683
684 sub match
685 {
686 my $key = shift ;
687 my $value = 0;
688 my $orig_key = $key ;
689 $x->seq($key, $value, R_CURSOR) ;
690 print "$orig_key\t-> $key\t-> $value\n" ;
691 }
692
693 $filename = "tree" ;
694 unlink $filename ;
695
696 $x = tie %h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_BTREE
697 or die "Cannot open $filename: $!\n";
698
699 # Add some key/value pairs to the file
700 $h{'mouse'} = 'mickey' ;
701 $h{'Wall'} = 'Larry' ;
702 $h{'Walls'} = 'Brick' ;
703 $h{'Smith'} = 'John' ;
704
705 $key = $value = 0 ;
706 print "IN ORDER\n" ;
707 for ($st = $x->seq($key, $value, R_FIRST) ;
708 $st == 0 ;
709 $st = $x->seq($key, $value, R_NEXT) )
710
711 { print "$key -> $value\n" }
712
713 print "\nPARTIAL MATCH\n" ;
714
715 match "Wa" ;
716 match "A" ;
717 match "a" ;
718
719 undef $x ;
720 untie %h ;
721
722 Here is the output:
723
724 IN ORDER
725 Smith -> John
726 Wall -> Larry
727 Walls -> Brick
728 mouse -> mickey
729
730 PARTIAL MATCH
731 Wa -> Wall -> Larry
732 A -> Smith -> John
733 a -> mouse -> mickey
734
736 DB_RECNO provides an interface to flat text files. Both variable and
737 fixed length records are supported.
738
739 In order to make RECNO more compatible with Perl, the array offset for
740 all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.
741
742 As with normal Perl arrays, a RECNO array can be accessed using nega‐
743 tive indexes. The index -1 refers to the last element of the array, -2
744 the second last, and so on. Attempting to access an element before the
745 start of the array will raise a fatal run-time error.
746
747 The 'bval' Option
748
749 The operation of the bval option warrants some discussion. Here is the
750 definition of bval from the Berkeley DB 1.85 recno manual page:
751
752 The delimiting byte to be used to mark the end of a
753 record for variable-length records, and the pad charac-
754 ter for fixed-length records. If no value is speci-
755 fied, newlines (``\n'') are used to mark the end of
756 variable-length records and fixed-length records are
757 padded with spaces.
758
759 The second sentence is wrong. In actual fact bval will only default to
760 "\n" when the openinfo parameter in dbopen is NULL. If a non-NULL open‐
761 info parameter is used at all, the value that happens to be in bval
762 will be used. That means you always have to specify bval when making
763 use of any of the options in the openinfo parameter. This documentation
764 error will be fixed in the next release of Berkeley DB.
765
766 That clarifies the situation with regards Berkeley DB itself. What
767 about DB_File? Well, the behavior defined in the quote above is quite
768 useful, so DB_File conforms to it.
769
770 That means that you can specify other options (e.g. cachesize) and
771 still have bval default to "\n" for variable length records, and space
772 for fixed length records.
773
774 Also note that the bval option only allows you to specify a single byte
775 as a delimiter.
776
777 A Simple Example
778
779 Here is a simple example that uses RECNO (if you are using a version of
780 Perl earlier than 5.004_57 this example won't work -- see "Extra RECNO
781 Methods" for a workaround).
782
783 use warnings ;
784 use strict ;
785 use DB_File ;
786
787 my $filename = "text" ;
788 unlink $filename ;
789
790 my @h ;
791 tie @h, "DB_File", $filename, O_RDWR⎪O_CREAT, 0666, $DB_RECNO
792 or die "Cannot open file 'text': $!\n" ;
793
794 # Add a few key/value pairs to the file
795 $h[0] = "orange" ;
796 $h[1] = "blue" ;
797 $h[2] = "yellow" ;
798
799 push @h, "green", "black" ;
800
801 my $elements = scalar @h ;
802 print "The array contains $elements entries\n" ;
803
804 my $last = pop @h ;
805 print "popped $last\n" ;
806
807 unshift @h, "white" ;
808 my $first = shift @h ;
809 print "shifted $first\n" ;
810
811 # Check for existence of a key
812 print "Element 1 Exists with value $h[1]\n" if $h[1] ;
813
814 # use a negative index
815 print "The last element is $h[-1]\n" ;
816 print "The 2nd last element is $h[-2]\n" ;
817
818 untie @h ;
819
820 Here is the output from the script:
821
822 The array contains 5 entries
823 popped black
824 shifted white
825 Element 1 Exists with value blue
826 The last element is green
827 The 2nd last element is yellow
828
829 Extra RECNO Methods
830
831 If you are using a version of Perl earlier than 5.004_57, the tied
832 array interface is quite limited. In the example script above "push",
833 "pop", "shift", "unshift" or determining the array length will not work
834 with a tied array.
835
836 To make the interface more useful for older versions of Perl, a number
837 of methods are supplied with DB_File to simulate the missing array
838 operations. All these methods are accessed via the object returned from
839 the tie call.
840
841 Here are the methods:
842
843 $X->push(list) ;
844 Pushes the elements of "list" to the end of the array.
845
846 $value = $X->pop ;
847 Removes and returns the last element of the array.
848
849 $X->shift
850 Removes and returns the first element of the array.
851
852 $X->unshift(list) ;
853 Pushes the elements of "list" to the start of the array.
854
855 $X->length
856 Returns the number of elements in the array.
857
858 $X->splice(offset, length, elements);
859 Returns a splice of the array.
860
861 Another Example
862
863 Here is a more complete example that makes use of some of the methods
864 described above. It also makes use of the API interface directly (see
865 "THE API INTERFACE").
866
867 use warnings ;
868 use strict ;
869 my (@h, $H, $file, $i) ;
870 use DB_File ;
871 use Fcntl ;
872
873 $file = "text" ;
874
875 unlink $file ;
876
877 $H = tie @h, "DB_File", $file, O_RDWR⎪O_CREAT, 0666, $DB_RECNO
878 or die "Cannot open file $file: $!\n" ;
879
880 # first create a text file to play with
881 $h[0] = "zero" ;
882 $h[1] = "one" ;
883 $h[2] = "two" ;
884 $h[3] = "three" ;
885 $h[4] = "four" ;
886
887 # Print the records in order.
888 #
889 # The length method is needed here because evaluating a tied
890 # array in a scalar context does not return the number of
891 # elements in the array.
892
893 print "\nORIGINAL\n" ;
894 foreach $i (0 .. $H->length - 1) {
895 print "$i: $h[$i]\n" ;
896 }
897
898 # use the push & pop methods
899 $a = $H->pop ;
900 $H->push("last") ;
901 print "\nThe last record was [$a]\n" ;
902
903 # and the shift & unshift methods
904 $a = $H->shift ;
905 $H->unshift("first") ;
906 print "The first record was [$a]\n" ;
907
908 # Use the API to add a new record after record 2.
909 $i = 2 ;
910 $H->put($i, "Newbie", R_IAFTER) ;
911
912 # and a new record before record 1.
913 $i = 1 ;
914 $H->put($i, "New One", R_IBEFORE) ;
915
916 # delete record 3
917 $H->del(3) ;
918
919 # now print the records in reverse order
920 print "\nREVERSE\n" ;
921 for ($i = $H->length - 1 ; $i >= 0 ; -- $i)
922 { print "$i: $h[$i]\n" }
923
924 # same again, but use the API functions instead
925 print "\nREVERSE again\n" ;
926 my ($s, $k, $v) = (0, 0, 0) ;
927 for ($s = $H->seq($k, $v, R_LAST) ;
928 $s == 0 ;
929 $s = $H->seq($k, $v, R_PREV))
930 { print "$k: $v\n" }
931
932 undef $H ;
933 untie @h ;
934
935 and this is what it outputs:
936
937 ORIGINAL
938 0: zero
939 1: one
940 2: two
941 3: three
942 4: four
943
944 The last record was [four]
945 The first record was [zero]
946
947 REVERSE
948 5: last
949 4: three
950 3: Newbie
951 2: one
952 1: New One
953 0: first
954
955 REVERSE again
956 5: last
957 4: three
958 3: Newbie
959 2: one
960 1: New One
961 0: first
962
963 Notes:
964
965 1. Rather than iterating through the array, @h like this:
966
967 foreach $i (@h)
968
969 it is necessary to use either this:
970
971 foreach $i (0 .. $H->length - 1)
972
973 or this:
974
975 for ($a = $H->get($k, $v, R_FIRST) ;
976 $a == 0 ;
977 $a = $H->get($k, $v, R_NEXT) )
978
979 2. Notice that both times the "put" method was used the record index
980 was specified using a variable, $i, rather than the literal value
981 itself. This is because "put" will return the record number of the
982 inserted line via that parameter.
983
985 As well as accessing Berkeley DB using a tied hash or array, it is also
986 possible to make direct use of most of the API functions defined in the
987 Berkeley DB documentation.
988
989 To do this you need to store a copy of the object returned from the
990 tie.
991
992 $db = tie %hash, "DB_File", "filename" ;
993
994 Once you have done that, you can access the Berkeley DB API functions
995 as DB_File methods directly like this:
996
997 $db->put($key, $value, R_NOOVERWRITE) ;
998
999 Important: If you have saved a copy of the object returned from "tie",
1000 the underlying database file will not be closed until both the tied
1001 variable is untied and all copies of the saved object are destroyed.
1002
1003 use DB_File ;
1004 $db = tie %hash, "DB_File", "filename"
1005 or die "Cannot tie filename: $!" ;
1006 ...
1007 undef $db ;
1008 untie %hash ;
1009
1010 See "The untie() Gotcha" for more details.
1011
1012 All the functions defined in dbopen are available except for close()
1013 and dbopen() itself. The DB_File method interface to the supported
1014 functions have been implemented to mirror the way Berkeley DB works
1015 whenever possible. In particular note that:
1016
1017 · The methods return a status value. All return 0 on success. All
1018 return -1 to signify an error and set $! to the exact error code.
1019 The return code 1 generally (but not always) means that the key
1020 specified did not exist in the database.
1021
1022 Other return codes are defined. See below and in the Berkeley DB
1023 documentation for details. The Berkeley DB documentation should be
1024 used as the definitive source.
1025
1026 · Whenever a Berkeley DB function returns data via one of its param‐
1027 eters, the equivalent DB_File method does exactly the same.
1028
1029 · If you are careful, it is possible to mix API calls with the tied
1030 hash/array interface in the same piece of code. Although only a
1031 few of the methods used to implement the tied interface currently
1032 make use of the cursor, you should always assume that the cursor
1033 has been changed any time the tied hash/array interface is used.
1034 As an example, this code will probably not do what you expect:
1035
1036 $X = tie %x, 'DB_File', $filename, O_RDWR⎪O_CREAT, 0777, $DB_BTREE
1037 or die "Cannot tie $filename: $!" ;
1038
1039 # Get the first key/value pair and set the cursor
1040 $X->seq($key, $value, R_FIRST) ;
1041
1042 # this line will modify the cursor
1043 $count = scalar keys %x ;
1044
1045 # Get the second key/value pair.
1046 # oops, it didn't, it got the last key/value pair!
1047 $X->seq($key, $value, R_NEXT) ;
1048
1049 The code above can be rearranged to get around the problem, like
1050 this:
1051
1052 $X = tie %x, 'DB_File', $filename, O_RDWR⎪O_CREAT, 0777, $DB_BTREE
1053 or die "Cannot tie $filename: $!" ;
1054
1055 # this line will modify the cursor
1056 $count = scalar keys %x ;
1057
1058 # Get the first key/value pair and set the cursor
1059 $X->seq($key, $value, R_FIRST) ;
1060
1061 # Get the second key/value pair.
1062 # worked this time.
1063 $X->seq($key, $value, R_NEXT) ;
1064
1065 All the constants defined in dbopen for use in the flags parameters in
1066 the methods defined below are also available. Refer to the Berkeley DB
1067 documentation for the precise meaning of the flags values.
1068
1069 Below is a list of the methods available.
1070
1071 $status = $X->get($key, $value [, $flags]) ;
1072 Given a key ($key) this method reads the value associated with it
1073 from the database. The value read from the database is returned in
1074 the $value parameter.
1075
1076 If the key does not exist the method returns 1.
1077
1078 No flags are currently defined for this method.
1079
1080 $status = $X->put($key, $value [, $flags]) ;
1081 Stores the key/value pair in the database.
1082
1083 If you use either the R_IAFTER or R_IBEFORE flags, the $key param‐
1084 eter will have the record number of the inserted key/value pair
1085 set.
1086
1087 Valid flags are R_CURSOR, R_IAFTER, R_IBEFORE, R_NOOVERWRITE and
1088 R_SETCURSOR.
1089
1090 $status = $X->del($key [, $flags]) ;
1091 Removes all key/value pairs with key $key from the database.
1092
1093 A return code of 1 means that the requested key was not in the
1094 database.
1095
1096 R_CURSOR is the only valid flag at present.
1097
1098 $status = $X->fd ;
1099 Returns the file descriptor for the underlying database.
1100
1101 See "Locking: The Trouble with fd" for an explanation for why you
1102 should not use "fd" to lock your database.
1103
1104 $status = $X->seq($key, $value, $flags) ;
1105 This interface allows sequential retrieval from the database. See
1106 dbopen for full details.
1107
1108 Both the $key and $value parameters will be set to the key/value
1109 pair read from the database.
1110
1111 The flags parameter is mandatory. The valid flag values are R_CUR‐
1112 SOR, R_FIRST, R_LAST, R_NEXT and R_PREV.
1113
1114 $status = $X->sync([$flags]) ;
1115 Flushes any cached buffers to disk.
1116
1117 R_RECNOSYNC is the only valid flag at present.
1118
1120 A DBM Filter is a piece of code that is be used when you always want to
1121 make the same transformation to all keys and/or values in a DBM data‐
1122 base.
1123
1124 There are four methods associated with DBM Filters. All work identi‐
1125 cally, and each is used to install (or uninstall) a single DBM Filter.
1126 Each expects a single parameter, namely a reference to a sub. The only
1127 difference between them is the place that the filter is installed.
1128
1129 To summarise:
1130
1131 filter_store_key
1132 If a filter has been installed with this method, it will be
1133 invoked every time you write a key to a DBM database.
1134
1135 filter_store_value
1136 If a filter has been installed with this method, it will be
1137 invoked every time you write a value to a DBM database.
1138
1139 filter_fetch_key
1140 If a filter has been installed with this method, it will be
1141 invoked every time you read a key from a DBM database.
1142
1143 filter_fetch_value
1144 If a filter has been installed with this method, it will be
1145 invoked every time you read a value from a DBM database.
1146
1147 You can use any combination of the methods, from none, to all four.
1148
1149 All filter methods return the existing filter, if present, or "undef"
1150 in not.
1151
1152 To delete a filter pass "undef" to it.
1153
1154 The Filter
1155
1156 When each filter is called by Perl, a local copy of $_ will contain the
1157 key or value to be filtered. Filtering is achieved by modifying the
1158 contents of $_. The return code from the filter is ignored.
1159
1160 An Example -- the NULL termination problem.
1161
1162 Consider the following scenario. You have a DBM database that you need
1163 to share with a third-party C application. The C application assumes
1164 that all keys and values are NULL terminated. Unfortunately when Perl
1165 writes to DBM databases it doesn't use NULL termination, so your Perl
1166 application will have to manage NULL termination itself. When you write
1167 to the database you will have to use something like this:
1168
1169 $hash{"$key\0"} = "$value\0" ;
1170
1171 Similarly the NULL needs to be taken into account when you are consid‐
1172 ering the length of existing keys/values.
1173
1174 It would be much better if you could ignore the NULL terminations issue
1175 in the main application code and have a mechanism that automatically
1176 added the terminating NULL to all keys and values whenever you write to
1177 the database and have them removed when you read from the database. As
1178 I'm sure you have already guessed, this is a problem that DBM Filters
1179 can fix very easily.
1180
1181 use warnings ;
1182 use strict ;
1183 use DB_File ;
1184
1185 my %hash ;
1186 my $filename = "filt" ;
1187 unlink $filename ;
1188
1189 my $db = tie %hash, 'DB_File', $filename, O_CREAT⎪O_RDWR, 0666, $DB_HASH
1190 or die "Cannot open $filename: $!\n" ;
1191
1192 # Install DBM Filters
1193 $db->filter_fetch_key ( sub { s/\0$// } ) ;
1194 $db->filter_store_key ( sub { $_ .= "\0" } ) ;
1195 $db->filter_fetch_value( sub { s/\0$// } ) ;
1196 $db->filter_store_value( sub { $_ .= "\0" } ) ;
1197
1198 $hash{"abc"} = "def" ;
1199 my $a = $hash{"ABC"} ;
1200 # ...
1201 undef $db ;
1202 untie %hash ;
1203
1204 Hopefully the contents of each of the filters should be self-explana‐
1205 tory. Both "fetch" filters remove the terminating NULL, and both
1206 "store" filters add a terminating NULL.
1207
1208 Another Example -- Key is a C int.
1209
1210 Here is another real-life example. By default, whenever Perl writes to
1211 a DBM database it always writes the key and value as strings. So when
1212 you use this:
1213
1214 $hash{12345} = "something" ;
1215
1216 the key 12345 will get stored in the DBM database as the 5 byte string
1217 "12345". If you actually want the key to be stored in the DBM database
1218 as a C int, you will have to use "pack" when writing, and "unpack" when
1219 reading.
1220
1221 Here is a DBM Filter that does it:
1222
1223 use warnings ;
1224 use strict ;
1225 use DB_File ;
1226 my %hash ;
1227 my $filename = "filt" ;
1228 unlink $filename ;
1229
1230 my $db = tie %hash, 'DB_File', $filename, O_CREAT⎪O_RDWR, 0666, $DB_HASH
1231 or die "Cannot open $filename: $!\n" ;
1232
1233 $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
1234 $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
1235 $hash{123} = "def" ;
1236 # ...
1237 undef $db ;
1238 untie %hash ;
1239
1240 This time only two filters have been used -- we only need to manipulate
1241 the contents of the key, so it wasn't necessary to install any value
1242 filters.
1243
1245 Locking: The Trouble with fd
1246
1247 Until version 1.72 of this module, the recommended technique for lock‐
1248 ing DB_File databases was to flock the filehandle returned from the
1249 "fd" function. Unfortunately this technique has been shown to be funda‐
1250 mentally flawed (Kudos to David Harris for tracking this down). Use it
1251 at your own peril!
1252
1253 The locking technique went like this.
1254
1255 $db = tie(%db, 'DB_File', 'foo.db', O_CREAT⎪O_RDWR, 0644)
1256 ⎪⎪ die "dbcreat foo.db $!";
1257 $fd = $db->fd;
1258 open(DB_FH, "+<&=$fd") ⎪⎪ die "dup $!";
1259 flock (DB_FH, LOCK_EX) ⎪⎪ die "flock: $!";
1260 ...
1261 $db{"Tom"} = "Jerry" ;
1262 ...
1263 flock(DB_FH, LOCK_UN);
1264 undef $db;
1265 untie %db;
1266 close(DB_FH);
1267
1268 In simple terms, this is what happens:
1269
1270 1. Use "tie" to open the database.
1271
1272 2. Lock the database with fd & flock.
1273
1274 3. Read & Write to the database.
1275
1276 4. Unlock and close the database.
1277
1278 Here is the crux of the problem. A side-effect of opening the DB_File
1279 database in step 2 is that an initial block from the database will get
1280 read from disk and cached in memory.
1281
1282 To see why this is a problem, consider what can happen when two pro‐
1283 cesses, say "A" and "B", both want to update the same DB_File database
1284 using the locking steps outlined above. Assume process "A" has already
1285 opened the database and has a write lock, but it hasn't actually
1286 updated the database yet (it has finished step 2, but not started step
1287 3 yet). Now process "B" tries to open the same database - step 1 will
1288 succeed, but it will block on step 2 until process "A" releases the
1289 lock. The important thing to notice here is that at this point in time
1290 both processes will have cached identical initial blocks from the data‐
1291 base.
1292
1293 Now process "A" updates the database and happens to change some of the
1294 data held in the initial buffer. Process "A" terminates, flushing all
1295 cached data to disk and releasing the database lock. At this point the
1296 database on disk will correctly reflect the changes made by process
1297 "A".
1298
1299 With the lock released, process "B" can now continue. It also updates
1300 the database and unfortunately it too modifies the data that was in its
1301 initial buffer. Once that data gets flushed to disk it will overwrite
1302 some/all of the changes process "A" made to the database.
1303
1304 The result of this scenario is at best a database that doesn't contain
1305 what you expect. At worst the database will corrupt.
1306
1307 The above won't happen every time competing process update the same
1308 DB_File database, but it does illustrate why the technique should not
1309 be used.
1310
1311 Safe ways to lock a database
1312
1313 Starting with version 2.x, Berkeley DB has internal support for lock‐
1314 ing. The companion module to this one, BerkeleyDB, provides an inter‐
1315 face to this locking functionality. If you are serious about locking
1316 Berkeley DB databases, I strongly recommend using BerkeleyDB.
1317
1318 If using BerkeleyDB isn't an option, there are a number of modules
1319 available on CPAN that can be used to implement locking. Each one
1320 implements locking differently and has different goals in mind. It is
1321 therefore worth knowing the difference, so that you can pick the right
1322 one for your application. Here are the three locking wrappers:
1323
1324 Tie::DB_Lock
1325 A DB_File wrapper which creates copies of the database file for
1326 read access, so that you have a kind of a multiversioning concur‐
1327 rent read system. However, updates are still serial. Use for data‐
1328 bases where reads may be lengthy and consistency problems may
1329 occur.
1330
1331 Tie::DB_LockFile
1332 A DB_File wrapper that has the ability to lock and unlock the
1333 database while it is being used. Avoids the tie-before-flock prob‐
1334 lem by simply re-tie-ing the database when you get or drop a lock.
1335 Because of the flexibility in dropping and re-acquiring the lock
1336 in the middle of a session, this can be massaged into a system
1337 that will work with long updates and/or reads if the application
1338 follows the hints in the POD documentation.
1339
1340 DB_File::Lock
1341 An extremely lightweight DB_File wrapper that simply flocks a
1342 lockfile before tie-ing the database and drops the lock after the
1343 untie. Allows one to use the same lockfile for multiple databases
1344 to avoid deadlock problems, if desired. Use for databases where
1345 updates are reads are quick and simple flock locking semantics are
1346 enough.
1347
1348 Sharing Databases With C Applications
1349
1350 There is no technical reason why a Berkeley DB database cannot be
1351 shared by both a Perl and a C application.
1352
1353 The vast majority of problems that are reported in this area boil down
1354 to the fact that C strings are NULL terminated, whilst Perl strings are
1355 not. See "DBM FILTERS" for a generic way to work around this problem.
1356
1357 Here is a real example. Netscape 2.0 keeps a record of the locations
1358 you visit along with the time you last visited them in a DB_HASH data‐
1359 base. This is usually stored in the file ~/.netscape/history.db. The
1360 key field in the database is the location string and the value field is
1361 the time the location was last visited stored as a 4 byte binary value.
1362
1363 If you haven't already guessed, the location string is stored with a
1364 terminating NULL. This means you need to be careful when accessing the
1365 database.
1366
1367 Here is a snippet of code that is loosely based on Tom Christiansen's
1368 ggh script (available from your nearest CPAN archive in
1369 authors/id/TOMC/scripts/nshist.gz).
1370
1371 use warnings ;
1372 use strict ;
1373 use DB_File ;
1374 use Fcntl ;
1375
1376 my ($dotdir, $HISTORY, %hist_db, $href, $binary_time, $date) ;
1377 $dotdir = $ENV{HOME} ⎪⎪ $ENV{LOGNAME};
1378
1379 $HISTORY = "$dotdir/.netscape/history.db";
1380
1381 tie %hist_db, 'DB_File', $HISTORY
1382 or die "Cannot open $HISTORY: $!\n" ;;
1383
1384 # Dump the complete database
1385 while ( ($href, $binary_time) = each %hist_db ) {
1386
1387 # remove the terminating NULL
1388 $href =~ s/\x00$// ;
1389
1390 # convert the binary time into a user friendly string
1391 $date = localtime unpack("V", $binary_time);
1392 print "$date $href\n" ;
1393 }
1394
1395 # check for the existence of a specific key
1396 # remember to add the NULL
1397 if ( $binary_time = $hist_db{"http://mox.perl.com/\x00"} ) {
1398 $date = localtime unpack("V", $binary_time) ;
1399 print "Last visited mox.perl.com on $date\n" ;
1400 }
1401 else {
1402 print "Never visited mox.perl.com\n"
1403 }
1404
1405 untie %hist_db ;
1406
1407 The untie() Gotcha
1408
1409 If you make use of the Berkeley DB API, it is very strongly recommended
1410 that you read "The untie Gotcha" in perltie.
1411
1412 Even if you don't currently make use of the API interface, it is still
1413 worth reading it.
1414
1415 Here is an example which illustrates the problem from a DB_File per‐
1416 spective:
1417
1418 use DB_File ;
1419 use Fcntl ;
1420
1421 my %x ;
1422 my $X ;
1423
1424 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR⎪O_TRUNC
1425 or die "Cannot tie first time: $!" ;
1426
1427 $x{123} = 456 ;
1428
1429 untie %x ;
1430
1431 tie %x, 'DB_File', 'tst.fil' , O_RDWR⎪O_CREAT
1432 or die "Cannot tie second time: $!" ;
1433
1434 untie %x ;
1435
1436 When run, the script will produce this error message:
1437
1438 Cannot tie second time: Invalid argument at bad.file line 14.
1439
1440 Although the error message above refers to the second tie() statement
1441 in the script, the source of the problem is really with the untie()
1442 statement that precedes it.
1443
1444 Having read perltie you will probably have already guessed that the
1445 error is caused by the extra copy of the tied object stored in $X. If
1446 you haven't, then the problem boils down to the fact that the DB_File
1447 destructor, DESTROY, will not be called until all references to the
1448 tied object are destroyed. Both the tied variable, %x, and $X above
1449 hold a reference to the object. The call to untie() will destroy the
1450 first, but $X still holds a valid reference, so the destructor will not
1451 get called and the database file tst.fil will remain open. The fact
1452 that Berkeley DB then reports the attempt to open a database that is
1453 already open via the catch-all "Invalid argument" doesn't help.
1454
1455 If you run the script with the "-w" flag the error message becomes:
1456
1457 untie attempted while 1 inner references still exist at bad.file line 12.
1458 Cannot tie second time: Invalid argument at bad.file line 14.
1459
1460 which pinpoints the real problem. Finally the script can now be modi‐
1461 fied to fix the original problem by destroying the API object before
1462 the untie:
1463
1464 ...
1465 $x{123} = 456 ;
1466
1467 undef $X ;
1468 untie %x ;
1469
1470 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR⎪O_CREAT
1471 ...
1472
1474 Why is there Perl source in my database?
1475
1476 If you look at the contents of a database file created by DB_File,
1477 there can sometimes be part of a Perl script included in it.
1478
1479 This happens because Berkeley DB uses dynamic memory to allocate buf‐
1480 fers which will subsequently be written to the database file. Being
1481 dynamic, the memory could have been used for anything before DB mal‐
1482 loced it. As Berkeley DB doesn't clear the memory once it has been
1483 allocated, the unused portions will contain random junk. In the case
1484 where a Perl script gets written to the database, the random junk will
1485 correspond to an area of dynamic memory that happened to be used during
1486 the compilation of the script.
1487
1488 Unless you don't like the possibility of there being part of your Perl
1489 scripts embedded in a database file, this is nothing to worry about.
1490
1491 How do I store complex data structures with DB_File?
1492
1493 Although DB_File cannot do this directly, there is a module which can
1494 layer transparently over DB_File to accomplish this feat.
1495
1496 Check out the MLDBM module, available on CPAN in the directory mod‐
1497 ules/by-module/MLDBM.
1498
1499 What does "Invalid Argument" mean?
1500
1501 You will get this error message when one of the parameters in the "tie"
1502 call is wrong. Unfortunately there are quite a few parameters to get
1503 wrong, so it can be difficult to figure out which one it is.
1504
1505 Here are a couple of possibilities:
1506
1507 1. Attempting to reopen a database without closing it.
1508
1509 2. Using the O_WRONLY flag.
1510
1511 What does "Bareword 'DB_File' not allowed" mean?
1512
1513 You will encounter this particular error message when you have the
1514 "strict 'subs'" pragma (or the full strict pragma) in your script.
1515 Consider this script:
1516
1517 use warnings ;
1518 use strict ;
1519 use DB_File ;
1520 my %x ;
1521 tie %x, DB_File, "filename" ;
1522
1523 Running it produces the error in question:
1524
1525 Bareword "DB_File" not allowed while "strict subs" in use
1526
1527 To get around the error, place the word "DB_File" in either single or
1528 double quotes, like this:
1529
1530 tie %x, "DB_File", "filename" ;
1531
1532 Although it might seem like a real pain, it is really worth the effort
1533 of having a "use strict" in all your scripts.
1534
1536 Articles that are either about DB_File or make use of it.
1537
1538 1. Full-Text Searching in Perl, Tim Kientzle (tkientzle@ddj.com), Dr.
1539 Dobb's Journal, Issue 295, January 1999, pp 34-41
1540
1542 Moved to the Changes file.
1543
1545 Some older versions of Berkeley DB had problems with fixed length
1546 records using the RECNO file format. This problem has been fixed since
1547 version 1.85 of Berkeley DB.
1548
1549 I am sure there are bugs in the code. If you do find any, or can sug‐
1550 gest any enhancements, I would welcome your comments.
1551
1553 DB_File comes with the standard Perl source distribution. Look in the
1554 directory ext/DB_File. Given the amount of time between releases of
1555 Perl the version that ships with Perl is quite likely to be out of
1556 date, so the most recent version can always be found on CPAN (see
1557 "CPAN" in perlmodlib for details), in the directory modules/by-mod‐
1558 ule/DB_File.
1559
1560 This version of DB_File will work with either version 1.x, 2.x or 3.x
1561 of Berkeley DB, but is limited to the functionality provided by version
1562 1.
1563
1564 The official web site for Berkeley DB is http://www.sleepycat.com. All
1565 versions of Berkeley DB are available there.
1566
1567 Alternatively, Berkeley DB version 1 is available at your nearest CPAN
1568 archive in src/misc/db.1.85.tar.gz.
1569
1570 If you are running IRIX, then get Berkeley DB version 1 from
1571 http://reality.sgi.com/ariel. It has the patches necessary to compile
1572 properly on IRIX 5.3.
1573
1575 Copyright (c) 1995-2005 Paul Marquess. All rights reserved. This pro‐
1576 gram is free software; you can redistribute it and/or modify it under
1577 the same terms as Perl itself.
1578
1579 Although DB_File is covered by the Perl license, the library it makes
1580 use of, namely Berkeley DB, is not. Berkeley DB has its own copyright
1581 and its own license. Please take the time to read it.
1582
1583 Here are are few words taken from the Berkeley DB FAQ (at
1584 http://www.sleepycat.com) regarding the license:
1585
1586 Do I have to license DB to use it in Perl scripts?
1587
1588 No. The Berkeley DB license requires that software that uses
1589 Berkeley DB be freely redistributable. In the case of Perl, that
1590 software is Perl, and not your scripts. Any Perl scripts that you
1591 write are your property, including scripts that make use of
1592 Berkeley DB. Neither the Perl license nor the Berkeley DB license
1593 place any restriction on what you may do with them.
1594
1595 If you are in any doubt about the license situation, contact either the
1596 Berkeley DB authors or the author of DB_File. See "AUTHOR" for details.
1597
1599 perl, dbopen(3), hash(3), recno(3), btree(3), perldbmfilter
1600
1602 The DB_File interface was written by Paul Marquess <pmqs@cpan.org>.
1603 Questions about the DB system itself may be addressed to <db@sleepy‐
1604 cat.com>.
1605
1606
1607
1608perl v5.8.8 2001-09-21 DB_File(3pm)