1DB_File(3pm) Perl Programmers Reference Guide DB_File(3pm)
2
3
4
6 DB_File - Perl5 access to Berkeley DB version 1.x
7
9 use DB_File;
10
11 [$X =] tie %hash, 'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
12 [$X =] tie %hash, 'DB_File', $filename, $flags, $mode, $DB_BTREE ;
13 [$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;
14
15 $status = $X->del($key [, $flags]) ;
16 $status = $X->put($key, $value [, $flags]) ;
17 $status = $X->get($key, $value [, $flags]) ;
18 $status = $X->seq($key, $value, $flags) ;
19 $status = $X->sync([$flags]) ;
20 $status = $X->fd ;
21
22 # BTREE only
23 $count = $X->get_dup($key) ;
24 @list = $X->get_dup($key) ;
25 %list = $X->get_dup($key, 1) ;
26 $status = $X->find_dup($key, $value) ;
27 $status = $X->del_dup($key, $value) ;
28
29 # RECNO only
30 $a = $X->length;
31 $a = $X->pop ;
32 $X->push(list);
33 $a = $X->shift;
34 $X->unshift(list);
35 @r = $X->splice(offset, length, elements);
36
37 # DBM Filters
38 $old_filter = $db->filter_store_key ( sub { ... } ) ;
39 $old_filter = $db->filter_store_value( sub { ... } ) ;
40 $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
41 $old_filter = $db->filter_fetch_value( sub { ... } ) ;
42
43 untie %hash ;
44 untie @array ;
45
47 DB_File is a module which allows Perl programs to make use of the
48 facilities provided by Berkeley DB version 1.x (if you have a newer
49 version of DB, see "Using DB_File with Berkeley DB version 2 or
50 greater"). It is assumed that you have a copy of the Berkeley DB
51 manual pages at hand when reading this documentation. The interface
52 defined here mirrors the Berkeley DB interface closely.
53
54 Berkeley DB is a C library which provides a consistent interface to a
55 number of database formats. DB_File provides an interface to all three
56 of the database types currently supported by Berkeley DB.
57
58 The file types are:
59
60 DB_HASH
61 This database type allows arbitrary key/value pairs to be stored
62 in data files. This is equivalent to the functionality provided by
63 other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM.
64 Remember though, the files created using DB_HASH are not
65 compatible with any of the other packages mentioned.
66
67 A default hashing algorithm, which will be adequate for most
68 applications, is built into Berkeley DB. If you do need to use
69 your own hashing algorithm it is possible to write your own in
70 Perl and have DB_File use it instead.
71
72 DB_BTREE
73 The btree format allows arbitrary key/value pairs to be stored in
74 a sorted, balanced binary tree.
75
76 As with the DB_HASH format, it is possible to provide a user
77 defined Perl routine to perform the comparison of keys. By
78 default, though, the keys are stored in lexical order.
79
80 DB_RECNO
81 DB_RECNO allows both fixed-length and variable-length flat text
82 files to be manipulated using the same key/value pair interface as
83 in DB_HASH and DB_BTREE. In this case the key will consist of a
84 record (line) number.
85
86 Using DB_File with Berkeley DB version 2 or greater
87 Although DB_File is intended to be used with Berkeley DB version 1, it
88 can also be used with version 2, 3 or 4. In this case the interface is
89 limited to the functionality provided by Berkeley DB 1.x. Anywhere the
90 version 2 or greater interface differs, DB_File arranges for it to work
91 like version 1. This feature allows DB_File scripts that were built
92 with version 1 to be migrated to version 2 or greater without any
93 changes.
94
95 If you want to make use of the new features available in Berkeley DB
96 2.x or greater, use the Perl module BerkeleyDB instead.
97
98 Note: The database file format has changed multiple times in Berkeley
99 DB version 2, 3 and 4. If you cannot recreate your databases, you must
100 dump any existing databases with either the "db_dump" or the
101 "db_dump185" utility that comes with Berkeley DB. Once you have
102 rebuilt DB_File to use Berkeley DB version 2 or greater, your databases
103 can be recreated using "db_load". Refer to the Berkeley DB
104 documentation for further details.
105
106 Please read "COPYRIGHT" before using version 2.x or greater of Berkeley
107 DB with DB_File.
108
109 Interface to Berkeley DB
110 DB_File allows access to Berkeley DB files using the tie() mechanism in
111 Perl 5 (for full details, see "tie()" in perlfunc). This facility
112 allows DB_File to access Berkeley DB files using either an associative
113 array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the
114 DB_RECNO file type).
115
116 In addition to the tie() interface, it is also possible to access most
117 of the functions provided in the Berkeley DB API directly. See "THE
118 API INTERFACE".
119
120 Opening a Berkeley DB Database File
121 Berkeley DB uses the function dbopen() to open or create a database.
122 Here is the C prototype for dbopen():
123
124 DB*
125 dbopen (const char * file, int flags, int mode,
126 DBTYPE type, const void * openinfo)
127
128 The parameter "type" is an enumeration which specifies which of the 3
129 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used.
130 Depending on which of these is actually chosen, the final parameter,
131 openinfo points to a data structure which allows tailoring of the
132 specific interface method.
133
134 This interface is handled slightly differently in DB_File. Here is an
135 equivalent call using DB_File:
136
137 tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;
138
139 The "filename", "flags" and "mode" parameters are the direct equivalent
140 of their dbopen() counterparts. The final parameter $DB_HASH performs
141 the function of both the "type" and "openinfo" parameters in dbopen().
142
143 In the example above $DB_HASH is actually a pre-defined reference to a
144 hash object. DB_File has three of these pre-defined references. Apart
145 from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.
146
147 The keys allowed in each of these pre-defined references is limited to
148 the names used in the equivalent C structure. So, for example, the
149 $DB_HASH reference will only allow keys called "bsize", "cachesize",
150 "ffactor", "hash", "lorder" and "nelem".
151
152 To change one of these elements, just assign to it like this:
153
154 $DB_HASH->{'cachesize'} = 10000 ;
155
156 The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are
157 usually adequate for most applications. If you do need to create extra
158 instances of these objects, constructors are available for each file
159 type.
160
161 Here are examples of the constructors and the valid options available
162 for DB_HASH, DB_BTREE and DB_RECNO respectively.
163
164 $a = new DB_File::HASHINFO ;
165 $a->{'bsize'} ;
166 $a->{'cachesize'} ;
167 $a->{'ffactor'};
168 $a->{'hash'} ;
169 $a->{'lorder'} ;
170 $a->{'nelem'} ;
171
172 $b = new DB_File::BTREEINFO ;
173 $b->{'flags'} ;
174 $b->{'cachesize'} ;
175 $b->{'maxkeypage'} ;
176 $b->{'minkeypage'} ;
177 $b->{'psize'} ;
178 $b->{'compare'} ;
179 $b->{'prefix'} ;
180 $b->{'lorder'} ;
181
182 $c = new DB_File::RECNOINFO ;
183 $c->{'bval'} ;
184 $c->{'cachesize'} ;
185 $c->{'psize'} ;
186 $c->{'flags'} ;
187 $c->{'lorder'} ;
188 $c->{'reclen'} ;
189 $c->{'bfname'} ;
190
191 The values stored in the hashes above are mostly the direct equivalent
192 of their C counterpart. Like their C counterparts, all are set to a
193 default values - that means you don't have to set all of the values
194 when you only want to change one. Here is an example:
195
196 $a = new DB_File::HASHINFO ;
197 $a->{'cachesize'} = 12345 ;
198 tie %y, 'DB_File', "filename", $flags, 0777, $a ;
199
200 A few of the options need extra discussion here. When used, the C
201 equivalent of the keys "hash", "compare" and "prefix" store pointers to
202 C functions. In DB_File these keys are used to store references to Perl
203 subs. Below are templates for each of the subs:
204
205 sub hash
206 {
207 my ($data) = @_ ;
208 ...
209 # return the hash value for $data
210 return $hash ;
211 }
212
213 sub compare
214 {
215 my ($key, $key2) = @_ ;
216 ...
217 # return 0 if $key1 eq $key2
218 # -1 if $key1 lt $key2
219 # 1 if $key1 gt $key2
220 return (-1 , 0 or 1) ;
221 }
222
223 sub prefix
224 {
225 my ($key, $key2) = @_ ;
226 ...
227 # return number of bytes of $key2 which are
228 # necessary to determine that it is greater than $key1
229 return $bytes ;
230 }
231
232 See "Changing the BTREE sort order" for an example of using the
233 "compare" template.
234
235 If you are using the DB_RECNO interface and you intend making use of
236 "bval", you should check out "The 'bval' Option".
237
238 Default Parameters
239 It is possible to omit some or all of the final 4 parameters in the
240 call to "tie" and let them take default values. As DB_HASH is the most
241 common file format used, the call:
242
243 tie %A, "DB_File", "filename" ;
244
245 is equivalent to:
246
247 tie %A, "DB_File", "filename", O_CREAT|O_RDWR, 0666, $DB_HASH ;
248
249 It is also possible to omit the filename parameter as well, so the
250 call:
251
252 tie %A, "DB_File" ;
253
254 is equivalent to:
255
256 tie %A, "DB_File", undef, O_CREAT|O_RDWR, 0666, $DB_HASH ;
257
258 See "In Memory Databases" for a discussion on the use of "undef" in
259 place of a filename.
260
261 In Memory Databases
262 Berkeley DB allows the creation of in-memory databases by using NULL
263 (that is, a "(char *)0" in C) in place of the filename. DB_File uses
264 "undef" instead of NULL to provide this functionality.
265
267 The DB_HASH file format is probably the most commonly used of the three
268 file formats that DB_File supports. It is also very straightforward to
269 use.
270
271 A Simple Example
272 This example shows how to create a database, add key/value pairs to the
273 database, delete keys/value pairs and finally how to enumerate the
274 contents of the database.
275
276 use warnings ;
277 use strict ;
278 use DB_File ;
279 our (%h, $k, $v) ;
280
281 unlink "fruit" ;
282 tie %h, "DB_File", "fruit", O_RDWR|O_CREAT, 0666, $DB_HASH
283 or die "Cannot open file 'fruit': $!\n";
284
285 # Add a few key/value pairs to the file
286 $h{"apple"} = "red" ;
287 $h{"orange"} = "orange" ;
288 $h{"banana"} = "yellow" ;
289 $h{"tomato"} = "red" ;
290
291 # Check for existence of a key
292 print "Banana Exists\n\n" if $h{"banana"} ;
293
294 # Delete a key/value pair.
295 delete $h{"apple"} ;
296
297 # print the contents of the file
298 while (($k, $v) = each %h)
299 { print "$k -> $v\n" }
300
301 untie %h ;
302
303 here is the output:
304
305 Banana Exists
306
307 orange -> orange
308 tomato -> red
309 banana -> yellow
310
311 Note that the like ordinary associative arrays, the order of the keys
312 retrieved is in an apparently random order.
313
315 The DB_BTREE format is useful when you want to store data in a given
316 order. By default the keys will be stored in lexical order, but as you
317 will see from the example shown in the next section, it is very easy to
318 define your own sorting function.
319
320 Changing the BTREE sort order
321 This script shows how to override the default sorting algorithm that
322 BTREE uses. Instead of using the normal lexical ordering, a case
323 insensitive compare function will be used.
324
325 use warnings ;
326 use strict ;
327 use DB_File ;
328
329 my %h ;
330
331 sub Compare
332 {
333 my ($key1, $key2) = @_ ;
334 "\L$key1" cmp "\L$key2" ;
335 }
336
337 # specify the Perl sub that will do the comparison
338 $DB_BTREE->{'compare'} = \&Compare ;
339
340 unlink "tree" ;
341 tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
342 or die "Cannot open file 'tree': $!\n" ;
343
344 # Add a key/value pair to the file
345 $h{'Wall'} = 'Larry' ;
346 $h{'Smith'} = 'John' ;
347 $h{'mouse'} = 'mickey' ;
348 $h{'duck'} = 'donald' ;
349
350 # Delete
351 delete $h{"duck"} ;
352
353 # Cycle through the keys printing them in order.
354 # Note it is not necessary to sort the keys as
355 # the btree will have kept them in order automatically.
356 foreach (keys %h)
357 { print "$_\n" }
358
359 untie %h ;
360
361 Here is the output from the code above.
362
363 mouse
364 Smith
365 Wall
366
367 There are a few point to bear in mind if you want to change the
368 ordering in a BTREE database:
369
370 1. The new compare function must be specified when you create the
371 database.
372
373 2. You cannot change the ordering once the database has been created.
374 Thus you must use the same compare function every time you access
375 the database.
376
377 3. Duplicate keys are entirely defined by the comparison function.
378 In the case-insensitive example above, the keys: 'KEY' and 'key'
379 would be considered duplicates, and assigning to the second one
380 would overwrite the first. If duplicates are allowed for (with the
381 R_DUP flag discussed below), only a single copy of duplicate keys
382 is stored in the database --- so (again with example above)
383 assigning three values to the keys: 'KEY', 'Key', and 'key' would
384 leave just the first key: 'KEY' in the database with three values.
385 For some situations this results in information loss, so care
386 should be taken to provide fully qualified comparison functions
387 when necessary. For example, the above comparison routine could
388 be modified to additionally compare case-sensitively if two keys
389 are equal in the case insensitive comparison:
390
391 sub compare {
392 my($key1, $key2) = @_;
393 lc $key1 cmp lc $key2 ||
394 $key1 cmp $key2;
395 }
396
397 And now you will only have duplicates when the keys themselves are
398 truly the same. (note: in versions of the db library prior to
399 about November 1996, such duplicate keys were retained so it was
400 possible to recover the original keys in sets of keys that
401 compared as equal).
402
403 Handling Duplicate Keys
404 The BTREE file type optionally allows a single key to be associated
405 with an arbitrary number of values. This option is enabled by setting
406 the flags element of $DB_BTREE to R_DUP when creating the database.
407
408 There are some difficulties in using the tied hash interface if you
409 want to manipulate a BTREE database with duplicate keys. Consider this
410 code:
411
412 use warnings ;
413 use strict ;
414 use DB_File ;
415
416 my ($filename, %h) ;
417
418 $filename = "tree" ;
419 unlink $filename ;
420
421 # Enable duplicate records
422 $DB_BTREE->{'flags'} = R_DUP ;
423
424 tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
425 or die "Cannot open $filename: $!\n";
426
427 # Add some key/value pairs to the file
428 $h{'Wall'} = 'Larry' ;
429 $h{'Wall'} = 'Brick' ; # Note the duplicate key
430 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
431 $h{'Smith'} = 'John' ;
432 $h{'mouse'} = 'mickey' ;
433
434 # iterate through the associative array
435 # and print each key/value pair.
436 foreach (sort keys %h)
437 { print "$_ -> $h{$_}\n" }
438
439 untie %h ;
440
441 Here is the output:
442
443 Smith -> John
444 Wall -> Larry
445 Wall -> Larry
446 Wall -> Larry
447 mouse -> mickey
448
449 As you can see 3 records have been successfully created with key "Wall"
450 - the only thing is, when they are retrieved from the database they
451 seem to have the same value, namely "Larry". The problem is caused by
452 the way that the associative array interface works. Basically, when the
453 associative array interface is used to fetch the value associated with
454 a given key, it will only ever retrieve the first value.
455
456 Although it may not be immediately obvious from the code above, the
457 associative array interface can be used to write values with duplicate
458 keys, but it cannot be used to read them back from the database.
459
460 The way to get around this problem is to use the Berkeley DB API method
461 called "seq". This method allows sequential access to key/value pairs.
462 See "THE API INTERFACE" for details of both the "seq" method and the
463 API in general.
464
465 Here is the script above rewritten using the "seq" API method.
466
467 use warnings ;
468 use strict ;
469 use DB_File ;
470
471 my ($filename, $x, %h, $status, $key, $value) ;
472
473 $filename = "tree" ;
474 unlink $filename ;
475
476 # Enable duplicate records
477 $DB_BTREE->{'flags'} = R_DUP ;
478
479 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
480 or die "Cannot open $filename: $!\n";
481
482 # Add some key/value pairs to the file
483 $h{'Wall'} = 'Larry' ;
484 $h{'Wall'} = 'Brick' ; # Note the duplicate key
485 $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
486 $h{'Smith'} = 'John' ;
487 $h{'mouse'} = 'mickey' ;
488
489 # iterate through the btree using seq
490 # and print each key/value pair.
491 $key = $value = 0 ;
492 for ($status = $x->seq($key, $value, R_FIRST) ;
493 $status == 0 ;
494 $status = $x->seq($key, $value, R_NEXT) )
495 { print "$key -> $value\n" }
496
497 undef $x ;
498 untie %h ;
499
500 that prints:
501
502 Smith -> John
503 Wall -> Brick
504 Wall -> Brick
505 Wall -> Larry
506 mouse -> mickey
507
508 This time we have got all the key/value pairs, including the multiple
509 values associated with the key "Wall".
510
511 To make life easier when dealing with duplicate keys, DB_File comes
512 with a few utility methods.
513
514 The get_dup() Method
515 The "get_dup" method assists in reading duplicate values from BTREE
516 databases. The method can take the following forms:
517
518 $count = $x->get_dup($key) ;
519 @list = $x->get_dup($key) ;
520 %list = $x->get_dup($key, 1) ;
521
522 In a scalar context the method returns the number of values associated
523 with the key, $key.
524
525 In list context, it returns all the values which match $key. Note that
526 the values will be returned in an apparently random order.
527
528 In list context, if the second parameter is present and evaluates TRUE,
529 the method returns an associative array. The keys of the associative
530 array correspond to the values that matched in the BTREE and the values
531 of the array are a count of the number of times that particular value
532 occurred in the BTREE.
533
534 So assuming the database created above, we can use "get_dup" like this:
535
536 use warnings ;
537 use strict ;
538 use DB_File ;
539
540 my ($filename, $x, %h) ;
541
542 $filename = "tree" ;
543
544 # Enable duplicate records
545 $DB_BTREE->{'flags'} = R_DUP ;
546
547 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
548 or die "Cannot open $filename: $!\n";
549
550 my $cnt = $x->get_dup("Wall") ;
551 print "Wall occurred $cnt times\n" ;
552
553 my %hash = $x->get_dup("Wall", 1) ;
554 print "Larry is there\n" if $hash{'Larry'} ;
555 print "There are $hash{'Brick'} Brick Walls\n" ;
556
557 my @list = sort $x->get_dup("Wall") ;
558 print "Wall => [@list]\n" ;
559
560 @list = $x->get_dup("Smith") ;
561 print "Smith => [@list]\n" ;
562
563 @list = $x->get_dup("Dog") ;
564 print "Dog => [@list]\n" ;
565
566 and it will print:
567
568 Wall occurred 3 times
569 Larry is there
570 There are 2 Brick Walls
571 Wall => [Brick Brick Larry]
572 Smith => [John]
573 Dog => []
574
575 The find_dup() Method
576 $status = $X->find_dup($key, $value) ;
577
578 This method checks for the existence of a specific key/value pair. If
579 the pair exists, the cursor is left pointing to the pair and the method
580 returns 0. Otherwise the method returns a non-zero value.
581
582 Assuming the database from the previous example:
583
584 use warnings ;
585 use strict ;
586 use DB_File ;
587
588 my ($filename, $x, %h, $found) ;
589
590 $filename = "tree" ;
591
592 # Enable duplicate records
593 $DB_BTREE->{'flags'} = R_DUP ;
594
595 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
596 or die "Cannot open $filename: $!\n";
597
598 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
599 print "Larry Wall is $found there\n" ;
600
601 $found = ( $x->find_dup("Wall", "Harry") == 0 ? "" : "not") ;
602 print "Harry Wall is $found there\n" ;
603
604 undef $x ;
605 untie %h ;
606
607 prints this
608
609 Larry Wall is there
610 Harry Wall is not there
611
612 The del_dup() Method
613 $status = $X->del_dup($key, $value) ;
614
615 This method deletes a specific key/value pair. It returns 0 if they
616 exist and have been deleted successfully. Otherwise the method returns
617 a non-zero value.
618
619 Again assuming the existence of the "tree" database
620
621 use warnings ;
622 use strict ;
623 use DB_File ;
624
625 my ($filename, $x, %h, $found) ;
626
627 $filename = "tree" ;
628
629 # Enable duplicate records
630 $DB_BTREE->{'flags'} = R_DUP ;
631
632 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
633 or die "Cannot open $filename: $!\n";
634
635 $x->del_dup("Wall", "Larry") ;
636
637 $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
638 print "Larry Wall is $found there\n" ;
639
640 undef $x ;
641 untie %h ;
642
643 prints this
644
645 Larry Wall is not there
646
647 Matching Partial Keys
648 The BTREE interface has a feature which allows partial keys to be
649 matched. This functionality is only available when the "seq" method is
650 used along with the R_CURSOR flag.
651
652 $x->seq($key, $value, R_CURSOR) ;
653
654 Here is the relevant quote from the dbopen man page where it defines
655 the use of the R_CURSOR flag with seq:
656
657 Note, for the DB_BTREE access method, the returned key is not
658 necessarily an exact match for the specified key. The returned key
659 is the smallest key greater than or equal to the specified key,
660 permitting partial key matches and range searches.
661
662 In the example script below, the "match" sub uses this feature to find
663 and print the first matching key/value pair given a partial key.
664
665 use warnings ;
666 use strict ;
667 use DB_File ;
668 use Fcntl ;
669
670 my ($filename, $x, %h, $st, $key, $value) ;
671
672 sub match
673 {
674 my $key = shift ;
675 my $value = 0;
676 my $orig_key = $key ;
677 $x->seq($key, $value, R_CURSOR) ;
678 print "$orig_key\t-> $key\t-> $value\n" ;
679 }
680
681 $filename = "tree" ;
682 unlink $filename ;
683
684 $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
685 or die "Cannot open $filename: $!\n";
686
687 # Add some key/value pairs to the file
688 $h{'mouse'} = 'mickey' ;
689 $h{'Wall'} = 'Larry' ;
690 $h{'Walls'} = 'Brick' ;
691 $h{'Smith'} = 'John' ;
692
693
694 $key = $value = 0 ;
695 print "IN ORDER\n" ;
696 for ($st = $x->seq($key, $value, R_FIRST) ;
697 $st == 0 ;
698 $st = $x->seq($key, $value, R_NEXT) )
699
700 { print "$key -> $value\n" }
701
702 print "\nPARTIAL MATCH\n" ;
703
704 match "Wa" ;
705 match "A" ;
706 match "a" ;
707
708 undef $x ;
709 untie %h ;
710
711 Here is the output:
712
713 IN ORDER
714 Smith -> John
715 Wall -> Larry
716 Walls -> Brick
717 mouse -> mickey
718
719 PARTIAL MATCH
720 Wa -> Wall -> Larry
721 A -> Smith -> John
722 a -> mouse -> mickey
723
725 DB_RECNO provides an interface to flat text files. Both variable and
726 fixed length records are supported.
727
728 In order to make RECNO more compatible with Perl, the array offset for
729 all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.
730
731 As with normal Perl arrays, a RECNO array can be accessed using
732 negative indexes. The index -1 refers to the last element of the array,
733 -2 the second last, and so on. Attempting to access an element before
734 the start of the array will raise a fatal run-time error.
735
736 The 'bval' Option
737 The operation of the bval option warrants some discussion. Here is the
738 definition of bval from the Berkeley DB 1.85 recno manual page:
739
740 The delimiting byte to be used to mark the end of a
741 record for variable-length records, and the pad charac-
742 ter for fixed-length records. If no value is speci-
743 fied, newlines (``\n'') are used to mark the end of
744 variable-length records and fixed-length records are
745 padded with spaces.
746
747 The second sentence is wrong. In actual fact bval will only default to
748 "\n" when the openinfo parameter in dbopen is NULL. If a non-NULL
749 openinfo parameter is used at all, the value that happens to be in bval
750 will be used. That means you always have to specify bval when making
751 use of any of the options in the openinfo parameter. This documentation
752 error will be fixed in the next release of Berkeley DB.
753
754 That clarifies the situation with regards Berkeley DB itself. What
755 about DB_File? Well, the behavior defined in the quote above is quite
756 useful, so DB_File conforms to it.
757
758 That means that you can specify other options (e.g. cachesize) and
759 still have bval default to "\n" for variable length records, and space
760 for fixed length records.
761
762 Also note that the bval option only allows you to specify a single byte
763 as a delimiter.
764
765 A Simple Example
766 Here is a simple example that uses RECNO (if you are using a version of
767 Perl earlier than 5.004_57 this example won't work -- see "Extra RECNO
768 Methods" for a workaround).
769
770 use warnings ;
771 use strict ;
772 use DB_File ;
773
774 my $filename = "text" ;
775 unlink $filename ;
776
777 my @h ;
778 tie @h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_RECNO
779 or die "Cannot open file 'text': $!\n" ;
780
781 # Add a few key/value pairs to the file
782 $h[0] = "orange" ;
783 $h[1] = "blue" ;
784 $h[2] = "yellow" ;
785
786 push @h, "green", "black" ;
787
788 my $elements = scalar @h ;
789 print "The array contains $elements entries\n" ;
790
791 my $last = pop @h ;
792 print "popped $last\n" ;
793
794 unshift @h, "white" ;
795 my $first = shift @h ;
796 print "shifted $first\n" ;
797
798 # Check for existence of a key
799 print "Element 1 Exists with value $h[1]\n" if $h[1] ;
800
801 # use a negative index
802 print "The last element is $h[-1]\n" ;
803 print "The 2nd last element is $h[-2]\n" ;
804
805 untie @h ;
806
807 Here is the output from the script:
808
809 The array contains 5 entries
810 popped black
811 shifted white
812 Element 1 Exists with value blue
813 The last element is green
814 The 2nd last element is yellow
815
816 Extra RECNO Methods
817 If you are using a version of Perl earlier than 5.004_57, the tied
818 array interface is quite limited. In the example script above "push",
819 "pop", "shift", "unshift" or determining the array length will not work
820 with a tied array.
821
822 To make the interface more useful for older versions of Perl, a number
823 of methods are supplied with DB_File to simulate the missing array
824 operations. All these methods are accessed via the object returned from
825 the tie call.
826
827 Here are the methods:
828
829 $X->push(list) ;
830 Pushes the elements of "list" to the end of the array.
831
832 $value = $X->pop ;
833 Removes and returns the last element of the array.
834
835 $X->shift
836 Removes and returns the first element of the array.
837
838 $X->unshift(list) ;
839 Pushes the elements of "list" to the start of the array.
840
841 $X->length
842 Returns the number of elements in the array.
843
844 $X->splice(offset, length, elements);
845 Returns a splice of the array.
846
847 Another Example
848 Here is a more complete example that makes use of some of the methods
849 described above. It also makes use of the API interface directly (see
850 "THE API INTERFACE").
851
852 use warnings ;
853 use strict ;
854 my (@h, $H, $file, $i) ;
855 use DB_File ;
856 use Fcntl ;
857
858 $file = "text" ;
859
860 unlink $file ;
861
862 $H = tie @h, "DB_File", $file, O_RDWR|O_CREAT, 0666, $DB_RECNO
863 or die "Cannot open file $file: $!\n" ;
864
865 # first create a text file to play with
866 $h[0] = "zero" ;
867 $h[1] = "one" ;
868 $h[2] = "two" ;
869 $h[3] = "three" ;
870 $h[4] = "four" ;
871
872
873 # Print the records in order.
874 #
875 # The length method is needed here because evaluating a tied
876 # array in a scalar context does not return the number of
877 # elements in the array.
878
879 print "\nORIGINAL\n" ;
880 foreach $i (0 .. $H->length - 1) {
881 print "$i: $h[$i]\n" ;
882 }
883
884 # use the push & pop methods
885 $a = $H->pop ;
886 $H->push("last") ;
887 print "\nThe last record was [$a]\n" ;
888
889 # and the shift & unshift methods
890 $a = $H->shift ;
891 $H->unshift("first") ;
892 print "The first record was [$a]\n" ;
893
894 # Use the API to add a new record after record 2.
895 $i = 2 ;
896 $H->put($i, "Newbie", R_IAFTER) ;
897
898 # and a new record before record 1.
899 $i = 1 ;
900 $H->put($i, "New One", R_IBEFORE) ;
901
902 # delete record 3
903 $H->del(3) ;
904
905 # now print the records in reverse order
906 print "\nREVERSE\n" ;
907 for ($i = $H->length - 1 ; $i >= 0 ; -- $i)
908 { print "$i: $h[$i]\n" }
909
910 # same again, but use the API functions instead
911 print "\nREVERSE again\n" ;
912 my ($s, $k, $v) = (0, 0, 0) ;
913 for ($s = $H->seq($k, $v, R_LAST) ;
914 $s == 0 ;
915 $s = $H->seq($k, $v, R_PREV))
916 { print "$k: $v\n" }
917
918 undef $H ;
919 untie @h ;
920
921 and this is what it outputs:
922
923 ORIGINAL
924 0: zero
925 1: one
926 2: two
927 3: three
928 4: four
929
930 The last record was [four]
931 The first record was [zero]
932
933 REVERSE
934 5: last
935 4: three
936 3: Newbie
937 2: one
938 1: New One
939 0: first
940
941 REVERSE again
942 5: last
943 4: three
944 3: Newbie
945 2: one
946 1: New One
947 0: first
948
949 Notes:
950
951 1. Rather than iterating through the array, @h like this:
952
953 foreach $i (@h)
954
955 it is necessary to use either this:
956
957 foreach $i (0 .. $H->length - 1)
958
959 or this:
960
961 for ($a = $H->get($k, $v, R_FIRST) ;
962 $a == 0 ;
963 $a = $H->get($k, $v, R_NEXT) )
964
965 2. Notice that both times the "put" method was used the record index
966 was specified using a variable, $i, rather than the literal value
967 itself. This is because "put" will return the record number of the
968 inserted line via that parameter.
969
971 As well as accessing Berkeley DB using a tied hash or array, it is also
972 possible to make direct use of most of the API functions defined in the
973 Berkeley DB documentation.
974
975 To do this you need to store a copy of the object returned from the
976 tie.
977
978 $db = tie %hash, "DB_File", "filename" ;
979
980 Once you have done that, you can access the Berkeley DB API functions
981 as DB_File methods directly like this:
982
983 $db->put($key, $value, R_NOOVERWRITE) ;
984
985 Important: If you have saved a copy of the object returned from "tie",
986 the underlying database file will not be closed until both the tied
987 variable is untied and all copies of the saved object are destroyed.
988
989 use DB_File ;
990 $db = tie %hash, "DB_File", "filename"
991 or die "Cannot tie filename: $!" ;
992 ...
993 undef $db ;
994 untie %hash ;
995
996 See "The untie() Gotcha" for more details.
997
998 All the functions defined in dbopen are available except for close()
999 and dbopen() itself. The DB_File method interface to the supported
1000 functions have been implemented to mirror the way Berkeley DB works
1001 whenever possible. In particular note that:
1002
1003 · The methods return a status value. All return 0 on success. All
1004 return -1 to signify an error and set $! to the exact error code.
1005 The return code 1 generally (but not always) means that the key
1006 specified did not exist in the database.
1007
1008 Other return codes are defined. See below and in the Berkeley DB
1009 documentation for details. The Berkeley DB documentation should be
1010 used as the definitive source.
1011
1012 · Whenever a Berkeley DB function returns data via one of its
1013 parameters, the equivalent DB_File method does exactly the same.
1014
1015 · If you are careful, it is possible to mix API calls with the tied
1016 hash/array interface in the same piece of code. Although only a
1017 few of the methods used to implement the tied interface currently
1018 make use of the cursor, you should always assume that the cursor
1019 has been changed any time the tied hash/array interface is used.
1020 As an example, this code will probably not do what you expect:
1021
1022 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1023 or die "Cannot tie $filename: $!" ;
1024
1025 # Get the first key/value pair and set the cursor
1026 $X->seq($key, $value, R_FIRST) ;
1027
1028 # this line will modify the cursor
1029 $count = scalar keys %x ;
1030
1031 # Get the second key/value pair.
1032 # oops, it didn't, it got the last key/value pair!
1033 $X->seq($key, $value, R_NEXT) ;
1034
1035 The code above can be rearranged to get around the problem, like
1036 this:
1037
1038 $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
1039 or die "Cannot tie $filename: $!" ;
1040
1041 # this line will modify the cursor
1042 $count = scalar keys %x ;
1043
1044 # Get the first key/value pair and set the cursor
1045 $X->seq($key, $value, R_FIRST) ;
1046
1047 # Get the second key/value pair.
1048 # worked this time.
1049 $X->seq($key, $value, R_NEXT) ;
1050
1051 All the constants defined in dbopen for use in the flags parameters in
1052 the methods defined below are also available. Refer to the Berkeley DB
1053 documentation for the precise meaning of the flags values.
1054
1055 Below is a list of the methods available.
1056
1057 $status = $X->get($key, $value [, $flags]) ;
1058 Given a key ($key) this method reads the value associated with it
1059 from the database. The value read from the database is returned in
1060 the $value parameter.
1061
1062 If the key does not exist the method returns 1.
1063
1064 No flags are currently defined for this method.
1065
1066 $status = $X->put($key, $value [, $flags]) ;
1067 Stores the key/value pair in the database.
1068
1069 If you use either the R_IAFTER or R_IBEFORE flags, the $key
1070 parameter will have the record number of the inserted key/value
1071 pair set.
1072
1073 Valid flags are R_CURSOR, R_IAFTER, R_IBEFORE, R_NOOVERWRITE and
1074 R_SETCURSOR.
1075
1076 $status = $X->del($key [, $flags]) ;
1077 Removes all key/value pairs with key $key from the database.
1078
1079 A return code of 1 means that the requested key was not in the
1080 database.
1081
1082 R_CURSOR is the only valid flag at present.
1083
1084 $status = $X->fd ;
1085 Returns the file descriptor for the underlying database.
1086
1087 See "Locking: The Trouble with fd" for an explanation for why you
1088 should not use "fd" to lock your database.
1089
1090 $status = $X->seq($key, $value, $flags) ;
1091 This interface allows sequential retrieval from the database. See
1092 dbopen for full details.
1093
1094 Both the $key and $value parameters will be set to the key/value
1095 pair read from the database.
1096
1097 The flags parameter is mandatory. The valid flag values are
1098 R_CURSOR, R_FIRST, R_LAST, R_NEXT and R_PREV.
1099
1100 $status = $X->sync([$flags]) ;
1101 Flushes any cached buffers to disk.
1102
1103 R_RECNOSYNC is the only valid flag at present.
1104
1106 A DBM Filter is a piece of code that is be used when you always want to
1107 make the same transformation to all keys and/or values in a DBM
1108 database.
1109
1110 There are four methods associated with DBM Filters. All work
1111 identically, and each is used to install (or uninstall) a single DBM
1112 Filter. Each expects a single parameter, namely a reference to a sub.
1113 The only difference between them is the place that the filter is
1114 installed.
1115
1116 To summarise:
1117
1118 filter_store_key
1119 If a filter has been installed with this method, it will be
1120 invoked every time you write a key to a DBM database.
1121
1122 filter_store_value
1123 If a filter has been installed with this method, it will be
1124 invoked every time you write a value to a DBM database.
1125
1126 filter_fetch_key
1127 If a filter has been installed with this method, it will be
1128 invoked every time you read a key from a DBM database.
1129
1130 filter_fetch_value
1131 If a filter has been installed with this method, it will be
1132 invoked every time you read a value from a DBM database.
1133
1134 You can use any combination of the methods, from none, to all four.
1135
1136 All filter methods return the existing filter, if present, or "undef"
1137 in not.
1138
1139 To delete a filter pass "undef" to it.
1140
1141 The Filter
1142 When each filter is called by Perl, a local copy of $_ will contain the
1143 key or value to be filtered. Filtering is achieved by modifying the
1144 contents of $_. The return code from the filter is ignored.
1145
1146 An Example -- the NULL termination problem.
1147 Consider the following scenario. You have a DBM database that you need
1148 to share with a third-party C application. The C application assumes
1149 that all keys and values are NULL terminated. Unfortunately when Perl
1150 writes to DBM databases it doesn't use NULL termination, so your Perl
1151 application will have to manage NULL termination itself. When you write
1152 to the database you will have to use something like this:
1153
1154 $hash{"$key\0"} = "$value\0" ;
1155
1156 Similarly the NULL needs to be taken into account when you are
1157 considering the length of existing keys/values.
1158
1159 It would be much better if you could ignore the NULL terminations issue
1160 in the main application code and have a mechanism that automatically
1161 added the terminating NULL to all keys and values whenever you write to
1162 the database and have them removed when you read from the database. As
1163 I'm sure you have already guessed, this is a problem that DBM Filters
1164 can fix very easily.
1165
1166 use warnings ;
1167 use strict ;
1168 use DB_File ;
1169
1170 my %hash ;
1171 my $filename = "filt" ;
1172 unlink $filename ;
1173
1174 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1175 or die "Cannot open $filename: $!\n" ;
1176
1177 # Install DBM Filters
1178 $db->filter_fetch_key ( sub { s/\0$// } ) ;
1179 $db->filter_store_key ( sub { $_ .= "\0" } ) ;
1180 $db->filter_fetch_value( sub { s/\0$// } ) ;
1181 $db->filter_store_value( sub { $_ .= "\0" } ) ;
1182
1183 $hash{"abc"} = "def" ;
1184 my $a = $hash{"ABC"} ;
1185 # ...
1186 undef $db ;
1187 untie %hash ;
1188
1189 Hopefully the contents of each of the filters should be self-
1190 explanatory. Both "fetch" filters remove the terminating NULL, and both
1191 "store" filters add a terminating NULL.
1192
1193 Another Example -- Key is a C int.
1194 Here is another real-life example. By default, whenever Perl writes to
1195 a DBM database it always writes the key and value as strings. So when
1196 you use this:
1197
1198 $hash{12345} = "something" ;
1199
1200 the key 12345 will get stored in the DBM database as the 5 byte string
1201 "12345". If you actually want the key to be stored in the DBM database
1202 as a C int, you will have to use "pack" when writing, and "unpack" when
1203 reading.
1204
1205 Here is a DBM Filter that does it:
1206
1207 use warnings ;
1208 use strict ;
1209 use DB_File ;
1210 my %hash ;
1211 my $filename = "filt" ;
1212 unlink $filename ;
1213
1214
1215 my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
1216 or die "Cannot open $filename: $!\n" ;
1217
1218 $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
1219 $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
1220 $hash{123} = "def" ;
1221 # ...
1222 undef $db ;
1223 untie %hash ;
1224
1225 This time only two filters have been used -- we only need to manipulate
1226 the contents of the key, so it wasn't necessary to install any value
1227 filters.
1228
1230 Locking: The Trouble with fd
1231 Until version 1.72 of this module, the recommended technique for
1232 locking DB_File databases was to flock the filehandle returned from the
1233 "fd" function. Unfortunately this technique has been shown to be
1234 fundamentally flawed (Kudos to David Harris for tracking this down).
1235 Use it at your own peril!
1236
1237 The locking technique went like this.
1238
1239 $db = tie(%db, 'DB_File', 'foo.db', O_CREAT|O_RDWR, 0644)
1240 || die "dbcreat foo.db $!";
1241 $fd = $db->fd;
1242 open(DB_FH, "+<&=$fd") || die "dup $!";
1243 flock (DB_FH, LOCK_EX) || die "flock: $!";
1244 ...
1245 $db{"Tom"} = "Jerry" ;
1246 ...
1247 flock(DB_FH, LOCK_UN);
1248 undef $db;
1249 untie %db;
1250 close(DB_FH);
1251
1252 In simple terms, this is what happens:
1253
1254 1. Use "tie" to open the database.
1255
1256 2. Lock the database with fd & flock.
1257
1258 3. Read & Write to the database.
1259
1260 4. Unlock and close the database.
1261
1262 Here is the crux of the problem. A side-effect of opening the DB_File
1263 database in step 2 is that an initial block from the database will get
1264 read from disk and cached in memory.
1265
1266 To see why this is a problem, consider what can happen when two
1267 processes, say "A" and "B", both want to update the same DB_File
1268 database using the locking steps outlined above. Assume process "A" has
1269 already opened the database and has a write lock, but it hasn't
1270 actually updated the database yet (it has finished step 2, but not
1271 started step 3 yet). Now process "B" tries to open the same database -
1272 step 1 will succeed, but it will block on step 2 until process "A"
1273 releases the lock. The important thing to notice here is that at this
1274 point in time both processes will have cached identical initial blocks
1275 from the database.
1276
1277 Now process "A" updates the database and happens to change some of the
1278 data held in the initial buffer. Process "A" terminates, flushing all
1279 cached data to disk and releasing the database lock. At this point the
1280 database on disk will correctly reflect the changes made by process
1281 "A".
1282
1283 With the lock released, process "B" can now continue. It also updates
1284 the database and unfortunately it too modifies the data that was in its
1285 initial buffer. Once that data gets flushed to disk it will overwrite
1286 some/all of the changes process "A" made to the database.
1287
1288 The result of this scenario is at best a database that doesn't contain
1289 what you expect. At worst the database will corrupt.
1290
1291 The above won't happen every time competing process update the same
1292 DB_File database, but it does illustrate why the technique should not
1293 be used.
1294
1295 Safe ways to lock a database
1296 Starting with version 2.x, Berkeley DB has internal support for
1297 locking. The companion module to this one, BerkeleyDB, provides an
1298 interface to this locking functionality. If you are serious about
1299 locking Berkeley DB databases, I strongly recommend using BerkeleyDB.
1300
1301 If using BerkeleyDB isn't an option, there are a number of modules
1302 available on CPAN that can be used to implement locking. Each one
1303 implements locking differently and has different goals in mind. It is
1304 therefore worth knowing the difference, so that you can pick the right
1305 one for your application. Here are the three locking wrappers:
1306
1307 Tie::DB_Lock
1308 A DB_File wrapper which creates copies of the database file for
1309 read access, so that you have a kind of a multiversioning
1310 concurrent read system. However, updates are still serial. Use for
1311 databases where reads may be lengthy and consistency problems may
1312 occur.
1313
1314 Tie::DB_LockFile
1315 A DB_File wrapper that has the ability to lock and unlock the
1316 database while it is being used. Avoids the tie-before-flock
1317 problem by simply re-tie-ing the database when you get or drop a
1318 lock. Because of the flexibility in dropping and re-acquiring the
1319 lock in the middle of a session, this can be massaged into a
1320 system that will work with long updates and/or reads if the
1321 application follows the hints in the POD documentation.
1322
1323 DB_File::Lock
1324 An extremely lightweight DB_File wrapper that simply flocks a
1325 lockfile before tie-ing the database and drops the lock after the
1326 untie. Allows one to use the same lockfile for multiple databases
1327 to avoid deadlock problems, if desired. Use for databases where
1328 updates are reads are quick and simple flock locking semantics are
1329 enough.
1330
1331 Sharing Databases With C Applications
1332 There is no technical reason why a Berkeley DB database cannot be
1333 shared by both a Perl and a C application.
1334
1335 The vast majority of problems that are reported in this area boil down
1336 to the fact that C strings are NULL terminated, whilst Perl strings are
1337 not. See "DBM FILTERS" for a generic way to work around this problem.
1338
1339 Here is a real example. Netscape 2.0 keeps a record of the locations
1340 you visit along with the time you last visited them in a DB_HASH
1341 database. This is usually stored in the file ~/.netscape/history.db.
1342 The key field in the database is the location string and the value
1343 field is the time the location was last visited stored as a 4 byte
1344 binary value.
1345
1346 If you haven't already guessed, the location string is stored with a
1347 terminating NULL. This means you need to be careful when accessing the
1348 database.
1349
1350 Here is a snippet of code that is loosely based on Tom Christiansen's
1351 ggh script (available from your nearest CPAN archive in
1352 authors/id/TOMC/scripts/nshist.gz).
1353
1354 use warnings ;
1355 use strict ;
1356 use DB_File ;
1357 use Fcntl ;
1358
1359 my ($dotdir, $HISTORY, %hist_db, $href, $binary_time, $date) ;
1360 $dotdir = $ENV{HOME} || $ENV{LOGNAME};
1361
1362 $HISTORY = "$dotdir/.netscape/history.db";
1363
1364 tie %hist_db, 'DB_File', $HISTORY
1365 or die "Cannot open $HISTORY: $!\n" ;;
1366
1367 # Dump the complete database
1368 while ( ($href, $binary_time) = each %hist_db ) {
1369
1370 # remove the terminating NULL
1371 $href =~ s/\x00$// ;
1372
1373 # convert the binary time into a user friendly string
1374 $date = localtime unpack("V", $binary_time);
1375 print "$date $href\n" ;
1376 }
1377
1378 # check for the existence of a specific key
1379 # remember to add the NULL
1380 if ( $binary_time = $hist_db{"http://mox.perl.com/\x00"} ) {
1381 $date = localtime unpack("V", $binary_time) ;
1382 print "Last visited mox.perl.com on $date\n" ;
1383 }
1384 else {
1385 print "Never visited mox.perl.com\n"
1386 }
1387
1388 untie %hist_db ;
1389
1390 The untie() Gotcha
1391 If you make use of the Berkeley DB API, it is very strongly recommended
1392 that you read "The untie Gotcha" in perltie.
1393
1394 Even if you don't currently make use of the API interface, it is still
1395 worth reading it.
1396
1397 Here is an example which illustrates the problem from a DB_File
1398 perspective:
1399
1400 use DB_File ;
1401 use Fcntl ;
1402
1403 my %x ;
1404 my $X ;
1405
1406 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_TRUNC
1407 or die "Cannot tie first time: $!" ;
1408
1409 $x{123} = 456 ;
1410
1411 untie %x ;
1412
1413 tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1414 or die "Cannot tie second time: $!" ;
1415
1416 untie %x ;
1417
1418 When run, the script will produce this error message:
1419
1420 Cannot tie second time: Invalid argument at bad.file line 14.
1421
1422 Although the error message above refers to the second tie() statement
1423 in the script, the source of the problem is really with the untie()
1424 statement that precedes it.
1425
1426 Having read perltie you will probably have already guessed that the
1427 error is caused by the extra copy of the tied object stored in $X. If
1428 you haven't, then the problem boils down to the fact that the DB_File
1429 destructor, DESTROY, will not be called until all references to the
1430 tied object are destroyed. Both the tied variable, %x, and $X above
1431 hold a reference to the object. The call to untie() will destroy the
1432 first, but $X still holds a valid reference, so the destructor will not
1433 get called and the database file tst.fil will remain open. The fact
1434 that Berkeley DB then reports the attempt to open a database that is
1435 already open via the catch-all "Invalid argument" doesn't help.
1436
1437 If you run the script with the "-w" flag the error message becomes:
1438
1439 untie attempted while 1 inner references still exist at bad.file line 12.
1440 Cannot tie second time: Invalid argument at bad.file line 14.
1441
1442 which pinpoints the real problem. Finally the script can now be
1443 modified to fix the original problem by destroying the API object
1444 before the untie:
1445
1446 ...
1447 $x{123} = 456 ;
1448
1449 undef $X ;
1450 untie %x ;
1451
1452 $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
1453 ...
1454
1456 Why is there Perl source in my database?
1457 If you look at the contents of a database file created by DB_File,
1458 there can sometimes be part of a Perl script included in it.
1459
1460 This happens because Berkeley DB uses dynamic memory to allocate
1461 buffers which will subsequently be written to the database file. Being
1462 dynamic, the memory could have been used for anything before DB
1463 malloced it. As Berkeley DB doesn't clear the memory once it has been
1464 allocated, the unused portions will contain random junk. In the case
1465 where a Perl script gets written to the database, the random junk will
1466 correspond to an area of dynamic memory that happened to be used during
1467 the compilation of the script.
1468
1469 Unless you don't like the possibility of there being part of your Perl
1470 scripts embedded in a database file, this is nothing to worry about.
1471
1472 How do I store complex data structures with DB_File?
1473 Although DB_File cannot do this directly, there is a module which can
1474 layer transparently over DB_File to accomplish this feat.
1475
1476 Check out the MLDBM module, available on CPAN in the directory
1477 modules/by-module/MLDBM.
1478
1479 What does "Invalid Argument" mean?
1480 You will get this error message when one of the parameters in the "tie"
1481 call is wrong. Unfortunately there are quite a few parameters to get
1482 wrong, so it can be difficult to figure out which one it is.
1483
1484 Here are a couple of possibilities:
1485
1486 1. Attempting to reopen a database without closing it.
1487
1488 2. Using the O_WRONLY flag.
1489
1490 What does "Bareword 'DB_File' not allowed" mean?
1491 You will encounter this particular error message when you have the
1492 "strict 'subs'" pragma (or the full strict pragma) in your script.
1493 Consider this script:
1494
1495 use warnings ;
1496 use strict ;
1497 use DB_File ;
1498 my %x ;
1499 tie %x, DB_File, "filename" ;
1500
1501 Running it produces the error in question:
1502
1503 Bareword "DB_File" not allowed while "strict subs" in use
1504
1505 To get around the error, place the word "DB_File" in either single or
1506 double quotes, like this:
1507
1508 tie %x, "DB_File", "filename" ;
1509
1510 Although it might seem like a real pain, it is really worth the effort
1511 of having a "use strict" in all your scripts.
1512
1514 Articles that are either about DB_File or make use of it.
1515
1516 1. Full-Text Searching in Perl, Tim Kientzle (tkientzle@ddj.com), Dr.
1517 Dobb's Journal, Issue 295, January 1999, pp 34-41
1518
1520 Moved to the Changes file.
1521
1523 Some older versions of Berkeley DB had problems with fixed length
1524 records using the RECNO file format. This problem has been fixed since
1525 version 1.85 of Berkeley DB.
1526
1527 I am sure there are bugs in the code. If you do find any, or can
1528 suggest any enhancements, I would welcome your comments.
1529
1531 DB_File comes with the standard Perl source distribution. Look in the
1532 directory ext/DB_File. Given the amount of time between releases of
1533 Perl the version that ships with Perl is quite likely to be out of
1534 date, so the most recent version can always be found on CPAN (see
1535 "CPAN" in perlmodlib for details), in the directory
1536 modules/by-module/DB_File.
1537
1538 This version of DB_File will work with either version 1.x, 2.x or 3.x
1539 of Berkeley DB, but is limited to the functionality provided by version
1540 1.
1541
1542 The official web site for Berkeley DB is
1543 http://www.oracle.com/technology/products/berkeley-db/db/index.html.
1544 All versions of Berkeley DB are available there.
1545
1546 Alternatively, Berkeley DB version 1 is available at your nearest CPAN
1547 archive in src/misc/db.1.85.tar.gz.
1548
1549 If you are running IRIX, then get Berkeley DB version 1 from
1550 http://reality.sgi.com/ariel. It has the patches necessary to compile
1551 properly on IRIX 5.3.
1552
1554 Copyright (c) 1995-2007 Paul Marquess. All rights reserved. This
1555 program is free software; you can redistribute it and/or modify it
1556 under the same terms as Perl itself.
1557
1558 Although DB_File is covered by the Perl license, the library it makes
1559 use of, namely Berkeley DB, is not. Berkeley DB has its own copyright
1560 and its own license. Please take the time to read it.
1561
1562 Here are are few words taken from the Berkeley DB FAQ (at
1563 http://www.oracle.com/technology/products/berkeley-db/db/index.html)
1564 regarding the license:
1565
1566 Do I have to license DB to use it in Perl scripts?
1567
1568 No. The Berkeley DB license requires that software that uses
1569 Berkeley DB be freely redistributable. In the case of Perl, that
1570 software is Perl, and not your scripts. Any Perl scripts that you
1571 write are your property, including scripts that make use of
1572 Berkeley DB. Neither the Perl license nor the Berkeley DB license
1573 place any restriction on what you may do with them.
1574
1575 If you are in any doubt about the license situation, contact either the
1576 Berkeley DB authors or the author of DB_File. See "AUTHOR" for details.
1577
1579 perl, dbopen(3), hash(3), recno(3), btree(3), perldbmfilter
1580
1582 The DB_File interface was written by Paul Marquess <pmqs@cpan.org>.
1583
1584
1585
1586perl v5.12.4 2011-11-04 DB_File(3pm)