Fsdb(3pm)

1Fsdb(3)               User Contributed Perl Documentation              Fsdb(3)
2
3
4

NAME

6       Fsdb - a flat-text database for shell scripting
7

SYNOPSIS

9       Fsdb, the flatfile streaming database is package of commands for
10       manipulating flat-ASCII databases from shell scripts.  Fsdb is useful
11       to process medium amounts of data (with very little data you'd do it by
12       hand, with megabytes you might want a real database).  Fsdb was known
13       as as Jdb from 1991 to Oct. 2008.
14
15       Fsdb is very good at doing things like:
16
17       •   extracting measurements from experimental output
18
19       •   examining data to address different hypotheses
20
21       •   joining data from different experiments
22
23       •   eliminating/detecting outliers
24
25       •   computing statistics on data (mean, confidence intervals,
26           correlations, histograms)
27
28       •   reformatting data for graphing programs
29
30       Fsdb is built around the idea of a flat text file as a database.  Fsdb
31       files (by convention, with the extension .fsdb), have a header
32       documenting the schema (what the columns mean), and then each line
33       represents a database record (or row).
34
35       For example:
36
37               #fsdb experiment duration
38               ufs_mab_sys 37.2
39               ufs_mab_sys 37.3
40               ufs_rcp_real 264.5
41               ufs_rcp_real 277.9
42
43       Is a simple file with four experiments (the rows), each with a
44       description, size parameter, and run time in the first, second, and
45       third columns.
46
47       Rather than hand-code scripts to do each special case, Fsdb provides
48       higher-level functions.  Although it's often easy throw together a
49       custom script to do any single task, I believe that there are several
50       advantages to using Fsdb:
51
52       •   these programs provide a higher level interface than plain Perl, so
53
54           **  Fewer lines of simpler code:
55
56                   dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration
57
58               Picks out just one type of experiment and computes statistics
59               on it, rather than:
60
61                   while (<>) { split; $sum+=$F[1]; $ss+=$F[1]**2; $n++; }
62                   $mean = $sum / $n; $std_dev = ...
63
64               in dozens of places.
65
66       •   the library uses names for columns, so
67
68           **  No more $F[1], use "_duration".
69
70           **  New or different order columns?  No changes to your scripts!
71
72           Thus if your experiment gets more complicated with a size
73           parameter, so your log changes to:
74
75                   #fsdb experiment size duration
76                   ufs_mab_sys 1024 37.2
77                   ufs_mab_sys 1024 37.3
78                   ufs_rcp_real 1024 264.5
79                   ufs_rcp_real 1024 277.9
80                   ufs_mab_sys 2048 45.3
81                   ufs_mab_sys 2048 44.2
82
83           Then the previous scripts still work, even though duration is now
84           the third column, not the second.
85
86       •   A series of actions are self-documenting (the provenance of
87           processsing done to produce each output is recorded in comments).
88
89           **  No more wondering what hacks were used to compute the final
90               data, just look at the comments at the end of the output.
91
92           For example, the commands
93
94               dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration
95
96           add to the end of the output the lines
97               #    | dbrow _experiment eq "ufs_mab_sys"
98               #    | dbcolstats duration
99
100       •   The library is mature, supporting large datasets (more than 100GB),
101           corner cases, error handling, backed by an automated test suite.
102
103           **  No more puzzling about bad output because your custom script
104               skimped on error checking.
105
106           **  No more memory thrashing when you try to sort ten million
107               records.
108
109       •   Fsdb-2.x supports Perl scripting (in addition to shell scripting),
110           with libraries to do Fsdb input and output, and easy support for
111           pipelines.  The shell script
112
113               dbcol name test1 | dbroweval '_test1 += 5;'
114
115           can be written in perl as:
116
117               dbpipeline(dbcol(qw(name test1)), dbroweval('_test1 += 5;'));
118
119       (The disadvantage is that you need to learn what functions Fsdb
120       provides.)
121
122       Fsdb is built on flat-ASCII databases.  By storing data in simple text
123       files and processing it with pipelines it is easy to experiment (in the
124       shell) and look at the output.  To the best of my knowledge, the
125       original implementation of this idea was "/rdb", a commercial product
126       described in the book UNIX relational database management: application
127       development in the UNIX environment by Rod Manis, Evan Schaffer, and
128       Robert Jorgensen (1988 by Prentice Hall, and also at the web page
129       <http://www.rdb.com/>).  Fsdb is an incompatible re-implementation of
130       their idea without any accelerated indexing or forms support.  (But
131       it's free, and probably has better statistics!).
132
133       Fsdb-2.x will exploit multiple processors or cores, and provides Perl-
134       level support for input, output, and threaded-pipelines.  (As of
135       Fsdb-2.44 it no longer uses Perl threading, just processes, since they
136       are faster.)
137
138       Installation instructions follow at the end of this document.  Fsdb-2.x
139       requires Perl 5.8 to run.  All commands have manual pages and provide
140       usage with the "--help" option.  All commands are backed by an
141       automated test suite.
142
143       The most recent version of Fsdb is available on the web at
144       <http://www.isi.edu/~johnh/SOFTWARE/FSDB/index.html>.
145

WHAT'S NEW

147   2.74, 2021-06-23 More ipv6.
148       ENHANCEMENT
149           Fsdb::Support::IPv6 package includes ipv6_fullhex to rewrite ipv6
150           print addresses as full, 128-bit hex values.
151

README CONTENTS

153       executive summary
154       what's new
155       README CONTENTS
156       installation
157       basic data format
158       basic data manipulation
159       list of commands
160       another example
161       a gradebook example
162       a password example
163       history
164       related work
165       release notes
166       copyright
167       comments
168

INSTALLATION

170       Fsdb now uses the standard Perl build and installation from
171       ExtUtil::MakeMaker(3), so the quick answer to installation is to type:
172
173           perl Makefile.PL
174           make
175           make test
176           make install
177
178       Or, if you want to install it somewhere else, change the first line to
179
180           perl Makefile.PL PREFIX=$HOME
181
182       and it will go in your home directory's bin, etc.  (See
183       ExtUtil::MakeMaker(3) for more details.)
184
185       Fsdb requires perl 5.8 or later.
186
187       A test-suite is available, run it with
188
189           make test
190
191       In the past, the ports existed for FreeBSD and MacOS.  If someone
192       running one of those OSes wants to contribute a new port, please let me
193       know.
194

BASIC DATA FORMAT

196       These programs are based on the idea storing data in simple ASCII
197       files.  A database is a file with one header line and then data or
198       comment lines.  For example:
199
200               #fsdb account passwd uid gid fullname homedir shell
201               johnh * 2274 134 John_Heidemann /home/johnh /bin/bash
202               greg * 2275 134 Greg_Johnson /home/greg /bin/bash
203               root * 0 0 Root /root /bin/bash
204               # this is a simple database
205
206       The header line must be first and begins with "#h".  There are rows
207       (records) and columns (fields), just like in a normal database.
208       Comment lines begin with "#".  Column names are any string not
209       containing spaces or single quote (although it is prudent to keep them
210       alphanumeric with underscore).
211
212       By default, columns are delimited by whitespace.  With this default
213       configuration, the contents of a field cannot contain whitespace.
214       However, this limitation can be relaxed by changing the field separator
215       as described below.
216
217       The big advantage of simple flat-text databases is that it is usually
218       easy to massage data into this format, and it's reasonably easy to take
219       data out of this format into other (text-based) programs, like gnuplot,
220       jgraph, and LaTeX.  Think Unix.  Think pipes.  (Or even output to Excel
221       and HTML if you prefer.)
222
223       Since no-whitespace in columns was a problem for some applications,
224       there's an option which relaxes this rule.  You can specify the field
225       separator in the table header with "-F x" where "x" is a code for the
226       new field separator.  A full list of codes is at dbfilealter(1), but
227       two common special values are "-F t" which is a separator of a single
228       tab character, and "-F S", a separator of two spaces.  Both allowing
229       (single) spaces in fields.  An example:
230
231               #fsdb -F S account passwd uid gid fullname homedir shell
232               johnh  *  2274  134  John Heidemann  /home/johnh  /bin/bash
233               greg  *  2275  134  Greg Johnson  /home/greg  /bin/bash
234               root  *  0  0  Root  /root  /bin/bash
235               # this is a simple database
236
237       See dbfilealter(1) for more details.  Regardless of what the column
238       separator is for the body of the data, it's always whitespace in the
239       header.
240
241       There's also a third format: a "list".  Because it's often hard to see
242       what's columns past the first two, in list format each "column" is on a
243       separate line.  The programs dblistize and dbcolize convert to and from
244       this format, and all programs work with either formats.  The command
245
246           dbfilealter -R C  < DATA/passwd.fsdb
247
248       outputs:
249
250               #fsdb -R C account passwd uid gid fullname homedir shell
251               account:  johnh
252               passwd:   *
253               uid:      2274
254               gid:      134
255               fullname: John_Heidemann
256               homedir:  /home/johnh
257               shell:    /bin/bash
258
259               account:  greg
260               passwd:   *
261               uid:      2275
262               gid:      134
263               fullname: Greg_Johnson
264               homedir:  /home/greg
265               shell:    /bin/bash
266
267               account:  root
268               passwd:   *
269               uid:      0
270               gid:      0
271               fullname: Root
272               homedir:  /root
273               shell:    /bin/bash
274
275               # this is a simple database
276               #  | dblistize
277
278       See dbfilealter(1) for more details.
279

BASIC DATA MANIPULATION

281       A number of programs exist to manipulate databases.  Complex functions
282       can be made by stringing together commands with shell pipelines.  For
283       example, to print the home directories of everyone with ``john'' in
284       their names, you would do:
285
286               cat DATA/passwd | dbrow '_fullname =~ /John/' | dbcol homedir
287
288       The output might be:
289
290               #fsdb homedir
291               /home/johnh
292               /home/greg
293               # this is a simple database
294               #  | dbrow _fullname =~ /John/
295               #  | dbcol homedir
296
297       (Notice that comments are appended to the output listing each command,
298       providing an automatic audit log.)
299
300       In addition to typical database functions (select, join, etc.) there
301       are also a number of statistical functions.
302
303       The real power of Fsdb is that one can apply arbitrary code to rows to
304       do powerful things.
305
306               cat DATA/passwd | dbroweval '_fullname =~ s/(\w+)_(\w+)/$2,_$1/'
307
308       converts "John_Heidemann" into "Heidemann,_John".  Not too much more
309       work could split fullname into firstname and lastname fields.
310
311       (Or:
312
313               cat DATA/passwd | dbcolcreate sort | dbroweval -b 'use Fsdb::Support'
314                       '_sort = _fullname; _sort =~ s/_/ /g; _sort = fullname_to_sort(_sort);'
315

TALKING ABOUT COLUMNS

317       An advantage of Fsdb is that you can talk about columns by name
318       (symbolically) rather than simply by their positions.  So in the above
319       example, "dbcol homedir" pulled out the home directory column, and
320       "dbrow '_fullname =~ /John/'" matched against column fullname.
321
322       In general, you can use the name of the column listed on the "#fsdb"
323       line to identify it in most programs, and _name to identify it in code.
324
325       Some alternatives for flexibility:
326
327       •   Numeric values identify columns positionally, numbering from 0.  So
328           0 or _0 is the first column, 1 is the second, etc.
329
330       •   In code, _last_columnname gets the value from columname's previous
331           row.
332
333       See dbroweval(1) for more details about writing code.
334

LIST OF COMMANDS

336       Enough said.  I'll summarize the commands, and then you can experiment.
337       For a detailed description of each command, see a summary by running it
338       with the argument "--help" (or "-?" if you prefer.)  Full manual pages
339       can be found by running the command with the argument "--man", or
340       running the Unix command "man dbcol" or whatever program you want.
341
342   TABLE CREATION
343       dbcolcreate
344           add columns to a database
345
346       dbcoldefine
347           set the column headings for a non-Fsdb file
348
349   TABLE MANIPULATION
350       dbcol
351           select columns from a table
352
353       dbrow
354           select rows from a table
355
356       dbsort
357           sort rows based on a set of columns
358
359       dbjoin
360           compute the natural join of two tables
361
362       dbcolrename
363           rename a column
364
365       dbcolmerge
366           merge two columns into one
367
368       dbcolsplittocols
369           split one column into two or more columns
370
371       dbcolsplittorows
372           split one column into multiple rows
373
374       dbfilepivot
375           "pivots" a file, converting multiple rows corresponding to the same
376           entity into a single row with multiple columns.
377
378       dbfilevalidate
379           check that db file doesn't have some common errors
380
381   COMPUTATION AND STATISTICS
382       dbcolstats
383           compute statistics over a column (mean,etc.,optionally median)
384
385       dbmultistats
386           group rows by some key value, then compute stats (mean, etc.) over
387           each group (equivalent to dbmapreduce with dbcolstats as the
388           reducer)
389
390       dbmapreduce
391           group rows (map) and then apply an arbitrary function to each group
392           (reduce)
393
394       dbrvstatdiff
395           compare two samples distributions (mean/conf interval/T-test)
396
397       dbcolmovingstats
398           computing moving statistics over a column of data
399
400       dbcolstatscores
401           compute Z-scores and T-scores over one column of data
402
403       dbcolpercentile
404           compute the rank or percentile of a column
405
406       dbcolhisto
407           compute histograms over a column of data
408
409       dbcolscorrelate
410           compute the coefficient of correlation over several columns
411
412       dbcolsregression
413           compute linear regression and correlation for two columns
414
415       dbrowaccumulate
416           compute a running sum over a column of data
417
418       dbrowcount
419           count the number of rows (a subset of dbstats)
420
421       dbrowdiff
422           compute differences between a columns in each row of a table
423
424       dbrowenumerate
425           number each row
426
427       dbroweval
428           run arbitrary Perl code on each row
429
430       dbrowuniq
431           count/eliminate identical rows (like Unix uniq(1))
432
433       dbfilediff
434           compare fields on rows of a file (something like Unix diff(1))
435
436   OUTPUT CONTROL
437       dbcolneaten
438           pretty-print columns
439
440       dbfilealter
441           convert between column or list format, or change the column
442           separator
443
444       dbfilestripcomments
445           remove comments from a table
446
447       dbformmail
448           generate a script that sends form mail based on each row
449
450   CONVERSIONS
451       (These programs convert data into fsdb.  See their web pages for
452       details.)
453
454       cgi_to_db
455           <http://stein.cshl.org/boulder/>
456
457       combined_log_format_to_db
458           <http://httpd.apache.org/docs/2.0/logs.html>
459
460       html_table_to_db
461           HTML tables to fsdb (assuming they're reasonably formatted).
462
463       kitrace_to_db
464           <http://ficus-www.cs.ucla.edu/ficus-members/geoff/kitrace.html>
465
466       ns_to_db
467           <http://mash-www.cs.berkeley.edu/ns/>
468
469       sqlselect_to_db
470           the output of SQL SELECT tables to db
471
472       tabdelim_to_db
473           spreadsheet tab-delimited files to db
474
475       tcpdump_to_db
476           (see man tcpdump(8) on any reasonable system)
477
478       xml_to_db
479           XML input to fsdb, assuming they're very regular
480
481       (And out of fsdb:)
482
483       db_to_csv
484           Comma-separated-value format from fsdb.
485
486       db_to_html_table
487           simple conversion of Fsdb to html tables
488
489   STANDARD OPTIONS
490       Many programs have common options:
491
492       -? or --help
493           Show basic usage.
494
495       -N on --new-name
496           When a command creates a new column like dbrowaccumulate's "accum",
497           this option lets one override the default name of that new column.
498
499       -T TmpDir
500           where to put tmp files.  Also uses environment variable TMPDIR, if
501           -T is not specified.  Default is /tmp.
502
503           Show basic usage.
504
505       -c FRACTION or --confidence FRACTION
506           Specify confidence interval FRACTION (dbcolstats, dbmultistats,
507           etc.)
508
509       -C S or "--element-separator S"
510           Specify column separator S (dbcolsplittocols, dbcolmerge).
511
512       -d or --debug
513           Enable debugging (may be repeated for greater effect in some
514           cases).
515
516       -a or --include-non-numeric
517           Compute stats over all data (treating non-numbers as zeros).  (By
518           default, things that can't be treated as numbers are ignored for
519           stats purposes)
520
521       -S or --pre-sorted
522           Assume the data is pre-sorted.  May be repeated to disable
523           verification (saving a small amount of work).
524
525       -e E or --empty E
526           give value E as the value for empty (null) records
527
528       -i I or --input I
529           Input data from file I.
530
531       -o O or --output O
532           Write data out to file O.
533
534       --header H
535           Use H as the full Fsdb header, rather than reading a header from
536           then input.  This option is particularly useful when using Fsdb
537           under Hadoop, where split files don't have heades.
538
539       --nolog.
540           Skip logging the program in a trailing comment.
541
542       When giving Perl code (in dbrow and dbroweval) column names can be
543       embedded if preceded by underscores.  Look at dbrow(1) or dbroweval(1)
544       for examples.)
545
546       Most programs run in constant memory and use temporary files if
547       necessary.  Exceptions are dbcolneaten, dbcolpercentile, dbmapreduce,
548       dbmultistats, dbrowsplituniq.
549

ANOTHER EXAMPLE

551       Take the raw data in "DATA/http_bandwidth", put a header on it
552       ("dbcoldefine size bw"), took statistics of each category
553       ("dbmultistats -k size bw"), pick out the relevant fields ("dbcol size
554       mean stddev pct_rsd"), and you get:
555
556               #fsdb size mean stddev pct_rsd
557               1024    1.4962e+06      2.8497e+05      19.047
558               10240   5.0286e+06      6.0103e+05      11.952
559               102400  4.9216e+06      3.0939e+05      6.2863
560               #  | dbcoldefine size bw
561               #  | /home/johnh/BIN/DB/dbmultistats -k size bw
562               #  | /home/johnh/BIN/DB/dbcol size mean stddev pct_rsd
563
564       (The whole command was:
565
566               cat DATA/http_bandwidth |
567               dbcoldefine size |
568               dbmultistats -k size bw |
569               dbcol size mean stddev pct_rsd
570
571       all on one line.)
572
573       Then post-process them to get rid of the exponential notation by adding
574       this to the end of the pipeline:
575
576           dbroweval '_mean = sprintf("%8.0f", _mean); _stddev = sprintf("%8.0f", _stddev);'
577
578       (Actually, this step is no longer required since dbcolstats now uses a
579       different default format.)
580
581       giving:
582
583               #fsdb      size    mean    stddev  pct_rsd
584               1024     1496200          284970        19.047
585               10240    5028600          601030        11.952
586               102400   4921600          309390        6.2863
587               #  | dbcoldefine size bw
588               #  | dbmultistats -k size bw
589               #  | dbcol size mean stddev pct_rsd
590               #  | dbroweval   { _mean = sprintf("%8.0f", _mean); _stddev = sprintf("%8.0f", _stddev); }
591
592       In a few lines, raw data is transformed to processed output.
593
594       Suppose you expect there is an odd distribution of results of one
595       datapoint.  Fsdb can easily produce a CDF (cumulative distribution
596       function) of the data, suitable for graphing:
597
598           cat DB/DATA/http_bandwidth | \
599               dbcoldefine size bw | \
600               dbrow '_size == 102400' | \
601               dbcol bw | \
602               dbsort -n bw | \
603               dbrowenumerate | \
604               dbcolpercentile count | \
605               dbcol bw percentile | \
606               xgraph
607
608       The steps, roughly: 1. get the raw input data and turn it into fsdb
609       format, 2. pick out just the relevant column (for efficiency) and sort
610       it, 3. for each data point, assign a CDF percentage to it, 4. pick out
611       the two columns to graph and show them
612

A GRADEBOOK EXAMPLE

614       The first commercial program I wrote was a gradebook, so here's how to
615       do it with Fsdb.
616
617       Format your data like DATA/grades.
618
619               #fsdb name email id test1
620               a a@ucla.example.edu 1 80
621               b b@usc.example.edu 2 70
622               c c@isi.example.edu 3 65
623               d d@lmu.example.edu 4 90
624               e e@caltech.example.edu 5 70
625               f f@oxy.example.edu 6 90
626
627       Or if your students have spaces in their names, use "-F S" and two
628       spaces to separate each column:
629
630               #fsdb -F S name email id test1
631               alfred aho  a@ucla.example.edu  1  80
632               butler lampson  b@usc.example.edu  2  70
633               david clark  c@isi.example.edu  3  65
634               constantine drovolis  d@lmu.example.edu  4  90
635               debrorah estrin  e@caltech.example.edu  5  70
636               sally floyd  f@oxy.example.edu  6  90
637
638       To compute statistics on an exam, do
639
640               cat DATA/grades | dbstats test1 |dblistize
641
642       giving
643
644               #fsdb -R C  ...
645               mean:        77.5
646               stddev:      10.84
647               pct_rsd:     13.987
648               conf_range:  11.377
649               conf_low:    66.123
650               conf_high:   88.877
651               conf_pct:    0.95
652               sum:         465
653               sum_squared: 36625
654               min:         65
655               max:         90
656               n:           6
657               ...
658
659       To do a histogram:
660
661               cat DATA/grades | dbcolhisto -n 5 -g test1
662
663       giving
664
665               #fsdb low histogram
666               65      *
667               70      **
668               75
669               80      *
670               85
671               90      **
672               #  | /home/johnh/BIN/DB/dbhistogram -n 5 -g test1
673
674       Now you want to send out grades to the students by e-mail.  Create a
675       form-letter (in the file test1.txt):
676
677               To: _email (_name)
678               From: J. Random Professor <jrp@usc.example.edu>
679               Subject: test1 scores
680
681               _name, your score on test1 was _test1.
682               86+   A
683               75-85 B
684               70-74 C
685               0-69  F
686
687       Generate the shell script that will send the mail out:
688
689               cat DATA/grades | dbformmail test1.txt > test1.sh
690
691       And run it:
692
693               sh <test1.sh
694
695       The last two steps can be combined:
696
697               cat DATA/grades | dbformmail test1.txt | sh
698
699       but I like to keep a copy of exactly what I send.
700
701       At the end of the semester you'll want to compute grade totals and
702       assign letter grades.  Both fall out of dbroweval.  For example, to
703       compute weighted total grades with a 40% midterm/60% final where the
704       midterm is 84 possible points and the final 100:
705
706               dbcol -rv total |
707               dbcolcreate total - |
708               dbroweval '
709                       _total = .40 * _midterm/84.0 + .60 * _final/100.0;
710                       _total = sprintf("%4.2f", _total);
711                       if (_final eq "-" || ( _name =~ /^_/)) { _total = "-"; };' |
712               dbcolneaten
713
714       If you got the data originally from a spreadsheet, save it in "tab-
715       delimited" format and convert it with tabdelim_to_db (run
716       tabdelim_to_db -? for examples).
717

A PASSWORD EXAMPLE

719       To convert the Unix password file to db:
720
721               cat /etc/passwd | sed 's/:/  /g'| \
722                       dbcoldefine -F S login password uid gid gecos home shell \
723                       >passwd.fsdb
724
725       To convert the group file
726
727               cat /etc/group | sed 's/:/  /g' | \
728                       dbcoldefine -F S group password gid members \
729                       >group.fsdb
730
731       To show the names of the groups that div7-members are in (assuming DIV7
732       is in the gecos field):
733
734               cat passwd.fsdb | dbrow '_gecos =~ /DIV7/' | dbcol login gid | \
735                       dbjoin -i - -i group.fsdb gid | dbcol login group
736

SHORT EXAMPLES

738       Which Fsdb programs are the most complicated (based on number of test
739       cases)?
740
741               ls TEST/*.cmd | \
742                       dbcoldefine test | \
743                       dbroweval '_test =~ s@^TEST/([^_]+).*$@$1@' | \
744                       dbrowuniq -c | \
745                       dbsort -nr count | \
746                       dbcolneaten
747
748       (Answer: dbmapreduce, then dbcolstats, dbfilealter and dbjoin.)
749
750       Stats on an exam (in $FILE, where $COLUMN is the name of the exam)?
751
752               cat $FILE | dbcolstats -q 4 $COLUMN <$FILE | dblistize | dbstripcomments
753
754               cat $FILE | dbcolhisto -g -n 20 $COLUMN | dbcolneaten | dbstripcomments
755
756       Merging a the hw1 column from file hw1.fsdb into grades.fsdb assuming
757       there's a common student id in column "id":
758
759               dbcol id hw1 <hw1.fsdb >t.fsdb
760
761               dbjoin -a -e - grades.fsdb t.fsdb id | \
762                   dbsort  name | \
763                   dbcolneaten >new_grades.fsdb
764
765       Merging two fsdb files with the same rows:
766
767               cat file1.fsdb file2.fsdb >output.fsdb
768
769       or if you want to clean things up a bit
770
771               cat file1.fsdb file2.fsdb | dbstripextraheaders >output.fsdb
772
773       or if you want to know where the data came from
774
775               for i in 1 2
776               do
777                       dbcolcreate source $i < file$i.fsdb
778               done >output.fsdb
779
780       (assumes you're using a Bourne-shell compatible shell, not csh).
781

WARNINGS

783       As with any tool, one should (which means must) understand the limits
784       of the tool.
785
786       All Fsdb tools should run in constant memory.  In some cases (such as
787       dbcolstats with quartiles, where the whole input must be re-read),
788       programs will spool data to disk if necessary.
789
790       Most tools buffer one or a few lines of data, so memory will scale with
791       the size of each line.  (So lines with many columns, or when columns
792       have lots data, may cause large memory consumption.)
793
794       All Fsdb tools should run in constant or at worst "n log n" time.
795
796       All Fsdb tools use normal Perl math routines for computation.  Although
797       I make every attempt to choose numerically stable algorithms (although
798       I also welcome feedback and suggestions for improvement), normal
799       rounding due to computer floating point approximations can result in
800       inaccuracies when data spans a large range of precision.  (See for
801       example the dbcolstats_extrema test cases.)
802
803       Any requirements and limitations of each Fsdb tool is documented on its
804       manual page.
805
806       If any Fsdb program violates these assumptions, that is a bug that
807       should be documented on the tool's manual page or ideally fixed.
808
809       Fsdb does depend on Perl's correctness, and Perl (and Fsdb) have some
810       bugs.  Fsdb should work on perl from version 5.10 onward.
811

HISTORY

813       There have been three versions of Fsdb; fsdb 1.0 is a complete re-write
814       of the pre-1995 versions, and was distributed from 1995 to 2007.  Fsdb
815       2.0 is a significant re-write of the 1.x versions for reasons described
816       below.
817
818       Fsdb (in its various forms) has been used extensively by its author
819       since 1991.  Since 1995 it's been used by two other researchers at UCLA
820       and several at ISI.  In February 1998 it was announced to the Internet.
821       Since then it has found a few users, some outside where I work.
822
823       Major changes:
824
825       1.0 1997-07-22: first public release.
826       2.0 2008-01-25: rewrite to use a common library, and starting to use
827       threads.
828       2.12 2008-10-16: completion of the rewrite, and first RPM package.
829       2.44 2013-10-02: abandoning threads for improved performance
830
831   Fsdb 2.0 Rationale
832       I've thought about fsdb-2.0 for many years, but it was started in
833       earnest in 2007.  Fsdb-2.0 has the following goals:
834
835       in-one-process processing
836           While fsdb is great on the Unix command line as a pipeline between
837           programs, it should also be possible to set it up to run in a
838           single process.  And if it does so, it should be able to avoid
839           serializing and deserializing (converting to and from text) data
840           between each module.  (Accomplished in fsdb-2.0: see dbpipeline,
841           although still needs tuning.)
842
843       clean IO API
844           Fsdb's roots go back to perl4 and 1991, so the fsdb-1.x library is
845           very, very crufty.  More than just being ugly (but it was that
846           too), this made things reading from one format file and writing to
847           another the application's job, when it should be the library's.
848           (Accomplished in fsdb-1.15 and improved in 2.0: see Fsdb::IO.)
849
850       normalized module APIs
851           Because fsdb modules were added as needed over 10 years, sometimes
852           the module APIs became inconsistent.  (For example, the 1.x
853           "dbcolcreate" required an empty value following the name of the new
854           column, but other programs specify empty values with the "-e"
855           argument.)  We should smooth over these inconsistencies.
856           (Accomplished as each module was ported in 2.0 through 2.7.)
857
858       everyone handles all input formats
859           Given a clean IO API, the distinction between "colized" and
860           "listized" fsdb files should go away.  Any program should be able
861           to read and write files in any format.  (Accomplished in fsdb-2.1.)
862
863       Fsdb-2.0 preserves backwards compatibility where possible, but breaks
864       it where necessary to accomplish the above goals.  In August 2008,
865       Fsdb-2.7 was declared preferred over the 1.x versions.  Benchmarking in
866       2013 showed that threading performed much worse than just using pipes,
867       so Fsdb-2.44 uses threading "style", but implemented with processes
868       (via my "Freds" library).
869
870   Contributors
871       Fsdb includes code ported from Geoff Kuenning
872       ("Fsdb::Support::TDistribution").
873
874       Fsdb contributors: Ashvin Goel goel@cse.oge.edu, Geoff Kuenning
875       geoff@fmg.cs.ucla.edu, Vikram Visweswariah visweswa@isi.edu, Kannan
876       Varadahan kannan@isi.edu, Lars Eggert larse@isi.edu, Arkadi Gelfond
877       arkadig@dyna.com, David Graff graff@ldc.upenn.edu, Haobo Yu
878       haoboy@packetdesign.com, Pavlin Radoslavov pavlin@catarina.usc.edu,
879       Graham Phillips, Yuri Pradkin, Alefiya Hussain, Ya Xu, Michael
880       Schwendt, Fabio Silva fabio@isi.edu, Jerry Zhao zhaoy@isi.edu, Ning Xu
881       nxu@aludra.usc.edu, Martin Lukac mlukac@lecs.cs.ucla.edu, Xue Cai,
882       Michael McQuaid, Christopher Meng, Calvin Ardi, H. Merijn Brand, Lan
883       Wei, Hang Guo.
884
885       Fsdb includes datasets contributed from NIST (DATA/nist_zarr13.fsdb),
886       from
887       <http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm>, the
888       NIST/SEMATECH e-Handbook of Statistical Methods, section 1.4.2.8.1.
889       Background and Data.  The source is public domain, and reproduced with
890       permission.
891

RELATED WORK

893       As stated in the introduction, Fsdb is an incompatible reimplementation
894       of the ideas found in "/rdb".  By storing data in simple text files and
895       processing it with pipelines it is easy to experiment (in the shell)
896       and look at the output.  The original implementation of this idea was
897       /rdb, a commercial product described in the book UNIX relational
898       database management: application development in the UNIX environment by
899       Rod Manis, Evan Schaffer, and Robert Jorgensen (and also at the web
900       page <http://www.rdb.com/>).
901
902       While Fsdb is inspired by Rdb, it includes no code from it, and Fsdb
903       makes several different design choices.  In particular: rdb attempts to
904       be closer to a "real" database, with provision for locking, file
905       indexing.  Fsdb focuses on single user use and so eschews these
906       choices.  Rdb also has some support for interactive editing.  Fsdb
907       leaves editing to text editors like emacs or vi.
908
909       In August, 2002 I found out Carlo Strozzi extended RDB with his package
910       NoSQL <http://www.linux.it/~carlos/nosql/>.  According to Mr. Strozzi,
911       he implemented NoSQL in awk to avoid the Perl start-up of RDB.
912       Although I haven't found Perl startup overhead to be a big problem on
913       my platforms (from old Sparcstation IPCs to 2GHz Pentium-4s), you may
914       want to evaluate his system.  The Linux Journal has a description of
915       NoSQL at <http://www.linuxjournal.com/article/3294>.  It seems quite
916       similar to Fsdb.  Like /rdb, NoSQL supports indexing (not present in
917       Fsdb).  Fsdb appears to have richer support for statistics, and, as of
918       Fsdb-2.x, its support for Perl threading may support faster performance
919       (one-process, less serialization and deserialization).
920

RELEASE NOTES

922       Versions prior to 1.0 were released informally on my web page but were
923       not announced.
924
925   0.0 1991
926       started for my own research use
927
928   0.1 26-May-94
929       first check-in to RCS
930
931   0.2 15-Mar-95
932       parts now require perl5
933
934   1.0, 22-Jul-97
935       adds autoconf support and a test script.
936
937   1.1, 20-Jan-98
938       support for double space field separators, better tests
939
940   1.2, 11-Feb-98
941       minor changes and release on comp.lang.perl.announce
942
943   1.3, 17-Mar-98
944       •   adds median and quartile options to dbstats
945
946       •   adds dmalloc_to_db converter
947
948       •   fixes some warnings
949
950       •   dbjoin now can run on unsorted input
951
952       •   fixes a dbjoin bug
953
954       •   some more tests in the test suite
955
956   1.4, 27-Mar-98
957       •   improves error messages (all should now report the program that
958           makes the error)
959
960       •   fixed a bug in dbstats output when the mean is zero
961
962   1.5, 25-Jun-98
963       BUG FIX dbcolhisto, dbcolpercentile now handles non-numeric values like
964       dbstats
965       NEW dbcolstats computes zscores and tscores over a column
966       NEW dbcolscorrelate computes correlation coefficients between two
967       columns
968       INTERNAL ficus_getopt.pl has been replaced by DbGetopt.pm
969       BUG FIX all tests are now ``portable'' (previously some tests ran only
970       on my system)
971       BUG FIX you no longer need to have the db programs in your path (fix
972       arose from a discussion with Arkadi Gelfond)
973       BUG FIX installation no longer uses cp -f (to work on SunOS 4)
974
975   1.6, 24-May-99
976       NEW dbsort, dbstats, dbmultistats now run in constant memory (using tmp
977       files if necessary)
978       NEW dbcolmovingstats does moving means over a series of data
979       NEW dbcol has a -v option to get all columns except those listed
980       NEW dbmultistats does quartiles and medians
981       NEW dbstripextraheaders now also cleans up bogus comments before the
982       fist header
983       BUG FIX dbcolneaten works better with double-space-separated data
984
985   1.7,  5-Jan-00
986       NEW dbcolize now detects and rejects lines that contain embedded copies
987       of the field separator
988       NEW configure tries harder to prevent people from improperly
989       configuring/installing fsdb
990       NEW tcpdump_to_db converter (incomplete)
991       NEW tabdelim_to_db converter:  from spreadsheet tab-delimited files to
992       db
993       NEW mailing lists for fsdb are     "fsdb-announce@heidemann.la.ca.us"
994       and  "fsdb-talk@heidemann.la.ca.us"
995           To subscribe to either, send mail
996           to    "fsdb-announce-request@heidemann.la.ca.us"   or
997           "fsdb-talk-request@heidemann.la.ca.us"     with "subscribe" in the
998           BODY of the message.
999
1000       BUG FIX dbjoin used to produce incorrect output if there were extra,
1001       unmatched values in the 2nd table. Thanks to Graham Phillips for
1002       providing a test case.
1003       BUG FIX the sample commands in the usage strings now all should
1004       explicitly include the source of data (typically from "cat foo.fsdb
1005       |").  Thanks to Ya Xu for pointing out this documentation deficiency.
1006       BUG FIX (DOCUMENTATION) dbcolmovingstats had incorrect sample output.
1007
1008   1.8, 28-Jun-00
1009       BUG FIX header options are now preserved when writing with dblistize
1010       NEW dbrowuniq now optionally checks for uniqueness only on certain
1011       fields
1012       NEW dbrowsplituniq makes one pass through a file and splits it into
1013       separate files based on the given fields
1014       NEW converter for "crl" format network traces
1015       NEW anywhere you use arbitrary code (like dbroweval), _last_foo now
1016       maps to the last row's value for field _foo.
1017       OPTIMIZATION comment processing slightly changed so that dbmultistats
1018       now is much faster on files with lots of comments (for example, ~100k
1019       lines of comments and 700 lines of data!) (Thanks to Graham Phillips
1020       for pointing out this performance problem.)
1021       BUG FIX dbstats with median/quartiles now correctly handles singleton
1022       data points.
1023
1024   1.9,  6-Nov-00
1025       NEW dbfilesplit, split a single input file into multiple output files
1026       (based on code contributed by Pavlin Radoslavov).
1027       BUG FIX dbsort now works with perl-5.6
1028
1029   1.10, 10-Apr-01
1030       BUG FIX dbstats now handles the case where there are more n-tiles than
1031       data
1032       NEW dbstats now includes a -S option to optimize work on pre-sorted
1033       data (inspired by code contributed by Haobo Yu)
1034       BUG FIX dbsort now has a better estimate of memory usage when run on
1035       data with very short records (problem detected by Haobo Yu)
1036       BUG FIX cleanup of temporary files is slightly better
1037
1038   1.11,  2-Nov-01
1039       BUG FIX dbcolneaten now runs in constant memory
1040       NEW dbcolneaten now supports "field specifiers" that allow some control
1041       over how wide columns should be
1042       OPTIMIZATION dbsort now tries hard to be filesystem cache-friendly
1043       (inspired by "Information and Control in Gray-box Systems" by the
1044       Arpaci-Dusseau's at SOSP 2001)
1045       INTERNAL t_distr now ported to perl5 module DbTDistr
1046
1047   1.12,  30-Oct-02
1048       BUG FIX dbmultistats documentation typo fixed
1049       NEW dbcolmultiscale
1050       NEW dbcol has -r option for "relaxed error checking"
1051       NEW dbcolneaten has new -e option to strip end-of-line spaces
1052       NEW dbrow finally has a -v option to negate the test
1053       BUG FIX math bug in dbcoldiff fixed by Ashvin Goel (need to check
1054       Scheaffer test cases)
1055       BUG FIX some patches to run with Perl 5.8. Note: some programs
1056       (dbcolmultiscale, dbmultistats, dbrowsplituniq) generate warnings like:
1057       "Use of uninitialized value in concatenation (.)" or "string at
1058       /usr/lib/perl5/5.8.0/FileCache.pm line 98, <STDIN> line 2". Please
1059       ignore this until I figure out how to suppress it. (Thanks to Jerry
1060       Zhao for noticing perl-5.8 problems.)
1061       BUG FIX fixed an autoconf problem where configure would fail to find a
1062       reasonable prefix (thanks to Fabio Silva for reporting the problem)
1063       NEW db_to_html_table: simple conversion to html tables (NO fancy stuff)
1064       NEW dblib now has a function dblib_text2html() that will do simple
1065       conversion of iso-8859-1 to HTML
1066
1067   1.13,  4-Feb-04
1068       NEW fsdb added to the freebsd ports tree
1069       <http://www.freshports.org/databases/fsdb/>.  Maintainer:
1070       "larse@isi.edu"
1071       BUG FIX properly handle trailing spaces when data must be numeric (ex.
1072       dbstats with -FS, see test dbstats_trailing_spaces). Fix from Ning Xu
1073       "nxu@aludra.usc.edu".
1074       NEW dbcolize error message improved (bug report from Terrence Brannon),
1075       and list format documented in the README.
1076       NEW cgi_to_db converts CGI.pm-format storage to fsdb list format
1077       BUG FIX handle numeric synonyms for column names in dbcol properly
1078       ENHANCEMENT "talking about columns" section added to README. Lack of
1079       documentation pointed out by Lars Eggert.
1080       CHANGE dbformmail now defaults to using Mail ("Berkeley Mail") to send
1081       mail, rather than sendmail (sendmail is still an option, but mail
1082       doesn't require running as root)
1083       NEW on platforms that support it (i.e., with perl 5.8), fsdb works fine
1084       with unicode
1085       NEW dbfilevalidate: check a db file for some common errors
1086
1087   1.14,  24-Aug-06
1088       ENHANCEMENT README cleanup
1089       INCOMPATIBLE CHANGE dbcolsplit renamed dbcolsplittocols
1090       NEW dbcolsplittorows  split one column into multiple rows
1091       NEW dbcolsregression compute linear regression and correlation for two
1092       columns
1093       ENHANCEMENT cvs_to_db: better error handling, normalize field names,
1094       skip blank lines
1095       ENHANCEMENT dbjoin now detects (and fails) if non-joined files have
1096       duplicate names
1097       BUG FIX minor bug fixed in calculation of Student t-distributions
1098       (doesn't change any test output, but may have caused small errors)
1099
1100   1.15, 12-Nov-07
1101       NEW fsdb-1.14 added to the MacOS Fink system
1102       <http://pdb.finkproject.org/pdb/package.php/fsdb>. (Thanks to Lars
1103       Eggert for maintaining this port.)
1104       NEW Fsdb::IO::Reader and Fsdb::IO::Writer now provide reasonably clean
1105       OO I/O interfaces to Fsdb files.  Highly recommended if you use fsdb
1106       directly from perl.  In the fullness of time I expect to reimplement
1107       the entire thing using these APIs to replace the current dblib.pl which
1108       is still hobbled by its roots in perl4.
1109       NEW dbmapreduce now implements a Google-style map/reduce abstraction,
1110       generalizing dbmultistats.
1111       ENHANCEMENT fsdb now uses the Perl build system (Makefile.PL, etc.),
1112       instead of autoconf.  This change paves the way to better perl-5-style
1113       modularization, proper manual pages, input of both listize and colize
1114       format for every program, and world peace.
1115       ENHANCEMENT dblib.pl is now moved to Fsdb::Old.pm.
1116       BUG FIX dbmultistats now propagates its format argument (-f). Bug and
1117       fix from Martin Lukac (thanks!).
1118       ENHANCEMENT dbformmail documentation now is clearer that it doesn't
1119       send the mail, you have to run the shell script it writes.  (Problem
1120       observed by Unkyu Park.)
1121       ENHANCEMENT adapted to autoconf-2.61 (and then these changes were
1122       discarded in favor of The Perl Way.
1123       BUG FIX dbmultistats memory usage corrected (O(# tags), not O(1))
1124       ENHANCEMENT dbmultistats can now optionally run with pre-grouped input
1125       in O(1) memory
1126       ENHANCEMENT dbroweval -N was finally implemented (eat comments)
1127
1128   2.0, 25-Jan-08
1129       2.0, 25-Jan-08 --- a quiet 2.0 release (gearing up towards complete)
1130
1131       ENHANCEMENT: shifting old programs to Perl modules, with the front-end
1132       program as just a wrapper. In the short-term, this change just means
1133       programs have real man pages. In the long-run, it will mean that one
1134       can run a pipeline in a single Perl program. So far: dbcol, dbroweval,
1135       the new dbrowcount. dbsort the new dbmerge, the old "dbstats" (renamed
1136       dbcolstats), dbcolrename, dbcolcreate,
1137       NEW: Fsdb::Filter::dbpipeline is an internal-only module that lets one
1138       use fsdb commands from within perl (via threads).
1139           It also provides perl function aliases for the internal modules, so
1140           a string of fsdb commands in perl are nearly as terse as in the
1141           shell:
1142
1143               use Fsdb::Filter::dbpipeline qw(:all);
1144               dbpipeline(
1145                   dbrow(qw(name test1)),
1146                   dbroweval('_test1 += 5;')
1147               );
1148
1149       INCOMPATIBLE CHANGE: The old dbcolstats has been renamed
1150       dbcolstatscores. The new dbcolstats does the same thing as the old
1151       dbstats. This incompatibility is unfortunate but normalizes program
1152       names.
1153       CHANGE: The new dbcolstats program always outputs "-" (the default
1154       empty value) for statistics it cannot compute (for example, standard
1155       deviation if there is only one row), instead of the old mix of "-" and
1156       "na".
1157       INCOMPATIBLE CHANGE: The old dbcolstats program, now called
1158       dbcolstatscores, also has different arguments.  The "-t mean,stddev"
1159       option is now "--tmean mean --tstddev stddev".  See dbcolstatscores for
1160       details.
1161       INCOMPATIBLE CHANGE: dbcolcreate now assumes all new columns get the
1162       default value rather than requiring each column to have an initial
1163       constant value. To change the initial value, sue the new "-e" option.
1164       NEW: dbrowcount counts rows, an almost-subset of dbcolstats's "n"
1165       output (except without differentiating numeric/non-numeric input), or
1166       the equivalent of "dbstripcomments | wc -l".
1167       NEW: dbmerge merges two sorted files. This functionality was previously
1168       embedded in dbsort.
1169       INCOMPATIBLE CHANGE: dbjoin's "-i" option to include non-matches is now
1170       renamed "-a", so as to not conflict with the new standard option "-i"
1171       for input file.
1172
1173   2.1,  6-Apr-08
1174       2.1,  6-Apr-08 --- another alpha 2.0, but now all converted programs
1175       understand both listize and colize format
1176
1177       ENHANCEMENT: shifting more old programs to Perl modules. New in 2.1:
1178       dbcolneaten, dbcoldefine, dbcolhisto, dblistize, dbcolize, dbrecolize
1179       ENHANCEMENT dbmerge now handles an arbitrary number of input files, not
1180       just exactly two.
1181       NEW dbmerge2 is an internal routine that handles merging exactly two
1182       files.
1183       INCOMPATIBLE CHANGE dbjoin now specifies inputs like dbmerge2, rather
1184       than assuming the first two arguments were tables (as in fsdb-1).
1185           The old dbjoin argument "-i" is now "-a" or <--type=outer>.
1186
1187           A minor change: comments in the source files for dbjoin are now
1188           intermixed with output rather than being delayed until the end.
1189
1190       ENHANCEMENT dbsort now no longer produces warnings when null values are
1191       passed to numeric comparisons.
1192       BUG FIX dbroweval now once again works with code that lacks a trailing
1193       semicolon. (This bug fixes a regression from 1.15.)
1194       INCOMPATIBLE CHANGE dbcolneaten's old "-e" option (to avoid end-of-line
1195       spaces) is now "-E" to avoid conflicts with the standard empty field
1196       argument.
1197       INCOMPATIBLE CHANGE dbcolhisto's old "-e" option is now "-E" to avoid
1198       conflicts. And its "-n", "-s", and "-w" are now "-N", "-S", and "-W" to
1199       correspond.
1200       NEW dbfilealter replaces dbrecolize, dblistize, and dbcolize, but with
1201       different options.
1202       ENHANCEMENT The library routines "Fsdb::IO" now understand both list-
1203       format and column-format data, so all converted programs can now
1204       automatically read either format.  This capability was one of the
1205       milestone goals for 2.0, so yea!
1206
1207   2.2, 23-May-08
1208       Release 2.2 is another 2.x alpha release.  Now most of the commands are
1209       ported, but a few remain, and I plan one last incompatible change (to
1210       the file header) before 2.x final.
1211
1212       ENHANCEMENT
1213           shifting more old programs to Perl modules.  New in 2.2:
1214           dbrowaccumulate, dbformmail.  dbcolmovingstats.  dbrowuniq.
1215           dbrowdiff.  dbcolmerge.  dbcolsplittocols.  dbcolsplittorows.
1216           dbmapreduce.  dbmultistats.  dbrvstatdiff.  Also dbrowenumerate
1217           exists only as a front-end (command-line) program.
1218
1219       INCOMPATIBLE CHANGE
1220           The following programs have been dropped from fsdb-2.x:
1221           dbcoltighten, dbfilesplit, dbstripextraheaders,
1222           dbstripleadingspace.
1223
1224       NEW combined_log_format_to_db to convert Apache logfiles
1225
1226       INCOMPATIBLE CHANGE
1227           Options to dbrowdiff are now -B and -I, not -a and -i.
1228
1229       INCOMPATIBLE CHANGE
1230           dbstripcomments is now dbfilestripcomments.
1231
1232       BUG FIXES
1233           dbcolneaten better handles empty columns; dbcolhisto warning
1234           suppressed (actually a bug in high-bucket handling).
1235
1236       INCOMPATIBLE CHANGE
1237           dbmultistats now requires a "-k" option in front of the key (tag)
1238           field, or if none is given, it will group by the first field (both
1239           like dbmapreduce).
1240
1241       KNOWN BUG
1242           dbmultistats with quantile option doesn't work currently.
1243
1244       INCOMPATIBLE CHANGE
1245           dbcoldiff is renamed dbrvstatdiff.
1246
1247       BUG FIXES
1248           dbformmail was leaving its log message as a  command, not a
1249           comment.  Oops.  No longer.
1250
1251   2.3, 27-May-08 (alpha)
1252       Another alpha release, this one just to fix the critical dbjoin bug
1253       listed below (that happens to have blocked my MP3 jukebox :-).
1254
1255       BUG FIX
1256           Dbsort no longer hangs if given an input file with no rows.
1257
1258       BUG FIX
1259           Dbjoin now works with unsorted input coming from a pipeline (like
1260           stdin).  Perl-5.8.8 has a bug (?) that was making this case
1261           fail---opening stdin in one thread, reading some, then reading more
1262           in a different thread caused an lseek which works on files, but
1263           fails on pipes like stdin.  Go figure.
1264
1265       BUG FIX / KNOWN BUG
1266           The dbjoin fix also fixed dbmultistats -q (it now gives the right
1267           answer).  Although a new bug appeared, messages like:
1268               Attempt to free unreferenced scalar: SV 0xa9dd0c4, Perl
1269           interpreter: 0xa8350b8 during global destruction.  So the
1270           dbmultistats_quartile test is still disabled.
1271
1272   2.4, 18-Jun-08
1273       Another alpha release, mostly to fix minor usability problems in
1274       dbmapreduce and client functions.
1275
1276       ENHANCEMENT
1277           dbrow now defaults to running user supplied code without warnings
1278           (as with fsdb-1.x).  Use "--warnings" or "-w" to turn them back on.
1279
1280       ENHANCEMENT
1281           dbroweval can now write different format output than the input,
1282           using the "-m" option.
1283
1284       KNOWN BUG
1285           dbmapreduce emits warnings on perl 5.10.0 about "Unbalanced string
1286           table refcount" and "Scalars leaked" when run with an external
1287           program as a reducer.
1288
1289           dbmultistats emits the warning "Attempt to free unreferenced
1290           scalar" when run with quartiles.
1291
1292           In each case the output is correct.  I believe these can be
1293           ignored.
1294
1295       CHANGE
1296           dbmapreduce no longer logs a line for each reducer that is invoked.
1297
1298   2.5, 24-Jun-08
1299       Another alpha release, fixing more minor bugs in "dbmapreduce" and
1300       lossage in "Fsdb::IO".
1301
1302       ENHANCEMENT
1303           dbmapreduce can now tolerate non-map-aware reducers that pass back
1304           the key column in put.  It also passes the current key as the last
1305           argument to external reducers.
1306
1307       BUG FIX
1308           Fsdb::IO::Reader, correctly handle "-header" option again.  (Broken
1309           since fsdb-2.3.)
1310
1311   2.6, 11-Jul-08
1312       Another alpha release, needed to fix DaGronk.  One new port, small bug
1313       fixes, and important fix to dbmapreduce.
1314
1315       ENHANCEMENT
1316           shifting more old programs to Perl modules.  New in 2.2:
1317           dbcolpercentile.
1318
1319       INCOMPATIBLE CHANGE and ENHANCEMENTS dbcolpercentile arguments changed,
1320       use "--rank" to require ranking instead of "-r". Also, "--ascending"
1321       and "--descending" can now be specified separately, both for
1322       "--percentile" and "--rank".
1323       BUG FIX
1324           Sigh, the sense of the --warnings option in dbrow was inverted.  No
1325           longer.
1326
1327       BUG FIX
1328           I found and fixed the string leaks (errors like "Unbalanced string
1329           table refcount" and "Scalars leaked") in dbmapreduce and
1330           dbmultistats.  (All "IO::Handle"s in threads must be manually
1331           destroyed.)
1332
1333       BUG FIX
1334           The "-C" option to specify the column separator in dbcolsplittorows
1335           now works again (broken since it was ported).
1336
1337       2.7, 30-Jul-08 beta
1338
1339       The beta release of fsdb-2.x.  Finally, all programs are ported.  As
1340       statistics, the number of lines of non-library code doubled from 7.5k
1341       to 15.5k.  The libraries are much more complete, going from 866 to 5164
1342       lines.  The overall number of programs is about the same, although 19
1343       were dropped and 11 were added.  The number of test cases has grown
1344       from 116 to 175.  All programs are now in perl-5, no more shell scripts
1345       or perl-4.  All programs now have manual pages.
1346
1347       Although this is a major step forward, I still expect to rename "jdb"
1348       to "fsdb".
1349
1350       ENHANCEMENT
1351           shifting more old programs to Perl modules.  New in 2.7:
1352           dbcolscorellate.  dbcolsregression.  cgi_to_db.  dbfilevalidate.
1353           db_to_csv.  csv_to_db, db_to_html_table, kitrace_to_db,
1354           tcpdump_to_db, tabdelim_to_db, ns_to_db.
1355
1356       INCOMPATIBLE CHANGE
1357           The following programs have been dropped from fsdb-2.x: db2dcliff,
1358           dbcolmultiscale, crl_to_db.  ipchain_logs_to_db.  They may come
1359           back, but seemed overly specialized.  The following program
1360           dbrowsplituniq was dropped because it is superseded by dbmapreduce.
1361           dmalloc_to_db was dropped pending a test cases and examples.
1362
1363       ENHANCEMENT
1364           dbfilevalidate now has a "-c" option to correct errors.
1365
1366       NEW html_table_to_db provides the inverse of db_to_html_table.
1367
1368   2.8,  5-Aug-08
1369       Change header format, preserving forwards compatibility.
1370
1371       BUG FIX
1372           Complete editing pass over the manual, making sure it aligns with
1373           fsdb-2.x.
1374
1375       SEMI-COMPATIBLE CHANGE
1376           The header of fsdb files has changed, it is now #fsdb, not #h (or
1377           #L) and parsing of -F and -R are also different.  See dbfilealter
1378           for the new specification.  The v1 file format will be read,
1379           compatibly, but not written.
1380
1381       BUG FIX
1382           dbmapreduce now tolerates comments that precede the first key,
1383           instead of failing with an error message.
1384
1385   2.9, 6-Aug-08
1386       Still in beta; just a quick bug-fix for dbmapreduce.
1387
1388       ENHANCEMENT
1389           dbmapreduce now generates plausible output when given no rows of
1390           input.
1391
1392   2.10, 23-Sep-08
1393       Still in beta, but picking up some bug fixes.
1394
1395       ENHANCEMENT
1396           dbmapreduce now generates plausible output when given no rows of
1397           input.
1398
1399       ENHANCEMENT
1400           dbroweval the warnings option was backwards; now corrected.  As a
1401           result, warnings in user code now default off (like in fsdb-1.x).
1402
1403       BUG FIX
1404           dbcolpercentile now defaults to assuming the target column is
1405           numeric.  The new option "-N" allows selection of a non-numeric
1406           target.
1407
1408       BUG FIX
1409           dbcolscorrelate now includes "--sample" and "--nosample" options to
1410           compute the sample or full population correlation coefficients.
1411           Thanks to Xue Cai for finding this bug.
1412
1413   2.11, 14-Oct-08
1414       Still in beta, but picking up some bug fixes.
1415
1416       ENHANCEMENT
1417           html_table_to_db is now more aggressive about filling in empty
1418           cells with the official empty value, rather than leaving them blank
1419           or as whitespace.
1420
1421       ENHANCEMENT
1422           dbpipeline now catches failures during pipeline element setup and
1423           exits reasonably gracefully.
1424
1425       BUG FIX
1426           dbsubprocess now reaps child processes, thus avoiding running out
1427           of processes when used a lot.
1428
1429   2.12, 16-Oct-08
1430       Finally, a full (non-beta) 2.x release!
1431
1432       INCOMPATIBLE CHANGE
1433           Jdb has been renamed Fsdb, the flatfile-streaming database.  This
1434           change affects all internal Perl APIs, but no shell command-level
1435           APIs.  While Jdb served well for more than ten years, it is easily
1436           confused with the Java debugger (even though Jdb was there first!).
1437           It also is too generic to work well in web search engines.
1438           Finally, Jdb stands for ``John's database'', and we're a bit beyond
1439           that.  (However, some call me the ``file-system guy'', so one could
1440           argue it retains that meeting.)
1441
1442           If you just used the shell commands, this change should not affect
1443           you.  If you used the Perl-level libraries directly in your code,
1444           you should be able to rename "Jdb" to "Fsdb" to move to 2.12.
1445
1446           The jdb-announce list not yet been renamed, but it will be shortly.
1447
1448           With this release I've accomplished everything I wanted to in
1449           fsdb-2.x.  I therefore expect to return to boring, bugfix releases.
1450
1451   2.13, 30-Oct-08
1452       BUG FIX
1453           dbrowaccumulate now treats non-numeric data as zero by default.
1454
1455       BUG FIX
1456           Fixed a perl-5.10ism in dbmapreduce that breaks that program under
1457           5.8.  Thanks to Martin Lukac for reporting the bug.
1458
1459   2.14, 26-Nov-08
1460       BUG FIX
1461           Improved documentation for dbmapreduce's "-f" option.
1462
1463       ENHANCEMENT
1464           dbcolmovingstats how computes a moving standard deviation in
1465           addition to a moving mean.
1466
1467   2.15, 13-Apr-09
1468       BUG FIX
1469           Fix a make install bug reported by Shalindra Fernando.
1470
1471   2.16, 14-Apr-09
1472       BUG FIX
1473           Another minor release bug: on some systems programize_module looses
1474           executable permissions.  Again reported by Shalindra Fernando.
1475
1476   2.17, 25-Jun-09
1477       TYPO FIXES
1478           Typo in the dbroweval manual fixed.
1479
1480       IMPROVEMENT
1481           There is no longer a comment line to label columns in dbcolneaten,
1482           instead the header line is tweaked to line up.  This change
1483           restores the Jdb-1.x behavior, and means that repeated runs of
1484           dbcolneaten no longer add comment lines each time.
1485
1486       BUG FIX
1487           It turns out  dbcolneaten was not correctly handling trailing
1488           spaces when given the "-E" option to suppress them.  This
1489           regression is now fixed.
1490
1491       EXTENSION
1492           dbroweval(1) can now handle direct references to the last row via
1493           $lfref, a dubious but now documented feature.
1494
1495       BUG FIXES
1496           Separators set with "-C" in dbcolmerge and dbcolsplittocols were
1497           not properly setting the heading, and null fields were not
1498           recognized.  The first bug was reported by Martin Lukac.
1499
1500   2.18,  1-Jul-09  A minor release
1501       IMPROVEMENT
1502           Documentation for Fsdb::IO::Reader has been improved.
1503
1504       IMPROVEMENT
1505           The package should now be PGP-signed.
1506
1507   2.19,  10-Jul-09
1508       BUG FIX
1509           Internal improvements to debugging output and robustness of
1510           dbmapreduce and dbpipeline.  TEST/dbpipeline_first_fails.cmd re-
1511           enabled.
1512
1513   2.20, 30-Nov-09 (A collection of minor bugfixes, plus a build against
1514       Fedora 12.)
1515       BUG FIX
1516           Loging for dbmapreduce with code refs is now stable (it no longer
1517           includes a hex pointer to the code reference).
1518
1519       BUG FIX
1520           Better handling of mixed blank lines in Fsdb::IO::Reader (see test
1521           case dbcolize_blank_lines.cmd).
1522
1523       BUG FIX
1524           html_table_to_db now handles multi-line input better, and handles
1525           tables with COLSPAN.
1526
1527       BUG FIX
1528           dbpipeline now cleans up threads in an "eval" to prevent "cannot
1529           detach a joined thread" errors that popped up in perl-5.10.
1530           Hopefully this prevents a race condition that causes the test
1531           suites to hang about 20% of the time (in dbpipeline_first_fails).
1532
1533       IMPROVEMENT
1534           dbmapreduce now detects and correctly fails when the input and
1535           reducer have incompatible field separators.
1536
1537       IMPROVEMENT
1538           dbcolstats, dbcolhisto, dbcolscorrelate, dbcolsregression, and
1539           dbrowcount now all take an "-F" option to let one specify the
1540           output field separator (so they work better with dbmapreduce).
1541
1542       BUG FIX
1543           An omitted "-k" from the manual page of dbmultistats is now there.
1544           Bug reported by Unkyu Park.
1545
1546   2.21, 17-Apr-10 bug fix release
1547       BUG FIX
1548           Fsdb::IO::Writer now no longer fails with -outputheader => never
1549           (an obscure bug).
1550
1551       IMPROVEMENT
1552           Fsdb (in the warnings section) and dbcolstats now more carefully
1553           document how they handle (and do not handle) numerical precision
1554           problems, and other general limits.  Thanks to Yuri Pradkin for
1555           prompting this documentation.
1556
1557       IMPROVEMENT
1558           "Fsdb::Support::fullname_to_sortkey" is now restored from "Jdb".
1559
1560       IMPROVEMENT
1561           Documention for multiple styles of input approaches (including
1562           performance description) added to Fsdb::IO.
1563
1564   2.22, 2010-10-31 One new tool dbcolcopylast and several bug fixes for Perl
1565       5.10.
1566       BUG FIX
1567           dbmerge now correctly handles n-way merges.  Bug reported by Yuri
1568           Pradkin.
1569
1570       INCOMPARABLE CHANGE
1571           dbcolneaten now defaults to not padding the last column.
1572
1573       ADDITION
1574           dbrowenumerate now takes -N NewColumn to give the new column a name
1575           other than "count".  Feature requested by Mike Rouch in January
1576           2005.
1577
1578       ADDITION
1579           New program dbcolcopylast copies the last value of a column into a
1580           new column copylast_column of the next row.  New program requested
1581           by Fabio Silva; useful for converting dbmultistats output into
1582           dbrvstatdiff input.
1583
1584       BUG FIX
1585           Several tools (particularly dbmapreduce and dbmultistats) would
1586           report errors like "Unbalanced string table refcount: (1) for
1587           "STDOUT" during global destruction" on exit, at least on certain
1588           versions of Perl (for me on 5.10.1), but similar errors have been
1589           off-and-on for several Perl releases.  Although I think my code
1590           looked OK, I worked around this problem with a different way of
1591           handling standard IO redirection.
1592
1593   2.23, 2011-03-10 Several small portability bugfixes; improved dbcolstats
1594       for large datasets
1595       IMPROVEMENT
1596           Documentation to dbrvstatdiff was changed to use "sd" to refer to
1597           standard deviation, not "ss" (which might be confused with sum-of-
1598           squares).
1599
1600       BUG FIX
1601           This documentation about dbmultistats was missing the -k option in
1602           some cases.
1603
1604       BUG FIX
1605           dbmapreduce was failing on MacOS-10.6.3 for some tests with the
1606           error
1607
1608               dbmapreduce: cannot run external dbmapreduce reduce program (perl TEST/dbmapreduce_external_with_key.pl)
1609
1610           The problem seemed to be only in the error, not in operation.  On
1611           MacOS, the error is now suppressed.  Thanks to Alefiya Hussain for
1612           providing access to a Mac system that allowed debugging of this
1613           problem.
1614
1615       IMPROVEMENT
1616           The csv_to_db command requires an external Perl library
1617           (Text::CSV_XS).  On computers that lack this optional library,
1618           previously Fsdb would configure with a warning and then test cases
1619           would fail.  Now those test cases are skipped with an additional
1620           warning.
1621
1622       BUG FIX
1623           The test suite now supports alternative valid output, as a hack to
1624           account for last-digit floating point differences.  (Not very
1625           satisfying :-(
1626
1627       BUG FIX
1628           dbcolstats output for confidence intervals on very large datasets
1629           has changed.  Previously it failed for more than 2^31-1 records,
1630           and handling of T-Distributions with thousands of rows was a bit
1631           dubious.  Now datasets with more than 10000 are considered
1632           infinitely large and hopefully correctly handled.
1633
1634   2.24, 2011-04-15 Improvements to fix an old bug in dbmapreduce with
1635       different field separators
1636       IMPROVEMENT
1637           The dbfilealter command had a "--correct" option to work-around
1638           from incompatible field-separators, but it did nothing.  Now it
1639           does the correct but sad, data-loosing thing.
1640
1641       IMPROVEMENT
1642           The dbmultistats command previously failed with an error message
1643           when invoked on input with a non-default field separator.  The root
1644           cause was the underlying dbmapreduce that did not handle the case
1645           of reducers that generated output with a different field separator
1646           than the input.  We now detect and repair incompatible field
1647           separators.  This change corrects a problem originally documented
1648           and detected in Fsdb-2.20.  Bug re-reported by Unkyu Park.
1649
1650   2.25, 2011-08-07 Two new tools, xml_to_db and dbfilepivot, and a bugfix for
1651       two people.
1652       IMPROVEMENT
1653           kitrace_to_db now supports a --utc option, which also fixes this
1654           test case for users outside of the Pacific time zone.  Bug reported
1655           by David Graff, and also by Peter Desnoyers (within a week of each
1656           other :-)
1657
1658       NEW xml_to_db can convert simple, very regular XML files into Fsdb.
1659
1660       NEW dbfilepivot "pivots" a file, converting multiple rows corresponding
1661           to the same entity into a single row with multiple columns.
1662
1663   2.26, 2011-12-12 Bug fixes, particularly for perl-5.14.2.
1664       BUG FIX
1665           Bugs fixed in Fsdb::IO::Reader(3) manual page.
1666
1667       BUG FIX
1668           Fixed problems where dbcolstats was truncating floating point
1669           numbers when sorting.  This strange behavior happens as of
1670           perl-5.14.2 and it seems like a Perl bug.  I've worked around it
1671           for the test suites, but I'm a bit nervous.
1672
1673   2.27, 2012-11-15 Accumulated bug fixes.
1674       IMPROVEMENT
1675           csv_to_db now reports errors in CVS input with real diagnostics.
1676
1677       IMPROVEMENT
1678           dbcolmovingstats can now compute median, when given the "-m"
1679           option.
1680
1681       BUG FIX
1682           dbcolmovingstats non-numeric handling (the "-a" option) now works
1683           properly.
1684
1685       DOCUMENTATION
1686           The internal t/test_command.t test framework is now documented.
1687
1688       BUG FIX
1689           dbrowuniq now correctly handles the case where there is no input
1690           (previously it output a blank line, which is a malformed fsdb
1691           file).  Thanks to Yuri Pradkin for reporting this bug.
1692
1693   2.28, 2012-11-15 A quick release to fix most rpmlint errors.
1694       BUG FIX
1695           Fixed a number of minor release problems (wrong permissions, old
1696           FSF address, etc.) found by rpmlint.
1697
1698   2.29, 2012-11-20 a quick release for CPAN testing
1699       IMPROVEMENT
1700           Tweaked the RPM spec.
1701
1702       IMPROVEMENT
1703           Modified Makefile.PL to fail gracefully on Perl installations that
1704           lack threads.  (Without this fix, I get massive failures in the
1705           non-ithreads test system.)
1706
1707   2.30, 2012-11-25 improvements to perl portability
1708       BUG FIX
1709           Removed unicode character in documention of dbcolscorrelated so pod
1710           tests will pass.  (Sigh, that should work :-( )
1711
1712       BUG FIX
1713           Fixed test suite failures on 5 tests (dbcolcreate_double_creation
1714           was the first) due to Carp's addition of a period.  This problem
1715           was breaking Fsdb on perl-5.17.  Thanks to Michael McQuaid for
1716           helping diagnose this problem.
1717
1718       IMPROVEMENT
1719           The test suite now prints out the names of tests it tries.
1720
1721   2.31, 2012-11-28 A release with actual improvements to dbfilepivot and
1722       dbrowuniq.
1723       BUG FIX
1724           Documentation fixes: typos in dbcolscorrelated, bugs in
1725           dbfilepivot, clarification for comment handling in
1726           Fsdb::IO::Reader.
1727
1728       IMPROVEMENT
1729           Previously dbfilepivot assumed the input was grouped by keys and
1730           didn't very that pre-condition.  Now there is no pre-condition (it
1731           will sort the input by default), and it checks if the invariant is
1732           violated.
1733
1734       BUG FIX
1735           Previously dbfilepivot failed if the input had comments (oops :-);
1736           no longer.
1737
1738       IMPROVEMENT
1739           Now dbrowuniq has the "-L" option to preserve the last unique row
1740           (instead of the first), a common idiom.
1741
1742   2.32, 2012-12-21 Test suites should now be more numerically robust.
1743       NEW New dbfilediff does fsdb-aware file differencing.  It does not do
1744           smart intuition of add/removes like Unix diff(1), but it does know
1745           about columns, and with "-E", it does numeric-aware differences.
1746
1747       IMPROVEMENT
1748           Test suites that are numeric now use dbfilediff to do numeric-aware
1749           comparisons, so the test suite should now be robust to slightly
1750           different computers and operating systems and compilers than
1751           exactly what I use.
1752
1753   2.33, 2012-12-23 Minor fixes to some test cases.
1754       IMPROVEMENT
1755           dbfilediff and dbrowuniq now supports the "-N" option to give the
1756           new column a different name.  (And a test cases where this
1757           duplication mattered have been fixed.)
1758
1759       IMPROVEMENT
1760           dbrvstatdiff now show the t-test breakpoint with a reasonable
1761           number of floating point digits.
1762
1763       BUG FIX
1764           Fixed a numerical stability problem in the dbroweval_last test
1765           case.
1766

WHAT'S NEW

1768   2.34, 2013-02-10 Parallelism in dbmerge.
1769       IMPROVEMENT
1770           Documention for dbjoin now includes resource requirements.
1771
1772       IMPROVEMENT
1773           Default memory usage for dbsort is now about 256MB.  (The world
1774           keeps moving forward.)
1775
1776       IMPROVEMENT
1777           dbmerge now does merging in parallel.  As a side-effect, dbsort
1778           should be faster when input overflows memory.  The level of
1779           parallelism can be limited with the "--parallelism" option.  (There
1780           is more work to do here, but we're off to a start.)
1781
1782   2.35, 2013-02-23 Improvements to dbmerge parallelism
1783       BUG FIX
1784           Fsdb temporary files are now created more securely (with
1785           File::Temp).
1786
1787       IMPROVEMENT
1788           Programs that sort or merge on fields (dbmerge2, dbmerge, dbsort,
1789           dbjoin) now report an error if no fields on which to join or merge
1790           are given.
1791
1792       IMPROVEMENT
1793           Parallelism in dbmerge is should now be more consistent, with less
1794           starting and stopping.
1795
1796       IMPROVEMENT In dbmerge, the "--xargs" option lets one give input
1797       filenames on standard input, rather than the command line. This feature
1798       paves the way for faster dbsort for large inputs (by pipelining sorting
1799       and merging), expected in the next release.
1800
1801   2.36, 2013-02-25 dbsort pipelines with dbmerge
1802       IMPROVEMENT For large inputs, dbsort now pipelines sorting and merging,
1803       allowing earlier processing.
1804       BUG FIX Since 2.35, dbmerge delayed cleanup of intermediate files,
1805       thereby requiring extra disk space.
1806
1807   2.37, 2013-02-26 quick bugfix to support parallel sort and merge from
1808       recent releases
1809       BUG FIX Since 2.35, dbmerge delayed removal of input files given by
1810       "--xargs".  This problem is now fixed.
1811
1812   2.38, 2013-04-29 minor bug fixes
1813       CLARIFICATION
1814           Configure now rejects Windows since tests seem to hang on some
1815           versions of Windows.  (I would love help from a Windows developer
1816           to get this problem fixed, but I cannot do it.)  See
1817           https://rt.cpan.org/Ticket/Display.html?id=84201.
1818
1819       IMPROVEMENT
1820           All programs that use temporary files (dbcolpercentile,
1821           dbcolscorrelate, dbcolstats, dbcolstatscores) now take the "-T"
1822           option and set the temporary directory consistently.
1823
1824           In addition, error messages are better when the temporary directory
1825           has problems.  Problem reported by Liang Zhu.
1826
1827       BUG FIX
1828           dbmapreduce was failing with external, map-reduce aware reducers
1829           (when invoked with -M and an external program).  (Sigh, did this
1830           case ever work?)  This case should now work.  Thanks to Yuri
1831           Pradkin for reporting this bug (in 2011).
1832
1833       BUG FIX
1834           Fixed perl-5.10 problem with dbmerge.  Thanks to Yuri Pradkin for
1835           reporting this bug (in 2013).
1836
1837   2.39, date 2013-05-31 quick release for the dbrowuniq extension
1838       BUG FIX
1839           Actually in 2.38, the Fedora .spec got cleaner dependencies.
1840           Suggestion from Christopher Meng via
1841           <https://bugzilla.redhat.com/show_bug.cgi?id=877096>.
1842
1843       ENHANCEMENT
1844           Fsdb files are now explicitly set into UTF-8 encoding, unless one
1845           specifies "-encoding" to "Fsdb::IO".
1846
1847       ENHANCEMENT
1848           dbrowuniq now supports "-I" for incremental counting.
1849
1850   2.40, 2013-07-13 small bug fixes
1851       BUG FIX
1852           dbsort now has more respect for a user-given temporary directory;
1853           it no longer is ignored for merging.
1854
1855       IMPROVEMENT
1856           dbrowuniq now has options to output the first, last, and both first
1857           and last rows of a run ("-F", "-L", and "-B").
1858
1859       BUG FIX
1860           dbrowuniq now correctly handles "-N".  Sigh, it didn't work before.
1861
1862   2.41, 2013-07-29 small bug and packaging fixes
1863       ENHANCEMENT
1864           Documentation to dbrvstatdiff improved (inspired by questions from
1865           Qian Kun).
1866
1867       BUG FIX
1868           dbrowuniq no longer duplicates singleton unique lines when
1869           outputting both (with "-B").
1870
1871       BUG FIX
1872           Add missing "XML::Simple" dependency to Makefile.PL.
1873
1874       ENHANCEMENT
1875           Tests now show the diff of the failing output if run with "make
1876           test TEST_VERBOSE=1".
1877
1878       ENHANCEMENT
1879           dbroweval now includes documentation for how to output extra rows.
1880           Suggestion from Yuri Pradkin.
1881
1882       BUG FIX
1883           Several improvements to the Fedora package from Michael Schwendt
1884           via <https://bugzilla.redhat.com/show_bug.cgi?id=877096>, and from
1885           the harsh master that is rpmlint.  (I am stymied at teaching it
1886           that "outliers" is spelled correctly.  Maybe I should send it
1887           Schneier's book.  And an unresolvable invalid-spec-name lurks in
1888           the SRPM.)
1889
1890   2.42, 2013-07-31 A bug fix and packaging release.
1891       ENHANCEMENT
1892           Documentation to dbjoin improved to better memory usage.  (Based on
1893           problem report by Lin Quan.)
1894
1895       BUG FIX
1896           The .spec is now perl-Fsdb.spec to satisfy rpmlint.  Thanks to
1897           Christopher Meng for a specific bug report.
1898
1899       BUG FIX
1900           Test dbroweval_last.cmd no longer has a column that caused failures
1901           because of numerical instability.
1902
1903       BUG FIX
1904           Some tests now better handle bugs in old versions of perl (5.10,
1905           5.12).  Thanks to Calvin Ardi for help debugging this on a Mac with
1906           perl-5.12, but the fix should affect other platforms.
1907
1908   2.43, 2013-08-27 Adds in-file compression.
1909       BUG FIX
1910           Changed the sort on TEST/dbsort_merge.cmd to strings (from
1911           numerics) so we're less susceptible to false test-failures due to
1912           floating point IO differences.
1913
1914       EXPERIMENTAL ENHANCEMENT
1915           Yet more parallelism in dbmerge: new "endgame-mode" builds a merge
1916           tree of processes at the end of large merge tasks to get maximally
1917           parallelism.  Currently this feature is off by default because it
1918           can hang for some inputs.  Enable this experimental feature with
1919           "--endgame".
1920
1921       ENHANCEMENT
1922           "Fsdb::IO" now handles being given "IO::Pipe" objects (as exercised
1923           by dbmerge).
1924
1925       BUG FIX
1926           Handling of NamedTmpfiles now supports concurrency.  This fix will
1927           hopefully fix occasional "Use of uninitialized value $_ in string
1928           ne at ...NamedTmpfile.pm line 93."  errors.
1929
1930       BUG FIX
1931           Fsdb now requires perl 5.10.  This is a bug fix because some test
1932           cases used to require it, but this fact was not properly
1933           documented.  (Back-porting to 5.008 would require removing all "//"
1934           operators.)
1935
1936       ENHANCEMENT
1937           Fsdb now handles automatic compression of file contents.  Enable
1938           compression with "dbfilealter -Z xz" (or "gz" or "bz2").  All
1939           programs should operate on compressed files and leave the output
1940           with the same level of compression.  "xz" is recommended as fastest
1941           and most efficient.  "gz" is produces unrepeatable output (and so
1942           has no output test), it seems to insist on adding a timestamp.
1943
1944   2.44, 2013-10-02 A major change--all threads are gone.
1945       ENHANCEMENT
1946           Fsdb is now thread free and only uses processes for parallelism.
1947           This change is a big change--the entire motivation for Fsdb-2 was
1948           to exploit parallelism via threading.  Parallelism--good, but perl
1949           threading--bad for performance.  Horribly bad for performance.
1950           About 20x worse than pipes on my box.  (See perl bug #119445 for
1951           the discussion.)
1952
1953       NEW "Fsdb::Support::Freds" provides a thread-like abstraction over
1954           forking, with some nice support for callbacks in the parent upon
1955           child termination.
1956
1957       ENHANCEMENT
1958           Details about removing threads: "dbpipeline" is thread free, and
1959           new tests to verify each of its parts.  The easy cases are
1960           "dbcolpercentile", "dbcolstats", "dbfilepivot", "dbjoin", and
1961           "dbcolstatscores", each of which use it in simple ways
1962           (2013-09-09).  "dbmerge" is now thread free (2013-09-13), but was a
1963           significant rewrite, which brought "dbsort" along.  "dbmapreduce"
1964           is partly thread free (2013-09-21), again as a rewrite, and it
1965           brings "dbmultistats" along.  Full "dbmapreduce" support took much
1966           longer (2013-10-02).
1967
1968       BUG FIX
1969           When running with user-only output ("-n"), dbroweval now resets the
1970           output vector $ofref after it has been output.
1971
1972       NEW dbcolcreate will create all columns at the head of each row with
1973           the "--first" option.
1974
1975       NEW dbfilecat will concatenate two files, verifying that they have the
1976           same schema.
1977
1978       ENHANCEMENT
1979           dbmapreduce now passes comments through, rather than eating them as
1980           before.
1981
1982           Also, dbmapreduce now supports a "--" option to prevent
1983           misinterpreting sub-program parameters as for dbmapreduce.
1984
1985       INCOMPATIBLE CHANGE
1986           dbmapreduce no longer figures out if it needs to add the key to the
1987           output.  For multi-key-aware reducers, it never does (and cannot).
1988           For non-multi-key-aware reducers, it defaults to add the key and
1989           will now fail if the reducer adds the key (with error "dbcolcreate:
1990           attempt to create pre-existing column...").  In such cases, one
1991           must disable adding the key with the new option "--no-prepend-key".
1992
1993       INCOMPATIBLE CHANGE
1994           dbmapreduce no longer copies the input field separator by default.
1995           For multi-key-aware reducers, it never does (and cannot).  For non-
1996           multi-key-aware reducers, it defaults to not copying the field
1997           separator, but it will copy it (the old default) with the
1998           "--copy-fs" option
1999
2000   2.45, 2013-10-07 cleanup from de-thread-ification
2001       BUG FIX
2002           Corrected a fast busy-wait in dbmerge.
2003
2004       ENHANCEMENT
2005           Endgame mode enabled in dbmerge; it (and also large cases of
2006           dbsort) should now exploit greater parallelism.
2007
2008       BUG FIX
2009           Test case with "Fsdb::BoundedQueue" (gone since 2.44) now removed.
2010
2011   2.46, 2013-10-08 continuing cleanup of our no-threads version
2012       BUG FIX
2013           Fixed some packaging details.  (Really, threads are no longer
2014           required, missing tests in the MANIFEST.)
2015
2016       IMPROVEMENT
2017           dbsort now better communicates with the merge process to avoid
2018           bursty parallelism.
2019
2020           Fsdb::IO::Writer now can take "-autoflush =" 1> for line-buffered
2021           IO.
2022
2023   2.47, 2013-10-12 test suite cleanup for non-threaded perls
2024       BUG FIX
2025           Removed some stray "use threads" in some test cases.  We didn't
2026           need them, and these were breaking non-threaded perls.
2027
2028       BUG FIX
2029           Better handling of Fred cleanup; should fix intermittent
2030           dbmapreduce failures on BSD.
2031
2032       ENHANCEMENT
2033           Improved test framework to show output when tests fail.  (This
2034           time, for real.)
2035
2036   2.48, 2014-01-03 small bugfixes and improved release engineering
2037       ENHANCEMENT
2038           Test suites now skip tests for libraries that are missing.  (Patch
2039           for missing "IO::Compresss:Xz" contributed by Calvin Ardi.)
2040
2041       ENHANCEMENT
2042           Removed references to Jdb in the package specification.  Since the
2043           name was changed in 2008, there's no longer a huge need for
2044           backwards compatibility.  (Suggestion form Petr Šabata.)
2045
2046       ENHANCEMENT
2047           Test suites now invoke the perl using the path from
2048           $Config{perlpath}.  Hopefully this helps testing in environments
2049           where there are multiple installed perls and the default perl is
2050           not the same as the perl-under-test (as happens in
2051           cpantesters.org).
2052
2053       BUG FIX
2054           Added specific encoding to this manpage to account for Unicode.
2055           Required to build correctly against perl-5.18.
2056
2057   2.49, 2014-01-04 bugfix to unicode handling in Fsdb IO (plus minor
2058       packaging fixes)
2059       BUG FIX
2060           Restored a line in the .spec to chmod g-s.
2061
2062       BUG FIX
2063           Unicode decoding is now handled correctly for programs that read
2064           from standard input.  (Also: New test scripts cover unicode input
2065           and output.)
2066
2067       BUG FIX
2068           Fix to Fsdb documentation encoding line.  Addresses test failure in
2069           perl-5.16 and earlier.  (Who knew "encoding" had to be followed by
2070           a blank line.)
2071

WHAT'S NEW

2073   2.50, 2014-05-27 a quick release for spec tweaks
2074       ENHANCEMENT
2075           In dbroweval, the "-N" (no output, even comments) option now
2076           implies "-n", and it now suppresses the header and trailer.
2077
2078       BUG FIX
2079           A few more tweaks to the perl-Fsdb.spec from Petr Šabata.
2080
2081       BUG FIX
2082           Fixed 3 uses of "use v5.10" in test suites that were causing test
2083           failures (due to warnings, not real failures) on some platforms.
2084
2085   2.51, 2014-09-05 Feature enhancements to dbcolmovingstats, dbcolcreate,
2086       dbmapreduce, and new sqlselect_to_db
2087       ENHANCEMENT
2088           dbcolcreate now has a "--no-recreate-fatal" that causes it to
2089           ignore creation of existing columns (instead of failing).
2090
2091       ENHANCEMENT
2092           dbmapreduce once again is robust to reducers that output the key;
2093           "--no-prepend-key" is no longer mandatory.
2094
2095       ENHANCEMENT
2096           dbcolsplittorows can now enumerate the output rows with "-E".
2097
2098       BUG FIX
2099           dbcolmovingstats is more mathematically robust.  Previously for
2100           some inputs and some platforms, floating point rounding could
2101           sometimes cause squareroots of negative numbers.
2102
2103       NEW sqlselect_to_db converts the output of the MySQL or MarinaDB select
2104           comment into fsdb format.
2105
2106       INCOMPATIBLE CHANGE
2107           dbfilediff now outputs the second row when doing sloppy numeric
2108           comparisons, to better support test suites.
2109
2110   2.52, 2014-11-03 Fixing the test suite for line number changes.
2111       ENHANCEMENT
2112           Test suites changes to be robust to exact line numbers of failures,
2113           since different Perl releases fail on different lines.
2114           <https://bugzilla.redhat.com/show_bug.cgi?id=1158380>
2115
2116   2.53, 2014-11-26 bug fixes and stability improvements to dbmapreduce
2117       ENHANCEMENT
2118           The dbfilediff how supports a "--quiet" option.
2119
2120       ENHANCEMENT
2121           Better documention of dbpipeline_filter.
2122
2123       BUGFIX
2124           Added groff-base and perl-podlators to the Fedora package spec.
2125           Fixes <https://bugzilla.redhat.com/show_bug.cgi?id=1163149>.  (Also
2126           in package 2.52-2.)
2127
2128       BUGFIX
2129           An important stability improvement to dbmapreduce.  It, plus
2130           dbmultistats, and dbcolstats now support controlled parallelism
2131           with the "--pararallelism=N" option.  They default to run with the
2132           number of available CPUs.  dbmapreduce also moderates its level of
2133           parallelism.  Previously it would create reducers as needed,
2134           causing CPU thrashing if reducers ran much slower than data
2135           production.
2136
2137       BUGFIX
2138           The combination of dbmapreduce with dbrowenumerate now works as it
2139           should.  (The obscure bug was an interaction with dbcolcreate with
2140           non-multi-key reducers that output their own key.  dbmapreduce has
2141           too many useful corner cases.)
2142
2143   2.54, 2014-11-28 fix for the test suite to correct failing tests on not-my-
2144       platform
2145       BUGFIX
2146           Sigh, the test suite now has a test suite.  Because, yes, I broke
2147           it, causing many incorrect failures at cpantesters.  Now fixed.
2148
2149   2.55, 2015-01-05 many spelling fixes and dbcolmovingstats tests are more
2150       robust to different numeric precision
2151       ENHANCEMENT
2152           dbfilediff now can be extra quiet, as I continue to try to track
2153           down a numeric difference on FreeBSD AMD boxes.
2154
2155       ENHANCEMENT
2156           dbcolmovingstats gave different test output (just reflecting
2157           rounding error) when stddev approaches zero.  We now detect hand
2158           handle this case.  See
2159           <https://rt.cpan.org/Public/Bug/Display.html?id=101220> and thanks
2160           to H. Merijn Brand for the bug report.
2161
2162       BUG FIX
2163           Many, many spelling bugs found by H. Merijn Brand; thanks for the
2164           bug report.
2165
2166       INCOMPATBLE CHANGE
2167           A number of programs had misspelled "separator" in
2168           "--fieldseparator" and "--columnseparator" options as "seperator".
2169           These are now correctly spelled.
2170
2171   2.56, 2015-02-03 fix against Getopt::Long-2.43's stricter error checkign
2172       BUG FIX
2173           Internal argument parsing uses Getopt::Long, but mixed pass-through
2174           and <>.  Bug reported by Petr Pisar at
2175           <https://bugzilla.redhat.com/show_bug.cgi?id=1188538>.a
2176
2177       BUG FIX
2178           Added missing BuildRequires for "XML::Simple".
2179
2180   2.57, 2015-04-29 Minor changes, with better performance from dbmulitstats.
2181       BUG FIX
2182           dbfilecat now honors "--remove-inputs" (previously it didn't).
2183           This omission meant that dbmapreduce (and dbmultistats) would
2184           accumulate files in /tmp when running.  Bad news for inputs with 4M
2185           keys.
2186
2187       ENHANCMENT
2188           dbmultistats should be faster with lots of small keys.  dbcolstats
2189           now supports "-k" to get some of the functionality of dbmultistats
2190           (if data is pre-sorted and median/quartiles are not required).
2191
2192           dbfilecat now honors "--remove-inputs" (previously it didn't).
2193           This omission meant that dbmapreduce (and dbmultistats) would
2194           accumulate files in /tmp when running.  Bad news for inputs with 4M
2195           keys.
2196
2197   2.58, 2015-04-30 Bugfix in dbmerge
2198       BUG FIX
2199           Fixed a case where dbmerge suffered mojobake in endgame mode.  This
2200           bug surfaced when dbsort was applied to large files (big enough to
2201           require merging) with unicode in them; the symptom was soemthing
2202           like:
2203             Wide character in print at /usr/lib64/perl5/IO/Handle.pm line
2204           420, <GEN12> line 111.
2205
2206   2.59, 2016-09-01 Collect a few small bug fixes and documentation
2207       improvements.
2208       BUG FIX
2209           More IO is explicitly marked UTF-8 to avoid Perl's tendency to
2210           mojibake on otherwise valid unicode input.  This change helps
2211           html_table_to_db.
2212
2213       ENHANCEMENT
2214           dbcolscorrelate now crossreferences dbcolsregression.
2215
2216       ENHANCEMENT
2217           Documentation for dbrowdiff now clarifies that the default is
2218           baseline mode.
2219
2220       BUG FIX
2221           dbjoin now propagates "-T" into the sorting process (if it is
2222           required).  Thanks to Lan Wei for reporting this bug.
2223
2224   2.60, 2016-09-04 Adds support for hash joins.
2225       ENHANCEMENT
2226           dbjoin now supports hash joins with "-t lefthash" and "-t
2227           righthash".  Hash joins cache a table in memory, but do not require
2228           that the other table be sorted.  They are ideal when joining a
2229           large table against a small one.
2230
2231   2.61, 2016-09-05 Support left and right outer joins.
2232       ENHANCEMENT
2233           dbjoin now handles left and right outer joins with "-t left" and
2234           "-t right".
2235
2236       ENHANCEMENT
2237           dbjoin hash joins are now selected with "-m lefthash" and "-m
2238           righthash" (not the shortlived "-t righthash" option).
2239           (Technically this change is incompatible with Fsdd-2.60, but no one
2240           but me ever used that version.)
2241
2242   2.62, 2016-11-29 A new yaml_to_db and other minor improvements.
2243       ENHANCEMENT
2244           Documentation for xml_to_db now includes sample output.
2245
2246       NEW yaml_to_db converts a specific form of YAML to fsdb.
2247
2248       BUG FIX
2249           The test suite now uses "diff -c -b" rather than "diff -cb" to make
2250           OpenBSD-5.9 happier, I hope.
2251
2252       ENHANCEMENT
2253           Comments that log operations at the end of each file now do simple
2254           quoting of spaces.  (It is not guaranteed to be fully shell-
2255           compliant.)
2256
2257       ENHANCEMENT
2258           There is a new standard option, "--header", allowing one to specify
2259           an Fsdb header for inputs that lack it.  Currently it is supported
2260           by dbcoldefine, dbrowuniq, dbmapreduce, dbmultistats, dbsort,
2261           dbpipeline.
2262
2263       ENHANCEMENT
2264           dbfilepivot now allows the --possible-pivots option, and if it is
2265           provided processes the data in one pass.
2266
2267       ENHANCEMENT
2268           dbroweval logs are now quoted.
2269
2270   2.63, 2017-02-03 Re-add some features supposedly in 2.62 but not, and add
2271       more --header options.
2272       ENHANCEMENT
2273           The option -j is now a synonym for --parallelism.  (And several
2274           documention bugs about this option are fixed.)
2275
2276       ENHANCEMENT
2277           Additional support for "--header" in dbcolmerge, dbcol, dbrow, and
2278           dbroweval.
2279
2280       BUG FIX
2281           Version 2.62 was supposed to have this improvement, but did not
2282           (and now does): dbfilepivot now allows the --possible-pivots
2283           option, and if it is provided processes the data in one pass.
2284
2285       BUG FIX
2286           Version 2.62 was supposed to have this improvement, but did not
2287           (and now does): dbroweval logs are now quoted.
2288
2289   2.64, 2017-11-20 several small bugfixes and enhancements
2290       BUG FIX
2291           In dbroweval, the "next row" option previously did not correctly
2292           set up "_last_fieldname".  It now does.
2293
2294       ENHANCEMENT
2295           The csv_to_db converter now has an optional "-F x" option to set
2296           the field separator.
2297
2298       ENHANCEMENT
2299           Finally dbcolsplittocols has a "--header" option, and a new "-N"
2300           option to give the list of resulting output columns.
2301
2302       INCOMPATIBLE CHANGE
2303           Now dbcolstats and dbmultistats produce no output (but a schema)
2304           when given no input but a schema.  Previously they gave a null row
2305           of output.  The "--output-on-no-input" and
2306           "--no-output-on-no-input" options can control this behavior.
2307
2308   2.65, 2018-02-16 Minor release, bug fix and -F option.
2309       ENHANCEMENT
2310           dbmultistats and dbmapreduce now both take a "-F x" option to set
2311           the field separator.
2312
2313       BUG FIX
2314           Fixed missing "use Carp" in dbcolstats.  Also went back and cleaned
2315           up all uses of "croak()".  Thanks to Zefram for the bug report.
2316
2317   2.66, 2018-12-20 Critical bug fix in dbjoin.
2318       BUG FIX
2319           Removed old tests from MANIFEST.  (Thanks to Hang Guo for reporting
2320           this bug.)
2321
2322       IMPROVEMENT
2323           Errors for non-existing input files now include the bad filename
2324           (before: "cannot setup filehandle", now: "cannot open input: cannot
2325           open TEST/bad_filename").
2326
2327       BUG FIX
2328           Hash joins with three identical rows were failing with the
2329           assertion failure "internal error: confused about overflow" due to
2330           a now-fixed bug.
2331
2332   2.67, 2019-07-10 add support for reading and writing hdfs
2333       IMPROVEMENT
2334           dbformmail now has an "mh" mechanism that writes messages to
2335           individual files (an mh-style mailbox).
2336
2337       BUG FIX
2338           dbrow failed to include the Carp library, leading to fails on
2339           croak.
2340
2341       BUG FIX
2342           Fixed dbjoin error message for an unsorted right stream was
2343           incorrect (it said left).
2344
2345       IMPROVEMENT
2346           All Fsdb programs can now read from and write to HDFS, when files
2347           that start with "hdfs:" are given to -i and -o options.
2348
2349   2.68, 2019-09-19 All programs now support automatic decompression based on
2350       file extension.
2351       IMPROVEMENT
2352           The omitted-possible-error test case for dbfilepivot now has an
2353           altnerative output that I saw on some BSD-running systems (thanks
2354           to CPAN).
2355
2356       IMPROVEMENT
2357           dbmerge and dbmerge2 now support "--header".  dbmerge2 now gives
2358           better error messages when presented the wrong number of inputs.
2359
2360       BUG FIX
2361           dbsort now works with "--header" even when the file is big (due to
2362           fixes to dbmerge).
2363
2364       IMPROVEMENT
2365           cvs_to_db now processes data with the "binary" option, allowing it
2366           to handle newlines embedded in quoted fields.
2367
2368       IMPROVEMENT
2369           All programs now will transparently decompress input files, if they
2370           are listed as a filename as an input argument that extends with a
2371           standard extension (.gz, .bz2, and .xz).
2372
2373   2.69, 2019-11-22 a small bugfix in dbcolstats
2374       BUG FIX
2375           Filled in the the test case for autodecompress, which was missing
2376           for the 2.68 release.
2377
2378       ENHANCEMENT
2379           The groff program is required for build, and the "Makefile.PL"
2380           fails if groff is missing at build time.  Thanks to Chris Williams
2381           for suggesting this check, and the CPAN auto-building system for
2382           trying many platforms.
2383
2384       BUG FIX
2385           The dbcolstats program had numerical instability that sometimes
2386           results in failing with a square-root of a negative number when
2387           many values varied right at the edge of floating-point precision.
2388           We now detect and report that case as 0 stddev.  Thanks to Hang Guo
2389           for providing a test case.
2390
2391   2.70, 2020-11-12 Some small quality-of-life enhancements and corner-case
2392       bugfixes.
2393       ENHANCEMENT
2394           dbcol can now take an option "-a" to include all columns, allowing
2395           reordering of certain columns while passing the rest through.
2396
2397       ENHANCEMENT
2398           dbrowuniq and dbmerge now buffer comments in a way that the last
2399           row of data output is no longer in the last block of comments.
2400           (The data is identical, but for humans looking at output, this
2401           change makes it less likely to lose the last row.)
2402
2403       BUG FIX
2404           dbmultistats and dbpipeline documentation now indicates that they
2405           support "--header" (something they did since version 2.62 in
2406           2016-11-29, but now documented.
2407
2408       ENHANCEMENT
2409           dbcolcreate now supports "--header".
2410
2411       BUG FIX
2412           Fixed several spelling errors in deprecated programs and removed
2413           information about the no-longer existing FreeBSD and MacOS ports.
2414           Thanks to Calvin Ardi for the patch.
2415
2416       BUG FIX
2417           dbmerge now handles --xargs when only one file is provided (and
2418           passes the file through unchanged).  It also throws a clean error
2419           with --xargs if zero files are provided.  (To support dbmerge,
2420           dbcol now has an internal "--saveoutput" option.)  Thanks to Yuri
2421           Pradkin for reporting the unhandled corner-case.
2422
2423   2.71, 2020-11-16 Fix a race condition breaking test suites.
2424       BUG FIX
2425           Suppress a race condition in dbcolmerge was sometimes throwing the
2426           error "Fsdb::Support::Freds: ending, but running process:
2427           dbmerge:xargs" in the dbmerge_0_xargs test case, on exit.
2428
2429   2.72, 2020-12-01 A small bug and a packaging improvement.
2430       BUG FIX
2431           dbcolhisto now handles the degenerate case where everything has the
2432           same value (previously it would throw "illegal division by zero").
2433
2434       ENHANCEMENT
2435           The spec for Fedora now includes "make" as BuildRequires, something
2436           required for Fedora 34.
2437
2438   2.73, 2021-05-18 Updates dbcolpercentile with "--weighted", and with more
2439       ipv6.
2440       ENHANCEMENT
2441           dbcolpercentile now has a "--weighted" option.
2442
2443       ENHANCEMENT
2444           The new Fsdb::Support::IPv6 package includes ipv6_normalize,
2445           ipv6_zeroize to rewrite ipv6 print addresses in IPv6 normal form,
2446           with a 0 in each 4-nybble field.
2447

AUTHOR

2449       John Heidemann, "johnh@isi.edu"
2450
2451       See "Contributors" for the many people who have contributed bug reports
2452       and fixes.
2453

COPYRIGHT

2455       Fsdb is Copyright (C) 1991-2020 by John Heidemann <johnh@isi.edu>.
2456
2457       This program is free software; you can redistribute it and/or modify it
2458       under the terms of version 2 of the GNU General Public License as
2459       published by the Free Software Foundation.
2460
2461       This program is distributed in the hope that it will be useful, but
2462       WITHOUT ANY WARRANTY; without even the implied warranty of
2463       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
2464       General Public License for more details.
2465
2466       You should have received a copy of the GNU General Public License along
2467       with this program; if not, write to the Free Software Foundation, Inc.,
2468       675 Mass Ave, Cambridge, MA 02139, USA.
2469
2470       A copy of the GNU General Public License can be found in the file
2471       ``COPYING''.
2472

COMMENTS and BUG REPORTS

2474       Any comments about these programs should be sent to John Heidemann
2475       "johnh@isi.edu".
2476
2477
2478
2479perl v5.34.0                      2021-07-22                           Fsdb(3)