1Fsdb::Filter::dbsort(3)User Contributed Perl DocumentatioFnsdb::Filter::dbsort(3)
2
3
4
6 dbsort - sort rows based on the the specified columns
7
9 dbsort [-M MemLimit] [-T TemporaryDirectory] [-nNrR] column [column...]
10
12 Sort all input rows as specified by the numeric or lexical columns.
13
14 Dbsort consumes a fixed amount of memory regardless of input size. (It
15 reverts to temporary files on disk if necessary, based on the -M and -T
16 options.)
17
18 The sort should be stable, but this has not yet been verified.
19
20 For large inputs (those that spill to disk), dbsort will do some of the
21 merging in parallel, if possible. The --parallel option can control
22 the degree of parallelism, if desired.
23
25 General option:
26
27 -M MaxMemBytes
28 Specify an approximate limit on memory usage (in bytes). Larger
29 values allow faster sorting because more operations happen in-
30 memory, provided you have enough memory.
31
32 -T TmpDir
33 where to put tmp files. Also uses environment variable TMPDIR, if
34 -T is not specified. Default is /tmp.
35
36 --parallelism N or -j N
37 Allow up to N merges to happen in parallel. Default is the number
38 of CPUs in the machine.
39
40 Sort specification options (can be interspersed with column names):
41
42 -r or --descending
43 sort in reverse order (high to low)
44
45 -R or --ascending
46 sort in normal order (low to high)
47
48 -n or --numeric
49 sort numerically
50
51 -N or --lexical
52 sort lexicographically
53
54 This module also supports the standard fsdb options:
55
56 -d Enable debugging output.
57
58 -i or --input InputSource
59 Read from InputSource, typically a file name, or "-" for standard
60 input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
61 objects.
62
63 -o or --output OutputDestination
64 Write to OutputDestination, typically a file name, or "-" for
65 standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
66 Fsdb::BoundedQueue objects.
67
68 --autorun or --noautorun
69 By default, programs process automatically, but Fsdb::Filter
70 objects in Perl do not run until you invoke the run() method. The
71 "--(no)autorun" option controls that behavior within Perl.
72
73 --header H
74 Use H as the full Fsdb header, rather than reading a header from
75 then input.
76
77 --help
78 Show help.
79
80 --man
81 Show full manual.
82
84 Input:
85 #fsdb cid cname
86 10 pascal
87 11 numanal
88 12 os
89
90 Command:
91 cat data.fsdb | dbsort cname
92
93 Output:
94 #fsdb cid cname
95 11 numanal
96 12 os
97 10 pascal
98 # | dbsort cname
99
101 dbmerge(1), dbmapreduce(1), Fsdb(3)
102
104 new
105 $filter = new Fsdb::Filter::dbsort(@arguments);
106
107 Create a new object, taking command-line arguments.
108
109 set_defaults
110 $filter->set_defaults();
111
112 Internal: set up defaults.
113
114 parse_options
115 $filter->parse_options(@ARGV);
116
117 Internal: parse command-line arguments.
118
119 setup
120 $filter->setup();
121
122 Internal: setup, parse headers.
123
124 segment_start
125 $self->segment_start(\@rows);
126
127 Sorting happens internally, to handle large things in pieces if
128 necessary.
129
130 call "$self-"segment_start> to init things and to restart after an
131 overflow "$self-"segment_overflow> to close one segment and start the
132 next, and "$self-"segment_merge_finish> to put them back together
133 again.
134
135 Note that we don't invoke the merge code unless the data exceeds some
136 threshold size, so small sorts happen completely in memory.
137
138 Once we give up on memory, all the merging happens by making passes
139 over the disk files.
140
141 segment_next_output
142 $out = $self->segment_next_output($input_finished)
143
144 Internal: return a Fsdb::IO::Writer as $OUT that either points to our
145 output or a temporary file, depending on how things are going.
146
147 segment_overflow
148 $self->segment_overflow(\@rows, $input_finished)
149
150 Called to sort @ROWS, writing them to the appropriate place.
151 $INPUT_FINISHED is set if all input has been read.
152
153 segment_merge_start
154 $self->segment_merge_start($fn);
155
156 Start merging on file $FN. Fork off a merge thread, if necessary.
157
158 segment_merge_finish
159 $self->segment_merge_finish();
160
161 Merge queued files, if any. Just call dbmerge(1) to do all the real
162 work.
163
164 run
165 $filter->run();
166
167 Internal: run over each rows.
168
170 Copyright (C) 1991-2018 by John Heidemann <johnh@isi.edu>
171
172 This program is distributed under terms of the GNU general public
173 license, version 2. See the file COPYING with the distribution for
174 details.
175
176
177
178perl v5.28.1 2018-10-08 Fsdb::Filter::dbsort(3)