1Fsdb::Filter::dbsort(3)User Contributed Perl DocumentatioFnsdb::Filter::dbsort(3)
2
3
4
6 dbsort - sort rows based on the the specified columns
7
9 dbsort [-M MemLimit] [-T TemporaryDirectory] [-nNrR] column [column...]
10
12 Sort all input rows as specified by the numeric or lexical columns.
13
14 Dbsort consumes a fixed amount of memory regardless of input size. (It
15 reverts to temporary files on disk if necessary, based on the -M and -T
16 options.)
17
18 The sort should be stable, but this has not yet been verified.
19
20 For large inputs (those that spill to disk), dbsort will do some of the
21 merging in parallel, if possible. The --parallel option can control
22 the degree of parallelism, if desired.
23
25 General option:
26
27 -M MaxMemBytes
28 Specify an approximate limit on memory usage (in bytes). Larger
29 values allow faster sorting because more operations happen in-
30 memory, provided you have enough memory.
31
32 -T TmpDir
33 where to put tmp files. Also uses environment variable TMPDIR, if
34 -T is not specified. Default is /tmp.
35
36 --parallelism N or -j N
37 Allow up to N merges to happen in parallel. Default is the number
38 of CPUs in the machine.
39
40 Sort specification options (can be interspersed with column names):
41
42 -r or --descending
43 sort in reverse order (high to low)
44
45 -R or --ascending
46 sort in normal order (low to high)
47
48 -t or --type-inferred-sorting
49 sort fields by type (numeric or leicographic), automatically
50
51 -T or --no-type-inferred-sorting
52 sort fields only as specified based on "-n" or "-N"
53
54 -n or --numeric
55 sort numerically
56
57 -N or --lexical
58 sort lexicographically
59
60 This module also supports the standard fsdb options:
61
62 -d Enable debugging output.
63
64 -i or --input InputSource
65 Read from InputSource, typically a file name, or "-" for standard
66 input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
67 objects.
68
69 -o or --output OutputDestination
70 Write to OutputDestination, typically a file name, or "-" for
71 standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
72 Fsdb::BoundedQueue objects.
73
74 --autorun or --noautorun
75 By default, programs process automatically, but Fsdb::Filter
76 objects in Perl do not run until you invoke the run() method. The
77 "--(no)autorun" option controls that behavior within Perl.
78
79 --header H
80 Use H as the full Fsdb header, rather than reading a header from
81 then input.
82
83 --help
84 Show help.
85
86 --man
87 Show full manual.
88
90 Input:
91 #fsdb cid cname
92 10 pascal
93 11 numanal
94 12 os
95
96 Command:
97 cat data.fsdb | dbsort cname
98
99 Output:
100 #fsdb cid cname
101 11 numanal
102 12 os
103 10 pascal
104 # | dbsort cname
105
107 dbmerge(1), dbmapreduce(1), Fsdb(3)
108
110 new
111 $filter = new Fsdb::Filter::dbsort(@arguments);
112
113 Create a new object, taking command-line arguments.
114
115 set_defaults
116 $filter->set_defaults();
117
118 Internal: set up defaults.
119
120 parse_options
121 $filter->parse_options(@ARGV);
122
123 Internal: parse command-line arguments.
124
125 setup
126 $filter->setup();
127
128 Internal: setup, parse headers.
129
130 segment_start
131 $self->segment_start(\@rows);
132
133 Sorting happens internally, to handle large things in pieces if
134 necessary.
135
136 call "$self-"segment_start> to init things and to restart after an
137 overflow "$self-"segment_overflow> to close one segment and start the
138 next, and "$self-"segment_merge_finish> to put them back together
139 again.
140
141 Note that we don't invoke the merge code unless the data exceeds some
142 threshold size, so small sorts happen completely in memory.
143
144 Once we give up on memory, all the merging happens by making passes
145 over the disk files.
146
147 segment_next_output
148 $out = $self->segment_next_output($input_finished)
149
150 Internal: return a Fsdb::IO::Writer as $OUT that either points to our
151 output or a temporary file, depending on how things are going.
152
153 segment_overflow
154 $self->segment_overflow(\@rows, $input_finished)
155
156 Called to sort @ROWS, writing them to the appropriate place.
157 $INPUT_FINISHED is set if all input has been read.
158
159 segment_merge_start
160 $self->segment_merge_start($fn);
161
162 Start merging on file $FN. Fork off a merge thread, if necessary.
163
164 segment_merge_finish
165 $self->segment_merge_finish();
166
167 Merge queued files, if any. Just call dbmerge(1) to do all the real
168 work.
169
170 run
171 $filter->run();
172
173 Internal: run over each rows.
174
176 Copyright (C) 1991-2018 by John Heidemann <johnh@isi.edu>
177
178 This program is distributed under terms of the GNU general public
179 license, version 2. See the file COPYING with the distribution for
180 details.
181
182
183
184perl v5.34.1 2022-04-04 Fsdb::Filter::dbsort(3)