1CUT(P) POSIX Programmer's Manual CUT(P)
2
3
4
6 cut - cut out selected fields of each line of a file
7
9 cut -b list [-n] [file ...]
10
11 cut -c list [file ...]
12
13 cut -f list [-d delim][-s][file ...]
14
15
17 The cut utility shall cut out bytes ( -b option), characters ( -c
18 option), or character-delimited fields ( -f option) from each line in
19 one or more files, concatenate them, and write them to standard output.
20
22 The cut utility shall conform to the Base Definitions volume of
23 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
24
25 The application shall ensure that the option-argument list (see options
26 -b, -c, and -f below) is a comma-separated list or <blank>-separated
27 list of positive numbers and ranges. Ranges can be in three forms. The
28 first is two positive numbers separated by a hyphen ( low- high), which
29 represents all fields from the first number to the second number. The
30 second is a positive number preceded by a hyphen (- high), which repre‐
31 sents all fields from field number 1 to that number. The third is a
32 positive number followed by a hyphen ( low-), which represents that
33 number to the last field, inclusive. The elements in list can be
34 repeated, can overlap, and can be specified in any order, but the
35 bytes, characters, or fields selected shall be written in the order of
36 the input data. If an element appears in the selection list more than
37 once, it shall be written exactly once.
38
39 The following options shall be supported:
40
41 -b list
42 Cut based on a list of bytes. Each selected byte shall be output
43 unless the -n option is also specified. It shall not be an error
44 to select bytes not present in the input line.
45
46 -c list
47 Cut based on a list of characters. Each selected character shall
48 be output. It shall not be an error to select characters not
49 present in the input line.
50
51 -d delim
52 Set the field delimiter to the character delim. The default is
53 the <tab>.
54
55 -f list
56 Cut based on a list of fields, assumed to be separated in the
57 file by a delimiter character (see -d). Each selected field
58 shall be output. Output fields shall be separated by a single
59 occurrence of the field delimiter character. Lines with no field
60 delimiters shall be passed through intact, unless -s is speci‐
61 fied. It shall not be an error to select fields not present in
62 the input line.
63
64 -n Do not split characters. When specified with the -b option, each
65 element in list of the form low- high (hyphen-separated numbers)
66 shall be modified as follows:
67
68 * If the byte selected by low is not the first byte of a char‐
69 acter, low shall be decremented to select the first byte of
70 the character originally selected by low. If the byte
71 selected by high is not the last byte of a character, high
72 shall be decremented to select the last byte of the character
73 prior to the character originally selected by high, or zero
74 if there is no prior character. If the resulting range ele‐
75 ment has high equal to zero or low greater than high, the
76 list element shall be dropped from list for that input line
77 without causing an error.
78
79 Each element in list of the form low- shall be treated as above with
80 high set to the number of bytes in the current line, not including the
81 terminating <newline>. Each element in list of the form - high shall be
82 treated as above with low set to 1. Each element in list of the form
83 num (a single number) shall be treated as above with low set to num and
84 high set to num.
85
86 -s Suppress lines with no delimiter characters, when used with the
87 -f option. Unless specified, lines with no delimiters shall be
88 passed through untouched.
89
90
92 The following operand shall be supported:
93
94 file A pathname of an input file. If no file operands are specified,
95 or if a file operand is '-' , the standard input shall be used.
96
97
99 The standard input shall be used only if no file operands are speci‐
100 fied, or if a file operand is '-' . See the INPUT FILES section.
101
103 The input files shall be text files, except that line lengths shall be
104 unlimited.
105
107 The following environment variables shall affect the execution of cut:
108
109 LANG Provide a default value for the internationalization variables
110 that are unset or null. (See the Base Definitions volume of
111 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
112 ables for the precedence of internationalization variables used
113 to determine the values of locale categories.)
114
115 LC_ALL If set to a non-empty string value, override the values of all
116 the other internationalization variables.
117
118 LC_CTYPE
119 Determine the locale for the interpretation of sequences of
120 bytes of text data as characters (for example, single-byte as
121 opposed to multi-byte characters in arguments and input files).
122
123 LC_MESSAGES
124 Determine the locale that should be used to affect the format
125 and contents of diagnostic messages written to standard error.
126
127 NLSPATH
128 Determine the location of message catalogs for the processing of
129 LC_MESSAGES .
130
131
133 Default.
134
136 The cut utility output shall be a concatenation of the selected bytes,
137 characters, or fields (one of the following):
138
139
140 "%s\n", <concatenation of bytes>
141
142
143 "%s\n", <concatenation of characters>
144
145
146 "%s\n", <concatenation of fields and field delimiters>
147
149 The standard error shall be used only for diagnostic messages.
150
152 None.
153
155 None.
156
158 The following exit values shall be returned:
159
160 0 All input files were output successfully.
161
162 >0 An error occurred.
163
164
166 Default.
167
168 The following sections are informative.
169
171 Earlier versions of the cut utility worked in an environment where
172 bytes and characters were considered equivalent (modulo <backspace> and
173 <tab> processing in some implementations). In the extended world of
174 multi-byte characters, the new -b option has been added. The -n option
175 (used with -b) allows it to be used to act on bytes rounded to charac‐
176 ter boundaries. The algorithm specified for -n guarantees that:
177
178
179 cut -b 1-500 -n file > file1
180 cut -b 501- -n file > file2
181
182 ends up with all the characters in file appearing exactly once in file1
183 or file2. (There is, however, a <newline> in both file1 and file2 for
184 each <newline> in file.)
185
187 Examples of the option qualifier list:
188
189 1,4,7 Select the first, fourth, and seventh bytes, characters, or
190 fields and field delimiters.
191
192 1-3,8 Equivalent to 1,2,3,8.
193
194 -5,10 Equivalent to 1,2,3,4,5,10.
195
196 3- Equivalent to third to last, inclusive.
197
198
199 The low- high forms are not always equivalent when used with -b and -n
200 and multi-byte characters; see the description of -n.
201
202 The following command:
203
204
205 cut -d : -f 1,6 /etc/passwd
206
207 reads the System V password file (user database) and produces lines of
208 the form:
209
210
211 <user ID>:<home directory>
212
213 Most utilities in this volume of IEEE Std 1003.1-2001 work on text
214 files. The cut utility can be used to turn files with arbitrary line
215 lengths into a set of text files containing the same data. The paste
216 utility can be used to create (or recreate) files with arbitrary line
217 lengths. For example, if file contains long lines:
218
219
220 cut -b 1-500 -n file > file1
221 cut -b 501- -n file > file2
222
223 creates file1 (a text file) with lines no longer than 500 bytes (plus
224 the <newline>) and file2 that contains the remainder of the data from
225 file. (Note that file2 is not a text file if there are lines in file
226 that are longer than 500 + {LINE_MAX} bytes.) The original file can be
227 recreated from file1 and file2 using the command:
228
229
230 paste -d "\0" file1 file2 > file
231
233 Some historical implementations do not count <backspace>s in determin‐
234 ing character counts with the -c option. This may be useful for using
235 cut for processing nroff output. It was deliberately decided not to
236 have the -c option treat either <backspace>s or <tab>s in any special
237 fashion. The fold utility does treat these characters specially.
238
239 Unlike other utilities, some historical implementations of cut exit
240 after not finding an input file, rather than continuing to process the
241 remaining file operands. This behavior is prohibited by this volume of
242 IEEE Std 1003.1-2001, where only the exit status is affected by this
243 problem.
244
245 The behavior of cut when provided with either mutually-exclusive
246 options or options that do not work logically together has been delib‐
247 erately left unspecified in favor of global wording in Utility Descrip‐
248 tion Defaults .
249
250 The OPTIONS section was changed in response to IEEE PASC Interpretation
251 1003.2 #149. The change represents historical practice on all known
252 systems. The original standard was ambiguous on the nature of the out‐
253 put.
254
255 The list option-arguments are historically used to select the portions
256 of the line to be written, but do not affect the order of the data. For
257 example:
258
259
260 echo abcdefghi | cut -c6,2,4-7,1
261
262 yields "abdefg" .
263
264 A proposal to enhance cut with the following option:
265
266 -o Preserve the selected field order. When this option is speci‐
267 fied, each byte, character, or field (or ranges of such) shall
268 be written in the order specified by the list option-argument,
269 even if this requires multiple outputs of the same bytes, char‐
270 acters, or fields.
271
272
273 was rejected because this type of enhancement is outside the scope of
274 the IEEE P1003.2b draft standard.
275
277 None.
278
280 grep , paste , Parameters and Variables
281
283 Portions of this text are reprinted and reproduced in electronic form
284 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
285 -- Portable Operating System Interface (POSIX), The Open Group Base
286 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
287 Electrical and Electronics Engineers, Inc and The Open Group. In the
288 event of any discrepancy between this version and the original IEEE and
289 The Open Group Standard, the original IEEE and The Open Group Standard
290 is the referee document. The original Standard can be obtained online
291 at http://www.opengroup.org/unix/online.html .
292
293
294
295IEEE/The Open Group 2003 CUT(P)