1CUT(1P) POSIX Programmer's Manual CUT(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 cut - cut out selected fields of each line of a file
13
15 cut -b list [-n] [file ...]
16
17 cut -c list [file ...]
18
19 cut -f list [-d delim][-s][file ...]
20
21
23 The cut utility shall cut out bytes ( -b option), characters ( -c
24 option), or character-delimited fields ( -f option) from each line in
25 one or more files, concatenate them, and write them to standard output.
26
28 The cut utility shall conform to the Base Definitions volume of
29 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
30
31 The application shall ensure that the option-argument list (see options
32 -b, -c, and -f below) is a comma-separated list or <blank>-separated
33 list of positive numbers and ranges. Ranges can be in three forms. The
34 first is two positive numbers separated by a hyphen ( low- high), which
35 represents all fields from the first number to the second number. The
36 second is a positive number preceded by a hyphen (- high), which repre‐
37 sents all fields from field number 1 to that number. The third is a
38 positive number followed by a hyphen ( low-), which represents that
39 number to the last field, inclusive. The elements in list can be
40 repeated, can overlap, and can be specified in any order, but the
41 bytes, characters, or fields selected shall be written in the order of
42 the input data. If an element appears in the selection list more than
43 once, it shall be written exactly once.
44
45 The following options shall be supported:
46
47 -b list
48 Cut based on a list of bytes. Each selected byte shall be output
49 unless the -n option is also specified. It shall not be an error
50 to select bytes not present in the input line.
51
52 -c list
53 Cut based on a list of characters. Each selected character shall
54 be output. It shall not be an error to select characters not
55 present in the input line.
56
57 -d delim
58 Set the field delimiter to the character delim. The default is
59 the <tab>.
60
61 -f list
62 Cut based on a list of fields, assumed to be separated in the
63 file by a delimiter character (see -d). Each selected field
64 shall be output. Output fields shall be separated by a single
65 occurrence of the field delimiter character. Lines with no field
66 delimiters shall be passed through intact, unless -s is speci‐
67 fied. It shall not be an error to select fields not present in
68 the input line.
69
70 -n Do not split characters. When specified with the -b option, each
71 element in list of the form low- high (hyphen-separated numbers)
72 shall be modified as follows:
73
74 * If the byte selected by low is not the first byte of a char‐
75 acter, low shall be decremented to select the first byte of
76 the character originally selected by low. If the byte
77 selected by high is not the last byte of a character, high
78 shall be decremented to select the last byte of the character
79 prior to the character originally selected by high, or zero
80 if there is no prior character. If the resulting range ele‐
81 ment has high equal to zero or low greater than high, the
82 list element shall be dropped from list for that input line
83 without causing an error.
84
85 Each element in list of the form low- shall be treated as above with
86 high set to the number of bytes in the current line, not including the
87 terminating <newline>. Each element in list of the form - high shall be
88 treated as above with low set to 1. Each element in list of the form
89 num (a single number) shall be treated as above with low set to num and
90 high set to num.
91
92 -s Suppress lines with no delimiter characters, when used with the
93 -f option. Unless specified, lines with no delimiters shall be
94 passed through untouched.
95
96
98 The following operand shall be supported:
99
100 file A pathname of an input file. If no file operands are specified,
101 or if a file operand is '-', the standard input shall be used.
102
103
105 The standard input shall be used only if no file operands are speci‐
106 fied, or if a file operand is '-' . See the INPUT FILES section.
107
109 The input files shall be text files, except that line lengths shall be
110 unlimited.
111
113 The following environment variables shall affect the execution of cut:
114
115 LANG Provide a default value for the internationalization variables
116 that are unset or null. (See the Base Definitions volume of
117 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
118 ables for the precedence of internationalization variables used
119 to determine the values of locale categories.)
120
121 LC_ALL If set to a non-empty string value, override the values of all
122 the other internationalization variables.
123
124 LC_CTYPE
125 Determine the locale for the interpretation of sequences of
126 bytes of text data as characters (for example, single-byte as
127 opposed to multi-byte characters in arguments and input files).
128
129 LC_MESSAGES
130 Determine the locale that should be used to affect the format
131 and contents of diagnostic messages written to standard error.
132
133 NLSPATH
134 Determine the location of message catalogs for the processing of
135 LC_MESSAGES .
136
137
139 Default.
140
142 The cut utility output shall be a concatenation of the selected bytes,
143 characters, or fields (one of the following):
144
145
146 "%s\n", <concatenation of bytes>
147
148
149 "%s\n", <concatenation of characters>
150
151
152 "%s\n", <concatenation of fields and field delimiters>
153
155 The standard error shall be used only for diagnostic messages.
156
158 None.
159
161 None.
162
164 The following exit values shall be returned:
165
166 0 All input files were output successfully.
167
168 >0 An error occurred.
169
170
172 Default.
173
174 The following sections are informative.
175
177 Earlier versions of the cut utility worked in an environment where
178 bytes and characters were considered equivalent (modulo <backspace> and
179 <tab> processing in some implementations). In the extended world of
180 multi-byte characters, the new -b option has been added. The -n option
181 (used with -b) allows it to be used to act on bytes rounded to charac‐
182 ter boundaries. The algorithm specified for -n guarantees that:
183
184
185 cut -b 1-500 -n file > file1
186 cut -b 501- -n file > file2
187
188 ends up with all the characters in file appearing exactly once in file1
189 or file2. (There is, however, a <newline> in both file1 and file2 for
190 each <newline> in file.)
191
193 Examples of the option qualifier list:
194
195 1,4,7 Select the first, fourth, and seventh bytes, characters, or
196 fields and field delimiters.
197
198 1-3,8 Equivalent to 1,2,3,8.
199
200 -5,10 Equivalent to 1,2,3,4,5,10.
201
202 3- Equivalent to third to last, inclusive.
203
204
205 The low- high forms are not always equivalent when used with -b and -n
206 and multi-byte characters; see the description of -n.
207
208 The following command:
209
210
211 cut -d : -f 1,6 /etc/passwd
212
213 reads the System V password file (user database) and produces lines of
214 the form:
215
216
217 <user ID>:<home directory>
218
219 Most utilities in this volume of IEEE Std 1003.1-2001 work on text
220 files. The cut utility can be used to turn files with arbitrary line
221 lengths into a set of text files containing the same data. The paste
222 utility can be used to create (or recreate) files with arbitrary line
223 lengths. For example, if file contains long lines:
224
225
226 cut -b 1-500 -n file > file1
227 cut -b 501- -n file > file2
228
229 creates file1 (a text file) with lines no longer than 500 bytes (plus
230 the <newline>) and file2 that contains the remainder of the data from
231 file. (Note that file2 is not a text file if there are lines in file
232 that are longer than 500 + {LINE_MAX} bytes.) The original file can be
233 recreated from file1 and file2 using the command:
234
235
236 paste -d "\0" file1 file2 > file
237
239 Some historical implementations do not count <backspace>s in determin‐
240 ing character counts with the -c option. This may be useful for using
241 cut for processing nroff output. It was deliberately decided not to
242 have the -c option treat either <backspace>s or <tab>s in any special
243 fashion. The fold utility does treat these characters specially.
244
245 Unlike other utilities, some historical implementations of cut exit
246 after not finding an input file, rather than continuing to process the
247 remaining file operands. This behavior is prohibited by this volume of
248 IEEE Std 1003.1-2001, where only the exit status is affected by this
249 problem.
250
251 The behavior of cut when provided with either mutually-exclusive
252 options or options that do not work logically together has been delib‐
253 erately left unspecified in favor of global wording in Utility Descrip‐
254 tion Defaults .
255
256 The OPTIONS section was changed in response to IEEE PASC Interpretation
257 1003.2 #149. The change represents historical practice on all known
258 systems. The original standard was ambiguous on the nature of the out‐
259 put.
260
261 The list option-arguments are historically used to select the portions
262 of the line to be written, but do not affect the order of the data. For
263 example:
264
265
266 echo abcdefghi | cut -c6,2,4-7,1
267
268 yields "abdefg" .
269
270 A proposal to enhance cut with the following option:
271
272 -o Preserve the selected field order. When this option is speci‐
273 fied, each byte, character, or field (or ranges of such) shall
274 be written in the order specified by the list option-argument,
275 even if this requires multiple outputs of the same bytes, char‐
276 acters, or fields.
277
278
279 was rejected because this type of enhancement is outside the scope of
280 the IEEE P1003.2b draft standard.
281
283 None.
284
286 grep, paste, Parameters and Variables
287
289 Portions of this text are reprinted and reproduced in electronic form
290 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
291 -- Portable Operating System Interface (POSIX), The Open Group Base
292 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
293 Electrical and Electronics Engineers, Inc and The Open Group. In the
294 event of any discrepancy between this version and the original IEEE and
295 The Open Group Standard, the original IEEE and The Open Group Standard
296 is the referee document. The original Standard can be obtained online
297 at http://www.opengroup.org/unix/online.html .
298
299
300
301IEEE/The Open Group 2003 CUT(1P)