1CUT(1P) POSIX Programmer's Manual CUT(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
11
13 cut — cut out selected fields of each line of a file
14
16 cut −b list [−n] [file...]
17
18 cut −c list [file...]
19
20 cut −f list [−d delim] [−s] [file...]
21
23 The cut utility shall cut out bytes (−b option), characters (−c
24 option), or character-delimited fields (−f option) from each line in
25 one or more files, concatenate them, and write them to standard output.
26
28 The cut utility shall conform to the Base Definitions volume of
29 POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines.
30
31 The application shall ensure that the option-argument list (see options
32 −b, −c, and −f below) is a <comma>-separated list or <blank>-separated
33 list of positive numbers and ranges. Ranges can be in three forms. The
34 first is two positive numbers separated by a <hyphen> (low−high), which
35 represents all fields from the first number to the second number. The
36 second is a positive number preceded by a <hyphen> (−high), which rep‐
37 resents all fields from field number 1 to that number. The third is a
38 positive number followed by a <hyphen> (low−), which represents that
39 number to the last field, inclusive. The elements in list can be
40 repeated, can overlap, and can be specified in any order, but the
41 bytes, characters, or fields selected shall be written in the order of
42 the input data. If an element appears in the selection list more than
43 once, it shall be written exactly once.
44
45 The following options shall be supported:
46
47 −b list Cut based on a list of bytes. Each selected byte shall be
48 output unless the −n option is also specified. It shall not
49 be an error to select bytes not present in the input line.
50
51 −c list Cut based on a list of characters. Each selected character
52 shall be output. It shall not be an error to select charac‐
53 ters not present in the input line.
54
55 −d delim Set the field delimiter to the character delim. The default
56 is the <tab>.
57
58 −f list Cut based on a list of fields, assumed to be separated in the
59 file by a delimiter character (see −d). Each selected field
60 shall be output. Output fields shall be separated by a single
61 occurrence of the field delimiter character. Lines with no
62 field delimiters shall be passed through intact, unless −s is
63 specified. It shall not be an error to select fields not
64 present in the input line.
65
66 −n Do not split characters. When specified with the −b option,
67 each element in list of the form low−high (<hyphen>-separated
68 numbers) shall be modified as follows:
69
70 * If the byte selected by low is not the first byte of a
71 character, low shall be decremented to select the first
72 byte of the character originally selected by low. If the
73 byte selected by high is not the last byte of a charac‐
74 ter, high shall be decremented to select the last byte of
75 the character prior to the character originally selected
76 by high, or zero if there is no prior character. If the
77 resulting range element has high equal to zero or low
78 greater than high, the list element shall be dropped from
79 list for that input line without causing an error.
80
81 Each element in list of the form low− shall be treated as
82 above with high set to the number of bytes in the current
83 line, not including the terminating <newline>. Each element
84 in list of the form −high shall be treated as above with low
85 set to 1. Each element in list of the form num (a single num‐
86 ber) shall be treated as above with low set to num and high
87 set to num.
88
89 −s Suppress lines with no delimiter characters, when used with
90 the −f option. Unless specified, lines with no delimiters
91 shall be passed through untouched.
92
94 The following operand shall be supported:
95
96 file A pathname of an input file. If no file operands are speci‐
97 fied, or if a file operand is '−', the standard input shall
98 be used.
99
101 The standard input shall be used only if no file operands are speci‐
102 fied, or if a file operand is '−'. See the INPUT FILES section.
103
105 The input files shall be text files, except that line lengths shall be
106 unlimited.
107
109 The following environment variables shall affect the execution of cut:
110
111 LANG Provide a default value for the internationalization vari‐
112 ables that are unset or null. (See the Base Definitions vol‐
113 ume of POSIX.1‐2008, Section 8.2, Internationalization Vari‐
114 ables for the precedence of internationalization variables
115 used to determine the values of locale categories.)
116
117 LC_ALL If set to a non-empty string value, override the values of
118 all the other internationalization variables.
119
120 LC_CTYPE Determine the locale for the interpretation of sequences of
121 bytes of text data as characters (for example, single-byte as
122 opposed to multi-byte characters in arguments and input
123 files).
124
125 LC_MESSAGES
126 Determine the locale that should be used to affect the format
127 and contents of diagnostic messages written to standard
128 error.
129
130 NLSPATH Determine the location of message catalogs for the processing
131 of LC_MESSAGES.
132
134 Default.
135
137 The cut utility output shall be a concatenation of the selected bytes,
138 characters, or fields (one of the following):
139
140 "%s\n", <concatenation of bytes>
141
142 "%s\n", <concatenation of characters>
143
144 "%s\n", <concatenation of fields and field delimiters>
145
147 The standard error shall be used only for diagnostic messages.
148
150 None.
151
153 None.
154
156 The following exit values shall be returned:
157
158 0 All input files were output successfully.
159
160 >0 An error occurred.
161
163 Default.
164
165 The following sections are informative.
166
168 The cut and fold utilities can be used to create text files out of
169 files with arbitrary line lengths. The cut utility should be used when
170 the number of lines (or records) needs to remain constant. The fold
171 utility should be used when the contents of long lines need to be kept
172 contiguous.
173
174 Earlier versions of the cut utility worked in an environment where
175 bytes and characters were considered equivalent (modulo <backspace> and
176 <tab> processing in some implementations). In the extended world of
177 multi-byte characters, the new −b option has been added. The −n option
178 (used with −b) allows it to be used to act on bytes rounded to charac‐
179 ter boundaries. The algorithm specified for −n guarantees that:
180
181 cut −b 1−500 −n file > file1
182 cut −b 501− −n file > file2
183
184 ends up with all the characters in file appearing exactly once in file1
185 or file2. (There is, however, a <newline> in both file1 and file2 for
186 each <newline> in file.)
187
189 Examples of the option qualifier list:
190
191 1,4,7 Select the first, fourth, and seventh bytes, characters, or
192 fields and field delimiters.
193
194 1−3,8 Equivalent to 1,2,3,8.
195
196 −5,10 Equivalent to 1,2,3,4,5,10.
197
198 3− Equivalent to third to last, inclusive.
199
200 The low−high forms are not always equivalent when used with −b and −n
201 and multi-byte characters; see the description of −n.
202
203 The following command:
204
205 cut −d : −f 1,6 /etc/passwd
206
207 reads the System V password file (user database) and produces lines of
208 the form:
209
210 <user ID>:<home directory>
211
212 Most utilities in this volume of POSIX.1‐2008 work on text files. The
213 cut utility can be used to turn files with arbitrary line lengths into
214 a set of text files containing the same data. The paste utility can be
215 used to create (or recreate) files with arbitrary line lengths. For
216 example, if file contains long lines:
217
218 cut −b 1−500 −n file > file1
219 cut −b 501− −n file > file2
220
221 creates file1 (a text file) with lines no longer than 500 bytes (plus
222 the <newline>) and file2 that contains the remainder of the data from
223 file. (Note that file2 is not a text file if there are lines in file
224 that are longer than 500 + {LINE_MAX} bytes.) The original file can be
225 recreated from file1 and file2 using the command:
226
227 paste −d "\0" file1 file2 > file
228
230 Some historical implementations do not count <backspace> characters in
231 determining character counts with the −c option. This may be useful for
232 using cut for processing nroff output. It was deliberately decided not
233 to have the −c option treat either <backspace> or <tab> characters in
234 any special fashion. The fold utility does treat these characters spe‐
235 cially.
236
237 Unlike other utilities, some historical implementations of cut exit
238 after not finding an input file, rather than continuing to process the
239 remaining file operands. This behavior is prohibited by this volume of
240 POSIX.1‐2008, where only the exit status is affected by this problem.
241
242 The behavior of cut when provided with either mutually-exclusive
243 options or options that do not work logically together has been delib‐
244 erately left unspecified in favor of global wording in Section 1.4,
245 Utility Description Defaults.
246
247 The OPTIONS section was changed in response to IEEE PASC Interpretation
248 1003.2 #149. The change represents historical practice on all known
249 systems. The original standard was ambiguous on the nature of the out‐
250 put.
251
252 The list option-arguments are historically used to select the portions
253 of the line to be written, but do not affect the order of the data. For
254 example:
255
256 echo abcdefghi | cut −c6,2,4−7,1
257
258 yields "abdefg".
259
260 A proposal to enhance cut with the following option:
261
262 −o Preserve the selected field order. When this option is specified,
263 each byte, character, or field (or ranges of such) shall be writ‐
264 ten in the order specified by the list option-argument, even if
265 this requires multiple outputs of the same bytes, characters, or
266 fields.
267
268 was rejected because this type of enhancement is outside the scope of
269 the IEEE P1003.2b draft standard.
270
272 None.
273
275 Section 2.5, Parameters and Variables, fold, grep, paste
276
277 The Base Definitions volume of POSIX.1‐2008, Chapter 8, Environment
278 Variables, Section 12.2, Utility Syntax Guidelines
279
281 Portions of this text are reprinted and reproduced in electronic form
282 from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
283 -- Portable Operating System Interface (POSIX), The Open Group Base
284 Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
285 cal and Electronics Engineers, Inc and The Open Group. (This is
286 POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
287 event of any discrepancy between this version and the original IEEE and
288 The Open Group Standard, the original IEEE and The Open Group Standard
289 is the referee document. The original Standard can be obtained online
290 at http://www.unix.org/online.html .
291
292 Any typographical or formatting errors that appear in this page are
293 most likely to have been introduced during the conversion of the source
294 files to man page format. To report such errors, see https://www.ker‐
295 nel.org/doc/man-pages/reporting_bugs.html .
296
297
298
299IEEE/The Open Group 2013 CUT(1P)