1CUT(1P)                    POSIX Programmer's Manual                   CUT(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       cut — cut out selected fields of each line of a file
13

SYNOPSIS

15       cut -b list [-n] [file...]
16
17       cut -c list [file...]
18
19       cut -f list [-d delim] [-s] [file...]
20

DESCRIPTION

22       The cut utility  shall  cut  out  bytes  (-b  option),  characters  (-c
23       option),  or  character-delimited  fields (-f option) from each line in
24       one or more files, concatenate them, and write them to standard output.
25

OPTIONS

27       The cut utility  shall  conform  to  the  Base  Definitions  volume  of
28       POSIX.1‐2017, Section 12.2, Utility Syntax Guidelines.
29
30       The application shall ensure that the option-argument list (see options
31       -b, -c, and -f below) is a <comma>-separated list or  <blank>-separated
32       list  of positive numbers and ranges. Ranges can be in three forms. The
33       first is two positive numbers separated by a <hyphen-minus> (low-high),
34       which represents all fields from the first number to the second number.
35       The second is a positive number preceded by a  <hyphen-minus>  (-high),
36       which  represents  all  fields  from field number 1 to that number. The
37       third is a positive number followed by a <hyphen-minus>  (low-),  which
38       represents  that  number  to the last field, inclusive. The elements in
39       list can be repeated, can overlap, and can be specified in  any  order,
40       but  the  bytes, characters, or fields selected shall be written in the
41       order of the input data. If an element appears in  the  selection  list
42       more than once, it shall be written exactly once.
43
44       The following options shall be supported:
45
46       -b list   Cut  based  on  a  list of bytes. Each selected byte shall be
47                 output unless the -n option is also specified. It  shall  not
48                 be an error to select bytes not present in the input line.
49
50       -c list   Cut  based  on  a list of characters. Each selected character
51                 shall be output. It shall not be an error to  select  charac‐
52                 ters not present in the input line.
53
54       -d delim  Set  the field delimiter to the character delim.  The default
55                 is the <tab>.
56
57       -f list   Cut based on a list of fields, assumed to be separated in the
58                 file  by a delimiter character (see -d).  Each selected field
59                 shall be output. Output fields shall be separated by a single
60                 occurrence  of  the  field delimiter character. Lines with no
61                 field delimiters shall be passed through intact, unless -s is
62                 specified.  It  shall  not  be  an error to select fields not
63                 present in the input line.
64
65       -n        Do not split characters. When specified with the  -b  option,
66                 each   element   in  list  of  the  form  low-high  (<hyphen-
67                 minus>-separated numbers) shall be modified as follows:
68
69                  *  If the byte selected by low is not the first  byte  of  a
70                     character,  low  shall be decremented to select the first
71                     byte of the character originally selected by low.  If the
72                     byte  selected  by high is not the last byte of a charac‐
73                     ter, high shall be decremented to select the last byte of
74                     the  character prior to the character originally selected
75                     by high, or zero if there is no prior character.  If  the
76                     resulting  range  element  has  high equal to zero or low
77                     greater than high, the list element shall be dropped from
78                     list for that input line without causing an error.
79
80                 Each  element  in  list  of the form low- shall be treated as
81                 above with high set to the number of  bytes  in  the  current
82                 line,  not including the terminating <newline>.  Each element
83                 in list of the form -high shall be treated as above with  low
84                 set to 1. Each element in list of the form num (a single num‐
85                 ber) shall be treated as above with low set to num  and  high
86                 set to num.
87
88       -s        Suppress  lines  with no delimiter characters, when used with
89                 the -f option. Unless specified,  lines  with  no  delimiters
90                 shall be passed through untouched.
91

OPERANDS

93       The following operand shall be supported:
94
95       file      A  pathname  of an input file. If no file operands are speci‐
96                 fied, or if a file operand is '-', the standard  input  shall
97                 be used.
98

STDIN

100       The  standard  input  shall be used only if no file operands are speci‐
101       fied, or if a file operand is '-'.  See the INPUT FILES section.
102

INPUT FILES

104       The input files shall be text files, except that line lengths shall  be
105       unlimited.
106

ENVIRONMENT VARIABLES

108       The following environment variables shall affect the execution of cut:
109
110       LANG      Provide  a  default  value for the internationalization vari‐
111                 ables that are unset or null. (See the Base Definitions  vol‐
112                 ume  of POSIX.1‐2017, Section 8.2, Internationalization Vari‐
113                 ables for the precedence  of  internationalization  variables
114                 used to determine the values of locale categories.)
115
116       LC_ALL    If  set  to  a non-empty string value, override the values of
117                 all the other internationalization variables.
118
119       LC_CTYPE  Determine the locale for the interpretation of  sequences  of
120                 bytes of text data as characters (for example, single-byte as
121                 opposed to  multi-byte  characters  in  arguments  and  input
122                 files).
123
124       LC_MESSAGES
125                 Determine the locale that should be used to affect the format
126                 and contents  of  diagnostic  messages  written  to  standard
127                 error.
128
129       NLSPATH   Determine the location of message catalogs for the processing
130                 of LC_MESSAGES.
131

ASYNCHRONOUS EVENTS

133       Default.
134

STDOUT

136       The cut utility output shall be a concatenation of the selected  bytes,
137       characters, or fields (one of the following):
138
139
140           "%s\n", <concatenation of bytes>
141
142           "%s\n", <concatenation of characters>
143
144           "%s\n", <concatenation of fields and field delimiters>
145

STDERR

147       The standard error shall be used only for diagnostic messages.
148

OUTPUT FILES

150       None.
151

EXTENDED DESCRIPTION

153       None.
154

EXIT STATUS

156       The following exit values shall be returned:
157
158        0    All input files were output successfully.
159
160       >0    An error occurred.
161

CONSEQUENCES OF ERRORS

163       Default.
164
165       The following sections are informative.
166

APPLICATION USAGE

168       The  cut  and  fold  utilities  can be used to create text files out of
169       files with arbitrary line lengths. The cut utility should be used  when
170       the  number  of  lines  (or records) needs to remain constant. The fold
171       utility should be used when the contents of long lines need to be  kept
172       contiguous.
173
174       Earlier  versions  of  the  cut  utility worked in an environment where
175       bytes and characters were considered equivalent (modulo <backspace> and
176       <tab>  processing  in  some  implementations). In the extended world of
177       multi-byte characters, the new -b option has been added. The -n  option
178       (used  with -b) allows it to be used to act on bytes rounded to charac‐
179       ter boundaries.  The algorithm specified for -n guarantees that:
180
181
182           cut -b 1-500 -n file > file1
183           cut -b 501- -n file > file2
184
185       ends up with all the characters in file appearing exactly once in file1
186       or  file2.  (There is, however, a <newline> in both file1 and file2 for
187       each <newline> in file.)
188

EXAMPLES

190       Examples of the option qualifier list:
191
192       1,4,7   Select the first, fourth, and  seventh  bytes,  characters,  or
193               fields and field delimiters.
194
195       1-3,8   Equivalent to 1,2,3,8.
196
197       -5,10   Equivalent to 1,2,3,4,5,10.
198
199       3-      Equivalent to third to last, inclusive.
200
201       The  low-high  forms are not always equivalent when used with -b and -n
202       and multi-byte characters; see the description of -n.
203
204       The following command:
205
206
207           cut -d : -f 1,6 /etc/passwd
208
209       reads the System V password file (user database) and produces lines  of
210       the form:
211
212
213           <user ID>:<home directory>
214
215       Most  utilities  in this volume of POSIX.1‐2017 work on text files. The
216       cut utility can be used to turn files with arbitrary line lengths  into
217       a  set of text files containing the same data. The paste utility can be
218       used to create (or recreate) files with  arbitrary  line  lengths.  For
219       example, if file contains long lines:
220
221
222           cut -b 1-500 -n file > file1
223           cut -b 501- -n file > file2
224
225       creates  file1  (a text file) with lines no longer than 500 bytes (plus
226       the <newline>) and file2 that contains the remainder of the  data  from
227       file.   (Note  that file2 is not a text file if there are lines in file
228       that are longer than 500 + {LINE_MAX} bytes.) The original file can  be
229       recreated from file1 and file2 using the command:
230
231
232           paste -d "\0" file1 file2 > file
233

RATIONALE

235       Some  historical implementations do not count <backspace> characters in
236       determining character counts with the -c option. This may be useful for
237       using  cut for processing nroff output. It was deliberately decided not
238       to have the -c option treat either <backspace> or <tab>  characters  in
239       any  special fashion. The fold utility does treat these characters spe‐
240       cially.
241
242       Unlike other utilities, some historical  implementations  of  cut  exit
243       after  not finding an input file, rather than continuing to process the
244       remaining file operands. This behavior is prohibited by this volume  of
245       POSIX.1‐2017, where only the exit status is affected by this problem.
246
247       The  behavior  of  cut  when  provided  with  either mutually-exclusive
248       options or options that do not work logically together has been  delib‐
249       erately  left  unspecified  in  favor of global wording in Section 1.4,
250       Utility Description Defaults.
251
252       The OPTIONS section was changed in response to IEEE PASC Interpretation
253       1003.2  #149.  The  change  represents historical practice on all known
254       systems. The original standard was ambiguous on the nature of the  out‐
255       put.
256
257       The  list option-arguments are historically used to select the portions
258       of the line to be written, but do not affect the order of the data. For
259       example:
260
261
262           echo abcdefghi | cut -c6,2,4-7,1
263
264       yields "abdefg".
265
266       A proposal to enhance cut with the following option:
267
268       -o    Preserve the selected field order. When this option is specified,
269             each byte, character, or field (or ranges of such) shall be writ‐
270             ten  in  the order specified by the list option-argument, even if
271             this requires multiple outputs of the same bytes, characters,  or
272             fields.
273
274       was  rejected  because this type of enhancement is outside the scope of
275       the IEEE P1003.2b draft standard.
276

FUTURE DIRECTIONS

278       None.
279

SEE ALSO

281       Section 2.5, Parameters and Variables, fold, grep, paste
282
283       The Base Definitions volume of  POSIX.1‐2017,  Chapter  8,  Environment
284       Variables, Section 12.2, Utility Syntax Guidelines
285
287       Portions  of  this text are reprinted and reproduced in electronic form
288       from IEEE Std 1003.1-2017, Standard for Information Technology --  Por‐
289       table  Operating System Interface (POSIX), The Open Group Base Specifi‐
290       cations Issue 7, 2018 Edition, Copyright (C) 2018 by the  Institute  of
291       Electrical  and  Electronics Engineers, Inc and The Open Group.  In the
292       event of any discrepancy between this version and the original IEEE and
293       The  Open Group Standard, the original IEEE and The Open Group Standard
294       is the referee document. The original Standard can be  obtained  online
295       at http://www.opengroup.org/unix/online.html .
296
297       Any  typographical  or  formatting  errors that appear in this page are
298       most likely to have been introduced during the conversion of the source
299       files  to  man page format. To report such errors, see https://www.ker
300       nel.org/doc/man-pages/reporting_bugs.html .
301
302
303
304IEEE/The Open Group                  2017                              CUT(1P)
Impressum