1CSPLIT(1P) POSIX Programmer's Manual CSPLIT(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
11
13 csplit — split files based on context
14
16 csplit [−ks] [−f prefix] [−n number] file arg...
17
19 The csplit utility shall read the file named by the file operand, write
20 all or part of that file into other files as directed by the arg oper‐
21 ands, and write the sizes of the files.
22
24 The csplit utility shall conform to the Base Definitions volume of
25 POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines.
26
27 The following options shall be supported:
28
29 −f prefix Name the created files prefix00, prefix01, ..., prefixn. The
30 default is xx00 ... xxn. If the prefix argument would cre‐
31 ate a filename exceeding {NAME_MAX} bytes, an error shall
32 result, csplit shall exit with a diagnostic message, and no
33 files shall be created.
34
35 −k Leave previously created files intact. By default, csplit
36 shall remove created files if an error occurs.
37
38 −n number Use number decimal digits to form filenames for the file
39 pieces. The default shall be 2.
40
41 −s Suppress the output of file size messages.
42
44 The following operands shall be supported:
45
46 file The pathname of a text file to be split. If file is '−', the
47 standard input shall be used.
48
49 Each arg operand can be one of the following:
50
51 /rexp/[offset]
52 A file shall be created using the content of the lines from
53 the current line up to, but not including, the line that
54 results from the evaluation of the regular expression with
55 offset, if any, applied. The regular expression rexp shall
56 follow the rules for basic regular expressions described in
57 the Base Definitions volume of POSIX.1‐2008, Section 9.3,
58 Basic Regular Expressions. The application shall use the
59 sequence "\/" to specify a <slash> character within the rexp.
60 The optional offset shall be a positive or negative integer
61 value representing a number of lines. A positive integer
62 value can be preceded by '+'. If the selection of lines from
63 an offset expression of this type would create a file with
64 zero lines, or one with greater than the number of lines left
65 in the input file, the results are unspecified. After the
66 section is created, the current line shall be set to the line
67 that results from the evaluation of the regular expression
68 with any offset applied. If the current line is the first
69 line in the file and a regular expression operation has not
70 yet been performed, the pattern match of rexp shall be
71 applied from the current line to the end of the file. Other‐
72 wise, the pattern match of rexp shall be applied from the
73 line following the current line to the end of the file.
74
75 %rexp%[offset]
76 Equivalent to /rexp/[offset], except that no file shall be
77 created for the selected section of the input file. The
78 application shall use the sequence "\%" to specify a <per‐
79 cent-sign> character within the rexp.
80
81 line_no Create a file from the current line up to (but not including)
82 the line number line_no. Lines in the file shall be numbered
83 starting at one. The current line becomes line_no.
84
85 {num} Repeat operand. This operand can follow any of the operands
86 described previously. If it follows a rexp type operand, that
87 operand shall be applied num more times. If it follows a
88 line_no operand, the file shall be split every line_no lines,
89 num times, from that point.
90
91 An error shall be reported if an operand does not reference a line
92 between the current position and the end of the file.
93
95 See the INPUT FILES section.
96
98 The input file shall be a text file.
99
101 The following environment variables shall affect the execution of
102 csplit:
103
104 LANG Provide a default value for the internationalization vari‐
105 ables that are unset or null. (See the Base Definitions vol‐
106 ume of POSIX.1‐2008, Section 8.2, Internationalization Vari‐
107 ables for the precedence of internationalization variables
108 used to determine the values of locale categories.)
109
110 LC_ALL If set to a non-empty string value, override the values of
111 all the other internationalization variables.
112
113 LC_COLLATE
114 Determine the locale for the behavior of ranges, equivalence
115 classes, and multi-character collating elements within regu‐
116 lar expressions.
117
118 LC_CTYPE Determine the locale for the interpretation of sequences of
119 bytes of text data as characters (for example, single-byte as
120 opposed to multi-byte characters in arguments and input
121 files) and the behavior of character classes within regular
122 expressions.
123
124 LC_MESSAGES
125 Determine the locale that should be used to affect the format
126 and contents of diagnostic messages written to standard
127 error.
128
129 NLSPATH Determine the location of message catalogs for the processing
130 of LC_MESSAGES.
131
133 If the −k option is specified, created files shall be retained. Other‐
134 wise, the default action occurs.
135
137 Unless the −s option is used, the standard output shall consist of one
138 line per file created, with a format as follows:
139
140 "%d\n", <file size in bytes>
141
143 The standard error shall be used only for diagnostic messages.
144
146 The output files shall contain portions of the original input file;
147 otherwise, unchanged.
148
150 None.
151
153 The following exit values shall be returned:
154
155 0 Successful completion.
156
157 >0 An error occurred.
158
160 By default, created files shall be removed if an error occurs. When the
161 −k option is specified, created files shall not be removed if an error
162 occurs.
163
164 The following sections are informative.
165
167 None.
168
170 1. This example creates four files, cobol00 ... cobol03:
171
172 csplit −f cobol file '/procedure division/' /par5./ /par16./
173
174 After editing the split files, they can be recombined as follows:
175
176 cat cobol0[0−3] > file
177
178 Note that this example overwrites the original file.
179
180 2. This example would split the file after the first 99 lines, and
181 every 100 lines thereafter, up to 9999 lines; this is because lines
182 in the file are numbered from 1 rather than zero, for historical
183 reasons:
184
185 csplit −k file 100 {99}
186
187 3. Assuming that prog.c follows the C-language coding convention of
188 ending routines with a '}' at the beginning of the line, this exam‐
189 ple creates a file containing each separate C routine (up to 21) in
190 prog.c:
191
192 csplit −k prog.c '%main(%' '/^}/+1' {20}
193
195 The −n option was added to extend the range of filenames that could be
196 handled.
197
198 Consideration was given to adding a −a flag to use the alphabetic file‐
199 name generation used by the historical split utility, but the function‐
200 ality added by the −n option was deemed to make alphabetic naming
201 unnecessary.
202
204 None.
205
207 sed, split
208
209 The Base Definitions volume of POSIX.1‐2008, Chapter 8, Environment
210 Variables, Section 9.3, Basic Regular Expressions, Section 12.2, Util‐
211 ity Syntax Guidelines
212
214 Portions of this text are reprinted and reproduced in electronic form
215 from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
216 -- Portable Operating System Interface (POSIX), The Open Group Base
217 Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
218 cal and Electronics Engineers, Inc and The Open Group. (This is
219 POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
220 event of any discrepancy between this version and the original IEEE and
221 The Open Group Standard, the original IEEE and The Open Group Standard
222 is the referee document. The original Standard can be obtained online
223 at http://www.unix.org/online.html .
224
225 Any typographical or formatting errors that appear in this page are
226 most likely to have been introduced during the conversion of the source
227 files to man page format. To report such errors, see https://www.ker‐
228 nel.org/doc/man-pages/reporting_bugs.html .
229
230
231
232IEEE/The Open Group 2013 CSPLIT(1P)