1CSPLIT(P) POSIX Programmer's Manual CSPLIT(P)
2
3
4
6 csplit - split files based on context
7
9 csplit [-ks][-f prefix][-n number] file arg1 ...argn
10
12 The csplit utility shall read the file named by the file operand, write
13 all or part of that file into other files as directed by the arg oper‐
14 ands, and write the sizes of the files.
15
17 The csplit utility shall conform to the Base Definitions volume of
18 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
19
20 The following options shall be supported:
21
22 -f prefix
23 Name the created files prefix 00, prefix 01, ..., prefixn. The
24 default is xx00 ... xx n. If the prefix argument would create a
25 filename exceeding {NAME_MAX} bytes, an error shall result,
26 csplit shall exit with a diagnostic message, and no files shall
27 be created.
28
29 -k Leave previously created files intact. By default, csplit shall
30 remove created files if an error occurs.
31
32 -n number
33 Use number decimal digits to form filenames for the file pieces.
34 The default shall be 2.
35
36 -s Suppress the output of file size messages.
37
38
40 The following operands shall be supported:
41
42 file The pathname of a text file to be split. If file is '-' , the
43 standard input shall be used.
44
45
46 The operands arg1 ... argn can be a combination of the following:
47
48 /rexp/[offset]
49
50 A file shall be created using the content of the lines from the
51 current line up to, but not including, the line that results
52 from the evaluation of the regular expression with offset, if
53 any, applied. The regular expression rexp shall follow the rules
54 for basic regular expressions described in the Base Definitions
55 volume of IEEE Std 1003.1-2001, Section 9.3, Basic Regular
56 Expressions. The application shall use the sequence "\/" to
57 specify a slash character within the rexp. The optional offset
58 shall be a positive or negative integer value representing a
59 number of lines. A positive integer value can be preceded by '+'
60 . If the selection of lines from an offset expression of this
61 type would create a file with zero lines, or one with greater
62 than the number of lines left in the input file, the results are
63 unspecified. After the section is created, the current line
64 shall be set to the line that results from the evaluation of the
65 regular expression with any offset applied. If the current line
66 is the first line in the file and a regular expression operation
67 has not yet been performed, the pattern match of rexp shall be
68 applied from the current line to the end of the file. Otherwise,
69 the pattern match of rexp shall be applied from the line follow‐
70 ing the current line to the end of the file.
71
72 %rexp%[offset]
73
74 Equivalent to /rexp/[offset], except that no file shall be cre‐
75 ated for the selected section of the input file. The application
76 shall use the sequence "\%" to specify a percent-sign character
77 within the rexp.
78
79 line_no
80 Create a file from the current line up to (but not including)
81 the line number line_no. Lines in the file shall be numbered
82 starting at one. The current line becomes line_no.
83
84 {num} Repeat operand. This operand can follow any of the operands
85 described previously. If it follows a rexp type operand, that
86 operand shall be applied num more times. If it follows a line_no
87 operand, the file shall be split every line_no lines, num times,
88 from that point.
89
90
91 An error shall be reported if an operand does not reference a line
92 between the current position and the end of the file.
93
95 See the INPUT FILES section.
96
98 The input file shall be a text file.
99
101 The following environment variables shall affect the execution of
102 csplit:
103
104 LANG Provide a default value for the internationalization variables
105 that are unset or null. (See the Base Definitions volume of
106 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
107 ables for the precedence of internationalization variables used
108 to determine the values of locale categories.)
109
110 LC_ALL If set to a non-empty string value, override the values of all
111 the other internationalization variables.
112
113 LC_COLLATE
114
115 Determine the locale for the behavior of ranges, equivalence
116 classes, and multi-character collating elements within regular
117 expressions.
118
119 LC_CTYPE
120 Determine the locale for the interpretation of sequences of
121 bytes of text data as characters (for example, single-byte as
122 opposed to multi-byte characters in arguments and input files)
123 and the behavior of character classes within regular expres‐
124 sions.
125
126 LC_MESSAGES
127 Determine the locale that should be used to affect the format
128 and contents of diagnostic messages written to standard error.
129
130 NLSPATH
131 Determine the location of message catalogs for the processing of
132 LC_MESSAGES .
133
134
136 If the -k option is specified, created files shall be retained. Other‐
137 wise, the default action occurs.
138
140 Unless the -s option is used, the standard output shall consist of one
141 line per file created, with a format as follows:
142
143
144 "%d\n", <file size in bytes>
145
147 The standard error shall be used only for diagnostic messages.
148
150 The output files shall contain portions of the original input file;
151 otherwise, unchanged.
152
154 None.
155
157 The following exit values shall be returned:
158
159 0 Successful completion.
160
161 >0 An error occurred.
162
163
165 By default, created files shall be removed if an error occurs. When the
166 -k option is specified, created files shall not be removed if an error
167 occurs.
168
169 The following sections are informative.
170
172 None.
173
175 1. This example creates four files, cobol00 ... cobol03:
176
177
178 csplit -f cobol file '/procedure division/' /par5./ /par16./
179
180 After editing the split files, they can be recombined as follows:
181
182
183 cat cobol0[0-3] > file
184
185 Note that this example overwrites the original file.
186
187 2. This example would split the file after the first 99 lines, and
188 every 100 lines thereafter, up to 9999 lines; this is because lines
189 in the file are numbered from 1 rather than zero, for historical
190 reasons:
191
192
193 csplit -k file 100 {99}
194
195 3. Assuming that prog.c follows the C-language coding convention of
196 ending routines with a '}' at the beginning of the line, this exam‐
197 ple creates a file containing each separate C routine (up to 21) in
198 prog.c:
199
200
201 csplit -k prog.c '%main(%' '/^}/+1' {20}
202
204 The -n option was added to extend the range of filenames that could be
205 handled.
206
207 Consideration was given to adding a -a flag to use the alphabetic file‐
208 name generation used by the historical split utility, but the function‐
209 ality added by the -n option was deemed to make alphabetic naming
210 unnecessary.
211
213 None.
214
216 sed , split
217
219 Portions of this text are reprinted and reproduced in electronic form
220 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
221 -- Portable Operating System Interface (POSIX), The Open Group Base
222 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
223 Electrical and Electronics Engineers, Inc and The Open Group. In the
224 event of any discrepancy between this version and the original IEEE and
225 The Open Group Standard, the original IEEE and The Open Group Standard
226 is the referee document. The original Standard can be obtained online
227 at http://www.opengroup.org/unix/online.html .
228
229
230
231IEEE/The Open Group 2003 CSPLIT(P)