1CSPLIT(1P) POSIX Programmer's Manual CSPLIT(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 csplit - split files based on context
13
15 csplit [-ks][-f prefix][-n number] file arg1 ...argn
16
18 The csplit utility shall read the file named by the file operand, write
19 all or part of that file into other files as directed by the arg oper‐
20 ands, and write the sizes of the files.
21
23 The csplit utility shall conform to the Base Definitions volume of
24 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
25
26 The following options shall be supported:
27
28 -f prefix
29 Name the created files prefix 00, prefix 01, ..., prefixn. The
30 default is xx00 ... xx n. If the prefix argument would create a
31 filename exceeding {NAME_MAX} bytes, an error shall result,
32 csplit shall exit with a diagnostic message, and no files shall
33 be created.
34
35 -k Leave previously created files intact. By default, csplit shall
36 remove created files if an error occurs.
37
38 -n number
39 Use number decimal digits to form filenames for the file pieces.
40 The default shall be 2.
41
42 -s Suppress the output of file size messages.
43
44
46 The following operands shall be supported:
47
48 file The pathname of a text file to be split. If file is '-', the
49 standard input shall be used.
50
51
52 The operands arg1 ... argn can be a combination of the following:
53
54 /rexp/[offset]
55
56 A file shall be created using the content of the lines from the
57 current line up to, but not including, the line that results
58 from the evaluation of the regular expression with offset, if
59 any, applied. The regular expression rexp shall follow the rules
60 for basic regular expressions described in the Base Definitions
61 volume of IEEE Std 1003.1-2001, Section 9.3, Basic Regular
62 Expressions. The application shall use the sequence "\/" to
63 specify a slash character within the rexp. The optional offset
64 shall be a positive or negative integer value representing a
65 number of lines. A positive integer value can be preceded by '+'
66 . If the selection of lines from an offset expression of this
67 type would create a file with zero lines, or one with greater
68 than the number of lines left in the input file, the results are
69 unspecified. After the section is created, the current line
70 shall be set to the line that results from the evaluation of the
71 regular expression with any offset applied. If the current line
72 is the first line in the file and a regular expression operation
73 has not yet been performed, the pattern match of rexp shall be
74 applied from the current line to the end of the file. Otherwise,
75 the pattern match of rexp shall be applied from the line follow‐
76 ing the current line to the end of the file.
77
78 %rexp%[offset]
79
80 Equivalent to /rexp/[offset], except that no file shall be cre‐
81 ated for the selected section of the input file. The application
82 shall use the sequence "\%" to specify a percent-sign character
83 within the rexp.
84
85 line_no
86 Create a file from the current line up to (but not including)
87 the line number line_no. Lines in the file shall be numbered
88 starting at one. The current line becomes line_no.
89
90 {num} Repeat operand. This operand can follow any of the operands
91 described previously. If it follows a rexp type operand, that
92 operand shall be applied num more times. If it follows a line_no
93 operand, the file shall be split every line_no lines, num times,
94 from that point.
95
96
97 An error shall be reported if an operand does not reference a line
98 between the current position and the end of the file.
99
101 See the INPUT FILES section.
102
104 The input file shall be a text file.
105
107 The following environment variables shall affect the execution of
108 csplit:
109
110 LANG Provide a default value for the internationalization variables
111 that are unset or null. (See the Base Definitions volume of
112 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
113 ables for the precedence of internationalization variables used
114 to determine the values of locale categories.)
115
116 LC_ALL If set to a non-empty string value, override the values of all
117 the other internationalization variables.
118
119 LC_COLLATE
120
121 Determine the locale for the behavior of ranges, equivalence
122 classes, and multi-character collating elements within regular
123 expressions.
124
125 LC_CTYPE
126 Determine the locale for the interpretation of sequences of
127 bytes of text data as characters (for example, single-byte as
128 opposed to multi-byte characters in arguments and input files)
129 and the behavior of character classes within regular expres‐
130 sions.
131
132 LC_MESSAGES
133 Determine the locale that should be used to affect the format
134 and contents of diagnostic messages written to standard error.
135
136 NLSPATH
137 Determine the location of message catalogs for the processing of
138 LC_MESSAGES .
139
140
142 If the -k option is specified, created files shall be retained. Other‐
143 wise, the default action occurs.
144
146 Unless the -s option is used, the standard output shall consist of one
147 line per file created, with a format as follows:
148
149
150 "%d\n", <file size in bytes>
151
153 The standard error shall be used only for diagnostic messages.
154
156 The output files shall contain portions of the original input file;
157 otherwise, unchanged.
158
160 None.
161
163 The following exit values shall be returned:
164
165 0 Successful completion.
166
167 >0 An error occurred.
168
169
171 By default, created files shall be removed if an error occurs. When the
172 -k option is specified, created files shall not be removed if an error
173 occurs.
174
175 The following sections are informative.
176
178 None.
179
181 1. This example creates four files, cobol00 ... cobol03:
182
183
184 csplit -f cobol file '/procedure division/' /par5./ /par16./
185
186 After editing the split files, they can be recombined as follows:
187
188
189 cat cobol0[0-3] > file
190
191 Note that this example overwrites the original file.
192
193 2. This example would split the file after the first 99 lines, and
194 every 100 lines thereafter, up to 9999 lines; this is because lines
195 in the file are numbered from 1 rather than zero, for historical
196 reasons:
197
198
199 csplit -k file 100 {99}
200
201 3. Assuming that prog.c follows the C-language coding convention of
202 ending routines with a '}' at the beginning of the line, this exam‐
203 ple creates a file containing each separate C routine (up to 21) in
204 prog.c:
205
206
207 csplit -k prog.c '%main(%' '/^}/+1' {20}
208
210 The -n option was added to extend the range of filenames that could be
211 handled.
212
213 Consideration was given to adding a -a flag to use the alphabetic file‐
214 name generation used by the historical split utility, but the function‐
215 ality added by the -n option was deemed to make alphabetic naming
216 unnecessary.
217
219 None.
220
222 sed, split
223
225 Portions of this text are reprinted and reproduced in electronic form
226 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
227 -- Portable Operating System Interface (POSIX), The Open Group Base
228 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
229 Electrical and Electronics Engineers, Inc and The Open Group. In the
230 event of any discrepancy between this version and the original IEEE and
231 The Open Group Standard, the original IEEE and The Open Group Standard
232 is the referee document. The original Standard can be obtained online
233 at http://www.opengroup.org/unix/online.html .
234
235
236
237IEEE/The Open Group 2003 CSPLIT(1P)