1CSPLIT(1P)                 POSIX Programmer's Manual                CSPLIT(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       csplit - split files based on context
13

SYNOPSIS

15       csplit [-ks][-f prefix][-n number] file arg1 ...argn
16

DESCRIPTION

18       The csplit utility shall read the file named by the file operand, write
19       all  or part of that file into other files as directed by the arg oper‐
20       ands, and write the sizes of the files.
21

OPTIONS

23       The csplit utility shall conform to  the  Base  Definitions  volume  of
24       IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
25
26       The following options shall be supported:
27
28       -f  prefix
29              Name  the  created files prefix 00, prefix 01, ..., prefixn. The
30              default is xx00 ...  xx n. If the prefix argument would create a
31              filename  exceeding  {NAME_MAX}  bytes,  an  error shall result,
32              csplit shall exit with a diagnostic message, and no files  shall
33              be created.
34
35       -k     Leave  previously created files intact. By default, csplit shall
36              remove created files if an error occurs.
37
38       -n  number
39              Use number decimal digits to form filenames for the file pieces.
40              The default shall be 2.
41
42       -s     Suppress the output of file size messages.
43
44

OPERANDS

46       The following operands shall be supported:
47
48       file   The  pathname  of  a  text file to be split. If file is '-', the
49              standard input shall be used.
50
51
52       The operands arg1 ... argn can be a combination of the following:
53
54       /rexp/[offset]
55
56              A file shall be created using the content of the lines from  the
57              current  line  up  to,  but not including, the line that results
58              from the evaluation of the regular expression  with  offset,  if
59              any, applied. The regular expression rexp shall follow the rules
60              for basic regular expressions described in the Base  Definitions
61              volume  of  IEEE Std 1003.1-2001,  Section  9.3,  Basic  Regular
62              Expressions.  The application shall use  the  sequence  "\/"  to
63              specify  a  slash character within the rexp. The optional offset
64              shall be a positive or negative  integer  value  representing  a
65              number of lines. A positive integer value can be preceded by '+'
66              . If the selection of lines from an offset  expression  of  this
67              type  would  create  a file with zero lines, or one with greater
68              than the number of lines left in the input file, the results are
69              unspecified.  After  the  section  is  created, the current line
70              shall be set to the line that results from the evaluation of the
71              regular expression with any offset applied.  If the current line
72              is the first line in the file and a regular expression operation
73              has  not  yet been performed, the pattern match of rexp shall be
74              applied from the current line to the end of the file. Otherwise,
75              the pattern match of rexp shall be applied from the line follow‐
76              ing the current line to the end of the file.
77
78       %rexp%[offset]
79
80              Equivalent to /rexp/[offset], except that no file shall be  cre‐
81              ated for the selected section of the input file. The application
82              shall use the sequence "\%" to specify a percent-sign  character
83              within the rexp.
84
85       line_no
86              Create  a  file  from the current line up to (but not including)
87              the line number line_no. Lines in the  file  shall  be  numbered
88              starting at one. The current line becomes line_no.
89
90       {num}  Repeat  operand.  This  operand  can  follow any of the operands
91              described previously. If it follows a rexp  type  operand,  that
92              operand shall be applied num more times. If it follows a line_no
93              operand, the file shall be split every line_no lines, num times,
94              from that point.
95
96
97       An  error  shall  be  reported  if an operand does not reference a line
98       between the current position and the end of the file.
99

STDIN

101       See the INPUT FILES section.
102

INPUT FILES

104       The input file shall be a text file.
105

ENVIRONMENT VARIABLES

107       The following environment  variables  shall  affect  the  execution  of
108       csplit:
109
110       LANG   Provide  a  default value for the internationalization variables
111              that are unset or null. (See  the  Base  Definitions  volume  of
112              IEEE Std 1003.1-2001,  Section  8.2,  Internationalization Vari‐
113              ables for the precedence of internationalization variables  used
114              to determine the values of locale categories.)
115
116       LC_ALL If  set  to a non-empty string value, override the values of all
117              the other internationalization variables.
118
119       LC_COLLATE
120
121              Determine the locale for the  behavior  of  ranges,  equivalence
122              classes,  and  multi-character collating elements within regular
123              expressions.
124
125       LC_CTYPE
126              Determine the locale for  the  interpretation  of  sequences  of
127              bytes  of  text  data as characters (for example, single-byte as
128              opposed to multi-byte characters in arguments and  input  files)
129              and  the  behavior  of  character classes within regular expres‐
130              sions.
131
132       LC_MESSAGES
133              Determine the locale that should be used to  affect  the  format
134              and contents of diagnostic messages written to standard error.
135
136       NLSPATH
137              Determine the location of message catalogs for the processing of
138              LC_MESSAGES .
139
140

ASYNCHRONOUS EVENTS

142       If the -k option is specified, created files shall be retained.  Other‐
143       wise, the default action occurs.
144

STDOUT

146       Unless  the -s option is used, the standard output shall consist of one
147       line per file created, with a format as follows:
148
149
150              "%d\n", <file size in bytes>
151

STDERR

153       The standard error shall be used only for diagnostic messages.
154

OUTPUT FILES

156       The output files shall contain portions of  the  original  input  file;
157       otherwise, unchanged.
158

EXTENDED DESCRIPTION

160       None.
161

EXIT STATUS

163       The following exit values shall be returned:
164
165        0     Successful completion.
166
167       >0     An error occurred.
168
169

CONSEQUENCES OF ERRORS

171       By default, created files shall be removed if an error occurs. When the
172       -k option is specified, created files shall not be removed if an  error
173       occurs.
174
175       The following sections are informative.
176

APPLICATION USAGE

178       None.
179

EXAMPLES

181        1. This example creates four files, cobol00 ... cobol03:
182
183
184           csplit -f cobol file '/procedure division/' /par5./ /par16./
185
186       After editing the split files, they can be recombined as follows:
187
188
189              cat cobol0[0-3] > file
190
191       Note that this example overwrites the original file.
192
193        2. This  example  would  split  the file after the first 99 lines, and
194           every 100 lines thereafter, up to 9999 lines; this is because lines
195           in  the  file  are numbered from 1 rather than zero, for historical
196           reasons:
197
198
199           csplit -k file  100  {99}
200
201        3. Assuming that prog.c follows the C-language  coding  convention  of
202           ending routines with a '}' at the beginning of the line, this exam‐
203           ple creates a file containing each separate C routine (up to 21) in
204           prog.c:
205
206
207           csplit -k prog.c '%main(%'  '/^}/+1' {20}
208

RATIONALE

210       The  -n option was added to extend the range of filenames that could be
211       handled.
212
213       Consideration was given to adding a -a flag to use the alphabetic file‐
214       name generation used by the historical split utility, but the function‐
215       ality added by the -n option  was  deemed  to  make  alphabetic  naming
216       unnecessary.
217

FUTURE DIRECTIONS

219       None.
220

SEE ALSO

222       sed, split
223
225       Portions  of  this text are reprinted and reproduced in electronic form
226       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
227       --  Portable  Operating  System  Interface (POSIX), The Open Group Base
228       Specifications Issue 6, Copyright (C) 2001-2003  by  the  Institute  of
229       Electrical  and  Electronics  Engineers, Inc and The Open Group. In the
230       event of any discrepancy between this version and the original IEEE and
231       The  Open Group Standard, the original IEEE and The Open Group Standard
232       is the referee document. The original Standard can be  obtained  online
233       at http://www.opengroup.org/unix/online.html .
234
235
236
237IEEE/The Open Group                  2003                           CSPLIT(1P)
Impressum