1S2P(1) Perl Programmers Reference Guide S2P(1)
2
3
4
6 psed - a stream editor
7
9 psed [-an] script [file ...]
10 psed [-an] [-e script] [-f script-file] [file ...]
11
12 s2p [-an] [-e script] [-f script-file]
13
15 A stream editor reads the input stream consisting of the specified
16 files (or standard input, if none are given), processes is line by line
17 by applying a script consisting of edit commands, and writes resulting
18 lines to standard output. The filename `"-"' may be used to read stan‐
19 dard input.
20
21 The edit script is composed from arguments of -e options and
22 script-files, in the given order. A single script argument may be spec‐
23 ified as the first parameter.
24
25 If this program is invoked with the name s2p, it will act as a sed-to-
26 Perl translator. See "sed Script Translation".
27
28 sed returns an exit code of 0 on success or >0 if an error occurred.
29
31 -a A file specified as argument to the w edit command is by default
32 opened before input processing starts. Using -a, opening of such
33 files is delayed until the first line is actually written to the
34 file.
35
36 -e script
37 The editing commands defined by script are appended to the script.
38 Multiple commands must be separated by newlines.
39
40 -f script-file
41 Editing commands from the specified script-file are read and
42 appended to the script.
43
44 -n By default, a line is written to standard output after the editing
45 script has been applied to it. The -n option suppresses automatic
46 printing.
47
49 sed command syntax is defined as
50
51 [address[,address]][!]function[argument]
52
53 with whitespace being permitted before or after addresses, and between
54 the function character and the argument. The addresses and the address
55 inverter ("!") are used to restrict the application of a command to the
56 selected line(s) of input.
57
58 Each command must be on a line of its own, except where noted in the
59 synopses below.
60
61 The edit cycle performed on each input line consist of reading the line
62 (without its trailing newline character) into the pattern space, apply‐
63 ing the applicable commands of the edit script, writing the final con‐
64 tents of the pattern space and a newline to the standard output. A
65 hold space is provided for saving the contents of the pattern space for
66 later use.
67
68 Addresses
69
70 A sed address is either a line number or a pattern, which may be com‐
71 bined arbitrarily to construct ranges. Lines are numbered across all
72 input files.
73
74 Any address may be followed by an exclamation mark (`"!"'), selecting
75 all lines not matching that address.
76
77 number
78 The line with the given number is selected.
79
80 $ A dollar sign ("$") is the line number of the last line of the
81 input stream.
82
83 /regular expression/
84 A pattern address is a basic regular expression (see "Basic Regular
85 Expressions"), between the delimiting character "/". Any other
86 character except "\" or newline may be used to delimit a pattern
87 address when the initial delimiter is prefixed with a backslash
88 (`"\"').
89
90 If no address is given, the command selects every line.
91
92 If one address is given, it selects the line (or lines) matching the
93 address.
94
95 Two addresses select a range that begins whenever the first address
96 matches, and ends (including that line) when the second address
97 matches. If the first (second) address is a matching pattern, the sec‐
98 ond address is not applied to the very same line to determine the end
99 of the range. Likewise, if the second address is a matching pattern,
100 the first address is not applied to the very same line to determine the
101 begin of another range. If both addresses are line numbers, and the
102 second line number is less than the first line number, then only the
103 first line is selected.
104
105 Functions
106
107 The maximum permitted number of addresses is indicated with each func‐
108 tion synopsis below.
109
110 The argument text consists of one or more lines following the command.
111 Embedded newlines in text must be preceded with a backslash. Other
112 backslashes in text are deleted and the following character is taken
113 literally.
114
115 [1addr]a\ text
116 Write text (which must start on the line following the command) to
117 standard output immediately before reading the next line of input,
118 either by executing the N function or by beginning a new cycle.
119
120 [2addr]b [label]
121 Branch to the : function with the specified label. If no label is
122 given, branch to the end of the script.
123
124 [2addr]c\ text
125 The line, or range of lines, selected by the address is deleted.
126 The text (which must start on the line following the command) is
127 written to standard output. With an address range, this occurs at
128 the end of the range.
129
130 [2addr]d
131 Deletes the pattern space and starts the next cycle.
132
133 [2addr]D
134 Deletes the pattern space through the first embedded newline or to
135 the end. If the pattern space becomes empty, a new cycle is
136 started, otherwise execution of the script is restarted.
137
138 [2addr]g
139 Replace the contents of the pattern space with the hold space.
140
141 [2addr]G
142 Append a newline and the contents of the hold space to the pattern
143 space.
144
145 [2addr]h
146 Replace the contents of the hold space with the pattern space.
147
148 [2addr]H
149 Append a newline and the contents of the pattern space to the hold
150 space.
151
152 [1addr]i\ text
153 Write the text (which must start on the line following the command)
154 to standard output.
155
156 [2addr]l
157 Print the contents of the pattern space: non-printable characters
158 are shown in C-style escaped form; long lines are split and have a
159 trailing `"\"' at the point of the split; the true end of a line is
160 marked with a `"$"'. Escapes are: `\a', `\t', `\n', `\f', `\r',
161 `\e' for BEL, HT, LF, FF, CR, ESC, respectively, and `\' followed
162 by a three-digit octal number for all other non-printable charac‐
163 ters.
164
165 [2addr]n
166 If automatic printing is enabled, write the pattern space to the
167 standard output. Replace the pattern space with the next line of
168 input. If there is no more input, processing is terminated.
169
170 [2addr]N
171 Append a newline and the next line of input to the pattern space.
172 If there is no more input, processing is terminated.
173
174 [2addr]p
175 Print the pattern space to the standard output. (Use the -n option
176 to suppress automatic printing at the end of a cycle if you want to
177 avoid double printing of lines.)
178
179 [2addr]P
180 Prints the pattern space through the first embedded newline or to
181 the end.
182
183 [1addr]q
184 Branch to the end of the script and quit without starting a new
185 cycle.
186
187 [1addr]r file
188 Copy the contents of the file to standard output immediately before
189 the next attempt to read a line of input. Any error encountered
190 while reading file is silently ignored.
191
192 [2addr]s/regular expression/replacement/flags
193 Substitute the replacement string for the first substring in the
194 pattern space that matches the regular expression. Any character
195 other than backslash or newline can be used instead of a slash to
196 delimit the regular expression and the replacement. To use the
197 delimiter as a literal character within the regular expression and
198 the replacement, precede the character by a backslash (`"\"').
199
200 Literal newlines may be embedded in the replacement string by pre‐
201 ceding a newline with a backslash.
202
203 Within the replacement, an ampersand (`"&"') is replaced by the
204 string matching the regular expression. The strings `"\1"' through
205 `"\9"' are replaced by the corresponding subpattern (see "Basic
206 Regular Expressions"). To get a literal `"&"' or `"\"' in the
207 replacement text, precede it by a backslash.
208
209 The following flags modify the behaviour of the s command:
210
211 g The replacement is performed for all matching, non-overlap‐
212 ping substrings of the pattern space.
213
214 1..9 Replace only the n-th matching substring of the pattern
215 space.
216
217 p If the substitution was made, print the new value of the
218 pattern space.
219
220 w file If the substitution was made, write the new value of the
221 pattern space to the specified file.
222
223 [2addr]t [label]
224 Branch to the : function with the specified label if any s substi‐
225 tutions have been made since the most recent reading of an input
226 line or execution of a t function. If no label is given, branch to
227 the end of the script.
228
229 [2addr]w file
230 The contents of the pattern space are written to the file.
231
232 [2addr]x
233 Swap the contents of the pattern space and the hold space.
234
235 [1addr]=
236 Prints the current line number on the standard output.
237
238 [0addr]: [label]
239 The command specifies the position of the label. It has no other
240 effect.
241
242 [2addr]{ [command]
243 [0addr]}
244 These two commands begin and end a command list. The first command
245 may be given on the same line as the opening { command. The com‐
246 mands within the list are jointly selected by the address(es) given
247 on the { command (but may still have individual addresses).
248
249 [0addr]# [comment]
250 The entire line is ignored (treated as a comment). If, however, the
251 first two characters in the script are `"#n"', automatic printing
252 of output is suppressed, as if the -n option were given on the com‐
253 mand line.
254
256 A Basic Regular Expression (BRE), as defined in POSIX 1003.2, consists
257 of atoms, for matching parts of a string, and bounds, specifying repe‐
258 titions of a preceding atom.
259
260 Atoms
261
262 The possible atoms of a BRE are: ., matching any single character; ^
263 and $, matching the null string at the beginning or end of a string,
264 respectively; a bracket expressions, enclosed in [ and ] (see below);
265 and any single character with no other significance (matching that
266 character). A \ before one of: ., ^, $, [, *, \, matching the character
267 after the backslash. A sequence of atoms enclosed in \( and \) becomes
268 an atom and establishes the target for a backreference, consisting of
269 the substring that actually matches the enclosed atoms. Finally, \
270 followed by one of the digits 0 through 9 is a backreference.
271
272 A ^ that is not first, or a $ that is not last does not have a special
273 significance and need not be preceded by a backslash to become literal.
274 The same is true for a ], that does not terminate a bracket expression.
275
276 An unescaped backslash cannot be last in a BRE.
277
278 Bounds
279
280 The BRE bounds are: *, specifying 0 or more matches of the preceding
281 atom; \{count\}, specifying that many repetitions; \{minimum,\}, giving
282 a lower limit; and \{minimum,maximum\} finally defines a lower and
283 upper bound.
284
285 A bound appearing as the first item in a BRE is taken literally.
286
287 Bracket Expressions
288
289 A bracket expression is a list of characters, character ranges and
290 character classes enclosed in [ and ] and matches any single character
291 from the represented set of characters.
292
293 A character range is written as two characters separated by - and rep‐
294 resents all characters (according to the character collating sequence)
295 that are not less than the first and not greater than the second.
296 (Ranges are very collating-sequence-dependent, and portable programs
297 should avoid relying on them.)
298
299 A character class is one of the class names
300
301 alnum digit punct
302 alpha graph space
303 blank lower upper
304 cntrl print xdigit
305
306 enclosed in [: and :] and represents the set of characters as defined
307 in ctype(3).
308
309 If the first character after [ is ^, the sense of matching is inverted.
310
311 To include a literal `"^"', place it anywhere else but first. To
312 include a literal '"]"' place it first or immediately after an initial
313 ^. To include a literal `"-"' make it the first (or second after ^) or
314 last character, or the second endpoint of a range.
315
316 The special bracket expression constructs "[[:<:]]" and "[[:>:]]" match
317 the null string at the beginning and end of a word respectively. (Note
318 that neither is identical to Perl's `\b' atom.)
319
320 Additional Atoms
321
322 Since some sed implementations provide additional regular expression
323 atoms (not defined in POSIX 1003.2), psed is capable of translating the
324 following backslash escapes:
325
326 \< This is the same as "[[:>:]]".
327 \> This is the same as "[[:<:]]".
328 \w This is an abbreviation for "[[:alnum:]_]".
329 \W This is an abbreviation for "[^[:alnum:]_]".
330 \y Match the empty string at a word boundary.
331 \B Match the empty string between any two either word or non-word char‐
332 acters.
333
334 To enable this feature, the environment variable PSEDEXTBRE must be set
335 to a string containing the requested characters, e.g.:
336 "PSEDEXTBRE='<>wW'".
337
339 The environment variable "PSEDEXTBRE" may be set to extend BREs. See
340 "Additional Atoms".
341
343 ambiguous translation for character `%s' in `y' command
344 The indicated character appears twice, with different translations.
345
346 `[' cannot be last in pattern
347 A `[' in a BRE indicates the beginning of a bracket expression.
348
349 `\' cannot be last in pattern
350 A `\' in a BRE is used to make the subsequent character literal.
351
352 `\' cannot be last in substitution
353 A `\' in a subsitution string is used to make the subsequent char‐
354 acter literal.
355
356 conflicting flags `%s'
357 In an s command, either the `g' flag and an n-th occurrence flag,
358 or multiple n-th occurrence flags are specified. Note that only the
359 digits `1' through `9' are permitted.
360
361 duplicate label %s (first defined at %s)
362 excess address(es)
363 The command has more than the permitted number of addresses.
364
365 extra characters after command (%s)
366 illegal option `%s'
367 improper delimiter in s command
368 The BRE and substitution may not be delimited with `\' or newline.
369
370 invalid address after `,'
371 invalid backreference (%s)
372 The specified backreference number exceeds the number of backrefer‐
373 ences in the BRE.
374
375 invalid repeat clause `\{%s\}'
376 The repeat clause does not contain a valid integer value, or pair
377 of values.
378
379 malformed regex, 1st address
380 malformed regex, 2nd address
381 malformed regular expression
382 malformed substitution expression
383 malformed `y' command argument
384 The first or second string of a y command is syntactically incor‐
385 rect.
386
387 maximum less than minimum in `\{%s\}'
388 no script command given
389 There must be at least one -e or one -f option specifying a script
390 or script file.
391
392 `\' not valid as delimiter in `y' command
393 option -e requires an argument
394 option -f requires an argument
395 `s' command requires argument
396 start of unterminated `{'
397 string lengths in `y' command differ
398 The translation table strings in a y commanf must have equal
399 lengths.
400
401 undefined label `%s'
402 unexpected `}'
403 A } command without a preceding { command was encountered.
404
405 unexpected end of script
406 The end of the script was reached although a text line after a a, c
407 or i command indicated another line.
408
409 unknown command `%s'
410 unterminated `['
411 A BRE contains an unterminated bracket expression.
412
413 unterminated `\('
414 A BRE contains an unterminated backreference.
415
416 `\{' without closing `\}'
417 A BRE contains an unterminated bounds specification.
418
419 `\)' without preceding `\('
420 `y' command requires argument
421
423 The basic material for the preceding section was generated by running
424 the sed script
425
426 #no autoprint
427 s/^.*Warn( *"\([^"]*\)".*$/\1/
428 t process
429 b
430 :process
431 s/$!/%s/g
432 s/$[_[:alnum:]]\{1,\}/%s/g
433 s/\\\\/\\/g
434 s/^/=item /
435 p
436
437 on the program's own text, and piping the output into "sort -u".
438
440 If this program is invoked with the name s2p it will act as a sed-to-
441 Perl translator. After option processing (all other arguments are
442 ignored), a Perl program is printed on standard output, which will
443 process the input stream (as read from all arguments) in the way
444 defined by the sed script and the option setting used for the transla‐
445 tion.
446
448 perl(1), re_format(7)
449
451 The l command will show escape characters (ESC) as `"\e"', but a verti‐
452 cal tab (VT) in octal.
453
454 Trailing spaces are truncated from labels in :, t and b commands.
455
456 The meaning of an empty regular expression (`"//"'), as defined by sed,
457 is "the last pattern used, at run time". This deviates from the Perl
458 interpretation, which will re-use the "last last successfully executed
459 regular expression". Since keeping track of pattern usage would create
460 terribly cluttered code, and differences would only appear in obscure
461 context (where other sed implementations appear to deviate, too), the
462 Perl semantics was adopted. Note that common usage of this feature,
463 such as in "/abc/s//xyz/", will work as expected.
464
465 Collating elements (of bracket expressions in BREs) are not imple‐
466 mented.
467
469 This sed implementation conforms to the IEEE Std1003.2-1992 ("POSIX.2")
470 definition of sed, and is compatible with the OpenBSD implementation,
471 except where otherwise noted (see "BUGS").
472
474 This Perl implementation of sed was written by Wolfgang Laun, Wolf‐
475 gang.Laun@alcatel.at.
476
478 This program is free and open software. You may use, modify, distrib‐
479 ute, and sell this program (and any modified variants) in any way you
480 wish, provided you do not restrict others from doing the same.
481
482
483
484perl v5.8.8 2008-05-05 S2P(1)