1S2P(1)                 Perl Programmers Reference Guide                 S2P(1)
2
3
4

NAME

6       psed - a stream editor
7

SYNOPSIS

9          psed [-an] script [file ...]
10          psed [-an] [-e script] [-f script-file] [file ...]
11
12          s2p  [-an] [-e script] [-f script-file]
13

DESCRIPTION

15       A stream editor reads the input stream consisting of the specified
16       files (or standard input, if none are given), processes is line by line
17       by applying a script consisting of edit commands, and writes resulting
18       lines to standard output. The filename `"-"' may be used to read stan‐
19       dard input.
20
21       The edit script is composed from arguments of -e options and
22       script-files, in the given order. A single script argument may be spec‐
23       ified as the first parameter.
24
25       If this program is invoked with the name s2p, it will act as a sed-to-
26       Perl translator. See "sed Script Translation".
27
28       sed returns an exit code of 0 on success or >0 if an error occurred.
29

OPTIONS

31       -a  A file specified as argument to the w edit command is by default
32           opened before input processing starts. Using -a, opening of such
33           files is delayed until the first line is actually written to the
34           file.
35
36       -e script
37           The editing commands defined by script are appended to the script.
38           Multiple commands must be separated by newlines.
39
40       -f script-file
41           Editing commands from the specified script-file are read and
42           appended to the script.
43
44       -n  By default, a line is written to standard output after the editing
45           script has been applied to it. The -n option suppresses automatic
46           printing.
47

COMMANDS

49       sed command syntax is defined as
50
51          [address[,address]][!]function[argument]
52
53       with whitespace being permitted before or after addresses, and between
54       the function character and the argument. The addresses and the address
55       inverter ("!") are used to restrict the application of a command to the
56       selected line(s) of input.
57
58       Each command must be on a line of its own, except where noted in the
59       synopses below.
60
61       The edit cycle performed on each input line consist of reading the line
62       (without its trailing newline character) into the pattern space, apply‐
63       ing the applicable commands of the edit script, writing the final con‐
64       tents of the pattern space and a newline to the standard output.  A
65       hold space is provided for saving the contents of the pattern space for
66       later use.
67
68       Addresses
69
70       A sed address is either a line number or a pattern, which may be com‐
71       bined arbitrarily to construct ranges. Lines are numbered across all
72       input files.
73
74       Any address may be followed by an exclamation mark (`"!"'), selecting
75       all lines not matching that address.
76
77       number
78           The line with the given number is selected.
79
80       $   A dollar sign ("$") is the line number of the last line of the
81           input stream.
82
83       /regular expression/
84           A pattern address is a basic regular expression (see "Basic Regular
85           Expressions"), between the delimiting character "/".  Any other
86           character except "\" or newline may be used to delimit a pattern
87           address when the initial delimiter is prefixed with a backslash
88           (`"\"').
89
90       If no address is given, the command selects every line.
91
92       If one address is given, it selects the line (or lines) matching the
93       address.
94
95       Two addresses select a range that begins whenever the first address
96       matches, and ends (including that line) when the second address
97       matches.  If the first (second) address is a matching pattern, the sec‐
98       ond address is not applied to the very same line to determine the end
99       of the range. Likewise, if the second address is a matching pattern,
100       the first address is not applied to the very same line to determine the
101       begin of another range. If both addresses are line numbers, and the
102       second line number is less than the first line number, then only the
103       first line is selected.
104
105       Functions
106
107       The maximum permitted number of addresses is indicated with each func‐
108       tion synopsis below.
109
110       The argument text consists of one or more lines following the command.
111       Embedded newlines in text must be preceded with a backslash.  Other
112       backslashes in text are deleted and the following character is taken
113       literally.
114
115       [1addr]a\ text
116           Write text (which must start on the line following the command) to
117           standard output immediately before reading the next line of input,
118           either by executing the N function or by beginning a new cycle.
119
120       [2addr]b [label]
121           Branch to the : function with the specified label. If no label is
122           given, branch to the end of the script.
123
124       [2addr]c\ text
125           The line, or range of lines, selected by the address is deleted.
126           The text (which must start on the line following the command) is
127           written to standard output. With an address range, this occurs at
128           the end of the range.
129
130       [2addr]d
131           Deletes the pattern space and starts the next cycle.
132
133       [2addr]D
134           Deletes the pattern space through the first embedded newline or to
135           the end.  If the pattern space becomes empty, a new cycle is
136           started, otherwise execution of the script is restarted.
137
138       [2addr]g
139           Replace the contents of the pattern space with the hold space.
140
141       [2addr]G
142           Append a newline and the contents of the hold space to the pattern
143           space.
144
145       [2addr]h
146           Replace the contents of the hold space with the pattern space.
147
148       [2addr]H
149           Append a newline and the contents of the pattern space to the hold
150           space.
151
152       [1addr]i\ text
153           Write the text (which must start on the line following the command)
154           to standard output.
155
156       [2addr]l
157           Print the contents of the pattern space: non-printable characters
158           are shown in C-style escaped form; long lines are split and have a
159           trailing `"\"' at the point of the split; the true end of a line is
160           marked with a `"$"'. Escapes are: `\a', `\t', `\n', `\f', `\r',
161           `\e' for BEL, HT, LF, FF, CR, ESC, respectively, and `\' followed
162           by a three-digit octal number for all other non-printable charac‐
163           ters.
164
165       [2addr]n
166           If automatic printing is enabled, write the pattern space to the
167           standard output. Replace the pattern space with the next line of
168           input. If there is no more input, processing is terminated.
169
170       [2addr]N
171           Append a newline and the next line of input to the pattern space.
172           If there is no more input, processing is terminated.
173
174       [2addr]p
175           Print the pattern space to the standard output. (Use the -n option
176           to suppress automatic printing at the end of a cycle if you want to
177           avoid double printing of lines.)
178
179       [2addr]P
180           Prints the pattern space through the first embedded newline or to
181           the end.
182
183       [1addr]q
184           Branch to the end of the script and quit without starting a new
185           cycle.
186
187       [1addr]r file
188           Copy the contents of the file to standard output immediately before
189           the next attempt to read a line of input. Any error encountered
190           while reading file is silently ignored.
191
192       [2addr]s/regular expression/replacement/flags
193           Substitute the replacement string for the first substring in the
194           pattern space that matches the regular expression.  Any character
195           other than backslash or newline can be used instead of a slash to
196           delimit the regular expression and the replacement.  To use the
197           delimiter as a literal character within the regular expression and
198           the replacement, precede the character by a backslash (`"\"').
199
200           Literal newlines may be embedded in the replacement string by pre‐
201           ceding a newline with a backslash.
202
203           Within the replacement, an ampersand (`"&"') is replaced by the
204           string matching the regular expression. The strings `"\1"' through
205           `"\9"' are replaced by the corresponding subpattern (see "Basic
206           Regular Expressions").  To get a literal `"&"' or `"\"' in the
207           replacement text, precede it by a backslash.
208
209           The following flags modify the behaviour of the s command:
210
211           g       The replacement is performed for all matching, non-overlap‐
212                   ping substrings of the pattern space.
213
214           1..9    Replace only the n-th matching substring of the pattern
215                   space.
216
217           p       If the substitution was made, print the new value of the
218                   pattern space.
219
220           w file  If the substitution was made, write the new value of the
221                   pattern space to the specified file.
222
223       [2addr]t [label]
224           Branch to the : function with the specified label if any s substi‐
225           tutions have been made since the most recent reading of an input
226           line or execution of a t function. If no label is given, branch to
227           the end of the script.
228
229       [2addr]w file
230           The contents of the pattern space are written to the file.
231
232       [2addr]x
233           Swap the contents of the pattern space and the hold space.
234
235       [1addr]=
236           Prints the current line number on the standard output.
237
238       [0addr]: [label]
239           The command specifies the position of the label. It has no other
240           effect.
241
242       [2addr]{ [command]
243       [0addr]}
244           These two commands begin and end a command list. The first command
245           may be given on the same line as the opening { command. The com‐
246           mands within the list are jointly selected by the address(es) given
247           on the { command (but may still have individual addresses).
248
249       [0addr]# [comment]
250           The entire line is ignored (treated as a comment). If, however, the
251           first two characters in the script are `"#n"', automatic printing
252           of output is suppressed, as if the -n option were given on the com‐
253           mand line.
254

BASIC REGULAR EXPRESSIONS

256       A Basic Regular Expression (BRE), as defined in POSIX 1003.2, consists
257       of atoms, for matching parts of a string, and bounds, specifying repe‐
258       titions of a preceding atom.
259
260       Atoms
261
262       The possible atoms of a BRE are: ., matching any single character; ^
263       and $, matching the null string at the beginning or end of a string,
264       respectively; a bracket expressions, enclosed in [ and ] (see below);
265       and any single character with no other significance (matching that
266       character). A \ before one of: ., ^, $, [, *, \, matching the character
267       after the backslash. A sequence of atoms enclosed in \( and \) becomes
268       an atom and establishes the target for a backreference, consisting of
269       the substring that actually matches the enclosed atoms.  Finally, \
270       followed by one of the digits 0 through 9 is a backreference.
271
272       A ^ that is not first, or a $ that is not last does not have a special
273       significance and need not be preceded by a backslash to become literal.
274       The same is true for a ], that does not terminate a bracket expression.
275
276       An unescaped backslash cannot be last in a BRE.
277
278       Bounds
279
280       The BRE bounds are: *, specifying 0 or more matches of the preceding
281       atom; \{count\}, specifying that many repetitions; \{minimum,\}, giving
282       a lower limit; and \{minimum,maximum\} finally defines a lower and
283       upper bound.
284
285       A bound appearing as the first item in a BRE is taken literally.
286
287       Bracket Expressions
288
289       A bracket expression is a list of characters, character ranges and
290       character classes enclosed in [ and ] and matches any single character
291       from the represented set of characters.
292
293       A character range is written as two characters separated by - and rep‐
294       resents all characters (according to the character collating sequence)
295       that are not less than the first and not greater than the second.
296       (Ranges are very collating-sequence-dependent, and portable programs
297       should avoid relying on them.)
298
299       A character class is one of the class names
300
301          alnum     digit     punct
302          alpha     graph     space
303          blank     lower     upper
304          cntrl     print     xdigit
305
306       enclosed in [: and :] and represents the set of characters as defined
307       in ctype(3).
308
309       If the first character after [ is ^, the sense of matching is inverted.
310
311       To include a literal `"^"', place it anywhere else but first. To
312       include a literal '"]"' place it first or immediately after an initial
313       ^. To include a literal `"-"' make it the first (or second after ^) or
314       last character, or the second endpoint of a range.
315
316       The special bracket expression constructs "[[:<:]]" and "[[:>:]]" match
317       the null string at the beginning and end of a word respectively.  (Note
318       that neither is identical to Perl's `\b' atom.)
319
320       Additional Atoms
321
322       Since some sed implementations provide additional regular expression
323       atoms (not defined in POSIX 1003.2), psed is capable of translating the
324       following backslash escapes:
325
326       \< This is the same as "[[:>:]]".
327       \> This is the same as "[[:<:]]".
328       \w This is an abbreviation for "[[:alnum:]_]".
329       \W This is an abbreviation for "[^[:alnum:]_]".
330       \y Match the empty string at a word boundary.
331       \B Match the empty string between any two either word or non-word char‐
332       acters.
333
334       To enable this feature, the environment variable PSEDEXTBRE must be set
335       to a string containing the requested characters, e.g.:
336       "PSEDEXTBRE='<>wW'".
337

ENVIRONMENT

339       The environment variable "PSEDEXTBRE" may be set to extend BREs.  See
340       "Additional Atoms".
341

DIAGNOSTICS

343       ambiguous translation for character `%s' in `y' command
344           The indicated character appears twice, with different translations.
345
346       `[' cannot be last in pattern
347           A `[' in a BRE indicates the beginning of a bracket expression.
348
349       `\' cannot be last in pattern
350           A `\' in a BRE is used to make the subsequent character literal.
351
352       `\' cannot be last in substitution
353           A `\' in a subsitution string is used to make the subsequent char‐
354           acter literal.
355
356       conflicting flags `%s'
357           In an s command, either the `g' flag and an n-th occurrence flag,
358           or multiple n-th occurrence flags are specified. Note that only the
359           digits `1' through `9' are permitted.
360
361       duplicate label %s (first defined at %s)
362       excess address(es)
363           The command has more than the permitted number of addresses.
364
365       extra characters after command (%s)
366       illegal option `%s'
367       improper delimiter in s command
368           The BRE and substitution may not be delimited with `\' or newline.
369
370       invalid address after `,'
371       invalid backreference (%s)
372           The specified backreference number exceeds the number of backrefer‐
373           ences in the BRE.
374
375       invalid repeat clause `\{%s\}'
376           The repeat clause does not contain a valid integer value, or pair
377           of values.
378
379       malformed regex, 1st address
380       malformed regex, 2nd address
381       malformed regular expression
382       malformed substitution expression
383       malformed `y' command argument
384           The first or second string of a y command  is syntactically incor‐
385           rect.
386
387       maximum less than minimum in `\{%s\}'
388       no script command given
389           There must be at least one -e or one -f option specifying a script
390           or script file.
391
392       `\' not valid as delimiter in `y' command
393       option -e requires an argument
394       option -f requires an argument
395       `s' command requires argument
396       start of unterminated `{'
397       string lengths in `y' command differ
398           The translation table strings in a y commanf must have equal
399           lengths.
400
401       undefined label `%s'
402       unexpected `}'
403           A } command without a preceding { command was encountered.
404
405       unexpected end of script
406           The end of the script was reached although a text line after a a, c
407           or i command indicated another line.
408
409       unknown command `%s'
410       unterminated `['
411           A BRE contains an unterminated bracket expression.
412
413       unterminated `\('
414           A BRE contains an unterminated backreference.
415
416       `\{' without closing `\}'
417           A BRE contains an unterminated bounds specification.
418
419       `\)' without preceding `\('
420       `y' command requires argument
421

EXAMPLE

423       The basic material for the preceding section was generated by running
424       the sed script
425
426          #no autoprint
427          s/^.*Warn( *"\([^"]*\)".*$/\1/
428          t process
429          b
430          :process
431          s/$!/%s/g
432          s/$[_[:alnum:]]\{1,\}/%s/g
433          s/\\\\/\\/g
434          s/^/=item /
435          p
436
437       on the program's own text, and piping the output into "sort -u".
438

SED SCRIPT TRANSLATION

440       If this program is invoked with the name s2p it will act as a sed-to-
441       Perl translator. After option processing (all other arguments are
442       ignored), a Perl program is printed on standard output, which will
443       process the input stream (as read from all arguments) in the way
444       defined by the sed script and the option setting used for the transla‐
445       tion.
446

SEE ALSO

448       perl(1), re_format(7)
449

BUGS

451       The l command will show escape characters (ESC) as `"\e"', but a verti‐
452       cal tab (VT) in octal.
453
454       Trailing spaces are truncated from labels in :, t and b commands.
455
456       The meaning of an empty regular expression (`"//"'), as defined by sed,
457       is "the last pattern used, at run time". This deviates from the Perl
458       interpretation, which will re-use the "last last successfully executed
459       regular expression". Since keeping track of pattern usage would create
460       terribly cluttered code, and differences would only appear in obscure
461       context (where other sed implementations appear to deviate, too), the
462       Perl semantics was adopted. Note that common usage of this feature,
463       such as in "/abc/s//xyz/", will work as expected.
464
465       Collating elements (of bracket expressions in BREs) are not imple‐
466       mented.
467

STANDARDS

469       This sed implementation conforms to the IEEE Std1003.2-1992 ("POSIX.2")
470       definition of sed, and is compatible with the OpenBSD implementation,
471       except where otherwise noted (see "BUGS").
472

AUTHOR

474       This Perl implementation of sed was written by Wolfgang Laun, Wolf‐
475       gang.Laun@alcatel.at.
476
478       This program is free and open software. You may use, modify, distrib‐
479       ute, and sell this program (and any modified variants) in any way you
480       wish, provided you do not restrict others from doing the same.
481
482
483
484perl v5.8.8                       2008-05-05                            S2P(1)
Impressum