1SED(1P)                    POSIX Programmer's Manual                   SED(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       sed - stream editor
13

SYNOPSIS

15       sed [-n] script[file...]
16
17       sed [-n][-e script]...[-f script_file]...[file...]
18
19

DESCRIPTION

21       The sed utility is a stream editor that shall read  one  or  more  text
22       files,  make editing changes according to a script of editing commands,
23       and write the results to standard output. The script shall be  obtained
24       from  either  the script operand string or a combination of the option-
25       arguments from the -e script and -f script_file options.
26

OPTIONS

28       The sed utility  shall  conform  to  the  Base  Definitions  volume  of
29       IEEE Std 1003.1-2001,  Section  12.2, Utility Syntax Guidelines, except
30       that the order of presentation of the -e and -f options is significant.
31
32       The following options shall be supported:
33
34       -e  script
35              Add the editing commands specified by the script option-argument
36              to the end of the script of editing commands. The script option-
37              argument shall have the same properties as the  script  operand,
38              described in the OPERANDS section.
39
40       -f  script_file
41              Add  the  editing commands in the file script_file to the end of
42              the script.
43
44       -n     Suppress the default output (in which each  line,  after  it  is
45              examined for editing, is written to standard output). Only lines
46              explicitly selected for output are written.
47
48
49       Multiple -e and -f options may be  specified.  All  commands  shall  be
50       added to the script in the order specified, regardless of their origin.
51

OPERANDS

53       The following operands shall be supported:
54
55       file   A pathname of a file whose contents are read and edited. If mul‐
56              tiple file operands are specified, the named files shall be read
57              in  the  order  specified and the concatenation shall be edited.
58              If no file operands are specified, the standard input  shall  be
59              used.
60
61       script A  string  to  be  used  as  the script of editing commands. The
62              application  shall  not  present  a  script  that  violates  the
63              restrictions of a text file except that the final character need
64              not be a <newline>.
65
66

STDIN

68       The standard input shall be used only if no file  operands  are  speci‐
69       fied. See the INPUT FILES section.
70

INPUT FILES

72       The  input  files shall be text files. The script_files named by the -f
73       option shall consist of editing commands.
74

ENVIRONMENT VARIABLES

76       The following environment variables shall affect the execution of sed:
77
78       LANG   Provide a default value for the  internationalization  variables
79              that  are  unset  or  null.  (See the Base Definitions volume of
80              IEEE Std 1003.1-2001, Section  8.2,  Internationalization  Vari‐
81              ables  for the precedence of internationalization variables used
82              to determine the values of locale categories.)
83
84       LC_ALL If set to a non-empty string value, override the values  of  all
85              the other internationalization variables.
86
87       LC_COLLATE
88
89              Determine  the  locale  for  the behavior of ranges, equivalence
90              classes, and multi-character collating elements  within  regular
91              expressions.
92
93       LC_CTYPE
94              Determine  the  locale  for  the  interpretation of sequences of
95              bytes of text data as characters (for  example,  single-byte  as
96              opposed  to multi-byte characters in arguments and input files),
97              and the behavior of character  classes  within  regular  expres‐
98              sions.
99
100       LC_MESSAGES
101              Determine  the  locale  that should be used to affect the format
102              and contents of diagnostic messages written to standard error.
103
104       NLSPATH
105              Determine the location of message catalogs for the processing of
106              LC_MESSAGES .
107
108

ASYNCHRONOUS EVENTS

110       Default.
111

STDOUT

113       The  input  files shall be written to standard output, with the editing
114       commands specified in the script applied. If the -n  option  is  speci‐
115       fied, only those input lines selected by the script shall be written to
116       standard output.
117

STDERR

119       The standard error shall be used only for diagnostic messages.
120

OUTPUT FILES

122       The output files shall be text files whose formats are dependent on the
123       editing commands given.
124

EXTENDED DESCRIPTION

126       The script shall consist of editing commands of the following form:
127
128
129              [address[,address]]function
130
131       where function represents a single-character command verb from the list
132       in Editing Commands in sed, followed by any applicable arguments.
133
134       The command can be preceded by <blank>s and/or semicolons. The function
135       can  be  preceded  by <blank>s. These optional characters shall have no
136       effect.
137
138       In default operation, sed cyclically shall append a line of input, less
139       its terminating <newline>, into the pattern space. Normally the pattern
140       space will be empty, unless a D command terminated the last cycle.  The
141       sed  utility  shall then apply in sequence all commands whose addresses
142       select that pattern space, and at the end of the script copy  the  pat‐
143       tern  space to standard output (except when -n is specified) and delete
144       the pattern space. Whenever the pattern space is  written  to  standard
145       output  or  a  named file, sed shall immediately follow it with a <new‐
146       line>.
147
148       Some of the editing commands use a hold space to save all  or  part  of
149       the pattern space for subsequent retrieval. The pattern and hold spaces
150       shall each be able to hold at least 8192 bytes.
151
152   Addresses in sed
153       An address is either a decimal number that counts input  lines  cumula‐
154       tively  across  files,  a '$' character that addresses the last line of
155       input, or a context address (which consists of a BRE, as  described  in
156       Regular  Expressions in sed, preceded and followed by a delimiter, usu‐
157       ally a slash).
158
159       An editing command with no addresses shall select every pattern space.
160
161       An editing command with one address shall  select  each  pattern  space
162       that matches the address.
163
164       An  editing command with two addresses shall select the inclusive range
165       from the first pattern space that matches the first address through the
166       next pattern space that matches the second. (If the second address is a
167       number less than or equal to the line number first selected,  only  one
168       line  shall  be  selected.)  Starting  at  the first line following the
169       selected range, sed shall look again for the first address. Thereafter,
170       the  process  shall be repeated. Omitting either or both of the address
171       components in the following form produces undefined results:
172
173
174              [address[,address]]
175
176   Regular Expressions in sed
177       The sed utility shall support the BREs described in  the  Base  Defini‐
178       tions  volume  of  IEEE Std 1003.1-2001,  Section  9.3,  Basic  Regular
179       Expressions, with the following additions:
180
181        * In a context address, the construction  "\cBREc",  where  c  is  any
182          character  other  than backslash or <newline>, shall be identical to
183          "/BRE/" . If the character designated by c appears following a back‐
184          slash,  then  it  shall  be considered to be that literal character,
185          which shall not terminate the  BRE.  For  example,  in  the  context
186          address  "\xabc\xdefx",  the second x stands for itself, so that the
187          BRE is "abcxdef" .
188
189        * The escape sequence '\n' shall match a  <newline>  embedded  in  the
190          pattern space. A literal <newline> shall not be used in the BRE of a
191          context address or in the substitute function.
192
193        * If an RE is empty (that is,  no  pattern  is  specified)  sed  shall
194          behave as if the last RE used in the last command applied (either as
195          an address or as part of a substitute command) was specified.
196
197   Editing Commands in sed
198       In the following list of editing commands, the maximum number  of  per‐
199       missible  addresses  for  each  function  is  indicated  by [ 0addr], [
200       1addr], or [ 2addr], representing zero, one, or two addresses.
201
202       The argument text shall consist of one or  more  lines.  Each  embedded
203       <newline>  in  the  text  shall be preceded by a backslash. Other back‐
204       slashes in text shall be removed, and the following character shall  be
205       treated literally.
206
207       The  r  and  w  command verbs, and the w flag to the s command, take an
208       optional rfile (or wfile) parameter, separated from  the  command  verb
209       letter  or flag by one or more <blank>s; implementations may allow zero
210       separation as an extension.
211
212       The argument rfile or the argument wfile shall  terminate  the  editing
213       command.  Each  wfile shall be created before processing begins. Imple‐
214       mentations shall support at least ten wfile arguments  in  the  script;
215       the  actual  number  (greater than or equal to 10) that is supported by
216       the implementation is unspecified. The use of the wfile parameter shall
217       cause that file to be initially created, if it does not exist, or shall
218       replace the contents of an existing file.
219
220       The b, r, s, t, w, y, and : command verbs shall accept additional argu‐
221       ments.   The following synopses indicate which arguments shall be sepa‐
222       rated from the command verbs by a single <space>.
223
224       The a and r commands schedule text for later output. The text specified
225       for  the  a  command,  and the contents of the file specified for the r
226       command, shall be written to  standard  output  just  before  the  next
227       attempt to fetch a line of input when executing the N or n commands, or
228       when reaching the end of the script. If written when reaching  the  end
229       of  the  script, and the -n option was not specified, the text shall be
230       written after copying the pattern space to standard  output.  The  con‐
231       tents  of  the file specified for the r command shall be as of the time
232       the output is written, not the time the r command is applied. The  text
233       shall be output in the order in which the a and r commands were applied
234       to the input.
235
236       Command verbs other than {, a, b, c, i, r, t, w, :, and # can  be  fol‐
237       lowed by a semicolon, optional <blank>s, and another command verb. How‐
238       ever, when the s command verb is used with the  w  flag,  following  it
239       with another command in this manner produces undefined results.
240
241       A function can be preceded by one or more '!' characters, in which case
242       the function shall be applied if the addresses do not select  the  pat‐
243       tern  space.  Zero  or more <blank>s shall be accepted before the first
244       '!' character. It is unspecified whether  <blank>s  can  follow  a  '!'
245       character,  and conforming applications shall not follow a '!'  charac‐
246       ter with <blank>s.
247
248       [2addr] {function
249
250       function
251
252       ...
253
254       }      Execute a list of sed functions only when the pattern  space  is
255              selected.  The  list  of  sed  functions  shall be surrounded by
256              braces and separated by <newline>s, and conform to the following
257              rules.  The  braces can be preceded or followed by <blank>s. The
258              functions can be preceded by <blank>s, but shall not be followed
259              by  <blank>s. The <right-brace> shall be preceded by a <newline>
260              and can be preceded or followed by <blank>s.
261
262       [1addr]a\
263
264       text   Write text to standard output as described previously.
265
266       [2addr]b [label]
267
268              Branch to the : function bearing the  label.  If  label  is  not
269              specified,  branch  to the end of the script. The implementation
270              shall support labels recognized as unique up to at least 8 char‐
271              acters;  the  actual  length  (greater  than or equal to 8) that
272              shall be supported by the implementation is unspecified.  It  is
273              unspecified  whether exceeding a label length causes an error or
274              a silent truncation.
275
276       [2addr]c\
277
278       text   Delete the pattern space. With a 0 or 1 address or at the end of
279              a  2-address  range, place text on the output and start the next
280              cycle.
281
282       [2addr]d
283              Delete the pattern space and start the next cycle.
284
285       [2addr]D
286              Delete the initial segment of  the  pattern  space  through  the
287              first <newline> and start the next cycle.
288
289       [2addr]g
290              Replace the contents of the pattern space by the contents of the
291              hold space.
292
293       [2addr]G
294              Append to the pattern space a <newline> followed by the contents
295              of the hold space.
296
297       [2addr]h
298              Replace  the contents of the hold space with the contents of the
299              pattern space.
300
301       [2addr]H
302              Append to the hold space a <newline> followed by the contents of
303              the pattern space.
304
305       [1addr]i\
306
307       text   Write text to standard output.
308
309       [2addr]l
310              (The  letter ell.) Write the pattern space to standard output in
311              a visually unambiguous form. The characters listed in  the  Base
312              Definitions  volume  of  IEEE Std 1003.1-2001, Table 5-1, Escape
313              Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\r',
314              '\t',  '\v'  )  shall  be  written  as  the corresponding escape
315              sequence; the '\n' in that table is not  applicable.  Non-print‐
316              able characters not in that table shall be written as one three-
317              digit octal number (with a preceding backslash) for each byte in
318              the  character  (most  significant byte first). If the size of a
319              byte on the system is greater than 9 bits, the format  used  for
320              non-printable characters is implementation-defined.
321
322       Long  lines  shall  be  folded,  with the point of folding indicated by
323       writing a backslash followed by a <newline>; the length at which  fold‐
324       ing  occurs  is  unspecified,  but should be appropriate for the output
325       device. The end of each line shall be marked with a '$' .
326
327       [2addr]n
328              Write the pattern space to standard output if the default output
329              has  not been suppressed, and replace the pattern space with the
330              next line of input, less its terminating <newline>.
331
332       If no next line of input is available, the n command verb shall  branch
333       to the end of the script and quit without starting a new cycle.
334
335       [2addr]N
336              Append  the  next line of input, less its terminating <newline>,
337              to the pattern space, using an embedded  <newline>  to  separate
338              the  appended material from the original material. Note that the
339              current line number changes.
340
341       If no next line of input is available, the N command verb shall  branch
342       to the end of the script and quit without starting a new cycle or copy‐
343       ing the pattern space to standard output.
344
345       [2addr]p
346              Write the pattern space to standard output.
347
348       [2addr]P
349              Write the pattern space, up to the first <newline>, to  standard
350              output.
351
352       [1addr]q
353              Branch  to the end of the script and quit without starting a new
354              cycle.
355
356       [1addr]r  rfile
357              Copy the contents of rfile to standard output as described  pre‐
358              viously.  If rfile does not exist or cannot be read, it shall be
359              treated as if it were an empty file, causing no error condition.
360
361       [2addr]s/BRE/replacement/flags
362
363              Substitute the replacement string for instances of  the  BRE  in
364              the  pattern  space. Any character other than backslash or <new‐
365              line> can be used instead of a slash to delimit the BRE and  the
366              replacement.  Within the BRE and the replacement, the BRE delim‐
367              iter itself can be used as a literal character if it is preceded
368              by a backslash.
369
370       The  replacement  string  shall  be  scanned  from beginning to end. An
371       ampersand ( '&' ) appearing in the replacement shall be replaced by the
372       string matching the BRE. The special meaning of '&' in this context can
373       be suppressed by preceding it by  a  backslash.  The  characters  "\n",
374       where n is a digit, shall be replaced by the text matched by the corre‐
375       sponding backreference expression. The special meaning of "\n" where  n
376       is  a  digit  in  this  context, can be suppressed by preceding it by a
377       backslash. For each other backslash ( '\' ) encountered, the  following
378       character shall lose its special meaning (if any). The meaning of a '\'
379       immediately followed by any character other than '&', '\', a digit,  or
380       the delimiter character used for this command, is unspecified.
381
382       A  line  can be split by substituting a <newline> into it. The applica‐
383       tion shall escape the <newline> in the replacement by preceding it by a
384       backslash.  A  substitution  shall be considered to have been performed
385       even if the replacement string is  identical  to  the  string  that  it
386       replaces.  Any  backslash used to alter the default meaning of a subse‐
387       quent character shall be discarded from  the  BRE  or  the  replacement
388       before evaluating the BRE or using the replacement.
389
390       The value of flags shall be zero or more of:
391
392       n
393              Substitute  for  the nth occurrence only of the BRE found within
394              the pattern space.
395
396       g
397              Globally substitute for all non-overlapping instances of the BRE
398              rather  than  just the first one. If both g and n are specified,
399              the results are unspecified.
400
401       p
402              Write the pattern space to standard output if a replacement  was
403              made.
404
405       w  wfile
406              Write.  Append  the  pattern space to wfile if a replacement was
407              made. A conforming application shall precede the wfile  argument
408              with  one  or  more <blank>s. If the w flag is not the last flag
409              value given in a concatenation  of  multiple  flag  values,  the
410              results are undefined.
411
412
413       [2addr]t [label]
414
415              Test. Branch to the : command verb bearing the label if any sub‐
416              stitutions have been made since the most recent  reading  of  an
417              input  line  or  execution  of  a  t. If label is not specified,
418              branch to the end of the script.
419
420       [2addr]w  wfile
421
422              Append (write) the pattern space to wfile.
423
424       [2addr]x
425              Exchange the contents of the pattern and hold spaces.
426
427       [2addr]y/string1/string2/
428
429              Replace all occurrences of characters in string1 with the corre‐
430              sponding  characters  in  string2. If a backslash followed by an
431              'n' appear in string1 or string2, the two  characters  shall  be
432              handled  as  a  single <newline>. If the number of characters in
433              string1 and string2 are not equal, or if any of  the  characters
434              in string1 appear more than once, the results are undefined. Any
435              character other than backslash or <newline> can be used  instead
436              of  slash  to  delimit  the  strings. If the delimiter is not n,
437              within string1 and string2, the delimiter itself can be used  as
438              a  literal  character  if  it  is preceded by a backslash.  If a
439              backslash character is immediately followed by a backslash char‐
440              acter  in string1 or string2, the two backslash characters shall
441              be counted as a single literal backslash character. The  meaning
442              of  a  backslash  followed  by  any character that is not 'n', a
443              backslash, or the delimiter character is undefined.
444
445       [0addr]:label
446              Do nothing. This command bears a label to which the b and t com‐
447              mands branch.
448
449       [1addr]=
450              Write the following to standard output:
451
452
453              "%d\n", <current line number>
454
455       [0addr]
456              Ignore this empty command.
457
458       [0addr]#
459              Ignore  the  '#'  and the remainder of the line (treat them as a
460              comment), with the single exception that if the first two  char‐
461              acters  in the script are "#n", the default output shall be sup‐
462              pressed; this shall be the equivalent of specifying  -n  on  the
463              command line.
464
465

EXIT STATUS

467       The following exit values shall be returned:
468
469        0     Successful completion.
470
471       >0     An error occurred.
472
473

CONSEQUENCES OF ERRORS

475       Default.
476
477       The following sections are informative.
478

APPLICATION USAGE

480       Regular  expressions  match  entire strings, not just individual lines,
481       but a <newline> is matched by '\n' in a sed  RE;  a  <newline>  is  not
482       allowed   by   the   general   definition   of  regular  expression  in
483       IEEE Std 1003.1-2001.  Also note that '\n' cannot be used  to  match  a
484       <newline>  at  the end of an arbitrary input line; <newline>s appear in
485       the pattern space as a result of the N editing command.
486

EXAMPLES

488       This sed script simulates the BSD  cat  -s  command,  squeezing  excess
489       blank lines from standard input.
490
491
492              sed -n '
493              # Write non-empty lines.
494              /./ {
495                  p
496                  d
497                  }
498              # Write a single empty line, then look for more empty lines.
499              /^$/    p
500              # Get next line, discard the held <newline> (empty line),
501              # and look for more empty lines.
502              :Empty
503              /^$/    {
504                  N
505                  s/.//
506                  b Empty
507                  }
508              # Write the non-empty line before going back to search
509              # for the first in a set of empty lines.
510                  p
511

RATIONALE

513       This volume of IEEE Std 1003.1-2001 requires implementations to support
514       at least ten distinct wfiles,  matching  historical  practice  on  many
515       implementations.  Implementations  are  encouraged to support more, but
516       conforming applications should not exceed this limit.
517
518       The exit status codes specified here are different from those in System
519       V.  System  V returns 2 for garbled sed commands, but returns zero with
520       its usage message or if the input file could not be opened.  The  stan‐
521       dard developers considered this to be a bug.
522
523       The  manner  in which the l command writes non-printable characters was
524       changed to avoid the historical backspace-overstrike method, and  other
525       requirements  to  achieve unambiguous output were added. See the RATIO‐
526       NALE for ed for details of the format chosen, which is the same as that
527       chosen for sed.
528
529       This volume of IEEE Std 1003.1-2001 requires implementations to provide
530       pattern and hold spaces of at least 8192 bytes, larger  than  the  4000
531       bytes spaces used by some historical implementations, but less than the
532       20480 bytes limit  used  in  an  early  proposal.  Implementations  are
533       encouraged  to  allocate  dynamically larger pattern and hold spaces as
534       needed.
535
536       The requirements for acceptance of <blank>s  and  <space>s  in  command
537       lines  has  been made more explicit than in early proposals to describe
538       clearly the historical practice  and  to  remove  confusion  about  the
539       phrase  "protect  initial blanks [sic] and tabs from the stripping that
540       is done on every script line" that appears in much  of  the  historical
541       documentation  of  the sed utility description of text. (Not all imple‐
542       mentations are  known  to  have  stripped  <blank>s  from  text  lines,
543       although  they  all have allowed leading <blank>s preceding the address
544       on a command line.)
545
546       The treatment of '#' comments differs from the SVID which only allows a
547       comment as the first line of the script, but matches BSD-derived imple‐
548       mentations. The comment character is treated as a command, and  it  has
549       the  same  properties in terms of being accepted with leading <blank>s;
550       the BSD implementation has historically supported this.
551
552       Early proposals required that a script_file have at least one  non-com‐
553       ment  line.  Some historical implementations have behaved in unexpected
554       ways if this were not the case. The standard developers considered that
555       this  was incorrect behavior and that application developers should not
556       have to avoid this feature. A correct implementation of this volume  of
557       IEEE Std 1003.1-2001  shall  permit  script_files  that consist only of
558       comment lines.
559
560       Early proposals indicated that if -e and -f  options  were  intermixed,
561       all  -e  options  were  processed  before any -f options. This has been
562       changed to process them in the order presented because it matches  his‐
563       torical practice and is more intuitive.
564
565       The  treatment  of the p flag to the s command differs between System V
566       and BSD-based systems when the default output is suppressed. In the two
567       examples:
568
569
570              echo a | sed    's/a/A/p'
571              echo a | sed -n 's/a/A/p'
572
573       this  volume  of IEEE Std 1003.1-2001, BSD, System V documentation, and
574       the SVID indicate that the first example should write two lines with A,
575       whereas the second should write one.  Some System V systems write the A
576       only once in both examples because the p flag  is  ignored  if  the  -n
577       option is not specified.
578
579       This  is  a case of a diametrical difference between systems that could
580       not be reconciled through the compromise of declaring the  behavior  to
581       be  unspecified.  The  SVID/BSD/System  V  documentation  behavior  was
582       adopted for this volume of IEEE Std 1003.1-2001 because:
583
584        * No known documentation for any historic system describes the  inter‐
585          action between the p flag and the -n option.
586
587        * The  selected behavior is more correct as there is no technical jus‐
588          tification for any interaction between the p flag and the -n option.
589          A  relationship  between -n and the p flag might imply that they are
590          only used together, but this ignores valid  scripts  that  interrupt
591          the  cyclical  nature of the processing through the use of the D, d,
592          q, or branching commands. Such scripts rely on the p suffix to write
593          the pattern space because they do not make use of the default output
594          at the "bottom" of the script.
595
596        * Because the -n option makes the p flag unnecessary, any  interaction
597          would  only  be  useful if sed scripts were written to run both with
598          and without the -n option. This is believed to be  unlikely.  It  is
599          even  more unlikely that programmers have coded the p flag expecting
600          it to be unnecessary.  Because the interaction was  not  documented,
601          the  likelihood  of  a  programmer  discovering  the interaction and
602          depending on it is further decreased.
603
604        * Finally, scripts that break under the specified behavior produce too
605          much  output  instead of too little, which is easier to diagnose and
606          correct.
607
608       The form of the substitute command that uses the n suffix  was  limited
609       to  the  first  512  matches  in an early proposal. This limit has been
610       removed because there is  no  reason  an  editor  processing  lines  of
611       {LINE_MAX}  length should have this restriction. The command s/a/A/2047
612       should be able to substitute the 2047th occurrence of a on a line.
613
614       The b, t, and : commands are documented to ignore leading white  space,
615       but  no mention is made of trailing white space. Historical implementa‐
616       tions of sed assigned different locations to the labels 'x' and "x "  .
617       This  is  not useful, and leads to subtle programming errors, but it is
618       historical practice, and changing it could theoretically break  working
619       scripts.  Implementors are encouraged to provide warning messages about
620       labels that are never used or jumps to labels that do not exist.
621
622       Historically, the sed ! and } editing commands did not permit  multiple
623       commands  on  a  single  line using a semicolon as a command delimiter.
624       Implementations are permitted, but not required, to support this exten‐
625       sion.
626

FUTURE DIRECTIONS

628       None.
629

SEE ALSO

631       awk, ed, grep
632
634       Portions  of  this text are reprinted and reproduced in electronic form
635       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
636       --  Portable  Operating  System  Interface (POSIX), The Open Group Base
637       Specifications Issue 6, Copyright (C) 2001-2003  by  the  Institute  of
638       Electrical  and  Electronics  Engineers, Inc and The Open Group. In the
639       event of any discrepancy between this version and the original IEEE and
640       The  Open Group Standard, the original IEEE and The Open Group Standard
641       is the referee document. The original Standard can be  obtained  online
642       at http://www.opengroup.org/unix/online.html .
643
644
645
646IEEE/The Open Group                  2003                              SED(1P)
Impressum