1SED(P)                     POSIX Programmer's Manual                    SED(P)
2
3
4

NAME

6       sed - stream editor
7

SYNOPSIS

9       sed [-n] script[file...]
10
11       sed [-n][-e script]...[-f script_file]...[file...]
12
13

DESCRIPTION

15       The  sed  utility  is  a stream editor that shall read one or more text
16       files, make editing changes according to a script of editing  commands,
17       and  write the results to standard output. The script shall be obtained
18       from either the script operand string or a combination of  the  option-
19       arguments from the -e script and -f script_file options.
20

OPTIONS

22       The  sed  utility  shall  conform  to  the  Base  Definitions volume of
23       IEEE Std 1003.1-2001, Section 12.2, Utility Syntax  Guidelines,  except
24       that the order of presentation of the -e and -f options is significant.
25
26       The following options shall be supported:
27
28       -e  script
29              Add the editing commands specified by the script option-argument
30              to the end of the script of editing commands. The script option-
31              argument  shall  have the same properties as the script operand,
32              described in the OPERANDS section.
33
34       -f  script_file
35              Add the editing commands in the file script_file to the  end  of
36              the script.
37
38       -n     Suppress  the  default  output  (in which each line, after it is
39              examined for editing, is written to standard output). Only lines
40              explicitly selected for output are written.
41
42
43       Multiple  -e  and  -f  options  may be specified. All commands shall be
44       added to the script in the order specified, regardless of their origin.
45

OPERANDS

47       The following operands shall be supported:
48
49       file   A pathname of a file whose contents are read and edited. If mul‐
50              tiple file operands are specified, the named files shall be read
51              in the order specified and the concatenation  shall  be  edited.
52              If  no  file operands are specified, the standard input shall be
53              used.
54
55       script A string to be used as  the  script  of  editing  commands.  The
56              application  shall  not  present  a  script  that  violates  the
57              restrictions of a text file except that the final character need
58              not be a <newline>.
59
60

STDIN

62       The  standard  input  shall be used only if no file operands are speci‐
63       fied. See the INPUT FILES section.
64

INPUT FILES

66       The input files shall be text files. The script_files named by  the  -f
67       option shall consist of editing commands.
68

ENVIRONMENT VARIABLES

70       The following environment variables shall affect the execution of sed:
71
72       LANG   Provide  a  default value for the internationalization variables
73              that are unset or null. (See  the  Base  Definitions  volume  of
74              IEEE Std 1003.1-2001,  Section  8.2,  Internationalization Vari‐
75              ables for the precedence of internationalization variables  used
76              to determine the values of locale categories.)
77
78       LC_ALL If  set  to a non-empty string value, override the values of all
79              the other internationalization variables.
80
81       LC_COLLATE
82
83              Determine the locale for the  behavior  of  ranges,  equivalence
84              classes,  and  multi-character collating elements within regular
85              expressions.
86
87       LC_CTYPE
88              Determine the locale for  the  interpretation  of  sequences  of
89              bytes  of  text  data as characters (for example, single-byte as
90              opposed to multi-byte characters in arguments and input  files),
91              and  the  behavior  of  character classes within regular expres‐
92              sions.
93
94       LC_MESSAGES
95              Determine the locale that should be used to  affect  the  format
96              and contents of diagnostic messages written to standard error.
97
98       NLSPATH
99              Determine the location of message catalogs for the processing of
100              LC_MESSAGES .
101
102

ASYNCHRONOUS EVENTS

104       Default.
105

STDOUT

107       The input files shall be written to standard output, with  the  editing
108       commands  specified  in  the script applied. If the -n option is speci‐
109       fied, only those input lines selected by the script shall be written to
110       standard output.
111

STDERR

113       The standard error shall be used only for diagnostic messages.
114

OUTPUT FILES

116       The output files shall be text files whose formats are dependent on the
117       editing commands given.
118

EXTENDED DESCRIPTION

120       The script shall consist of editing commands of the following form:
121
122
123              [address[,address]]function
124
125       where function represents a single-character command verb from the list
126       in Editing Commands in sed , followed by any applicable arguments.
127
128       The command can be preceded by <blank>s and/or semicolons. The function
129       can be preceded by <blank>s. These optional characters  shall  have  no
130       effect.
131
132       In default operation, sed cyclically shall append a line of input, less
133       its terminating <newline>, into the pattern space. Normally the pattern
134       space  will be empty, unless a D command terminated the last cycle. The
135       sed utility shall then apply in sequence all commands  whose  addresses
136       select  that  pattern space, and at the end of the script copy the pat‐
137       tern space to standard output (except when -n is specified) and  delete
138       the  pattern  space.  Whenever the pattern space is written to standard
139       output or a named file, sed shall immediately follow it  with  a  <new‐
140       line>.
141
142       Some  of  the  editing commands use a hold space to save all or part of
143       the pattern space for subsequent retrieval. The pattern and hold spaces
144       shall each be able to hold at least 8192 bytes.
145
146   Addresses in sed
147       An  address  is either a decimal number that counts input lines cumula‐
148       tively across files, a '$' character that addresses the  last  line  of
149       input,  or  a context address (which consists of a BRE, as described in
150       Regular Expressions in sed , preceded and followed by a delimiter, usu‐
151       ally a slash).
152
153       An editing command with no addresses shall select every pattern space.
154
155       An  editing  command  with  one address shall select each pattern space
156       that matches the address.
157
158       An editing command with two addresses shall select the inclusive  range
159       from the first pattern space that matches the first address through the
160       next pattern space that matches the second. (If the second address is a
161       number  less  than or equal to the line number first selected, only one
162       line shall be selected.) Starting  at  the  first  line  following  the
163       selected range, sed shall look again for the first address. Thereafter,
164       the process shall be repeated. Omitting either or both of  the  address
165       components in the following form produces undefined results:
166
167
168              [address[,address]]
169
170   Regular Expressions in sed
171       The  sed  utility  shall support the BREs described in the Base Defini‐
172       tions  volume  of  IEEE Std 1003.1-2001,  Section  9.3,  Basic  Regular
173       Expressions, with the following additions:
174
175        * In  a  context  address,  the construction "\cBREc" , where c is any
176          character other than backslash or <newline>, shall be  identical  to
177          "/BRE/" . If the character designated by c appears following a back‐
178          slash, then it shall be considered to  be  that  literal  character,
179          which  shall  not  terminate  the  BRE.  For example, in the context
180          address "\xabc\xdefx" , the second x stands for itself, so that  the
181          BRE is "abcxdef" .
182
183        * The  escape  sequence  '\n'  shall match a <newline> embedded in the
184          pattern space. A literal <newline> shall not be used in the BRE of a
185          context address or in the substitute function.
186
187        * If  an  RE  is  empty  (that  is, no pattern is specified) sed shall
188          behave as if the last RE used in the last command applied (either as
189          an address or as part of a substitute command) was specified.
190
191   Editing Commands in sed
192       In  the  following list of editing commands, the maximum number of per‐
193       missible addresses for each  function  is  indicated  by  [  0addr],  [
194       1addr], or [ 2addr], representing zero, one, or two addresses.
195
196       The  argument  text  shall  consist of one or more lines. Each embedded
197       <newline> in the text shall be preceded by  a  backslash.  Other  back‐
198       slashes  in text shall be removed, and the following character shall be
199       treated literally.
200
201       The r and w command verbs, and the w flag to the  s  command,  take  an
202       optional  rfile  (or  wfile) parameter, separated from the command verb
203       letter or flag by one or more <blank>s; implementations may allow  zero
204       separation as an extension.
205
206       The  argument  rfile  or the argument wfile shall terminate the editing
207       command. Each wfile shall be created before processing  begins.  Imple‐
208       mentations  shall  support  at least ten wfile arguments in the script;
209       the actual number (greater than or equal to 10) that  is  supported  by
210       the implementation is unspecified. The use of the wfile parameter shall
211       cause that file to be initially created, if it does not exist, or shall
212       replace the contents of an existing file.
213
214       The b, r, s, t, w, y, and : command verbs shall accept additional argu‐
215       ments.  The following synopses indicate which arguments shall be  sepa‐
216       rated from the command verbs by a single <space>.
217
218       The a and r commands schedule text for later output. The text specified
219       for the a command, and the contents of the file  specified  for  the  r
220       command,  shall  be  written  to  standard  output just before the next
221       attempt to fetch a line of input when executing the N or n commands, or
222       when  reaching  the end of the script. If written when reaching the end
223       of the script, and the -n option was not specified, the text  shall  be
224       written  after  copying  the pattern space to standard output. The con‐
225       tents of the file specified for the r command shall be as of  the  time
226       the  output is written, not the time the r command is applied. The text
227       shall be output in the order in which the a and r commands were applied
228       to the input.
229
230       Command  verbs  other than {, a, b, c, i, r, t, w, :, and # can be fol‐
231       lowed by a semicolon, optional <blank>s, and another command verb. How‐
232       ever,  when  the  s  command verb is used with the w flag, following it
233       with another command in this manner produces undefined results.
234
235       A function can be preceded by one or more '!' characters, in which case
236       the  function  shall be applied if the addresses do not select the pat‐
237       tern space. Zero or more <blank>s shall be accepted  before  the  first
238       '!'  character.  It  is  unspecified  whether <blank>s can follow a '!'
239       character, and conforming applications shall not follow a '!'   charac‐
240       ter with <blank>s.
241
242       [2addr] {function
243
244       function
245
246       ...
247
248       }      Execute  a  list of sed functions only when the pattern space is
249              selected. The list of  sed  functions  shall  be  surrounded  by
250              braces and separated by <newline>s, and conform to the following
251              rules. The braces can be preceded or followed by  <blank>s.  The
252              functions can be preceded by <blank>s, but shall not be followed
253              by <blank>s. The <right-brace> shall be preceded by a  <newline>
254              and can be preceded or followed by <blank>s.
255
256       [1addr]a\
257
258       text   Write text to standard output as described previously.
259
260       [2addr]b [label]
261
262              Branch  to  the  :  function  bearing the label. If label is not
263              specified, branch to the end of the script.  The  implementation
264              shall support labels recognized as unique up to at least 8 char‐
265              acters; the actual length (greater than  or  equal  to  8)  that
266              shall  be supported by the implementation is unspecified.  It is
267              unspecified whether exceeding a label length causes an error  or
268              a silent truncation.
269
270       [2addr]c\
271
272       text   Delete the pattern space. With a 0 or 1 address or at the end of
273              a 2-address range, place text on the output and start  the  next
274              cycle.
275
276       [2addr]d
277              Delete the pattern space and start the next cycle.
278
279       [2addr]D
280              Delete  the  initial  segment  of  the pattern space through the
281              first <newline> and start the next cycle.
282
283       [2addr]g
284              Replace the contents of the pattern space by the contents of the
285              hold space.
286
287       [2addr]G
288              Append to the pattern space a <newline> followed by the contents
289              of the hold space.
290
291       [2addr]h
292              Replace the contents of the hold space with the contents of  the
293              pattern space.
294
295       [2addr]H
296              Append to the hold space a <newline> followed by the contents of
297              the pattern space.
298
299       [1addr]i\
300
301       text   Write text to standard output.
302
303       [2addr]l
304              (The letter ell.) Write the pattern space to standard output  in
305              a  visually  unambiguous form. The characters listed in the Base
306              Definitions volume of IEEE Std 1003.1-2001,  Table  5-1,  Escape
307              Sequences  and  Associated Actions ( '\\' , '\a' , '\b' , '\f' ,
308              '\r' , '\t' , '\v' )  shall  be  written  as  the  corresponding
309              escape  sequence; the '\n' in that table is not applicable. Non-
310              printable characters not in that table shall be written  as  one
311              three-digit  octal  number (with a preceding backslash) for each
312              byte in the character (most significant byte first). If the size
313              of  a byte on the system is greater than 9 bits, the format used
314              for non-printable characters is implementation-defined.
315
316       Long lines shall be folded, with the  point  of  folding  indicated  by
317       writing  a backslash followed by a <newline>; the length at which fold‐
318       ing occurs is unspecified, but should be  appropriate  for  the  output
319       device. The end of each line shall be marked with a '$' .
320
321       [2addr]n
322              Write the pattern space to standard output if the default output
323              has not been suppressed, and replace the pattern space with  the
324              next line of input, less its terminating <newline>.
325
326       If  no next line of input is available, the n command verb shall branch
327       to the end of the script and quit without starting a new cycle.
328
329       [2addr]N
330              Append the next line of input, less its  terminating  <newline>,
331              to  the  pattern  space, using an embedded <newline> to separate
332              the appended material from the original material. Note that  the
333              current line number changes.
334
335       If  no next line of input is available, the N command verb shall branch
336       to the end of the script and quit without starting a new cycle or copy‐
337       ing the pattern space to standard output.
338
339       [2addr]p
340              Write the pattern space to standard output.
341
342       [2addr]P
343              Write  the pattern space, up to the first <newline>, to standard
344              output.
345
346       [1addr]q
347              Branch to the end of the script and quit without starting a  new
348              cycle.
349
350       [1addr]r  rfile
351              Copy  the contents of rfile to standard output as described pre‐
352              viously.  If rfile does not exist or cannot be read, it shall be
353              treated as if it were an empty file, causing no error condition.
354
355       [2addr]s/BRE/replacement/flags
356
357              Substitute  the  replacement  string for instances of the BRE in
358              the pattern space. Any character other than backslash  or  <new‐
359              line>  can be used instead of a slash to delimit the BRE and the
360              replacement. Within the BRE and the replacement, the BRE  delim‐
361              iter itself can be used as a literal character if it is preceded
362              by a backslash.
363
364       The replacement string shall be  scanned  from  beginning  to  end.  An
365       ampersand ( '&' ) appearing in the replacement shall be replaced by the
366       string matching the BRE. The special meaning of '&' in this context can
367       be  suppressed  by  preceding  it  by a backslash. The characters "\n",
368       where n is a digit, shall be replaced by the text matched by the corre‐
369       sponding  backreference expression. The special meaning of "\n" where n
370       is a digit in this context, can be suppressed  by  preceding  it  by  a
371       backslash.  For each other backslash ( '\' ) encountered, the following
372       character shall lose its special meaning (if any). The meaning of a '\'
373       immediately  followed  by any character other than '&' , '\' , a digit,
374       or the delimiter character used for this command, is unspecified.
375
376       A line can be split by substituting a <newline> into it.  The  applica‐
377       tion shall escape the <newline> in the replacement by preceding it by a
378       backslash. A substitution shall be considered to  have  been  performed
379       even  if  the  replacement  string  is  identical to the string that it
380       replaces. Any backslash used to alter the default meaning of  a  subse‐
381       quent  character  shall  be  discarded  from the BRE or the replacement
382       before evaluating the BRE or using the replacement.
383
384       The value of flags shall be zero or more of:
385
386       n
387              Substitute for the nth occurrence only of the BRE  found  within
388              the pattern space.
389
390       g
391              Globally substitute for all non-overlapping instances of the BRE
392              rather than just the first one. If both g and n  are  specified,
393              the results are unspecified.
394
395       p
396              Write  the pattern space to standard output if a replacement was
397              made.
398
399       w  wfile
400              Write. Append the pattern space to wfile if  a  replacement  was
401              made.  A conforming application shall precede the wfile argument
402              with one or more <blank>s. If the w flag is not  the  last  flag
403              value  given  in  a  concatenation  of multiple flag values, the
404              results are undefined.
405
406
407       [2addr]t [label]
408
409              Test. Branch to the : command verb bearing the label if any sub‐
410              stitutions  have  been  made since the most recent reading of an
411              input line or execution of a  t.  If  label  is  not  specified,
412              branch to the end of the script.
413
414       [2addr]w  wfile
415
416              Append (write) the pattern space to wfile.
417
418       [2addr]x
419              Exchange the contents of the pattern and hold spaces.
420
421       [2addr]y/string1/string2/
422
423              Replace all occurrences of characters in string1 with the corre‐
424              sponding characters in string2. If a backslash  followed  by  an
425              'n'  appear  in  string1 or string2, the two characters shall be
426              handled as a single <newline>. If the number  of  characters  in
427              string1  and  string2 are not equal, or if any of the characters
428              in string1 appear more than once, the results are undefined. Any
429              character  other than backslash or <newline> can be used instead
430              of slash to delimit the strings. If  the  delimiter  is  not  n,
431              within  string1 and string2, the delimiter itself can be used as
432              a literal character if it is preceded  by  a  backslash.   If  a
433              backslash character is immediately followed by a backslash char‐
434              acter in string1 or string2, the two backslash characters  shall
435              be  counted as a single literal backslash character. The meaning
436              of a backslash followed by any character that is  not  'n'  ,  a
437              backslash, or the delimiter character is undefined.
438
439       [0addr]:label
440              Do nothing. This command bears a label to which the b and t com‐
441              mands branch.
442
443       [1addr]=
444              Write the following to standard output:
445
446
447              "%d\n", <current line number>
448
449       [0addr]
450              Ignore this empty command.
451
452       [0addr]#
453              Ignore the '#' and the remainder of the line (treat  them  as  a
454              comment),  with the single exception that if the first two char‐
455              acters in the script are "#n" , the default output shall be sup‐
456              pressed;  this  shall  be the equivalent of specifying -n on the
457              command line.
458
459

EXIT STATUS

461       The following exit values shall be returned:
462
463        0     Successful completion.
464
465       >0     An error occurred.
466
467

CONSEQUENCES OF ERRORS

469       Default.
470
471       The following sections are informative.
472

APPLICATION USAGE

474       Regular expressions match entire strings, not  just  individual  lines,
475       but  a  <newline>  is  matched  by '\n' in a sed RE; a <newline> is not
476       allowed  by  the  general   definition   of   regular   expression   in
477       IEEE Std 1003.1-2001.   Also  note  that '\n' cannot be used to match a
478       <newline> at the end of an arbitrary input line; <newline>s  appear  in
479       the pattern space as a result of the N editing command.
480

EXAMPLES

482       This  sed  script  simulates  the  BSD cat -s command, squeezing excess
483       blank lines from standard input.
484
485
486              sed -n '
487              # Write non-empty lines.
488              /./ {
489                  p
490                  d
491                  }
492              # Write a single empty line, then look for more empty lines.
493              /^$/    p
494              # Get next line, discard the held <newline> (empty line),
495              # and look for more empty lines.
496              :Empty
497              /^$/    {
498                  N
499                  s/.//
500                  b Empty
501                  }
502              # Write the non-empty line before going back to search
503              # for the first in a set of empty lines.
504                  p
505

RATIONALE

507       This volume of IEEE Std 1003.1-2001 requires implementations to support
508       at  least  ten  distinct  wfiles,  matching historical practice on many
509       implementations. Implementations are encouraged to  support  more,  but
510       conforming applications should not exceed this limit.
511
512       The exit status codes specified here are different from those in System
513       V. System V returns 2 for garbled sed commands, but returns  zero  with
514       its  usage  message or if the input file could not be opened. The stan‐
515       dard developers considered this to be a bug.
516
517       The manner in which the l command writes non-printable  characters  was
518       changed  to avoid the historical backspace-overstrike method, and other
519       requirements to achieve unambiguous output were added. See  the  RATIO‐
520       NALE for ed for details of the format chosen, which is the same as that
521       chosen for sed.
522
523       This volume of IEEE Std 1003.1-2001 requires implementations to provide
524       pattern  and  hold  spaces of at least 8192 bytes, larger than the 4000
525       bytes spaces used by some historical implementations, but less than the
526       20480  bytes  limit  used  in  an  early  proposal. Implementations are
527       encouraged to allocate dynamically larger pattern and  hold  spaces  as
528       needed.
529
530       The  requirements  for  acceptance  of <blank>s and <space>s in command
531       lines has been made more explicit than in early proposals  to  describe
532       clearly  the  historical  practice  and  to  remove confusion about the
533       phrase "protect initial blanks [sic] and tabs from the  stripping  that
534       is  done  on  every script line" that appears in much of the historical
535       documentation of the sed utility description of text. (Not  all  imple‐
536       mentations  are  known  to  have  stripped  <blank>s  from  text lines,
537       although they all have allowed leading <blank>s preceding  the  address
538       on a command line.)
539
540       The treatment of '#' comments differs from the SVID which only allows a
541       comment as the first line of the script, but matches BSD-derived imple‐
542       mentations.  The  comment character is treated as a command, and it has
543       the same properties in terms of being accepted with  leading  <blank>s;
544       the BSD implementation has historically supported this.
545
546       Early  proposals required that a script_file have at least one non-com‐
547       ment line. Some historical implementations have behaved  in  unexpected
548       ways if this were not the case. The standard developers considered that
549       this was incorrect behavior and that application developers should  not
550       have  to avoid this feature. A correct implementation of this volume of
551       IEEE Std 1003.1-2001 shall permit script_files  that  consist  only  of
552       comment lines.
553
554       Early  proposals  indicated  that if -e and -f options were intermixed,
555       all -e options were processed before any  -f  options.  This  has  been
556       changed  to process them in the order presented because it matches his‐
557       torical practice and is more intuitive.
558
559       The treatment of the p flag to the s command differs between  System  V
560       and BSD-based systems when the default output is suppressed. In the two
561       examples:
562
563
564              echo a | sed    's/a/A/p'
565              echo a | sed -n 's/a/A/p'
566
567       this volume of IEEE Std 1003.1-2001, BSD, System V  documentation,  and
568       the SVID indicate that the first example should write two lines with A,
569       whereas the second should write one.  Some System V systems write the A
570       only  once  in  both  examples  because the p flag is ignored if the -n
571       option is not specified.
572
573       This is a case of a diametrical difference between systems  that  could
574       not  be  reconciled through the compromise of declaring the behavior to
575       be  unspecified.  The  SVID/BSD/System  V  documentation  behavior  was
576       adopted for this volume of IEEE Std 1003.1-2001 because:
577
578        * No  known documentation for any historic system describes the inter‐
579          action between the p flag and the -n option.
580
581        * The selected behavior is more correct as there is no technical  jus‐
582          tification for any interaction between the p flag and the -n option.
583          A relationship between -n and the p flag might imply that  they  are
584          only  used  together,  but this ignores valid scripts that interrupt
585          the cyclical nature of the processing through the use of the  D,  d,
586          q, or branching commands. Such scripts rely on the p suffix to write
587          the pattern space because they do not make use of the default output
588          at the "bottom" of the script.
589
590        * Because  the -n option makes the p flag unnecessary, any interaction
591          would only be useful if sed scripts were written to  run  both  with
592          and  without  the  -n option. This is believed to be unlikely. It is
593          even more unlikely that programmers have coded the p flag  expecting
594          it  to  be unnecessary.  Because the interaction was not documented,
595          the likelihood of  a  programmer  discovering  the  interaction  and
596          depending on it is further decreased.
597
598        * Finally, scripts that break under the specified behavior produce too
599          much output instead of too little, which is easier to  diagnose  and
600          correct.
601
602       The  form  of the substitute command that uses the n suffix was limited
603       to the first 512 matches in an early  proposal.  This  limit  has  been
604       removed  because  there  is  no  reason  an  editor processing lines of
605       {LINE_MAX} length should have this restriction. The command  s/a/A/2047
606       should be able to substitute the 2047th occurrence of a on a line.
607
608       The  b, t, and : commands are documented to ignore leading white space,
609       but no mention is made of trailing white space. Historical  implementa‐
610       tions  of sed assigned different locations to the labels 'x' and "x " .
611       This is not useful, and leads to subtle programming errors, but  it  is
612       historical  practice, and changing it could theoretically break working
613       scripts. Implementors are encouraged to provide warning messages  about
614       labels that are never used or jumps to labels that do not exist.
615
616       Historically,  the sed ! and } editing commands did not permit multiple
617       commands on a single line using a semicolon  as  a  command  delimiter.
618       Implementations are permitted, but not required, to support this exten‐
619       sion.
620

FUTURE DIRECTIONS

622       None.
623

SEE ALSO

625       awk , ed , grep
626
628       Portions of this text are reprinted and reproduced in  electronic  form
629       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
630       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
631       Specifications  Issue  6,  Copyright  (C) 2001-2003 by the Institute of
632       Electrical and Electronics Engineers, Inc and The Open  Group.  In  the
633       event of any discrepancy between this version and the original IEEE and
634       The Open Group Standard, the original IEEE and The Open Group  Standard
635       is  the  referee document. The original Standard can be obtained online
636       at http://www.opengroup.org/unix/online.html .
637
638
639
640IEEE/The Open Group                  2003                               SED(P)
Impressum