1SED(1P) POSIX Programmer's Manual SED(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 sed - stream editor
13
15 sed [-n] script[file...]
16
17 sed [-n][-e script]...[-f script_file]...[file...]
18
19
21 The sed utility is a stream editor that shall read one or more text
22 files, make editing changes according to a script of editing commands,
23 and write the results to standard output. The script shall be obtained
24 from either the script operand string or a combination of the option-
25 arguments from the -e script and -f script_file options.
26
28 The sed utility shall conform to the Base Definitions volume of
29 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except
30 that the order of presentation of the -e and -f options is significant.
31
32 The following options shall be supported:
33
34 -e script
35 Add the editing commands specified by the script option-argument
36 to the end of the script of editing commands. The script option-
37 argument shall have the same properties as the script operand,
38 described in the OPERANDS section.
39
40 -f script_file
41 Add the editing commands in the file script_file to the end of
42 the script.
43
44 -n Suppress the default output (in which each line, after it is
45 examined for editing, is written to standard output). Only lines
46 explicitly selected for output are written.
47
48
49 Multiple -e and -f options may be specified. All commands shall be
50 added to the script in the order specified, regardless of their origin.
51
53 The following operands shall be supported:
54
55 file A pathname of a file whose contents are read and edited. If mul‐
56 tiple file operands are specified, the named files shall be read
57 in the order specified and the concatenation shall be edited.
58 If no file operands are specified, the standard input shall be
59 used.
60
61 script A string to be used as the script of editing commands. The
62 application shall not present a script that violates the
63 restrictions of a text file except that the final character need
64 not be a <newline>.
65
66
68 The standard input shall be used only if no file operands are speci‐
69 fied. See the INPUT FILES section.
70
72 The input files shall be text files. The script_files named by the -f
73 option shall consist of editing commands.
74
76 The following environment variables shall affect the execution of sed:
77
78 LANG Provide a default value for the internationalization variables
79 that are unset or null. (See the Base Definitions volume of
80 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
81 ables for the precedence of internationalization variables used
82 to determine the values of locale categories.)
83
84 LC_ALL If set to a non-empty string value, override the values of all
85 the other internationalization variables.
86
87 LC_COLLATE
88
89 Determine the locale for the behavior of ranges, equivalence
90 classes, and multi-character collating elements within regular
91 expressions.
92
93 LC_CTYPE
94 Determine the locale for the interpretation of sequences of
95 bytes of text data as characters (for example, single-byte as
96 opposed to multi-byte characters in arguments and input files),
97 and the behavior of character classes within regular expres‐
98 sions.
99
100 LC_MESSAGES
101 Determine the locale that should be used to affect the format
102 and contents of diagnostic messages written to standard error.
103
104 NLSPATH
105 Determine the location of message catalogs for the processing of
106 LC_MESSAGES .
107
108
110 Default.
111
113 The input files shall be written to standard output, with the editing
114 commands specified in the script applied. If the -n option is speci‐
115 fied, only those input lines selected by the script shall be written to
116 standard output.
117
119 The standard error shall be used only for diagnostic messages.
120
122 The output files shall be text files whose formats are dependent on the
123 editing commands given.
124
126 The script shall consist of editing commands of the following form:
127
128
129 [address[,address]]function
130
131 where function represents a single-character command verb from the list
132 in Editing Commands in sed, followed by any applicable arguments.
133
134 The command can be preceded by <blank>s and/or semicolons. The function
135 can be preceded by <blank>s. These optional characters shall have no
136 effect.
137
138 In default operation, sed cyclically shall append a line of input, less
139 its terminating <newline>, into the pattern space. Normally the pattern
140 space will be empty, unless a D command terminated the last cycle. The
141 sed utility shall then apply in sequence all commands whose addresses
142 select that pattern space, and at the end of the script copy the pat‐
143 tern space to standard output (except when -n is specified) and delete
144 the pattern space. Whenever the pattern space is written to standard
145 output or a named file, sed shall immediately follow it with a <new‐
146 line>.
147
148 Some of the editing commands use a hold space to save all or part of
149 the pattern space for subsequent retrieval. The pattern and hold spaces
150 shall each be able to hold at least 8192 bytes.
151
152 Addresses in sed
153 An address is either a decimal number that counts input lines cumula‐
154 tively across files, a '$' character that addresses the last line of
155 input, or a context address (which consists of a BRE, as described in
156 Regular Expressions in sed, preceded and followed by a delimiter, usu‐
157 ally a slash).
158
159 An editing command with no addresses shall select every pattern space.
160
161 An editing command with one address shall select each pattern space
162 that matches the address.
163
164 An editing command with two addresses shall select the inclusive range
165 from the first pattern space that matches the first address through the
166 next pattern space that matches the second. (If the second address is a
167 number less than or equal to the line number first selected, only one
168 line shall be selected.) Starting at the first line following the
169 selected range, sed shall look again for the first address. Thereafter,
170 the process shall be repeated. Omitting either or both of the address
171 components in the following form produces undefined results:
172
173
174 [address[,address]]
175
176 Regular Expressions in sed
177 The sed utility shall support the BREs described in the Base Defini‐
178 tions volume of IEEE Std 1003.1-2001, Section 9.3, Basic Regular
179 Expressions, with the following additions:
180
181 * In a context address, the construction "\cBREc", where c is any
182 character other than backslash or <newline>, shall be identical to
183 "/BRE/" . If the character designated by c appears following a back‐
184 slash, then it shall be considered to be that literal character,
185 which shall not terminate the BRE. For example, in the context
186 address "\xabc\xdefx", the second x stands for itself, so that the
187 BRE is "abcxdef" .
188
189 * The escape sequence '\n' shall match a <newline> embedded in the
190 pattern space. A literal <newline> shall not be used in the BRE of a
191 context address or in the substitute function.
192
193 * If an RE is empty (that is, no pattern is specified) sed shall
194 behave as if the last RE used in the last command applied (either as
195 an address or as part of a substitute command) was specified.
196
197 Editing Commands in sed
198 In the following list of editing commands, the maximum number of per‐
199 missible addresses for each function is indicated by [ 0addr], [
200 1addr], or [ 2addr], representing zero, one, or two addresses.
201
202 The argument text shall consist of one or more lines. Each embedded
203 <newline> in the text shall be preceded by a backslash. Other back‐
204 slashes in text shall be removed, and the following character shall be
205 treated literally.
206
207 The r and w command verbs, and the w flag to the s command, take an
208 optional rfile (or wfile) parameter, separated from the command verb
209 letter or flag by one or more <blank>s; implementations may allow zero
210 separation as an extension.
211
212 The argument rfile or the argument wfile shall terminate the editing
213 command. Each wfile shall be created before processing begins. Imple‐
214 mentations shall support at least ten wfile arguments in the script;
215 the actual number (greater than or equal to 10) that is supported by
216 the implementation is unspecified. The use of the wfile parameter shall
217 cause that file to be initially created, if it does not exist, or shall
218 replace the contents of an existing file.
219
220 The b, r, s, t, w, y, and : command verbs shall accept additional argu‐
221 ments. The following synopses indicate which arguments shall be sepa‐
222 rated from the command verbs by a single <space>.
223
224 The a and r commands schedule text for later output. The text specified
225 for the a command, and the contents of the file specified for the r
226 command, shall be written to standard output just before the next
227 attempt to fetch a line of input when executing the N or n commands, or
228 when reaching the end of the script. If written when reaching the end
229 of the script, and the -n option was not specified, the text shall be
230 written after copying the pattern space to standard output. The con‐
231 tents of the file specified for the r command shall be as of the time
232 the output is written, not the time the r command is applied. The text
233 shall be output in the order in which the a and r commands were applied
234 to the input.
235
236 Command verbs other than {, a, b, c, i, r, t, w, :, and # can be fol‐
237 lowed by a semicolon, optional <blank>s, and another command verb. How‐
238 ever, when the s command verb is used with the w flag, following it
239 with another command in this manner produces undefined results.
240
241 A function can be preceded by one or more '!' characters, in which case
242 the function shall be applied if the addresses do not select the pat‐
243 tern space. Zero or more <blank>s shall be accepted before the first
244 '!' character. It is unspecified whether <blank>s can follow a '!'
245 character, and conforming applications shall not follow a '!' charac‐
246 ter with <blank>s.
247
248 [2addr] {function
249
250 function
251
252 ...
253
254 } Execute a list of sed functions only when the pattern space is
255 selected. The list of sed functions shall be surrounded by
256 braces and separated by <newline>s, and conform to the following
257 rules. The braces can be preceded or followed by <blank>s. The
258 functions can be preceded by <blank>s, but shall not be followed
259 by <blank>s. The <right-brace> shall be preceded by a <newline>
260 and can be preceded or followed by <blank>s.
261
262 [1addr]a\
263
264 text Write text to standard output as described previously.
265
266 [2addr]b [label]
267
268 Branch to the : function bearing the label. If label is not
269 specified, branch to the end of the script. The implementation
270 shall support labels recognized as unique up to at least 8 char‐
271 acters; the actual length (greater than or equal to 8) that
272 shall be supported by the implementation is unspecified. It is
273 unspecified whether exceeding a label length causes an error or
274 a silent truncation.
275
276 [2addr]c\
277
278 text Delete the pattern space. With a 0 or 1 address or at the end of
279 a 2-address range, place text on the output and start the next
280 cycle.
281
282 [2addr]d
283 Delete the pattern space and start the next cycle.
284
285 [2addr]D
286 Delete the initial segment of the pattern space through the
287 first <newline> and start the next cycle.
288
289 [2addr]g
290 Replace the contents of the pattern space by the contents of the
291 hold space.
292
293 [2addr]G
294 Append to the pattern space a <newline> followed by the contents
295 of the hold space.
296
297 [2addr]h
298 Replace the contents of the hold space with the contents of the
299 pattern space.
300
301 [2addr]H
302 Append to the hold space a <newline> followed by the contents of
303 the pattern space.
304
305 [1addr]i\
306
307 text Write text to standard output.
308
309 [2addr]l
310 (The letter ell.) Write the pattern space to standard output in
311 a visually unambiguous form. The characters listed in the Base
312 Definitions volume of IEEE Std 1003.1-2001, Table 5-1, Escape
313 Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\r',
314 '\t', '\v' ) shall be written as the corresponding escape
315 sequence; the '\n' in that table is not applicable. Non-print‐
316 able characters not in that table shall be written as one three-
317 digit octal number (with a preceding backslash) for each byte in
318 the character (most significant byte first). If the size of a
319 byte on the system is greater than 9 bits, the format used for
320 non-printable characters is implementation-defined.
321
322 Long lines shall be folded, with the point of folding indicated by
323 writing a backslash followed by a <newline>; the length at which fold‐
324 ing occurs is unspecified, but should be appropriate for the output
325 device. The end of each line shall be marked with a '$' .
326
327 [2addr]n
328 Write the pattern space to standard output if the default output
329 has not been suppressed, and replace the pattern space with the
330 next line of input, less its terminating <newline>.
331
332 If no next line of input is available, the n command verb shall branch
333 to the end of the script and quit without starting a new cycle.
334
335 [2addr]N
336 Append the next line of input, less its terminating <newline>,
337 to the pattern space, using an embedded <newline> to separate
338 the appended material from the original material. Note that the
339 current line number changes.
340
341 If no next line of input is available, the N command verb shall branch
342 to the end of the script and quit without starting a new cycle or copy‐
343 ing the pattern space to standard output.
344
345 [2addr]p
346 Write the pattern space to standard output.
347
348 [2addr]P
349 Write the pattern space, up to the first <newline>, to standard
350 output.
351
352 [1addr]q
353 Branch to the end of the script and quit without starting a new
354 cycle.
355
356 [1addr]r rfile
357 Copy the contents of rfile to standard output as described pre‐
358 viously. If rfile does not exist or cannot be read, it shall be
359 treated as if it were an empty file, causing no error condition.
360
361 [2addr]s/BRE/replacement/flags
362
363 Substitute the replacement string for instances of the BRE in
364 the pattern space. Any character other than backslash or <new‐
365 line> can be used instead of a slash to delimit the BRE and the
366 replacement. Within the BRE and the replacement, the BRE delim‐
367 iter itself can be used as a literal character if it is preceded
368 by a backslash.
369
370 The replacement string shall be scanned from beginning to end. An
371 ampersand ( '&' ) appearing in the replacement shall be replaced by the
372 string matching the BRE. The special meaning of '&' in this context can
373 be suppressed by preceding it by a backslash. The characters "\n",
374 where n is a digit, shall be replaced by the text matched by the corre‐
375 sponding backreference expression. The special meaning of "\n" where n
376 is a digit in this context, can be suppressed by preceding it by a
377 backslash. For each other backslash ( '\' ) encountered, the following
378 character shall lose its special meaning (if any). The meaning of a '\'
379 immediately followed by any character other than '&', '\', a digit, or
380 the delimiter character used for this command, is unspecified.
381
382 A line can be split by substituting a <newline> into it. The applica‐
383 tion shall escape the <newline> in the replacement by preceding it by a
384 backslash. A substitution shall be considered to have been performed
385 even if the replacement string is identical to the string that it
386 replaces. Any backslash used to alter the default meaning of a subse‐
387 quent character shall be discarded from the BRE or the replacement
388 before evaluating the BRE or using the replacement.
389
390 The value of flags shall be zero or more of:
391
392 n
393 Substitute for the nth occurrence only of the BRE found within
394 the pattern space.
395
396 g
397 Globally substitute for all non-overlapping instances of the BRE
398 rather than just the first one. If both g and n are specified,
399 the results are unspecified.
400
401 p
402 Write the pattern space to standard output if a replacement was
403 made.
404
405 w wfile
406 Write. Append the pattern space to wfile if a replacement was
407 made. A conforming application shall precede the wfile argument
408 with one or more <blank>s. If the w flag is not the last flag
409 value given in a concatenation of multiple flag values, the
410 results are undefined.
411
412
413 [2addr]t [label]
414
415 Test. Branch to the : command verb bearing the label if any sub‐
416 stitutions have been made since the most recent reading of an
417 input line or execution of a t. If label is not specified,
418 branch to the end of the script.
419
420 [2addr]w wfile
421
422 Append (write) the pattern space to wfile.
423
424 [2addr]x
425 Exchange the contents of the pattern and hold spaces.
426
427 [2addr]y/string1/string2/
428
429 Replace all occurrences of characters in string1 with the corre‐
430 sponding characters in string2. If a backslash followed by an
431 'n' appear in string1 or string2, the two characters shall be
432 handled as a single <newline>. If the number of characters in
433 string1 and string2 are not equal, or if any of the characters
434 in string1 appear more than once, the results are undefined. Any
435 character other than backslash or <newline> can be used instead
436 of slash to delimit the strings. If the delimiter is not n,
437 within string1 and string2, the delimiter itself can be used as
438 a literal character if it is preceded by a backslash. If a
439 backslash character is immediately followed by a backslash char‐
440 acter in string1 or string2, the two backslash characters shall
441 be counted as a single literal backslash character. The meaning
442 of a backslash followed by any character that is not 'n', a
443 backslash, or the delimiter character is undefined.
444
445 [0addr]:label
446 Do nothing. This command bears a label to which the b and t com‐
447 mands branch.
448
449 [1addr]=
450 Write the following to standard output:
451
452
453 "%d\n", <current line number>
454
455 [0addr]
456 Ignore this empty command.
457
458 [0addr]#
459 Ignore the '#' and the remainder of the line (treat them as a
460 comment), with the single exception that if the first two char‐
461 acters in the script are "#n", the default output shall be sup‐
462 pressed; this shall be the equivalent of specifying -n on the
463 command line.
464
465
467 The following exit values shall be returned:
468
469 0 Successful completion.
470
471 >0 An error occurred.
472
473
475 Default.
476
477 The following sections are informative.
478
480 Regular expressions match entire strings, not just individual lines,
481 but a <newline> is matched by '\n' in a sed RE; a <newline> is not
482 allowed by the general definition of regular expression in
483 IEEE Std 1003.1-2001. Also note that '\n' cannot be used to match a
484 <newline> at the end of an arbitrary input line; <newline>s appear in
485 the pattern space as a result of the N editing command.
486
488 This sed script simulates the BSD cat -s command, squeezing excess
489 blank lines from standard input.
490
491
492 sed -n '
493 # Write non-empty lines.
494 /./ {
495 p
496 d
497 }
498 # Write a single empty line, then look for more empty lines.
499 /^$/ p
500 # Get next line, discard the held <newline> (empty line),
501 # and look for more empty lines.
502 :Empty
503 /^$/ {
504 N
505 s/.//
506 b Empty
507 }
508 # Write the non-empty line before going back to search
509 # for the first in a set of empty lines.
510 p
511
513 This volume of IEEE Std 1003.1-2001 requires implementations to support
514 at least ten distinct wfiles, matching historical practice on many
515 implementations. Implementations are encouraged to support more, but
516 conforming applications should not exceed this limit.
517
518 The exit status codes specified here are different from those in System
519 V. System V returns 2 for garbled sed commands, but returns zero with
520 its usage message or if the input file could not be opened. The stan‐
521 dard developers considered this to be a bug.
522
523 The manner in which the l command writes non-printable characters was
524 changed to avoid the historical backspace-overstrike method, and other
525 requirements to achieve unambiguous output were added. See the RATIO‐
526 NALE for ed for details of the format chosen, which is the same as that
527 chosen for sed.
528
529 This volume of IEEE Std 1003.1-2001 requires implementations to provide
530 pattern and hold spaces of at least 8192 bytes, larger than the 4000
531 bytes spaces used by some historical implementations, but less than the
532 20480 bytes limit used in an early proposal. Implementations are
533 encouraged to allocate dynamically larger pattern and hold spaces as
534 needed.
535
536 The requirements for acceptance of <blank>s and <space>s in command
537 lines has been made more explicit than in early proposals to describe
538 clearly the historical practice and to remove confusion about the
539 phrase "protect initial blanks [sic] and tabs from the stripping that
540 is done on every script line" that appears in much of the historical
541 documentation of the sed utility description of text. (Not all imple‐
542 mentations are known to have stripped <blank>s from text lines,
543 although they all have allowed leading <blank>s preceding the address
544 on a command line.)
545
546 The treatment of '#' comments differs from the SVID which only allows a
547 comment as the first line of the script, but matches BSD-derived imple‐
548 mentations. The comment character is treated as a command, and it has
549 the same properties in terms of being accepted with leading <blank>s;
550 the BSD implementation has historically supported this.
551
552 Early proposals required that a script_file have at least one non-com‐
553 ment line. Some historical implementations have behaved in unexpected
554 ways if this were not the case. The standard developers considered that
555 this was incorrect behavior and that application developers should not
556 have to avoid this feature. A correct implementation of this volume of
557 IEEE Std 1003.1-2001 shall permit script_files that consist only of
558 comment lines.
559
560 Early proposals indicated that if -e and -f options were intermixed,
561 all -e options were processed before any -f options. This has been
562 changed to process them in the order presented because it matches his‐
563 torical practice and is more intuitive.
564
565 The treatment of the p flag to the s command differs between System V
566 and BSD-based systems when the default output is suppressed. In the two
567 examples:
568
569
570 echo a | sed 's/a/A/p'
571 echo a | sed -n 's/a/A/p'
572
573 this volume of IEEE Std 1003.1-2001, BSD, System V documentation, and
574 the SVID indicate that the first example should write two lines with A,
575 whereas the second should write one. Some System V systems write the A
576 only once in both examples because the p flag is ignored if the -n
577 option is not specified.
578
579 This is a case of a diametrical difference between systems that could
580 not be reconciled through the compromise of declaring the behavior to
581 be unspecified. The SVID/BSD/System V documentation behavior was
582 adopted for this volume of IEEE Std 1003.1-2001 because:
583
584 * No known documentation for any historic system describes the inter‐
585 action between the p flag and the -n option.
586
587 * The selected behavior is more correct as there is no technical jus‐
588 tification for any interaction between the p flag and the -n option.
589 A relationship between -n and the p flag might imply that they are
590 only used together, but this ignores valid scripts that interrupt
591 the cyclical nature of the processing through the use of the D, d,
592 q, or branching commands. Such scripts rely on the p suffix to write
593 the pattern space because they do not make use of the default output
594 at the "bottom" of the script.
595
596 * Because the -n option makes the p flag unnecessary, any interaction
597 would only be useful if sed scripts were written to run both with
598 and without the -n option. This is believed to be unlikely. It is
599 even more unlikely that programmers have coded the p flag expecting
600 it to be unnecessary. Because the interaction was not documented,
601 the likelihood of a programmer discovering the interaction and
602 depending on it is further decreased.
603
604 * Finally, scripts that break under the specified behavior produce too
605 much output instead of too little, which is easier to diagnose and
606 correct.
607
608 The form of the substitute command that uses the n suffix was limited
609 to the first 512 matches in an early proposal. This limit has been
610 removed because there is no reason an editor processing lines of
611 {LINE_MAX} length should have this restriction. The command s/a/A/2047
612 should be able to substitute the 2047th occurrence of a on a line.
613
614 The b, t, and : commands are documented to ignore leading white space,
615 but no mention is made of trailing white space. Historical implementa‐
616 tions of sed assigned different locations to the labels 'x' and "x " .
617 This is not useful, and leads to subtle programming errors, but it is
618 historical practice, and changing it could theoretically break working
619 scripts. Implementors are encouraged to provide warning messages about
620 labels that are never used or jumps to labels that do not exist.
621
622 Historically, the sed ! and } editing commands did not permit multiple
623 commands on a single line using a semicolon as a command delimiter.
624 Implementations are permitted, but not required, to support this exten‐
625 sion.
626
628 None.
629
631 awk, ed, grep
632
634 Portions of this text are reprinted and reproduced in electronic form
635 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
636 -- Portable Operating System Interface (POSIX), The Open Group Base
637 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
638 Electrical and Electronics Engineers, Inc and The Open Group. In the
639 event of any discrepancy between this version and the original IEEE and
640 The Open Group Standard, the original IEEE and The Open Group Standard
641 is the referee document. The original Standard can be obtained online
642 at http://www.opengroup.org/unix/online.html .
643
644
645
646IEEE/The Open Group 2003 SED(1P)