1SED(1P) POSIX Programmer's Manual SED(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
11
13 sed — stream editor
14
16 sed [−n] script [file...]
17
18 sed [−n] −e script [−e script]... [−f script_file]... [file...]
19
20 sed [−n] [−e script]... −f script_file [−f script_file]... [file...]
21
23 The sed utility is a stream editor that shall read one or more text
24 files, make editing changes according to a script of editing commands,
25 and write the results to standard output. The script shall be obtained
26 from either the script operand string or a combination of the option-
27 arguments from the −e script and −f script_file options.
28
30 The sed utility shall conform to the Base Definitions volume of
31 POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines, except that the
32 order of presentation of the −e and −f options is significant.
33
34 The following options shall be supported:
35
36 −e script Add the editing commands specified by the script option-argu‐
37 ment to the end of the script of editing commands.
38
39 −f script_file
40 Add the editing commands in the file script_file to the end
41 of the script of editing commands.
42
43 −n Suppress the default output (in which each line, after it is
44 examined for editing, is written to standard output). Only
45 lines explicitly selected for output are written.
46
47 If any −e or −f options are specified, the script of editing commands
48 shall initially be empty. The commands specified by each −e or −f
49 option shall be added to the script in the order specified. When each
50 addition is made, if the previous addition (if any) was from a −e
51 option, a <newline> shall be inserted before the new addition. The
52 resulting script shall have the same properties as the script operand,
53 described in the OPERANDS section.
54
56 The following operands shall be supported:
57
58 file A pathname of a file whose contents are read and edited. If
59 multiple file operands are specified, the named files shall
60 be read in the order specified and the concatenation shall be
61 edited. If no file operands are specified, the standard input
62 shall be used.
63
64 script A string to be used as the script of editing commands. The
65 application shall not present a script that violates the
66 restrictions of a text file except that the final character
67 need not be a <newline>.
68
70 The standard input shall be used if no file operands are specified, and
71 shall be used if a file operand is '−' and the implementation treats
72 the '−' as meaning standard input. Otherwise, the standard input shall
73 not be used. See the INPUT FILES section.
74
76 The input files shall be text files. The script_files named by the −f
77 option shall consist of editing commands.
78
80 The following environment variables shall affect the execution of sed:
81
82 LANG Provide a default value for the internationalization vari‐
83 ables that are unset or null. (See the Base Definitions vol‐
84 ume of POSIX.1‐2008, Section 8.2, Internationalization Vari‐
85 ables for the precedence of internationalization variables
86 used to determine the values of locale categories.)
87
88 LC_ALL If set to a non-empty string value, override the values of
89 all the other internationalization variables.
90
91 LC_COLLATE
92 Determine the locale for the behavior of ranges, equivalence
93 classes, and multi-character collating elements within regu‐
94 lar expressions.
95
96 LC_CTYPE Determine the locale for the interpretation of sequences of
97 bytes of text data as characters (for example, single-byte as
98 opposed to multi-byte characters in arguments and input
99 files), and the behavior of character classes within regular
100 expressions.
101
102 LC_MESSAGES
103 Determine the locale that should be used to affect the format
104 and contents of diagnostic messages written to standard
105 error.
106
107 NLSPATH Determine the location of message catalogs for the processing
108 of LC_MESSAGES.
109
111 Default.
112
114 The input files shall be written to standard output, with the editing
115 commands specified in the script applied. If the −n option is speci‐
116 fied, only those input lines selected by the script shall be written to
117 standard output.
118
120 The standard error shall be used only for diagnostic messages.
121
123 The output files shall be text files whose formats are dependent on the
124 editing commands given.
125
127 The script shall consist of editing commands of the following form:
128
129 [address[,address]]function
130
131 where function represents a single-character command verb from the list
132 in Editing Commands in sed, followed by any applicable arguments.
133
134 The command can be preceded by <blank> characters and/or <semicolon>
135 characters. The function can be preceded by <blank> characters. These
136 optional characters shall have no effect.
137
138 In default operation, sed cyclically shall append a line of input, less
139 its terminating <newline> character, into the pattern space. Reading
140 from input shall be skipped if a <newline> was in the pattern space
141 prior to a D command ending the previous cycle. The sed utility shall
142 then apply in sequence all commands whose addresses select that pattern
143 space, until a command starts the next cycle or quits. If no commands
144 explicitly started a new cycle, then at the end of the script the pat‐
145 tern space shall be copied to standard output (except when −n is speci‐
146 fied) and the pattern space shall be deleted. Whenever the pattern
147 space is written to standard output or a named file, sed shall immedi‐
148 ately follow it with a <newline>.
149
150 Some of the editing commands use a hold space to save all or part of
151 the pattern space for subsequent retrieval. The pattern and hold spaces
152 shall each be able to hold at least 8192 bytes.
153
154 Addresses in sed
155 An address is either a decimal number that counts input lines cumula‐
156 tively across files, a '$' character that addresses the last line of
157 input, or a context address (which consists of a BRE, as described in
158 Regular Expressions in sed, preceded and followed by a delimiter, usu‐
159 ally a <slash>).
160
161 An editing command with no addresses shall select every pattern space.
162
163 An editing command with one address shall select each pattern space
164 that matches the address.
165
166 An editing command with two addresses shall select the inclusive range
167 from the first pattern space that matches the first address through the
168 next pattern space that matches the second. (If the second address is a
169 number less than or equal to the line number first selected, only one
170 line shall be selected.) Starting at the first line following the
171 selected range, sed shall look again for the first address. Thereafter,
172 the process shall be repeated. Omitting either or both of the address
173 components in the following form produces undefined results:
174
175 [address[,address]]
176
177 Regular Expressions in sed
178 The sed utility shall support the BREs described in the Base Defini‐
179 tions volume of POSIX.1‐2008, Section 9.3, Basic Regular Expressions,
180 with the following additions:
181
182 * In a context address, the construction "\cBREc", where c is any
183 character other than <backslash> or <newline>, shall be identical
184 to "/BRE/". If the character designated by c appears following a
185 <backslash>, then it shall be considered to be that literal charac‐
186 ter, which shall not terminate the BRE. For example, in the context
187 address "\xabc\xdefx", the second x stands for itself, so that the
188 BRE is "abcxdef".
189
190 * The escape sequence '\n' shall match a <newline> embedded in the
191 pattern space. A literal <newline> shall not be used in the BRE of
192 a context address or in the substitute function.
193
194 * If an RE is empty (that is, no pattern is specified) sed shall
195 behave as if the last RE used in the last command applied (either
196 as an address or as part of a substitute command) was specified.
197
198 Editing Commands in sed
199 In the following list of editing commands, the maximum number of per‐
200 missible addresses for each function is indicated by [0addr], [1addr],
201 or [2addr], representing zero, one, or two addresses.
202
203 The argument text shall consist of one or more lines. Each embedded
204 <newline> in the text shall be preceded by a <backslash>. Other <back‐
205 slash> characters in text shall be removed, and the following character
206 shall be treated literally.
207
208 The r and w command verbs, and the w flag to the s command, take an
209 rfile (or wfile) parameter, separated from the command verb letter or
210 flag by one or more <blank> characters; implementations may allow zero
211 separation as an extension.
212
213 The argument rfile or the argument wfile shall terminate the editing
214 command. Each wfile shall be created before processing begins. Imple‐
215 mentations shall support at least ten wfile arguments in the script;
216 the actual number (greater than or equal to 10) that is supported by
217 the implementation is unspecified. The use of the wfile parameter shall
218 cause that file to be initially created, if it does not exist, or shall
219 replace the contents of an existing file.
220
221 The b, r, s, t, w, y, and : command verbs shall accept additional argu‐
222 ments. The following synopses indicate which arguments shall be sepa‐
223 rated from the command verbs by a single <space>.
224
225 The a and r commands schedule text for later output. The text specified
226 for the a command, and the contents of the file specified for the r
227 command, shall be written to standard output just before the next
228 attempt to fetch a line of input when executing the N or n commands, or
229 when reaching the end of the script. If written when reaching the end
230 of the script, and the −n option was not specified, the text shall be
231 written after copying the pattern space to standard output. The con‐
232 tents of the file specified for the r command shall be as of the time
233 the output is written, not the time the r command is applied. The text
234 shall be output in the order in which the a and r commands were applied
235 to the input.
236
237 Command verbs other than {, a, b, c, i, r, t, w, :, and # can be fol‐
238 lowed by a <semicolon>, optional <blank> characters, and another com‐
239 mand verb. However, when the s command verb is used with the w flag,
240 following it with another command in this manner produces undefined
241 results.
242
243 A function can be preceded by one or more '!' characters, in which
244 case the function shall be applied if the addresses do not select the
245 pattern space. Zero or more <blank> characters shall be accepted before
246 the first '!' character. It is unspecified whether <blank> characters
247 can follow a '!' character, and conforming applications shall not fol‐
248 low a '!' character with <blank> characters.
249
250 [2addr] {editing command
251
252 editing command
253
254 ...
255
256 } Execute a list of sed editing commands only when the pattern
257 space is selected. The list of sed editing commands shall be
258 surrounded by braces and separated by <newline> characters,
259 and conform to the following rules. The braces can be pre‐
260 ceded or followed by <blank> characters. The editing commands
261 can be preceded by <blank> characters, but shall not be fol‐
262 lowed by <blank> characters. The <right-brace> shall be pre‐
263 ceded by a <newline> and can be preceded or followed by
264 <blank> characters.
265
266 [1addr]a\
267
268 text Write text to standard output as described previously.
269
270 [2addr]b [label]
271 Branch to the : function bearing the label. If label is not
272 specified, branch to the end of the script. The implementa‐
273 tion shall support labels recognized as unique up to at least
274 8 characters; the actual length (greater than or equal to 8)
275 that shall be supported by the implementation is unspecified.
276 It is unspecified whether exceeding a label length causes an
277 error or a silent truncation.
278
279 [2addr]c\
280
281 text Delete the pattern space. With a 0 or 1 address or at the end
282 of a 2-address range, place text on the output and start the
283 next cycle.
284
285 [2addr]d Delete the pattern space and start the next cycle.
286
287 [2addr]D If the pattern space contains no <newline>, delete the pat‐
288 tern space and start a normal new cycle as if the d command
289 was issued. Otherwise, delete the initial segment of the pat‐
290 tern space through the first <newline>, and start the next
291 cycle with the resultant pattern space and without reading
292 any new input.
293
294 [2addr]g Replace the contents of the pattern space by the contents of
295 the hold space.
296
297 [2addr]G Append to the pattern space a <newline> followed by the con‐
298 tents of the hold space.
299
300 [2addr]h Replace the contents of the hold space with the contents of
301 the pattern space.
302
303 [2addr]H Append to the hold space a <newline> followed by the contents
304 of the pattern space.
305
306 [1addr]i\
307
308 text Write text to standard output.
309
310 [2addr]l (The letter ell.) Write the pattern space to standard output
311 in a visually unambiguous form. The characters listed in the
312 Base Definitions volume of POSIX.1‐2008, Table 5-1, Escape
313 Sequences and Associated Actions ('\\', '\a', '\b', '\f',
314 '\r', '\t', '\v') shall be written as the corresponding
315 escape sequence; the '\n' in that table is not applicable.
316 Non-printable characters not in that table shall be written
317 as one three-digit octal number (with a preceding <back‐
318 slash>) for each byte in the character (most significant byte
319 first).
320
321 Long lines shall be folded, with the point of folding indi‐
322 cated by writing a <backslash> followed by a <newline>; the
323 length at which folding occurs is unspecified, but should be
324 appropriate for the output device. The end of each line shall
325 be marked with a '$'.
326
327 [2addr]n Write the pattern space to standard output if the default
328 output has not been suppressed, and replace the pattern space
329 with the next line of input, less its terminating <newline>.
330
331 If no next line of input is available, the n command verb
332 shall branch to the end of the script and quit without start‐
333 ing a new cycle.
334
335 [2addr]N Append the next line of input, less its terminating <new‐
336 line>, to the pattern space, using an embedded <newline> to
337 separate the appended material from the original material.
338 Note that the current line number changes.
339
340 If no next line of input is available, the N command verb
341 shall branch to the end of the script and quit without start‐
342 ing a new cycle or copying the pattern space to standard out‐
343 put.
344
345 [2addr]p Write the pattern space to standard output.
346
347 [2addr]P Write the pattern space, up to the first <newline>, to stan‐
348 dard output.
349
350 [1addr]q Branch to the end of the script and quit without starting a
351 new cycle.
352
353 [1addr]r rfile
354 Copy the contents of rfile to standard output as described
355 previously. If rfile does not exist or cannot be read, it
356 shall be treated as if it were an empty file, causing no
357 error condition.
358
359 [2addr]s/BRE/replacement/flags
360 Substitute the replacement string for instances of the BRE in
361 the pattern space. Any character other than <backslash> or
362 <newline> can be used instead of a <slash> to delimit the BRE
363 and the replacement. Within the BRE and the replacement, the
364 BRE delimiter itself can be used as a literal character if it
365 is preceded by a <backslash>.
366
367 The replacement string shall be scanned from beginning to
368 end. An <ampersand> ('&') appearing in the replacement shall
369 be replaced by the string matching the BRE. The special mean‐
370 ing of '&' in this context can be suppressed by preceding it
371 by a <backslash>. The characters "\n", where n is a digit,
372 shall be replaced by the text matched by the corresponding
373 back-reference expression. If the corresponding back-refer‐
374 ence expression does not match, then the characters "\n"
375 shall be replaced by the empty string. The special meaning of
376 "\n" where n is a digit in this context, can be suppressed by
377 preceding it by a <backslash>. For each other <backslash>
378 encountered, the following character shall lose its special
379 meaning (if any). The meaning of a <backslash> immediately
380 followed by any character other than '&', <backslash>, a
381 digit, or the delimiter character used for this command, is
382 unspecified.
383
384 A line can be split by substituting a <newline> into it. The
385 application shall escape the <newline> in the replacement by
386 preceding it by a <backslash>. A substitution shall be con‐
387 sidered to have been performed even if the replacement string
388 is identical to the string that it replaces. Any <backslash>
389 used to alter the default meaning of a subsequent character
390 shall be discarded from the BRE or the replacement before
391 evaluating the BRE or using the replacement.
392
393 The value of flags shall be zero or more of:
394
395 n Substitute for the nth occurrence only of the BRE
396 found within the pattern space.
397
398 g Globally substitute for all non-overlapping
399 instances of the BRE rather than just the first
400 one. If both g and n are specified, the results are
401 unspecified.
402
403 p Write the pattern space to standard output if a
404 replacement was made.
405
406 w wfile Write. Append the pattern space to wfile if a
407 replacement was made. A conforming application
408 shall precede the wfile argument with one or more
409 <blank> characters. If the w flag is not the last
410 flag value given in a concatenation of multiple
411 flag values, the results are undefined.
412
413 [2addr]t [label]
414 Test. Branch to the : command verb bearing the label if any
415 substitutions have been made since the most recent reading of
416 an input line or execution of a t. If label is not speci‐
417 fied, branch to the end of the script.
418
419 [2addr]w wfile
420 Append (write) the pattern space to wfile.
421
422 [2addr]x Exchange the contents of the pattern and hold spaces.
423
424 [2addr]y/string1/string2/
425 Replace all occurrences of characters in string1 with the
426 corresponding characters in string2. If a <backslash> fol‐
427 lowed by an 'n' appear in string1 or string2, the two charac‐
428 ters shall be handled as a single <newline>. If the number
429 of characters in string1 and string2 are not equal, or if any
430 of the characters in string1 appear more than once, the
431 results are undefined. Any character other than <backslash>
432 or <newline> can be used instead of <slash> to delimit the
433 strings. If the delimiter is not 'n', within string1 and
434 string2, the delimiter itself can be used as a literal char‐
435 acter if it is preceded by a <backslash>. If a <backslash>
436 character is immediately followed by a <backslash> character
437 in string1 or string2, the two <backslash> characters shall
438 be counted as a single literal <backslash> character. The
439 meaning of a <backslash> followed by any character that is
440 not 'n', a <backslash>, or the delimiter character is unde‐
441 fined.
442
443 [0addr]:label
444 Do nothing. This command bears a label to which the b and t
445 commands branch.
446
447 [1addr]= Write the following to standard output:
448
449 "%d\n", <current line number>
450
451 [0addr] Ignore this empty command.
452
453 [0addr]# Ignore the '#' and the remainder of the line (treat them as a
454 comment), with the single exception that if the first two
455 characters in the script are "#n", the default output shall
456 be suppressed; this shall be the equivalent of specifying −n
457 on the command line.
458
460 The following exit values shall be returned:
461
462 0 Successful completion.
463
464 >0 An error occurred.
465
467 Default.
468
469 The following sections are informative.
470
472 Regular expressions match entire strings, not just individual lines,
473 but a <newline> is matched by '\n' in a sed RE; a <newline> is not
474 allowed by the general definition of regular expression in
475 POSIX.1‐2008. Also note that '\n' cannot be used to match a <newline>
476 at the end of an arbitrary input line; <newline> characters appear in
477 the pattern space as a result of the N editing command.
478
480 This sed script simulates the BSD cat −s command, squeezing excess
481 empty lines from standard input.
482
483 sed −n '
484 # Write non-empty lines.
485 /./ {
486 p
487 d
488 }
489 # Write a single empty line, then look for more empty lines.
490 /^$/ p
491 # Get next line, discard the held <newline> (empty line),
492 # and look for more empty lines.
493 :Empty
494 /^$/ {
495 N
496 s/.//
497 b Empty
498 }
499 # Write the non-empty line before going back to search
500 # for the first in a set of empty lines.
501 p
502 '
503
504 The following sed command is a much simpler method of squeezing empty
505 lines, although it is not quite the same as cat −s since it removes any
506 initial empty lines:
507
508 sed −n '/./,/^$/p'
509
511 This volume of POSIX.1‐2008 requires implementations to support at
512 least ten distinct wfiles, matching historical practice on many imple‐
513 mentations. Implementations are encouraged to support more, but con‐
514 forming applications should not exceed this limit.
515
516 The exit status codes specified here are different from those in System
517 V. System V returns 2 for garbled sed commands, but returns zero with
518 its usage message or if the input file could not be opened. The stan‐
519 dard developers considered this to be a bug.
520
521 The manner in which the l command writes non-printable characters was
522 changed to avoid the historical backspace-overstrike method, and other
523 requirements to achieve unambiguous output were added. See the RATIO‐
524 NALE for ed for details of the format chosen, which is the same as that
525 chosen for sed.
526
527 This volume of POSIX.1‐2008 requires implementations to provide pattern
528 and hold spaces of at least 8192 bytes, larger than the 4000 bytes spa‐
529 ces used by some historical implementations, but less than the 20480
530 bytes limit used in an early proposal. Implementations are encouraged
531 to allocate dynamically larger pattern and hold spaces as needed.
532
533 The requirements for acceptance of <blank> and <space> characters in
534 command lines has been made more explicit than in early proposals to
535 describe clearly the historical practice and to remove confusion about
536 the phrase ``protect initial blanks [sic] and tabs from the stripping
537 that is done on every script line'' that appears in much of the histor‐
538 ical documentation of the sed utility description of text. (Not all
539 implementations are known to have stripped <blank> characters from text
540 lines, although they all have allowed leading <blank> characters pre‐
541 ceding the address on a command line.)
542
543 The treatment of '#' comments differs from the SVID which only allows a
544 comment as the first line of the script, but matches BSD-derived imple‐
545 mentations. The comment character is treated as a command, and it has
546 the same properties in terms of being accepted with leading <blank>
547 characters; the BSD implementation has historically supported this.
548
549 Early proposals required that a script_file have at least one non-com‐
550 ment line. Some historical implementations have behaved in unexpected
551 ways if this were not the case. The standard developers considered that
552 this was incorrect behavior and that application developers should not
553 have to avoid this feature. A correct implementation of this volume of
554 POSIX.1‐2008 shall permit script_files that consist only of comment
555 lines.
556
557 Early proposals indicated that if −e and −f options were intermixed,
558 all −e options were processed before any −f options. This has been
559 changed to process them in the order presented because it matches his‐
560 torical practice and is more intuitive.
561
562 The treatment of the p flag to the s command differs between System V
563 and BSD-based systems when the default output is suppressed. In the two
564 examples:
565
566 echo a | sed 's/a/A/p'
567 echo a | sed −n 's/a/A/p'
568
569 this volume of POSIX.1‐2008, BSD, System V documentation, and the SVID
570 indicate that the first example should write two lines with A, whereas
571 the second should write one. Some System V systems write the A only
572 once in both examples because the p flag is ignored if the −n option is
573 not specified.
574
575 This is a case of a diametrical difference between systems that could
576 not be reconciled through the compromise of declaring the behavior to
577 be unspecified. The SVID/BSD/System V documentation behavior was
578 adopted for this volume of POSIX.1‐2008 because:
579
580 * No known documentation for any historic system describes the inter‐
581 action between the p flag and the −n option.
582
583 * The selected behavior is more correct as there is no technical jus‐
584 tification for any interaction between the p flag and the −n
585 option. A relationship between −n and the p flag might imply that
586 they are only used together, but this ignores valid scripts that
587 interrupt the cyclical nature of the processing through the use of
588 the D, d, q, or branching commands. Such scripts rely on the p suf‐
589 fix to write the pattern space because they do not make use of the
590 default output at the ``bottom'' of the script.
591
592 * Because the −n option makes the p flag unnecessary, any interaction
593 would only be useful if sed scripts were written to run both with
594 and without the −n option. This is believed to be unlikely. It is
595 even more unlikely that programmers have coded the p flag expecting
596 it to be unnecessary. Because the interaction was not documented,
597 the likelihood of a programmer discovering the interaction and
598 depending on it is further decreased.
599
600 * Finally, scripts that break under the specified behavior produce
601 too much output instead of too little, which is easier to diagnose
602 and correct.
603
604 The form of the substitute command that uses the n suffix was limited
605 to the first 512 matches in an early proposal. This limit has been
606 removed because there is no reason an editor processing lines of
607 {LINE_MAX} length should have this restriction. The command s/a/A/2047
608 should be able to substitute the 2047th occurrence of a on a line.
609
610 The b, t, and : commands are documented to ignore leading white space,
611 but no mention is made of trailing white space. Historical implementa‐
612 tions of sed assigned different locations to the labels 'x' and "x ".
613 This is not useful, and leads to subtle programming errors, but it is
614 historical practice, and changing it could theoretically break working
615 scripts. Implementors are encouraged to provide warning messages about
616 labels that are never used or jumps to labels that do not exist.
617
618 Historically, the sed ! and } editing commands did not permit multiple
619 commands on a single line using a <semicolon> as a command delimiter.
620 Implementations are permitted, but not required, to support this exten‐
621 sion.
622
623 Earlier versions of this standard allowed for implementations with
624 bytes other than eight bits, but this has been modified in this ver‐
625 sion.
626
628 None.
629
631 awk, ed, grep
632
633 The Base Definitions volume of POSIX.1‐2008, Table 5-1, Escape
634 Sequences and Associated Actions, Chapter 8, Environment Variables,
635 Section 9.3, Basic Regular Expressions, Section 12.2, Utility Syntax
636 Guidelines
637
639 Portions of this text are reprinted and reproduced in electronic form
640 from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
641 -- Portable Operating System Interface (POSIX), The Open Group Base
642 Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
643 cal and Electronics Engineers, Inc and The Open Group. (This is
644 POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
645 event of any discrepancy between this version and the original IEEE and
646 The Open Group Standard, the original IEEE and The Open Group Standard
647 is the referee document. The original Standard can be obtained online
648 at http://www.unix.org/online.html .
649
650 Any typographical or formatting errors that appear in this page are
651 most likely to have been introduced during the conversion of the source
652 files to man page format. To report such errors, see https://www.ker‐
653 nel.org/doc/man-pages/reporting_bugs.html .
654
655
656
657IEEE/The Open Group 2013 SED(1P)