1SPAX(1L) Schily´s USER COMMANDS SPAX(1L)
2
3
4
6 pax - portable archive interchange
7
9 spax [other options] [-cdnv] [-H|-L] [-f archive]
10 [-o options]... [-s replstr]... [pattern...]
11
12
13 spax -r [other options] [-cdiknuv] [-H|-L] [-f archive]
14 [-o options]... [-p string]... [-s replstr]... [pattern...]
15
16
17 spax -w [other options] [-dituvX] [-H|-L] [-b blocksize] [-a]
18 [-f archive] [-o options]... [-s replstr]... [-x format]
19 [file...]
20
21
22 spax -r -w[other options] [-diklntuvX] [-H|-L] [-o options]...
23 [-p string]... [-s replstr]... [file...] directory
24
26 The pax utility shall read, write, and write lists of the members of
27 archive files and copy directory hierarchies. A variety of archive for‐
28 mats shall be supported; see the -x format option.
29
30 The action to be taken depends on the presence of the -r and -w
31 options. The four combinations of -r and -w are referred to as the four
32 modes of operation: list, read, write, and copy modes, corresponding
33 respectively to the four forms shown in the SYNOPSIS section.
34
35 list In list mode (when neither -r nor -w are specified), pax shall
36 write the names of the members of the archive file read from the
37 standard input, with pathnames matching the specified patterns,
38 to standard output. If a named file is of type directory, the
39 file hierarchy rooted at that file shall be listed as well.
40
41 read In read mode (when -r is specified, but -w is not), pax shall
42 extract the members of the archive file read from the standard
43 input, with pathnames matching the specified patterns. If an
44 extracted file is of type directory, the file hierarchy rooted
45 at that file shall be extracted as well. The extracted files
46 shall be created performing pathname resolution with the direc‐
47 tory in which pax was invoked as the current working directory.
48
49 If an attempt is made to extract a directory when the directory
50 already exists, this shall not be considered an error. If an
51 attempt is made to extract a FIFO when the FIFO already exists,
52 this shall not be considered an error.
53
54 The ownership, access, and modification times, and file mode of
55 the restored files are discussed under the -p option.
56
57 write In write mode (when -w is specified, but -r is not), pax shall
58 write the contents of the file operands to the standard output
59 in an archive format. If no file operands are specified, a list
60 of files to copy, one per line, shall be read from the standard
61 input. A file of type directory shall include all of the files
62 in the file hierarchy rooted at the file.
63
64 copy In copy mode (when both -r and -w are specified), pax shall copy
65 the file operands to the destination directory.
66
67 If no file operands are specified, a list of files to copy, one
68 per line, shall be read from the standard input. A file of type
69 directory shall include all of the files in the file hierarchy
70 rooted at the file.
71
72 The effect of the copy shall be as if the copied files were
73 written to an archive file and then subsequently extracted,
74 except that there may be hard links between the original and the
75 copied files. If the destination directory is a subdirectory of
76 one of the files to be copied, the results are unspecified. If
77 the destination directory is a file of a type not defined by the
78 System Interfaces volume of IEEE Std 1003.1-2001, the results
79 are implementation-defined; otherwise, it shall be an error for
80 the file named by the directory operand not to exist, not be
81 writable by the user, or not be a file of type directory.
82
83 In read or copy modes, if intermediate directories are necessary to
84 extract an archive member, pax shall perform actions equivalent to the
85 mkdir() function defined in the System Interfaces volume of IEEE Std
86 1003.1-2001, called with the following arguments:
87
88 · The intermediate directory used as the path argument.
89
90 · The value of the bitwise-inclusive OR of S_IRWXU, S_IRWXG, and
91 S_IRWXO as the mode argument.
92
93 If any specified pattern or file operands are not matched by at least
94 one file or archive member, pax shall write a diagnostic message to
95 standard error for each one that did not match and exit with a non-zero
96 exit status.
97
98 The archive formats described in the EXTENDED DESCRIPTION section shall
99 be automatically detected on input. The default output archive format
100 shall be implementation-defined.
101
102 The spax implementation defaults to -x ustar.
103
104 A single archive can span multiple files. The pax utility shall deter‐
105 mine, in an implementation-defined manner, what file to read or write
106 as the next file.
107
108 If the selected archive format supports the specification of linked
109 files, it shall be an error if these files cannot be linked when the
110 archive is extracted, except that if the files to be linked are sym‐
111 bolic links and the system is not capable of making hard links to sym‐
112 bolic links, then separate copies of the symbolic link shall be created
113 instead. For archive formats that do not store file contents with each
114 name that causes a hard link, if the file that contains the data is not
115 extracted during this pax session, either the data shall be restored
116 from the original file, or a diagnostic message shall be displayed with
117 the name of a file that can be used to extract the data. In traversing
118 directories, pax shall detect infinite loops; that is, entering a pre‐
119 viously visited directory that is an ancestor of the last file visited.
120 When it detects an infinite loop, pax shall write a diagnostic message
121 to standard error and shall terminate.
122
123
125 The pax utility shall conform to the Base Definitions volume of IEEE
126 Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except that
127 the order of presentation of the -o, -p, and -s options is significant.
128 See also the OTHER OPTIONS section.
129
130 The following options shall be supported:
131
132 -r Read an archive file from standard input.
133
134 -w Write files to the standard output in the specified archive for‐
135 mat.
136
137 -a Append files to the end of the archive. It is implementation-
138 defined which devices on the system support appending. Addi‐
139 tional file formats unspecified by this volume of IEEE Std
140 1003.1-2001 may impose restrictions on appending.
141
142 -b blocksize
143 Block the output at a positive decimal integer number of bytes
144 per write to the archive file. Devices and archive formats may
145 impose restrictions on blocking. Blocking shall be automatically
146 determined on input. Conforming applications shall not specify a
147 blocksize value larger than 32256. Default blocking when creat‐
148 ing archives depends on the archive format. (See the -x option
149 below.)
150
151 -c Match all file or archive members except those specified by the
152 pattern or file operands.
153
154 -d Cause files of type directory being copied or archived or ar‐
155 chive members of type directory being extracted or listed to
156 match only the file or archive member itself and not the file
157 hierarchy rooted at the file.
158
159 -f archive
160 Specify the pathname of the input or output archive, overriding
161 the default standard input (in list or read modes) or standard
162 output (write mode).
163
164 -H If a symbolic link referencing a file of type directory is spec‐
165 ified on the command line, pax shall archive the file hierarchy
166 rooted in the file referenced by the link, using the name of the
167 link as the root of the file hierarchy. Otherwise, if a sym‐
168 bolic link referencing a file of any other file type which pax
169 can normally archive is specified on the command line, then pax
170 shall archive the file referenced by the link, using the name of
171 the link. The default behavior shall be to archive the symbolic
172 link itself.
173
174 -i Interactively rename files or archive members. For each archive
175 member matching a pattern operand or file matching a file oper‐
176 and, a prompt shall be written to the file /dev/tty. The prompt
177 shall contain the name of the file or archive member, but the
178 format is otherwise unspecified. A line shall then be read from
179 /dev/tty. If this line is blank, the file or archive member
180 shall be skipped. If this line consists of a single period, the
181 file or archive member shall be processed with no modification
182 to its name. Otherwise, its name shall be replaced with the con‐
183 tents of the line. The pax utility shall immediately exit with a
184 non-zero exit status if end-of-file is encountered when reading
185 a response or if /dev/tty cannot be opened for reading and writ‐
186 ing.
187
188 The results of extracting a hard link to a file that has been
189 renamed during extraction are unspecified.
190
191 -k Prevent the overwriting of existing files.
192
193 -l (The letter ell.) In copy mode, hard links shall be made between
194 the source and destination file hierarchies whenever possible.
195 If specified in conjunction with -H or -L, when a symbolic link
196 is encountered, the hard link created in the destination file
197 hierarchy shall be to the file referenced by the symbolic link.
198 If specified when neither -H nor -L is specified, when a sym‐
199 bolic link is encountered, the implementation shall create a
200 hard link to the symbolic link in the source file hierarchy or
201 copy the symbolic link to the destination.
202
203 -L If a symbolic link referencing a file of type directory is spec‐
204 ified on the command line or encountered during the traversal of
205 a file hierarchy, pax shall archive the file hierarchy rooted in
206 the file referenced by the link, using the name of the link as
207 the root of the file hierarchy. Otherwise, if a symbolic link
208 referencing a file of any other file type which pax can normally
209 archive is specified on the command line or encountered during
210 the traversal of a file hierarchy, pax shall archive the file
211 referenced by the link, using the name of the link. The default
212 behavior shall be to archive the symbolic link itself.
213
214 -n Select the first archive member that matches each pattern oper‐
215 and. No more than one archive member shall be matched for each
216 pattern (although members of type directory shall still match
217 the file hierarchy rooted at that file).
218
219 -o options
220 Provide information to the implementation to modify the algo‐
221 rithm for extracting or writing files. The value of options
222 shall consist of one or more comma-separated keywords of the
223 form:
224
225 keyword[[:]=value][,keyword[[:]=value],...]
226
227 Some keywords apply only to certain file formats, as indicated
228 with each description. Use of keywords that are inapplicable to
229 the file format being processed produces undefined results.
230
231 Keywords in the options argument shall be a string that would be
232 a valid portable filename as described in the Base Definitions
233 volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
234 Character Set.
235
236 Note: Keywords are not expected to be filenames, merely to fol‐
237 low the same character composition rules as portable
238 filenames.
239
240 Keywords can be preceded with white space. The value field shall
241 consist of zero or more characters; within value, the applica‐
242 tion shall precede any literal comma with a backslash, which
243 shall be ignored, but preserves the comma as part of value. A
244 comma as the final character, or a comma followed solely by
245 white space as the final characters, in options shall be
246 ignored. Multiple -o options can be specified; if keywords given
247 to these multiple -o options conflict, the keywords and values
248 appearing later in command line sequence shall take precedence
249 and the earlier shall be silently ignored. The following keyword
250 values of options shall be supported for the file formats as
251 indicated:
252
253 delete=pattern
254 (Applicable only to the -x pax format.) When used in
255 write or copy mode, pax shall omit from extended header
256 records that it produces any keywords matching the string
257 pattern. When used in read or list mode, pax shall ignore
258 any keywords matching the string pattern in the extended
259 header records. In both cases, matching shall be per‐
260 formed using the pattern matching notation described in
261 Patterns Matching a Single Character and Patterns Match‐
262 ing Multiple Characters. For example:
263
264 -o delete=security.*
265
266 would suppress security-related information. See pax
267 Extended Header for extended header record keyword usage.
268
269 When multiple -o delete=pattern options are specified,
270 the patterns shall be additive; all keywords matching the
271 specified string patterns shall be omitted from extended
272 header records that pax produces.
273
274 exthdr.name=string
275 (Applicable only to the -x pax format.) This keyword
276 allows user control over the name that is written into
277 the ustar header blocks for the extended header produced
278 under the circumstances described in pax Header Block.
279 The name shall be the contents of string, after the fol‐
280 lowing character substitutions have been made:
281
282 ┌─────────────────┬─────────────────────────────────────────────┐
283 │string Includes: │ Replaced By: │
284 ├─────────────────┼─────────────────────────────────────────────┤
285 │%d │ The directory name of the file, equivalent │
286 │ │ to the result of the dirname utility on the │
287 │ │ translated pathname. │
288 ├─────────────────┼─────────────────────────────────────────────┤
289 │%f │ The filename of the file, equivalent to the │
290 │ │ result of the basename utility on the │
291 │ │ translated pathname. │
292 ├─────────────────┼─────────────────────────────────────────────┤
293 │%p │ The process ID of the pax process. │
294 ├─────────────────┼─────────────────────────────────────────────┤
295 │%% │ A '%' character. │
296 └─────────────────┴─────────────────────────────────────────────┘
297 Any other '%' characters in string produce undefined
298 results.
299
300 If no -o exthdr.name= string is specified, pax shall use
301 the following default value:
302
303 %d/PaxHeaders.%p/%f
304
305 globexthdr.name=string
306 (Applicable only to the -x pax format.) When used in
307 write or copy mode with the appropriate options, pax
308 shall create global extended header records with ustar
309 header blocks that will be treated as regular files by
310 previous versions of pax. This keyword allows user con‐
311 trol over the name that is written into the ustar header
312 blocks for global extended header records. The name shall
313 be the contents of string, after the following character
314 substitutions have been made:
315
316 ┌─────────────────┬─────────────────────────────────────────────┐
317 │string Includes: │ Replaced By: │
318 ├─────────────────┼─────────────────────────────────────────────┤
319 │%n │ An integer that represents the sequence │
320 │ │ number of the global extended header record │
321 │ │ in the archive, starting at 1. │
322 ├─────────────────┼─────────────────────────────────────────────┤
323 │%p │ The process ID of the pax process. │
324 ├─────────────────┼─────────────────────────────────────────────┤
325 │%% │ A '%' character. │
326 └─────────────────┴─────────────────────────────────────────────┘
327 Any other '%' characters in string produce undefined
328 results.
329
330 If no -o globexthdr.name=string is specified, pax shall
331 use the following default value:
332
333 $TMPDIR/GlobalHead.%p.%n
334
335 where $TMPDIR represents the value of the TMPDIR environ‐
336 ment variable. If TMPDIR is not set, pax shall use /tmp.
337
338 invalid=action
339 (Applicable only to the -x pax format.) This keyword
340 allows user control over the action pax takes upon
341 encountering values in an extended header record that, in
342 read or copy mode, are invalid in the destination hierar‐
343 chy or, in list mode, cannot be written in the codeset
344 and current locale of the implementation. The following
345 are invalid values that shall be recognized by pax:
346
347 + In read or copy mode, a filename or link name that
348 contains character encodings invalid in the desti‐
349 nation hierarchy. (For example, the name may con‐
350 tain embedded NULs.)
351
352 + In read or copy mode, a filename or link name that
353 is longer than the maximum allowed in the destina‐
354 tion hierarchy (for either a pathname component or
355 the entire pathname).
356
357 + In list mode, any character string value (file‐
358 name, link name, user name, and so on) that cannot
359 be written in the codeset and current locale of
360 the implementation.
361
362 The following mutually-exclusive values of the action
363 argument are supported:
364
365 bypass In read or copy mode, pax shall bypass the file,
366 causing no change to the destination hierarchy. In
367 list mode, pax shall write all requested valid
368 values for the file, but its method for writing
369 invalid values is unspecified.
370
371 rename In read or copy mode, pax shall act as if the -i
372 option were in effect for each file with invalid
373 filename or link name values, allowing the user to
374 provide a replacement name interactively. In list
375 mode, pax shall behave identically to the bypass
376 action.
377
378 UTF-8 When used in read, copy, or list mode and a file‐
379 name, link name, owner name, or any other field in
380 an extended header record cannot be translated
381 from the pax UTF-8 codeset format to the codeset
382 and current locale of the implementation, pax
383 shall use the actual UTF-8 encoding for the name.
384
385 write In read or copy mode, pax shall write the file,
386 translating the name, regardless of whether this
387 may overwrite an existing file with a valid name.
388 In list mode, pax shall behave identically to the
389 bypass action.
390
391 If no -o invalid=option is specified, pax shall act as if
392 -o invalid= bypass were specified. Any overwriting of
393 existing files that may be allowed by the -o invalid=
394 actions shall be subject to permission(-p) and modifica‐
395 tion time (-u) restrictions, and shall be suppressed if
396 the -k option is also specified.
397
398 linkdata
399 (Applicable only to the -x pax format.) In write mode,
400 pax shall write the contents of a file to the archive
401 even when that file is merely a hard link to a file whose
402 contents have already been written to the archive.
403
404 listopt=format
405 This keyword specifies the output format of the table of
406 contents produced when the -v option is specified in list
407 mode. See List Mode Format Specifications. To avoid ambi‐
408 guity, the listopt= format shall be the only or final
409 keyword= value pair in a -o option-argument; all charac‐
410 ters in the remainder of the option-argument shall be
411 considered part of the format string. When multiple -o
412 listopt= format options are specified, the format strings
413 shall be considered a single, concatenated string, evalu‐
414 ated in command line order.
415
416 times (Applicable only to the -x pax format.) When used in
417 write or copy mode, pax shall include atime and mtime
418 extended header records for each file. See pax Extended
419 Header File Times.
420
421 In addition to these keywords, if the -x pax format is speci‐
422 fied, any of the keywords and values defined in pax Extended
423 Header, including implementation extensions, can be used in -o
424 option-arguments, in either of two modes:
425
426 keyword=value
427 When used in write or copy mode, these keyword/value
428 pairs shall be included at the beginning of the archive
429 as typeflag g global extended header records. When used
430 in read or list mode, these keyword/value pairs shall act
431 as if they had been at the beginning of the archive as
432 typeflag g global extended header records.
433
434 keyword:=value
435 When used in write or copy mode, these keyword/value
436 pairs shall be included as records at the beginning of a
437 typeflag x extended header for each file. (This shall be
438 equivalent to the equal-sign form except that it creates
439 no typeflag g global extended header records.) When used
440 in read or list mode, these keyword/value pairs shall act
441 as if they were included as records at the end of each
442 extended header; thus, they shall override any global or
443 file-specific extended header record keywords of the same
444 names. For example, in the command:
445
446 pax -r -o "gname:=mygroup," <archive
447
448 the group name will be forced to a new value for all
449 files read from the archive.
450
451 The precedence of -o keywords over various fields in the archive
452 is described in pax Extended Header Keyword Precedence.
453
454 -p string
455 Specify one or more file characteristic options (privileges).
456 The string option-argument shall be a string specifying file
457 characteristics to be retained or discarded on extraction. The
458 string shall consist of the specification characters a , e, m,
459 o, and p. Other implementation-defined characters can be
460 included. Multiple characteristics can be concatenated within
461 the same string and multiple -p options can be specified. The
462 meaning of the specification characters are as follows:
463
464 a Do not preserve file access times.
465
466 e Preserve the user ID, group ID, file mode bits (see the
467 Base Definitions volume of IEEE Std 1003.1-2001, Section
468 3.168, File Mode Bits), access time, modification time,
469 and any other implementation-defined file characteris‐
470 tics.
471
472 m
473
474 Do not preserve file modification times.
475
476 o Preserve the user ID and group ID.
477
478 p Preserve the file mode bits. Other implementation-defined
479 file mode attributes may be preserved.
480
481 In the preceding list, "preserve" indicates that an attribute
482 stored in the archive shall be given to the extracted file, sub‐
483 ject to the permissions of the invoking process. The access and
484 modification times of the file shall be preserved unless other‐
485 wise specified with the -p option or not stored in the archive.
486 All attributes that are not preserved shall be determined as
487 part of the normal file creation action (see File Read, Write,
488 and Creation).
489
490 If neither the e nor the o specification character is specified,
491 or the user ID and group ID are not preserved for any reason,
492 pax shall not set the S_ISUID and S_ISGID bits of the file mode.
493
494 If the preservation of any of these items fails for any reason,
495 pax shall write a diagnostic message to standard error. Failure
496 to preserve these items shall affect the final exit status, but
497 shall not cause the extracted file to be deleted.
498
499 If file characteristic letters in any of the string option-argu‐
500 ments are duplicated or conflict with each other, the ones given
501 last shall take precedence. For example, if -p eme is specified,
502 file modification times are preserved.
503
504 -s replstr
505 Modify file or archive member names named by pattern or file op‐
506 erands according to the substitution expression replstr, using
507 the syntax of the ed utility. The concepts of "address" and
508 "line" are meaningless in the context of the pax utility, and
509 shall not be supplied. The format shall be:
510
511 -s /old/new/[gp]
512
513 where as in ed, old is a basic regular expression and new can
514 contain an ampersand, '\n' (where n is a digit) backreferences,
515 or subexpression matching. The old string shall also be permit‐
516 ted to contain <newline>s.
517
518 Any non-null character can be used as a delimiter ( '/' shown
519 here). Multiple -s expressions can be specified; the expressions
520 shall be applied in the order specified, terminating with the
521 first successful substitution. The optional trailing 'g' is as
522 defined in the ed utility. The optional trailing 'p' shall cause
523 successful substitutions to be written to standard error. File
524 or archive member names that substitute to the empty string
525 shall be ignored when reading and writing archives.
526
527 -t When reading files from the file system, and if the user has the
528 permissions required by utime() to do so, set the access time of
529 each file read to the access time that it had before being read
530 by pax.
531
532 -u Ignore files that are older (having a less recent file modifica‐
533 tion time) than a pre-existing file or archive member with the
534 same name. In read mode, an archive member with the same name as
535 a file in the file system shall be extracted if the archive mem‐
536 ber is newer than the file. In write mode, an archive file mem‐
537 ber with the same name as a file in the file system shall be
538 superseded if the file is newer than the archive member. If -a
539 is also specified, this is accomplished by appending to the ar‐
540 chive; otherwise, it is unspecified whether this is accomplished
541 by actual replacement in the archive or by appending to the ar‐
542 chive. In copy mode, the file in the destination hierarchy shall
543 be replaced by the file in the source hierarchy or by a link to
544 the file in the source hierarchy if the file in the source hier‐
545 archy is newer.
546
547 -v In list mode, produce a verbose table of contents (see the STD‐
548 OUT section). Otherwise, write archive member pathnames to stan‐
549 dard error (see the STDERR section).
550
551 -x format
552 Specify the output archive format. The pax utility shall support
553 the following formats:
554
555 cpio The cpio interchange format; see the EXTENDED DESCRIPTION
556 section. The default blocksize for this format for char‐
557 acter special archive files shall be 5120. Implementa‐
558 tions shall support all blocksize values less than or
559 equal to 32256 that are multiples of 512.
560
561 pax The pax interchange format; see the EXTENDED DESCRIPTION
562 section. The default blocksize for this format for char‐
563 acter special archive files shall be 5120. Implementa‐
564 tions shall support all blocksize values less than or
565 equal to 32256 that are multiples of 512.
566
567 ustar The tar interchange format; see the EXTENDED DESCRIPTION
568 section. The default blocksize for this format for char‐
569 acter special archive files shall be 10240. Implementa‐
570 tions shall support all blocksize values less than or
571 equal to 32256 that are multiples of 512.
572
573 Implementation-defined formats shall specify a default block
574 size as well as any other block sizes supported for character
575 special archive files.
576
577 Any attempt to append to an archive file in a format different
578 from the existing archive format shall cause pax to exit immedi‐
579 ately with a non-zero exit status.
580
581 In copy mode, if no -x format is specified, pax shall behave as
582 if -x pax were specified.
583
584 -X When traversing the file hierarchy specified by a pathname, pax
585 shall not descend into directories that have a different device
586 ID ( st_dev; see the System Interfaces volume of IEEE Std
587 1003.1-2001, stat()).
588
589 Specifying more than one of the mutually-exclusive options -H and -L
590 shall not be considered an error and the last option specified shall
591 determine the behavior of the utility.
592
593 The options that operate on the names of files or archive members (-c,
594 -i, -n, -s, -u, and -v) shall interact as follows. In read mode, the
595 archive members shall be selected based on the user-specified pattern
596 operands as modified by the -c, -n, and -u options. Then, any -s and -i
597 options shall modify, in that order, the names of the selected files.
598 The -v option shall write names resulting from these modifications.
599
600 In write mode, the files shall be selected based on the user-specified
601 pathnames as modified by the -n and -u options. Then, any -s and -i
602 options shall modify, in that order, the names of these selected files.
603 The -v option shall write names resulting from these modifications.
604
605 If both the -u and -n options are specified, pax shall not consider a
606 file selected unless it is newer than the file to which it is compared.
607
608
609 List Mode Format Specifications
610 The manual page for spax is not yet ready. The following text is a
611 quotation from the POSIX.1-2001 standard.
612
613 In list mode with the -o listopt=format option, the format argument
614 shall be applied for each selected file. The pax utility shall append a
615 <newline> to the listopt output for each selected file. The format
616 argument shall be used as the format string described in the Base Defi‐
617 nitions volume of IEEE Std 1003.1-2001, Chapter 5, File Format Nota‐
618 tion, with the exceptions 1. through 5. defined in the EXTENDED
619 DESCRIPTION section of printf(3), plus the following exceptions:
620
621 6. The sequence (keyword) can occur before a format conversion
622 specifier. The conversion argument is defined by the value of
623 keyword. The implementation shall support the following key‐
624 words:
625
626 · Any of the Field Name entries in ustar Header Block and
627 Octet-Oriented cpio Archive Entry. The implementation may
628 support the cpio keywords without the leading c_ in addi‐
629 tion to the form required by Values for cpio c_mode
630 Field.
631
632 · Any keyword defined for the extended header in pax
633 Extended Header.
634
635 · Any keyword provided as an implementation-defined exten‐
636 sion within the extended header defined in pax Extended
637 Header.
638
639 For example, the sequence "%(charset)s" is the string value of
640 the name of the character set in the extended header.
641
642 The result of the keyword conversion argument shall be the value
643 from the applicable header field or extended header, without any
644 trailing NULs.
645
646 All keyword values used as conversion arguments shall be trans‐
647 lated from the UTF-8 encoding to the character set appropriate
648 for the local file system, user database, and so on, as applica‐
649 ble.
650
651 7. An additional conversion specifier character, T, shall be used
652 to specify time formats. The T conversion specifier character
653 can be preceded by the sequence (keyword=subformat), where sub‐
654 format is a date format as defined by date operands. The default
655 keyword shall be mtime and the default subformat shall be:
656
657 %b %e %H:%M %Y
658
659 8. An additional conversion specifier character, M, shall be used
660 to specify the file mode string as defined in ls(1) Standard
661 Output. If (keyword) is omitted, the mode keyword shall be used.
662 For example, %.1M writes the single character corresponding to
663 the <entry type> field of the ls -l command.
664
665 9. An additional conversion specifier character, D, shall be used
666 to specify the device for block or special files, if applicable,
667 in an implementation-defined format. If not applicable, and
668 (keyword) is specified, then this conversion shall be equivalent
669 to %(keyword)u. If not applicable, and (keyword) is omitted,
670 then this conversion shall be equivalent to <space>.
671
672 10. An additional conversion specifier character, F, shall be used
673 to specify a pathname. The F conversion character can be pre‐
674 ceded by a sequence of comma-separated keywords:
675
676 (keyword[,keyword] ... )
677 The values for all the keywords that are non-null shall be con‐
678 catenated together, each separated by a '/'. The default shall
679 be (path) if the keyword path is defined; otherwise, the default
680 shall be (prefix, name).
681
682 11. An additional conversion specifier character, L, shall be used
683 to specify a symbolic line expansion. If the current file is a
684 symbolic link, then %L shall expand to:
685
686 "%s -> %s", <value of keyword>, <contents of link>
687
688 Otherwise, the %L conversion specification shall be the equivalent of
689 %F.
690
691
693 The following operands shall be supported:
694
695 directory
696 The destination directory pathname for copy mode.
697
698 file A pathname of a file to be copied or archived.
699
700 pattern
701 A pattern matching one or more pathnames of archive members. A
702 pattern must be given in the name-generating notation of the
703 pattern matching notation in Pattern Matching Notation , includ‐
704 ing the filename expansion rules in Patterns Used for Filename
705 Expansion. The default, if no pattern is specified, is to select
706 all members in the archive.
707
708
710 In write mode, the standard input shall be used only if no file oper‐
711 ands are specified. It shall be a text file containing a list of path‐
712 names, one per line, without leading or trailing <blank>s.
713
714 In list and read modes, if -f is not specified, the standard input
715 shall be an archive file.
716
717 Otherwise, the standard input shall not be used.
718
719
721 The input file named by the archive option-argument, or standard input
722 when the archive is read from there, shall be a file formatted accord‐
723 ing to one of the specifications in the EXTENDED DESCRIPTION section or
724 some other implementation-defined format.
725
726 The file /dev/tty shall be used to write prompts and read responses.
727
728
730 The following environment variables shall affect the execution of pax:
731
732 LANG Provide a default value for the internationalization variables
733 that are unset or null. (See the Base Definitions volume of IEEE
734 Std 1003.1-2001, Section 8.2, Internationalization Variables for
735 the precedence of internationalization variables used to deter‐
736 mine the values of locale categories.)
737
738 LC_ALL If set to a non-empty string value, override the values of all
739 the other internationalization variables.
740
741 LC_COLLATE
742 Determine the locale for the behavior of ranges, equivalence
743 classes, and multi-character collating elements used in the pat‐
744 tern matching expressions for the pattern operand, the basic
745 regular expression for the -s option, and the extended regular
746 expression defined for the yesexpr locale keyword in the LC_MES‐
747 SAGES category.
748
749 LC_CTYPE
750 Determine the locale for the interpretation of sequences of
751 bytes of text data as characters (for example, single-byte as
752 opposed to multi-byte characters in arguments and input files),
753 the behavior of character classes used in the extended regular
754 expression defined for the yesexpr locale keyword in the LC_MES‐
755 SAGES category, and pattern matching.
756
757 LC_MESSAGES
758 Determine the locale for the processing of affirmative responses
759 that should be used to affect the format and contents of diag‐
760 nostic messages written to standard error.
761
762 LC_TIME
763 Determine the format and contents of date and time strings when
764 the -v option is specified.
765
766 NLSPATH
767 [XSI] [Option Start] Determine the location of message catalogs
768 for the processing of LC_MESSAGES . [Option End]
769
770 TMPDIR Determine the pathname that provides part of the default global
771 extended header record file, as described for the -o globexthdr=
772 keyword in the OPTIONS section.
773
774 TZ Determine the timezone used to calculate date and time strings
775 when the -v option is specified. If TZ is unset or null, an
776 unspecified default timezone shall be used.
777
778
780 Default.
781
782
784 In write mode, if -f is not specified, the standard output shall be the
785 archive formatted according to one of the specifications in the
786 EXTENDED DESCRIPTION section, or some other implementation-defined for‐
787 mat (see -x format).
788
789 In list mode, when the -o listopt= format has been specified, the
790 selected archive members shall be written to standard output using the
791 format described under List Mode Format Specifications. In list mode
792 without the -o listopt= format option, the table of contents of the
793 selected archive members shall be written to standard output using the
794 following format:
795
796 "%s\n", <pathname>
797
798 If the -v option is specified in list mode, the table of contents of
799 the selected archive members shall be written to standard output using
800 the following formats.
801
802 For pathnames representing hard links to previous members of the ar‐
803 chive:
804
805 "%s == %s\n", <ls -l listing>, <linkname>
806
807 For all other pathnames:
808
809 "%s\n", <ls -l listing>
810
811 where <ls -l listing> shall be the format specified by the ls(1) util‐
812 ity with the -l option. When writing pathnames in this format, it is
813 unspecified what is written for fields for which the underlying archive
814 format does not have the correct information, although the correct num‐
815 ber of <blank>-separated fields shall be written.
816
817 In list mode, standard output shall not be buffered more than a line at
818 a time.
819
820
822 If -v is specified in read, write, or copy modes, pax shall write the
823 pathnames it processes to the standard error output using the following
824 format:
825
826 "%s\n", <pathname>
827
828 These pathnames shall be written as soon as processing is begun on the
829 file or archive member, and shall be flushed to standard error. The
830 trailing <newline>, which shall not be buffered, is written when the
831 file has been read or written.
832
833 If the -s option is specified, and the replacement string has a trail‐
834 ing 'p', substitutions shall be written to standard error in the fol‐
835 lowing format:
836
837 "%s >> %s\n", <original pathname>, <new pathname>
838
839 In all operating modes of pax, optional messages of unspecified format
840 concerning the input archive format and volume number, the number of
841 files, blocks, volumes, and media parts as well as other diagnostic
842 messages may be written to standard error.
843
844 In all formats, for both standard output and standard error, it is
845 unspecified how non-printable characters in pathnames or link names are
846 written.
847
848 When pax is in read mode or list mode, using the -x pax archive format,
849 and a filename, link name, owner name, or any other field in an
850 extended header record cannot be translated from the pax UTF-8 codeset
851 format to the codeset and current locale of the implementation, pax
852 shall write a diagnostic message to standard error, shall process the
853 file as described for the -o invalid= option, and then shall process
854 the next file in the archive.
855
856
858 In read mode, the extracted output files shall be of the archived file
859 type. In copy mode, the copied output files shall be the type of the
860 file being copied. In either mode, existing files in the destination
861 hierarchy shall be overwritten only when all permission (-p), modifica‐
862 tion time (-u), and invalid-value (-o invalid=) tests allow it.
863
864 In write mode, the output file named by the -f option-argument shall be
865 a file formatted according to one of the specifications in the EXTENDED
866 DESCRIPTION section, or some other implementation-defined format.
867
868
870 pax Interchange Format
871 A pax archive tape or file produced in the -x pax format shall contain
872 a series of blocks. The physical layout of the archive shall be identi‐
873 cal to the ustar format described in ustar Interchange Format. Each
874 file archived shall be represented by the following sequence:
875
876 · An optional header block with extended header records.
877 This header block is of the form described in pax Header
878 Block, with a typeflag value of x or g. The extended
879 header records, described in pax Extended Header, shall
880 be included as the data for this header block.
881
882 · A header block that describes the file. Any fields in the
883 preceding optional extended header shall override the
884 associated fields in this header block for this file.
885
886 · Zero or more blocks that contain the contents of the
887 file.
888
889 At the end of the archive file there shall be two 512-byte blocks
890 filled with binary zeros, interpreted as an end-of-archive indicator.
891
892 A schematic of an example archive with global extended header records
893 and two actual files is shown in pax Format Archive Example. In the
894 example, the second file in the archive has no extended header preced‐
895 ing it, presumably because it has no need for extended attributes.
896
897 Figure: pax Format Archive Example
898
899 ┌──────────────────────────────┬─────────────────────────────────────────────┐
900 │ustar Header [typeflag = 'g'] │ │
901 ├──────────────────────────────┤ Global Extended header │
902 │Global Extended Header Data │ │
903 ├──────────────────────────────┼─────────────────────────────────────────────┤
904 │ustar Header [typeflag = 'x'] │ │
905 ├──────────────────────────────┤ │
906 │Extended Header Data │ │
907 ├──────────────────────────────┤ File 1: Extended Header data is included │
908 │ustar Header [typeflag = '0'] │ │
909 ├──────────────────────────────┤ │
910 │Data for File 1 │ │
911 ├──────────────────────────────┼─────────────────────────────────────────────┤
912 │ustar Header [typeflag = '0'] │ │
913 ├──────────────────────────────┤ File 2: No Extended Header data is included │
914 │Data for File 2 │ │
915 ├──────────────────────────────┼─────────────────────────────────────────────┤
916 │Block of binary Zeroes │ │
917 ├──────────────────────────────┤ End of Archive Indicator │
918 │Block of binary Zeroes │ │
919 └──────────────────────────────┴─────────────────────────────────────────────┘
920
921 pax Header Block
922 The pax header block shall be identical to the ustar header block
923 described in ustar Interchange Format, except that two additional type‐
924 flag values are defined:
925
926 x Represents extended header records for the following file in the
927 archive (which shall have its own ustar header block). The for‐
928 mat of these extended header records shall be as described in
929 pax Extended Header.
930
931 g Represents global extended header records for the following
932 files in the archive. The format of these extended header
933 records shall be as described in pax Extended Header. Each
934 value shall affect all subsequent files that do not override
935 that value in their own extended header record and until another
936 global extended header record is reached that provides another
937 value for the same field. The typeflag g global headers should
938 not be used with interchange media that could suffer partial
939 data loss in transporting the archive.
940
941 For both of these types, the size field shall be the size of the
942 extended header records in octets. The other fields in the header block
943 are not meaningful to this version of the pax utility. However, if
944 this archive is read by a pax utility conforming to the ISO
945 POSIX-2:1993 standard, the header block fields are used to create a
946 regular file that contains the extended header records as data. There‐
947 fore, header block field values should be selected to provide reason‐
948 able file access to this regular file.
949
950 A further difference from the ustar header block is that data blocks
951 for files of typeflag 1 (the digit one) (hard link) may be included,
952 which means that the size field may be greater than zero. Archives cre‐
953 ated by pax -o linkdata shall include these data blocks with the hard
954 links.
955
956
957 pax Extended Header
958 A pax extended header contains values that are inappropriate for the
959 ustar header block because of limitations in that format: fields
960 requiring a character encoding other than that described in the ISO/IEC
961 646:1991 standard, fields representing file attributes not described in
962 the ustar header, and fields whose format or length do not fit the
963 requirements of the ustar header. The values in an extended header add
964 attributes to the following file (or files; see the description of the
965 typeflag g header block) or override values in the following header
966 block(s), as indicated in the following list of keywords.
967
968 An extended header shall consist of one or more records, each con‐
969 structed as follows:
970
971 "%d %s=%s\n", <length>, <keyword>, <value>
972
973 The extended header records shall be encoded according to the ISO/IEC
974 10646-1:2000 standard (UTF-8). The <length> field, <blank>, equals
975 sign, and <newline> shown shall be limited to the portable character
976 set, as encoded in UTF-8. The <keyword> and <value> fields can be any
977 UTF-8 characters. The <length> field shall be the decimal length of the
978 extended header record in octets, including the trailing <newline>.
979
980 The <keyword> field shall be one of the entries from the following list
981 or a keyword provided as an implementation extension. Keywords con‐
982 sisting entirely of lowercase letters, digits, and periods are reserved
983 for future standardization. A keyword shall not include an equals sign.
984 (In the following list, the notations "file(s)" or "block(s)" is used
985 to acknowledge that a keyword affects the following single file after a
986 typeflag x extended header, but possibly multiple files after typeflag
987 g. Any requirements in the list for pax to include a record when in
988 write or copy mode shall apply only when such a record has not already
989 been provided through the use of the -o option. When used in copy mode,
990 pax shall behave as if an archive had been created with applicable
991 extended header records and then extracted.)
992
993 atime The file access time for the following file(s), equivalent to
994 the value of the st_atime member of the stat structure for a
995 file, as described by the stat(2) function. The access time
996 shall be restored if the process has the appropriate privilege
997 required to do so. The format of the <value> shall be as
998 described in pax Extended Header File Times.
999
1000 charset
1001 The name of the character set used to encode the data in the
1002 following file(s). The entries in the following table are
1003 defined to refer to known standards; additional names may be
1004 agreed on between the originator and recipient.
1005
1006 ┌────────────────────────┬───────────────────────────────┐
1007 │ <value> │ Formal Standard │
1008 ├────────────────────────┼───────────────────────────────┤
1009 │ISO-IR 646 1990 │ ISO/IEC 646:1990 │
1010 │ISO-IR 8859 1 1998 │ ISO/IEC 8859-1:1998 │
1011 │ISO-IR 8859 2 1999 │ ISO/IEC 8859-2:1999 │
1012 │ISO-IR 8859 3 1999 │ ISO/IEC 8859-3:1999 │
1013 │ISO-IR 8859 4 1998 │ ISO/IEC 8859-4:1998 │
1014 │ISO-IR 8859 5 1999 │ ISO/IEC 8859-5:1999 │
1015 │ISO-IR 8859 6 1999 │ ISO/IEC 8859-6:1999 │
1016 │ISO-IR 8859 7 1987 │ ISO/IEC 8859-7:1987 │
1017 │ISO-IR 8859 8 1999 │ ISO/IEC 8859-8:1999 │
1018 │ISO-IR 8859 9 1999 │ ISO/IEC 8859-9:1999 │
1019 │ISO-IR 8859 10 1998 │ ISO/IEC 8859-10:1998 │
1020 │ISO-IR 8859 13 1998 │ ISO/IEC 8859-13:1998 │
1021 │ISO-IR 8859 14 1998 │ ISO/IEC 8859-14:1998 │
1022 │ISO-IR 8859 15 1999 │ ISO/IEC 8859-15:1999 │
1023 │ISO-IR 10646 2000 │ ISO/IEC 10646:2000 │
1024 │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1025 │BINARY │ None │
1026 └────────────────────────┴───────────────────────────────┘
1027 The encoding is included in an extended header for information only;
1028 when pax is used as described in IEEE Std 1003.1-2001, it shall not
1029 translate the file data into any other encoding. The BINARY entry indi‐
1030 cates unencoded binary data.
1031
1032 When used in write or copy mode, it is implementation-defined whether
1033 pax includes a charset extended header record for a file.
1034
1035 comment
1036 A series of characters used as a comment. All characters in the
1037 <value> field shall be ignored by pax.
1038
1039 gid The group ID of the group that owns the file, expressed as a
1040 decimal number using digits from the ISO/IEC 646:1991 standard.
1041 This record shall override the gid field in the following header
1042 block(s). When used in write or copy mode, pax shall include a
1043 gid extended header record for each file whose group ID is
1044 greater than 2097151 (octal 7777777).
1045
1046 gname The group of the file(s), formatted as a group name in the group
1047 database. This record shall override the gid and gname fields in
1048 the following header block(s), and any gid extended header
1049 record. When used in read, copy, or list mode, pax shall trans‐
1050 late the name from the UTF-8 encoding in the header record to
1051 the character set appropriate for the group database on the
1052 receiving system. If any of the UTF-8 characters cannot be
1053 translated, and if the -o invalid=UTF-8 option is not specified,
1054 the results are implementation-defined. When used in write or
1055 copy mode, pax shall include a gname extended header record for
1056 each file whose group name cannot be represented entirely with
1057 the letters and digits of the portable character set.
1058
1059 linkpath
1060 The pathname of a link being created to another file, of any
1061 type, previously archived. This record shall override the
1062 linkname field in the following ustar header block(s). The fol‐
1063 lowing ustar header block shall determine the type of link cre‐
1064 ated. If typeflag of the following header block is 1, it shall
1065 be a hard link. If typeflag is 2, it shall be a symbolic link
1066 and the linkpath value shall be the contents of the symbolic
1067 link. The pax utility shall translate the name of the link (con‐
1068 tents of the symbolic link) from the UTF-8 encoding to the char‐
1069 acter set appropriate for the local file system. When used in
1070 write or copy mode, pax shall include a linkpath extended header
1071 record for each link whose pathname cannot be represented
1072 entirely with the members of the portable character set other
1073 than NUL.
1074
1075 mtime The file modification time of the following file(s), equivalent
1076 to the value of the st_mtime member of the stat structure for a
1077 file, as described in the stat(2) function. This record shall
1078 override the mtime field in the following header block(s). The
1079 modification time shall be restored if the process has the
1080 appropriate privilege required to do so. The format of the
1081 <value> shall be as described in pax Extended Header File Times.
1082
1083 path The pathname of the following file(s). This record shall over‐
1084 ride the name and prefix fields in the following header
1085 block(s). The pax utility shall translate the pathname of the
1086 file from the UTF-8 encoding to the character set appropriate
1087 for the local file system.
1088
1089 When used in write or copy mode, pax shall include a path
1090 extended header record for each file whose pathname cannot be
1091 represented entirely with the members of the portable character
1092 set other than NUL.
1093
1094 realtime.any
1095 The keywords prefixed by "realtime." are reserved for future
1096 standardization.
1097
1098 security.any
1099 The keywords prefixed by "security." are reserved for future
1100 standardization.
1101
1102 size The size of the file in octets, expressed as a decimal number
1103 using digits from the ISO/IEC 646:1991 standard. This record
1104 shall override the size field in the following header block(s).
1105 When used in write or copy mode, pax shall include a size
1106 extended header record for each file with a size value greater
1107 than 8589934591 (octal 77777777777).
1108
1109 uid The user ID of the file owner, expressed as a decimal number
1110 using digits from the ISO/IEC 646:1991 standard. This record
1111 shall override the uid field in the following header block(s).
1112 When used in write or copy mode, pax shall include a uid
1113 extended header record for each file whose owner ID is greater
1114 than 2097151 (octal 7777777).
1115
1116 uname The owner of the following file(s), formatted as a user name in
1117 the user database. This record shall override the uid and uname
1118 fields in the following header block(s), and any uid extended
1119 header record. When used in read, copy, or list mode, pax shall
1120 translate the name from the UTF-8 encoding in the header record
1121 to the character set appropriate for the user database on the
1122 receiving system. If any of the UTF-8 characters cannot be
1123 translated, and if the -o invalid=UTF-8 option is not specified,
1124 the results are implementation-defined. When used in write or
1125 copy mode, pax shall include a uname extended header record for
1126 each file whose user name cannot be represented entirely with
1127 the letters and digits of the portable character set.
1128
1129 If the <value> field is zero length, it shall delete any header block
1130 field, previously entered extended header value, or global extended
1131 header value of the same name.
1132
1133 If a keyword in an extended header record (or in a -o option-argument)
1134 overrides or deletes a corresponding field in the ustar header block,
1135 pax shall ignore the contents of that header block field.
1136
1137 Unlike the ustar header block fields, NULs shall not delimit <value>s;
1138 all characters within the <value> field shall be considered data for
1139 the field. None of the length limitations of the ustar header block
1140 fields in ustar Header Block shall apply to the extended header
1141 records.
1142
1143
1144 pax Extended Header Keyword Precedence
1145 This section describes the precedence in which the various header
1146 records and fields and command line options are selected to apply to a
1147 file in the archive. When pax is used in read or list modes, it shall
1148 determine a file attribute in the following sequence:
1149
1150 1. If -o delete=keyword-prefix is used, the affected
1151 attributes shall be determined from step 7., if applica‐
1152 ble, or ignored otherwise.
1153
1154 2. If -o keyword:= is used, the affected attributes shall be
1155 ignored.
1156
1157 3. If -o keyword:=value is used, the affected attribute
1158 shall be assigned the value.
1159
1160 4. If there is a typeflag x extended header record, the
1161 affected attribute shall be assigned the <value>. When
1162 extended header records conflict, the last one given in
1163 the header shall take precedence.
1164
1165 5. If -o keyword=value is used, the affected attribute shall
1166 be assigned the value.
1167
1168 6. If there is a typeflag g global extended header record,
1169 the affected attribute shall be assigned the <value>.
1170 When global extended header records conflict, the last
1171 one given in the global header shall take precedence.
1172
1173 7. Otherwise, the attribute shall be determined from the
1174 ustar header block.
1175
1176
1177 pax Extended Header File Times
1178 The pax utility shall write an mtime record for each file in write or
1179 copy modes if the file's modification time cannot be represented
1180 exactly in the ustar header logical record described in ustar Inter‐
1181 change Format. This can occur if the time is out of ustar range, or if
1182 the file system of the underlying implementation supports non-integer
1183 time granularities and the time is not an integer. All of these time
1184 records shall be formatted as a decimal representation of the time in
1185 seconds since the Epoch. If a period ('.') decimal point character is
1186 present, the digits to the right of the point shall represent the units
1187 of a subsecond timing granularity, where the first digit is tenths of a
1188 second and each subsequent digit is a tenth of the previous digit. In
1189 read or copy mode, the pax utility shall truncate the time of a file to
1190 the greatest value that is not greater than the input header file time.
1191 In write or copy mode, the pax utility shall output a time exactly if
1192 it can be represented exactly as a decimal number, and otherwise shall
1193 generate only enough digits so that the same time shall be recovered if
1194 the file is extracted on a system whose underlying implementation sup‐
1195 ports the same time granularity.
1196
1197
1198 ustar Interchange Format
1199 A ustar archive tape or file shall contain a series of logical records.
1200 Each logical record shall be a fixed-size logical record of 512 octets
1201 (see below). Although this format may be thought of as being stored on
1202 9-track industry-standard 12.7 mm (0.5 in) magnetic tape, other types
1203 of transportable media are not excluded. Each file archived shall be
1204 represented by a header logical record that describes the file, fol‐
1205 lowed by zero or more logical records that give the contents of the
1206 file. At the end of the archive file there shall be two 512-octet logi‐
1207 cal records filled with binary zeros, interpreted as an end-of-archive
1208 indicator.
1209
1210 The logical records may be grouped for physical I/O operations, as
1211 described under the -b blocksize and -x ustar options. Each group of
1212 logical records may be written with a single operation equivalent to
1213 the write(2) function. On magnetic tape, the result of this write shall
1214 be a single tape physical block. The last physical block shall always
1215 be the full size, so logical records after the two zero logical records
1216 may contain undefined data.
1217
1218 The header logical record shall be structured as shown in the following
1219 table. All lengths and offsets are in decimal.
1220
1221 Table: ustar Header Block
1222
1223 ┌───────────┬──────────────┬────────────────────┐
1224 │Field Name │ Octet Offset │ Length (in Octets) │
1225 ├───────────┼──────────────┼────────────────────┤
1226 │name │ 0 │ 100 │
1227 │mode │ 100 │ 8 │
1228 │uid │ 108 │ 8 │
1229 │gid │ 116 │ 8 │
1230 │size │ 124 │ 12 │
1231 │mtime │ 136 │ 12 │
1232 │chksum │ 148 │ 8 │
1233 │typeflag │ 156 │ 1 │
1234 │linkname │ 157 │ 100 │
1235 │magic │ 257 │ 6 │
1236 │version │ 263 │ 2 │
1237 │uname │ 265 │ 32 │
1238 │gname │ 297 │ 32 │
1239 │devmajor │ 329 │ 8 │
1240 │devminor │ 337 │ 8 │
1241 │prefix │ 345 │ 155 │
1242 └───────────┴──────────────┴────────────────────┘
1243 All characters in the header logical record shall be represented in the
1244 coded character set of the ISO/IEC 646:1991 standard. For maximum
1245 portability between implementations, names should be selected from
1246 characters represented by the portable filename character set as octets
1247 with the most significant bit zero. If an implementation supports the
1248 use of characters outside of slash and the portable filename character
1249 set in names for files, users, and groups, one or more implementation-
1250 defined encodings of these characters shall be provided for interchange
1251 purposes.
1252
1253 However, the pax utility shall never create filenames on the local sys‐
1254 tem that cannot be accessed via the procedures described in IEEE Std
1255 1003.1-2001. If a filename is found on the medium that would create an
1256 invalid filename, it is implementation-defined whether the data from
1257 the file is stored on the file hierarchy and under what name it is
1258 stored. The pax utility may choose to ignore these files as long as it
1259 produces an error indicating that the file is being ignored.
1260
1261 Each field within the header logical record is contiguous; that is,
1262 there is no padding used. Each character on the archive medium shall be
1263 stored contiguously.
1264
1265 The fields magic, uname, and gname are character strings each termi‐
1266 nated by a NUL character. The fields name, linkname, and prefix are
1267 NUL-terminated character strings except when all characters in the
1268 array contain non-NUL characters including the last character. The ver‐
1269 sion field is two octets containing the characters "00" (zero-zero).
1270 The typeflag contains a single character. All other fields are leading
1271 zero-filled octal numbers using digits from the ISO/IEC 646:1991 stan‐
1272 dard IRV. Each numeric field is terminated by one or more <space> or
1273 NUL characters.
1274
1275 The name and the prefix fields shall produce the pathname of the file.
1276 A new pathname shall be formed, if prefix is not an empty string (its
1277 first character is not NUL), by concatenating prefix (up to the first
1278 NUL character), a slash character, and name; otherwise, name is used
1279 alone. In either case, name is terminated at the first NUL character.
1280 If prefix begins with a NUL character, it shall be ignored. In this
1281 manner, pathnames of at most 256 characters can be supported. If a
1282 pathname does not fit in the space provided, pax shall notify the user
1283 of the error, and shall not store any part of the file-header or data-
1284 on the medium.
1285
1286 The linkname field, described below, shall not use the prefix to pro‐
1287 duce a pathname. As such, a linkname is limited to 100 characters. If
1288 the name does not fit in the space provided, pax shall notify the user
1289 of the error, and shall not attempt to store the link on the medium.
1290
1291 The mode field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1292 dard octal digit representation. The encoded bits shall represent the
1293 following values:
1294
1295 Table: ustar mode Field
1296
1297 ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1298 │ Bit │ IEEE Std │ Description │
1299 │Value │ 1003.1-2001 Bit │ │
1300 ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1301 │04000 │ S_ISUID │ Set UID on execution. │
1302 │02000 │ S_ISGID │ Set GID on execution. │
1303 │01000 │ <reserved> │ Reserved for future standardization. │
1304 │00400 │ S_IRUSR │ Read permission for file owner class. │
1305 │00200 │ S_IWUSR │ Write permission for file owner class. │
1306 │00100 │ S_IXUSR │ Execute/search permission for file owner class. │
1307 │00040 │ S_IRGRP │ Read permission for file group class. │
1308 │00020 │ S_IWGRP │ Write permission for file group class. │
1309 │00010 │ S_IXGRP │ Execute/search permission for file group class. │
1310 │00004 │ S_IROTH │ Read permission for file other class. │
1311 │00002 │ S_IWOTH │ Write permission for file other class. │
1312 │00001 │ S_IXOTH │ Execute/search permission for file other class. │
1313 └──────┴─────────────────┴─────────────────────────────────────────────────┘
1314 When appropriate privilege is required to set one of these mode bits,
1315 and the user restoring the files from the archive does not have the
1316 appropriate privilege, the mode bits for which the user does not have
1317 appropriate privilege shall be ignored. Some of the mode bits in the
1318 archive format are not mentioned elsewhere in this volume of IEEE Std
1319 1003.1-2001. If the implementation does not support those bits, they
1320 may be ignored.
1321
1322 The uid and gid fields are the user and group ID of the owner and group
1323 of the file, respectively.
1324
1325 The size field is the size of the file in octets. If the typeflag field
1326 is set to specify a file to be of type 1 (a link) or 2 (a symbolic
1327 link), the size field shall be specified as zero. If the typeflag field
1328 is set to specify a file of type 5 (directory), the size field shall be
1329 interpreted as described under the definition of that record type. No
1330 data logical records are stored for types 1, 2, or 5. If the typeflag
1331 field is set to 3 (character special file), 4 (block special file), or
1332 6 (FIFO), the meaning of the size field is unspecified by this volume
1333 of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1334 the medium. Additionally, for type 6, the size field shall be ignored
1335 when reading. If the typeflag field is set to any other value, the num‐
1336 ber of logical records written following the header shall be
1337 (size+511)/512, ignoring any fraction in the result of the division.
1338
1339 The mtime field shall be the modification time of the file at the time
1340 it was archived. It is the ISO/IEC 646:1991 standard representation of
1341 the octal value of the modification time obtained from the stat(2)
1342 function.
1343
1344 The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1345 tion of the octal value of the simple sum of all octets in the header
1346 logical record. Each octet in the header shall be treated as an
1347 unsigned value. These values shall be added to an unsigned integer,
1348 initialized to zero, the precision of which is not less than 17 bits.
1349 When calculating the checksum, the chksum field is treated as if it
1350 were all spaces.
1351
1352 The typeflag field specifies the type of file archived. If a particular
1353 implementation does not recognize the type, or the user does not have
1354 appropriate privilege to create that type, the file shall be extracted
1355 as if it were a regular file if the file type is defined to have a
1356 meaning for the size field that could cause data logical records to be
1357 written on the medium (see the previous description for size). If con‐
1358 version to a regular file occurs, the pax utility shall produce an
1359 error indicating that the conversion took place. All of the typeflag
1360 fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1361
1362 0 Represents a regular file. For backwards-compatibility, a type‐
1363 flag value of binary zero ('\0') should be recognized as meaning
1364 a regular file when extracting files from the archive. Archives
1365 written with this version of the archive file format create reg‐
1366 ular files with a typefla value of the ISO/IEC 646:1991 standard
1367 IRV '0'.
1368
1369 1 Represents a file linked to another file, of any type, previ‐
1370 ously archived. Such files are identified by having the same
1371 device and file serial numbers, and pathnames that refer to dif‐
1372 ferent directory entries. All such files shall be archived as
1373 linked files. The linked-to name is specified in the linkname
1374 field with a NUL-character terminator if it is less than 100
1375 octets in length.
1376
1377 2 Represents a symbolic link. The contents of the symbolic link
1378 shall be stored in the linkname field.
1379
1380 3,4 Represent character special files and block special files
1381 respectively. In this case the devmajor and devminor fields
1382 shall contain information defining the device, the format of
1383 which is unspecified by this volume of IEEE Std 1003.1-2001.
1384 Implementations may map the device specifications to their own
1385 local specification or may ignore the entry.
1386
1387 5 Specifies a directory or subdirectory. On systems where disk
1388 allocation is performed on a directory basis, the size field
1389 shall contain the maximum number of octets (which may be rounded
1390 to the nearest disk block allocation unit) that the directory
1391 may hold. A size field of zero indicates no such limiting. Sys‐
1392 tems that do not support limiting in this manner should ignore
1393 the size field.
1394
1395 6 Specifies a FIFO special file. Note that the archiving of a FIFO
1396 file archives the existence of this file and not its contents.
1397
1398 7 Reserved to represent a file to which an implementation has
1399 associated some high-performance attribute. Implementations
1400 without such extensions should treat this file as a regular file
1401 (type 0).
1402
1403 A-Z The letters 'A' to 'Z', inclusive, are reserved for custom
1404 implementations. All other values are reserved for future ver‐
1405 sions of IEEE Std 1003.1-2001.
1406
1407 It is unspecified whether files with pathnames that refer to the same
1408 directory entry are archived as linked files or as separate files. If
1409 they are archived as linked files, this means that attempting to
1410 extract both pathnames from the resulting archive will always cause an
1411 error (unless the -u option is used) because the link cannot be cre‐
1412 ated.
1413
1414 It is unspecified whether files with the same device and file serial
1415 numbers being appended to an archive are treated as linked files to
1416 members that were in the archive before the append.
1417
1418 Attempts to archive a socket using ustar interchange format shall pro‐
1419 duce a diagnostic message. Handling of other file types is implementa‐
1420 tion-defined.
1421
1422 The magic field is the specification that this archive was output in
1423 this archive format. If this field contains ustar (the five characters
1424 from the ISO/IEC 646:1991 standard IRV shown followed by NUL), the
1425 uname and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1426 representation of the owner and group of the file, respectively (trun‐
1427 cated to fit, if necessary). When the file is restored by a privi‐
1428 leged, protection-preserving version of the utility, the user and group
1429 databases shall be scanned for these names. If found, the user and
1430 group IDs contained within these files shall be used rather than the
1431 values contained within the uid and gid fields.
1432
1433
1434 cpio Interchange Format
1435 The octet-oriented cpio archive format shall be a series of entries,
1436 each comprising a header that describes the file, the name of the file,
1437 and then the contents of the file.
1438
1439 An archive may be recorded as a series of fixed-size blocks of octets.
1440 This blocking shall be used only to make physical I/O more efficient.
1441 The last group of blocks shall always be at the full size.
1442
1443 For the octet-oriented cpio archive format, the individual entry infor‐
1444 mation shall be in the order indicated and described by the following
1445 table; see also the <cpio.h> header.
1446
1447 Table: Octet-Oriented cpio Archive Entry
1448
1449 ┌─────────────────────┬────────────────────┬─────────────────┐
1450 │ Header Field Name │ Length (in Octets) │ Interpreted as │
1451 ├─────────────────────┼────────────────────┼─────────────────┤
1452 │c_magic │ 6 │ Octal number │
1453 │c_dev │ 6 │ Octal number │
1454 │c_ino │ 6 │ Octal number │
1455 │c_mode │ 6 │ Octal number │
1456 │c_uid │ 6 │ Octal number │
1457 │c_gid │ 6 │ Octal number │
1458 │c_nlink │ 6 │ Octal number │
1459 │c_rdev │ 6 │ Octal number │
1460 │c_mtime │ 11 │ Octal number │
1461 │c_namesize │ 6 │ Octal number │
1462 │c_filesize │ 11 │ Octal number │
1463 │ │ │ │
1464 │Filename Field Name │ Length │ Interpreted as │
1465 │c_name │ c_namesize │ Pathname string │
1466 │ │ │ │
1467 │File Data Field Name │ Length │ Interpreted as │
1468 │c_filedata │ c_filesize │ Data │
1469 └─────────────────────┴────────────────────┴─────────────────┘
1470 cpio Header
1471 For each file in the archive, a header as defined previously shall be
1472 written. The information in the header fields is written as streams of
1473 the ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1474 The octal numbers shall be extended to the necessary length by append‐
1475 ing the ISO/IEC 646:1991 standard IRV zeros at the most-significant-
1476 digit end of the number; the result is written to the most-significant
1477 digit of the stream of octets first. The fields shall be interpreted as
1478 follows:
1479
1480 c_magic
1481 Identify the archive as being a transportable archive by con‐
1482 taining the identifying value "070707".
1483
1484 c_dev, c_ino
1485 Contains values that uniquely identify the file within the ar‐
1486 chive (that is, no files contain the same pair of c_dev and
1487 c_ino values unless they are links to the same file). The values
1488 shall be determined in an unspecified manner.
1489
1490 c_mode Contains the file type and access permissions as defined in the
1491 following table.
1492
1493 Table: Values for cpio c_mode Field
1494
1495 ┌──────────────────────┬─────────┬────────────────────────┐
1496 │File Permissions Name │ Value │ Indicates │
1497 ├──────────────────────┼─────────┼────────────────────────┤
1498 │C_IRUSR │ 000400 │ Read by owner │
1499 │C_IWUSR │ 000200 │ Write by owner │
1500 │C_IXUSR │ 000100 │ Execute by owner │
1501 │C_IRGRP │ 000040 │ Read by group │
1502 │C_IWGRP │ 000020 │ Write by group │
1503 │C_IXGRP │ 000010 │ Execute by group │
1504 │C_IROTH │ 000004 │ Read by others │
1505 │C_IWOTH │ 000002 │ Write by others │
1506 │C_IXOTH │ 000001 │ Execute by others │
1507 │C_ISUID │ 004000 │ Set uid │
1508 │C_ISGID │ 002000 │ Set gid │
1509 │C_ISVTX │ 001000 │ Reserved │
1510 ├──────────────────────┼─────────┼────────────────────────┤
1511 │File Type Name │ Value │ Indicates │
1512 ├──────────────────────┼─────────┼────────────────────────┤
1513 │C_ISDIR │ 0040000 │ Directory │
1514 │C_ISFIFO │ 0010000 │ FIFO │
1515 │C_ISREG │ 0100000 │ Regular file │
1516 │C_ISLNK │ 0120000 │ Symbolic link │
1517 │C_ISBLK │ 0060000 │ Block special file │
1518 │C_ISCHR │ 0020000 │ Character special file │
1519 │C_ISSOCK │ 0140000 │ Socket │
1520 │C_ISCTG │ 0110000 │ Reserved │
1521 └──────────────────────┴─────────┴────────────────────────┘
1522 Directories, FIFOs, symbolic links, and regular files shall be
1523 supported on a system conforming to this volume of IEEE Std
1524 1003.1-2001; additional values defined previously are reserved
1525 for compatibility with existing systems. Additional file types
1526 may be supported; however, such files should not be written to
1527 archives intended to be transported to other systems.
1528
1529 c_uid Contains the user ID of the owner.
1530
1531 c_gid Contains the group ID of the group.
1532
1533 c_nlink
1534 Contains a number greater than or equal to the number of links
1535 in the archive referencing the file. If the -a option is used to
1536 append to a cpio archive, then the pax utility need not account
1537 for the files in the existing part of the archive when calculat‐
1538 ing the c_nlink values for the appended part of the archive, and
1539 need not alter the c_nlink values in the existing part of the
1540 archive if additional files with the same c_dev and c_ino values
1541 are appended to the archive.
1542
1543 c_rdev Contains implementation-defined information for character or
1544 block special files.
1545
1546 c_mtime
1547 Contains the latest time of modification of the file at the time
1548 the archive was created.
1549
1550 c_namesize
1551 Contains the length of the pathname, including the terminating
1552 NUL character.
1553
1554 c_filesize
1555 Contains the length of the file in octets. This shall be the
1556 length of the data section following the header structure.
1557
1558
1559 cpio Filename
1560 The c_name field shall contain the pathname of the file. The length of
1561 this field in octets is the value of c_namesize.
1562
1563 If a filename is found on the medium that would create an invalid path‐
1564 name, it is implementation-defined whether the data from the file is
1565 stored on the file hierarchy and under what name it is stored.
1566
1567 All characters shall be represented in the ISO/IEC 646:1991 standard
1568 IRV. For maximum portability between implementations, names should be
1569 selected from characters represented by the portable filename character
1570 set as octets with the most significant bit zero. If an implementation
1571 supports the use of characters outside the portable filename character
1572 set in names for files, users, and groups, one or more implementation-
1573 defined encodings of these characters shall be provided for interchange
1574 purposes. However, the pax utility shall never create filenames on the
1575 local system that cannot be accessed via the procedures described pre‐
1576 viously in this volume of IEEE Std 1003.1-2001. If a filename is found
1577 on the medium that would create an invalid filename, it is implementa‐
1578 tion-defined whether the data from the file is stored on the local file
1579 system and under what name it is stored. The pax utility may choose to
1580 ignore these files as long as it produces an error indicating that the
1581 file is being ignored.
1582
1583
1584 cpio File Data
1585 Following c_name, there shall be c_filesize octets of data. Interpre‐
1586 tation of such data occurs in a manner dependent on the file. If
1587 c_filesize is zero, no data shall be contained in c_filedata.
1588
1589 When restoring from an archive:
1590
1591 · If the user does not have the appropriate privilege to create a
1592 file of the specified type, pax shall ignore the entry and write
1593 an error message to standard error.
1594
1595 · Only regular files have data to be restored. Presuming a regular
1596 file meets any selection criteria that might be imposed on the
1597 format-reading utility by the user, such data shall be restored.
1598
1599 · If a user does not have appropriate privilege to set a particu‐
1600 lar mode flag, the flag shall be ignored. Some of the mode flags
1601 in the archive format are not mentioned elsewhere in this volume
1602 of IEEE Std 1003.1-2001. If the implementation does not support
1603 those flags, they may be ignored.
1604
1605
1606 cpio Special Entries
1607 FIFO special files, directories, and the trailer shall be recorded with
1608 c_filesize equal to zero. For other special files, c_filesize is
1609 unspecified by this volume of IEEE Std 1003.1-2001. The header for the
1610 next file entry in the archive shall be written directly after the last
1611 octet of the file entry preceding it. A header denoting the filename
1612 TRAILER!!! shall indicate the end of the archive; the contents of
1613 octets in the last block of the archive following such a header are
1614 undefined.
1615
1616
1618 The following exit values shall be returned:
1619
1620 0 All files were processed successfully.
1621
1622 >0 An error occurred.
1623
1624
1626 If pax cannot create a file or a link when reading an archive or cannot
1627 find a file when writing an archive, or cannot preserve the user ID,
1628 group ID, or file mode when the -p option is specified, a diagnostic
1629 message shall be written to standard error and a non-zero exit status
1630 shall be returned, but processing shall continue. In the case where pax
1631 cannot create a link to a file, pax shall not, by default, create a
1632 second copy of the file.
1633
1634 If the extraction of a file from an archive is prematurely terminated
1635 by a signal or error, pax may have only partially extracted the file or
1636 (if the -n option was not specified) may have extracted a file of the
1637 same name as that specified by the user, but which is not the file the
1638 user wanted. Additionally, the file modes of extracted directories may
1639 have additional bits from the S_IRWXU mask set as well as incorrect
1640 modification and access times.
1641
1642
1643_________________________________________________________________
1645
1646
1648 Caution is advised when using the -a option to append to a cpio format
1649 archive. If any of the files being appended happen to be given the same
1650 c_dev and c_ino values as a file in the existing part of the archive,
1651 then they may be treated as links to that file on extraction. Thus, it
1652 is risky to use -a with cpio format except when it is done on the same
1653 system that the original archive was created on, and with the same pax
1654 utility, and in the knowledge that there has been little or no file
1655 system activity since the original archive was created that could lead
1656 to any of the files appended being given the same c_dev and c_ino val‐
1657 ues as an unrelated file in the existing part of the archive. Also,
1658 when (intentionally) appending additional links to a file in the exist‐
1659 ing part of the archive, the c_nlink values in the modified archive can
1660 be smaller than the number of links to the file in the archive, which
1661 may mean that the links are not preserved on extraction.
1662
1663 The -p (privileges) option was invented to reconcile differences
1664 between historical tar and cpio implementations. In particular, the two
1665 utilities use -m in diametrically opposed ways. The -p option also pro‐
1666 vides a consistent means of extending the ways in which future file
1667 attributes can be addressed, such as for enhanced security systems or
1668 high-performance files. Although it may seem complex, there are really
1669 two modes that are most commonly used:
1670
1671 -p e ``Preserve everything". This would be used by the historical
1672 superuser, someone with all the appropriate privileges, to pre‐
1673 serve all aspects of the files as they are recorded in the ar‐
1674 chive. The e flag is the sum of o and p, and other implementa‐
1675 tion-defined attributes.
1676
1677 -p p ``Preserve" the file mode bits. This would be used by the user
1678 with regular privileges who wished to preserve aspects of the
1679 file other than the ownership. The file times are preserved by
1680 default, but two other flags are offered to disable these and
1681 use the time of extraction.
1682
1683 The one pathname per line format of standard input precludes pathnames
1684 containing <newline>s. Although such pathnames violate the portable
1685 filename guidelines, they may exist and their presence may inhibit
1686 usage of pax within shell scripts. This problem is inherited from his‐
1687 torical archive programs. The problem can be avoided by listing file‐
1688 name arguments on the command line instead of on standard input.
1689
1690 It is almost certain that appropriate privileges are required for pax
1691 to accomplish parts of this volume of IEEE Std 1003.1-2001. Specifi‐
1692 cally, creating files of type block special or character special,
1693 restoring file access times unless the files are owned by the user (the
1694 -t option), or preserving file owner, group, and mode (the -p option)
1695 all probably require appropriate privileges.
1696
1697 In read mode, implementations are permitted to overwrite files when the
1698 archive has multiple members with the same name. This may fail if per‐
1699 missions on the first version of the file do not permit it to be over‐
1700 written.
1701
1702 The cpio and ustar formats can only support files up to 8589934592
1703 bytes (8 * 2^30) in size.
1704
1705
1707 The following command:
1708
1709 pax -w -f /dev/rmt/1m .
1710
1711 copies the contents of the current directory to tape drive 1, medium
1712 density (assuming historical System V device naming procedures-the his‐
1713 torical BSD device name would be /dev/rmt9).
1714
1715 The following commands:
1716
1717 mkdir newdirpax -rw olddir newdir
1718
1719 copy the olddir directory hierarchy to newdir.
1720
1721 pax -r -s ',^//*usr//*,,' -f a.pax
1722
1723 reads the archive a.pax, with all files rooted in /usr in the archive
1724 extracted relative to the current directory.
1725
1726 Using the option:
1727
1728 -o listopt="%M %(atime)T %(size)D %(name)s"
1729
1730 overrides the default output description in Standard Output and instead
1731 writes:
1732
1733 -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1734
1735 Using the options:
1736
1737 -o listopt='%L\t%(size)D\n%.7' \
1738 -o listopt='(name)s\n%(atime)T\n%T'
1739
1740 overrides the default output description in Standard Output and instead
1741 writes:
1742
1743 /usr/foo/bar -> /tmp 1492
1744 /usr/fo
1745 Jan 12 1991
1746 Jan 31 15:53
1747
1748
1750 The pax utility was new for the ISO POSIX-2:1993 standard. It repre‐
1751 sents a peaceful compromise between advocates of the historical tar and
1752 cpio utilities.
1753
1754 A fundamental difference between cpio and tar was in the way directo‐
1755 ries were treated. The cpio utility did not treat directories differ‐
1756 ently from other files, and to select a directory and its contents
1757 required that each file in the hierarchy be explicitly specified. For
1758 tar, a directory matched every file in the file hierarchy it rooted.
1759
1760 The pax utility offers both interfaces; by default, directories map
1761 into the file hierarchy they root. The -d option causes pax to skip any
1762 file not explicitly referenced, as cpio historically did. The tar -
1763 style behavior was chosen as the default because it was believed that
1764 this was the more common usage and because tar is the more commonly
1765 available interface, as it was historically provided on both System V
1766 and BSD implementations.
1767
1768 The data interchange format specification in this volume of IEEE Std
1769 1003.1-2001 requires that processes with "appropriate privileges" shall
1770 always restore the ownership and permissions of extracted files exactly
1771 as archived. If viewed from the historic equivalence between superuser
1772 and "appropriate privileges", there are two problems with this require‐
1773 ment. First, users running as superusers may unknowingly set dangerous
1774 permissions on extracted files. Second, it is needlessly limiting, in
1775 that superusers cannot extract files and own them as superuser unless
1776 the archive was created by the superuser. (It should be noted that
1777 restoration of ownerships and permissions for the superuser, by
1778 default, is historical practice in cpio, but not in tar.) In order to
1779 avoid these two problems, the pax specification has an additional
1780 "privilege" mechanism, the -p option. Only a pax invocation with the
1781 privileges needed, and which has the -p option set using the e specifi‐
1782 cation character, has the "appropriate privilege" to restore full own‐
1783 ership and permission information.
1784
1785 Note also that this volume of IEEE Std 1003.1-2001 requires that the
1786 file ownership and access permissions shall be set, on extraction, in
1787 the same fashion as the creat(2) function when provided with the mode
1788 stored in the archive. This means that the file creation mask of the
1789 user is applied to the file permissions.
1790
1791 Users should note that directories may be created by pax while extract‐
1792 ing files with permissions that are different from those that existed
1793 at the time the archive was created. When extracting sensitive informa‐
1794 tion into a directory hierarchy that no longer exists, users are
1795 encouraged to set their file creation mask appropriately to protect
1796 these files during extraction.
1797
1798 The table of contents output is written to standard output to facili‐
1799 tate pipeline processing.
1800
1801 An early proposal had hard links displaying for all pathnames. This
1802 was removed because it complicates the output of the case where -v is
1803 not specified and does not match historical cpio usage. The hard-link
1804 information is available in the -v display.
1805
1806 The description of the -l option allows implementations to make hard
1807 links to symbolic links. IEEE Std 1003.1-2001 does not specify any way
1808 to create a hard link to a symbolic link, but many implementations pro‐
1809 vide this capability as an extension. If there are hard links to sym‐
1810 bolic links when an archive is created, the implementation is required
1811 to archive the hard link in the archive (unless -H or -L is specified).
1812 When in read mode and in copy mode, implementations supporting hard
1813 links to symbolic links should use them when appropriate.
1814
1815 The archive formats inherited from the POSIX.1-1990 standard have cer‐
1816 tain restrictions that have been brought along from historical usage.
1817 For example, there are restrictions on the length of pathnames stored
1818 in the archive. When pax is used in copy (-rw) mode (copying directory
1819 hierarchies), the ability to use extensions from the -x pax format
1820 overcomes these restrictions.
1821
1822 The default blocksize value of 5120 bytes for cpio was selected because
1823 it is one of the standard block-size values for cpio, set when the -B
1824 option is specified. (The other default block-size value for cpio is
1825 512 bytes, and this was considered to be too small.) The default block
1826 value of 10240 bytes for tar was selected because that is the standard
1827 block-size value for BSD tar. The maximum block size of 32256 bytes
1828 (2^15-512 bytes) is the largest multiple of 512 bytes that fits into a
1829 signed 16-bit tape controller transfer register. There are known limi‐
1830 tations in some historical systems that would prevent larger blocks
1831 from being accepted. Historical values were chosen to improve compati‐
1832 bility with historical scripts using dd(1) or similar utilities to
1833 manipulate archives. Also, default block sizes for any file type other
1834 than character special file has been deleted from this volume of IEEE
1835 Std 1003.1-2001 as unimportant and not likely to affect the structure
1836 of the resulting archive.
1837
1838 Implementations are permitted to modify the block-size value based on
1839 the archive format or the device to which the archive is being written.
1840 This is to provide implementations with the opportunity to take advan‐
1841 tage of special types of devices, and it should not be used without a
1842 great deal of consideration as it almost certainly decreases archive
1843 portability.
1844
1845 The intended use of the -n option was to permit extraction of one or
1846 more files from the archive without processing the entire archive. This
1847 was viewed by the standard developers as offering significant perfor‐
1848 mance advantages over historical implementations. The -n option in
1849 early proposals had three effects; the first was to cause special char‐
1850 acters in patterns to not be treated specially. The second was to cause
1851 only the first file that matched a pattern to be extracted. The third
1852 was to cause pax to write a diagnostic message to standard error when
1853 no file was found matching a specified pattern. Only the second behav‐
1854 ior is retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1855 sons. First, it is in general not acceptable for a single option to
1856 have multiple effects. Second, the ability to make pattern matching
1857 characters act as normal characters is useful for parts of pax other
1858 than file extraction. Third, a finer degree of control over the special
1859 characters is useful because users may wish to normalize only a single
1860 special character in a single filename. Fourth, given a more general
1861 escape mechanism, the previous behavior of the -n option can be easily
1862 obtained using the -s option or a sed script. Finally, writing a diag‐
1863 nostic message when a pattern specified by the user is unmatched by any
1864 file is useful behavior in all cases.
1865
1866 In this version, the -n was removed from the copy mode synopsis of pax;
1867 it is inapplicable because there are no pattern operands specified in
1868 this mode.
1869
1870 There is another method than pax for copying subtrees in IEEE Std
1871 1003.1-2001 described as part of the cp(1) utility. Both methods are
1872 historical practice: cp(1) provides a simpler, more intuitive inter‐
1873 face, while pax offers a finer granularity of control. Each provides
1874 additional functionality to the other; in particular, pax maintains the
1875 hard-link structure of the hierarchy while cp(1) does not. It is the
1876 intention of the standard developers that the results be similar (using
1877 appropriate option combinations in both utilities). The results are not
1878 required to be identical; there seemed insufficient gain to applica‐
1879 tions to balance the difficulty of implementations having to guarantee
1880 that the results would be exactly identical.
1881
1882 A single archive may span more than one file. It is suggested that
1883 implementations provide informative messages to the user on standard
1884 error whenever the archive file is changed.
1885
1886 The -d option (do not create intermediate directories not listed in the
1887 archive) found in early proposals was originally provided as a comple‐
1888 ment to the historic -d option of cpio. It has been deleted.
1889
1890 The -s option in early proposals specified a subset of the substitution
1891 command from the ed utility. As there was no reason for only a subset
1892 to be supported, the -s option is now compatible with the current ed
1893 specification. Since the delimiter can be any non-null character, the
1894 following usage with single spaces is valid:
1895
1896 pax -s " foo bar " ...
1897
1898 The -t description is worded so as to note that this may cause the
1899 access time update caused by some other activity (which occurs while
1900 the file is being read) to be overwritten.
1901
1902 The default behavior of pax with regard to file modification times is
1903 the same as historical implementations of tar. It is not the histori‐
1904 cal behavior of cpio.
1905
1906 Because the -i option uses /dev/tty, utilities without a controlling
1907 terminal are not able to use this option.
1908
1909 The -y option, found in early proposals, has been deleted because a
1910 line containing a single period for the -i option has equivalent func‐
1911 tionality. The special lines for the -i option (a single period and the
1912 empty line) are historical practice in cpio.
1913
1914 In early drafts, a -e charmap option was included to increase portabil‐
1915 ity of files between systems using different coded character sets. This
1916 option was omitted because it was apparent that consensus could not be
1917 formed for it. In this version, the use of UTF-8 should be an adequate
1918 substitute.
1919
1920 The -k option was added to address international concerns about the
1921 dangers involved in the character set transformations of -e (if the
1922 target character set were different from the source, the filenames
1923 might be transformed into names matching existing files) and also was
1924 made more general to protect files transferred between file systems
1925 with different {NAME_MAX} values (truncating a filename on a smaller
1926 system might also inadvertently overwrite existing files). As stated,
1927 it prevents any overwriting, even if the target file is older than the
1928 source. This version adds more granularity of options to solve this
1929 problem by introducing the -o invalid=option - specifically the UTF-8
1930 action. (Note that an existing file that is named with a UTF-8 encoding
1931 is still subject to overwriting in this case. The -k option closes that
1932 loophole.)
1933
1934 Some of the file characteristics referenced in this volume of IEEE Std
1935 1003.1-2001 might not be supported by some archive formats. For exam‐
1936 ple, neither the tar nor cpio formats contain the file access time. For
1937 this reason, the e specification character has been provided, intended
1938 to cause all file characteristics specified in the archive to be
1939 retained.
1940
1941 It is required that extracted directories, by default, have their
1942 access and modification times and permissions set to the values speci‐
1943 fied in the archive. This has obvious problems in that the directories
1944 are almost certainly modified after being extracted and that directory
1945 permissions may not permit file creation. One possible solution is to
1946 create directories with the mode specified in the archive, as modified
1947 by the umask of the user, with sufficient permissions to allow file
1948 creation. After all files have been extracted, pax would then reset the
1949 access and modification times and permissions as necessary.
1950
1951 The list-mode formatting description borrows heavily from the one
1952 defined by the printf(1) utility. However, since there is no separate
1953 operand list to get conversion arguments, the format was extended to
1954 allow specifying the name of the conversion argument as part of the
1955 conversion specification.
1956
1957 The T conversion specifier allows time fields to be displayed in any of
1958 the date formats. Unlike the ls(1) utility, pax does not adjust the
1959 format when the date is less than six months in the past. This makes
1960 parsing the output more predictable.
1961
1962 The D conversion specifier handles the ability to display the
1963 major/minor or file size, as with ls(1), by using %-8(size)D.
1964
1965 The L conversion specifier handles the ls display for symbolic links.
1966
1967 Conversion specifiers were added to generate existing known types used
1968 for ls(1).
1969
1970
1971 pax Interchange Format
1972 The new POSIX data interchange format was developed primarily to sat‐
1973 isfy international concerns that the ustar and cpio formats did not
1974 provide for file, user, and group names encoded in characters outside a
1975 subset of the ISO/IEC 646:1991 standard. The standard developers real‐
1976 ized that this new POSIX data interchange format should be very exten‐
1977 sible because there were other requirements they foresaw in the near
1978 future:
1979
1980 · Support international character encodings and locale information
1981
1982 · Support security information (ACLs, and so on)
1983
1984 · Support future file types, such as realtime or contiguous files
1985
1986 · Include data areas for implementation use
1987
1988 · Support systems with words larger than 32 bits and timers with
1989 subsecond granularity
1990
1991 The following were not goals for this format because these are better
1992 handled by separate utilities or are inappropriate for a portable for‐
1993 mat:
1994
1995 · Encryption
1996
1997 · Compression
1998
1999 · Data translation between locales and codesets
2000
2001 · inode storage
2002
2003 The format chosen to support the goals is an extension of the ustar
2004 format. Of the two formats previously available, only the ustar format
2005 was selected for extensions because:
2006
2007 · It was easier to extend in an upwards-compatible way. It offered
2008 version flags and header block type fields with room for future
2009 standardization. The cpio format, while possessing a more flexi‐
2010 ble file naming methodology, could not be extended without
2011 breaking some theoretical implementation or using a dummy file‐
2012 name that could be a legitimate filename.
2013
2014 · Industry experience since the original "tar wars" fought in
2015 developing the ISO POSIX-1 standard has clearly been in favor of
2016 the ustar format, which is generally the default output format
2017 selected for pax implementations on new systems.
2018
2019 The new format was designed with one additional goal in mind: reason‐
2020 able behavior when an older tar or pax utility happened to read an ar‐
2021 chive. Since the POSIX.1-1990 standard mandated that a "format-reading
2022 utility" had to treat unrecognized typeflag values as regular files,
2023 this allowed the format to include all the extended information in a
2024 pseudo-regular file that preceded each real file. An option is given
2025 that allows the archive creator to set up reasonable names for these
2026 files on the older systems. Also, the normative text suggests that
2027 reasonable file access values be used for this ustar header block. Mak‐
2028 ing these header files inaccessible for convenient reading and deleting
2029 would not be reasonable. File permissions of 600 or 700 are suggested.
2030
2031 The ustar typeflag field was used to accommodate the additional func‐
2032 tionality of the new format rather than magic or version because the
2033 POSIX.1-1990 standard (and, by reference, the previous version of pax),
2034 mandated the behavior of the format-reading utility when it encountered
2035 an unknown typeflag, but was silent about the other two fields.
2036
2037 Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2038 a proposed archive format that was based on compatibility with the
2039 standard for tape files (ISO 1001, similar to the format used histori‐
2040 cally on many mainframes and minicomputers). This format was overly
2041 complex and required considerable overhead in volume and header
2042 records. Furthermore, the standard developers felt that it would not be
2043 acceptable to the community of POSIX developers, so it was later
2044 changed to be a format more closely related to historical practice on
2045 POSIX systems.
2046
2047 The prefix and name split of pathnames in ustar was replaced by the
2048 single path extended header record for simplicity.
2049
2050 The concept of a global extended header (typeflag g) was controversial.
2051 If this were applied to an archive being recorded on magnetic tape, a
2052 few unreadable blocks at the beginning of the tape could be a serious
2053 problem; a utility attempting to extract as many files as possible from
2054 a damaged archive could lose a large percentage of file header informa‐
2055 tion in this case. However, if the archive were on a reliable medium,
2056 such as a CD-ROM, the global extended header offers considerable poten‐
2057 tial size reductions by eliminating redundant information. Thus, the
2058 text warns against using the global method for unreliable media and
2059 provides a method for implanting global information in the extended
2060 header for each file, rather than in the typeflag g records.
2061
2062 No facility for data translation or filtering on a per-file basis is
2063 included because the standard developers could not invent an interface
2064 that would allow this in an efficient manner. If a filter, such as
2065 encryption or compression, is to be applied to all the files, it is
2066 more efficient to apply the filter to the entire archive as a single
2067 file. The standard developers considered interfaces that would invoke a
2068 shell script for each file going into or out of the archive, but the
2069 system overhead in this approach was considered to be too high.
2070
2071 One such approach would be to have filter= records that give a pathname
2072 for an executable. When the program is invoked, the file and archive
2073 would be open for standard input/output and all the header fields would
2074 be available as environment variables or command-line arguments. The
2075 standard developers did discuss such schemes, but they were omitted
2076 from IEEE Std 1003.1-2001 due to concerns about excessive overhead.
2077 Also, the program itself would need to be in the archive if it were to
2078 be used portably.
2079
2080 There is currently no portable means of identifying the character
2081 set(s) used for a file in the file system. Therefore, pax has not been
2082 given a mechanism to generate charset records automatically. The only
2083 portable means of doing this is for the user to write the archive using
2084 the -o charset=string command line option. This assumes that all of the
2085 files in the archive use the same encoding. The "implementation-
2086 defined" text is included to allow for a system that can identify the
2087 encodings used for each of its files.
2088
2089 The table of standards that accompanies the charset record description
2090 is acknowledged to be very limited. Only a limited number of character
2091 set standards is reasonable for maximal interchange. Any character set
2092 is, of course, possible by prior agreement. It was suggested that
2093 EBCDIC be listed, but it was omitted because it is not defined by a
2094 formal standard. Formal standards, and then only those with reasonably
2095 large followings, can be included here, simply as a matter of practi‐
2096 cality. The <value>s represent names of officially registered character
2097 sets in the format required by the ISO 2375:1985 standard.
2098
2099 The normal comma or <blank>-separated list rules are not followed in
2100 the case of keyword options to allow ease of argument parsing for
2101 getopts.
2102
2103 Further information on character encodings is in pax Archive Character
2104 Set Encoding/Decoding.
2105
2106 The standard developers have reserved keyword name space for vendor
2107 extensions. It is suggested that the format to be used is:
2108
2109 VENDOR.keyword
2110
2111 where VENDOR is the name of the vendor or organization in all uppercase
2112 letters. It is further suggested that the keyword following the period
2113 be named differently than any of the standard keywords so that it could
2114 be used for future standardization, if appropriate, by omitting the
2115 VENDOR prefix.
2116
2117 The <length> field in the extended header record was included to make
2118 it simpler to step through the records, even if a record contains an
2119 unknown format (to a particular pax) with complex interactions of spe‐
2120 cial characters. It also provides a minor integrity checkpoint within
2121 the records to aid a program attempting to recover files from a damaged
2122 archive.
2123
2124 There are no extended header versions of the devmajor and devminor
2125 fields because the unspecified format ustar header field should be suf‐
2126 ficient. If they are not, vendor-specific extended keywords (such as
2127 VENDOR.devmajor) should be used.
2128
2129 Device and i-number labeling of files was not adopted from cpio; files
2130 are interchanged strictly on a symbolic name basis, as in ustar.
2131
2132 Just as with the ustar format descriptions, the new format makes no
2133 special arrangements for multi-volume archives. Each of the pax archive
2134 types is assumed to be inside a single POSIX file and splitting that
2135 file over multiple volumes (diskettes, tape cartridges, and so on),
2136 processing their labels, and mounting each in the proper sequence are
2137 considered to be implementation details that cannot be described
2138 portably.
2139
2140 The pax format is intended for interchange, not only for backup on a
2141 single (family of) systems. It is not as densely packed as might be
2142 possible for backup:
2143
2144 · It contains information as coded characters that could be coded
2145 in binary.
2146
2147 · It identifies extended records with name fields that could be
2148 omitted in favor of a fixed-field layout.
2149
2150 · It translates names into a portable character set and identifies
2151 locale-related information, both of which are probably unneces‐
2152 sary for backup.
2153
2154 The requirements on restoring from an archive are slightly different
2155 from the historical wording, allowing for non-monolithic privilege to
2156 bring forward as much as possible. In particular, attributes such as
2157 "high performance file" might be broadly but not universally granted
2158 while set-user-ID or chown(2) might be much more restricted. There is
2159 no implication in IEEE Std 1003.1-2001 that the security information be
2160 honored after it is restored to the file hierarchy, in spite of what
2161 might be improperly inferred by the silence on that topic. That is a
2162 topic for another standard.
2163
2164 Links are recorded in the fashion described here because a link can be
2165 to any file type. It is desirable in general to be able to restore part
2166 of an archive selectively and restore all of those files completely. If
2167 the data is not associated with each link, it is not possible to do
2168 this. However, the data associated with a file can be large, and when
2169 selective restoration is not needed, this can be a significant burden.
2170 The archive is structured so that files that have no associated data
2171 can always be restored by the name of any link name of any link, and
2172 the user may choose whether data is recorded with each instance of a
2173 file that contains data. The format permits mixing of both types of
2174 links in a single archive; this can be done for special needs, and pax
2175 is expected to interpret such archives on input properly, despite the
2176 fact that there is no pax option that would force this mixed case on
2177 output. (When -o linkdata is used, the output must contain the dupli‐
2178 cate data, but the implementation is free to include it or omit it when
2179 -o linkdata is not used.)
2180
2181 The time values are included as extended header records for those
2182 implementations needing more than the eleven octal digits allowed by
2183 the ustar format. Portable file timestamps cannot be negative. If pax
2184 encounters a file with a negative timestamp in copy or write mode, it
2185 can reject the file, substitute a non-negative timestamp, or generate a
2186 non-portable timestamp with a leading '-'. Even though some implementa‐
2187 tions can support finer file-time granularities than seconds, the nor‐
2188 mative text requires support only for seconds since the Epoch because
2189 the ISO POSIX-1 standard states them that way. The ustar format
2190 includes only mtime; the new format adds atime and ctime for symmetry.
2191 The atime access time restored to the file system will be affected by
2192 the -p a and -p e options. The ctime creation time (actually inode mod‐
2193 ification time) is described with "appropriate privilege" so that it
2194 can be ignored when writing to the file system. POSIX does not provide
2195 a portable means to change file creation time. Nothing is intended to
2196 prevent a non-portable implementation of pax from restoring the value.
2197
2198 The gid, size, and uid extended header records were included to allow
2199 expansion beyond the sizes specified in the regular tar header. New
2200 file system architectures are emerging that will exhaust the 12-digit
2201 size field. There are probably not many systems requiring more than 8
2202 digits for user and group IDs, but the extended header values were
2203 included for completeness, allowing overrides for all of the decimal
2204 values in the tar header.
2205
2206 The standard developers intended to describe the effective results of
2207 pax with regard to file ownerships and permissions; implementations are
2208 not restricted in timing or sequencing the restoration of such, pro‐
2209 vided the results are as specified.
2210
2211 Much of the text describing the extended headers refers to use in
2212 "write or copy modes". The copy mode references are due to the norma‐
2213 tive text: "The effect of the copy shall be as if the copied files were
2214 written to an archive file and then subsequently extracted ...". There
2215 is certainly no way to test whether pax is actually generating the
2216 extended headers in copy mode, but the effects must be as if it had.
2217
2218
2219 pax Archive Character Set Encoding/Decoding
2220 There is a need to exchange archives of files between systems of dif‐
2221 ferent native codesets. Filenames, group names, and user names must be
2222 preserved to the fullest extent possible when an archive is read on the
2223 receiving platform. Translation of the contents of files is not within
2224 the scope of the pax utility.
2225
2226 There will also be the need to represent characters that are not avail‐
2227 able on the receiving platform. These unsupported characters cannot be
2228 automatically folded to the local set of characters due to the chance
2229 of collisions. This could result in overwriting previous extracted
2230 files from the archive or pre-existing files on the system.
2231
2232 For these reasons, the codeset used to represent characters within the
2233 extended header records of the pax archive must be sufficiently rich to
2234 handle all commonly used character sets. The fields requiring transla‐
2235 tion include, at a minimum, filenames, user names, group names, and
2236 link pathnames. Implementations may wish to have localized extended
2237 keywords that use non-portable characters.
2238
2239 The standard developers considered the following options:
2240
2241 · The archive creator specifies the well-defined name of the
2242 source codeset. The receiver must then recognize the codeset
2243 name and perform the appropriate translations to the destination
2244 codeset.
2245
2246 · The archive creator includes within the archive the character
2247 mapping table for the source codeset used to encode extended
2248 header records. The receiver must then read the character map‐
2249 ping table and perform the appropriate translations to the des‐
2250 tination codeset.
2251
2252 · The archive creator translates the extended header records in
2253 the source codeset into a canonical form. The receiver must then
2254 perform the appropriate translations to the destination codeset.
2255
2256 The approach that incorporates the name of the source codeset poses the
2257 problem of codeset name registration, and makes the archive useless to
2258 pax archive decoders that do not recognize that codeset.
2259
2260 Because parts of an archive may be corrupted, the standard developers
2261 felt that including the character map of the source codeset was too
2262 fragile. The loss of this one key component could result in making the
2263 entire archive useless. (The difference between this and the global
2264 extended header decision was that the latter has a workaround-duplicat‐
2265 ing extended header records on unreliable media-but this would be too
2266 burdensome for large character set maps.)
2267
2268 Both of the above approaches also put an undue burden on the pax ar‐
2269 chive receiver to handle the cross-product of all source and destina‐
2270 tion codesets.
2271
2272 To simplify the translation from the source codeset to the canonical
2273 form and from the canonical form to the destination codeset, the stan‐
2274 dard developers decided that the internal representation should be a
2275 stateless encoding. A stateless encoding is one where each codepoint
2276 has the same meaning, without regard to the decoder being in a specific
2277 state. An example of a stateful encoding would be the Japanese Shift-
2278 JIS; an example of a stateless encoding would be the ISO/IEC 646:1991
2279 standard (equivalent to 7-bit ASCII).
2280
2281 For these reasons, the standard developers decided to adopt a canonical
2282 format for the representation of file information strings. The obvious,
2283 well-endorsed candidate is the ISO/IEC 10646-1:2000 standard (based in
2284 part on Unicode), which can be used to represent the characters of vir‐
2285 tually all standardized character sets. The standard developers ini‐
2286 tially agreed upon using UCS2 (16-bit Unicode) as the internal repre‐
2287 sentation. This repertoire of characters provides a sufficiently rich
2288 set to represent all commonly-used codesets.
2289
2290 However, the standard developers found that the 16-bit Unicode repre‐
2291 sentation had some problems. It forced the issue of standardizing byte
2292 ordering. The 2-byte length of each character made the extended header
2293 records twice as long for the case of strings coded entirely from his‐
2294 torical 7-bit ASCII. For these reasons, the standard developers chose
2295 the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2296 representation encodes UCS2 or UCS4 characters reliably and determinis‐
2297 tically, eliminating the need for a canonical byte ordering. In addi‐
2298 tion, NUL octets and other characters possibly confusing to POSIX file
2299 systems do not appear, except to represent themselves. It was realized
2300 that certain national codesets take up more space after the encoding,
2301 due to their placement within the UCS range; it was felt that the use‐
2302 fulness of the encoding of the names outweighs the disadvantage of size
2303 increase for file, user, and group names.
2304
2305 The encoding of UTF-8 is as follows:
2306
2307 UCS4 Hex Encoding UTF-8 Binary Encoding
2308 00000000-0000007F 0xxxxxxx
2309 00000080-000007FF 110xxxxx 10xxxxxx
2310 00000800-0000FFFF 1110xxxx 10xxxxxx 10xxxxxx
2311 00010000-001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2312 00200000-03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2313 04000000-7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2314
2315 where each 'x' represents a bit value from the character being trans‐
2316 lated.
2317
2318
2319 ustar Interchange Format
2320 The description of the ustar format reflects numerous enhancements over
2321 pre-1988 versions of the historical tar utility. The goal of these
2322 changes was not only to provide the functional enhancements desired,
2323 but also to retain compatibility between new and old versions. This
2324 compatibility has been retained. Archives written using the old archive
2325 format are compatible with the new format.
2326
2327 Implementors should be aware that the previous file format did not
2328 include a mechanism to archive directory type files. For this reason,
2329 the convention of using a filename ending with slash was adopted to
2330 specify a directory on the archive.
2331
2332 The total size of the name and prefix fields have been set to meet the
2333 minimum requirements for {PATH_MAX} If a pathname will fit within the
2334 name field, it is recommended that the pathname be stored there without
2335 the use of the prefix field. Although the name field is known to be too
2336 small to contain {PATH_MAX} characters, the value was not changed in
2337 this version of the archive file format to retain backwards-compatibil‐
2338 ity, and instead the prefix was introduced. Also, because of the ear‐
2339 lier version of the format, there is no way to remove the restriction
2340 on the linkname field being limited in size to just that of the name
2341 field.
2342
2343 The size field is required to be meaningful in all implementation
2344 extensions, although it could be zero. This is required so that the
2345 data blocks can always be properly counted.
2346
2347 It is suggested that if device special files need to be represented
2348 that cannot be represented in the standard format, that one of the
2349 extension types (A-Z) be used, and that the additional information for
2350 the special file be represented as data and be reflected in the size
2351 field.
2352
2353 Attempting to restore a special file type, where it is converted to
2354 ordinary data and conflicts with an existing filename, need not be spe‐
2355 cially detected by the utility. If run as an ordinary user, pax should
2356 not be able to overwrite the entries in, for example, /dev in any case
2357 (whether the file is converted to another type or not). If run as a
2358 privileged user, it should be able to do so, and it would be considered
2359 a bug if it did not. The same is true of ordinary data files and simi‐
2360 larly named special files; it is impossible to anticipate the needs of
2361 the user (who could really intend to overwrite the file), so the behav‐
2362 ior should be predictable (and thus regular) and rely on the protection
2363 system as required.
2364
2365 The value 7 in the typeflag field is intended to define how contiguous
2366 files can be stored in a ustar archive. IEEE Std 1003.1-2001 does not
2367 require the contiguous file extension, but does define a standard way
2368 of archiving such files so that all conforming systems can interpret
2369 these file types in a meaningful and consistent manner. On a system
2370 that does not support extended file types, the pax utility should do
2371 the best it can with the file and go on to the next.
2372
2373 The file protection modes are those conventionally used by the ls(1)
2374 utility. This is extended beyond the usage in the ISO POSIX-2 standard
2375 to support the "shared text" or "sticky" bit. It is intended that the
2376 conformance document should not document anything beyond the existence
2377 of and support of such a mode. Further extensions are expected to
2378 these bits, particularly with overloading the set-user-ID and set-
2379 group-ID flags.
2380
2381
2382 cpio Interchange Format
2383 The reference to appropriate privilege in the cpio format refers to an
2384 error on standard output; the ustar format does not make comparable
2385 statements.
2386
2387 The model for this format was the historical System V cpio -c data
2388 interchange format. This model documents the portable version of the
2389 cpio format and not the binary version. It has the flexibility to
2390 transfer data of any type described within IEEE Std 1003.1-2001, yet is
2391 extensible to transfer data types specific to extensions beyond IEEE
2392 Std 1003.1-2001 (for example, contiguous files). Because it describes
2393 existing practice, there is no question of maintaining upwards-compati‐
2394 bility.
2395
2396
2397 cpio Header
2398 There has been some concern that the size of the c_ino field of the
2399 header is too small to handle those systems that have very large inode
2400 numbers. However, the c_ino field in the header is used strictly as a
2401 hard-link resolution mechanism for archives. It is not necessarily the
2402 same value as the inode number of the file in the location from which
2403 that file is extracted.
2404
2405 The name c_magic is based on historical usage.
2406
2407
2408 cpio Filename
2409 For most historical implementations of the cpio utility, {PATH_MAX}
2410 octets can be used to describe the pathname without the addition of any
2411 other header fields (the NUL character would be included in this
2412 count). {PATH_MAX} is the minimum value for pathname size, documented
2413 as 256 bytes. However, an implementation may use c_namesize to deter‐
2414 mine the exact length of the pathname. With the current description of
2415 the <cpio.h> header, this pathname size can be as large as a number
2416 that is described in six octal digits.
2417
2418 Two values are documented under the c_mode field values to provide for
2419 extensibility for known file types:
2420
2421 0110 000
2422 Reserved for contiguous files. The implementation may treat the
2423 rest of the information for this archive like a regular file. If
2424 this file type is undefined, the implementation may create the
2425 file as a regular file.
2426
2427 This provides for extensibility of the cpio format while allowing for
2428 the ability to read old archives. Files of an unknown type may be read
2429 as "regular files" on some implementations. On a system that does not
2430 support extended file types, the pax utility should do the best it can
2431 with the file and go on to the next.
2432
2433
2435 None.
2436
2437
2439_________________________________________________________________
2440
2441
2443 Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2444 Base Definitions volume of IEEE Std 1003.1-2001, <cpio.h>, the System
2445 Interfaces volume of IEEE Std 1003.1-2001, chown(2), creat(2),
2446 mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2447
2448
2450 First released in Issue 4.
2451
2452
2453 Issue 5
2454 A note is added to the APPLICATION USAGE indicating that the cpio and
2455 tar formats can only support files up to 8 gigabytes in size.
2456
2457
2458 Issue 6
2459 The pax utility is aligned with the IEEE P1003.2b draft standard:
2460
2461 · Support has been added for symbolic links in the options and
2462 interchange formats.
2463
2464 · A new format has been devised, based on extensions to ustar.
2465
2466 · References to the "extended" tar and cpio formats derived from
2467 the POSIX.1-1990 standard have been changed to remove the
2468 "extended" adjective because this could cause confusion with the
2469 extended tar header added in this revision. (All references to
2470 tar are actually to ustar.)
2471
2472 The TZ entry is added to the ENVIRONMENT VARIABLES section.
2473
2474 IEEE PASC Interpretation 1003.2 #168 is applied, clarifying that
2475 mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2476 ing an archive.
2477
2478 IEEE PASC Interpretation 1003.2 #180 is applied, clarifying how
2479 extracted files are created when in read mode.
2480
2481 IEEE PASC Interpretation 1003.2 #181 is applied, clarifying the
2482 description of the -t option.
2483
2484 IEEE PASC Interpretation 1003.2 #195 is applied.
2485
2486 IEEE PASC Interpretation 1003.2 #206 is applied, clarifying the han‐
2487 dling of links for the -H, -L, and -l options.
2488
2489 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied, adding
2490 the process ID of the pax process into certain fields. This change pro‐
2491 vides a method for the implementation to ensure that different
2492 instances of pax extracting a file named /a/b/foo will not collide when
2493 processing the extended header information associated with foo.
2494
2495 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied, chang‐
2496 ing -x B to -x pax in the OPTIONS section.
2497
2498 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied, updat‐
2499 ing the SYNOPSIS to be consistent with the normative text.
2500
2501 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied, updat‐
2502 ing the DESCRIPTION to describe the behavior when files to be linked
2503 are symbolic links and the system is not capable of making hard links
2504 to symbolic links.
2505
2506 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied, updat‐
2507 ing the OPTIONS section to describe the behavior for how multiple
2508 options are to be handled.
2509
2510 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied, updat‐
2511 ing the write option within the OPTIONS section.
2512
2513 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied, adding
2514 a paragraph into the OPTIONS section that states that specifying more
2515 than one of the mutually-exclusive options (-H and -L) is not consid‐
2516 ered an error and that the last option specified will determine the
2517 behavior of the utility.
2518
2519 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied, remov‐
2520 ing the ctime paragraph within the EXTENDED DESCRIPTION. There is a
2521 contradiction in the definition of the ctime keyword for the pax
2522 extended header, in that the st_ctime member of the stat structure does
2523 not refer to a file creation time. No field in the standard stat struc‐
2524 ture from <sys/stat.h> includes a file creation time.
2525
2526 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied, making
2527 it clear that typeflag 1 RB ( ustar Interchange Format) applies not
2528 only to files that are hard-linked, but also to files that are aliased
2529 via symlinks.
2530
2531 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied, clari‐
2532 fying the cpio c_nlink field.
2533
2534 End of quoted text from the POSIX.1-2001 standard.
2535
2537 The following other options are implemented as extension to the POSIX
2538 standard. Note that some other non-POSIX options are mentioned in
2539 -help and -xhelp output - these are also supported in spax(1) and are
2540 described in the star(1) manual page.
2541
2542 -help Prints a summary of the most important options for spax(1) and
2543 exits.
2544
2545 -do-statistics
2546 Print statistic messages at the end of a spax(1) run.
2547
2548 -xhelp Prints a summary of the less important options for spax(1) and
2549 exits.
2550
2551 -version
2552 Prints the spax version number string and exists.
2553
2554
2561 The Institute of Electrical and Electronics Engineers and The Open
2562 Group, have given us permission to reprint portions of their documenta‐
2563 tion. In the following statement, the phrase ``this text'' refers to
2564 portions of the system documentation.
2565
2566 Portions of this text are reprinted and reproduced in electronic form
2567 in the sfind manual, from IEEE Std 1003.1, 2004 Edition, Standard for
2568 Information Technology -- Portable Operating System Interface (POSIX),
2569 The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004 by
2570 the Institute of Electrical and Electronics Engineers, Inc and The Open
2571 Group. In the event of any discrepancy between these versions and the
2572 original IEEE and The Open Group Standard, the original IEEE and The
2573 Open Group Standard is the referee document. The original Standard can
2574 be obtained online at http://www.opengroup.org/unix/online.html.
2575
2578 Joerg Schilling
2579 Seestr. 110
2580 D-13353 Berlin
2581 Germany
2582
2583 Mail bugs and suggestions to:
2584
2585 schilling@fokus.fraunhofer.de or js@cs.tu-berlin.de or
2586 joerg@schily.isdn.cs.tu-berlin.de
2587
2588
2589
2590Joerg Schilling 13/04/16 SPAX(1L)