1SPAX(1L) Schily´s USER COMMANDS SPAX(1L)
2
3
4
6 pax - portable archive interchange
7
9 spax [other options] [-cdnv] [-H|-L] [-f archive]
10 [-o options]... [-s replstr]... [pattern...]
11
12
13 spax -r [other options] [-cdiknuv] [-H|-L] [-f archive]
14 [-o options]... [-p string]... [-s replstr]... [pattern...]
15
16
17 spax -w [other options] [-dituvX] [-H|-L] [-b blocksize] [-a]
18 [-f archive] [-o options]... [-s replstr]... [-x format]
19 [file...]
20
21
22 spax -r -w[other options] [-diklntuvX] [-H|-L] [-o options]...
23 [-p string]... [-s replstr]... [file...] directory
24
26 The pax utility shall read, write, and write lists of the members of
27 archive files and copy directory hierarchies. A variety of archive for‐
28 mats shall be supported; see the -x format option.
29
30 The action to be taken depends on the presence of the -r and -w
31 options. The four combinations of -r and -w are referred to as the four
32 modes of operation: list, read, write, and copy modes, corresponding
33 respectively to the four forms shown in the SYNOPSIS section.
34
35 list In list mode (when neither -r nor -w are specified), pax shall
36 write the names of the members of the archive file read from the
37 standard input, with pathnames matching the specified patterns,
38 to standard output. If a named file is of type directory, the
39 file hierarchy rooted at that file shall be listed as well.
40
41 read In read mode (when -r is specified, but -w is not), pax shall
42 extract the members of the archive file read from the standard
43 input, with pathnames matching the specified patterns. If an
44 extracted file is of type directory, the file hierarchy rooted
45 at that file shall be extracted as well. The extracted files
46 shall be created performing pathname resolution with the direc‐
47 tory in which pax was invoked as the current working directory.
48
49 If an attempt is made to extract a directory when the directory
50 already exists, this shall not be considered an error. If an
51 attempt is made to extract a FIFO when the FIFO already exists,
52 this shall not be considered an error.
53
54 The ownership, access, and modification times, and file mode of
55 the restored files are discussed under the -p option.
56
57 write In write mode (when -w is specified, but -r is not), pax shall
58 write the contents of the file operands to the standard output
59 in an archive format. If no file operands are specified, a list
60 of files to copy, one per line, shall be read from the standard
61 input. A file of type directory shall include all of the files
62 in the file hierarchy rooted at the file.
63
64 copy In copy mode (when both -r and -w are specified), pax shall copy
65 the file operands to the destination directory.
66
67 If no file operands are specified, a list of files to copy, one
68 per line, shall be read from the standard input. A file of type
69 directory shall include all of the files in the file hierarchy
70 rooted at the file.
71
72 The effect of the copy shall be as if the copied files were
73 written to an archive file and then subsequently extracted,
74 except that there may be hard links between the original and the
75 copied files. If the destination directory is a subdirectory of
76 one of the files to be copied, the results are unspecified. If
77 the destination directory is a file of a type not defined by the
78 System Interfaces volume of IEEE Std 1003.1-2001, the results
79 are implementation-defined; otherwise, it shall be an error for
80 the file named by the directory operand not to exist, not be
81 writable by the user, or not be a file of type directory.
82
83 In read or copy modes, if intermediate directories are necessary to
84 extract an archive member, pax shall perform actions equivalent to the
85 mkdir() function defined in the System Interfaces volume of IEEE Std
86 1003.1-2001, called with the following arguments:
87
88 · The intermediate directory used as the path argument.
89
90 · The value of the bitwise-inclusive OR of S_IRWXU, S_IRWXG, and
91 S_IRWXO as the mode argument.
92
93 If any specified pattern or file operands are not matched by at least
94 one file or archive member, pax shall write a diagnostic message to
95 standard error for each one that did not match and exit with a non-zero
96 exit status.
97
98 The archive formats described in the EXTENDED DESCRIPTION section shall
99 be automatically detected on input. The default output archive format
100 shall be implementation-defined.
101
102 The spax implementation defaults to -x ustar.
103
104 A single archive can span multiple files. The pax utility shall deter‐
105 mine, in an implementation-defined manner, what file to read or write
106 as the next file.
107
108 If the selected archive format supports the specification of linked
109 files, it shall be an error if these files cannot be linked when the
110 archive is extracted, except that if the files to be linked are sym‐
111 bolic links and the system is not capable of making hard links to sym‐
112 bolic links, then separate copies of the symbolic link shall be created
113 instead. For archive formats that do not store file contents with each
114 name that causes a hard link, if the file that contains the data is not
115 extracted during this pax session, either the data shall be restored
116 from the original file, or a diagnostic message shall be displayed with
117 the name of a file that can be used to extract the data. In traversing
118 directories, pax shall detect infinite loops; that is, entering a pre‐
119 viously visited directory that is an ancestor of the last file visited.
120 When it detects an infinite loop, pax shall write a diagnostic message
121 to standard error and shall terminate.
122
123
125 The pax utility shall conform to the Base Definitions volume of IEEE
126 Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except that
127 the order of presentation of the -o, -p, and -s options is significant.
128
129 The following options shall be supported:
130
131 -r Read an archive file from standard input.
132
133 -w Write files to the standard output in the specified archive for‐
134 mat.
135
136 -a Append files to the end of the archive. It is implementation-
137 defined which devices on the system support appending. Addi‐
138 tional file formats unspecified by this volume of IEEE Std
139 1003.1-2001 may impose restrictions on appending.
140
141 -b blocksize
142 Block the output at a positive decimal integer number of bytes
143 per write to the archive file. Devices and archive formats may
144 impose restrictions on blocking. Blocking shall be automatically
145 determined on input. Conforming applications shall not specify a
146 blocksize value larger than 32256. Default blocking when creat‐
147 ing archives depends on the archive format. (See the -x option
148 below.)
149
150 -c Match all file or archive members except those specified by the
151 pattern or file operands.
152
153 -d Cause files of type directory being copied or archived or ar‐
154 chive members of type directory being extracted or listed to
155 match only the file or archive member itself and not the file
156 hierarchy rooted at the file.
157
158 -f archive
159 Specify the pathname of the input or output archive, overriding
160 the default standard input (in list or read modes) or standard
161 output (write mode).
162
163 -H If a symbolic link referencing a file of type directory is spec‐
164 ified on the command line, pax shall archive the file hierarchy
165 rooted in the file referenced by the link, using the name of the
166 link as the root of the file hierarchy. Otherwise, if a sym‐
167 bolic link referencing a file of any other file type which pax
168 can normally archive is specified on the command line, then pax
169 shall archive the file referenced by the link, using the name of
170 the link. The default behavior shall be to archive the symbolic
171 link itself.
172
173 -i Interactively rename files or archive members. For each archive
174 member matching a pattern operand or file matching a file oper‐
175 and, a prompt shall be written to the file /dev/tty. The prompt
176 shall contain the name of the file or archive member, but the
177 format is otherwise unspecified. A line shall then be read from
178 /dev/tty. If this line is blank, the file or archive member
179 shall be skipped. If this line consists of a single period, the
180 file or archive member shall be processed with no modification
181 to its name. Otherwise, its name shall be replaced with the con‐
182 tents of the line. The pax utility shall immediately exit with a
183 non-zero exit status if end-of-file is encountered when reading
184 a response or if /dev/tty cannot be opened for reading and writ‐
185 ing.
186
187 The results of extracting a hard link to a file that has been
188 renamed during extraction are unspecified.
189
190 -k Prevent the overwriting of existing files.
191
192 -l (The letter ell.) In copy mode, hard links shall be made between
193 the source and destination file hierarchies whenever possible.
194 If specified in conjunction with -H or -L, when a symbolic link
195 is encountered, the hard link created in the destination file
196 hierarchy shall be to the file referenced by the symbolic link.
197 If specified when neither -H nor -L is specified, when a sym‐
198 bolic link is encountered, the implementation shall create a
199 hard link to the symbolic link in the source file hierarchy or
200 copy the symbolic link to the destination.
201
202 -L If a symbolic link referencing a file of type directory is spec‐
203 ified on the command line or encountered during the traversal of
204 a file hierarchy, pax shall archive the file hierarchy rooted in
205 the file referenced by the link, using the name of the link as
206 the root of the file hierarchy. Otherwise, if a symbolic link
207 referencing a file of any other file type which pax can normally
208 archive is specified on the command line or encountered during
209 the traversal of a file hierarchy, pax shall archive the file
210 referenced by the link, using the name of the link. The default
211 behavior shall be to archive the symbolic link itself.
212
213 -n Select the first archive member that matches each pattern oper‐
214 and. No more than one archive member shall be matched for each
215 pattern (although members of type directory shall still match
216 the file hierarchy rooted at that file).
217
218 -o options
219 Provide information to the implementation to modify the algo‐
220 rithm for extracting or writing files. The value of options
221 shall consist of one or more comma-separated keywords of the
222 form:
223
224 keyword[[:]=value][,keyword[[:]=value],...]
225
226 Some keywords apply only to certain file formats, as indicated
227 with each description. Use of keywords that are inapplicable to
228 the file format being processed produces undefined results.
229
230 Keywords in the options argument shall be a string that would be
231 a valid portable filename as described in the Base Definitions
232 volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
233 Character Set.
234
235 Note: Keywords are not expected to be filenames, merely to fol‐
236 low the same character composition rules as portable
237 filenames.
238
239 Keywords can be preceded with white space. The value field shall
240 consist of zero or more characters; within value, the applica‐
241 tion shall precede any literal comma with a backslash, which
242 shall be ignored, but preserves the comma as part of value. A
243 comma as the final character, or a comma followed solely by
244 white space as the final characters, in options shall be
245 ignored. Multiple -o options can be specified; if keywords given
246 to these multiple -o options conflict, the keywords and values
247 appearing later in command line sequence shall take precedence
248 and the earlier shall be silently ignored. The following keyword
249 values of options shall be supported for the file formats as
250 indicated:
251
252 delete=pattern
253 (Applicable only to the -x pax format.) When used in
254 write or copy mode, pax shall omit from extended header
255 records that it produces any keywords matching the string
256 pattern. When used in read or list mode, pax shall ignore
257 any keywords matching the string pattern in the extended
258 header records. In both cases, matching shall be per‐
259 formed using the pattern matching notation described in
260 Patterns Matching a Single Character and Patterns Match‐
261 ing Multiple Characters. For example:
262
263 -o delete=security.*
264
265 would suppress security-related information. See pax
266 Extended Header for extended header record keyword usage.
267
268 When multiple -o delete=pattern options are specified,
269 the patterns shall be additive; all keywords matching the
270 specified string patterns shall be omitted from extended
271 header records that pax produces.
272
273 exthdr.name=string
274 (Applicable only to the -x pax format.) This keyword
275 allows user control over the name that is written into
276 the ustar header blocks for the extended header produced
277 under the circumstances described in pax Header Block.
278 The name shall be the contents of string, after the fol‐
279 lowing character substitutions have been made:
280
281 ┌─────────────────┬─────────────────────────────────────────────┐
282 │string Includes: │ Replaced By: │
283 ├─────────────────┼─────────────────────────────────────────────┤
284 │%d │ The directory name of the file, equivalent │
285 │ │ to the result of the dirname utility on the │
286 │ │ translated pathname. │
287 ├─────────────────┼─────────────────────────────────────────────┤
288 │%f │ The filename of the file, equivalent to the │
289 │ │ result of the basename utility on the │
290 │ │ translated pathname. │
291 ├─────────────────┼─────────────────────────────────────────────┤
292 │%p │ The process ID of the pax process. │
293 ├─────────────────┼─────────────────────────────────────────────┤
294 │%% │ A '%' character. │
295 └─────────────────┴─────────────────────────────────────────────┘
296 Any other '%' characters in string produce undefined
297 results.
298
299 If no -o exthdr.name= string is specified, pax shall use
300 the following default value:
301
302 %d/PaxHeaders.%p/%f
303
304 globexthdr.name=string
305 (Applicable only to the -x pax format.) When used in
306 write or copy mode with the appropriate options, pax
307 shall create global extended header records with ustar
308 header blocks that will be treated as regular files by
309 previous versions of pax. This keyword allows user con‐
310 trol over the name that is written into the ustar header
311 blocks for global extended header records. The name shall
312 be the contents of string, after the following character
313 substitutions have been made:
314
315 ┌─────────────────┬─────────────────────────────────────────────┐
316 │string Includes: │ Replaced By: │
317 ├─────────────────┼─────────────────────────────────────────────┤
318 │%n │ An integer that represents the sequence │
319 │ │ number of the global extended header record │
320 │ │ in the archive, starting at 1. │
321 ├─────────────────┼─────────────────────────────────────────────┤
322 │%p │ The process ID of the pax process. │
323 ├─────────────────┼─────────────────────────────────────────────┤
324 │%% │ A '%' character. │
325 └─────────────────┴─────────────────────────────────────────────┘
326 Any other '%' characters in string produce undefined
327 results.
328
329 If no -o globexthdr.name=string is specified, pax shall
330 use the following default value:
331
332 $TMPDIR/GlobalHead.%p.%n
333
334 where $TMPDIR represents the value of the TMPDIR environ‐
335 ment variable. If TMPDIR is not set, pax shall use /tmp.
336
337 invalid=action
338 (Applicable only to the -x pax format.) This keyword
339 allows user control over the action pax takes upon
340 encountering values in an extended header record that, in
341 read or copy mode, are invalid in the destination hierar‐
342 chy or, in list mode, cannot be written in the codeset
343 and current locale of the implementation. The following
344 are invalid values that shall be recognized by pax:
345
346 + In read or copy mode, a filename or link name that
347 contains character encodings invalid in the desti‐
348 nation hierarchy. (For example, the name may con‐
349 tain embedded NULs.)
350
351 + In read or copy mode, a filename or link name that
352 is longer than the maximum allowed in the destina‐
353 tion hierarchy (for either a pathname component or
354 the entire pathname).
355
356 + In list mode, any character string value (file‐
357 name, link name, user name, and so on) that cannot
358 be written in the codeset and current locale of
359 the implementation.
360
361 The following mutually-exclusive values of the action
362 argument are supported:
363
364 bypass In read or copy mode, pax shall bypass the file,
365 causing no change to the destination hierarchy. In
366 list mode, pax shall write all requested valid
367 values for the file, but its method for writing
368 invalid values is unspecified.
369
370 rename In read or copy mode, pax shall act as if the -i
371 option were in effect for each file with invalid
372 filename or link name values, allowing the user to
373 provide a replacement name interactively. In list
374 mode, pax shall behave identically to the bypass
375 action.
376
377 UTF-8 When used in read, copy, or list mode and a file‐
378 name, link name, owner name, or any other field in
379 an extended header record cannot be translated
380 from the pax UTF-8 codeset format to the codeset
381 and current locale of the implementation, pax
382 shall use the actual UTF-8 encoding for the name.
383
384 write In read or copy mode, pax shall write the file,
385 translating the name, regardless of whether this
386 may overwrite an existing file with a valid name.
387 In list mode, pax shall behave identically to the
388 bypass action.
389
390 If no -o invalid=option is specified, pax shall act as if
391 -o invalid= bypass were specified. Any overwriting of
392 existing files that may be allowed by the -o invalid=
393 actions shall be subject to permission(-p) and modifica‐
394 tion time (-u) restrictions, and shall be suppressed if
395 the -k option is also specified.
396
397 linkdata
398 (Applicable only to the -x pax format.) In write mode,
399 pax shall write the contents of a file to the archive
400 even when that file is merely a hard link to a file whose
401 contents have already been written to the archive.
402
403 listopt=format
404 This keyword specifies the output format of the table of
405 contents produced when the -v option is specified in list
406 mode. See List Mode Format Specifications. To avoid ambi‐
407 guity, the listopt= format shall be the only or final
408 keyword= value pair in a -o option-argument; all charac‐
409 ters in the remainder of the option-argument shall be
410 considered part of the format string. When multiple -o
411 listopt= format options are specified, the format strings
412 shall be considered a single, concatenated string, evalu‐
413 ated in command line order.
414
415 times (Applicable only to the -x pax format.) When used in
416 write or copy mode, pax shall include atime and mtime
417 extended header records for each file. See pax Extended
418 Header File Times.
419
420 In addition to these keywords, if the -x pax format is speci‐
421 fied, any of the keywords and values defined in pax Extended
422 Header, including implementation extensions, can be used in -o
423 option-arguments, in either of two modes:
424
425 keyword=value
426 When used in write or copy mode, these keyword/value
427 pairs shall be included at the beginning of the archive
428 as typeflag g global extended header records. When used
429 in read or list mode, these keyword/value pairs shall act
430 as if they had been at the beginning of the archive as
431 typeflag g global extended header records.
432
433 keyword:=value
434 When used in write or copy mode, these keyword/value
435 pairs shall be included as records at the beginning of a
436 typeflag x extended header for each file. (This shall be
437 equivalent to the equal-sign form except that it creates
438 no typeflag g global extended header records.) When used
439 in read or list mode, these keyword/value pairs shall act
440 as if they were included as records at the end of each
441 extended header; thus, they shall override any global or
442 file-specific extended header record keywords of the same
443 names. For example, in the command:
444
445 pax -r -o "gname:=mygroup," <archive
446
447 the group name will be forced to a new value for all
448 files read from the archive.
449
450 The precedence of -o keywords over various fields in the archive
451 is described in pax Extended Header Keyword Precedence.
452
453 -p string
454 Specify one or more file characteristic options (privileges).
455 The string option-argument shall be a string specifying file
456 characteristics to be retained or discarded on extraction. The
457 string shall consist of the specification characters a , e, m,
458 o, and p. Other implementation-defined characters can be
459 included. Multiple characteristics can be concatenated within
460 the same string and multiple -p options can be specified. The
461 meaning of the specification characters are as follows:
462
463 a Do not preserve file access times.
464
465 e Preserve the user ID, group ID, file mode bits (see the
466 Base Definitions volume of IEEE Std 1003.1-2001, Section
467 3.168, File Mode Bits), access time, modification time,
468 and any other implementation-defined file characteris‐
469 tics.
470
471 m
472
473 Do not preserve file modification times.
474
475 o Preserve the user ID and group ID.
476
477 p Preserve the file mode bits. Other implementation-defined
478 file mode attributes may be preserved.
479
480 In the preceding list, "preserve" indicates that an attribute
481 stored in the archive shall be given to the extracted file, sub‐
482 ject to the permissions of the invoking process. The access and
483 modification times of the file shall be preserved unless other‐
484 wise specified with the -p option or not stored in the archive.
485 All attributes that are not preserved shall be determined as
486 part of the normal file creation action (see File Read, Write,
487 and Creation).
488
489 If neither the e nor the o specification character is specified,
490 or the user ID and group ID are not preserved for any reason,
491 pax shall not set the S_ISUID and S_ISGID bits of the file mode.
492
493 If the preservation of any of these items fails for any reason,
494 pax shall write a diagnostic message to standard error. Failure
495 to preserve these items shall affect the final exit status, but
496 shall not cause the extracted file to be deleted.
497
498 If file characteristic letters in any of the string option-argu‐
499 ments are duplicated or conflict with each other, the ones given
500 last shall take precedence. For example, if -p eme is specified,
501 file modification times are preserved.
502
503 -s replstr
504 Modify file or archive member names named by pattern or file op‐
505 erands according to the substitution expression replstr, using
506 the syntax of the ed utility. The concepts of "address" and
507 "line" are meaningless in the context of the pax utility, and
508 shall not be supplied. The format shall be:
509
510 -s /old/new/[gp]
511
512 where as in ed, old is a basic regular expression and new can
513 contain an ampersand, '\n' (where n is a digit) backreferences,
514 or subexpression matching. The old string shall also be permit‐
515 ted to contain <newline>s.
516
517 Any non-null character can be used as a delimiter ( '/' shown
518 here). Multiple -s expressions can be specified; the expressions
519 shall be applied in the order specified, terminating with the
520 first successful substitution. The optional trailing 'g' is as
521 defined in the ed utility. The optional trailing 'p' shall cause
522 successful substitutions to be written to standard error. File
523 or archive member names that substitute to the empty string
524 shall be ignored when reading and writing archives.
525
526 -t When reading files from the file system, and if the user has the
527 permissions required by utime() to do so, set the access time of
528 each file read to the access time that it had before being read
529 by pax.
530
531 -u Ignore files that are older (having a less recent file modifica‐
532 tion time) than a pre-existing file or archive member with the
533 same name. In read mode, an archive member with the same name as
534 a file in the file system shall be extracted if the archive mem‐
535 ber is newer than the file. In write mode, an archive file mem‐
536 ber with the same name as a file in the file system shall be
537 superseded if the file is newer than the archive member. If -a
538 is also specified, this is accomplished by appending to the ar‐
539 chive; otherwise, it is unspecified whether this is accomplished
540 by actual replacement in the archive or by appending to the ar‐
541 chive. In copy mode, the file in the destination hierarchy shall
542 be replaced by the file in the source hierarchy or by a link to
543 the file in the source hierarchy if the file in the source hier‐
544 archy is newer.
545
546 -v In list mode, produce a verbose table of contents (see the STD‐
547 OUT section). Otherwise, write archive member pathnames to stan‐
548 dard error (see the STDERR section).
549
550 -x format
551 Specify the output archive format. The pax utility shall support
552 the following formats:
553
554 cpio The cpio interchange format; see the EXTENDED DESCRIPTION
555 section. The default blocksize for this format for char‐
556 acter special archive files shall be 5120. Implementa‐
557 tions shall support all blocksize values less than or
558 equal to 32256 that are multiples of 512.
559
560 pax The pax interchange format; see the EXTENDED DESCRIPTION
561 section. The default blocksize for this format for char‐
562 acter special archive files shall be 5120. Implementa‐
563 tions shall support all blocksize values less than or
564 equal to 32256 that are multiples of 512.
565
566 ustar The tar interchange format; see the EXTENDED DESCRIPTION
567 section. The default blocksize for this format for char‐
568 acter special archive files shall be 10240. Implementa‐
569 tions shall support all blocksize values less than or
570 equal to 32256 that are multiples of 512.
571
572 Implementation-defined formats shall specify a default block
573 size as well as any other block sizes supported for character
574 special archive files.
575
576 Any attempt to append to an archive file in a format different
577 from the existing archive format shall cause pax to exit immedi‐
578 ately with a non-zero exit status.
579
580 In copy mode, if no -x format is specified, pax shall behave as
581 if -x pax were specified.
582
583 -X When traversing the file hierarchy specified by a pathname, pax
584 shall not descend into directories that have a different device
585 ID ( st_dev; see the System Interfaces volume of IEEE Std
586 1003.1-2001, stat()).
587
588 Specifying more than one of the mutually-exclusive options -H and -L
589 shall not be considered an error and the last option specified shall
590 determine the behavior of the utility.
591
592 The options that operate on the names of files or archive members (-c,
593 -i, -n, -s, -u, and -v)shallinteractasfollows.Inread mode, the archive
594 members shall be selected based on the user-specified pattern operands
595 as modified by the -c, -n, and -u options. Then, any -s and -i options
596 shall modify, in that order, the names of the selected files. The -v
597 option shall write names resulting from these modifications.
598
599 In write mode, the files shall be selected based on the user-specified
600 pathnames as modified by the -n and -u options. Then, any -s and -i
601 options shall modify, in that order, the names of these selected files.
602 The -v option shall write names resulting from these modifications.
603
604 If both the -u and -n options are specified, pax shall not consider a
605 file selected unless it is newer than the file to which it is compared.
606
607
608 List Mode Format Specifications
609 The manual page for spax is not yet ready. The following text is a
610 quotation from the POSIX.1-2001 standard.
611
612 In list mode with the -o listopt=format option, the format argument
613 shall be applied for each selected file. The pax utility shall append a
614 <newline> to the listopt output for each selected file. The format
615 argument shall be used as the format string described in the Base Defi‐
616 nitions volume of IEEE Std 1003.1-2001, Chapter 5, File Format Nota‐
617 tion, with the exceptions 1. through 5. defined in the EXTENDED
618 DESCRIPTION section of printf(3), plus the following exceptions:
619
620 6. The sequence (keyword) can occur before a format conversion
621 specifier. The conversion argument is defined by the value of
622 keyword. The implementation shall support the following key‐
623 words:
624
625 · Any of the Field Name entries in ustar Header Block and
626 Octet-Oriented cpio Archive Entry. The implementation may
627 support the cpio keywords without the leading c_ in addi‐
628 tion to the form required by Values for cpio c_mode
629 Field.
630
631 · Any keyword defined for the extended header in pax
632 Extended Header.
633
634 · Any keyword provided as an implementation-defined exten‐
635 sion within the extended header defined in pax Extended
636 Header.
637
638 For example, the sequence "%(charset)s" is the string value of
639 the name of the character set in the extended header.
640
641 The result of the keyword conversion argument shall be the value
642 from the applicable header field or extended header, without any
643 trailing NULs.
644
645 All keyword values used as conversion arguments shall be trans‐
646 lated from the UTF-8 encoding to the character set appropriate
647 for the local file system, user database, and so on, as applica‐
648 ble.
649
650 7. An additional conversion specifier character, T, shall be used
651 to specify time formats. The T conversion specifier character
652 can be preceded by the sequence (keyword=subformat), where sub‐
653 format is a date format as defined by date operands. The default
654 keyword shall be mtime and the default subformat shall be:
655
656 %b %e %H:%M %Y
657
658 8. An additional conversion specifier character, M, shall be used
659 to specify the file mode string as defined in ls(1) Standard
660 Output. If (keyword) is omitted, the mode keyword shall be used.
661 For example, %.1M writes the single character corresponding to
662 the <entry type> field of the ls -l command.
663
664 9. An additional conversion specifier character, D, shall be used
665 to specify the device for block or special files, if applicable,
666 in an implementation-defined format. If not applicable, and
667 (keyword) is specified, then this conversion shall be equivalent
668 to %(keyword)u. If not applicable, and (keyword) is omitted,
669 then this conversion shall be equivalent to <space>.
670
671 10. An additional conversion specifier character, F, shall be used
672 to specify a pathname. The F conversion character can be pre‐
673 ceded by a sequence of comma-separated keywords:
674
675 (keyword[,keyword] ... )
676 The values for all the keywords that are non-null shall be con‐
677 catenated together, each separated by a '/'. The default shall
678 be (path) if the keyword path is defined; otherwise, the default
679 shall be (prefix, name).
680
681 11. An additional conversion specifier character, L, shall be used
682 to specify a symbolic line expansion. If the current file is a
683 symbolic link, then %L shall expand to:
684
685 "%s -> %s", <value of keyword>, <contents of link>
686
687 Otherwise, the %L conversion specification shall be the equivalent of
688 %F.
689
690
692 The following operands shall be supported:
693
694 directory
695 The destination directory pathname for copy mode.
696
697 file A pathname of a file to be copied or archived.
698
699 pattern
700 A pattern matching one or more pathnames of archive members. A
701 pattern must be given in the name-generating notation of the
702 pattern matching notation in Pattern Matching Notation , includ‐
703 ing the filename expansion rules in Patterns Used for Filename
704 Expansion. The default, if no pattern is specified, is to select
705 all members in the archive.
706
707
709 In write mode, the standard input shall be used only if no file oper‐
710 ands are specified. It shall be a text file containing a list of path‐
711 names, one per line, without leading or trailing <blank>s.
712
713 In list and read modes, if -f is not specified, the standard input
714 shall be an archive file.
715
716 Otherwise, the standard input shall not be used.
717
718
720 The input file named by the archive option-argument, or standard input
721 when the archive is read from there, shall be a file formatted accord‐
722 ing to one of the specifications in the EXTENDED DESCRIPTION section or
723 some other implementation-defined format.
724
725 The file /dev/tty shall be used to write prompts and read responses.
726
727
729 The following environment variables shall affect the execution of pax:
730
731 LANG Provide a default value for the internationalization variables
732 that are unset or null. (See the Base Definitions volume of IEEE
733 Std 1003.1-2001, Section 8.2, Internationalization Variables for
734 the precedence of internationalization variables used to deter‐
735 mine the values of locale categories.)
736
737 LC_ALL If set to a non-empty string value, override the values of all
738 the other internationalization variables.
739
740 LC_COLLATE
741 Determine the locale for the behavior of ranges, equivalence
742 classes, and multi-character collating elements used in the pat‐
743 tern matching expressions for the pattern operand, the basic
744 regular expression for the -s option, and the extended regular
745 expression defined for the yesexpr locale keyword in the LC_MES‐
746 SAGES category.
747
748 LC_CTYPE
749 Determine the locale for the interpretation of sequences of
750 bytes of text data as characters (for example, single-byte as
751 opposed to multi-byte characters in arguments and input files),
752 the behavior of character classes used in the extended regular
753 expression defined for the yesexpr locale keyword in the LC_MES‐
754 SAGES category, and pattern matching.
755
756 LC_MESSAGES
757 Determine the locale for the processing of affirmative responses
758 that should be used to affect the format and contents of diag‐
759 nostic messages written to standard error.
760
761 LC_TIME
762 Determine the format and contents of date and time strings when
763 the -v option is specified.
764
765 NLSPATH
766 [XSI] [Option Start] Determine the location of message catalogs
767 for the processing of LC_MESSAGES . [Option End]
768
769 TMPDIR Determine the pathname that provides part of the default global
770 extended header record file, as described for the -o globexthdr=
771 keyword in the OPTIONS section.
772
773 TZ Determine the timezone used to calculate date and time strings
774 when the -v option is specified. If TZ is unset or null, an
775 unspecified default timezone shall be used.
776
777
779 Default.
780
781
783 In write mode, if -f is not specified, the standard output shall be the
784 archive formatted according to one of the specifications in the
785 EXTENDED DESCRIPTION section, or some other implementation-defined for‐
786 mat (see -x format).
787
788 In list mode, when the -o listopt= format has been specified, the
789 selected archive members shall be written to standard output using the
790 format described under List Mode Format Specifications. In list mode
791 without the -o listopt= format option, the table of contents of the
792 selected archive members shall be written to standard output using the
793 following format:
794
795 "%s\n", <pathname>
796
797 If the -v option is specified in list mode, the table of contents of
798 the selected archive members shall be written to standard output using
799 the following formats.
800
801 For pathnames representing hard links to previous members of the ar‐
802 chive:
803
804 "%s == %s\n", <ls -l listing>, <linkname>
805
806 For all other pathnames:
807
808 "%s\n", <ls -l listing>
809
810 where <ls -l listing> shall be the format specified by the ls(1) util‐
811 ity with the -l option. When writing pathnames in this format, it is
812 unspecified what is written for fields for which the underlying archive
813 format does not have the correct information, although the correct num‐
814 ber of <blank>-separated fields shall be written.
815
816 In list mode, standard output shall not be buffered more than a line at
817 a time.
818
819
821 If -v is specified in read, write, or copy modes, pax shall write the
822 pathnames it processes to the standard error output using the following
823 format:
824
825 "%s\n", <pathname>
826
827 These pathnames shall be written as soon as processing is begun on the
828 file or archive member, and shall be flushed to standard error. The
829 trailing <newline>, which shall not be buffered, is written when the
830 file has been read or written.
831
832 If the -s option is specified, and the replacement string has a trail‐
833 ing 'p', substitutions shall be written to standard error in the fol‐
834 lowing format:
835
836 "%s >> %s\n", <original pathname>, <new pathname>
837
838 In all operating modes of pax, optional messages of unspecified format
839 concerning the input archive format and volume number, the number of
840 files, blocks, volumes, and media parts as well as other diagnostic
841 messages may be written to standard error.
842
843 In all formats, for both standard output and standard error, it is
844 unspecified how non-printable characters in pathnames or link names are
845 written.
846
847 When pax is in read mode or list mode, using the -x pax archive format,
848 and a filename, link name, owner name, or any other field in an
849 extended header record cannot be translated from the pax UTF-8 codeset
850 format to the codeset and current locale of the implementation, pax
851 shall write a diagnostic message to standard error, shall process the
852 file as described for the -o invalid= option, and then shall process
853 the next file in the archive.
854
855
857 In read mode, the extracted output files shall be of the archived file
858 type. In copy mode, the copied output files shall be the type of the
859 file being copied. In either mode, existing files in the destination
860 hierarchy shall be overwritten only when all permission (-p), modifica‐
861 tion time (-u), and invalid-value (-o invalid=) tests allow it.
862
863 In write mode, the output file named by the -f option-argument shall be
864 a file formatted according to one of the specifications in the EXTENDED
865 DESCRIPTION section, or some other implementation-defined format.
866
867
869 pax Interchange Format
870 A pax archive tape or file produced in the -x pax format shall contain
871 a series of blocks. The physical layout of the archive shall be identi‐
872 cal to the ustar format described in ustar Interchange Format. Each
873 file archived shall be represented by the following sequence:
874
875 · An optional header block with extended header records.
876 This header block is of the form described in pax Header
877 Block, with a typeflag value of x or g. The extended
878 header records, described in pax Extended Header, shall
879 be included as the data for this header block.
880
881 · A header block that describes the file. Any fields in the
882 preceding optional extended header shall override the
883 associated fields in this header block for this file.
884
885 · Zero or more blocks that contain the contents of the
886 file.
887
888 At the end of the archive file there shall be two 512-byte blocks
889 filled with binary zeros, interpreted as an end-of-archive indicator.
890
891 A schematic of an example archive with global extended header records
892 and two actual files is shown in pax Format Archive Example. In the
893 example, the second file in the archive has no extended header preced‐
894 ing it, presumably because it has no need for extended attributes.
895
896 Figure: pax Format Archive Example
897
898 ┌──────────────────────────────┬─────────────────────────────────────────────┐
899 │ustar Header [typeflag = 'g'] │ │
900 ├──────────────────────────────┤ Global Extended header │
901 │Global Extended Header Data │ │
902 ├──────────────────────────────┼─────────────────────────────────────────────┤
903 │ustar Header [typeflag = 'x'] │ │
904 ├──────────────────────────────┤ │
905 │Extended Header Data │ │
906 ├──────────────────────────────┤ File 1: Extended Header data is included │
907 │ustar Header [typeflag = '0'] │ │
908 ├──────────────────────────────┤ │
909 │Data for File 1 │ │
910 ├──────────────────────────────┼─────────────────────────────────────────────┤
911 │ustar Header [typeflag = '0'] │ │
912 ├──────────────────────────────┤ File 2: No Extended Header data is included │
913 │Data for File 2 │ │
914 ├──────────────────────────────┼─────────────────────────────────────────────┤
915 │Block of binary Zeroes │ │
916 ├──────────────────────────────┤ End of Archive Indicator │
917 │Block of binary Zeroes │ │
918 └──────────────────────────────┴─────────────────────────────────────────────┘
919
920 pax Header Block
921 The pax header block shall be identical to the ustar header block
922 described in ustar Interchange Format, except that two additional type‐
923 flag values are defined:
924
925 x Represents extended header records for the following file in the
926 archive (which shall have its own ustar header block). The for‐
927 mat of these extended header records shall be as described in
928 pax Extended Header.
929
930 g Represents global extended header records for the following
931 files in the archive. The format of these extended header
932 records shall be as described in pax Extended Header. Each
933 value shall affect all subsequent files that do not override
934 that value in their own extended header record and until another
935 global extended header record is reached that provides another
936 value for the same field. The typeflag g global headers should
937 not be used with interchange media that could suffer partial
938 data loss in transporting the archive.
939
940 For both of these types, the size field shall be the size of the
941 extended header records in octets. The other fields in the header block
942 are not meaningful to this version of the pax utility. However, if
943 this archive is read by a pax utility conforming to the ISO
944 POSIX-2:1993 standard, the header block fields are used to create a
945 regular file that contains the extended header records as data. There‐
946 fore, header block field values should be selected to provide reason‐
947 able file access to this regular file.
948
949 A further difference from the ustar header block is that data blocks
950 for files of typeflag 1 (the digit one) (hard link) may be included,
951 which means that the size field may be greater than zero. Archives cre‐
952 ated by pax -o linkdata shall include these data blocks with the hard
953 links.
954
955
956 pax Extended Header
957 A pax extended header contains values that are inappropriate for the
958 ustar header block because of limitations in that format: fields
959 requiring a character encoding other than that described in the ISO/IEC
960 646:1991 standard, fields representing file attributes not described in
961 the ustar header, and fields whose format or length do not fit the
962 requirements of the ustar header. The values in an extended header add
963 attributes to the following file (or files; see the description of the
964 typeflag g header block) or override values in the following header
965 block(s), as indicated in the following list of keywords.
966
967 An extended header shall consist of one or more records, each con‐
968 structed as follows:
969
970 "%d %s=%s\n", <length>, <keyword>, <value>
971
972 The extended header records shall be encoded according to the ISO/IEC
973 10646-1:2000 standard (UTF-8). The <length> field, <blank>, equals
974 sign, and <newline> shown shall be limited to the portable character
975 set, as encoded in UTF-8. The <keyword> and <value> fields can be any
976 UTF-8 characters. The <length> field shall be the decimal length of the
977 extended header record in octets, including the trailing <newline>.
978
979 The <keyword> field shall be one of the entries from the following list
980 or a keyword provided as an implementation extension. Keywords con‐
981 sisting entirely of lowercase letters, digits, and periods are reserved
982 for future standardization. A keyword shall not include an equals sign.
983 (In the following list, the notations "file(s)" or "block(s)" is used
984 to acknowledge that a keyword affects the following single file after a
985 typeflag x extended header, but possibly multiple files after typeflag
986 g. Any requirements in the list for pax to include a record when in
987 write or copy mode shall apply only when such a record has not already
988 been provided through the use of the -o option. When used in copy mode,
989 pax shall behave as if an archive had been created with applicable
990 extended header records and then extracted.)
991
992 atime The file access time for the following file(s), equivalent to
993 the value of the st_atime member of the stat structure for a
994 file, as described by the stat(2) function. The access time
995 shall be restored if the process has the appropriate privilege
996 required to do so. The format of the <value> shall be as
997 described in pax Extended Header File Times.
998
999 charset
1000 The name of the character set used to encode the data in the
1001 following file(s). The entries in the following table are
1002 defined to refer to known standards; additional names may be
1003 agreed on between the originator and recipient.
1004
1005 ┌────────────────────────┬───────────────────────────────┐
1006 │ <value> │ Formal Standard │
1007 ├────────────────────────┼───────────────────────────────┤
1008 │ISO-IR 646 1990 │ ISO/IEC 646:1990 │
1009 │ISO-IR 8859 1 1998 │ ISO/IEC 8859-1:1998 │
1010 │ISO-IR 8859 2 1999 │ ISO/IEC 8859-2:1999 │
1011 │ISO-IR 8859 3 1999 │ ISO/IEC 8859-3:1999 │
1012 │ISO-IR 8859 4 1998 │ ISO/IEC 8859-4:1998 │
1013 │ISO-IR 8859 5 1999 │ ISO/IEC 8859-5:1999 │
1014 │ISO-IR 8859 6 1999 │ ISO/IEC 8859-6:1999 │
1015 │ISO-IR 8859 7 1987 │ ISO/IEC 8859-7:1987 │
1016 │ISO-IR 8859 8 1999 │ ISO/IEC 8859-8:1999 │
1017 │ISO-IR 8859 9 1999 │ ISO/IEC 8859-9:1999 │
1018 │ISO-IR 8859 10 1998 │ ISO/IEC 8859-10:1998 │
1019 │ISO-IR 8859 13 1998 │ ISO/IEC 8859-13:1998 │
1020 │ISO-IR 8859 14 1998 │ ISO/IEC 8859-14:1998 │
1021 │ISO-IR 8859 15 1999 │ ISO/IEC 8859-15:1999 │
1022 │ISO-IR 10646 2000 │ ISO/IEC 10646:2000 │
1023 │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1024 │BINARY │ None │
1025 └────────────────────────┴───────────────────────────────┘
1026 The encoding is included in an extended header for information only;
1027 when pax is used as described in IEEE Std 1003.1-2001, it shall not
1028 translate the file data into any other encoding. The BINARY entry indi‐
1029 cates unencoded binary data.
1030
1031 When used in write or copy mode, it is implementation-defined whether
1032 pax includes a charset extended header record for a file.
1033
1034 comment
1035 A series of characters used as a comment. All characters in the
1036 <value> field shall be ignored by pax.
1037
1038 gid The group ID of the group that owns the file, expressed as a
1039 decimal number using digits from the ISO/IEC 646:1991 standard.
1040 This record shall override the gid field in the following header
1041 block(s). When used in write or copy mode, pax shall include a
1042 gid extended header record for each file whose group ID is
1043 greater than 2097151 (octal 7777777).
1044
1045 gname The group of the file(s), formatted as a group name in the group
1046 database. This record shall override the gid and gname fields in
1047 the following header block(s), and any gid extended header
1048 record. When used in read, copy, or list mode, pax shall trans‐
1049 late the name from the UTF-8 encoding in the header record to
1050 the character set appropriate for the group database on the
1051 receiving system. If any of the UTF-8 characters cannot be
1052 translated, and if the -o invalid=UTF-8 option is not specified,
1053 the results are implementation-defined. When used in write or
1054 copy mode, pax shall include a gname extended header record for
1055 each file whose group name cannot be represented entirely with
1056 the letters and digits of the portable character set.
1057
1058 linkpath
1059 The pathname of a link being created to another file, of any
1060 type, previously archived. This record shall override the
1061 linkname field in the following ustar header block(s). The fol‐
1062 lowing ustar header block shall determine the type of link cre‐
1063 ated. If typeflag of the following header block is 1, it shall
1064 be a hard link. If typeflag is 2, it shall be a symbolic link
1065 and the linkpath value shall be the contents of the symbolic
1066 link. The pax utility shall translate the name of the link (con‐
1067 tents of the symbolic link) from the UTF-8 encoding to the char‐
1068 acter set appropriate for the local file system. When used in
1069 write or copy mode, pax shall include a linkpath extended header
1070 record for each link whose pathname cannot be represented
1071 entirely with the members of the portable character set other
1072 than NUL.
1073
1074 mtime The file modification time of the following file(s), equivalent
1075 to the value of the st_mtime member of the stat structure for a
1076 file, as described in the stat(2) function. This record shall
1077 override the mtime field in the following header block(s). The
1078 modification time shall be restored if the process has the
1079 appropriate privilege required to do so. The format of the
1080 <value> shall be as described in pax Extended Header File Times.
1081
1082 path The pathname of the following file(s). This record shall over‐
1083 ride the name and prefix fields in the following header
1084 block(s). The pax utility shall translate the pathname of the
1085 file from the UTF-8 encoding to the character set appropriate
1086 for the local file system.
1087
1088 When used in write or copy mode, pax shall include a path
1089 extended header record for each file whose pathname cannot be
1090 represented entirely with the members of the portable character
1091 set other than NUL.
1092
1093 realtime.any
1094 The keywords prefixed by "realtime." are reserved for future
1095 standardization.
1096
1097 security.any
1098 The keywords prefixed by "security." are reserved for future
1099 standardization.
1100
1101 size The size of the file in octets, expressed as a decimal number
1102 using digits from the ISO/IEC 646:1991 standard. This record
1103 shall override the size field in the following header block(s).
1104 When used in write or copy mode, pax shall include a size
1105 extended header record for each file with a size value greater
1106 than 8589934591 (octal 77777777777).
1107
1108 uid The user ID of the file owner, expressed as a decimal number
1109 using digits from the ISO/IEC 646:1991 standard. This record
1110 shall override the uid field in the following header block(s).
1111 When used in write or copy mode, pax shall include a uid
1112 extended header record for each file whose owner ID is greater
1113 than 2097151 (octal 7777777).
1114
1115 uname The owner of the following file(s), formatted as a user name in
1116 the user database. This record shall override the uid and uname
1117 fields in the following header block(s), and any uid extended
1118 header record. When used in read, copy, or list mode, pax shall
1119 translate the name from the UTF-8 encoding in the header record
1120 to the character set appropriate for the user database on the
1121 receiving system. If any of the UTF-8 characters cannot be
1122 translated, and if the -o invalid=UTF-8 option is not specified,
1123 the results are implementation-defined. When used in write or
1124 copy mode, pax shall include a uname extended header record for
1125 each file whose user name cannot be represented entirely with
1126 the letters and digits of the portable character set.
1127
1128 If the <value> field is zero length, it shall delete any header block
1129 field, previously entered extended header value, or global extended
1130 header value of the same name.
1131
1132 If a keyword in an extended header record (or in a -o option-argument)
1133 overrides or deletes a corresponding field in the ustar header block,
1134 pax shall ignore the contents of that header block field.
1135
1136 Unlike the ustar header block fields, NULs shall not delimit <value>s;
1137 all characters within the <value> field shall be considered data for
1138 the field. None of the length limitations of the ustar header block
1139 fields in ustar Header Block shall apply to the extended header
1140 records.
1141
1142
1143 pax Extended Header Keyword Precedence
1144 This section describes the precedence in which the various header
1145 records and fields and command line options are selected to apply to a
1146 file in the archive. When pax is used in read or list modes, it shall
1147 determine a file attribute in the following sequence:
1148
1149 1. If -o delete=keyword-prefix is used, the affected
1150 attributes shall be determined from step 7., if applica‐
1151 ble, or ignored otherwise.
1152
1153 2. If -o keyword:= is used, the affected attributes shall be
1154 ignored.
1155
1156 3. If -o keyword:=value is used, the affected attribute
1157 shall be assigned the value.
1158
1159 4. If there is a typeflag x extended header record, the
1160 affected attribute shall be assigned the <value>. When
1161 extended header records conflict, the last one given in
1162 the header shall take precedence.
1163
1164 5. If -o keyword=value is used, the affected attribute shall
1165 be assigned the value.
1166
1167 6. If there is a typeflag g global extended header record,
1168 the affected attribute shall be assigned the <value>.
1169 When global extended header records conflict, the last
1170 one given in the global header shall take precedence.
1171
1172 7. Otherwise, the attribute shall be determined from the
1173 ustar header block.
1174
1175
1176 pax Extended Header File Times
1177 The pax utility shall write an mtime record for each file in write or
1178 copy modes if the file's modification time cannot be represented
1179 exactly in the ustar header logical record described in ustar Inter‐
1180 change Format. This can occur if the time is out of ustar range, or if
1181 the file system of the underlying implementation supports non-integer
1182 time granularities and the time is not an integer. All of these time
1183 records shall be formatted as a decimal representation of the time in
1184 seconds since the Epoch. If a period ('.') decimal point character is
1185 present, the digits to the right of the point shall represent the units
1186 of a subsecond timing granularity, where the first digit is tenths of a
1187 second and each subsequent digit is a tenth of the previous digit. In
1188 read or copy mode, the pax utility shall truncate the time of a file to
1189 the greatest value that is not greater than the input header file time.
1190 In write or copy mode, the pax utility shall output a time exactly if
1191 it can be represented exactly as a decimal number, and otherwise shall
1192 generate only enough digits so that the same time shall be recovered if
1193 the file is extracted on a system whose underlying implementation sup‐
1194 ports the same time granularity.
1195
1196
1197 ustar Interchange Format
1198 A ustar archive tape or file shall contain a series of logical records.
1199 Each logical record shall be a fixed-size logical record of 512 octets
1200 (see below). Although this format may be thought of as being stored on
1201 9-track industry-standard 12.7 mm (0.5 in) magnetic tape, other types
1202 of transportable media are not excluded. Each file archived shall be
1203 represented by a header logical record that describes the file, fol‐
1204 lowed by zero or more logical records that give the contents of the
1205 file. At the end of the archive file there shall be two 512-octet logi‐
1206 cal records filled with binary zeros, interpreted as an end-of-archive
1207 indicator.
1208
1209 The logical records may be grouped for physical I/O operations, as
1210 described under the -b blocksize and -x ustar options. Each group of
1211 logical records may be written with a single operation equivalent to
1212 the write(2) function. On magnetic tape, the result of this write shall
1213 be a single tape physical block. The last physical block shall always
1214 be the full size, so logical records after the two zero logical records
1215 may contain undefined data.
1216
1217 The header logical record shall be structured as shown in the following
1218 table. All lengths and offsets are in decimal.
1219
1220 Table: ustar Header Block
1221
1222 ┌───────────┬──────────────┬────────────────────┐
1223 │Field Name │ Octet Offset │ Length (in Octets) │
1224 ├───────────┼──────────────┼────────────────────┤
1225 │name │ 0 │ 100 │
1226 │mode │ 100 │ 8 │
1227 │uid │ 108 │ 8 │
1228 │gid │ 116 │ 8 │
1229 │size │ 124 │ 12 │
1230 │mtime │ 136 │ 12 │
1231 │chksum │ 148 │ 8 │
1232 │typeflag │ 156 │ 1 │
1233 │linkname │ 157 │ 100 │
1234 │magic │ 257 │ 6 │
1235 │version │ 263 │ 2 │
1236 │uname │ 265 │ 32 │
1237 │gname │ 297 │ 32 │
1238 │devmajor │ 329 │ 8 │
1239 │devminor │ 337 │ 8 │
1240 │prefix │ 345 │ 155 │
1241 └───────────┴──────────────┴────────────────────┘
1242 All characters in the header logical record shall be represented in the
1243 coded character set of the ISO/IEC 646:1991 standard. For maximum
1244 portability between implementations, names should be selected from
1245 characters represented by the portable filename character set as octets
1246 with the most significant bit zero. If an implementation supports the
1247 use of characters outside of slash and the portable filename character
1248 set in names for files, users, and groups, one or more implementation-
1249 defined encodings of these characters shall be provided for interchange
1250 purposes.
1251
1252 However, the pax utility shall never create filenames on the local sys‐
1253 tem that cannot be accessed via the procedures described in IEEE Std
1254 1003.1-2001. If a filename is found on the medium that would create an
1255 invalid filename, it is implementation-defined whether the data from
1256 the file is stored on the file hierarchy and under what name it is
1257 stored. The pax utility may choose to ignore these files as long as it
1258 produces an error indicating that the file is being ignored.
1259
1260 Each field within the header logical record is contiguous; that is,
1261 there is no padding used. Each character on the archive medium shall be
1262 stored contiguously.
1263
1264 The fields magic, uname, and gname are character strings each termi‐
1265 nated by a NUL character. The fields name, linkname, and prefix are
1266 NUL-terminated character strings except when all characters in the
1267 array contain non-NUL characters including the last character. The ver‐
1268 sion field is two octets containing the characters "00" (zero-zero).
1269 The typeflag contains a single character. All other fields are leading
1270 zero-filled octal numbers using digits from the ISO/IEC 646:1991 stan‐
1271 dard IRV. Each numeric field is terminated by one or more <space> or
1272 NUL characters.
1273
1274 The name and the prefix fields shall produce the pathname of the file.
1275 A new pathname shall be formed, if prefix is not an empty string (its
1276 first character is not NUL), by concatenating prefix (up to the first
1277 NUL character), a slash character, and name; otherwise, name is used
1278 alone. In either case, name is terminated at the first NUL character.
1279 If prefix begins with a NUL character, it shall be ignored. In this
1280 manner, pathnames of at most 256 characters can be supported. If a
1281 pathname does not fit in the space provided, pax shall notify the user
1282 of the error, and shall not store any part of the file-header or data-
1283 on the medium.
1284
1285 The linkname field, described below, shall not use the prefix to pro‐
1286 duce a pathname. As such, a linkname is limited to 100 characters. If
1287 the name does not fit in the space provided, pax shall notify the user
1288 of the error, and shall not attempt to store the link on the medium.
1289
1290 The mode field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1291 dard octal digit representation. The encoded bits shall represent the
1292 following values:
1293
1294 Table: ustar mode Field
1295
1296 ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1297 │ Bit │ IEEE Std │ Description │
1298 │Value │ 1003.1-2001 Bit │ │
1299 ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1300 │04000 │ S_ISUID │ Set UID on execution. │
1301 │02000 │ S_ISGID │ Set GID on execution. │
1302 │01000 │ <reserved> │ Reserved for future standardization. │
1303 │00400 │ S_IRUSR │ Read permission for file owner class. │
1304 │00200 │ S_IWUSR │ Write permission for file owner class. │
1305 │00100 │ S_IXUSR │ Execute/search permission for file owner class. │
1306 │00040 │ S_IRGRP │ Read permission for file group class. │
1307 │00020 │ S_IWGRP │ Write permission for file group class. │
1308 │00010 │ S_IXGRP │ Execute/search permission for file group class. │
1309 │00004 │ S_IROTH │ Read permission for file other class. │
1310 │00002 │ S_IWOTH │ Write permission for file other class. │
1311 │00001 │ S_IXOTH │ Execute/search permission for file other class. │
1312 └──────┴─────────────────┴─────────────────────────────────────────────────┘
1313 When appropriate privilege is required to set one of these mode bits,
1314 and the user restoring the files from the archive does not have the
1315 appropriate privilege, the mode bits for which the user does not have
1316 appropriate privilege shall be ignored. Some of the mode bits in the
1317 archive format are not mentioned elsewhere in this volume of IEEE Std
1318 1003.1-2001. If the implementation does not support those bits, they
1319 may be ignored.
1320
1321 The uid and gid fields are the user and group ID of the owner and group
1322 of the file, respectively.
1323
1324 The size field is the size of the file in octets. If the typeflag field
1325 is set to specify a file to be of type 1 (a link) or 2 (a symbolic
1326 link), the size field shall be specified as zero. If the typeflag field
1327 is set to specify a file of type 5 (directory), the size field shall be
1328 interpreted as described under the definition of that record type. No
1329 data logical records are stored for types 1, 2, or 5. If the typeflag
1330 field is set to 3 (character special file), 4 (block special file), or
1331 6 (FIFO), the meaning of the size field is unspecified by this volume
1332 of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1333 the medium. Additionally, for type 6, the size field shall be ignored
1334 when reading. If the typeflag field is set to any other value, the num‐
1335 ber of logical records written following the header shall be
1336 (size+511)/512, ignoring any fraction in the result of the division.
1337
1338 The mtime field shall be the modification time of the file at the time
1339 it was archived. It is the ISO/IEC 646:1991 standard representation of
1340 the octal value of the modification time obtained from the stat(2)
1341 function.
1342
1343 The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1344 tion of the octal value of the simple sum of all octets in the header
1345 logical record. Each octet in the header shall be treated as an
1346 unsigned value. These values shall be added to an unsigned integer,
1347 initialized to zero, the precision of which is not less than 17 bits.
1348 When calculating the checksum, the chksum field is treated as if it
1349 were all spaces.
1350
1351 The typeflag field specifies the type of file archived. If a particular
1352 implementation does not recognize the type, or the user does not have
1353 appropriate privilege to create that type, the file shall be extracted
1354 as if it were a regular file if the file type is defined to have a
1355 meaning for the size field that could cause data logical records to be
1356 written on the medium (see the previous description for size). If con‐
1357 version to a regular file occurs, the pax utility shall produce an
1358 error indicating that the conversion took place. All of the typeflag
1359 fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1360
1361 0 Represents a regular file. For backwards-compatibility, a type‐
1362 flag value of binary zero ('\0') should be recognized as meaning
1363 a regular file when extracting files from the archive. Archives
1364 written with this version of the archive file format create reg‐
1365 ular files with a typefla value of the ISO/IEC 646:1991 standard
1366 IRV '0'.
1367
1368 1 Represents a file linked to another file, of any type, previ‐
1369 ously archived. Such files are identified by having the same
1370 device and file serial numbers, and pathnames that refer to dif‐
1371 ferent directory entries. All such files shall be archived as
1372 linked files. The linked-to name is specified in the linkname
1373 field with a NUL-character terminator if it is less than 100
1374 octets in length.
1375
1376 2 Represents a symbolic link. The contents of the symbolic link
1377 shall be stored in the linkname field.
1378
1379 3,4 Represent character special files and block special files
1380 respectively. In this case the devmajor and devminor fields
1381 shall contain information defining the device, the format of
1382 which is unspecified by this volume of IEEE Std 1003.1-2001.
1383 Implementations may map the device specifications to their own
1384 local specification or may ignore the entry.
1385
1386 5 Specifies a directory or subdirectory. On systems where disk
1387 allocation is performed on a directory basis, the size field
1388 shall contain the maximum number of octets (which may be rounded
1389 to the nearest disk block allocation unit) that the directory
1390 may hold. A size field of zero indicates no such limiting. Sys‐
1391 tems that do not support limiting in this manner should ignore
1392 the size field.
1393
1394 6 Specifies a FIFO special file. Note that the archiving of a FIFO
1395 file archives the existence of this file and not its contents.
1396
1397 7 Reserved to represent a file to which an implementation has
1398 associated some high-performance attribute. Implementations
1399 without such extensions should treat this file as a regular file
1400 (type 0).
1401
1402 A-Z The letters 'A' to 'Z', inclusive, are reserved for custom
1403 implementations. All other values are reserved for future ver‐
1404 sions of IEEE Std 1003.1-2001.
1405
1406 It is unspecified whether files with pathnames that refer to the same
1407 directory entry are archived as linked files or as separate files. If
1408 they are archived as linked files, this means that attempting to
1409 extract both pathnames from the resulting archive will always cause an
1410 error (unless the -u option is used) because the link cannot be cre‐
1411 ated.
1412
1413 It is unspecified whether files with the same device and file serial
1414 numbers being appended to an archive are treated as linked files to
1415 members that were in the archive before the append.
1416
1417 Attempts to archive a socket using ustar interchange format shall pro‐
1418 duce a diagnostic message. Handling of other file types is implementa‐
1419 tion-defined.
1420
1421 The magic field is the specification that this archive was output in
1422 this archive format. If this field contains ustar (the five characters
1423 from the ISO/IEC 646:1991 standard IRV shown followed by NUL), the
1424 uname and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1425 representation of the owner and group of the file, respectively (trun‐
1426 cated to fit, if necessary). When the file is restored by a privi‐
1427 leged, protection-preserving version of the utility, the user and group
1428 databases shall be scanned for these names. If found, the user and
1429 group IDs contained within these files shall be used rather than the
1430 values contained within the uid and gid fields.
1431
1432
1433 cpio Interchange Format
1434 The octet-oriented cpio archive format shall be a series of entries,
1435 each comprising a header that describes the file, the name of the file,
1436 and then the contents of the file.
1437
1438 An archive may be recorded as a series of fixed-size blocks of octets.
1439 This blocking shall be used only to make physical I/O more efficient.
1440 The last group of blocks shall always be at the full size.
1441
1442 For the octet-oriented cpio archive format, the individual entry infor‐
1443 mation shall be in the order indicated and described by the following
1444 table; see also the <cpio.h> header.
1445
1446 Table: Octet-Oriented cpio Archive Entry
1447
1448 ┌─────────────────────┬────────────────────┬─────────────────┐
1449 │ Header Field Name │ Length (in Octets) │ Interpreted as │
1450 ├─────────────────────┼────────────────────┼─────────────────┤
1451 │c_magic │ 6 │ Octal number │
1452 │c_dev │ 6 │ Octal number │
1453 │c_ino │ 6 │ Octal number │
1454 │c_mode │ 6 │ Octal number │
1455 │c_uid │ 6 │ Octal number │
1456 │c_gid │ 6 │ Octal number │
1457 │c_nlink │ 6 │ Octal number │
1458 │c_rdev │ 6 │ Octal number │
1459 │c_mtime │ 11 │ Octal number │
1460 │c_namesize │ 6 │ Octal number │
1461 │c_filesize │ 11 │ Octal number │
1462 │ │ │ │
1463 │Filename Field Name │ Length │ Interpreted as │
1464 │c_name │ c_namesize │ Pathname string │
1465 │ │ │ │
1466 │File Data Field Name │ Length │ Interpreted as │
1467 │c_filedata │ c_filesize │ Data │
1468 └─────────────────────┴────────────────────┴─────────────────┘
1469 cpio Header
1470 For each file in the archive, a header as defined previously shall be
1471 written. The information in the header fields is written as streams of
1472 the ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1473 The octal numbers shall be extended to the necessary length by append‐
1474 ing the ISO/IEC 646:1991 standard IRV zeros at the most-significant-
1475 digit end of the number; the result is written to the most-significant
1476 digit of the stream of octets first. The fields shall be interpreted as
1477 follows:
1478
1479 c_magic
1480 Identify the archive as being a transportable archive by con‐
1481 taining the identifying value "070707".
1482
1483 c_dev, c_ino
1484 Contains values that uniquely identify the file within the ar‐
1485 chive (that is, no files contain the same pair of c_dev and
1486 c_ino values unless they are links to the same file). The values
1487 shall be determined in an unspecified manner.
1488
1489 c_mode Contains the file type and access permissions as defined in the
1490 following table.
1491
1492 Table: Values for cpio c_mode Field
1493
1494 ┌──────────────────────┬─────────┬────────────────────────┐
1495 │File Permissions Name │ Value │ Indicates │
1496 ├──────────────────────┼─────────┼────────────────────────┤
1497 │C_IRUSR │ 000400 │ Read by owner │
1498 │C_IWUSR │ 000200 │ Write by owner │
1499 │C_IXUSR │ 000100 │ Execute by owner │
1500 │C_IRGRP │ 000040 │ Read by group │
1501 │C_IWGRP │ 000020 │ Write by group │
1502 │C_IXGRP │ 000010 │ Execute by group │
1503 │C_IROTH │ 000004 │ Read by others │
1504 │C_IWOTH │ 000002 │ Write by others │
1505 │C_IXOTH │ 000001 │ Execute by others │
1506 │C_ISUID │ 004000 │ Set uid │
1507 │C_ISGID │ 002000 │ Set gid │
1508 │C_ISVTX │ 001000 │ Reserved │
1509 ├──────────────────────┼─────────┼────────────────────────┤
1510 │File Type Name │ Value │ Indicates │
1511 ├──────────────────────┼─────────┼────────────────────────┤
1512 │C_ISDIR │ 0040000 │ Directory │
1513 │C_ISFIFO │ 0010000 │ FIFO │
1514 │C_ISREG │ 0100000 │ Regular file │
1515 │C_ISLNK │ 0120000 │ Symbolic link │
1516 │C_ISBLK │ 0060000 │ Block special file │
1517 │C_ISCHR │ 0020000 │ Character special file │
1518 │C_ISSOCK │ 0140000 │ Socket │
1519 │C_ISCTG │ 0110000 │ Reserved │
1520 └──────────────────────┴─────────┴────────────────────────┘
1521 Directories, FIFOs, symbolic links, and regular files shall be
1522 supported on a system conforming to this volume of IEEE Std
1523 1003.1-2001; additional values defined previously are reserved
1524 for compatibility with existing systems. Additional file types
1525 may be supported; however, such files should not be written to
1526 archives intended to be transported to other systems.
1527
1528 c_uid Contains the user ID of the owner.
1529
1530 c_gid Contains the group ID of the group.
1531
1532 c_nlink
1533 Contains a number greater than or equal to the number of links
1534 in the archive referencing the file. If the -a option is used to
1535 append to a cpio archive, then the pax utility need not account
1536 for the files in the existing part of the archive when calculat‐
1537 ing the c_nlink values for the appended part of the archive, and
1538 need not alter the c_nlink values in the existing part of the
1539 archive if additional files with the same c_dev and c_ino values
1540 are appended to the archive.
1541
1542 c_rdev Contains implementation-defined information for character or
1543 block special files.
1544
1545 c_mtime
1546 Contains the latest time of modification of the file at the time
1547 the archive was created.
1548
1549 c_namesize
1550 Contains the length of the pathname, including the terminating
1551 NUL character.
1552
1553 c_filesize
1554 Contains the length of the file in octets. This shall be the
1555 length of the data section following the header structure.
1556
1557
1558 cpio Filename
1559 The c_name field shall contain the pathname of the file. The length of
1560 this field in octets is the value of c_namesize.
1561
1562 If a filename is found on the medium that would create an invalid path‐
1563 name, it is implementation-defined whether the data from the file is
1564 stored on the file hierarchy and under what name it is stored.
1565
1566 All characters shall be represented in the ISO/IEC 646:1991 standard
1567 IRV. For maximum portability between implementations, names should be
1568 selected from characters represented by the portable filename character
1569 set as octets with the most significant bit zero. If an implementation
1570 supports the use of characters outside the portable filename character
1571 set in names for files, users, and groups, one or more implementation-
1572 defined encodings of these characters shall be provided for interchange
1573 purposes. However, the pax utility shall never create filenames on the
1574 local system that cannot be accessed via the procedures described pre‐
1575 viously in this volume of IEEE Std 1003.1-2001. If a filename is found
1576 on the medium that would create an invalid filename, it is implementa‐
1577 tion-defined whether the data from the file is stored on the local file
1578 system and under what name it is stored. The pax utility may choose to
1579 ignore these files as long as it produces an error indicating that the
1580 file is being ignored.
1581
1582
1583 cpio File Data
1584 Following c_name, there shall be c_filesize octets of data. Interpre‐
1585 tation of such data occurs in a manner dependent on the file. If
1586 c_filesize is zero, no data shall be contained in c_filedata.
1587
1588 When restoring from an archive:
1589
1590 · If the user does not have the appropriate privilege to create a
1591 file of the specified type, pax shall ignore the entry and write
1592 an error message to standard error.
1593
1594 · Only regular files have data to be restored. Presuming a regular
1595 file meets any selection criteria that might be imposed on the
1596 format-reading utility by the user, such data shall be restored.
1597
1598 · If a user does not have appropriate privilege to set a particu‐
1599 lar mode flag, the flag shall be ignored. Some of the mode flags
1600 in the archive format are not mentioned elsewhere in this volume
1601 of IEEE Std 1003.1-2001. If the implementation does not support
1602 those flags, they may be ignored.
1603
1604
1605 cpio Special Entries
1606 FIFO special files, directories, and the trailer shall be recorded with
1607 c_filesize equal to zero. For other special files, c_filesize is
1608 unspecified by this volume of IEEE Std 1003.1-2001. The header for the
1609 next file entry in the archive shall be written directly after the last
1610 octet of the file entry preceding it. A header denoting the filename
1611 TRAILER!!! shall indicate the end of the archive; the contents of
1612 octets in the last block of the archive following such a header are
1613 undefined.
1614
1615
1617 The following exit values shall be returned:
1618
1619 0 All files were processed successfully.
1620
1621 >0 An error occurred.
1622
1623
1625 If pax cannot create a file or a link when reading an archive or cannot
1626 find a file when writing an archive, or cannot preserve the user ID,
1627 group ID, or file mode when the -p option is specified, a diagnostic
1628 message shall be written to standard error and a non-zero exit status
1629 shall be returned, but processing shall continue. In the case where pax
1630 cannot create a link to a file, pax shall not, by default, create a
1631 second copy of the file.
1632
1633 If the extraction of a file from an archive is prematurely terminated
1634 by a signal or error, pax may have only partially extracted the file or
1635 (if the -n option was not specified) may have extracted a file of the
1636 same name as that specified by the user, but which is not the file the
1637 user wanted. Additionally, the file modes of extracted directories may
1638 have additional bits from the S_IRWXU mask set as well as incorrect
1639 modification and access times.
1640
1641
1642_________________________________________________________________
1644
1645
1647 Caution is advised when using the -a option to append to a cpio format
1648 archive. If any of the files being appended happen to be given the same
1649 c_dev and c_ino values as a file in the existing part of the archive,
1650 then they may be treated as links to that file on extraction. Thus, it
1651 is risky to use -a with cpio format except when it is done on the same
1652 system that the original archive was created on, and with the same pax
1653 utility, and in the knowledge that there has been little or no file
1654 system activity since the original archive was created that could lead
1655 to any of the files appended being given the same c_dev and c_ino val‐
1656 ues as an unrelated file in the existing part of the archive. Also,
1657 when (intentionally) appending additional links to a file in the exist‐
1658 ing part of the archive, the c_nlink values in the modified archive can
1659 be smaller than the number of links to the file in the archive, which
1660 may mean that the links are not preserved on extraction.
1661
1662 The -p (privileges) option was invented to reconcile differences
1663 between historical tar and cpio implementations. In particular, the two
1664 utilities use -m in diametrically opposed ways. The -p option also pro‐
1665 vides a consistent means of extending the ways in which future file
1666 attributes can be addressed, such as for enhanced security systems or
1667 high-performance files. Although it may seem complex, there are really
1668 two modes that are most commonly used:
1669
1670 -p e ``Preserve everything". This would be used by the historical
1671 superuser, someone with all the appropriate privileges, to pre‐
1672 serve all aspects of the files as they are recorded in the ar‐
1673 chive. The e flag is the sum of o and p, and other implementa‐
1674 tion-defined attributes.
1675
1676 -p p ``Preserve" the file mode bits. This would be used by the user
1677 with regular privileges who wished to preserve aspects of the
1678 file other than the ownership. The file times are preserved by
1679 default, but two other flags are offered to disable these and
1680 use the time of extraction.
1681
1682 The one pathname per line format of standard input precludes pathnames
1683 containing <newline>s. Although such pathnames violate the portable
1684 filename guidelines, they may exist and their presence may inhibit
1685 usage of pax within shell scripts. This problem is inherited from his‐
1686 torical archive programs. The problem can be avoided by listing file‐
1687 name arguments on the command line instead of on standard input.
1688
1689 It is almost certain that appropriate privileges are required for pax
1690 to accomplish parts of this volume of IEEE Std 1003.1-2001. Specifi‐
1691 cally, creating files of type block special or character special,
1692 restoring file access times unless the files are owned by the user (the
1693 -t option), or preserving file owner, group, and mode (the -p option)
1694 all probably require appropriate privileges.
1695
1696 In read mode, implementations are permitted to overwrite files when the
1697 archive has multiple members with the same name. This may fail if per‐
1698 missions on the first version of the file do not permit it to be over‐
1699 written.
1700
1701 The cpio and ustar formats can only support files up to 8589934592
1702 bytes (8 * 2^30) in size.
1703
1704
1706 The following command:
1707
1708 pax -w -f /dev/rmt/1m .
1709
1710 copies the contents of the current directory to tape drive 1, medium
1711 density (assuming historical System V device naming procedures-the his‐
1712 torical BSD device name would be /dev/rmt9).
1713
1714 The following commands:
1715
1716 mkdir newdirpax -rw olddir newdir
1717
1718 copy the olddir directory hierarchy to newdir.
1719
1720 pax -r -s ',^//*usr//*,,' -f a.pax
1721
1722 reads the archive a.pax, with all files rooted in /usr in the archive
1723 extracted relative to the current directory.
1724
1725 Using the option:
1726
1727 -o listopt="%M %(atime)T %(size)D %(name)s"
1728
1729 overrides the default output description in Standard Output and instead
1730 writes:
1731
1732 -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1733
1734 Using the options:
1735
1736 -o listopt='%L\t%(size)D\n%.7' \
1737 -o listopt='(name)s\n%(atime)T\n%T'
1738
1739 overrides the default output description in Standard Output and instead
1740 writes:
1741
1742 /usr/foo/bar -> /tmp 1492
1743 /usr/fo
1744 Jan 12 1991
1745 Jan 31 15:53
1746
1747
1749 The pax utility was new for the ISO POSIX-2:1993 standard. It repre‐
1750 sents a peaceful compromise between advocates of the historical tar and
1751 cpio utilities.
1752
1753 A fundamental difference between cpio and tar was in the way directo‐
1754 ries were treated. The cpio utility did not treat directories differ‐
1755 ently from other files, and to select a directory and its contents
1756 required that each file in the hierarchy be explicitly specified. For
1757 tar, a directory matched every file in the file hierarchy it rooted.
1758
1759 The pax utility offers both interfaces; by default, directories map
1760 into the file hierarchy they root. The -d option causes pax to skip any
1761 file not explicitly referenced, as cpio historically did. The tar -
1762 style behavior was chosen as the default because it was believed that
1763 this was the more common usage and because tar is the more commonly
1764 available interface, as it was historically provided on both System V
1765 and BSD implementations.
1766
1767 The data interchange format specification in this volume of IEEE Std
1768 1003.1-2001 requires that processes with "appropriate privileges" shall
1769 always restore the ownership and permissions of extracted files exactly
1770 as archived. If viewed from the historic equivalence between superuser
1771 and "appropriate privileges", there are two problems with this require‐
1772 ment. First, users running as superusers may unknowingly set dangerous
1773 permissions on extracted files. Second, it is needlessly limiting, in
1774 that superusers cannot extract files and own them as superuser unless
1775 the archive was created by the superuser. (It should be noted that
1776 restoration of ownerships and permissions for the superuser, by
1777 default, is historical practice in cpio, but not in tar.) In order to
1778 avoid these two problems, the pax specification has an additional
1779 "privilege" mechanism, the -p option. Only a pax invocation with the
1780 privileges needed, and which has the -p option set using the e specifi‐
1781 cation character, has the "appropriate privilege" to restore full own‐
1782 ership and permission information.
1783
1784 Note also that this volume of IEEE Std 1003.1-2001 requires that the
1785 file ownership and access permissions shall be set, on extraction, in
1786 the same fashion as the creat(2) function when provided with the mode
1787 stored in the archive. This means that the file creation mask of the
1788 user is applied to the file permissions.
1789
1790 Users should note that directories may be created by pax while extract‐
1791 ing files with permissions that are different from those that existed
1792 at the time the archive was created. When extracting sensitive informa‐
1793 tion into a directory hierarchy that no longer exists, users are
1794 encouraged to set their file creation mask appropriately to protect
1795 these files during extraction.
1796
1797 The table of contents output is written to standard output to facili‐
1798 tate pipeline processing.
1799
1800 An early proposal had hard links displaying for all pathnames. This
1801 was removed because it complicates the output of the case where -v is
1802 not specified and does not match historical cpio usage. The hard-link
1803 information is available in the -v display.
1804
1805 The description of the -l option allows implementations to make hard
1806 links to symbolic links. IEEE Std 1003.1-2001 does not specify any way
1807 to create a hard link to a symbolic link, but many implementations pro‐
1808 vide this capability as an extension. If there are hard links to sym‐
1809 bolic links when an archive is created, the implementation is required
1810 to archive the hard link in the archive (unless -H or -L is specified).
1811 When in read mode and in copy mode, implementations supporting hard
1812 links to symbolic links should use them when appropriate.
1813
1814 The archive formats inherited from the POSIX.1-1990 standard have cer‐
1815 tain restrictions that have been brought along from historical usage.
1816 For example, there are restrictions on the length of pathnames stored
1817 in the archive. When pax is used in copy (-rw) mode (copying directory
1818 hierarchies), the ability to use extensions from the -x pax format
1819 overcomes these restrictions.
1820
1821 The default blocksize value of 5120 bytes for cpio was selected because
1822 it is one of the standard block-size values for cpio, set when the -B
1823 option is specified. (The other default block-size value for cpio is
1824 512 bytes, and this was considered to be too small.) The default block
1825 value of 10240 bytes for tar was selected because that is the standard
1826 block-size value for BSD tar. The maximum block size of 32256 bytes
1827 (2^15-512 bytes) is the largest multiple of 512 bytes that fits into a
1828 signed 16-bit tape controller transfer register. There are known limi‐
1829 tations in some historical systems that would prevent larger blocks
1830 from being accepted. Historical values were chosen to improve compati‐
1831 bility with historical scripts using dd(1) or similar utilities to
1832 manipulate archives. Also, default block sizes for any file type other
1833 than character special file has been deleted from this volume of IEEE
1834 Std 1003.1-2001 as unimportant and not likely to affect the structure
1835 of the resulting archive.
1836
1837 Implementations are permitted to modify the block-size value based on
1838 the archive format or the device to which the archive is being written.
1839 This is to provide implementations with the opportunity to take advan‐
1840 tage of special types of devices, and it should not be used without a
1841 great deal of consideration as it almost certainly decreases archive
1842 portability.
1843
1844 The intended use of the -n option was to permit extraction of one or
1845 more files from the archive without processing the entire archive. This
1846 was viewed by the standard developers as offering significant perfor‐
1847 mance advantages over historical implementations. The -n option in
1848 early proposals had three effects; the first was to cause special char‐
1849 acters in patterns to not be treated specially. The second was to cause
1850 only the first file that matched a pattern to be extracted. The third
1851 was to cause pax to write a diagnostic message to standard error when
1852 no file was found matching a specified pattern. Only the second behav‐
1853 ior is retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1854 sons. First, it is in general not acceptable for a single option to
1855 have multiple effects. Second, the ability to make pattern matching
1856 characters act as normal characters is useful for parts of pax other
1857 than file extraction. Third, a finer degree of control over the special
1858 characters is useful because users may wish to normalize only a single
1859 special character in a single filename. Fourth, given a more general
1860 escape mechanism, the previous behavior of the -n option can be easily
1861 obtained using the -s option or a sed script. Finally, writing a diag‐
1862 nostic message when a pattern specified by the user is unmatched by any
1863 file is useful behavior in all cases.
1864
1865 In this version, the -n was removed from the copy mode synopsis of pax;
1866 it is inapplicable because there are no pattern operands specified in
1867 this mode.
1868
1869 There is another method than pax for copying subtrees in IEEE Std
1870 1003.1-2001 described as part of the cp(1) utility. Both methods are
1871 historical practice: cp(1) provides a simpler, more intuitive inter‐
1872 face, while pax offers a finer granularity of control. Each provides
1873 additional functionality to the other; in particular, pax maintains the
1874 hard-link structure of the hierarchy while cp(1) does not. It is the
1875 intention of the standard developers that the results be similar (using
1876 appropriate option combinations in both utilities). The results are not
1877 required to be identical; there seemed insufficient gain to applica‐
1878 tions to balance the difficulty of implementations having to guarantee
1879 that the results would be exactly identical.
1880
1881 A single archive may span more than one file. It is suggested that
1882 implementations provide informative messages to the user on standard
1883 error whenever the archive file is changed.
1884
1885 The -d option (do not create intermediate directories not listed in the
1886 archive) found in early proposals was originally provided as a comple‐
1887 ment to the historic -d option of cpio. It has been deleted.
1888
1889 The -s option in early proposals specified a subset of the substitution
1890 command from the ed utility. As there was no reason for only a subset
1891 to be supported, the -s option is now compatible with the current ed
1892 specification. Since the delimiter can be any non-null character, the
1893 following usage with single spaces is valid:
1894
1895 pax -s " foo bar " ...
1896
1897 The -t description is worded so as to note that this may cause the
1898 access time update caused by some other activity (which occurs while
1899 the file is being read) to be overwritten.
1900
1901 The default behavior of pax with regard to file modification times is
1902 the same as historical implementations of tar. It is not the histori‐
1903 cal behavior of cpio.
1904
1905 Because the -i option uses /dev/tty, utilities without a controlling
1906 terminal are not able to use this option.
1907
1908 The -y option, found in early proposals, has been deleted because a
1909 line containing a single period for the -i option has equivalent func‐
1910 tionality. The special lines for the -i option (a single period and the
1911 empty line) are historical practice in cpio.
1912
1913 In early drafts, a -e charmap option was included to increase portabil‐
1914 ity of files between systems using different coded character sets. This
1915 option was omitted because it was apparent that consensus could not be
1916 formed for it. In this version, the use of UTF-8 should be an adequate
1917 substitute.
1918
1919 The -k option was added to address international concerns about the
1920 dangers involved in the character set transformations of -e (if the
1921 target character set were different from the source, the filenames
1922 might be transformed into names matching existing files) and also was
1923 made more general to protect files transferred between file systems
1924 with different {NAME_MAX} values (truncating a filename on a smaller
1925 system might also inadvertently overwrite existing files). As stated,
1926 it prevents any overwriting, even if the target file is older than the
1927 source. This version adds more granularity of options to solve this
1928 problem by introducing the -o invalid=option -specifically the UTF-8
1929 action. (Note that an existing file that is named with a UTF-8 encoding
1930 is still subject to overwriting in this case. The -k option closes that
1931 loophole.)
1932
1933 Some of the file characteristics referenced in this volume of IEEE Std
1934 1003.1-2001 might not be supported by some archive formats. For exam‐
1935 ple, neither the tar nor cpio formats contain the file access time. For
1936 this reason, the e specification character has been provided, intended
1937 to cause all file characteristics specified in the archive to be
1938 retained.
1939
1940 It is required that extracted directories, by default, have their
1941 access and modification times and permissions set to the values speci‐
1942 fied in the archive. This has obvious problems in that the directories
1943 are almost certainly modified after being extracted and that directory
1944 permissions may not permit file creation. One possible solution is to
1945 create directories with the mode specified in the archive, as modified
1946 by the umask of the user, with sufficient permissions to allow file
1947 creation. After all files have been extracted, pax would then reset the
1948 access and modification times and permissions as necessary.
1949
1950 The list-mode formatting description borrows heavily from the one
1951 defined by the printf(1) utility. However, since there is no separate
1952 operand list to get conversion arguments, the format was extended to
1953 allow specifying the name of the conversion argument as part of the
1954 conversion specification.
1955
1956 The T conversion specifier allows time fields to be displayed in any of
1957 the date formats. Unlike the ls(1) utility, pax does not adjust the
1958 format when the date is less than six months in the past. This makes
1959 parsing the output more predictable.
1960
1961 The D conversion specifier handles the ability to display the
1962 major/minor or file size, as with ls(1), by using %-8(size)D.
1963
1964 The L conversion specifier handles the ls display for symbolic links.
1965
1966 Conversion specifiers were added to generate existing known types used
1967 for ls(1).
1968
1969
1970 pax Interchange Format
1971 The new POSIX data interchange format was developed primarily to sat‐
1972 isfy international concerns that the ustar and cpio formats did not
1973 provide for file, user, and group names encoded in characters outside a
1974 subset of the ISO/IEC 646:1991 standard. The standard developers real‐
1975 ized that this new POSIX data interchange format should be very exten‐
1976 sible because there were other requirements they foresaw in the near
1977 future:
1978
1979 · Support international character encodings and locale information
1980
1981 · Support security information (ACLs, and so on)
1982
1983 · Support future file types, such as realtime or contiguous files
1984
1985 · Include data areas for implementation use
1986
1987 · Support systems with words larger than 32 bits and timers with
1988 subsecond granularity
1989
1990 The following were not goals for this format because these are better
1991 handled by separate utilities or are inappropriate for a portable for‐
1992 mat:
1993
1994 · Encryption
1995
1996 · Compression
1997
1998 · Data translation between locales and codesets
1999
2000 · inode storage
2001
2002 The format chosen to support the goals is an extension of the ustar
2003 format. Of the two formats previously available, only the ustar format
2004 was selected for extensions because:
2005
2006 · It was easier to extend in an upwards-compatible way. It offered
2007 version flags and header block type fields with room for future
2008 standardization. The cpio format, while possessing a more flexi‐
2009 ble file naming methodology, could not be extended without
2010 breaking some theoretical implementation or using a dummy file‐
2011 name that could be a legitimate filename.
2012
2013 · Industry experience since the original "tar wars" fought in
2014 developing the ISO POSIX-1 standard has clearly been in favor of
2015 the ustar format, which is generally the default output format
2016 selected for pax implementations on new systems.
2017
2018 The new format was designed with one additional goal in mind: reason‐
2019 able behavior when an older tar or pax utility happened to read an ar‐
2020 chive. Since the POSIX.1-1990 standard mandated that a "format-reading
2021 utility" had to treat unrecognized typeflag values as regular files,
2022 this allowed the format to include all the extended information in a
2023 pseudo-regular file that preceded each real file. An option is given
2024 that allows the archive creator to set up reasonable names for these
2025 files on the older systems. Also, the normative text suggests that
2026 reasonable file access values be used for this ustar header block. Mak‐
2027 ing these header files inaccessible for convenient reading and deleting
2028 would not be reasonable. File permissions of 600 or 700 are suggested.
2029
2030 The ustar typeflag field was used to accommodate the additional func‐
2031 tionality of the new format rather than magic or version because the
2032 POSIX.1-1990 standard (and, by reference, the previous version of pax),
2033 mandated the behavior of the format-reading utility when it encountered
2034 an unknown typeflag, but was silent about the other two fields.
2035
2036 Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2037 a proposed archive format that was based on compatibility with the
2038 standard for tape files (ISO 1001, similar to the format used histori‐
2039 cally on many mainframes and minicomputers). This format was overly
2040 complex and required considerable overhead in volume and header
2041 records. Furthermore, the standard developers felt that it would not be
2042 acceptable to the community of POSIX developers, so it was later
2043 changed to be a format more closely related to historical practice on
2044 POSIX systems.
2045
2046 The prefix and name split of pathnames in ustar was replaced by the
2047 single path extended header record for simplicity.
2048
2049 The concept of a global extended header (typeflag g) was controversial.
2050 If this were applied to an archive being recorded on magnetic tape, a
2051 few unreadable blocks at the beginning of the tape could be a serious
2052 problem; a utility attempting to extract as many files as possible from
2053 a damaged archive could lose a large percentage of file header informa‐
2054 tion in this case. However, if the archive were on a reliable medium,
2055 such as a CD-ROM, the global extended header offers considerable poten‐
2056 tial size reductions by eliminating redundant information. Thus, the
2057 text warns against using the global method for unreliable media and
2058 provides a method for implanting global information in the extended
2059 header for each file, rather than in the typeflag g records.
2060
2061 No facility for data translation or filtering on a per-file basis is
2062 included because the standard developers could not invent an interface
2063 that would allow this in an efficient manner. If a filter, such as
2064 encryption or compression, is to be applied to all the files, it is
2065 more efficient to apply the filter to the entire archive as a single
2066 file. The standard developers considered interfaces that would invoke a
2067 shell script for each file going into or out of the archive, but the
2068 system overhead in this approach was considered to be too high.
2069
2070 One such approach would be to have filter= records that give a pathname
2071 for an executable. When the program is invoked, the file and archive
2072 would be open for standard input/output and all the header fields would
2073 be available as environment variables or command-line arguments. The
2074 standard developers did discuss such schemes, but they were omitted
2075 from IEEE Std 1003.1-2001 due to concerns about excessive overhead.
2076 Also, the program itself would need to be in the archive if it were to
2077 be used portably.
2078
2079 There is currently no portable means of identifying the character
2080 set(s) used for a file in the file system. Therefore, pax has not been
2081 given a mechanism to generate charset records automatically. The only
2082 portable means of doing this is for the user to write the archive using
2083 the -o charset=string command line option. This assumes that all of the
2084 files in the archive use the same encoding. The "implementation-
2085 defined" text is included to allow for a system that can identify the
2086 encodings used for each of its files.
2087
2088 The table of standards that accompanies the charset record description
2089 is acknowledged to be very limited. Only a limited number of character
2090 set standards is reasonable for maximal interchange. Any character set
2091 is, of course, possible by prior agreement. It was suggested that
2092 EBCDIC be listed, but it was omitted because it is not defined by a
2093 formal standard. Formal standards, and then only those with reasonably
2094 large followings, can be included here, simply as a matter of practi‐
2095 cality. The <value>s represent names of officially registered character
2096 sets in the format required by the ISO 2375:1985 standard.
2097
2098 The normal comma or <blank>-separated list rules are not followed in
2099 the case of keyword options to allow ease of argument parsing for
2100 getopts.
2101
2102 Further information on character encodings is in pax Archive Character
2103 Set Encoding/Decoding.
2104
2105 The standard developers have reserved keyword name space for vendor
2106 extensions. It is suggested that the format to be used is:
2107
2108 VENDOR.keyword
2109
2110 where VENDOR is the name of the vendor or organization in all uppercase
2111 letters. It is further suggested that the keyword following the period
2112 be named differently than any of the standard keywords so that it could
2113 be used for future standardization, if appropriate, by omitting the
2114 VENDOR prefix.
2115
2116 The <length> field in the extended header record was included to make
2117 it simpler to step through the records, even if a record contains an
2118 unknown format (to a particular pax) with complex interactions of spe‐
2119 cial characters. It also provides a minor integrity checkpoint within
2120 the records to aid a program attempting to recover files from a damaged
2121 archive.
2122
2123 There are no extended header versions of the devmajor and devminor
2124 fields because the unspecified format ustar header field should be suf‐
2125 ficient. If they are not, vendor-specific extended keywords (such as
2126 VENDOR.devmajor) should be used.
2127
2128 Device and i-number labeling of files was not adopted from cpio; files
2129 are interchanged strictly on a symbolic name basis, as in ustar.
2130
2131 Just as with the ustar format descriptions, the new format makes no
2132 special arrangements for multi-volume archives. Each of the pax archive
2133 types is assumed to be inside a single POSIX file and splitting that
2134 file over multiple volumes (diskettes, tape cartridges, and so on),
2135 processing their labels, and mounting each in the proper sequence are
2136 considered to be implementation details that cannot be described
2137 portably.
2138
2139 The pax format is intended for interchange, not only for backup on a
2140 single (family of) systems. It is not as densely packed as might be
2141 possible for backup:
2142
2143 · It contains information as coded characters that could be coded
2144 in binary.
2145
2146 · It identifies extended records with name fields that could be
2147 omitted in favor of a fixed-field layout.
2148
2149 · It translates names into a portable character set and identifies
2150 locale-related information, both of which are probably unneces‐
2151 sary for backup.
2152
2153 The requirements on restoring from an archive are slightly different
2154 from the historical wording, allowing for non-monolithic privilege to
2155 bring forward as much as possible. In particular, attributes such as
2156 "high performance file" might be broadly but not universally granted
2157 while set-user-ID or chown(2) might be much more restricted. There is
2158 no implication in IEEE Std 1003.1-2001 that the security information be
2159 honored after it is restored to the file hierarchy, in spite of what
2160 might be improperly inferred by the silence on that topic. That is a
2161 topic for another standard.
2162
2163 Links are recorded in the fashion described here because a link can be
2164 to any file type. It is desirable in general to be able to restore part
2165 of an archive selectively and restore all of those files completely. If
2166 the data is not associated with each link, it is not possible to do
2167 this. However, the data associated with a file can be large, and when
2168 selective restoration is not needed, this can be a significant burden.
2169 The archive is structured so that files that have no associated data
2170 can always be restored by the name of any link name of any link, and
2171 the user may choose whether data is recorded with each instance of a
2172 file that contains data. The format permits mixing of both types of
2173 links in a single archive; this can be done for special needs, and pax
2174 is expected to interpret such archives on input properly, despite the
2175 fact that there is no pax option that would force this mixed case on
2176 output. (When -o linkdata is used, the output must contain the dupli‐
2177 cate data, but the implementation is free to include it or omit it when
2178 -o linkdata is not used.)
2179
2180 The time values are included as extended header records for those
2181 implementations needing more than the eleven octal digits allowed by
2182 the ustar format. Portable file timestamps cannot be negative. If pax
2183 encounters a file with a negative timestamp in copy or write mode, it
2184 can reject the file, substitute a non-negative timestamp, or generate a
2185 non-portable timestamp with a leading granularities than seconds, the
2186 normative text requires support only for seconds since the Epoch
2187 because the ISO POSIX-1 standard states them that way. The ustar format
2188 includes only mtime; the new format adds atime and ctime for symmetry.
2189 The atime access time restored to the file system will be affected by
2190 the -p a and -p e options. The ctime creation time (actually inode mod‐
2191 ification time) is described with "appropriate privilege" so that it
2192 can be ignored when writing to the file system. POSIX does not provide
2193 a portable means to change file creation time. Nothing is intended to
2194 prevent a non-portable implementation of pax from restoring the value.
2195
2196 The gid, size, and uid extended header records were included to allow
2197 expansion beyond the sizes specified in the regular tar header. New
2198 file system architectures are emerging that will exhaust the 12-digit
2199 size field. There are probably not many systems requiring more than 8
2200 digits for user and group IDs, but the extended header values were
2201 included for completeness, allowing overrides for all of the decimal
2202 values in the tar header.
2203
2204 The standard developers intended to describe the effective results of
2205 pax with regard to file ownerships and permissions; implementations are
2206 not restricted in timing or sequencing the restoration of such, pro‐
2207 vided the results are as specified.
2208
2209 Much of the text describing the extended headers refers to use in
2210 "write or copy modes". The copy mode references are due to the norma‐
2211 tive text: "The effect of the copy shall be as if the copied files were
2212 written to an archive file and then subsequently extracted ...". There
2213 is certainly no way to test whether pax is actually generating the
2214 extended headers in copy mode, but the effects must be as if it had.
2215
2216
2217 pax Archive Character Set Encoding/Decoding
2218 There is a need to exchange archives of files between systems of dif‐
2219 ferent native codesets. Filenames, group names, and user names must be
2220 preserved to the fullest extent possible when an archive is read on the
2221 receiving platform. Translation of the contents of files is not within
2222 the scope of the pax utility.
2223
2224 There will also be the need to represent characters that are not avail‐
2225 able on the receiving platform. These unsupported characters cannot be
2226 automatically folded to the local set of characters due to the chance
2227 of collisions. This could result in overwriting previous extracted
2228 files from the archive or pre-existing files on the system.
2229
2230 For these reasons, the codeset used to represent characters within the
2231 extended header records of the pax archive must be sufficiently rich to
2232 handle all commonly used character sets. The fields requiring transla‐
2233 tion include, at a minimum, filenames, user names, group names, and
2234 link pathnames. Implementations may wish to have localized extended
2235 keywords that use non-portable characters.
2236
2237 The standard developers considered the following options:
2238
2239 · The archive creator specifies the well-defined name of the
2240 source codeset. The receiver must then recognize the codeset
2241 name and perform the appropriate translations to the destination
2242 codeset.
2243
2244 · The archive creator includes within the archive the character
2245 mapping table for the source codeset used to encode extended
2246 header records. The receiver must then read the character map‐
2247 ping table and perform the appropriate translations to the des‐
2248 tination codeset.
2249
2250 · The archive creator translates the extended header records in
2251 the source codeset into a canonical form. The receiver must then
2252 perform the appropriate translations to the destination codeset.
2253
2254 The approach that incorporates the name of the source codeset poses the
2255 problem of codeset name registration, and makes the archive useless to
2256 pax archive decoders that do not recognize that codeset.
2257
2258 Because parts of an archive may be corrupted, the standard developers
2259 felt that including the character map of the source codeset was too
2260 fragile. The loss of this one key component could result in making the
2261 entire archive useless. (The difference between this and the global
2262 extended header decision was that the latter has a workaround-duplicat‐
2263 ing extended header records on unreliable media-but this would be too
2264 burdensome for large character set maps.)
2265
2266 Both of the above approaches also put an undue burden on the pax ar‐
2267 chive receiver to handle the cross-product of all source and destina‐
2268 tion codesets.
2269
2270 To simplify the translation from the source codeset to the canonical
2271 form and from the canonical form to the destination codeset, the stan‐
2272 dard developers decided that the internal representation should be a
2273 stateless encoding. A stateless encoding is one where each codepoint
2274 has the same meaning, without regard to the decoder being in a specific
2275 state. An example of a stateful encoding would be the Japanese Shift-
2276 JIS; an example of a stateless encoding would be the ISO/IEC 646:1991
2277 standard (equivalent to 7-bit ASCII).
2278
2279 For these reasons, the standard developers decided to adopt a canonical
2280 format for the representation of file information strings. The obvious,
2281 well-endorsed candidate is the ISO/IEC 10646-1:2000 standard (based in
2282 part on Unicode), which can be used to represent the characters of vir‐
2283 tually all standardized character sets. The standard developers ini‐
2284 tially agreed upon using UCS2 (16-bit Unicode) as the internal repre‐
2285 sentation. This repertoire of characters provides a sufficiently rich
2286 set to represent all commonly-used codesets.
2287
2288 However, the standard developers found that the 16-bit Unicode repre‐
2289 sentation had some problems. It forced the issue of standardizing byte
2290 ordering. The 2-byte length of each character made the extended header
2291 records twice as long for the case of strings coded entirely from his‐
2292 torical 7-bit ASCII. For these reasons, the standard developers chose
2293 the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2294 representation encodes UCS2 or UCS4 characters reliably and determinis‐
2295 tically, eliminating the need for a canonical byte ordering. In addi‐
2296 tion, NUL octets and other characters possibly confusing to POSIX file
2297 systems do not appear, except to represent themselves. It was realized
2298 that certain national codesets take up more space after the encoding,
2299 due to their placement within the UCS range; it was felt that the use‐
2300 fulness of the encoding of the names outweighs the disadvantage of size
2301 increase for file, user, and group names.
2302
2303 The encoding of UTF-8 is as follows:
2304
2305 UCS4 Hex Encoding UTF-8 Binary Encoding
2306 00000000-0000007F 0xxxxxxx
2307 00000080-000007FF 110xxxxx 10xxxxxx
2308 00000800-0000FFFF 1110xxxx 10xxxxxx 10xxxxxx
2309 00010000-001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2310 00200000-03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2311 04000000-7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2312
2313 where each 'x' represents a bit value from the character being trans‐
2314 lated.
2315
2316
2317 ustar Interchange Format
2318 The description of the ustar format reflects numerous enhancements over
2319 pre-1988 versions of the historical tar utility. The goal of these
2320 changes was not only to provide the functional enhancements desired,
2321 but also to retain compatibility between new and old versions. This
2322 compatibility has been retained. Archives written using the old archive
2323 format are compatible with the new format.
2324
2325 Implementors should be aware that the previous file format did not
2326 include a mechanism to archive directory type files. For this reason,
2327 the convention of using a filename ending with slash was adopted to
2328 specify a directory on the archive.
2329
2330 The total size of the name and prefix fields have been set to meet the
2331 minimum requirements for {PATH_MAX} If a pathname will fit within the
2332 name field, it is recommended that the pathname be stored there without
2333 the use of the prefix field. Although the name field is known to be too
2334 small to contain {PATH_MAX} characters, the value was not changed in
2335 this version of the archive file format to retain backwards-compatibil‐
2336 ity, and instead the prefix was introduced. Also, because of the ear‐
2337 lier version of the format, there is no way to remove the restriction
2338 on the linkname field being limited in size to just that of the name
2339 field.
2340
2341 The size field is required to be meaningful in all implementation
2342 extensions, although it could be zero. This is required so that the
2343 data blocks can always be properly counted.
2344
2345 It is suggested that if device special files need to be represented
2346 that cannot be represented in the standard format, that one of the
2347 extension types (A-Z) be used, and that the additional information for
2348 the special file be represented as data and be reflected in the size
2349 field.
2350
2351 Attempting to restore a special file type, where it is converted to
2352 ordinary data and conflicts with an existing filename, need not be spe‐
2353 cially detected by the utility. If run as an ordinary user, pax should
2354 not be able to overwrite the entries in, for example, /dev in any case
2355 (whether the file is converted to another type or not). If run as a
2356 privileged user, it should be able to do so, and it would be considered
2357 a bug if it did not. The same is true of ordinary data files and simi‐
2358 larly named special files; it is impossible to anticipate the needs of
2359 the user (who could really intend to overwrite the file), so the behav‐
2360 ior should be predictable (and thus regular) and rely on the protection
2361 system as required.
2362
2363 The value 7 in the typeflag field is intended to define how contiguous
2364 files can be stored in a ustar archive. IEEE Std 1003.1-2001 does not
2365 require the contiguous file extension, but does define a standard way
2366 of archiving such files so that all conforming systems can interpret
2367 these file types in a meaningful and consistent manner. On a system
2368 that does not support extended file types, the pax utility should do
2369 the best it can with the file and go on to the next.
2370
2371 The file protection modes are those conventionally used by the ls(1)
2372 utility. This is extended beyond the usage in the ISO POSIX-2 standard
2373 to support the "shared text" or "sticky" bit. It is intended that the
2374 conformance document should not document anything beyond the existence
2375 of and support of such a mode. Further extensions are expected to
2376 these bits, particularly with overloading the set-user-ID and set-
2377 group-ID flags.
2378
2379
2380 cpio Interchange Format
2381 The reference to appropriate privilege in the cpio format refers to an
2382 error on standard output; the ustar format does not make comparable
2383 statements.
2384
2385 The model for this format was the historical System V cpio -c data
2386 interchange format. This model documents the portable version of the
2387 cpio format and not the binary version. It has the flexibility to
2388 transfer data of any type described within IEEE Std 1003.1-2001, yet is
2389 extensible to transfer data types specific to extensions beyond IEEE
2390 Std 1003.1-2001 (for example, contiguous files). Because it describes
2391 existing practice, there is no question of maintaining upwards-compati‐
2392 bility.
2393
2394
2395 cpio Header
2396 There has been some concern that the size of the c_ino field of the
2397 header is too small to handle those systems that have very large inode
2398 numbers. However, the c_ino field in the header is used strictly as a
2399 hard-link resolution mechanism for archives. It is not necessarily the
2400 same value as the inode number of the file in the location from which
2401 that file is extracted.
2402
2403 The name c_magic is based on historical usage.
2404
2405
2406 cpio Filename
2407 For most historical implementations of the cpio utility, {PATH_MAX}
2408 octets can be used to describe the pathname without the addition of any
2409 other header fields (the NUL character would be included in this
2410 count). {PATH_MAX} is the minimum value for pathname size, documented
2411 as 256 bytes. However, an implementation may use c_namesize to deter‐
2412 mine the exact length of the pathname. With the current description of
2413 the <cpio.h> header, this pathname size can be as large as a number
2414 that is described in six octal digits.
2415
2416 Two values are documented under the c_mode field values to provide for
2417 extensibility for known file types:
2418
2419 0110 000
2420 Reserved for contiguous files. The implementation may treat the
2421 rest of the information for this archive like a regular file. If
2422 this file type is undefined, the implementation may create the
2423 file as a regular file.
2424
2425 This provides for extensibility of the cpio format while allowing for
2426 the ability to read old archives. Files of an unknown type may be read
2427 as "regular files" on some implementations. On a system that does not
2428 support extended file types, the pax utility should do the best it can
2429 with the file and go on to the next.
2430
2431
2433 None.
2434
2435
2437_________________________________________________________________
2438
2439
2441 Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2442 Base Definitions volume of IEEE Std 1003.1-2001, <cpio.h>, the System
2443 Interfaces volume of IEEE Std 1003.1-2001, chown(2), creat(2),
2444 mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2445
2446
2448 First released in Issue 4.
2449
2450
2451 Issue 5
2452 A note is added to the APPLICATION USAGE indicating that the cpio and
2453 tar formats can only support files up to 8 gigabytes in size.
2454
2455
2456 Issue 6
2457 The pax utility is aligned with the IEEE P1003.2b draft standard:
2458
2459 · Support has been added for symbolic links in the options and
2460 interchange formats.
2461
2462 · A new format has been devised, based on extensions to ustar.
2463
2464 · References to the "extended" tar and cpio formats derived from
2465 the POSIX.1-1990 standard have been changed to remove the
2466 "extended" adjective because this could cause confusion with the
2467 extended tar header added in this revision. (All references to
2468 tar are actually to ustar.)
2469
2470 The TZ entry is added to the ENVIRONMENT VARIABLES section.
2471
2472 IEEE PASC Interpretation 1003.2 #168 is applied, clarifying that
2473 mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2474 ing an archive.
2475
2476 IEEE PASC Interpretation 1003.2 #180 is applied, clarifying how
2477 extracted files are created when in read mode.
2478
2479 IEEE PASC Interpretation 1003.2 #181 is applied, clarifying the
2480 description of the -t option.
2481
2482 IEEE PASC Interpretation 1003.2 #195 is applied.
2483
2484 IEEE PASC Interpretation 1003.2 #206 is applied, clarifying the han‐
2485 dling of links for the -H, -L, and -l options.
2486
2487 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied, adding
2488 the process ID of the pax process into certain fields. This change pro‐
2489 vides a method for the implementation to ensure that different
2490 instances of pax extracting a file named /a/b/foo will not collide when
2491 processing the extended header information associated with foo.
2492
2493 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied, chang‐
2494 ing -x B to -x pax in the OPTIONS section.
2495
2496 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied, updat‐
2497 ing the SYNOPSIS to be consistent with the normative text.
2498
2499 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied, updat‐
2500 ing the DESCRIPTION to describe the behavior when files to be linked
2501 are symbolic links and the system is not capable of making hard links
2502 to symbolic links.
2503
2504 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied, updat‐
2505 ing the OPTIONS section to describe the behavior for how multiple
2506 options are to be handled.
2507
2508 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied, updat‐
2509 ing the write option within the OPTIONS section.
2510
2511 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied, adding
2512 a paragraph into the OPTIONS section that states that specifying more
2513 than one of the mutually-exclusive options (-H and -L) is not consid‐
2514 ered an error and that the last option specified will determine the
2515 behavior of the utility.
2516
2517 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied, remov‐
2518 ing the ctime paragraph within the EXTENDED DESCRIPTION. There is a
2519 contradiction in the definition of the ctime keyword for the pax
2520 extended header, in that the st_ctime member of the stat structure does
2521 not refer to a file creation time. No field in the standard stat struc‐
2522 ture from <sys/stat.h> includes a file creation time.
2523
2524 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied, making
2525 it clear that typeflag 1 RB ( ustar Interchange Format) applies not
2526 only to files that are hard-linked, but also to files that are aliased
2527 via symlinks.
2528
2529 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied, clari‐
2530 fying the cpio c_nlink field.
2531
2532 End of quoted text from the POSIX.1-2001 standard.
2533
2535 The following other options are implemented as extension to the POSIX
2536 standard:
2537
2538 -help Prints a summary of the most important options for spax(1) and
2539 exits.
2540
2541 -xhelp Prints a summary of the less important options for spax(1) and
2542 exits.
2543
2544 -version
2545 Prints the spax version number string and exists.
2546
2547
2554 The Institute of Electrical and Electronics Engineers and The Open
2555 Group, have given us permission to reprint portions of their documenta‐
2556 tion. In the following statement, the phrase ``this text'' refers to
2557 portions of the system documentation.
2558
2559 Portions of this text are reprinted and reproduced in electronic form
2560 in the sfind manual, from IEEE Std 1003.1, 2004 Edition, Standard for
2561 Information Technology -- Portable Operating System Interface (POSIX),
2562 The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004 by
2563 the Institute of Electrical and Electronics Engineers, Inc and The Open
2564 Group. In the event of any discrepancy between these versions and the
2565 original IEEE and The Open Group Standard, the original IEEE and The
2566 Open Group Standard is the referee document. The original Standard can
2567 be obtained online at http://www.opengroup.org/unix/online.html.
2568
2571 Joerg Schilling
2572 Seestr. 110
2573 D-13353 Berlin
2574 Germany
2575
2576 Mail bugs and suggestions to:
2577
2578 schilling@fokus.fraunhofer.de or js@cs.tu-berlin.de or
2579 joerg@schily.isdn.cs.tu-berlin.de
2580
2581
2582
2583Joerg Schilling 09/04/10 SPAX(1L)