1SPAX(1L) Schily´s USER COMMANDS SPAX(1L)
2
3
4
6 pax - portable archive interchange
7
9 spax [other options] [-cdnv] [-H|-L] [-f archive]
10 [-o options]... [-s replstr]... [pattern...]
11
12
13 spax -r [other options] [-cdiknuv] [-H|-L] [-f archive]
14 [-o options]... [-p string]... [-s replstr]... [pattern...]
15
16
17 spax -w [other options] [-dituvX] [-H|-L] [-b blocksize] [-a]
18 [-f archive] [-o options]... [-s replstr]... [-x format]
19 [file...]
20
21
22 spax -r -w[other options] [-diklntuvX] [-H|-L] [-o options]...
23 [-p string]... [-s replstr]... [file...] directory
24
26 The pax utility shall read, write, and write lists of the members of
27 archive files and copy directory hierarchies. A variety of archive for‐
28 mats shall be supported; see the -x format option.
29
30 The action to be taken depends on the presence of the -r and -w
31 options. The four combinations of -r and -w are referred to as the four
32 modes of operation: list, read, write, and copy modes, corresponding
33 respectively to the four forms shown in the SYNOPSIS section.
34
35 list In list mode (when neither -r nor -w are specified), pax shall
36 write the names of the members of the archive file read from the
37 standard input, with pathnames matching the specified patterns,
38 to standard output. If a named file is of type directory, the
39 file hierarchy rooted at that file shall be listed as well.
40
41 read In read mode (when -r is specified, but -w is not), pax shall
42 extract the members of the archive file read from the standard
43 input, with pathnames matching the specified patterns. If an
44 extracted file is of type directory, the file hierarchy rooted
45 at that file shall be extracted as well. The extracted files
46 shall be created performing pathname resolution with the direc‐
47 tory in which pax was invoked as the current working directory.
48
49 If an attempt is made to extract a directory when the directory
50 already exists, this shall not be considered an error. If an
51 attempt is made to extract a FIFO when the FIFO already exists,
52 this shall not be considered an error.
53
54 The ownership, access, and modification times, and file mode of
55 the restored files are discussed under the -p option.
56
57 write In write mode (when -w is specified, but -r is not), pax shall
58 write the contents of the file operands to the standard output
59 in an archive format. If no file operands are specified, a list
60 of files to copy, one per line, shall be read from the standard
61 input. A file of type directory shall include all of the files
62 in the file hierarchy rooted at the file.
63
64 copy In copy mode (when both -r and -w are specified), pax shall copy
65 the file operands to the destination directory.
66
67 If no file operands are specified, a list of files to copy, one
68 per line, shall be read from the standard input. A file of type
69 directory shall include all of the files in the file hierarchy
70 rooted at the file.
71
72 The effect of the copy shall be as if the copied files were
73 written to an archive file and then subsequently extracted,
74 except that there may be hard links between the original and the
75 copied files. If the destination directory is a subdirectory of
76 one of the files to be copied, the results are unspecified. If
77 the destination directory is a file of a type not defined by the
78 System Interfaces volume of IEEE Std 1003.1-2001, the results
79 are implementation-defined; otherwise, it shall be an error for
80 the file named by the directory operand not to exist, not be
81 writable by the user, or not be a file of type directory.
82
83 In read or copy modes, if intermediate directories are necessary to
84 extract an archive member, pax shall perform actions equivalent to the
85 mkdir() function defined in the System Interfaces volume of IEEE Std
86 1003.1-2001, called with the following arguments:
87
88 · The intermediate directory used as the path argument.
89
90 · The value of the bitwise-inclusive OR of S_IRWXU, S_IRWXG, and
91 S_IRWXO as the mode argument.
92
93 If any specified pattern or file operands are not matched by at least
94 one file or archive member, pax shall write a diagnostic message to
95 standard error for each one that did not match and exit with a non-zero
96 exit status.
97
98 The archive formats described in the EXTENDED DESCRIPTION section shall
99 be automatically detected on input. The default output archive format
100 shall be implementation-defined.
101
102 The spax implementation defaults to -x ustar.
103
104 A single archive can span multiple files. The pax utility shall deter‐
105 mine, in an implementation-defined manner, what file to read or write
106 as the next file.
107
108 If the selected archive format supports the specification of linked
109 files, it shall be an error if these files cannot be linked when the
110 archive is extracted, except that if the files to be linked are sym‐
111 bolic links and the system is not capable of making hard links to sym‐
112 bolic links, then separate copies of the symbolic link shall be created
113 instead. For archive formats that do not store file contents with each
114 name that causes a hard link, if the file that contains the data is not
115 extracted during this pax session, either the data shall be restored
116 from the original file, or a diagnostic message shall be displayed with
117 the name of a file that can be used to extract the data. In traversing
118 directories, pax shall detect infinite loops; that is, entering a pre‐
119 viously visited directory that is an ancestor of the last file visited.
120 When it detects an infinite loop, pax shall write a diagnostic message
121 to standard error and shall terminate.
122
123
125 The pax utility shall conform to the Base Definitions volume of IEEE
126 Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except that
127 the order of presentation of the -o, -p, and -s options is significant.
128
129 See also the "OTHER OPTIONS" section.
130
131
132 The following options shall be supported:
133
134 -r Read an archive file from standard input.
135
136 -w Write files to the standard output in the specified archive for‐
137 mat.
138
139 -a Append files to the end of the archive. It is implementation-
140 defined which devices on the system support appending. Addi‐
141 tional file formats unspecified by this volume of IEEE Std
142 1003.1-2001 may impose restrictions on appending.
143
144 -b blocksize
145 Block the output at a positive decimal integer number of bytes
146 per write to the archive file. Devices and archive formats may
147 impose restrictions on blocking. Blocking shall be automatically
148 determined on input. Conforming applications shall not specify a
149 blocksize value larger than 32256. Default blocking when creat‐
150 ing archives depends on the archive format. (See the -x option
151 below.)
152
153 -c Match all file or archive members except those specified by the
154 pattern or file operands.
155
156 -d Cause files of type directory being copied or archived or ar‐
157 chive members of type directory being extracted or listed to
158 match only the file or archive member itself and not the file
159 hierarchy rooted at the file.
160
161 -f archive
162 Specify the pathname of the input or output archive, overriding
163 the default standard input (in list or read modes) or standard
164 output (write mode).
165
166 -H If a symbolic link referencing a file of type directory is spec‐
167 ified on the command line, pax shall archive the file hierarchy
168 rooted in the file referenced by the link, using the name of the
169 link as the root of the file hierarchy. Otherwise, if a sym‐
170 bolic link referencing a file of any other file type which pax
171 can normally archive is specified on the command line, then pax
172 shall archive the file referenced by the link, using the name of
173 the link. The default behavior shall be to archive the symbolic
174 link itself.
175
176 -i Interactively rename files or archive members. For each archive
177 member matching a pattern operand or file matching a file oper‐
178 and, a prompt shall be written to the file /dev/tty. The prompt
179 shall contain the name of the file or archive member, but the
180 format is otherwise unspecified. A line shall then be read from
181 /dev/tty. If this line is blank, the file or archive member
182 shall be skipped. If this line consists of a single period, the
183 file or archive member shall be processed with no modification
184 to its name. Otherwise, its name shall be replaced with the con‐
185 tents of the line. The pax utility shall immediately exit with a
186 non-zero exit status if end-of-file is encountered when reading
187 a response or if /dev/tty cannot be opened for reading and writ‐
188 ing.
189
190 The results of extracting a hard link to a file that has been
191 renamed during extraction are unspecified.
192
193 -k Prevent the overwriting of existing files.
194
195 -l (The letter ell.) In copy mode, hard links shall be made between
196 the source and destination file hierarchies whenever possible.
197 If specified in conjunction with -H or -L, when a symbolic link
198 is encountered, the hard link created in the destination file
199 hierarchy shall be to the file referenced by the symbolic link.
200 If specified when neither -H nor -L is specified, when a sym‐
201 bolic link is encountered, the implementation shall create a
202 hard link to the symbolic link in the source file hierarchy or
203 copy the symbolic link to the destination.
204
205 -L If a symbolic link referencing a file of type directory is spec‐
206 ified on the command line or encountered during the traversal of
207 a file hierarchy, pax shall archive the file hierarchy rooted in
208 the file referenced by the link, using the name of the link as
209 the root of the file hierarchy. Otherwise, if a symbolic link
210 referencing a file of any other file type which pax can normally
211 archive is specified on the command line or encountered during
212 the traversal of a file hierarchy, pax shall archive the file
213 referenced by the link, using the name of the link. The default
214 behavior shall be to archive the symbolic link itself.
215
216 -n Select the first archive member that matches each pattern oper‐
217 and. No more than one archive member shall be matched for each
218 pattern (although members of type directory shall still match
219 the file hierarchy rooted at that file).
220
221 -o options
222 Provide information to the implementation to modify the algo‐
223 rithm for extracting or writing files. The value of options
224 shall consist of one or more comma-separated keywords of the
225 form:
226
227 keyword[[:]=value][,keyword[[:]=value],...]
228
229 Some keywords apply only to certain file formats, as indicated
230 with each description. Use of keywords that are inapplicable to
231 the file format being processed produces undefined results.
232
233 Keywords in the options argument shall be a string that would be
234 a valid portable filename as described in the Base Definitions
235 volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
236 Character Set.
237
238 Note: Keywords are not expected to be filenames, merely to fol‐
239 low the same character composition rules as portable
240 filenames.
241
242 Keywords can be preceded with white space. The value field shall
243 consist of zero or more characters; within value, the applica‐
244 tion shall precede any literal comma with a backslash, which
245 shall be ignored, but preserves the comma as part of value. A
246 comma as the final character, or a comma followed solely by
247 white space as the final characters, in options shall be
248 ignored. Multiple -o options can be specified; if keywords given
249 to these multiple -o options conflict, the keywords and values
250 appearing later in command line sequence shall take precedence
251 and the earlier shall be silently ignored. The following keyword
252 values of options shall be supported for the file formats as
253 indicated:
254
255 delete=pattern
256 (Applicable only to the -x pax format.) When used in
257 write or copy mode, pax shall omit from extended header
258 records that it produces any keywords matching the string
259 pattern. When used in read or list mode, pax shall ignore
260 any keywords matching the string pattern in the extended
261 header records. In both cases, matching shall be per‐
262 formed using the pattern matching notation described in
263 Patterns Matching a Single Character and Patterns Match‐
264 ing Multiple Characters. For example:
265
266 -o delete=security.*
267
268 would suppress security-related information. See pax
269 Extended Header for extended header record keyword usage.
270
271 When multiple -o delete=pattern options are specified,
272 the patterns shall be additive; all keywords matching the
273 specified string patterns shall be omitted from extended
274 header records that pax produces.
275
276 exthdr.name=string
277 (Applicable only to the -x pax format.) This keyword
278 allows user control over the name that is written into
279 the ustar header blocks for the extended header produced
280 under the circumstances described in pax Header Block.
281 The name shall be the contents of string, after the fol‐
282 lowing character substitutions have been made:
283
284 ┌─────────────────┬─────────────────────────────────────────────┐
285 │string Includes: │ Replaced By: │
286 ├─────────────────┼─────────────────────────────────────────────┤
287 │%d │ The directory name of the file, equivalent │
288 │ │ to the result of the dirname utility on the │
289 │ │ translated pathname. │
290 ├─────────────────┼─────────────────────────────────────────────┤
291 │%f │ The filename of the file, equivalent to the │
292 │ │ result of the basename utility on the │
293 │ │ translated pathname. │
294 ├─────────────────┼─────────────────────────────────────────────┤
295 │%p │ The process ID of the pax process. │
296 ├─────────────────┼─────────────────────────────────────────────┤
297 │%% │ A '%' character. │
298 └─────────────────┴─────────────────────────────────────────────┘
299 Any other '%' characters in string produce undefined
300 results.
301
302 If no -o exthdr.name= string is specified, pax shall use
303 the following default value:
304
305 %d/PaxHeaders.%p/%f
306
307 globexthdr.name=string
308 (Applicable only to the -x pax format.) When used in
309 write or copy mode with the appropriate options, pax
310 shall create global extended header records with ustar
311 header blocks that will be treated as regular files by
312 previous versions of pax. This keyword allows user con‐
313 trol over the name that is written into the ustar header
314 blocks for global extended header records. The name shall
315 be the contents of string, after the following character
316 substitutions have been made:
317
318 ┌─────────────────┬─────────────────────────────────────────────┐
319 │string Includes: │ Replaced By: │
320 ├─────────────────┼─────────────────────────────────────────────┤
321 │%n │ An integer that represents the sequence │
322 │ │ number of the global extended header record │
323 │ │ in the archive, starting at 1. │
324 ├─────────────────┼─────────────────────────────────────────────┤
325 │%p │ The process ID of the pax process. │
326 ├─────────────────┼─────────────────────────────────────────────┤
327 │%% │ A '%' character. │
328 └─────────────────┴─────────────────────────────────────────────┘
329 Any other '%' characters in string produce undefined
330 results.
331
332 If no -o globexthdr.name=string is specified, pax shall
333 use the following default value:
334
335 $TMPDIR/GlobalHead.%p.%n
336
337 where $TMPDIR represents the value of the TMPDIR environ‐
338 ment variable. If TMPDIR is not set, pax shall use /tmp.
339
340 invalid=action
341 (Applicable only to the -x pax format.) This keyword
342 allows user control over the action pax takes upon
343 encountering values in an extended header record that, in
344 read or copy mode, are invalid in the destination hierar‐
345 chy or, in list mode, cannot be written in the codeset
346 and current locale of the implementation. The following
347 are invalid values that shall be recognized by pax:
348
349 + In read or copy mode, a filename or link name that
350 contains character encodings invalid in the desti‐
351 nation hierarchy. (For example, the name may con‐
352 tain embedded NULs.)
353
354 + In read or copy mode, a filename or link name that
355 is longer than the maximum allowed in the destina‐
356 tion hierarchy (for either a pathname component or
357 the entire pathname).
358
359 + In list mode, any character string value (file‐
360 name, link name, user name, and so on) that cannot
361 be written in the codeset and current locale of
362 the implementation.
363
364 The following mutually-exclusive values of the action
365 argument are supported:
366
367 bypass In read or copy mode, pax shall bypass the file,
368 causing no change to the destination hierarchy. In
369 list mode, pax shall write all requested valid
370 values for the file, but its method for writing
371 invalid values is unspecified.
372
373 rename In read or copy mode, pax shall act as if the -i
374 option were in effect for each file with invalid
375 filename or link name values, allowing the user to
376 provide a replacement name interactively. In list
377 mode, pax shall behave identically to the bypass
378 action.
379
380 UTF-8 When used in read, copy, or list mode and a file‐
381 name, link name, owner name, or any other field in
382 an extended header record cannot be translated
383 from the pax UTF-8 codeset format to the codeset
384 and current locale of the implementation, pax
385 shall use the actual UTF-8 encoding for the name.
386
387 write In read or copy mode, pax shall write the file,
388 translating the name, regardless of whether this
389 may overwrite an existing file with a valid name.
390 In list mode, pax shall behave identically to the
391 bypass action.
392
393 If no -o invalid=option is specified, pax shall act as if
394 -o invalid= bypass were specified. Any overwriting of
395 existing files that may be allowed by the -o invalid=
396 actions shall be subject to permission(-p) and modifica‐
397 tion time (-u) restrictions, and shall be suppressed if
398 the -k option is also specified.
399
400 linkdata
401 (Applicable only to the -x pax format.) In write mode,
402 pax shall write the contents of a file to the archive
403 even when that file is merely a hard link to a file whose
404 contents have already been written to the archive.
405
406 listopt=format
407 This keyword specifies the output format of the table of
408 contents produced when the -v option is specified in list
409 mode. See List Mode Format Specifications. To avoid ambi‐
410 guity, the listopt= format shall be the only or final
411 keyword= value pair in a -o option-argument; all charac‐
412 ters in the remainder of the option-argument shall be
413 considered part of the format string. When multiple -o
414 listopt= format options are specified, the format strings
415 shall be considered a single, concatenated string, evalu‐
416 ated in command line order.
417
418 times (Applicable only to the -x pax format.) When used in
419 write or copy mode, pax shall include atime and mtime
420 extended header records for each file. See pax Extended
421 Header File Times.
422
423 In addition to these keywords, if the -x pax format is speci‐
424 fied, any of the keywords and values defined in pax Extended
425 Header, including implementation extensions, can be used in -o
426 option-arguments, in either of two modes:
427
428 keyword=value
429 When used in write or copy mode, these keyword/value
430 pairs shall be included at the beginning of the archive
431 as typeflag g global extended header records. When used
432 in read or list mode, these keyword/value pairs shall act
433 as if they had been at the beginning of the archive as
434 typeflag g global extended header records.
435
436 keyword:=value
437 When used in write or copy mode, these keyword/value
438 pairs shall be included as records at the beginning of a
439 typeflag x extended header for each file. (This shall be
440 equivalent to the equal-sign form except that it creates
441 no typeflag g global extended header records.) When used
442 in read or list mode, these keyword/value pairs shall act
443 as if they were included as records at the end of each
444 extended header; thus, they shall override any global or
445 file-specific extended header record keywords of the same
446 names. For example, in the command:
447
448 pax -r -o "gname:=mygroup," <archive
449
450 the group name will be forced to a new value for all
451 files read from the archive.
452
453 The precedence of -o keywords over various fields in the archive
454 is described in pax Extended Header Keyword Precedence.
455
456 -p string
457 Specify one or more file characteristic options (privileges).
458 The string option-argument shall be a string specifying file
459 characteristics to be retained or discarded on extraction. The
460 string shall consist of the specification characters a , e, m,
461 o, and p. Other implementation-defined characters can be
462 included. Multiple characteristics can be concatenated within
463 the same string and multiple -p options can be specified. The
464 meaning of the specification characters are as follows:
465
466 a Do not preserve file access times.
467
468 e Preserve the user ID, group ID, file mode bits (see the
469 Base Definitions volume of IEEE Std 1003.1-2001, Section
470 3.168, File Mode Bits), access time, modification time,
471 and any other implementation-defined file characteris‐
472 tics.
473
474 m
475
476 Do not preserve file modification times.
477
478 o Preserve the user ID and group ID.
479
480 p Preserve the file mode bits. Other implementation-defined
481 file mode attributes may be preserved.
482
483 In the preceding list, "preserve" indicates that an attribute
484 stored in the archive shall be given to the extracted file, sub‐
485 ject to the permissions of the invoking process. The access and
486 modification times of the file shall be preserved unless other‐
487 wise specified with the -p option or not stored in the archive.
488 All attributes that are not preserved shall be determined as
489 part of the normal file creation action (see File Read, Write,
490 and Creation).
491
492 If neither the e nor the o specification character is specified,
493 or the user ID and group ID are not preserved for any reason,
494 pax shall not set the S_ISUID and S_ISGID bits of the file mode.
495
496 If the preservation of any of these items fails for any reason,
497 pax shall write a diagnostic message to standard error. Failure
498 to preserve these items shall affect the final exit status, but
499 shall not cause the extracted file to be deleted.
500
501 If file characteristic letters in any of the string option-argu‐
502 ments are duplicated or conflict with each other, the ones given
503 last shall take precedence. For example, if -p eme is specified,
504 file modification times are preserved.
505
506 -s replstr
507 Modify file or archive member names named by pattern or file op‐
508 erands according to the substitution expression replstr, using
509 the syntax of the ed utility. The concepts of "address" and
510 "line" are meaningless in the context of the pax utility, and
511 shall not be supplied. The format shall be:
512
513 -s /old/new/[gp]
514
515 where as in ed, old is a basic regular expression and new can
516 contain an ampersand, '\n' (where n is a digit) backreferences,
517 or subexpression matching. The old string shall also be permit‐
518 ted to contain <newline>s.
519
520 Any non-null character can be used as a delimiter ( '/' shown
521 here). Multiple -s expressions can be specified; the expressions
522 shall be applied in the order specified, terminating with the
523 first successful substitution. The optional trailing 'g' is as
524 defined in the ed utility. The optional trailing 'p' shall cause
525 successful substitutions to be written to standard error. File
526 or archive member names that substitute to the empty string
527 shall be ignored when reading and writing archives.
528
529 -t When reading files from the file system, and if the user has the
530 permissions required by utime() to do so, set the access time of
531 each file read to the access time that it had before being read
532 by pax.
533
534 -u Ignore files that are older (having a less recent file modifica‐
535 tion time) than a pre-existing file or archive member with the
536 same name. In read mode, an archive member with the same name as
537 a file in the file system shall be extracted if the archive mem‐
538 ber is newer than the file. In write mode, an archive file mem‐
539 ber with the same name as a file in the file system shall be
540 superseded if the file is newer than the archive member. If -a
541 is also specified, this is accomplished by appending to the ar‐
542 chive; otherwise, it is unspecified whether this is accomplished
543 by actual replacement in the archive or by appending to the ar‐
544 chive. In copy mode, the file in the destination hierarchy shall
545 be replaced by the file in the source hierarchy or by a link to
546 the file in the source hierarchy if the file in the source hier‐
547 archy is newer.
548
549 -v In list mode, produce a verbose table of contents (see the STD‐
550 OUT section). Otherwise, write archive member pathnames to stan‐
551 dard error (see the STDERR section).
552
553 -x format
554 Specify the output archive format. The pax utility shall support
555 the following formats:
556
557 cpio The cpio interchange format; see the EXTENDED DESCRIPTION
558 section. The default blocksize for this format for char‐
559 acter special archive files shall be 5120. Implementa‐
560 tions shall support all blocksize values less than or
561 equal to 32256 that are multiples of 512.
562
563 pax The pax interchange format; see the EXTENDED DESCRIPTION
564 section. The default blocksize for this format for char‐
565 acter special archive files shall be 5120. Implementa‐
566 tions shall support all blocksize values less than or
567 equal to 32256 that are multiples of 512.
568
569 ustar The tar interchange format; see the EXTENDED DESCRIPTION
570 section. The default blocksize for this format for char‐
571 acter special archive files shall be 10240. Implementa‐
572 tions shall support all blocksize values less than or
573 equal to 32256 that are multiples of 512.
574
575 Implementation-defined formats shall specify a default block
576 size as well as any other block sizes supported for character
577 special archive files.
578
579 Any attempt to append to an archive file in a format different
580 from the existing archive format shall cause pax to exit immedi‐
581 ately with a non-zero exit status.
582
583 In copy mode, if no -x format is specified, pax shall behave as
584 if -x pax were specified.
585
586 -X When traversing the file hierarchy specified by a pathname, pax
587 shall not descend into directories that have a different device
588 ID ( st_dev; see the System Interfaces volume of IEEE Std
589 1003.1-2001, stat()).
590
591 Specifying more than one of the mutually-exclusive options -H and -L
592 shall not be considered an error and the last option specified shall
593 determine the behavior of the utility.
594
595 The options that operate on the names of files or archive members (-c,
596 -i, -n, -s, -u, and -v) shall interact as follows. In read mode, the
597 archive members shall be selected based on the user-specified pattern
598 operands as modified by the -c, -n, and -u options. Then, any -s and -i
599 options shall modify, in that order, the names of the selected files.
600 The -v option shall write names resulting from these modifications.
601
602 In write mode, the files shall be selected based on the user-specified
603 pathnames as modified by the -n and -u options. Then, any -s and -i
604 options shall modify, in that order, the names of these selected files.
605 The -v option shall write names resulting from these modifications.
606
607 If both the -u and -n options are specified, pax shall not consider a
608 file selected unless it is newer than the file to which it is compared.
609
610
611 List Mode Format Specifications
612 The manual page for spax is not yet ready. The following text is a
613 quotation from the POSIX.1-2001 standard.
614
615 In list mode with the -o listopt=format option, the format argument
616 shall be applied for each selected file. The pax utility shall append a
617 <newline> to the listopt output for each selected file. The format
618 argument shall be used as the format string described in the Base Defi‐
619 nitions volume of IEEE Std 1003.1-2001, Chapter 5, File Format Nota‐
620 tion, with the exceptions 1. through 5. defined in the EXTENDED
621 DESCRIPTION section of printf(3), plus the following exceptions:
622
623 6. The sequence (keyword) can occur before a format conversion
624 specifier. The conversion argument is defined by the value of
625 keyword. The implementation shall support the following key‐
626 words:
627
628 · Any of the Field Name entries in ustar Header Block and
629 Octet-Oriented cpio Archive Entry. The implementation may
630 support the cpio keywords without the leading c_ in addi‐
631 tion to the form required by Values for cpio c_mode
632 Field.
633
634 · Any keyword defined for the extended header in pax
635 Extended Header.
636
637 · Any keyword provided as an implementation-defined exten‐
638 sion within the extended header defined in pax Extended
639 Header.
640
641 For example, the sequence "%(charset)s" is the string value of
642 the name of the character set in the extended header.
643
644 The result of the keyword conversion argument shall be the value
645 from the applicable header field or extended header, without any
646 trailing NULs.
647
648 All keyword values used as conversion arguments shall be trans‐
649 lated from the UTF-8 encoding to the character set appropriate
650 for the local file system, user database, and so on, as applica‐
651 ble.
652
653 7. An additional conversion specifier character, T, shall be used
654 to specify time formats. The T conversion specifier character
655 can be preceded by the sequence (keyword=subformat), where sub‐
656 format is a date format as defined by date operands. The default
657 keyword shall be mtime and the default subformat shall be:
658
659 %b %e %H:%M %Y
660
661 8. An additional conversion specifier character, M, shall be used
662 to specify the file mode string as defined in ls(1) Standard
663 Output. If (keyword) is omitted, the mode keyword shall be used.
664 For example, %.1M writes the single character corresponding to
665 the <entry type> field of the ls -l command.
666
667 9. An additional conversion specifier character, D, shall be used
668 to specify the device for block or special files, if applicable,
669 in an implementation-defined format. If not applicable, and
670 (keyword) is specified, then this conversion shall be equivalent
671 to %(keyword)u. If not applicable, and (keyword) is omitted,
672 then this conversion shall be equivalent to <space>.
673
674 10. An additional conversion specifier character, F, shall be used
675 to specify a pathname. The F conversion character can be pre‐
676 ceded by a sequence of comma-separated keywords:
677
678 (keyword[,keyword] ... )
679 The values for all the keywords that are non-null shall be con‐
680 catenated together, each separated by a '/'. The default shall
681 be (path) if the keyword path is defined; otherwise, the default
682 shall be (prefix, name).
683
684 11. An additional conversion specifier character, L, shall be used
685 to specify a symbolic line expansion. If the current file is a
686 symbolic link, then %L shall expand to:
687
688 "%s -> %s", <value of keyword>, <contents of link>
689
690 Otherwise, the %L conversion specification shall be the equivalent of
691 %F.
692
693
695 The following operands shall be supported:
696
697 directory
698 The destination directory pathname for copy mode.
699
700 file A pathname of a file to be copied or archived.
701
702 pattern
703 A pattern matching one or more pathnames of archive members. A
704 pattern must be given in the name-generating notation of the
705 pattern matching notation in Pattern Matching Notation , includ‐
706 ing the filename expansion rules in Patterns Used for Filename
707 Expansion. The default, if no pattern is specified, is to select
708 all members in the archive.
709
710
712 In write mode, the standard input shall be used only if no file oper‐
713 ands are specified. It shall be a text file containing a list of path‐
714 names, one per line, without leading or trailing <blank>s.
715
716 In list and read modes, if -f is not specified, the standard input
717 shall be an archive file.
718
719 Otherwise, the standard input shall not be used.
720
721
723 The input file named by the archive option-argument, or standard input
724 when the archive is read from there, shall be a file formatted accord‐
725 ing to one of the specifications in the EXTENDED DESCRIPTION section or
726 some other implementation-defined format.
727
728 The file /dev/tty shall be used to write prompts and read responses.
729
730
732 The following environment variables shall affect the execution of pax:
733
734 LANG Provide a default value for the internationalization variables
735 that are unset or null. (See the Base Definitions volume of IEEE
736 Std 1003.1-2001, Section 8.2, Internationalization Variables for
737 the precedence of internationalization variables used to deter‐
738 mine the values of locale categories.)
739
740 LC_ALL If set to a non-empty string value, override the values of all
741 the other internationalization variables.
742
743 LC_COLLATE
744 Determine the locale for the behavior of ranges, equivalence
745 classes, and multi-character collating elements used in the pat‐
746 tern matching expressions for the pattern operand, the basic
747 regular expression for the -s option, and the extended regular
748 expression defined for the yesexpr locale keyword in the LC_MES‐
749 SAGES category.
750
751 LC_CTYPE
752 Determine the locale for the interpretation of sequences of
753 bytes of text data as characters (for example, single-byte as
754 opposed to multi-byte characters in arguments and input files),
755 the behavior of character classes used in the extended regular
756 expression defined for the yesexpr locale keyword in the LC_MES‐
757 SAGES category, and pattern matching.
758
759 LC_MESSAGES
760 Determine the locale for the processing of affirmative responses
761 that should be used to affect the format and contents of diag‐
762 nostic messages written to standard error.
763
764 LC_TIME
765 Determine the format and contents of date and time strings when
766 the -v option is specified.
767
768 NLSPATH
769 [XSI] [Option Start] Determine the location of message catalogs
770 for the processing of LC_MESSAGES . [Option End]
771
772 TMPDIR Determine the pathname that provides part of the default global
773 extended header record file, as described for the -o globexthdr=
774 keyword in the OPTIONS section.
775
776 TZ Determine the timezone used to calculate date and time strings
777 when the -v option is specified. If TZ is unset or null, an
778 unspecified default timezone shall be used.
779
780
782 Default.
783
784
786 In write mode, if -f is not specified, the standard output shall be the
787 archive formatted according to one of the specifications in the
788 EXTENDED DESCRIPTION section, or some other implementation-defined for‐
789 mat (see -x format).
790
791 In list mode, when the -o listopt= format has been specified, the
792 selected archive members shall be written to standard output using the
793 format described under List Mode Format Specifications. In list mode
794 without the -o listopt= format option, the table of contents of the
795 selected archive members shall be written to standard output using the
796 following format:
797
798 "%s\n", <pathname>
799
800 If the -v option is specified in list mode, the table of contents of
801 the selected archive members shall be written to standard output using
802 the following formats.
803
804 For pathnames representing hard links to previous members of the ar‐
805 chive:
806
807 "%s == %s\n", <ls -l listing>, <linkname>
808
809 For all other pathnames:
810
811 "%s\n", <ls -l listing>
812
813 where <ls -l listing> shall be the format specified by the ls(1) util‐
814 ity with the -l option. When writing pathnames in this format, it is
815 unspecified what is written for fields for which the underlying archive
816 format does not have the correct information, although the correct num‐
817 ber of <blank>-separated fields shall be written.
818
819 In list mode, standard output shall not be buffered more than a line at
820 a time.
821
822
824 If -v is specified in read, write, or copy modes, pax shall write the
825 pathnames it processes to the standard error output using the following
826 format:
827
828 "%s\n", <pathname>
829
830 These pathnames shall be written as soon as processing is begun on the
831 file or archive member, and shall be flushed to standard error. The
832 trailing <newline>, which shall not be buffered, is written when the
833 file has been read or written.
834
835 If the -s option is specified, and the replacement string has a trail‐
836 ing 'p', substitutions shall be written to standard error in the fol‐
837 lowing format:
838
839 "%s >> %s\n", <original pathname>, <new pathname>
840
841 In all operating modes of pax, optional messages of unspecified format
842 concerning the input archive format and volume number, the number of
843 files, blocks, volumes, and media parts as well as other diagnostic
844 messages may be written to standard error.
845
846 In all formats, for both standard output and standard error, it is
847 unspecified how non-printable characters in pathnames or link names are
848 written.
849
850 When pax is in read mode or list mode, using the -x pax archive format,
851 and a filename, link name, owner name, or any other field in an
852 extended header record cannot be translated from the pax UTF-8 codeset
853 format to the codeset and current locale of the implementation, pax
854 shall write a diagnostic message to standard error, shall process the
855 file as described for the -o invalid= option, and then shall process
856 the next file in the archive.
857
858
860 In read mode, the extracted output files shall be of the archived file
861 type. In copy mode, the copied output files shall be the type of the
862 file being copied. In either mode, existing files in the destination
863 hierarchy shall be overwritten only when all permission (-p), modifica‐
864 tion time (-u), and invalid-value (-o invalid=) tests allow it.
865
866 In write mode, the output file named by the -f option-argument shall be
867 a file formatted according to one of the specifications in the EXTENDED
868 DESCRIPTION section, or some other implementation-defined format.
869
870
872 pax Interchange Format
873 A pax archive tape or file produced in the -x pax format shall contain
874 a series of blocks. The physical layout of the archive shall be identi‐
875 cal to the ustar format described in ustar Interchange Format. Each
876 file archived shall be represented by the following sequence:
877
878 · An optional header block with extended header records.
879 This header block is of the form described in pax Header
880 Block, with a typeflag value of x or g. The extended
881 header records, described in pax Extended Header, shall
882 be included as the data for this header block.
883
884 · A header block that describes the file. Any fields in the
885 preceding optional extended header shall override the
886 associated fields in this header block for this file.
887
888 · Zero or more blocks that contain the contents of the
889 file.
890
891 At the end of the archive file there shall be two 512-byte blocks
892 filled with binary zeros, interpreted as an end-of-archive indicator.
893
894 A schematic of an example archive with global extended header records
895 and two actual files is shown in pax Format Archive Example. In the
896 example, the second file in the archive has no extended header preced‐
897 ing it, presumably because it has no need for extended attributes.
898
899 Figure: pax Format Archive Example
900
901 ┌──────────────────────────────┬─────────────────────────────────────────────┐
902 │ustar Header [typeflag = 'g'] │ │
903 ├──────────────────────────────┤ Global Extended header │
904 │Global Extended Header Data │ │
905 ├──────────────────────────────┼─────────────────────────────────────────────┤
906 │ustar Header [typeflag = 'x'] │ │
907 ├──────────────────────────────┤ │
908 │Extended Header Data │ │
909 ├──────────────────────────────┤ File 1: Extended Header data is included │
910 │ustar Header [typeflag = '0'] │ │
911 ├──────────────────────────────┤ │
912 │Data for File 1 │ │
913 ├──────────────────────────────┼─────────────────────────────────────────────┤
914 │ustar Header [typeflag = '0'] │ │
915 ├──────────────────────────────┤ File 2: No Extended Header data is included │
916 │Data for File 2 │ │
917 ├──────────────────────────────┼─────────────────────────────────────────────┤
918 │Block of binary Zeroes │ │
919 ├──────────────────────────────┤ End of Archive Indicator │
920 │Block of binary Zeroes │ │
921 └──────────────────────────────┴─────────────────────────────────────────────┘
922
923 pax Header Block
924 The pax header block shall be identical to the ustar header block
925 described in ustar Interchange Format, except that two additional type‐
926 flag values are defined:
927
928 x Represents extended header records for the following file in the
929 archive (which shall have its own ustar header block). The for‐
930 mat of these extended header records shall be as described in
931 pax Extended Header.
932
933 g Represents global extended header records for the following
934 files in the archive. The format of these extended header
935 records shall be as described in pax Extended Header. Each
936 value shall affect all subsequent files that do not override
937 that value in their own extended header record and until another
938 global extended header record is reached that provides another
939 value for the same field. The typeflag g global headers should
940 not be used with interchange media that could suffer partial
941 data loss in transporting the archive.
942
943 For both of these types, the size field shall be the size of the
944 extended header records in octets. The other fields in the header block
945 are not meaningful to this version of the pax utility. However, if
946 this archive is read by a pax utility conforming to the ISO
947 POSIX-2:1993 standard, the header block fields are used to create a
948 regular file that contains the extended header records as data. There‐
949 fore, header block field values should be selected to provide reason‐
950 able file access to this regular file.
951
952 A further difference from the ustar header block is that data blocks
953 for files of typeflag 1 (the digit one) (hard link) may be included,
954 which means that the size field may be greater than zero. Archives cre‐
955 ated by pax -o linkdata shall include these data blocks with the hard
956 links.
957
958
959 pax Extended Header
960 A pax extended header contains values that are inappropriate for the
961 ustar header block because of limitations in that format: fields
962 requiring a character encoding other than that described in the ISO/IEC
963 646:1991 standard, fields representing file attributes not described in
964 the ustar header, and fields whose format or length do not fit the
965 requirements of the ustar header. The values in an extended header add
966 attributes to the following file (or files; see the description of the
967 typeflag g header block) or override values in the following header
968 block(s), as indicated in the following list of keywords.
969
970 An extended header shall consist of one or more records, each con‐
971 structed as follows:
972
973 "%d %s=%s\n", <length>, <keyword>, <value>
974
975 The extended header records shall be encoded according to the ISO/IEC
976 10646-1:2000 standard (UTF-8). The <length> field, <blank>, equals
977 sign, and <newline> shown shall be limited to the portable character
978 set, as encoded in UTF-8. The <keyword> and <value> fields can be any
979 UTF-8 characters. The <length> field shall be the decimal length of the
980 extended header record in octets, including the trailing <newline>.
981
982 The <keyword> field shall be one of the entries from the following list
983 or a keyword provided as an implementation extension. Keywords con‐
984 sisting entirely of lowercase letters, digits, and periods are reserved
985 for future standardization. A keyword shall not include an equals sign.
986 (In the following list, the notations "file(s)" or "block(s)" is used
987 to acknowledge that a keyword affects the following single file after a
988 typeflag x extended header, but possibly multiple files after typeflag
989 g. Any requirements in the list for pax to include a record when in
990 write or copy mode shall apply only when such a record has not already
991 been provided through the use of the -o option. When used in copy mode,
992 pax shall behave as if an archive had been created with applicable
993 extended header records and then extracted.)
994
995 atime The file access time for the following file(s), equivalent to
996 the value of the st_atime member of the stat structure for a
997 file, as described by the stat(2) function. The access time
998 shall be restored if the process has the appropriate privilege
999 required to do so. The format of the <value> shall be as
1000 described in pax Extended Header File Times.
1001
1002 charset
1003 The name of the character set used to encode the data in the
1004 following file(s). The entries in the following table are
1005 defined to refer to known standards; additional names may be
1006 agreed on between the originator and recipient.
1007
1008 ┌────────────────────────┬───────────────────────────────┐
1009 │ <value> │ Formal Standard │
1010 ├────────────────────────┼───────────────────────────────┤
1011 │ISO-IR 646 1990 │ ISO/IEC 646:1990 │
1012 │ISO-IR 8859 1 1998 │ ISO/IEC 8859-1:1998 │
1013 │ISO-IR 8859 2 1999 │ ISO/IEC 8859-2:1999 │
1014 │ISO-IR 8859 3 1999 │ ISO/IEC 8859-3:1999 │
1015 │ISO-IR 8859 4 1998 │ ISO/IEC 8859-4:1998 │
1016 │ISO-IR 8859 5 1999 │ ISO/IEC 8859-5:1999 │
1017 │ISO-IR 8859 6 1999 │ ISO/IEC 8859-6:1999 │
1018 │ISO-IR 8859 7 1987 │ ISO/IEC 8859-7:1987 │
1019 │ISO-IR 8859 8 1999 │ ISO/IEC 8859-8:1999 │
1020 │ISO-IR 8859 9 1999 │ ISO/IEC 8859-9:1999 │
1021 │ISO-IR 8859 10 1998 │ ISO/IEC 8859-10:1998 │
1022 │ISO-IR 8859 13 1998 │ ISO/IEC 8859-13:1998 │
1023 │ISO-IR 8859 14 1998 │ ISO/IEC 8859-14:1998 │
1024 │ISO-IR 8859 15 1999 │ ISO/IEC 8859-15:1999 │
1025 │ISO-IR 10646 2000 │ ISO/IEC 10646:2000 │
1026 │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1027 │BINARY │ None │
1028 └────────────────────────┴───────────────────────────────┘
1029 The encoding is included in an extended header for information only;
1030 when pax is used as described in IEEE Std 1003.1-2001, it shall not
1031 translate the file data into any other encoding. The BINARY entry indi‐
1032 cates unencoded binary data.
1033
1034 When used in write or copy mode, it is implementation-defined whether
1035 pax includes a charset extended header record for a file.
1036
1037 comment
1038 A series of characters used as a comment. All characters in the
1039 <value> field shall be ignored by pax.
1040
1041 gid The group ID of the group that owns the file, expressed as a
1042 decimal number using digits from the ISO/IEC 646:1991 standard.
1043 This record shall override the gid field in the following header
1044 block(s). When used in write or copy mode, pax shall include a
1045 gid extended header record for each file whose group ID is
1046 greater than 2097151 (octal 7777777).
1047
1048 gname The group of the file(s), formatted as a group name in the group
1049 database. This record shall override the gid and gname fields in
1050 the following header block(s), and any gid extended header
1051 record. When used in read, copy, or list mode, pax shall trans‐
1052 late the name from the UTF-8 encoding in the header record to
1053 the character set appropriate for the group database on the
1054 receiving system. If any of the UTF-8 characters cannot be
1055 translated, and if the -o invalid=UTF-8 option is not specified,
1056 the results are implementation-defined. When used in write or
1057 copy mode, pax shall include a gname extended header record for
1058 each file whose group name cannot be represented entirely with
1059 the letters and digits of the portable character set.
1060
1061 linkpath
1062 The pathname of a link being created to another file, of any
1063 type, previously archived. This record shall override the
1064 linkname field in the following ustar header block(s). The fol‐
1065 lowing ustar header block shall determine the type of link cre‐
1066 ated. If typeflag of the following header block is 1, it shall
1067 be a hard link. If typeflag is 2, it shall be a symbolic link
1068 and the linkpath value shall be the contents of the symbolic
1069 link. The pax utility shall translate the name of the link (con‐
1070 tents of the symbolic link) from the UTF-8 encoding to the char‐
1071 acter set appropriate for the local file system. When used in
1072 write or copy mode, pax shall include a linkpath extended header
1073 record for each link whose pathname cannot be represented
1074 entirely with the members of the portable character set other
1075 than NUL.
1076
1077 mtime The file modification time of the following file(s), equivalent
1078 to the value of the st_mtime member of the stat structure for a
1079 file, as described in the stat(2) function. This record shall
1080 override the mtime field in the following header block(s). The
1081 modification time shall be restored if the process has the
1082 appropriate privilege required to do so. The format of the
1083 <value> shall be as described in pax Extended Header File Times.
1084
1085 path The pathname of the following file(s). This record shall over‐
1086 ride the name and prefix fields in the following header
1087 block(s). The pax utility shall translate the pathname of the
1088 file from the UTF-8 encoding to the character set appropriate
1089 for the local file system.
1090
1091 When used in write or copy mode, pax shall include a path
1092 extended header record for each file whose pathname cannot be
1093 represented entirely with the members of the portable character
1094 set other than NUL.
1095
1096 realtime.any
1097 The keywords prefixed by "realtime." are reserved for future
1098 standardization.
1099
1100 security.any
1101 The keywords prefixed by "security." are reserved for future
1102 standardization.
1103
1104 size The size of the file in octets, expressed as a decimal number
1105 using digits from the ISO/IEC 646:1991 standard. This record
1106 shall override the size field in the following header block(s).
1107 When used in write or copy mode, pax shall include a size
1108 extended header record for each file with a size value greater
1109 than 8589934591 (octal 77777777777).
1110
1111 uid The user ID of the file owner, expressed as a decimal number
1112 using digits from the ISO/IEC 646:1991 standard. This record
1113 shall override the uid field in the following header block(s).
1114 When used in write or copy mode, pax shall include a uid
1115 extended header record for each file whose owner ID is greater
1116 than 2097151 (octal 7777777).
1117
1118 uname The owner of the following file(s), formatted as a user name in
1119 the user database. This record shall override the uid and uname
1120 fields in the following header block(s), and any uid extended
1121 header record. When used in read, copy, or list mode, pax shall
1122 translate the name from the UTF-8 encoding in the header record
1123 to the character set appropriate for the user database on the
1124 receiving system. If any of the UTF-8 characters cannot be
1125 translated, and if the -o invalid=UTF-8 option is not specified,
1126 the results are implementation-defined. When used in write or
1127 copy mode, pax shall include a uname extended header record for
1128 each file whose user name cannot be represented entirely with
1129 the letters and digits of the portable character set.
1130
1131 If the <value> field is zero length, it shall delete any header block
1132 field, previously entered extended header value, or global extended
1133 header value of the same name.
1134
1135 If a keyword in an extended header record (or in a -o option-argument)
1136 overrides or deletes a corresponding field in the ustar header block,
1137 pax shall ignore the contents of that header block field.
1138
1139 Unlike the ustar header block fields, NULs shall not delimit <value>s;
1140 all characters within the <value> field shall be considered data for
1141 the field. None of the length limitations of the ustar header block
1142 fields in ustar Header Block shall apply to the extended header
1143 records.
1144
1145
1146 pax Extended Header Keyword Precedence
1147 This section describes the precedence in which the various header
1148 records and fields and command line options are selected to apply to a
1149 file in the archive. When pax is used in read or list modes, it shall
1150 determine a file attribute in the following sequence:
1151
1152 1. If -o delete=keyword-prefix is used, the affected
1153 attributes shall be determined from step 7., if applica‐
1154 ble, or ignored otherwise.
1155
1156 2. If -o keyword:= is used, the affected attributes shall be
1157 ignored.
1158
1159 3. If -o keyword:=value is used, the affected attribute
1160 shall be assigned the value.
1161
1162 4. If there is a typeflag x extended header record, the
1163 affected attribute shall be assigned the <value>. When
1164 extended header records conflict, the last one given in
1165 the header shall take precedence.
1166
1167 5. If -o keyword=value is used, the affected attribute shall
1168 be assigned the value.
1169
1170 6. If there is a typeflag g global extended header record,
1171 the affected attribute shall be assigned the <value>.
1172 When global extended header records conflict, the last
1173 one given in the global header shall take precedence.
1174
1175 7. Otherwise, the attribute shall be determined from the
1176 ustar header block.
1177
1178
1179 pax Extended Header File Times
1180 The pax utility shall write an mtime record for each file in write or
1181 copy modes if the file's modification time cannot be represented
1182 exactly in the ustar header logical record described in ustar Inter‐
1183 change Format. This can occur if the time is out of ustar range, or if
1184 the file system of the underlying implementation supports non-integer
1185 time granularities and the time is not an integer. All of these time
1186 records shall be formatted as a decimal representation of the time in
1187 seconds since the Epoch. If a period ('.') decimal point character is
1188 present, the digits to the right of the point shall represent the units
1189 of a subsecond timing granularity, where the first digit is tenths of a
1190 second and each subsequent digit is a tenth of the previous digit. In
1191 read or copy mode, the pax utility shall truncate the time of a file to
1192 the greatest value that is not greater than the input header file time.
1193 In write or copy mode, the pax utility shall output a time exactly if
1194 it can be represented exactly as a decimal number, and otherwise shall
1195 generate only enough digits so that the same time shall be recovered if
1196 the file is extracted on a system whose underlying implementation sup‐
1197 ports the same time granularity.
1198
1199
1200 ustar Interchange Format
1201 A ustar archive tape or file shall contain a series of logical records.
1202 Each logical record shall be a fixed-size logical record of 512 octets
1203 (see below). Although this format may be thought of as being stored on
1204 9-track industry-standard 12.7 mm (0.5 in) magnetic tape, other types
1205 of transportable media are not excluded. Each file archived shall be
1206 represented by a header logical record that describes the file, fol‐
1207 lowed by zero or more logical records that give the contents of the
1208 file. At the end of the archive file there shall be two 512-octet logi‐
1209 cal records filled with binary zeros, interpreted as an end-of-archive
1210 indicator.
1211
1212 The logical records may be grouped for physical I/O operations, as
1213 described under the -b blocksize and -x ustar options. Each group of
1214 logical records may be written with a single operation equivalent to
1215 the write(2) function. On magnetic tape, the result of this write shall
1216 be a single tape physical block. The last physical block shall always
1217 be the full size, so logical records after the two zero logical records
1218 may contain undefined data.
1219
1220 The header logical record shall be structured as shown in the following
1221 table. All lengths and offsets are in decimal.
1222
1223 Table: ustar Header Block
1224
1225 ┌───────────┬──────────────┬────────────────────┐
1226 │Field Name │ Octet Offset │ Length (in Octets) │
1227 ├───────────┼──────────────┼────────────────────┤
1228 │name │ 0 │ 100 │
1229 │mode │ 100 │ 8 │
1230 │uid │ 108 │ 8 │
1231 │gid │ 116 │ 8 │
1232 │size │ 124 │ 12 │
1233 │mtime │ 136 │ 12 │
1234 │chksum │ 148 │ 8 │
1235 │typeflag │ 156 │ 1 │
1236 │linkname │ 157 │ 100 │
1237 │magic │ 257 │ 6 │
1238 │version │ 263 │ 2 │
1239 │uname │ 265 │ 32 │
1240 │gname │ 297 │ 32 │
1241 │devmajor │ 329 │ 8 │
1242 │devminor │ 337 │ 8 │
1243 │prefix │ 345 │ 155 │
1244 └───────────┴──────────────┴────────────────────┘
1245 All characters in the header logical record shall be represented in the
1246 coded character set of the ISO/IEC 646:1991 standard. For maximum
1247 portability between implementations, names should be selected from
1248 characters represented by the portable filename character set as octets
1249 with the most significant bit zero. If an implementation supports the
1250 use of characters outside of slash and the portable filename character
1251 set in names for files, users, and groups, one or more implementation-
1252 defined encodings of these characters shall be provided for interchange
1253 purposes.
1254
1255 However, the pax utility shall never create filenames on the local sys‐
1256 tem that cannot be accessed via the procedures described in IEEE Std
1257 1003.1-2001. If a filename is found on the medium that would create an
1258 invalid filename, it is implementation-defined whether the data from
1259 the file is stored on the file hierarchy and under what name it is
1260 stored. The pax utility may choose to ignore these files as long as it
1261 produces an error indicating that the file is being ignored.
1262
1263 Each field within the header logical record is contiguous; that is,
1264 there is no padding used. Each character on the archive medium shall be
1265 stored contiguously.
1266
1267 The fields magic, uname, and gname are character strings each termi‐
1268 nated by a NUL character. The fields name, linkname, and prefix are
1269 NUL-terminated character strings except when all characters in the
1270 array contain non-NUL characters including the last character. The ver‐
1271 sion field is two octets containing the characters "00" (zero-zero).
1272 The typeflag contains a single character. All other fields are leading
1273 zero-filled octal numbers using digits from the ISO/IEC 646:1991 stan‐
1274 dard IRV. Each numeric field is terminated by one or more <space> or
1275 NUL characters.
1276
1277 The name and the prefix fields shall produce the pathname of the file.
1278 A new pathname shall be formed, if prefix is not an empty string (its
1279 first character is not NUL), by concatenating prefix (up to the first
1280 NUL character), a slash character, and name; otherwise, name is used
1281 alone. In either case, name is terminated at the first NUL character.
1282 If prefix begins with a NUL character, it shall be ignored. In this
1283 manner, pathnames of at most 256 characters can be supported. If a
1284 pathname does not fit in the space provided, pax shall notify the user
1285 of the error, and shall not store any part of the file-header or data-
1286 on the medium.
1287
1288 The linkname field, described below, shall not use the prefix to pro‐
1289 duce a pathname. As such, a linkname is limited to 100 characters. If
1290 the name does not fit in the space provided, pax shall notify the user
1291 of the error, and shall not attempt to store the link on the medium.
1292
1293 The mode field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1294 dard octal digit representation. The encoded bits shall represent the
1295 following values:
1296
1297 Table: ustar mode Field
1298
1299 ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1300 │ Bit │ IEEE Std │ Description │
1301 │Value │ 1003.1-2001 Bit │ │
1302 ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1303 │04000 │ S_ISUID │ Set UID on execution. │
1304 │02000 │ S_ISGID │ Set GID on execution. │
1305 │01000 │ <reserved> │ Reserved for future standardization. │
1306 │00400 │ S_IRUSR │ Read permission for file owner class. │
1307 │00200 │ S_IWUSR │ Write permission for file owner class. │
1308 │00100 │ S_IXUSR │ Execute/search permission for file owner class. │
1309 │00040 │ S_IRGRP │ Read permission for file group class. │
1310 │00020 │ S_IWGRP │ Write permission for file group class. │
1311 │00010 │ S_IXGRP │ Execute/search permission for file group class. │
1312 │00004 │ S_IROTH │ Read permission for file other class. │
1313 │00002 │ S_IWOTH │ Write permission for file other class. │
1314 │00001 │ S_IXOTH │ Execute/search permission for file other class. │
1315 └──────┴─────────────────┴─────────────────────────────────────────────────┘
1316 When appropriate privilege is required to set one of these mode bits,
1317 and the user restoring the files from the archive does not have the
1318 appropriate privilege, the mode bits for which the user does not have
1319 appropriate privilege shall be ignored. Some of the mode bits in the
1320 archive format are not mentioned elsewhere in this volume of IEEE Std
1321 1003.1-2001. If the implementation does not support those bits, they
1322 may be ignored.
1323
1324 The uid and gid fields are the user and group ID of the owner and group
1325 of the file, respectively.
1326
1327 The size field is the size of the file in octets. If the typeflag field
1328 is set to specify a file to be of type 1 (a link) or 2 (a symbolic
1329 link), the size field shall be specified as zero. If the typeflag field
1330 is set to specify a file of type 5 (directory), the size field shall be
1331 interpreted as described under the definition of that record type. No
1332 data logical records are stored for types 1, 2, or 5. If the typeflag
1333 field is set to 3 (character special file), 4 (block special file), or
1334 6 (FIFO), the meaning of the size field is unspecified by this volume
1335 of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1336 the medium. Additionally, for type 6, the size field shall be ignored
1337 when reading. If the typeflag field is set to any other value, the num‐
1338 ber of logical records written following the header shall be
1339 (size+511)/512, ignoring any fraction in the result of the division.
1340
1341 The mtime field shall be the modification time of the file at the time
1342 it was archived. It is the ISO/IEC 646:1991 standard representation of
1343 the octal value of the modification time obtained from the stat(2)
1344 function.
1345
1346 The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1347 tion of the octal value of the simple sum of all octets in the header
1348 logical record. Each octet in the header shall be treated as an
1349 unsigned value. These values shall be added to an unsigned integer,
1350 initialized to zero, the precision of which is not less than 17 bits.
1351 When calculating the checksum, the chksum field is treated as if it
1352 were all spaces.
1353
1354 The typeflag field specifies the type of file archived. If a particular
1355 implementation does not recognize the type, or the user does not have
1356 appropriate privilege to create that type, the file shall be extracted
1357 as if it were a regular file if the file type is defined to have a
1358 meaning for the size field that could cause data logical records to be
1359 written on the medium (see the previous description for size). If con‐
1360 version to a regular file occurs, the pax utility shall produce an
1361 error indicating that the conversion took place. All of the typeflag
1362 fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1363
1364 0 Represents a regular file. For backwards-compatibility, a type‐
1365 flag value of binary zero ('\0') should be recognized as meaning
1366 a regular file when extracting files from the archive. Archives
1367 written with this version of the archive file format create reg‐
1368 ular files with a typefla value of the ISO/IEC 646:1991 standard
1369 IRV '0'.
1370
1371 1 Represents a file linked to another file, of any type, previ‐
1372 ously archived. Such files are identified by having the same
1373 device and file serial numbers, and pathnames that refer to dif‐
1374 ferent directory entries. All such files shall be archived as
1375 linked files. The linked-to name is specified in the linkname
1376 field with a NUL-character terminator if it is less than 100
1377 octets in length.
1378
1379 2 Represents a symbolic link. The contents of the symbolic link
1380 shall be stored in the linkname field.
1381
1382 3,4 Represent character special files and block special files
1383 respectively. In this case the devmajor and devminor fields
1384 shall contain information defining the device, the format of
1385 which is unspecified by this volume of IEEE Std 1003.1-2001.
1386 Implementations may map the device specifications to their own
1387 local specification or may ignore the entry.
1388
1389 5 Specifies a directory or subdirectory. On systems where disk
1390 allocation is performed on a directory basis, the size field
1391 shall contain the maximum number of octets (which may be rounded
1392 to the nearest disk block allocation unit) that the directory
1393 may hold. A size field of zero indicates no such limiting. Sys‐
1394 tems that do not support limiting in this manner should ignore
1395 the size field.
1396
1397 6 Specifies a FIFO special file. Note that the archiving of a FIFO
1398 file archives the existence of this file and not its contents.
1399
1400 7 Reserved to represent a file to which an implementation has
1401 associated some high-performance attribute. Implementations
1402 without such extensions should treat this file as a regular file
1403 (type 0).
1404
1405 A-Z The letters 'A' to 'Z', inclusive, are reserved for custom
1406 implementations. All other values are reserved for future ver‐
1407 sions of IEEE Std 1003.1-2001.
1408
1409 It is unspecified whether files with pathnames that refer to the same
1410 directory entry are archived as linked files or as separate files. If
1411 they are archived as linked files, this means that attempting to
1412 extract both pathnames from the resulting archive will always cause an
1413 error (unless the -u option is used) because the link cannot be cre‐
1414 ated.
1415
1416 It is unspecified whether files with the same device and file serial
1417 numbers being appended to an archive are treated as linked files to
1418 members that were in the archive before the append.
1419
1420 Attempts to archive a socket using ustar interchange format shall pro‐
1421 duce a diagnostic message. Handling of other file types is implementa‐
1422 tion-defined.
1423
1424 The magic field is the specification that this archive was output in
1425 this archive format. If this field contains ustar (the five characters
1426 from the ISO/IEC 646:1991 standard IRV shown followed by NUL), the
1427 uname and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1428 representation of the owner and group of the file, respectively (trun‐
1429 cated to fit, if necessary). When the file is restored by a privi‐
1430 leged, protection-preserving version of the utility, the user and group
1431 databases shall be scanned for these names. If found, the user and
1432 group IDs contained within these files shall be used rather than the
1433 values contained within the uid and gid fields.
1434
1435
1436 cpio Interchange Format
1437 The octet-oriented cpio archive format shall be a series of entries,
1438 each comprising a header that describes the file, the name of the file,
1439 and then the contents of the file.
1440
1441 An archive may be recorded as a series of fixed-size blocks of octets.
1442 This blocking shall be used only to make physical I/O more efficient.
1443 The last group of blocks shall always be at the full size.
1444
1445 For the octet-oriented cpio archive format, the individual entry infor‐
1446 mation shall be in the order indicated and described by the following
1447 table; see also the <cpio.h> header.
1448
1449 Table: Octet-Oriented cpio Archive Entry
1450
1451 ┌─────────────────────┬────────────────────┬─────────────────┐
1452 │ Header Field Name │ Length (in Octets) │ Interpreted as │
1453 ├─────────────────────┼────────────────────┼─────────────────┤
1454 │c_magic │ 6 │ Octal number │
1455 │c_dev │ 6 │ Octal number │
1456 │c_ino │ 6 │ Octal number │
1457 │c_mode │ 6 │ Octal number │
1458 │c_uid │ 6 │ Octal number │
1459 │c_gid │ 6 │ Octal number │
1460 │c_nlink │ 6 │ Octal number │
1461 │c_rdev │ 6 │ Octal number │
1462 │c_mtime │ 11 │ Octal number │
1463 │c_namesize │ 6 │ Octal number │
1464 │c_filesize │ 11 │ Octal number │
1465 │ │ │ │
1466 │Filename Field Name │ Length │ Interpreted as │
1467 │c_name │ c_namesize │ Pathname string │
1468 │ │ │ │
1469 │File Data Field Name │ Length │ Interpreted as │
1470 │c_filedata │ c_filesize │ Data │
1471 └─────────────────────┴────────────────────┴─────────────────┘
1472 cpio Header
1473 For each file in the archive, a header as defined previously shall be
1474 written. The information in the header fields is written as streams of
1475 the ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1476 The octal numbers shall be extended to the necessary length by append‐
1477 ing the ISO/IEC 646:1991 standard IRV zeros at the most-significant-
1478 digit end of the number; the result is written to the most-significant
1479 digit of the stream of octets first. The fields shall be interpreted as
1480 follows:
1481
1482 c_magic
1483 Identify the archive as being a transportable archive by con‐
1484 taining the identifying value "070707".
1485
1486 c_dev, c_ino
1487 Contains values that uniquely identify the file within the ar‐
1488 chive (that is, no files contain the same pair of c_dev and
1489 c_ino values unless they are links to the same file). The values
1490 shall be determined in an unspecified manner.
1491
1492 c_mode Contains the file type and access permissions as defined in the
1493 following table.
1494
1495 Table: Values for cpio c_mode Field
1496
1497 ┌──────────────────────┬─────────┬────────────────────────┐
1498 │File Permissions Name │ Value │ Indicates │
1499 ├──────────────────────┼─────────┼────────────────────────┤
1500 │C_IRUSR │ 000400 │ Read by owner │
1501 │C_IWUSR │ 000200 │ Write by owner │
1502 │C_IXUSR │ 000100 │ Execute by owner │
1503 │C_IRGRP │ 000040 │ Read by group │
1504 │C_IWGRP │ 000020 │ Write by group │
1505 │C_IXGRP │ 000010 │ Execute by group │
1506 │C_IROTH │ 000004 │ Read by others │
1507 │C_IWOTH │ 000002 │ Write by others │
1508 │C_IXOTH │ 000001 │ Execute by others │
1509 │C_ISUID │ 004000 │ Set uid │
1510 │C_ISGID │ 002000 │ Set gid │
1511 │C_ISVTX │ 001000 │ Reserved │
1512 ├──────────────────────┼─────────┼────────────────────────┤
1513 │File Type Name │ Value │ Indicates │
1514 ├──────────────────────┼─────────┼────────────────────────┤
1515 │C_ISDIR │ 0040000 │ Directory │
1516 │C_ISFIFO │ 0010000 │ FIFO │
1517 │C_ISREG │ 0100000 │ Regular file │
1518 │C_ISLNK │ 0120000 │ Symbolic link │
1519 │C_ISBLK │ 0060000 │ Block special file │
1520 │C_ISCHR │ 0020000 │ Character special file │
1521 │C_ISSOCK │ 0140000 │ Socket │
1522 │C_ISCTG │ 0110000 │ Reserved │
1523 └──────────────────────┴─────────┴────────────────────────┘
1524 Directories, FIFOs, symbolic links, and regular files shall be
1525 supported on a system conforming to this volume of IEEE Std
1526 1003.1-2001; additional values defined previously are reserved
1527 for compatibility with existing systems. Additional file types
1528 may be supported; however, such files should not be written to
1529 archives intended to be transported to other systems.
1530
1531 c_uid Contains the user ID of the owner.
1532
1533 c_gid Contains the group ID of the group.
1534
1535 c_nlink
1536 Contains a number greater than or equal to the number of links
1537 in the archive referencing the file. If the -a option is used to
1538 append to a cpio archive, then the pax utility need not account
1539 for the files in the existing part of the archive when calculat‐
1540 ing the c_nlink values for the appended part of the archive, and
1541 need not alter the c_nlink values in the existing part of the
1542 archive if additional files with the same c_dev and c_ino values
1543 are appended to the archive.
1544
1545 c_rdev Contains implementation-defined information for character or
1546 block special files.
1547
1548 c_mtime
1549 Contains the latest time of modification of the file at the time
1550 the archive was created.
1551
1552 c_namesize
1553 Contains the length of the pathname, including the terminating
1554 NUL character.
1555
1556 c_filesize
1557 Contains the length of the file in octets. This shall be the
1558 length of the data section following the header structure.
1559
1560
1561 cpio Filename
1562 The c_name field shall contain the pathname of the file. The length of
1563 this field in octets is the value of c_namesize.
1564
1565 If a filename is found on the medium that would create an invalid path‐
1566 name, it is implementation-defined whether the data from the file is
1567 stored on the file hierarchy and under what name it is stored.
1568
1569 All characters shall be represented in the ISO/IEC 646:1991 standard
1570 IRV. For maximum portability between implementations, names should be
1571 selected from characters represented by the portable filename character
1572 set as octets with the most significant bit zero. If an implementation
1573 supports the use of characters outside the portable filename character
1574 set in names for files, users, and groups, one or more implementation-
1575 defined encodings of these characters shall be provided for interchange
1576 purposes. However, the pax utility shall never create filenames on the
1577 local system that cannot be accessed via the procedures described pre‐
1578 viously in this volume of IEEE Std 1003.1-2001. If a filename is found
1579 on the medium that would create an invalid filename, it is implementa‐
1580 tion-defined whether the data from the file is stored on the local file
1581 system and under what name it is stored. The pax utility may choose to
1582 ignore these files as long as it produces an error indicating that the
1583 file is being ignored.
1584
1585
1586 cpio File Data
1587 Following c_name, there shall be c_filesize octets of data. Interpre‐
1588 tation of such data occurs in a manner dependent on the file. If
1589 c_filesize is zero, no data shall be contained in c_filedata.
1590
1591 When restoring from an archive:
1592
1593 · If the user does not have the appropriate privilege to create a
1594 file of the specified type, pax shall ignore the entry and write
1595 an error message to standard error.
1596
1597 · Only regular files have data to be restored. Presuming a regular
1598 file meets any selection criteria that might be imposed on the
1599 format-reading utility by the user, such data shall be restored.
1600
1601 · If a user does not have appropriate privilege to set a particu‐
1602 lar mode flag, the flag shall be ignored. Some of the mode flags
1603 in the archive format are not mentioned elsewhere in this volume
1604 of IEEE Std 1003.1-2001. If the implementation does not support
1605 those flags, they may be ignored.
1606
1607
1608 cpio Special Entries
1609 FIFO special files, directories, and the trailer shall be recorded with
1610 c_filesize equal to zero. For other special files, c_filesize is
1611 unspecified by this volume of IEEE Std 1003.1-2001. The header for the
1612 next file entry in the archive shall be written directly after the last
1613 octet of the file entry preceding it. A header denoting the filename
1614 TRAILER!!! shall indicate the end of the archive; the contents of
1615 octets in the last block of the archive following such a header are
1616 undefined.
1617
1618
1620 The following exit values shall be returned:
1621
1622 0 All files were processed successfully.
1623
1624 >0 An error occurred.
1625
1626
1628 If pax cannot create a file or a link when reading an archive or cannot
1629 find a file when writing an archive, or cannot preserve the user ID,
1630 group ID, or file mode when the -p option is specified, a diagnostic
1631 message shall be written to standard error and a non-zero exit status
1632 shall be returned, but processing shall continue. In the case where pax
1633 cannot create a link to a file, pax shall not, by default, create a
1634 second copy of the file.
1635
1636 If the extraction of a file from an archive is prematurely terminated
1637 by a signal or error, pax may have only partially extracted the file or
1638 (if the -n option was not specified) may have extracted a file of the
1639 same name as that specified by the user, but which is not the file the
1640 user wanted. Additionally, the file modes of extracted directories may
1641 have additional bits from the S_IRWXU mask set as well as incorrect
1642 modification and access times.
1643
1644
1645_________________________________________________________________
1647
1648
1650 Caution is advised when using the -a option to append to a cpio format
1651 archive. If any of the files being appended happen to be given the same
1652 c_dev and c_ino values as a file in the existing part of the archive,
1653 then they may be treated as links to that file on extraction. Thus, it
1654 is risky to use -a with cpio format except when it is done on the same
1655 system that the original archive was created on, and with the same pax
1656 utility, and in the knowledge that there has been little or no file
1657 system activity since the original archive was created that could lead
1658 to any of the files appended being given the same c_dev and c_ino val‐
1659 ues as an unrelated file in the existing part of the archive. Also,
1660 when (intentionally) appending additional links to a file in the exist‐
1661 ing part of the archive, the c_nlink values in the modified archive can
1662 be smaller than the number of links to the file in the archive, which
1663 may mean that the links are not preserved on extraction.
1664
1665 The -p (privileges) option was invented to reconcile differences
1666 between historical tar and cpio implementations. In particular, the two
1667 utilities use -m in diametrically opposed ways. The -p option also pro‐
1668 vides a consistent means of extending the ways in which future file
1669 attributes can be addressed, such as for enhanced security systems or
1670 high-performance files. Although it may seem complex, there are really
1671 two modes that are most commonly used:
1672
1673 -p e ``Preserve everything". This would be used by the historical
1674 superuser, someone with all the appropriate privileges, to pre‐
1675 serve all aspects of the files as they are recorded in the ar‐
1676 chive. The e flag is the sum of o and p, and other implementa‐
1677 tion-defined attributes.
1678
1679 -p p ``Preserve" the file mode bits. This would be used by the user
1680 with regular privileges who wished to preserve aspects of the
1681 file other than the ownership. The file times are preserved by
1682 default, but two other flags are offered to disable these and
1683 use the time of extraction.
1684
1685 The one pathname per line format of standard input precludes pathnames
1686 containing <newline>s. Although such pathnames violate the portable
1687 filename guidelines, they may exist and their presence may inhibit
1688 usage of pax within shell scripts. This problem is inherited from his‐
1689 torical archive programs. The problem can be avoided by listing file‐
1690 name arguments on the command line instead of on standard input.
1691
1692 It is almost certain that appropriate privileges are required for pax
1693 to accomplish parts of this volume of IEEE Std 1003.1-2001. Specifi‐
1694 cally, creating files of type block special or character special,
1695 restoring file access times unless the files are owned by the user (the
1696 -t option), or preserving file owner, group, and mode (the -p option)
1697 all probably require appropriate privileges.
1698
1699 In read mode, implementations are permitted to overwrite files when the
1700 archive has multiple members with the same name. This may fail if per‐
1701 missions on the first version of the file do not permit it to be over‐
1702 written.
1703
1704 The cpio and ustar formats can only support files up to 8589934592
1705 bytes (8 * 2^30) in size.
1706
1707
1709 The following command:
1710
1711 pax -w -f /dev/rmt/1m .
1712
1713 copies the contents of the current directory to tape drive 1, medium
1714 density (assuming historical System V device naming procedures-the his‐
1715 torical BSD device name would be /dev/rmt9).
1716
1717 The following commands:
1718
1719 mkdir newdirpax -rw olddir newdir
1720
1721 copy the olddir directory hierarchy to newdir.
1722
1723 pax -r -s ',^//*usr//*,,' -f a.pax
1724
1725 reads the archive a.pax, with all files rooted in /usr in the archive
1726 extracted relative to the current directory.
1727
1728 Using the option:
1729
1730 -o listopt="%M %(atime)T %(size)D %(name)s"
1731
1732 overrides the default output description in Standard Output and instead
1733 writes:
1734
1735 -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1736
1737 Using the options:
1738
1739 -o listopt='%L\t%(size)D\n%.7' \
1740 -o listopt='(name)s\n%(atime)T\n%T'
1741
1742 overrides the default output description in Standard Output and instead
1743 writes:
1744
1745 /usr/foo/bar -> /tmp 1492
1746 /usr/fo
1747 Jan 12 1991
1748 Jan 31 15:53
1749
1750
1752 The pax utility was new for the ISO POSIX-2:1993 standard. It repre‐
1753 sents a peaceful compromise between advocates of the historical tar and
1754 cpio utilities.
1755
1756 A fundamental difference between cpio and tar was in the way directo‐
1757 ries were treated. The cpio utility did not treat directories differ‐
1758 ently from other files, and to select a directory and its contents
1759 required that each file in the hierarchy be explicitly specified. For
1760 tar, a directory matched every file in the file hierarchy it rooted.
1761
1762 The pax utility offers both interfaces; by default, directories map
1763 into the file hierarchy they root. The -d option causes pax to skip any
1764 file not explicitly referenced, as cpio historically did. The tar -
1765 style behavior was chosen as the default because it was believed that
1766 this was the more common usage and because tar is the more commonly
1767 available interface, as it was historically provided on both System V
1768 and BSD implementations.
1769
1770 The data interchange format specification in this volume of IEEE Std
1771 1003.1-2001 requires that processes with "appropriate privileges" shall
1772 always restore the ownership and permissions of extracted files exactly
1773 as archived. If viewed from the historic equivalence between superuser
1774 and "appropriate privileges", there are two problems with this require‐
1775 ment. First, users running as superusers may unknowingly set dangerous
1776 permissions on extracted files. Second, it is needlessly limiting, in
1777 that superusers cannot extract files and own them as superuser unless
1778 the archive was created by the superuser. (It should be noted that
1779 restoration of ownerships and permissions for the superuser, by
1780 default, is historical practice in cpio, but not in tar.) In order to
1781 avoid these two problems, the pax specification has an additional
1782 "privilege" mechanism, the -p option. Only a pax invocation with the
1783 privileges needed, and which has the -p option set using the e specifi‐
1784 cation character, has the "appropriate privilege" to restore full own‐
1785 ership and permission information.
1786
1787 Note also that this volume of IEEE Std 1003.1-2001 requires that the
1788 file ownership and access permissions shall be set, on extraction, in
1789 the same fashion as the creat(2) function when provided with the mode
1790 stored in the archive. This means that the file creation mask of the
1791 user is applied to the file permissions.
1792
1793 Users should note that directories may be created by pax while extract‐
1794 ing files with permissions that are different from those that existed
1795 at the time the archive was created. When extracting sensitive informa‐
1796 tion into a directory hierarchy that no longer exists, users are
1797 encouraged to set their file creation mask appropriately to protect
1798 these files during extraction.
1799
1800 The table of contents output is written to standard output to facili‐
1801 tate pipeline processing.
1802
1803 An early proposal had hard links displaying for all pathnames. This
1804 was removed because it complicates the output of the case where -v is
1805 not specified and does not match historical cpio usage. The hard-link
1806 information is available in the -v display.
1807
1808 The description of the -l option allows implementations to make hard
1809 links to symbolic links. IEEE Std 1003.1-2001 does not specify any way
1810 to create a hard link to a symbolic link, but many implementations pro‐
1811 vide this capability as an extension. If there are hard links to sym‐
1812 bolic links when an archive is created, the implementation is required
1813 to archive the hard link in the archive (unless -H or -L is specified).
1814 When in read mode and in copy mode, implementations supporting hard
1815 links to symbolic links should use them when appropriate.
1816
1817 The archive formats inherited from the POSIX.1-1990 standard have cer‐
1818 tain restrictions that have been brought along from historical usage.
1819 For example, there are restrictions on the length of pathnames stored
1820 in the archive. When pax is used in copy (-rw) mode (copying directory
1821 hierarchies), the ability to use extensions from the -x pax format
1822 overcomes these restrictions.
1823
1824 The default blocksize value of 5120 bytes for cpio was selected because
1825 it is one of the standard block-size values for cpio, set when the -B
1826 option is specified. (The other default block-size value for cpio is
1827 512 bytes, and this was considered to be too small.) The default block
1828 value of 10240 bytes for tar was selected because that is the standard
1829 block-size value for BSD tar. The maximum block size of 32256 bytes
1830 (2^15-512 bytes) is the largest multiple of 512 bytes that fits into a
1831 signed 16-bit tape controller transfer register. There are known limi‐
1832 tations in some historical systems that would prevent larger blocks
1833 from being accepted. Historical values were chosen to improve compati‐
1834 bility with historical scripts using dd(1) or similar utilities to
1835 manipulate archives. Also, default block sizes for any file type other
1836 than character special file has been deleted from this volume of IEEE
1837 Std 1003.1-2001 as unimportant and not likely to affect the structure
1838 of the resulting archive.
1839
1840 Implementations are permitted to modify the block-size value based on
1841 the archive format or the device to which the archive is being written.
1842 This is to provide implementations with the opportunity to take advan‐
1843 tage of special types of devices, and it should not be used without a
1844 great deal of consideration as it almost certainly decreases archive
1845 portability.
1846
1847 The intended use of the -n option was to permit extraction of one or
1848 more files from the archive without processing the entire archive. This
1849 was viewed by the standard developers as offering significant perfor‐
1850 mance advantages over historical implementations. The -n option in
1851 early proposals had three effects; the first was to cause special char‐
1852 acters in patterns to not be treated specially. The second was to cause
1853 only the first file that matched a pattern to be extracted. The third
1854 was to cause pax to write a diagnostic message to standard error when
1855 no file was found matching a specified pattern. Only the second behav‐
1856 ior is retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1857 sons. First, it is in general not acceptable for a single option to
1858 have multiple effects. Second, the ability to make pattern matching
1859 characters act as normal characters is useful for parts of pax other
1860 than file extraction. Third, a finer degree of control over the special
1861 characters is useful because users may wish to normalize only a single
1862 special character in a single filename. Fourth, given a more general
1863 escape mechanism, the previous behavior of the -n option can be easily
1864 obtained using the -s option or a sed script. Finally, writing a diag‐
1865 nostic message when a pattern specified by the user is unmatched by any
1866 file is useful behavior in all cases.
1867
1868 In this version, the -n was removed from the copy mode synopsis of pax;
1869 it is inapplicable because there are no pattern operands specified in
1870 this mode.
1871
1872 There is another method than pax for copying subtrees in IEEE Std
1873 1003.1-2001 described as part of the cp(1) utility. Both methods are
1874 historical practice: cp(1) provides a simpler, more intuitive inter‐
1875 face, while pax offers a finer granularity of control. Each provides
1876 additional functionality to the other; in particular, pax maintains the
1877 hard-link structure of the hierarchy while cp(1) does not. It is the
1878 intention of the standard developers that the results be similar (using
1879 appropriate option combinations in both utilities). The results are not
1880 required to be identical; there seemed insufficient gain to applica‐
1881 tions to balance the difficulty of implementations having to guarantee
1882 that the results would be exactly identical.
1883
1884 A single archive may span more than one file. It is suggested that
1885 implementations provide informative messages to the user on standard
1886 error whenever the archive file is changed.
1887
1888 The -d option (do not create intermediate directories not listed in the
1889 archive) found in early proposals was originally provided as a comple‐
1890 ment to the historic -d option of cpio. It has been deleted.
1891
1892 The -s option in early proposals specified a subset of the substitution
1893 command from the ed utility. As there was no reason for only a subset
1894 to be supported, the -s option is now compatible with the current ed
1895 specification. Since the delimiter can be any non-null character, the
1896 following usage with single spaces is valid:
1897
1898 pax -s " foo bar " ...
1899
1900 The -t description is worded so as to note that this may cause the
1901 access time update caused by some other activity (which occurs while
1902 the file is being read) to be overwritten.
1903
1904 The default behavior of pax with regard to file modification times is
1905 the same as historical implementations of tar. It is not the histori‐
1906 cal behavior of cpio.
1907
1908 Because the -i option uses /dev/tty, utilities without a controlling
1909 terminal are not able to use this option.
1910
1911 The -y option, found in early proposals, has been deleted because a
1912 line containing a single period for the -i option has equivalent func‐
1913 tionality. The special lines for the -i option (a single period and the
1914 empty line) are historical practice in cpio.
1915
1916 In early drafts, a -e charmap option was included to increase portabil‐
1917 ity of files between systems using different coded character sets. This
1918 option was omitted because it was apparent that consensus could not be
1919 formed for it. In this version, the use of UTF-8 should be an adequate
1920 substitute.
1921
1922 The -k option was added to address international concerns about the
1923 dangers involved in the character set transformations of -e (if the
1924 target character set were different from the source, the filenames
1925 might be transformed into names matching existing files) and also was
1926 made more general to protect files transferred between file systems
1927 with different {NAME_MAX} values (truncating a filename on a smaller
1928 system might also inadvertently overwrite existing files). As stated,
1929 it prevents any overwriting, even if the target file is older than the
1930 source. This version adds more granularity of options to solve this
1931 problem by introducing the -o invalid=option - specifically the UTF-8
1932 action. (Note that an existing file that is named with a UTF-8 encoding
1933 is still subject to overwriting in this case. The -k option closes that
1934 loophole.)
1935
1936 Some of the file characteristics referenced in this volume of IEEE Std
1937 1003.1-2001 might not be supported by some archive formats. For exam‐
1938 ple, neither the tar nor cpio formats contain the file access time. For
1939 this reason, the e specification character has been provided, intended
1940 to cause all file characteristics specified in the archive to be
1941 retained.
1942
1943 It is required that extracted directories, by default, have their
1944 access and modification times and permissions set to the values speci‐
1945 fied in the archive. This has obvious problems in that the directories
1946 are almost certainly modified after being extracted and that directory
1947 permissions may not permit file creation. One possible solution is to
1948 create directories with the mode specified in the archive, as modified
1949 by the umask of the user, with sufficient permissions to allow file
1950 creation. After all files have been extracted, pax would then reset the
1951 access and modification times and permissions as necessary.
1952
1953 The list-mode formatting description borrows heavily from the one
1954 defined by the printf(1) utility. However, since there is no separate
1955 operand list to get conversion arguments, the format was extended to
1956 allow specifying the name of the conversion argument as part of the
1957 conversion specification.
1958
1959 The T conversion specifier allows time fields to be displayed in any of
1960 the date formats. Unlike the ls(1) utility, pax does not adjust the
1961 format when the date is less than six months in the past. This makes
1962 parsing the output more predictable.
1963
1964 The D conversion specifier handles the ability to display the
1965 major/minor or file size, as with ls(1), by using %-8(size)D.
1966
1967 The L conversion specifier handles the ls display for symbolic links.
1968
1969 Conversion specifiers were added to generate existing known types used
1970 for ls(1).
1971
1972
1973 pax Interchange Format
1974 The new POSIX data interchange format was developed primarily to sat‐
1975 isfy international concerns that the ustar and cpio formats did not
1976 provide for file, user, and group names encoded in characters outside a
1977 subset of the ISO/IEC 646:1991 standard. The standard developers real‐
1978 ized that this new POSIX data interchange format should be very exten‐
1979 sible because there were other requirements they foresaw in the near
1980 future:
1981
1982 · Support international character encodings and locale information
1983
1984 · Support security information (ACLs, and so on)
1985
1986 · Support future file types, such as realtime or contiguous files
1987
1988 · Include data areas for implementation use
1989
1990 · Support systems with words larger than 32 bits and timers with
1991 subsecond granularity
1992
1993 The following were not goals for this format because these are better
1994 handled by separate utilities or are inappropriate for a portable for‐
1995 mat:
1996
1997 · Encryption
1998
1999 · Compression
2000
2001 · Data translation between locales and codesets
2002
2003 · inode storage
2004
2005 The format chosen to support the goals is an extension of the ustar
2006 format. Of the two formats previously available, only the ustar format
2007 was selected for extensions because:
2008
2009 · It was easier to extend in an upwards-compatible way. It offered
2010 version flags and header block type fields with room for future
2011 standardization. The cpio format, while possessing a more flexi‐
2012 ble file naming methodology, could not be extended without
2013 breaking some theoretical implementation or using a dummy file‐
2014 name that could be a legitimate filename.
2015
2016 · Industry experience since the original "tar wars" fought in
2017 developing the ISO POSIX-1 standard has clearly been in favor of
2018 the ustar format, which is generally the default output format
2019 selected for pax implementations on new systems.
2020
2021 The new format was designed with one additional goal in mind: reason‐
2022 able behavior when an older tar or pax utility happened to read an ar‐
2023 chive. Since the POSIX.1-1990 standard mandated that a "format-reading
2024 utility" had to treat unrecognized typeflag values as regular files,
2025 this allowed the format to include all the extended information in a
2026 pseudo-regular file that preceded each real file. An option is given
2027 that allows the archive creator to set up reasonable names for these
2028 files on the older systems. Also, the normative text suggests that
2029 reasonable file access values be used for this ustar header block. Mak‐
2030 ing these header files inaccessible for convenient reading and deleting
2031 would not be reasonable. File permissions of 600 or 700 are suggested.
2032
2033 The ustar typeflag field was used to accommodate the additional func‐
2034 tionality of the new format rather than magic or version because the
2035 POSIX.1-1990 standard (and, by reference, the previous version of pax),
2036 mandated the behavior of the format-reading utility when it encountered
2037 an unknown typeflag, but was silent about the other two fields.
2038
2039 Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2040 a proposed archive format that was based on compatibility with the
2041 standard for tape files (ISO 1001, similar to the format used histori‐
2042 cally on many mainframes and minicomputers). This format was overly
2043 complex and required considerable overhead in volume and header
2044 records. Furthermore, the standard developers felt that it would not be
2045 acceptable to the community of POSIX developers, so it was later
2046 changed to be a format more closely related to historical practice on
2047 POSIX systems.
2048
2049 The prefix and name split of pathnames in ustar was replaced by the
2050 single path extended header record for simplicity.
2051
2052 The concept of a global extended header (typeflag g) was controversial.
2053 If this were applied to an archive being recorded on magnetic tape, a
2054 few unreadable blocks at the beginning of the tape could be a serious
2055 problem; a utility attempting to extract as many files as possible from
2056 a damaged archive could lose a large percentage of file header informa‐
2057 tion in this case. However, if the archive were on a reliable medium,
2058 such as a CD-ROM, the global extended header offers considerable poten‐
2059 tial size reductions by eliminating redundant information. Thus, the
2060 text warns against using the global method for unreliable media and
2061 provides a method for implanting global information in the extended
2062 header for each file, rather than in the typeflag g records.
2063
2064 No facility for data translation or filtering on a per-file basis is
2065 included because the standard developers could not invent an interface
2066 that would allow this in an efficient manner. If a filter, such as
2067 encryption or compression, is to be applied to all the files, it is
2068 more efficient to apply the filter to the entire archive as a single
2069 file. The standard developers considered interfaces that would invoke a
2070 shell script for each file going into or out of the archive, but the
2071 system overhead in this approach was considered to be too high.
2072
2073 One such approach would be to have filter= records that give a pathname
2074 for an executable. When the program is invoked, the file and archive
2075 would be open for standard input/output and all the header fields would
2076 be available as environment variables or command-line arguments. The
2077 standard developers did discuss such schemes, but they were omitted
2078 from IEEE Std 1003.1-2001 due to concerns about excessive overhead.
2079 Also, the program itself would need to be in the archive if it were to
2080 be used portably.
2081
2082 There is currently no portable means of identifying the character
2083 set(s) used for a file in the file system. Therefore, pax has not been
2084 given a mechanism to generate charset records automatically. The only
2085 portable means of doing this is for the user to write the archive using
2086 the -o charset=string command line option. This assumes that all of the
2087 files in the archive use the same encoding. The "implementation-
2088 defined" text is included to allow for a system that can identify the
2089 encodings used for each of its files.
2090
2091 The table of standards that accompanies the charset record description
2092 is acknowledged to be very limited. Only a limited number of character
2093 set standards is reasonable for maximal interchange. Any character set
2094 is, of course, possible by prior agreement. It was suggested that
2095 EBCDIC be listed, but it was omitted because it is not defined by a
2096 formal standard. Formal standards, and then only those with reasonably
2097 large followings, can be included here, simply as a matter of practi‐
2098 cality. The <value>s represent names of officially registered character
2099 sets in the format required by the ISO 2375:1985 standard.
2100
2101 The normal comma or <blank>-separated list rules are not followed in
2102 the case of keyword options to allow ease of argument parsing for
2103 getopts.
2104
2105 Further information on character encodings is in pax Archive Character
2106 Set Encoding/Decoding.
2107
2108 The standard developers have reserved keyword name space for vendor
2109 extensions. It is suggested that the format to be used is:
2110
2111 VENDOR.keyword
2112
2113 where VENDOR is the name of the vendor or organization in all uppercase
2114 letters. It is further suggested that the keyword following the period
2115 be named differently than any of the standard keywords so that it could
2116 be used for future standardization, if appropriate, by omitting the
2117 VENDOR prefix.
2118
2119 The <length> field in the extended header record was included to make
2120 it simpler to step through the records, even if a record contains an
2121 unknown format (to a particular pax) with complex interactions of spe‐
2122 cial characters. It also provides a minor integrity checkpoint within
2123 the records to aid a program attempting to recover files from a damaged
2124 archive.
2125
2126 There are no extended header versions of the devmajor and devminor
2127 fields because the unspecified format ustar header field should be suf‐
2128 ficient. If they are not, vendor-specific extended keywords (such as
2129 VENDOR.devmajor) should be used.
2130
2131 Device and i-number labeling of files was not adopted from cpio; files
2132 are interchanged strictly on a symbolic name basis, as in ustar.
2133
2134 Just as with the ustar format descriptions, the new format makes no
2135 special arrangements for multi-volume archives. Each of the pax archive
2136 types is assumed to be inside a single POSIX file and splitting that
2137 file over multiple volumes (diskettes, tape cartridges, and so on),
2138 processing their labels, and mounting each in the proper sequence are
2139 considered to be implementation details that cannot be described
2140 portably.
2141
2142 The pax format is intended for interchange, not only for backup on a
2143 single (family of) systems. It is not as densely packed as might be
2144 possible for backup:
2145
2146 · It contains information as coded characters that could be coded
2147 in binary.
2148
2149 · It identifies extended records with name fields that could be
2150 omitted in favor of a fixed-field layout.
2151
2152 · It translates names into a portable character set and identifies
2153 locale-related information, both of which are probably unneces‐
2154 sary for backup.
2155
2156 The requirements on restoring from an archive are slightly different
2157 from the historical wording, allowing for non-monolithic privilege to
2158 bring forward as much as possible. In particular, attributes such as
2159 "high performance file" might be broadly but not universally granted
2160 while set-user-ID or chown(2) might be much more restricted. There is
2161 no implication in IEEE Std 1003.1-2001 that the security information be
2162 honored after it is restored to the file hierarchy, in spite of what
2163 might be improperly inferred by the silence on that topic. That is a
2164 topic for another standard.
2165
2166 Links are recorded in the fashion described here because a link can be
2167 to any file type. It is desirable in general to be able to restore part
2168 of an archive selectively and restore all of those files completely. If
2169 the data is not associated with each link, it is not possible to do
2170 this. However, the data associated with a file can be large, and when
2171 selective restoration is not needed, this can be a significant burden.
2172 The archive is structured so that files that have no associated data
2173 can always be restored by the name of any link name of any link, and
2174 the user may choose whether data is recorded with each instance of a
2175 file that contains data. The format permits mixing of both types of
2176 links in a single archive; this can be done for special needs, and pax
2177 is expected to interpret such archives on input properly, despite the
2178 fact that there is no pax option that would force this mixed case on
2179 output. (When -o linkdata is used, the output must contain the dupli‐
2180 cate data, but the implementation is free to include it or omit it when
2181 -o linkdata is not used.)
2182
2183 The time values are included as extended header records for those
2184 implementations needing more than the eleven octal digits allowed by
2185 the ustar format. Portable file timestamps cannot be negative. If pax
2186 encounters a file with a negative timestamp in copy or write mode, it
2187 can reject the file, substitute a non-negative timestamp, or generate a
2188 non-portable timestamp with a leading '-'. Even though some implementa‐
2189 tions can support finer file-time granularities than seconds, the nor‐
2190 mative text requires support only for seconds since the Epoch because
2191 the ISO POSIX-1 standard states them that way. The ustar format
2192 includes only mtime; the new format adds atime and ctime for symmetry.
2193 The atime access time restored to the file system will be affected by
2194 the -p a and -p e options. The ctime creation time (actually inode mod‐
2195 ification time) is described with "appropriate privilege" so that it
2196 can be ignored when writing to the file system. POSIX does not provide
2197 a portable means to change file creation time. Nothing is intended to
2198 prevent a non-portable implementation of pax from restoring the value.
2199
2200 The gid, size, and uid extended header records were included to allow
2201 expansion beyond the sizes specified in the regular tar header. New
2202 file system architectures are emerging that will exhaust the 12-digit
2203 size field. There are probably not many systems requiring more than 8
2204 digits for user and group IDs, but the extended header values were
2205 included for completeness, allowing overrides for all of the decimal
2206 values in the tar header.
2207
2208 The standard developers intended to describe the effective results of
2209 pax with regard to file ownerships and permissions; implementations are
2210 not restricted in timing or sequencing the restoration of such, pro‐
2211 vided the results are as specified.
2212
2213 Much of the text describing the extended headers refers to use in
2214 "write or copy modes". The copy mode references are due to the norma‐
2215 tive text: "The effect of the copy shall be as if the copied files were
2216 written to an archive file and then subsequently extracted ...". There
2217 is certainly no way to test whether pax is actually generating the
2218 extended headers in copy mode, but the effects must be as if it had.
2219
2220
2221 pax Archive Character Set Encoding/Decoding
2222 There is a need to exchange archives of files between systems of dif‐
2223 ferent native codesets. Filenames, group names, and user names must be
2224 preserved to the fullest extent possible when an archive is read on the
2225 receiving platform. Translation of the contents of files is not within
2226 the scope of the pax utility.
2227
2228 There will also be the need to represent characters that are not avail‐
2229 able on the receiving platform. These unsupported characters cannot be
2230 automatically folded to the local set of characters due to the chance
2231 of collisions. This could result in overwriting previous extracted
2232 files from the archive or pre-existing files on the system.
2233
2234 For these reasons, the codeset used to represent characters within the
2235 extended header records of the pax archive must be sufficiently rich to
2236 handle all commonly used character sets. The fields requiring transla‐
2237 tion include, at a minimum, filenames, user names, group names, and
2238 link pathnames. Implementations may wish to have localized extended
2239 keywords that use non-portable characters.
2240
2241 The standard developers considered the following options:
2242
2243 · The archive creator specifies the well-defined name of the
2244 source codeset. The receiver must then recognize the codeset
2245 name and perform the appropriate translations to the destination
2246 codeset.
2247
2248 · The archive creator includes within the archive the character
2249 mapping table for the source codeset used to encode extended
2250 header records. The receiver must then read the character map‐
2251 ping table and perform the appropriate translations to the des‐
2252 tination codeset.
2253
2254 · The archive creator translates the extended header records in
2255 the source codeset into a canonical form. The receiver must then
2256 perform the appropriate translations to the destination codeset.
2257
2258 The approach that incorporates the name of the source codeset poses the
2259 problem of codeset name registration, and makes the archive useless to
2260 pax archive decoders that do not recognize that codeset.
2261
2262 Because parts of an archive may be corrupted, the standard developers
2263 felt that including the character map of the source codeset was too
2264 fragile. The loss of this one key component could result in making the
2265 entire archive useless. (The difference between this and the global
2266 extended header decision was that the latter has a workaround-duplicat‐
2267 ing extended header records on unreliable media-but this would be too
2268 burdensome for large character set maps.)
2269
2270 Both of the above approaches also put an undue burden on the pax ar‐
2271 chive receiver to handle the cross-product of all source and destina‐
2272 tion codesets.
2273
2274 To simplify the translation from the source codeset to the canonical
2275 form and from the canonical form to the destination codeset, the stan‐
2276 dard developers decided that the internal representation should be a
2277 stateless encoding. A stateless encoding is one where each codepoint
2278 has the same meaning, without regard to the decoder being in a specific
2279 state. An example of a stateful encoding would be the Japanese Shift-
2280 JIS; an example of a stateless encoding would be the ISO/IEC 646:1991
2281 standard (equivalent to 7-bit ASCII).
2282
2283 For these reasons, the standard developers decided to adopt a canonical
2284 format for the representation of file information strings. The obvious,
2285 well-endorsed candidate is the ISO/IEC 10646-1:2000 standard (based in
2286 part on Unicode), which can be used to represent the characters of vir‐
2287 tually all standardized character sets. The standard developers ini‐
2288 tially agreed upon using UCS2 (16-bit Unicode) as the internal repre‐
2289 sentation. This repertoire of characters provides a sufficiently rich
2290 set to represent all commonly-used codesets.
2291
2292 However, the standard developers found that the 16-bit Unicode repre‐
2293 sentation had some problems. It forced the issue of standardizing byte
2294 ordering. The 2-byte length of each character made the extended header
2295 records twice as long for the case of strings coded entirely from his‐
2296 torical 7-bit ASCII. For these reasons, the standard developers chose
2297 the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2298 representation encodes UCS2 or UCS4 characters reliably and determinis‐
2299 tically, eliminating the need for a canonical byte ordering. In addi‐
2300 tion, NUL octets and other characters possibly confusing to POSIX file
2301 systems do not appear, except to represent themselves. It was realized
2302 that certain national codesets take up more space after the encoding,
2303 due to their placement within the UCS range; it was felt that the use‐
2304 fulness of the encoding of the names outweighs the disadvantage of size
2305 increase for file, user, and group names.
2306
2307 The encoding of UTF-8 is as follows:
2308
2309 UCS4 Hex Encoding UTF-8 Binary Encoding
2310 00000000-0000007F 0xxxxxxx
2311 00000080-000007FF 110xxxxx 10xxxxxx
2312 00000800-0000FFFF 1110xxxx 10xxxxxx 10xxxxxx
2313 00010000-001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2314 00200000-03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2315 04000000-7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2316
2317 where each 'x' represents a bit value from the character being trans‐
2318 lated.
2319
2320
2321 ustar Interchange Format
2322 The description of the ustar format reflects numerous enhancements over
2323 pre-1988 versions of the historical tar utility. The goal of these
2324 changes was not only to provide the functional enhancements desired,
2325 but also to retain compatibility between new and old versions. This
2326 compatibility has been retained. Archives written using the old archive
2327 format are compatible with the new format.
2328
2329 Implementors should be aware that the previous file format did not
2330 include a mechanism to archive directory type files. For this reason,
2331 the convention of using a filename ending with slash was adopted to
2332 specify a directory on the archive.
2333
2334 The total size of the name and prefix fields have been set to meet the
2335 minimum requirements for {PATH_MAX} If a pathname will fit within the
2336 name field, it is recommended that the pathname be stored there without
2337 the use of the prefix field. Although the name field is known to be too
2338 small to contain {PATH_MAX} characters, the value was not changed in
2339 this version of the archive file format to retain backwards-compatibil‐
2340 ity, and instead the prefix was introduced. Also, because of the ear‐
2341 lier version of the format, there is no way to remove the restriction
2342 on the linkname field being limited in size to just that of the name
2343 field.
2344
2345 The size field is required to be meaningful in all implementation
2346 extensions, although it could be zero. This is required so that the
2347 data blocks can always be properly counted.
2348
2349 It is suggested that if device special files need to be represented
2350 that cannot be represented in the standard format, that one of the
2351 extension types (A-Z) be used, and that the additional information for
2352 the special file be represented as data and be reflected in the size
2353 field.
2354
2355 Attempting to restore a special file type, where it is converted to
2356 ordinary data and conflicts with an existing filename, need not be spe‐
2357 cially detected by the utility. If run as an ordinary user, pax should
2358 not be able to overwrite the entries in, for example, /dev in any case
2359 (whether the file is converted to another type or not). If run as a
2360 privileged user, it should be able to do so, and it would be considered
2361 a bug if it did not. The same is true of ordinary data files and simi‐
2362 larly named special files; it is impossible to anticipate the needs of
2363 the user (who could really intend to overwrite the file), so the behav‐
2364 ior should be predictable (and thus regular) and rely on the protection
2365 system as required.
2366
2367 The value 7 in the typeflag field is intended to define how contiguous
2368 files can be stored in a ustar archive. IEEE Std 1003.1-2001 does not
2369 require the contiguous file extension, but does define a standard way
2370 of archiving such files so that all conforming systems can interpret
2371 these file types in a meaningful and consistent manner. On a system
2372 that does not support extended file types, the pax utility should do
2373 the best it can with the file and go on to the next.
2374
2375 The file protection modes are those conventionally used by the ls(1)
2376 utility. This is extended beyond the usage in the ISO POSIX-2 standard
2377 to support the "shared text" or "sticky" bit. It is intended that the
2378 conformance document should not document anything beyond the existence
2379 of and support of such a mode. Further extensions are expected to
2380 these bits, particularly with overloading the set-user-ID and set-
2381 group-ID flags.
2382
2383
2384 cpio Interchange Format
2385 The reference to appropriate privilege in the cpio format refers to an
2386 error on standard output; the ustar format does not make comparable
2387 statements.
2388
2389 The model for this format was the historical System V cpio -c data
2390 interchange format. This model documents the portable version of the
2391 cpio format and not the binary version. It has the flexibility to
2392 transfer data of any type described within IEEE Std 1003.1-2001, yet is
2393 extensible to transfer data types specific to extensions beyond IEEE
2394 Std 1003.1-2001 (for example, contiguous files). Because it describes
2395 existing practice, there is no question of maintaining upwards-compati‐
2396 bility.
2397
2398
2399 cpio Header
2400 There has been some concern that the size of the c_ino field of the
2401 header is too small to handle those systems that have very large inode
2402 numbers. However, the c_ino field in the header is used strictly as a
2403 hard-link resolution mechanism for archives. It is not necessarily the
2404 same value as the inode number of the file in the location from which
2405 that file is extracted.
2406
2407 The name c_magic is based on historical usage.
2408
2409
2410 cpio Filename
2411 For most historical implementations of the cpio utility, {PATH_MAX}
2412 octets can be used to describe the pathname without the addition of any
2413 other header fields (the NUL character would be included in this
2414 count). {PATH_MAX} is the minimum value for pathname size, documented
2415 as 256 bytes. However, an implementation may use c_namesize to deter‐
2416 mine the exact length of the pathname. With the current description of
2417 the <cpio.h> header, this pathname size can be as large as a number
2418 that is described in six octal digits.
2419
2420 Two values are documented under the c_mode field values to provide for
2421 extensibility for known file types:
2422
2423 0110 000
2424 Reserved for contiguous files. The implementation may treat the
2425 rest of the information for this archive like a regular file. If
2426 this file type is undefined, the implementation may create the
2427 file as a regular file.
2428
2429 This provides for extensibility of the cpio format while allowing for
2430 the ability to read old archives. Files of an unknown type may be read
2431 as "regular files" on some implementations. On a system that does not
2432 support extended file types, the pax utility should do the best it can
2433 with the file and go on to the next.
2434
2435
2437 None.
2438
2439
2441_________________________________________________________________
2442
2443
2445 Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2446 Base Definitions volume of IEEE Std 1003.1-2001, <cpio.h>, the System
2447 Interfaces volume of IEEE Std 1003.1-2001, chown(2), creat(2),
2448 mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2449
2450
2452 First released in Issue 4.
2453
2454
2455 Issue 5
2456 A note is added to the APPLICATION USAGE indicating that the cpio and
2457 tar formats can only support files up to 8 gigabytes in size.
2458
2459
2460 Issue 6
2461 The pax utility is aligned with the IEEE P1003.2b draft standard:
2462
2463 · Support has been added for symbolic links in the options and
2464 interchange formats.
2465
2466 · A new format has been devised, based on extensions to ustar.
2467
2468 · References to the "extended" tar and cpio formats derived from
2469 the POSIX.1-1990 standard have been changed to remove the
2470 "extended" adjective because this could cause confusion with the
2471 extended tar header added in this revision. (All references to
2472 tar are actually to ustar.)
2473
2474 The TZ entry is added to the ENVIRONMENT VARIABLES section.
2475
2476 IEEE PASC Interpretation 1003.2 #168 is applied, clarifying that
2477 mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2478 ing an archive.
2479
2480 IEEE PASC Interpretation 1003.2 #180 is applied, clarifying how
2481 extracted files are created when in read mode.
2482
2483 IEEE PASC Interpretation 1003.2 #181 is applied, clarifying the
2484 description of the -t option.
2485
2486 IEEE PASC Interpretation 1003.2 #195 is applied.
2487
2488 IEEE PASC Interpretation 1003.2 #206 is applied, clarifying the han‐
2489 dling of links for the -H, -L, and -l options.
2490
2491 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied, adding
2492 the process ID of the pax process into certain fields. This change pro‐
2493 vides a method for the implementation to ensure that different
2494 instances of pax extracting a file named /a/b/foo will not collide when
2495 processing the extended header information associated with foo.
2496
2497 IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied, chang‐
2498 ing -x B to -x pax in the OPTIONS section.
2499
2500 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied, updat‐
2501 ing the SYNOPSIS to be consistent with the normative text.
2502
2503 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied, updat‐
2504 ing the DESCRIPTION to describe the behavior when files to be linked
2505 are symbolic links and the system is not capable of making hard links
2506 to symbolic links.
2507
2508 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied, updat‐
2509 ing the OPTIONS section to describe the behavior for how multiple
2510 options are to be handled.
2511
2512 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied, updat‐
2513 ing the write option within the OPTIONS section.
2514
2515 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied, adding
2516 a paragraph into the OPTIONS section that states that specifying more
2517 than one of the mutually-exclusive options (-H and -L) is not consid‐
2518 ered an error and that the last option specified will determine the
2519 behavior of the utility.
2520
2521 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied, remov‐
2522 ing the ctime paragraph within the EXTENDED DESCRIPTION. There is a
2523 contradiction in the definition of the ctime keyword for the pax
2524 extended header, in that the st_ctime member of the stat structure does
2525 not refer to a file creation time. No field in the standard stat struc‐
2526 ture from <sys/stat.h> includes a file creation time.
2527
2528 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied, making
2529 it clear that typeflag 1 RB ( ustar Interchange Format) applies not
2530 only to files that are hard-linked, but also to files that are aliased
2531 via symlinks.
2532
2533 IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied, clari‐
2534 fying the cpio c_nlink field.
2535
2536 End of quoted text from the POSIX.1-2001 standard.
2537
2539 The following other options are implemented as extension to the POSIX
2540 standard. Note that some other non-POSIX options are mentioned in
2541 -help and -xhelp output - these are also supported in spax(1) and are
2542 well described in star(1) manual page.
2543
2544 -help Prints a summary of the most important options for spax(1) and
2545 exits.
2546
2547 -xhelp Prints a summary of the less important options for spax(1) and
2548 exits.
2549
2550 -version
2551 Prints the spax version number string and exists.
2552
2553 -do-statistics
2554 Print statistic messages at the end of a spax(1) run.
2555
2556
2563 The Institute of Electrical and Electronics Engineers and The Open
2564 Group, have given us permission to reprint portions of their documenta‐
2565 tion. In the following statement, the phrase ``this text'' refers to
2566 portions of the system documentation.
2567
2568 Portions of this text are reprinted and reproduced in electronic form
2569 in the sfind manual, from IEEE Std 1003.1, 2004 Edition, Standard for
2570 Information Technology -- Portable Operating System Interface (POSIX),
2571 The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004 by
2572 the Institute of Electrical and Electronics Engineers, Inc and The Open
2573 Group. In the event of any discrepancy between these versions and the
2574 original IEEE and The Open Group Standard, the original IEEE and The
2575 Open Group Standard is the referee document. The original Standard can
2576 be obtained online at http://www.opengroup.org/unix/online.html.
2577
2580 Joerg Schilling
2581 Seestr. 110
2582 D-13353 Berlin
2583 Germany
2584
2585 Mail bugs and suggestions to:
2586
2587 schilling@fokus.fraunhofer.de or js@cs.tu-berlin.de or
2588 joerg@schily.isdn.cs.tu-berlin.de
2589
2590
2591
2592Joerg Schilling 10/08/01 SPAX(1L)