1PAX(P)                     POSIX Programmer's Manual                    PAX(P)
2
3
4

NAME

6       pax - portable archive interchange
7

SYNOPSIS

9       pax [-cdnv][-H|-L][-f archive][-s replstr]...[pattern...]
10
11       pax -r[-cdiknuv][-H|-L][-f archive][-o options]...[-p string]...
12              [-s replstr]...[pattern...]
13
14       pax -w[-dituvX][-H|-L][-b blocksize][[-a][-f archive][-o options]...
15              [-s replstr]...[-x format][file...]
16
17       pax -r -w[-diklntuvX][-H|-L][-p string]...[-s replstr]...
18              [file...] directory
19
20

DESCRIPTION

22       The  pax  utility  shall read, write, and write lists of the members of
23       archive files and copy directory hierarchies. A variety of archive for‐
24       mats shall be supported; see the -x format option.
25
26       The  action  to  be  taken  depends  on  the  presence of the -r and -w
27       options. The four combinations of -r and -w are referred to as the four
28       modes  of  operation:  list, read, write, and copy modes, corresponding
29       respectively to the four forms shown in the SYNOPSIS section.
30
31       list   In list mode (when neither -r nor -w are specified),  pax  shall
32              write the names of the members of the archive file read from the
33              standard input, with pathnames matching the specified  patterns,
34              to  standard  output.  If a named file is of type directory, the
35              file hierarchy rooted at that file shall be listed as well.
36
37       read   In read mode (when -r is specified, but -w is  not),  pax  shall
38              extract  the  members of the archive file read from the standard
39              input, with pathnames matching the  specified  patterns.  If  an
40              extracted  file  is of type directory, the file hierarchy rooted
41              at that file shall be extracted as  well.  The  extracted  files
42              shall  be created performing pathname resolution with the direc‐
43              tory in which pax was invoked as the current working directory.
44
45       If an attempt is made to extract a directory when the directory already
46       exists, this shall not be considered an error. If an attempt is made to
47       extract a FIFO when the FIFO already exists, this shall not be  consid‐
48       ered an error.
49
50       The  ownership,  access,  and  modification times, and file mode of the
51       restored files are discussed under the -p option.
52
53       write  In write mode (when -w is specified, but -r is not),  pax  shall
54              write  the  contents of the file operands to the standard output
55              in an archive format. If no file operands are specified, a  list
56              of  files to copy, one per line, shall be read from the standard
57              input. A file of type directory shall include all of  the  files
58              in the file hierarchy rooted at the file.
59
60       copy   In copy mode (when both -r and -w are specified), pax shall copy
61              the file operands to the destination directory.
62
63       If no file operands are specified, a list of files  to  copy,  one  per
64       line,  shall  be read from the standard input. A file of type directory
65       shall include all of the files in the  file  hierarchy  rooted  at  the
66       file.
67
68       The  effect of the copy shall be as if the copied files were written to
69       an archive file and then subsequently extracted, except that there  may
70       be  hard links between the original and the copied files. If the desti‐
71       nation directory is a subdirectory of one of the files  to  be  copied,
72       the  results are unspecified. If the destination directory is a file of
73       a   type   not   defined   by   the   System   Interfaces   volume   of
74       IEEE Std 1003.1-2001,  the  results  are implementation-defined; other‐
75       wise, it shall be an error for the file named by the directory  operand
76       not  to  exist,  not  be writable by the user, or not be a file of type
77       directory.
78
79
80       In read or copy modes, if intermediate  directories  are  necessary  to
81       extract  an archive member, pax shall perform actions equivalent to the
82       mkdir()  function  defined  in  the   System   Interfaces   volume   of
83       IEEE Std 1003.1-2001, called with the following arguments:
84
85        * The intermediate directory used as the path argument
86
87        * The  value  of  the  bitwise-inclusive  OR  of S_IRWXU, S_IRWXG, and
88          S_IRWXO as the mode argument
89
90       If any specified pattern or file operands are not matched by  at  least
91       one  file  or  archive  member, pax shall write a diagnostic message to
92       standard error for each one that did not match and exit with a non-zero
93       exit status.
94
95       The archive formats described in the EXTENDED DESCRIPTION section shall
96       be automatically detected on input. The default output  archive  format
97       shall be implementation-defined.
98
99       A  single archive can span multiple files. The pax utility shall deter‐
100       mine, in an implementation-defined manner, what file to read  or  write
101       as the next file.
102
103       If  the  selected  archive  format supports the specification of linked
104       files, it shall be an error if these files cannot be  linked  when  the
105       archive  is  extracted. For archive formats that do not store file con‐
106       tents with each name that causes a hard link, if the file that contains
107       the  data  is  not  extracted  during this pax session, either the data
108       shall be restored from the original file, or a diagnostic message shall
109       be  displayed  with  the name of a file that can be used to extract the
110       data. In traversing directories, pax shall detect infinite loops;  that
111       is,  entering a previously visited directory that is an ancestor of the
112       last file visited. When it detects an infinite loop, pax shall write  a
113       diagnostic message to standard error and shall terminate.
114

OPTIONS

116       The  pax  utility  shall  conform  to  the  Base  Definitions volume of
117       IEEE Std 1003.1-2001, Section 12.2, Utility Syntax  Guidelines,  except
118       that the order of presentation of the -o, -p, and -s options is signif‐
119       icant.
120
121       The following options shall be supported:
122
123       -r     Read an archive file from standard input.
124
125       -w     Write files to the standard output in the specified archive for‐
126              mat.
127
128       -a     Append  files  to  the end of the archive. It is implementation-
129              defined which devices on the  system  support  appending.  Addi‐
130              tional    file   formats   unspecified   by   this   volume   of
131              IEEE Std 1003.1-2001 may impose restrictions on appending.
132
133       -b  blocksize
134              Block the output at a positive decimal integer number  of  bytes
135              per  write  to the archive file. Devices and archive formats may
136              impose restrictions on blocking. Blocking shall be automatically
137              determined on input. Conforming applications shall not specify a
138              blocksize value larger than 32256. Default blocking when  creat‐
139              ing  archives  depends on the archive format. (See the -x option
140              below.)
141
142       -c     Match all file or archive members except those specified by  the
143              pattern or file operands.
144
145       -d     Cause  files  of  type directory being copied or archived or ar‐
146              chive members of type directory being  extracted  or  listed  to
147              match  only  the  file or archive member itself and not the file
148              hierarchy rooted at the file.
149
150       -f  archive
151              Specify the pathname of the input or output archive,  overriding
152              the  default  standard input (in list or read modes) or standard
153              output ( write mode).
154
155       -H     If a symbolic link referencing a file of type directory is spec‐
156              ified  on the command line, pax shall archive the file hierarchy
157              rooted in the file referenced by the link, using the name of the
158              link as the root of the file hierarchy. Otherwise, if a symbolic
159              link referencing a file of any other file  type  which  pax  can
160              normally  archive  is  specified  on  the command line, then pax
161              shall archive the file referenced by the link, using the name of
162              the  link. The default behavior shall be to archive the symbolic
163              link itself.
164
165       -i     Interactively rename files or archive members. For each  archive
166              member  matching a pattern operand or file matching a file oper‐
167              and, a prompt shall be written to the file /dev/tty.  The prompt
168              shall  contain  the  name of the file or archive member, but the
169              format is otherwise unspecified. A line shall then be read  from
170              /dev/tty.  If  this  line  is  blank, the file or archive member
171              shall be skipped. If this line consists of a single period,  the
172              file  or  archive member shall be processed with no modification
173              to its name. Otherwise, its name shall be replaced with the con‐
174              tents of the line. The pax utility shall immediately exit with a
175              non-zero exit status if end-of-file is encountered when  reading
176              a response or if /dev/tty cannot be opened for reading and writ‐
177              ing.
178
179       The results of extracting a hard link to a file that has  been  renamed
180       during extraction are unspecified.
181
182       -k     Prevent the overwriting of existing files.
183
184       -l     (The letter ell.) In copy mode, hard links shall be made between
185              the source and destination file hierarchies  whenever  possible.
186              If  specified in conjunction with -H or -L, when a symbolic link
187              is encountered, the hard link created in  the  destination  file
188              hierarchy  shall be to the file referenced by the symbolic link.
189              If specified when neither -H nor -L is specified,  when  a  sym‐
190              bolic  link  is  encountered,  the implementation shall create a
191              hard link to the symbolic link in the source file  hierarchy  or
192              copy the symbolic link to the destination.
193
194       -L     If a symbolic link referencing a file of type directory is spec‐
195              ified on the command line or encountered during the traversal of
196              a file hierarchy, pax shall archive the file hierarchy rooted in
197              the file referenced by the link, using the name of the  link  as
198              the  root  of  the file hierarchy. Otherwise, if a symbolic link
199              referencing a file of any other file type which pax can normally
200              archive  is  specified on the command line or encountered during
201              the traversal of a file hierarchy, pax shall  archive  the  file
202              referenced  by the link, using the name of the link. The default
203              behavior shall be to archive the symbolic link itself.
204
205       -n     Select the first archive member that matches each pattern  oper‐
206              and.   No more than one archive member shall be matched for each
207              pattern (although members of type directory  shall  still  match
208              the file hierarchy rooted at that file).
209
210       -o  options
211              Provide  information  to  the implementation to modify the algo‐
212              rithm for extracting or writing  files.  The  value  of  options
213              shall  consist  of  one  or more comma-separated keywords of the
214              form:
215
216
217              keyword[[:]=value][,keyword[[:]=value], ...]
218
219       Some keywords apply only to certain file  formats,  as  indicated  with
220       each  description.  Use  of  keywords that are inapplicable to the file
221       format being processed produces undefined results.
222
223       Keywords in the options argument shall be a  string  that  would  be  a
224       valid  portable filename as described in the Base Definitions volume of
225       IEEE Std 1003.1-2001, Section 3.276, Portable Filename Character Set.
226
227       Note:
228              Keywords are not expected to be filenames, merely to follow  the
229              same character composition rules as portable filenames.
230
231
232       Keywords  can  be preceded with white space. The value field shall con‐
233       sist of zero or more characters; within value,  the  application  shall
234       precede any literal comma with a backslash, which shall be ignored, but
235       preserves the comma as part of value. A comma as the  final  character,
236       or  a  comma followed solely by white space as the final characters, in
237       options shall be ignored. Multiple -o options can be specified; if key‐
238       words  given  to  these  multiple -o options conflict, the keywords and
239       values appearing later in command line sequence shall  take  precedence
240       and the earlier shall be silently ignored. The following keyword values
241       of options shall be supported for the file formats as indicated:
242
243       delete=pattern
244
245              (Applicable only to the -x pax format.) When used  in  write  or
246              copy  mode,  pax shall omit from extended header records that it
247              produces any keywords matching the string pattern. When used  in
248              read  or  list  mode, pax shall ignore any keywords matching the
249              string pattern in the extended header records.  In  both  cases,
250              matching  shall be performed using the pattern matching notation
251              described in Patterns Matching a Single Character  and  Patterns
252              Matching Multiple Characters . For example:
253
254
255                     -o delete=security.*
256
257              would  suppress  security-related  information. See pax Extended
258              Header for extended header record keyword usage.
259
260       exthdr.name=string
261
262              (Applicable only to the -x pax format.) This keyword allows user
263              control  over  the  name  that  is written into the ustar header
264              blocks for the extended header produced under the  circumstances
265              described  in  pax Header Block . The name shall be the contents
266              of string, after the following character substitutions have been
267              made:
268
269                    string
270                    Includes:   Replaced By:
271                    %d          The directory name of the file, equiva‐
272                                lent to the result of the dirname util‐
273                                ity on the translated pathname.
274                    %f          The filename of the file, equivalent to
275                                the result of the basename utility on
276                                the translated pathname.
277                    %p          The process ID of the pax process.
278                    %%          A '%' character.
279
280              Any other '%' characters in string produce undefined results.
281
282              If  no  -o  exthdr.name=  string is specified, pax shall use the
283              following default value:
284
285
286                     %d/PaxHeaders.%p/%f
287
288       globexthdr.name=string
289
290              (Applicable only to the -x pax format.) When used  in  write  or
291              copy  mode with the appropriate options, pax shall create global
292              extended header records with ustar header blocks  that  will  be
293              treated  as regular files by previous versions of pax. This key‐
294              word allows user control over the name that is written into  the
295              ustar header blocks for global extended header records. The name
296              shall be the contents of string, after the  following  character
297              substitutions have been made:
298
299                    string
300                    Includes:   Replaced By:
301                    %n          An integer that represents the sequence
302                                number of the global extended header
303                                record in the archive, starting at 1.
304                    %p          The process ID of the pax process.
305                    %%          A '%' character.
306
307              Any other '%' characters in string produce undefined results.
308
309              If no -o globexthdr.name= string is specified, pax shall use the
310              following default value:
311
312
313                     $TMPDIR/GlobalHead.%p.%n
314
315              where $ TMPDIR represents the value of  the  TMPDIR  environment
316              variable. If TMPDIR is not set, pax shall use /tmp.
317
318       invalid=action
319
320              (Applicable only to the -x pax format.) This keyword allows user
321              control over the action pax takes upon encountering values in an
322              extended  header  record that, in read or copy mode, are invalid
323              in the destination hierarchy or, in list mode, cannot be written
324              in  the  codeset  and  current locale of the implementation. The
325              following are invalid values that shall be recognized by pax:
326
327                      * In read or copy mode, a filename  or  link  name  that
328                        contains  character  encodings invalid in the destina‐
329                        tion hierarchy. (For example,  the  name  may  contain
330                        embedded NULs.)
331
332                      * In  read or copy mode, a filename or link name that is
333                        longer than the maximum  allowed  in  the  destination
334                        hierarchy  (for  either  a  pathname  component or the
335                        entire pathname).
336
337                      * In list mode, any character  string  value  (filename,
338                        link  name, user name, and so on) that cannot be writ‐
339                        ten in the codeset and current locale of the implemen‐
340                        tation.
341
342              The  following  mutually-exclusive values of the action argument
343              are supported:
344
345              bypass
346                     In read or copy mode, pax shall bypass the file,  causing
347                     no change to the destination hierarchy. In list mode, pax
348                     shall write all requested valid values for the file,  but
349                     its method for writing invalid values is unspecified.
350
351              rename
352                     In  read  or copy mode, pax shall act as if the -i option
353                     were in effect for each file  with  invalid  filename  or
354                     link name values, allowing the user to provide a replace‐
355                     ment name interactively. In list mode, pax  shall  behave
356                     identically to the bypass action.
357
358              UTF-8
359                     When  used  in  read,  copy, or list mode and a filename,
360                     link name, owner name, or any other field in an  extended
361                     header  record  cannot  be  translated from the pax UTF-8
362                     codeset format to the codeset and current locale  of  the
363                     implementation,  pax  shall use the actual UTF-8 encoding
364                     for the name.
365
366              write
367                     In read or copy mode, pax shall write the file, translat‐
368                     ing  or  truncating  the name, regardless of whether this
369                     may overwrite an existing file with a valid name. In list
370                     mode, pax shall behave identically to the bypass action.
371
372
373              If  no  -o  invalid= option is specified, pax shall act as if -o
374              invalid= bypass were  specified.  Any  overwriting  of  existing
375              files  that  may  be allowed by the -o invalid= actions shall be
376              subject to permission ( -p) and modification time ( -u) restric‐
377              tions,  and  shall be suppressed if the -k option is also speci‐
378              fied.
379
380       linkdata
381
382              (Applicable only to the -x pax format.) In write mode, pax shall
383              write  the contents of a file to the archive even when that file
384              is merely a hard link to a file whose contents have already been
385              written to the archive.
386
387       listopt=format
388
389              This  keyword  specifies  the output format of the table of con‐
390              tents produced when the -v option is specified in list mode. See
391              List  Mode  Format  Specifications  .  To  avoid  ambiguity, the
392              listopt= format shall be the only or final keyword=  value  pair
393              in  a -o option-argument; all characters in the remainder of the
394              option-argument shall be considered part of the  format  string.
395              When multiple -o listopt= format options are specified, the for‐
396              mat strings shall be considered a single,  concatenated  string,
397              evaluated in command line order.
398
399       times
400
401              (Applicable  only  to  the -x pax format.) When used in write or
402              copy mode, pax shall include atime, ctime,  and  mtime  extended
403              header records for each file. See pax Extended Header File Times
404              .
405
406
407       In addition to these keywords, if the -x pax format is  specified,  any
408       of  the  keywords and values defined in pax Extended Header , including
409       implementation extensions, can  be  used  in  -o  option-arguments,  in
410       either of two modes:
411
412       keyword=value
413
414              When used in write or copy mode, these keyword/value pairs shall
415              be included at the beginning of the archive as typeflag g global
416              extended  header  records. When used in read or list mode, these
417              keyword/value pairs shall act as if they had been at the  begin‐
418              ning  of  the  archive  as  typeflag  g  global  extended header
419              records.
420
421       keyword:=value
422
423              When used in write or copy mode, these keyword/value pairs shall
424              be included as records at the beginning of a typeflag x extended
425              header for each file. (This shall be equivalent  to  the  equal-
426              sign  form  except that it creates no typeflag g global extended
427              header records.) When used in read  or  list  mode,  these  key‐
428              word/value  pairs  shall act as if they were included as records
429              at the end of each extended header; thus,  they  shall  override
430              any  global  or file-specific extended header record keywords of
431              the same names. For example, in the command:
432
433
434                     pax -r -o "
435                     gname:=mygroup,
436                     " <archive
437
438              the group name will be forced to a new value for all files  read
439              from the archive.
440
441
442       The  precedence  of  -o  keywords over various fields in the archive is
443       described in pax Extended Header Keyword Precedence .
444
445       -p  string
446              Specify one or more file  characteristic  options  (privileges).
447              The  string  option-argument  shall  be a string specifying file
448              characteristics to be retained or discarded on  extraction.  The
449              string shall consist of the specification characters a , e , m ,
450              o , and p  .  Other  implementation-defined  characters  can  be
451              included.  Multiple  characteristics  can be concatenated within
452              the same string and multiple -p options can  be  specified.  The
453              meaning of the specification characters are as follows:
454
455       a
456              Do not preserve file access times.
457
458       e
459              Preserve  the  user  ID,  group ID, file mode bits (see the Base
460              Definitions volume of IEEE Std 1003.1-2001, Section 3.168,  File
461              Mode Bits), access time, modification time, and any other imple‐
462              mentation-defined file characteristics.
463
464       m
465              Do not preserve file modification times.
466
467       o
468              Preserve the user ID and group ID.
469
470       p
471              Preserve the file mode bits. Other  implementation-defined  file
472              mode attributes may be preserved.
473
474
475       In the preceding list, "preserve" indicates that an attribute stored in
476       the archive shall be given to the extracted file, subject to  the  per‐
477       missions  of the invoking process. The access and modification times of
478       the file shall be preserved unless  otherwise  specified  with  the  -p
479       option  or  not stored in the archive. All attributes that are not pre‐
480       served shall be determined as part of the normal file  creation  action
481       (see File Read, Write, and Creation ).
482
483       If neither the e nor the o specification character is specified, or the
484       user ID and group ID are not preserved for any reason,  pax  shall  not
485       set the S_ISUID and S_ISGID bits of the file mode.
486
487       If  the  preservation  of  any of these items fails for any reason, pax
488       shall write a diagnostic message to standard error.   Failure  to  pre‐
489       serve  these  items  shall  affect the final exit status, but shall not
490       cause the extracted file to be deleted.
491
492       If file characteristic letters in any of  the  string  option-arguments
493       are  duplicated  or conflict with each other, the ones given last shall
494       take precedence. For example, if -p eme is specified, file modification
495       times are preserved.
496
497       -s  replstr
498              Modify file or archive member names named by pattern or file op‐
499              erands according to the substitution expression  replstr,  using
500              the  syntax  of  the  ed  utility. The concepts of "address" and
501              "line" are meaningless in the context of the  pax  utility,  and
502              shall not be supplied. The format shall be:
503
504
505              -s /old/new/[gp]
506
507       where  as  in ed, old is a basic regular expression and new can contain
508       an ampersand, '\n' (where n is a digit) backreferences,  or  subexpres‐
509       sion  matching. The old string shall also be permitted to contain <new‐
510       line>s.
511
512       Any non-null character can be used as a delimiter (  '/'  shown  here).
513       Multiple  -s  expressions  can  be  specified; the expressions shall be
514       applied in the order specified, terminating with the  first  successful
515       substitution.  The  optional trailing 'g' is as defined in the ed util‐
516       ity. The optional trailing 'p' shall cause successful substitutions  to
517       be written to standard error. File or archive member names that substi‐
518       tute to the empty string shall be ignored when reading and writing  ar‐
519       chives.
520
521       -t     When reading files from the file system, and if the user has the
522              permissions required by utime() to do so, set the access time of
523              each  file read to the access time that it had before being read
524              by pax.
525
526       -u     Ignore files that are older (having a less recent file modifica‐
527              tion  time)  than a pre-existing file or archive member with the
528              same name. In read mode, an archive member with the same name as
529              a file in the file system shall be extracted if the archive mem‐
530              ber is newer than the file. In write mode, an archive file  mem‐
531              ber  with  the  same  name as a file in the file system shall be
532              superseded if the file is newer than the archive member.  If  -a
533              is  also specified, this is accomplished by appending to the ar‐
534              chive; otherwise, it is unspecified whether this is accomplished
535              by  actual replacement in the archive or by appending to the ar‐
536              chive. In copy mode, the file in the destination hierarchy shall
537              be  replaced by the file in the source hierarchy or by a link to
538              the file in the source hierarchy if the file in the source hier‐
539              archy is newer.
540
541       -v     In  list mode, produce a verbose table of contents (see the STD‐
542              OUT section). Otherwise, write archive member pathnames to stan‐
543              dard error (see the STDERR section).
544
545       -x  format
546              Specify the output archive format. The pax utility shall support
547              the following formats:
548
549       cpio
550              The cpio interchange format; see the EXTENDED  DESCRIPTION  sec‐
551              tion.   The default blocksize for this format for character spe‐
552              cial archive files shall be 5120. Implementations shall  support
553              all blocksize values less than or equal to 32256 that are multi‐
554              ples of 512.
555
556       pax
557              The pax interchange format; see the  EXTENDED  DESCRIPTION  sec‐
558              tion.   The default blocksize for this format for character spe‐
559              cial archive files shall be 5120. Implementations shall  support
560              all blocksize values less than or equal to 32256 that are multi‐
561              ples of 512.
562
563       ustar
564              The tar interchange format; see the  EXTENDED  DESCRIPTION  sec‐
565              tion.   The default blocksize for this format for character spe‐
566              cial archive files shall be 10240. Implementations shall support
567              all blocksize values less than or equal to 32256 that are multi‐
568              ples of 512.
569
570
571       Implementation-defined formats shall specify a default  block  size  as
572       well  as  any other block sizes supported for character special archive
573       files.
574
575       Any attempt to append to an archive file in a format different from the
576       existing archive format shall cause pax to exit immediately with a non-
577       zero exit status.
578
579       In copy mode, if no -x format is specified, pax shall behave as  if  -x
580       pax were specified.
581
582       -X     When  traversing the file hierarchy specified by a pathname, pax
583              shall not descend into directories that have a different  device
584              ID   (   st_dev;   see   the   System   Interfaces   volume   of
585              IEEE Std 1003.1-2001, stat()).
586
587
588       The options that operate on the names of files or archive members ( -c,
589       -i,  -n,  -s,  -u, and -v) shall interact as follows. In read mode, the
590       archive members shall be selected based on the  user-specified  pattern
591       operands as modified by the -c, -n, and -u options. Then, any -s and -i
592       options shall modify, in that order, the names of the  selected  files.
593       The -v option shall write names resulting from these modifications.
594
595       In  write mode, the files shall be selected based on the user-specified
596       pathnames as modified by the -n and -u options. Then,  any  -s  and  -i
597       options shall modify, in that order, the names of these selected files.
598       The -v option shall write names resulting from these modifications.
599
600       If both the -u and -n options are specified, pax shall not  consider  a
601       file selected unless it is newer than the file to which it is compared.
602
603   List Mode Format Specifications
604       In  list  mode  with the -o listopt= format option, the format argument
605       shall be applied for each selected file. The pax utility shall append a
606       <newline>  to  the  listopt  output  for each selected file. The format
607       argument shall be used as the format string described in the Base Defi‐
608       nitions  volume  of  IEEE Std 1003.1-2001, Chapter 5, File Format Nota‐
609       tion, with the exceptions  1.   through  5.  defined  in  the  EXTENDED
610       DESCRIPTION section of printf, plus the following exceptions:
611
612       6.     The  sequence  (  keyword)  can occur before a format conversion
613              specifier. The conversion argument is defined by  the  value  of
614              keyword.  The  implementation  shall  support the following key‐
615              words:
616
617               * Any of the Field Name  entries  in  ustar  Header  Block  and
618                 Octet-Oriented  cpio  Archive  Entry . The implementation may
619                 support the cpio keywords without the leading c_ in  addition
620                 to the form required by Values for cpio c_mode Field .
621
622               * Any  keyword  defined for the extended header in pax Extended
623                 Header .
624
625               * Any keyword provided as an  implementation-defined  extension
626                 within the extended header defined in pax Extended Header .
627
628       For example, the sequence "%(charset)s" is the string value of the name
629       of the character set in the extended header.
630
631       The result of the keyword conversion argument shall be the  value  from
632       the  applicable  header  field or extended header, without any trailing
633       NULs.
634
635       All keyword values used as conversion  arguments  shall  be  translated
636       from  the UTF-8 encoding to the character set appropriate for the local
637       file system, user database, and so on, as applicable.
638
639       7.     An additional conversion specifier character, T , shall be  used
640              to  specify  time  formats. The T conversion specifier character
641              can be preceded by the sequence  (  keyword=  subformat),  where
642              subformat  is  a  date  format  as defined by date operands. The
643              default keyword shall be mtime and the default  subformat  shall
644              be:
645
646
647              %b %e %H:%M %Y
648
649       8.     An  additional conversion specifier character, M , shall be used
650              to specify the file mode string as defined in ls  Standard  Out‐
651              put.   If ( keyword) is omitted, the mode keyword shall be used.
652              For example, %.1M writes the single character  corresponding  to
653              the <entry type> field of the ls -l command.
654
655       9.     An  additional conversion specifier character, D , shall be used
656              to specify the device for block or special files, if applicable,
657              in  an  implementation-defined  format. If not applicable, and (
658              keyword) is specified, then this conversion shall be  equivalent
659              to  %(keyword)u.  If  not applicable, and ( keyword) is omitted,
660              then this conversion shall be equivalent to <space>.
661
662       10.    An additional conversion specifier character, F , shall be  used
663              to  specify  a  pathname. The F conversion character can be pre‐
664              ceded by a sequence of comma-separated keywords:
665
666
667              (keyword[,keyword] ... )
668
669       The values for all the keywords that are non-null shall be concatenated
670       together, each separated by a '/' . The default shall be ( path) if the
671       keyword path is defined; otherwise, the  default  shall  be  (  prefix,
672       name).
673
674       11.    An  additional conversion specifier character, L , shall be used
675              to specify a symbolic line expansion. If the current file  is  a
676              symbolic link, then %L shall expand to:
677
678
679              "%s -> %s", <value of keyword>, <contents of link>
680
681       Otherwise,  the  %L conversion specification shall be the equivalent of
682       %F .
683
684

OPERANDS

686       The following operands shall be supported:
687
688       directory
689              The destination directory pathname for copy mode.
690
691       file   A pathname of a file to be copied or archived.
692
693       pattern
694              A pattern matching one or more pathnames of archive  members.  A
695              pattern  must  be  given  in the name-generating notation of the
696              pattern matching notation in Pattern Matching Notation , includ‐
697              ing  the  filename expansion rules in Patterns Used for Filename
698              Expansion . The default, if  no  pattern  is  specified,  is  to
699              select all members in the archive.
700
701

STDIN

703       In  write  mode, the standard input shall be used only if no file oper‐
704       ands are specified. It shall be a text file containing a list of  path‐
705       names, one per line, without leading or trailing <blank>s.
706
707       In  list  and  read  modes,  if -f is not specified, the standard input
708       shall be an archive file.
709
710       Otherwise, the standard input shall not be used.
711

INPUT FILES

713       The input file named by the archive option-argument, or standard  input
714       when  the archive is read from there, shall be a file formatted accord‐
715       ing to one of the specifications in the EXTENDED DESCRIPTION section or
716       some other implementation-defined format.
717
718       The file /dev/tty shall be used to write prompts and read responses.
719

ENVIRONMENT VARIABLES

721       The following environment variables shall affect the execution of pax:
722
723       LANG   Provide  a  default value for the internationalization variables
724              that are unset or null. (See  the  Base  Definitions  volume  of
725              IEEE Std 1003.1-2001,  Section  8.2,  Internationalization Vari‐
726              ables for the precedence of internationalization variables  used
727              to determine the values of locale categories.)
728
729       LC_ALL If  set  to a non-empty string value, override the values of all
730              the other internationalization variables.
731
732       LC_COLLATE
733
734              Determine the locale for the  behavior  of  ranges,  equivalence
735              classes, and multi-character collating elements used in the pat‐
736              tern matching expressions for the  pattern  operand,  the  basic
737              regular  expression  for the -s option, and the extended regular
738              expression defined for the yesexpr locale keyword in the LC_MES‐
739              SAGES category.
740
741       LC_CTYPE
742              Determine  the  locale  for  the  interpretation of sequences of
743              bytes of text data as characters (for  example,  single-byte  as
744              opposed  to multi-byte characters in arguments and input files),
745              the behavior of character classes used in the  extended  regular
746              expression defined for the yesexpr locale keyword in the LC_MES‐
747              SAGES category, and pattern matching.
748
749       LC_MESSAGES
750              Determine the locale for the processing of affirmative responses
751              that  should  be used to affect the format and contents of diag‐
752              nostic messages written to standard error.
753
754       LC_TIME
755              Determine the format and contents of date and time strings  when
756              the -v option is specified.
757
758       NLSPATH
759              Determine the location of message catalogs for the processing of
760              LC_MESSAGES .
761
762       TMPDIR Determine the pathname that provides part of the default  global
763              extended header record file, as described for the -o globexthdr=
764              keyword in the OPTIONS section.
765
766       TZ     Determine the timezone used to calculate date and  time  strings
767              when  the  -v  option  is  specified. If TZ is unset or null, an
768              unspecified default timezone shall be used.
769
770

ASYNCHRONOUS EVENTS

772       Default.
773

STDOUT

775       In write mode, if -f is not specified, the standard output shall be the
776       archive  formatted  according  to  one  of  the  specifications  in the
777       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
778       mat (see -x format).
779
780       In  list  mode,  when  the  -o  listopt= format has been specified, the
781       selected archive members shall be written to standard output using  the
782       format  described  under List Mode Format Specifications . In list mode
783       without the -o listopt= format option, the table  of  contents  of  the
784       selected  archive members shall be written to standard output using the
785       following format:
786
787
788              "%s\n", <pathname>
789
790       If the -v option is specified in list mode, the table  of  contents  of
791       the  selected archive members shall be written to standard output using
792       the following formats.
793
794       For pathnames representing hard links to previous members  of  the  ar‐
795       chive:
796
797
798              "%s == %s\n", <ls -l listing>, <linkname>
799
800       For all other pathnames:
801
802
803              "%s\n", <ls -l listing>
804
805       where  <ls  -l listing> shall be the format specified by the ls utility
806       with the -l option. When  writing  pathnames  in  this  format,  it  is
807       unspecified what is written for fields for which the underlying archive
808       format does not have the correct information, although the correct num‐
809       ber of <blank>-separated fields shall be written.
810
811       In list mode, standard output shall not be buffered more than a line at
812       a time.
813

STDERR

815       If -v is specified in read, write, or copy modes, pax shall  write  the
816       pathnames it processes to the standard error output using the following
817       format:
818
819
820              "%s\n", <pathname>
821
822       These pathnames shall be written as soon as processing is begun on  the
823       file  or  archive  member,  and shall be flushed to standard error. The
824       trailing <newline>, which shall not be buffered, is  written  when  the
825       file has been read or written.
826
827       If  the -s option is specified, and the replacement string has a trail‐
828       ing 'p' , substitutions shall be written to standard error in the  fol‐
829       lowing format:
830
831
832              "%s >> %s\n", <original pathname>, <new pathname>
833
834       In  all operating modes of pax, optional messages of unspecified format
835       concerning the input archive format and volume number,  the  number  of
836       files,  blocks,  volumes,  and  media parts as well as other diagnostic
837       messages may be written to standard error.
838
839       In all formats, for both standard output  and  standard  error,  it  is
840       unspecified how non-printable characters in pathnames or link names are
841       written.
842
843       When pax is in read mode or list mode, using the -x pax archive format,
844       and  a  filename,  link  name,  owner  name,  or  any other field in an
845       extended header record cannot be translated from the pax UTF-8  codeset
846       format  to  the  codeset  and current locale of the implementation, pax
847       shall write a diagnostic message to standard error, shall  process  the
848       file  as  described  for the -o invalid= option, and then shall process
849       the next file in the archive.
850

OUTPUT FILES

852       In read mode, the extracted output files shall be of the archived  file
853       type.  In  copy  mode, the copied output files shall be the type of the
854       file being copied. In either mode, existing files  in  the  destination
855       hierarchy  shall be overwritten only when all permission ( -p), modifi‐
856       cation time ( -u), and invalid-value ( -o invalid=) tests allow it.
857
858       In write mode, the output file named by the -f option-argument shall be
859       a file formatted according to one of the specifications in the EXTENDED
860       DESCRIPTION section, or some other implementation-defined format.
861

EXTENDED DESCRIPTION

863   pax Interchange Format
864       A pax archive tape or file produced in the -x pax format shall  contain
865       a series of blocks. The physical layout of the archive shall be identi‐
866       cal to the ustar format described in ustar Interchange  Format  .  Each
867       file archived shall be represented by the following sequence:
868
869        * An  optional  header block with extended header records. This header
870          block is of the form described in pax Header Block , with a typeflag
871          value  of  x  or  g.  The  extended header records, described in pax
872          Extended Header , shall be included as  the  data  for  this  header
873          block.
874
875        * A  header block that describes the file. Any fields in the preceding
876          optional extended header shall override  the  associated  fields  in
877          this header block for this file.
878
879        * Zero or more blocks that contain the contents of the file.
880
881       At  the  end  of  the  archive  file there shall be two 512-byte blocks
882       filled with binary zeros, interpreted as an end-of-archive indicator.
883
884       A schematic of an example archive with global extended  header  records
885       and  two  actual  files is shown in pax Format Archive Example . In the
886       example, the second file in the archive has no extended header  preced‐
887       ing it, presumably because it has no need for extended attributes.
888
889
890
891                         Figure: pax Format Archive Example
892
893   pax Header Block
894       The  pax  header  block  shall  be  identical to the ustar header block
895       described in ustar Interchange Format  ,  except  that  two  additional
896       typeflag values are defined:
897
898       x      Represents extended header records for the following file in the
899              archive (which shall have its own ustar header block).  The for‐
900              mat  of  these  extended header records shall be as described in
901              pax Extended Header .
902
903       g      Represents global extended  header  records  for  the  following
904              files  in  the  archive.  The  format  of  these extended header
905              records shall be as described in  pax  Extended  Header  .  Each
906              value  shall  affect  all  subsequent files that do not override
907              that value in their own extended header record and until another
908              global  extended  header record is reached that provides another
909              value for the same field. The typeflag g global  headers  should
910              not  be  used  with  interchange media that could suffer partial
911              data loss in transporting the archive.
912
913
914       For both of these types, the size  field  shall  be  the  size  of  the
915       extended header records in octets. The other fields in the header block
916       are not meaningful to this version of the pax utility. However, if this
917       archive  is  read  by  a pax utility conforming to the ISO POSIX-2:1993
918       standard, the header block fields are used to  create  a  regular  file
919       that  contains  the  extended header records as data. Therefore, header
920       block field values should be selected to provide reasonable file access
921       to this regular file.
922
923       A  further  difference  from the ustar header block is that data blocks
924       for files of typeflag 1 (the digit one) (hard link)  may  be  included,
925       which means that the size field may be greater than zero. Archives cre‐
926       ated by pax -o linkdata shall include these data blocks with  the  hard
927       links.
928
929   pax Extended Header
930       A  pax  extended  header contains values that are inappropriate for the
931       ustar header block  because  of  limitations  in  that  format:  fields
932       requiring  a  character  encoding  other  than  that  described  in the
933       ISO/IEC 646:1991 standard,  fields  representing  file  attributes  not
934       described in the ustar header, and fields whose format or length do not
935       fit the requirements of the ustar header. The  values  in  an  extended
936       header add attributes to the following file (or files; see the descrip‐
937       tion of the typeflag g header block) or override values in the  follow‐
938       ing header block(s), as indicated in the following list of keywords.
939
940       An  extended  header  shall  consist  of one or more records, each con‐
941       structed as follows:
942
943
944              "%d %s=%s\n", <length>, <keyword>, <value>
945
946       The  extended  header  records  shall  be  encoded  according  to   the
947       ISO/IEC 10646-1:2000  standard  (UTF-8).  The  <length> field, <blank>,
948       equals sign, and <newline> shown shall be limited to the portable char‐
949       acter set, as encoded in UTF-8. The <keyword> and <value> fields can be
950       any UTF-8 characters. The <length> field shall be the decimal length of
951       the extended header record in octets, including the trailing <newline>.
952
953       The <keyword> field shall be one of the entries from the following list
954       or a keyword provided as an implementation extension. Keywords consist‐
955       ing entirely of lowercase letters, digits, and periods are reserved for
956       future standardization. A keyword shall not include an equals sign. (In
957       the  following  list,  the notations "file(s)" or "block(s)" is used to
958       acknowledge that a keyword affects the following single  file  after  a
959       typeflag  x extended header, but possibly multiple files after typeflag
960       g. Any requirements in the list for pax to include  a  record  when  in
961       write  or copy mode shall apply only when such a record has not already
962       been provided through the use of the -o option. When used in copy mode,
963       pax  shall  behave  as  if  an archive had been created with applicable
964       extended header records and then extracted.)
965
966       atime  The file access time for the following  file(s),  equivalent  to
967              the  value  of  the  st_atime member of the stat structure for a
968              file, as described by the stat() function. The access time shall
969              be  restored  if  the  process  has  the  appropriate  privilege
970              required to do so.  The  format  of  the  <value>  shall  be  as
971              described in pax Extended Header File Times .
972
973       charset
974              The  name  of  the  character set used to encode the data in the
975              following file(s).  The  entries  in  the  following  table  are
976              defined  to  refer  to  known standards; additional names may be
977              agreed on between the originator and recipient.
978
979                   <value>                  Formal Standard
980                   ISO-IR 646 1990          ISO/IEC 646:1990
981                   ISO-IR 8859 1 1998       ISO/IEC 8859-1:1998
982                   ISO-IR 8859 2 1999       ISO/IEC 8859-2:1999
983                   ISO-IR 8859 3 1999       ISO/IEC 8859-3:1999
984                   ISO-IR 8859 4 1998       ISO/IEC 8859-4:1998
985                   ISO-IR 8859 5 1999       ISO/IEC 8859-5:1999
986                   ISO-IR 8859 6 1999       ISO/IEC 8859-6:1999
987                   ISO-IR 8859 7 1987       ISO/IEC 8859-7:1987
988                   ISO-IR 8859 8 1999       ISO/IEC 8859-8:1999
989                   ISO-IR 8859 9 1999       ISO/IEC 8859-9:1999
990
991                   ISO-IR 8859 10 1998      ISO/IEC 8859-10:1998
992                   ISO-IR 8859 13 1998      ISO/IEC 8859-13:1998
993                   ISO-IR 8859 14 1998      ISO/IEC 8859-14:1998
994                   ISO-IR 8859 15 1999      ISO/IEC 8859-15:1999
995                   ISO-IR 10646 2000        ISO/IEC 10646:2000
996                   ISO-IR 10646 2000 UTF-8  ISO/IEC 10646, UTF-8 encoding
997                   BINARY                   None.
998
999       The encoding is included in an extended header  for  information  only;
1000       when  pax  is  used  as described in IEEE Std 1003.1-2001, it shall not
1001       translate the file data into any other encoding. The BINARY entry indi‐
1002       cates unencoded binary data.
1003
1004       When  used  in write or copy mode, it is implementation-defined whether
1005       pax includes a charset extended header record for a file.
1006
1007       comment
1008              A series of characters used as a comment. All characters in  the
1009              <value> field shall be ignored by pax.
1010
1011       ctime  The  file creation time for the following file(s), equivalent to
1012              the value of the st_ctime member of the  stat  structure  for  a
1013              file,  as  described  by  the stat() function. The creation time
1014              shall be restored if the process has the  appropriate  privilege
1015              required  to  do  so.  The  format  of  the  <value> shall be as
1016              described in pax Extended Header File Times .
1017
1018       gid    The group ID of the group that owns the  file,  expressed  as  a
1019              decimal  number using digits from the ISO/IEC 646:1991 standard.
1020              This record shall override the gid field in the following header
1021              block(s).  When  used in write or copy mode, pax shall include a
1022              gid extended header record for  each  file  whose  group  ID  is
1023              greater than 2097151 (octal 7777777).
1024
1025       gname  The group of the file(s), formatted as a group name in the group
1026              database.  This record shall override the gid and  gname  fields
1027              in  the  following  header block(s), and any gid extended header
1028              record. When used in read, copy, or list mode, pax shall  trans‐
1029              late  the  name  from the UTF-8 encoding in the header record to
1030              the character set appropriate for  the  group  database  on  the
1031              receiving  system.  If  any  of  the  UTF-8 characters cannot be
1032              translated, and if the -o invalid= UTF-8 option  is  not  speci‐
1033              fied, the results are implementation-defined. When used in write
1034              or copy mode, pax shall include a gname extended  header  record
1035              for  each  file  whose group name cannot be represented entirely
1036              with the letters and digits of the portable character set.
1037
1038       linkpath
1039              The pathname of a link being created to  another  file,  of  any
1040              type,  previously  archived.  This  record  shall  override  the
1041              linkname field in the following ustar header block(s).  The fol‐
1042              lowing  ustar header block shall determine the type of link cre‐
1043              ated. If typeflag of the following header block is 1,  it  shall
1044              be  a  hard  link. If typeflag is 2, it shall be a symbolic link
1045              and the linkpath value shall be the  contents  of  the  symbolic
1046              link. The pax utility shall translate the name of the link (con‐
1047              tents of the symbolic link) from the UTF-8 encoding to the char‐
1048              acter  set  appropriate  for the local file system. When used in
1049              write or copy mode, pax shall include a linkpath extended header
1050              record  for  each  link  whose  pathname  cannot  be represented
1051              entirely with the members of the portable  character  set  other
1052              than NUL.
1053
1054       mtime  The  file modification time of the following file(s), equivalent
1055              to the value of the st_mtime member of the stat structure for  a
1056              file,  as  described  in  the stat() function. This record shall
1057              override the mtime field in the following header  block(s).  The
1058              modification  time  shall  be  restored  if  the process has the
1059              appropriate privilege required to  do  so.  The  format  of  the
1060              <value>  shall be as described in pax Extended Header File Times
1061              .
1062
1063       path   The pathname of the following file(s). This record  shall  over‐
1064              ride  the  name  and  prefix  fields  in  the  following  header
1065              block(s). The pax utility shall translate the  pathname  of  the
1066              file  from  the  UTF-8 encoding to the character set appropriate
1067              for the local file system.
1068
1069       When used in write or copy mode, pax  shall  include  a  path  extended
1070       header  record  for  each  file  whose  pathname  cannot be represented
1071       entirely with the members of the portable character set other than NUL.
1072
1073       realtime.any
1074              The keywords prefixed by "realtime."  are  reserved  for  future
1075              standardization.
1076
1077       security.any
1078              The  keywords  prefixed  by  "security." are reserved for future
1079              standardization.
1080
1081       size   The size of the file in octets, expressed as  a  decimal  number
1082              using  digits  from  the  ISO/IEC 646:1991 standard. This record
1083              shall override the size field in the following header  block(s).
1084              When  used  in  write  or  copy  mode,  pax shall include a size
1085              extended header record for each file with a size  value  greater
1086              than 8589934591 (octal 77777777777).
1087
1088       uid    The  user  ID  of  the file owner, expressed as a decimal number
1089              using digits from the  ISO/IEC 646:1991  standard.  This  record
1090              shall  override  the uid field in the following header block(s).
1091              When used in write  or  copy  mode,  pax  shall  include  a  uid
1092              extended  header  record for each file whose owner ID is greater
1093              than 2097151 (octal 7777777).
1094
1095       uname  The owner of the following file(s), formatted as a user name  in
1096              the  user database. This record shall override the uid and uname
1097              fields in the following header block(s), and  any  uid  extended
1098              header  record. When used in read, copy, or list mode, pax shall
1099              translate the name from the UTF-8 encoding in the header  record
1100              to  the  character  set appropriate for the user database on the
1101              receiving system. If any  of  the  UTF-8  characters  cannot  be
1102              translated,  and  if  the -o invalid= UTF-8 option is not speci‐
1103              fied, the results are implementation-defined. When used in write
1104              or  copy  mode, pax shall include a uname extended header record
1105              for each file whose user name  cannot  be  represented  entirely
1106              with the letters and digits of the portable character set.
1107
1108
1109       If  the  <value> field is zero length, it shall delete any header block
1110       field, previously entered extended header  value,  or  global  extended
1111       header value of the same name.
1112
1113       If  a keyword in an extended header record (or in a -o option-argument)
1114       overrides or deletes a corresponding field in the ustar  header  block,
1115       pax shall ignore the contents of that header block field.
1116
1117       Unlike  the ustar header block fields, NULs shall not delimit <value>s;
1118       all characters within the <value> field shall be  considered  data  for
1119       the  field.  None  of  the length limitations of the ustar header block
1120       fields in ustar  Header  Block  shall  apply  to  the  extended  header
1121       records.
1122
1123   pax Extended Header Keyword Precedence
1124       This  section  describes  the  precedence  in  which the various header
1125       records and fields and command line options are selected to apply to  a
1126       file  in  the archive. When pax is used in read or list modes, it shall
1127       determine a file attribute in the following sequence:
1128
1129        1. If -o delete= keyword-prefix is used, the affected attributes shall
1130           be determined from step 7., if applicable, or ignored otherwise.
1131
1132        2. If -o keyword:= is used, the affected attributes shall be ignored.
1133
1134        3. If  -o  keyword  :=  value is used, the affected attribute shall be
1135           assigned the value.
1136
1137        4. If there is a typeflag  x  extended  header  record,  the  affected
1138           attribute  shall  be  assigned  the  <value>.  When extended header
1139           records conflict, the last one  given  in  the  header  shall  take
1140           precedence.
1141
1142        5. If  -o  keyword  =  value  is used, the affected attribute shall be
1143           assigned the value.
1144
1145        6. If there is  a  typeflag  g  global  extended  header  record,  the
1146           affected  attribute  shall  be  assigned  the  <value>. When global
1147           extended header records conflict, the last one given in the  global
1148           header shall take precedence.
1149
1150        7. Otherwise,  the attribute shall be determined from the ustar header
1151           block.
1152
1153   pax Extended Header File Times
1154       The pax utility shall write an mtime record for each file in  write  or
1155       copy  modes  if  the  file's  modification  time  cannot be represented
1156       exactly in the ustar header logical record described  in  ustar  Inter‐
1157       change Format . This can occur if the time is out of ustar range, or if
1158       the file system of the underlying implementation  supports  non-integer
1159       time  granularities  and  the time is not an integer. All of these time
1160       records shall be formatted as a decimal representation of the  time  in
1161       seconds since the Epoch. If a period ( '.' ) decimal point character is
1162       present, the digits to the right of the point shall represent the units
1163       of a subsecond timing granularity, where the first digit is tenths of a
1164       second and each subsequent digit is a tenth of the previous  digit.  In
1165       read or copy mode, the pax utility shall truncate the time of a file to
1166       the greatest value that is not greater than the input header file time.
1167       In  write  or copy mode, the pax utility shall output a time exactly if
1168       it can be represented exactly as a decimal number, and otherwise  shall
1169       generate only enough digits so that the same time shall be recovered if
1170       the file is extracted on a system whose underlying implementation  sup‐
1171       ports the same time granularity.
1172
1173   ustar Interchange Format
1174       A ustar archive tape or file shall contain a series of logical records.
1175       Each logical record shall be a fixed-size logical record of 512  octets
1176       (see  below). Although this format may be thought of as being stored on
1177       9-track industry-standard 12.7 mm (0.5 in) magnetic tape,  other  types
1178       of  transportable  media are not excluded.  Each file archived shall be
1179       represented by a header logical record that describes  the  file,  fol‐
1180       lowed  by  zero  or  more logical records that give the contents of the
1181       file. At the end of the archive file there shall be two 512-octet logi‐
1182       cal  records filled with binary zeros, interpreted as an end-of-archive
1183       indicator.
1184
1185       The logical records may be grouped  for  physical  I/O  operations,  as
1186       described  under  the  -b blocksize and -x ustar options. Each group of
1187       logical records may be written with a single  operation  equivalent  to
1188       the write() function.  On magnetic tape, the result of this write shall
1189       be a single tape physical block. The last physical block  shall  always
1190       be the full size, so logical records after the two zero logical records
1191       may contain undefined data.
1192
1193       The header logical record shall be structured as shown in the following
1194       table. All lengths and offsets are in decimal.
1195
1196                              Table: ustar Header Block
1197
1198                   Field Name   Octet Offset   Length (in Octets)
1199                   name         0              100
1200                   mode         100            8
1201                   uid          108            8
1202                   gid          116            8
1203                   size         124            12
1204                   mtime        136            12
1205                   chksum       148            8
1206                   typeflag     156            1
1207                   linkname     157            100
1208                   magic        257            6
1209                   version      263            2
1210                   uname        265            32
1211                   gname        297            32
1212                   devmajor     329            8
1213                   devminor     337            8
1214                   prefix       345            155
1215
1216       All characters in the header logical record shall be represented in the
1217       coded character set  of  the  ISO/IEC 646:1991  standard.  For  maximum
1218       portability  between  implementations,  names  should  be selected from
1219       characters represented by the portable filename character set as octets
1220       with  the most significant bit zero.  If an implementation supports the
1221       use of characters outside of slash and the portable filename  character
1222       set  in names for files, users, and groups, one or more implementation-
1223       defined encodings of these characters shall be provided for interchange
1224       purposes.
1225
1226       However, the pax utility shall never create filenames on the local sys‐
1227       tem  that  cannot  be  accessed  via  the   procedures   described   in
1228       IEEE Std 1003.1-2001.  If  a filename is found on the medium that would
1229       create an invalid filename, it is  implementation-defined  whether  the
1230       data  from the file is stored on the file hierarchy and under what name
1231       it is stored. The pax utility may choose to ignore these files as  long
1232       as it produces an error indicating that the file is being ignored.
1233
1234       Each  field  within  the  header logical record is contiguous; that is,
1235       there is no padding used. Each character on the archive medium shall be
1236       stored contiguously.
1237
1238       The  fields  magic,  uname, and gname are character strings each termi‐
1239       nated by a NUL character. The fields name,  linkname,  and  prefix  are
1240       NUL-terminated  character  strings  except  when  all characters in the
1241       array contain non-NUL characters including the last character. The ver‐
1242       sion  field  is  two octets containing the characters "00" (zero-zero).
1243       The typeflag contains a single character.  All other fields are leading
1244       zero-filled  octal numbers using digits from the ISO/IEC 646:1991 stan‐
1245       dard IRV. Each numeric field is terminated by one or  more  <space>  or
1246       NUL characters.
1247
1248       The  name and the prefix fields shall produce the pathname of the file.
1249       A new pathname shall be formed, if prefix is not an empty  string  (its
1250       first  character  is not NUL), by concatenating prefix (up to the first
1251       NUL character), a slash character, and name; otherwise,  name  is  used
1252       alone.  In  either case, name is terminated at the first NUL character.
1253       If prefix begins with a NUL character, it shall  be  ignored.  In  this
1254       manner,  pathnames  of  at  most  256 characters can be supported. If a
1255       pathname does not fit in the space provided, pax shall notify the  user
1256       of  the error, and shall not store any part of the file-header or data-
1257       on the medium.
1258
1259       The linkname field, described below, shall not use the prefix  to  pro‐
1260       duce  a  pathname. As such, a linkname is limited to 100 characters. If
1261       the name does not fit in the space provided, pax shall notify the  user
1262       of the error, and shall not attempt to store the link on the medium.
1263
1264       The  mode  field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1265       dard octal digit representation. The encoded bits shall  represent  the
1266       following values:
1267
1268                               Table: ustar mode Field
1269
1270       Bit Value IEEE Std 1003.1-2001 Bit Description
1271       04000     S_ISUID                  Set UID on execution.
1272       02000     S_ISGID                  Set GID on execution.
1273       01000     <reserved>               Reserved for future standardization.
1274       00400     S_IRUSR                  Read permission for file owner class.
1275       00200     S_IWUSR                  Write permission for file owner
1276                                          class.
1277       00100     S_IXUSR                  Execute/search permission for file
1278                                          owner class.
1279       00040     S_IRGRP                  Read permission for file group class.
1280       00020     S_IWGRP                  Write permission for file group
1281                                          class.
1282       00010     S_IXGRP                  Execute/search permission for file
1283                                          group class.
1284       00004     S_IROTH                  Read permission for file other class.
1285       00002     S_IWOTH                  Write permission for file other
1286                                          class.
1287       00001     S_IXOTH                  Execute/search permission for file
1288                                          other class.
1289
1290       When  appropriate  privilege is required to set one of these mode bits,
1291       and the user restoring the files from the archive  does  not  have  the
1292       appropriate  privilege,  the mode bits for which the user does not have
1293       appropriate privilege shall be ignored. Some of the mode  bits  in  the
1294       archive   format   are  not  mentioned  elsewhere  in  this  volume  of
1295       IEEE Std 1003.1-2001. If the  implementation  does  not  support  those
1296       bits, they may be ignored.
1297
1298       The uid and gid fields are the user and group ID of the owner and group
1299       of the file, respectively.
1300
1301       The size field is the size of the file in octets. If the typeflag field
1302       is  set  to  specify  a  file to be of type 1 (a link) or 2 (a symbolic
1303       link), the size field shall be specified as zero. If the typeflag field
1304       is set to specify a file of type 5 (directory), the size field shall be
1305       interpreted as described under the definition of that record  type.  No
1306       data  logical  records are stored for types 1, 2, or 5. If the typeflag
1307       field is set to 3 (character special file), 4 (block special file),  or
1308       6  (FIFO),  the meaning of the size field is unspecified by this volume
1309       of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1310       the  medium.  Additionally, for type 6, the size field shall be ignored
1311       when reading. If the typeflag field is set to any other value, the num‐
1312       ber  of  logical  records  written  following  the  header  shall  be (
1313       size+511)/512, ignoring any fraction in the result of the division.
1314
1315       The mtime field shall be the modification time of the file at the  time
1316       it  was archived. It is the ISO/IEC 646:1991 standard representation of
1317       the octal value of the modification time obtained from the stat() func‐
1318       tion.
1319
1320       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1321       tion of the octal value of the simple sum of all octets in  the  header
1322       logical  record.  Each  octet  in  the  header  shall  be treated as an
1323       unsigned value. These values shall be added  to  an  unsigned  integer,
1324       initialized  to  zero, the precision of which is not less than 17 bits.
1325       When calculating the checksum, the chksum field is  treated  as  if  it
1326       were all spaces.
1327
1328       The typeflag field specifies the type of file archived. If a particular
1329       implementation does not recognize the type, or the user does  not  have
1330       appropriate  privilege to create that type, the file shall be extracted
1331       as if it were a regular file if the file type  is  defined  to  have  a
1332       meaning  for the size field that could cause data logical records to be
1333       written on the medium (see the previous description for size). If  con‐
1334       version  to  a  regular  file  occurs, the pax utility shall produce an
1335       error indicating that the conversion took place. All  of  the  typeflag
1336       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1337
1338       0      Represents  a regular file. For backwards-compatibility, a type‐
1339              flag value of binary zero ( '\0' ) should be recognized as mean‐
1340              ing  a  regular file when extracting files from the archive. Ar‐
1341              chives written with this version of the archive file format cre‐
1342              ate  regular files with a typeflag value of the ISO/IEC 646:1991
1343              standard IRV '0' .
1344
1345       1      Represents a file linked to another file, of  any  type,  previ‐
1346              ously  archived.  Such  files are identified by each file having
1347              the same device and file serial number. The  linked-to  name  is
1348              specified  in the linkname field with a NUL-character terminator
1349              if it is less than 100 octets in length.
1350
1351       2      Represents a symbolic link. The contents of  the  symbolic  link
1352              shall be stored in the linkname field.
1353
1354       3,4    Represent  character  special  files  and  block  special  files
1355              respectively.  In this case the  devmajor  and  devminor  fields
1356              shall  contain  information  defining  the device, the format of
1357              which is unspecified by  this  volume  of  IEEE Std 1003.1-2001.
1358              Implementations  may  map the device specifications to their own
1359              local specification or may ignore the entry.
1360
1361       5      Specifies a directory or subdirectory.  On  systems  where  disk
1362              allocation  is  performed  on  a directory basis, the size field
1363              shall contain the maximum number of octets (which may be rounded
1364              to  the  nearest  disk block allocation unit) that the directory
1365              may hold. A size field of zero indicates no such limiting.  Sys‐
1366              tems  that  do not support limiting in this manner should ignore
1367              the size field.
1368
1369       6      Specifies a FIFO special file. Note that the archiving of a FIFO
1370              file archives the existence of this file and not its contents.
1371
1372       7      Reserved  to  represent  a  file  to which an implementation has
1373              associated  some  high-performance  attribute.   Implementations
1374              without such extensions should treat this file as a regular file
1375              (type 0).
1376
1377       A-Z    The letters 'A' to 'Z' ,  inclusive,  are  reserved  for  custom
1378              implementations.  All  other values are reserved for future ver‐
1379              sions of IEEE Std 1003.1-2001.
1380
1381
1382       Attempts to archive a socket using ustar interchange format shall  pro‐
1383       duce  a diagnostic message. Handling of other file types is implementa‐
1384       tion-defined.
1385
1386       The magic field is the specification that this archive  was  output  in
1387       this  archive format. If this field contains ustar (the five characters
1388       from the ISO/IEC 646:1991 standard IRV  shown  followed  by  NUL),  the
1389       uname  and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1390       representation of the owner and group of the file, respectively  (trun‐
1391       cated to fit, if necessary). When the file is restored by a privileged,
1392       protection-preserving version of the utility, the user and group  data‐
1393       bases  shall  be scanned for these names.  If found, the user and group
1394       IDs contained within these files shall be used rather than  the  values
1395       contained within the uid and gid fields.
1396
1397   cpio Interchange Format
1398       The  octet-oriented  cpio  archive format shall be a series of entries,
1399       each comprising a header that describes the file, the name of the file,
1400       and then the contents of the file.
1401
1402       An  archive may be recorded as a series of fixed-size blocks of octets.
1403       This blocking shall be used only to make physical I/O  more  efficient.
1404       The last group of blocks shall always be at the full size.
1405
1406       For the octet-oriented cpio archive format, the individual entry infor‐
1407       mation shall be in the order indicated and described by  the  following
1408       table; see also the <cpio.h> header.
1409
1410                      Table: Octet-Oriented cpio Archive Entry
1411
1412              Header Field Name     Length (in Octets)  Interpreted as
1413              c_magic               6                   Octal number
1414              c_dev                 6                   Octal number
1415              c_ino                 6                   Octal number
1416              c_mode                6                   Octal number
1417              c_uid                 6                   Octal number
1418              c_gid                 6                   Octal number
1419              c_nlink               6                   Octal number
1420              c_rdev                6                   Octal number
1421              c_mtime               11                  Octal number
1422              c_namesize            6                   Octal number
1423              c_filesize            11                  Octal number
1424              Filename Field Name   Length              Interpreted as
1425              c_name                c_namesize          Pathname string
1426              File Data Field Name  Length              Interpreted as
1427              c_filedata            c_filesize          Data
1428
1429   cpio Header
1430       For  each  file in the archive, a header as defined previously shall be
1431       written. The information in the header fields is written as streams  of
1432       the  ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1433       The octal numbers shall be extended to the necessary length by  append‐
1434       ing  the  ISO/IEC 646:1991  standard IRV zeros at the most-significant-
1435       digit end of the number; the result is written to the  most-significant
1436       digit of the stream of octets first. The fields shall be interpreted as
1437       follows:
1438
1439       c_magic
1440              Identify the archive as being a transportable  archive  by  con‐
1441              taining the identifying value "070707" .
1442
1443       c_dev, c_ino
1444              Contains  values  that uniquely identify the file within the ar‐
1445              chive (that is, no files contain the  same  pair  of  c_dev  and
1446              c_ino values unless they are links to the same file). The values
1447              shall be determined in an unspecified manner.
1448
1449       c_mode Contains the file type and access permissions as defined in  the
1450              following table.
1451
1452                            Table: Values for cpio c_mode Field
1453
1454                   File Permissions Name  Value    Indicates
1455                   C_IRUSR                000400   Read by owner
1456                   C_IWUSR                000200   Write by owner
1457                   C_IXUSR                000100   Execute by owner
1458                   C_IRGRP                000040   Read by group
1459                   C_IWGRP                000020   Write by group
1460                   C_IXGRP                000010   Execute by group
1461                   C_IROTH                000004   Read by others
1462                   C_IWOTH                000002   Write by others
1463                   C_IXOTH                000001   Execute by others
1464                   C_ISUID                004000   Set uid
1465                   C_ISGID                002000   Set gid
1466                   C_ISVTX                001000   Reserved
1467                   File Type Name         Value    Indicates
1468                   C_ISDIR                040000   Directory
1469                   C_ISFIFO               010000   FIFO
1470                   C_ISREG                0100000  Regular file
1471                   C_ISLNK                0120000  Symbolic link
1472                   C_ISBLK                060000   Block special file
1473                   C_ISCHR                020000   Character special file
1474                   C_ISSOCK               0140000  Socket
1475                   C_ISCTG                0110000  Reserved
1476
1477       Directories,  FIFOs,  symbolic  links,  and regular files shall be sup‐
1478       ported on a system conforming to this volume  of  IEEE Std 1003.1-2001;
1479       additional  values  defined  previously  are reserved for compatibility
1480       with existing systems.  Additional file types may  be  supported;  how‐
1481       ever,  such  files  should  not  be  written to archives intended to be
1482       transported to other systems.
1483
1484       c_uid  Contains the user ID of the owner.
1485
1486       c_gid  Contains the group ID of the group.
1487
1488       c_nlink
1489              Contains the number of links referencing the file  at  the  time
1490              the archive was created.
1491
1492       c_rdev Contains  implementation-defined  information  for  character or
1493              block special files.
1494
1495       c_mtime
1496              Contains the latest time of modification of the file at the time
1497              the archive was created.
1498
1499       c_namesize
1500              Contains  the  length of the pathname, including the terminating
1501              NUL character.
1502
1503       c_filesize
1504              Contains the length of the file in octets.  This  shall  be  the
1505              length of the data section following the header structure.
1506
1507
1508   cpio Filename
1509       The  c_name field shall contain the pathname of the file. The length of
1510       this field in octets is the value of c_namesize.
1511
1512       If a filename is found on the medium that would create an invalid path‐
1513       name,  it  is  implementation-defined whether the data from the file is
1514       stored on the file hierarchy and under what name it is stored.
1515
1516       All characters shall be represented in  the  ISO/IEC 646:1991  standard
1517       IRV.  For  maximum portability between implementations, names should be
1518       selected from characters represented by the portable filename character
1519       set  as octets with the most significant bit zero. If an implementation
1520       supports the use of characters outside the portable filename  character
1521       set  in names for files, users, and groups, one or more implementation-
1522       defined encodings of these characters shall be provided for interchange
1523       purposes.  However, the pax utility shall never create filenames on the
1524       local system that cannot be accessed via the procedures described  pre‐
1525       viously  in this volume of IEEE Std 1003.1-2001. If a filename is found
1526       on the medium that would create an invalid filename, it is  implementa‐
1527       tion-defined whether the data from the file is stored on the local file
1528       system and under what name it is stored. The pax utility may choose  to
1529       ignore  these files as long as it produces an error indicating that the
1530       file is being ignored.
1531
1532   cpio File Data
1533       Following c_name, there shall be c_filesize octets of data. Interpreta‐
1534       tion  of such data occurs in a manner dependent on the file. If c_file‐
1535       size is zero, no data shall be contained in c_filedata.
1536
1537       When restoring from an archive:
1538
1539        * If the user does not have the appropriate privilege to create a file
1540          of the specified type, pax shall ignore the entry and write an error
1541          message to standard error.
1542
1543        * Only regular files have data to be  restored.  Presuming  a  regular
1544          file  meets any selection criteria that might be imposed on the for‐
1545          mat-reading utility by the user, such data shall be restored.
1546
1547        * If a user does not have appropriate privilege to  set  a  particular
1548          mode  flag, the flag shall be ignored. Some of the mode flags in the
1549          archive format  are  not  mentioned  elsewhere  in  this  volume  of
1550          IEEE Std 1003.1-2001.  If  the implementation does not support those
1551          flags, they may be ignored.
1552
1553   cpio Special Entries
1554       FIFO special files, directories, and the trailer shall be recorded with
1555       c_filesize  equal  to  zero.  For  other  special  files, c_filesize is
1556       unspecified by this volume of IEEE Std 1003.1-2001.  The header for the
1557       next file entry in the archive shall be written directly after the last
1558       octet of the file entry preceding it. A header  denoting  the  filename
1559       TRAILER!!!  shall  indicate  the  end  of  the archive; the contents of
1560       octets in the last block of the archive following  such  a  header  are
1561       undefined.
1562

EXIT STATUS

1564       The following exit values shall be returned:
1565
1566        0     All files were processed successfully.
1567
1568       >0     An error occurred.
1569
1570

CONSEQUENCES OF ERRORS

1572       If pax cannot create a file or a link when reading an archive or cannot
1573       find a file when writing an archive, or cannot preserve  the  user  ID,
1574       group  ID,  or  file mode when the -p option is specified, a diagnostic
1575       message shall be written to standard error and a non-zero  exit  status
1576       shall be returned, but processing shall continue. In the case where pax
1577       cannot create a link to a file, pax shall not,  by  default,  create  a
1578       second copy of the file.
1579
1580       If  the  extraction of a file from an archive is prematurely terminated
1581       by a signal or error, pax may have only partially extracted the file or
1582       (if  the  -n option was not specified) may have extracted a file of the
1583       same name as that specified by the user, but which is not the file  the
1584       user  wanted. Additionally, the file modes of extracted directories may
1585       have additional bits from the S_IRWXU mask set  as  well  as  incorrect
1586       modification and access times.
1587
1588       The following sections are informative.
1589

APPLICATION USAGE

1591       The  -p  (privileges)  option  was  invented  to  reconcile differences
1592       between historical tar and cpio implementations. In particular, the two
1593       utilities use -m in diametrically opposed ways. The -p option also pro‐
1594       vides a consistent means of extending the ways  in  which  future  file
1595       attributes  can  be addressed, such as for enhanced security systems or
1596       high-performance files. Although it may seem complex, there are  really
1597       two modes that are most commonly used:
1598
1599       -p e   ``Preserve  everything".  This  would  be used by the historical
1600              superuser, someone with all the appropriate privileges, to  pre‐
1601              serve  all  aspects of the files as they are recorded in the ar‐
1602              chive.  The e flag is the sum of o and p, and other  implementa‐
1603              tion-defined attributes.
1604
1605       -p p   ``Preserve"  the  file mode bits. This would be used by the user
1606              with regular privileges who wished to preserve  aspects  of  the
1607              file  other  than the ownership. The file times are preserved by
1608              default, but two other flags are offered to  disable  these  and
1609              use the time of extraction.
1610
1611
1612       The  one pathname per line format of standard input precludes pathnames
1613       containing <newline>s. Although such  pathnames  violate  the  portable
1614       filename  guidelines,  they  may  exist  and their presence may inhibit
1615       usage of pax within shell scripts.  This problem is inherited from his‐
1616       torical  archive  programs. The problem can be avoided by listing file‐
1617       name arguments on the command line instead of on standard input.
1618
1619       It is almost certain that appropriate privileges are required  for  pax
1620       to  accomplish  parts  of this volume of IEEE Std 1003.1-2001. Specifi‐
1621       cally, creating files of  type  block  special  or  character  special,
1622       restoring file access times unless the files are owned by the user (the
1623       -t option), or preserving file owner, group, and mode (the  -p  option)
1624       all probably require appropriate privileges.
1625
1626       In read mode, implementations are permitted to overwrite files when the
1627       archive has multiple members with the same name.  This may fail if per‐
1628       missions  on the first version of the file do not permit it to be over‐
1629       written.
1630
1631       The cpio and ustar formats can only  support  files  up  to  8589934592
1632       bytes (8 * 2^30) in size.
1633

EXAMPLES

1635       The following command:
1636
1637
1638              pax -w -f /dev/rmt/1m .
1639
1640       copies  the  contents  of the current directory to tape drive 1, medium
1641       density (assuming historical System V device naming procedures-the his‐
1642       torical BSD device name would be /dev/rmt9).
1643
1644       The following commands:
1645
1646
1647              mkdir newdirpax -rw olddir newdir
1648
1649       copy the olddir directory hierarchy to newdir.
1650
1651
1652              pax -r -s ',^//*usr//*,,' -f a.pax
1653
1654       reads  the  archive a.pax, with all files rooted in /usr in the archive
1655       extracted relative to the current directory.
1656
1657       Using the option:
1658
1659
1660              -o listopt="%M %(atime)T %(size)D %(name)s"
1661
1662       overrides the default output description in Standard Output and instead
1663       writes:
1664
1665
1666              -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1667
1668       Using the options:
1669
1670
1671              -o listopt='%L\t%(size)D\n%.7' \
1672              -o listopt='(name)s\n%(ctime)T\n%T'
1673
1674       overrides the default output description in Standard Output and instead
1675       writes:
1676
1677
1678              /usr/foo/bar -> /tmp   1492
1679              /usr/fo
1680              Jan 12 1991
1681              Jan 31 15:53
1682

RATIONALE

1684       The pax utility was new for the ISO POSIX-2:1993 standard.   It  repre‐
1685       sents a peaceful compromise between advocates of the historical tar and
1686       cpio utilities.
1687
1688       A fundamental difference between cpio and tar was in the  way  directo‐
1689       ries  were  treated. The cpio utility did not treat directories differ‐
1690       ently from other files, and to select  a  directory  and  its  contents
1691       required  that  each file in the hierarchy be explicitly specified. For
1692       tar, a directory matched every file in the file hierarchy it rooted.
1693
1694       The pax utility offers both interfaces;  by  default,  directories  map
1695       into the file hierarchy they root. The -d option causes pax to skip any
1696       file not explicitly referenced, as cpio historically  did.  The  tar  -
1697       style  behavior  was chosen as the default because it was believed that
1698       this was the more common usage and because tar  is  the  more  commonly
1699       available  interface,  as it was historically provided on both System V
1700       and BSD implementations.
1701
1702       The  data  interchange  format  specification   in   this   volume   of
1703       IEEE Std 1003.1-2001  requires  that processes with "appropriate privi‐
1704       leges" shall always restore the ownership and permissions of  extracted
1705       files  exactly  as  archived.  If  viewed from the historic equivalence
1706       between superuser and "appropriate privileges", there are two  problems
1707       with  this requirement.  First, users running as superusers may unknow‐
1708       ingly set dangerous permissions on extracted files. Second, it is need‐
1709       lessly  limiting,  in that superusers cannot extract files and own them
1710       as superuser unless the archive  was  created  by  the  superuser.  (It
1711       should  be noted that restoration of ownerships and permissions for the
1712       superuser, by default, is historical practice in cpio, but not in tar.)
1713       In  order  to  avoid  these  two problems, the pax specification has an
1714       additional "privilege" mechanism, the -p option. Only a pax  invocation
1715       with the privileges needed, and which has the -p option set using the e
1716       specification character, has the  "appropriate  privilege"  to  restore
1717       full ownership and permission information.
1718
1719       Note  also  that  this volume of IEEE Std 1003.1-2001 requires that the
1720       file ownership and access permissions shall be set, on  extraction,  in
1721       the  same  fashion  as the creat() function when provided with the mode
1722       stored in the archive. This means that the file creation  mask  of  the
1723       user is applied to the file permissions.
1724
1725       Users should note that directories may be created by pax while extract‐
1726       ing files with permissions that are different from those  that  existed
1727       at the time the archive was created. When extracting sensitive informa‐
1728       tion into a directory  hierarchy  that  no  longer  exists,  users  are
1729       encouraged  to  set  their  file creation mask appropriately to protect
1730       these files during extraction.
1731
1732       The table of contents output is written to standard output  to  facili‐
1733       tate pipeline processing.
1734
1735       An early proposal had hard links displaying for all pathnames. This was
1736       removed because it complicates the output of the case where -v  is  not
1737       specified  and  does  not  match  historical  cpio usage. The hard-link
1738       information is available in the -v display.
1739
1740       The description of the -l option allows implementations  to  make  hard
1741       links to symbolic links.  IEEE Std 1003.1-2001 does not specify any way
1742       to create a hard link to a symbolic link, but many implementations pro‐
1743       vide  this  capability as an extension. If there are hard links to sym‐
1744       bolic links when an archive is created, the implementation is  required
1745       to archive the hard link in the archive (unless -H or -L is specified).
1746       When in read mode and in copy  mode,  implementations  supporting  hard
1747       links to symbolic links should use them when appropriate.
1748
1749       The  archive formats inherited from the POSIX.1-1990 standard have cer‐
1750       tain restrictions that have been brought along from  historical  usage.
1751       For  example,  there are restrictions on the length of pathnames stored
1752       in the archive. When pax is used in copy( -rw) mode (copying  directory
1753       hierarchies),  the  ability  to  use  extensions from the -x pax format
1754       overcomes these restrictions.
1755
1756       The default blocksize value of 5120 bytes for cpio was selected because
1757       it  is  one of the standard block-size values for cpio, set when the -B
1758       option is specified.  (The other default block-size value for  cpio  is
1759       512  bytes, and this was considered to be too small.) The default block
1760       value of 10240 bytes for tar was selected because that is the  standard
1761       block-size  value  for  BSD  tar. The maximum block size of 32256 bytes
1762       (2**15-512 bytes) is the largest multiple of 512 bytes that fits into a
1763       signed  16-bit tape controller transfer register. There are known limi‐
1764       tations in some historical systems that  would  prevent  larger  blocks
1765       from  being accepted. Historical values were chosen to improve compati‐
1766       bility with historical scripts using dd or similar utilities to manipu‐
1767       late  archives.  Also, default block sizes for any file type other than
1768       character  special  file  has  been  deleted  from   this   volume   of
1769       IEEE Std 1003.1-2001 as unimportant and not likely to affect the struc‐
1770       ture of the resulting archive.
1771
1772       Implementations are permitted to modify the block-size value  based  on
1773       the archive format or the device to which the archive is being written.
1774       This is to provide implementations with the opportunity to take  advan‐
1775       tage  of  special types of devices, and it should not be used without a
1776       great deal of consideration as it almost  certainly  decreases  archive
1777       portability.
1778
1779       The  intended  use  of the -n option was to permit extraction of one or
1780       more files from the archive without processing the entire archive. This
1781       was  viewed  by the standard developers as offering significant perfor‐
1782       mance advantages over historical  implementations.  The  -n  option  in
1783       early proposals had three effects; the first was to cause special char‐
1784       acters in patterns to not be treated specially. The second was to cause
1785       only  the  first file that matched a pattern to be extracted. The third
1786       was to cause pax to write a diagnostic message to standard  error  when
1787       no  file was found matching a specified pattern. Only the second behav‐
1788       ior is retained by this volume of IEEE Std 1003.1-2001, for  many  rea‐
1789       sons.  First,  it  is  in general not acceptable for a single option to
1790       have multiple effects. Second, the ability  to  make  pattern  matching
1791       characters  act  as  normal characters is useful for parts of pax other
1792       than file extraction.  Third, a finer degree of control over  the  spe‐
1793       cial  characters  is  useful because users may wish to normalize only a
1794       single special character in a single filename.  Fourth,  given  a  more
1795       general escape mechanism, the previous behavior of the -n option can be
1796       easily obtained using the -s option or a sed script.  Finally,  writing
1797       a  diagnostic message when a pattern specified by the user is unmatched
1798       by any file is useful behavior in all cases.
1799
1800       In this version, the -n was removed from the copy mode synopsis of pax;
1801       it  is  inapplicable because there are no pattern operands specified in
1802       this mode.
1803
1804       There  is  another  method   than   pax   for   copying   subtrees   in
1805       IEEE Std 1003.1-2001  described as part of the cp utility. Both methods
1806       are historical practice: cp provides a simpler, more  intuitive  inter‐
1807       face,  while  pax  offers a finer granularity of control. Each provides
1808       additional functionality to the other; in particular, pax maintains the
1809       hard-link  structure  of  the  hierarchy  while  cp does not. It is the
1810       intention of the standard developers that the results be similar (using
1811       appropriate option combinations in both utilities). The results are not
1812       required to be identical; there seemed insufficient  gain  to  applica‐
1813       tions  to balance the difficulty of implementations having to guarantee
1814       that the results would be exactly identical.
1815
1816       A single archive may span more than one  file.  It  is  suggested  that
1817       implementations  provide  informative  messages to the user on standard
1818       error whenever the archive file is changed.
1819
1820       The -d option (do not create intermediate directories not listed in the
1821       archive)  found in early proposals was originally provided as a comple‐
1822       ment to the historic -d option of cpio.  It has been deleted.
1823
1824       The -s option in early proposals specified a subset of the substitution
1825       command  from  the ed utility. As there was no reason for only a subset
1826       to be supported, the -s option is now compatible with  the  current  ed
1827       specification.  Since  the delimiter can be any non-null character, the
1828       following usage with single spaces is valid:
1829
1830
1831              pax -s " foo bar " ...
1832
1833       The -t description is worded so as to note  that  this  may  cause  the
1834       access  time  update  caused by some other activity (which occurs while
1835       the file is being read) to be overwritten.
1836
1837       The default behavior of pax with regard to file modification  times  is
1838       the same as historical implementations of tar. It is not the historical
1839       behavior of cpio.
1840
1841       Because the -i option uses /dev/tty, utilities  without  a  controlling
1842       terminal are not able to use this option.
1843
1844       The  -y  option,  found  in early proposals, has been deleted because a
1845       line containing a single period for the -i option has equivalent  func‐
1846       tionality. The special lines for the -i option (a single period and the
1847       empty line) are historical practice in cpio.
1848
1849       In early drafts, a -e charmap option was included to increase portabil‐
1850       ity of files between systems using different coded character sets. This
1851       option was omitted because it was apparent that consensus could not  be
1852       formed  for it. In this version, the use of UTF-8 should be an adequate
1853       substitute.
1854
1855       The -k option was added to address  international  concerns  about  the
1856       dangers  involved  in  the  character set transformations of -e (if the
1857       target character set were different  from  the  source,  the  filenames
1858       might  be  transformed into names matching existing files) and also was
1859       made more general to protect files  transferred  between  file  systems
1860       with  different  {NAME_MAX}  values (truncating a filename on a smaller
1861       system might also inadvertently overwrite existing files).  As  stated,
1862       it  prevents any overwriting, even if the target file is older than the
1863       source. This version adds more granularity of  options  to  solve  this
1864       problem  by  introducing  the -o invalid= option-specifically the UTF-8
1865       action. (Note that an existing file that is named with a UTF-8 encoding
1866       is still subject to overwriting in this case. The -k option closes that
1867       loophole.)
1868
1869       Some  of  the  file  characteristics  referenced  in  this  volume   of
1870       IEEE Std 1003.1-2001  might  not  be supported by some archive formats.
1871       For example, neither the tar nor cpio formats contain the  file  access
1872       time. For this reason, the e specification character has been provided,
1873       intended to cause all file characteristics specified in the archive  to
1874       be retained.
1875
1876       It  is  required  that  extracted  directories,  by default, have their
1877       access and modification times and permissions set to the values  speci‐
1878       fied  in the archive. This has obvious problems in that the directories
1879       are almost certainly modified after being extracted and that  directory
1880       permissions  may not permit file creation.  One possible solution is to
1881       create directories with the mode specified in the archive, as  modified
1882       by  the  umask  of  the user, with sufficient permissions to allow file
1883       creation. After all files have been extracted, pax would then reset the
1884       access and modification times and permissions as necessary.
1885
1886       The  list-mode  formatting  description  borrows  heavily  from the one
1887       defined by the printf utility. However, since there is no separate  op‐
1888       erand  list  to  get  conversion  arguments, the format was extended to
1889       allow specifying the name of the conversion argument  as  part  of  the
1890       conversion specification.
1891
1892       The T conversion specifier allows time fields to be displayed in any of
1893       the date formats. Unlike the ls utility, pax does not adjust the format
1894       when  the  date is less than six months in the past. This makes parsing
1895       the output more predictable.
1896
1897       The  D  conversion  specifier  handles  the  ability  to  display   the
1898       major/minor or file size, as with ls, by using %-8(size)D.
1899
1900       The L conversion specifier handles the ls display for symbolic links.
1901
1902       Conversion  specifiers were added to generate existing known types used
1903       for ls.
1904
1905   pax Interchange Format
1906       The new POSIX data interchange format was developed primarily  to  sat‐
1907       isfy  international  concerns  that  the ustar and cpio formats did not
1908       provide for file, user, and group names encoded in characters outside a
1909       subset  of the ISO/IEC 646:1991 standard. The standard developers real‐
1910       ized that this new POSIX data interchange format should be very  exten‐
1911       sible  because  there  were other requirements they foresaw in the near
1912       future:
1913
1914        * Support international character encodings and locale information
1915
1916        * Support security information (ACLs, and so on)
1917
1918        * Support future file types, such as realtime or contiguous files
1919
1920        * Include data areas for implementation use
1921
1922        * Support systems with words larger than 32 bits and timers with  sub‐
1923          second granularity
1924
1925       The  following  were not goals for this format because these are better
1926       handled by separate utilities or are inappropriate for a portable  for‐
1927       mat:
1928
1929        * Encryption
1930
1931        * Compression
1932
1933        * Data translation between locales and codesets
1934
1935        * inode storage
1936
1937       The  format  chosen  to  support the goals is an extension of the ustar
1938       format. Of the two formats previously available, only the ustar  format
1939       was selected for extensions because:
1940
1941        * It  was  easier  to  extend in an upwards-compatible way. It offered
1942          version flags and header block type  fields  with  room  for  future
1943          standardization.  The  cpio format, while possessing a more flexible
1944          file naming methodology, could not be extended without breaking some
1945          theoretical implementation or using a dummy filename that could be a
1946          legitimate filename.
1947
1948        * Industry experience since the original " tar wars" fought in  devel‐
1949          oping  the  ISO POSIX-1  standard  has  clearly been in favor of the
1950          ustar format, which is generally the default output format  selected
1951          for pax implementations on new systems.
1952
1953       The  new  format was designed with one additional goal in mind: reason‐
1954       able behavior when an older tar or pax utility happened to read an  ar‐
1955       chive.  Since the POSIX.1-1990 standard mandated that a "format-reading
1956       utility" had to treat unrecognized typeflag values  as  regular  files,
1957       this  allowed  the  format to include all the extended information in a
1958       pseudo-regular file that preceded each real file. An  option  is  given
1959       that  allows  the  archive creator to set up reasonable names for these
1960       files on the older systems. Also, the normative text suggests that rea‐
1961       sonable file access values be used for this ustar header block.  Making
1962       these header files inaccessible for  convenient  reading  and  deleting
1963       would not be reasonable. File permissions of 600 or 700 are suggested.
1964
1965       The  ustar  typeflag field was used to accommodate the additional func‐
1966       tionality of the new format rather than magic or  version  because  the
1967       POSIX.1-1990 standard (and, by reference, the previous version of pax),
1968       mandated the behavior of the format-reading utility when it encountered
1969       an unknown typeflag, but was silent about the other two fields.
1970
1971       Early proposals of the first revision to IEEE Std 1003.1-2001 contained
1972       a proposed archive format that was  based  on  compatibility  with  the
1973       standard  for tape files (ISO 1001, similar to the format used histori‐
1974       cally on many mainframes and minicomputers).  This  format  was  overly
1975       complex  and  required  considerable  overhead  in  volume  and  header
1976       records. Furthermore, the standard developers felt that it would not be
1977       acceptable  to  the  community  of  POSIX  developers,  so it was later
1978       changed to be a format more closely related to historical  practice  on
1979       POSIX systems.
1980
1981       The  prefix  and  name  split of pathnames in ustar was replaced by the
1982       single path extended header record for simplicity.
1983
1984       The concept of a global extended header ( typeflag  g)  was  controver‐
1985       sial.  If  this  were  applied to an archive being recorded on magnetic
1986       tape, a few unreadable blocks at the beginning of the tape could  be  a
1987       serious  problem; a utility attempting to extract as many files as pos‐
1988       sible from a damaged archive could lose  a  large  percentage  of  file
1989       header  information  in  this  case.  However, if the archive were on a
1990       reliable medium, such as a CD-ROM, the global  extended  header  offers
1991       considerable  potential size reductions by eliminating redundant infor‐
1992       mation. Thus, the text warns against using the global method for  unre‐
1993       liable media and provides a method for implanting global information in
1994       the extended header for each  file,  rather  than  in  the  typeflag  g
1995       records.
1996
1997       No  facility  for  data translation or filtering on a per-file basis is
1998       included because the standard developers could not invent an  interface
1999       that  would  allow  this  in  an efficient manner. If a filter, such as
2000       encryption or compression, is to be applied to all  the  files,  it  is
2001       more  efficient  to  apply the filter to the entire archive as a single
2002       file. The standard developers considered interfaces that would invoke a
2003       shell  script  for  each file going into or out of the archive, but the
2004       system overhead in this approach was considered to be too high.
2005
2006       One such approach would be to have filter= records that give a pathname
2007       for  an  executable.  When the program is invoked, the file and archive
2008       would be open for standard input/output and all the header fields would
2009       be  available  as  environment variables or command-line arguments. The
2010       standard developers did discuss such schemes,  but  they  were  omitted
2011       from  IEEE Std 1003.1-2001  due  to  concerns about excessive overhead.
2012       Also, the program itself would need to be in the archive if it were  to
2013       be used portably.
2014
2015       There  is  currently  no  portable  means  of identifying the character
2016       set(s) used for a file in the file system. Therefore, pax has not  been
2017       given  a  mechanism to generate charset records automatically. The only
2018       portable means of doing this is for the user to write the archive using
2019       the  -o  charset=  string command line option. This assumes that all of
2020       the files in the archive use the same  encoding.  The  "implementation-
2021       defined"  text  is included to allow for a system that can identify the
2022       encodings used for each of its files.
2023
2024       The table of standards that accompanies the charset record  description
2025       is  acknowledged to be very limited. Only a limited number of character
2026       set standards is reasonable for maximal interchange.  Any character set
2027       is,  of  course,  possible  by  prior  agreement. It was suggested that
2028       EBCDIC be listed, but it was omitted because it is  not  defined  by  a
2029       formal  standard. Formal standards, and then only those with reasonably
2030       large followings, can be included here, simply as a matter  of  practi‐
2031       cality. The <value>s represent names of officially registered character
2032       sets in the format required by the ISO 2375:1985 standard.
2033
2034       The normal comma or <blank>-separated list rules are  not  followed  in
2035       the  case  of  keyword  options  to  allow ease of argument parsing for
2036       getopts.
2037
2038       Further information on character encodings is in pax Archive  Character
2039       Set Encoding/Decoding .
2040
2041       The  standard  developers  have  reserved keyword name space for vendor
2042       extensions. It is suggested that the format to be used is:
2043
2044
2045              VENDOR.keyword
2046
2047       where VENDOR is the name of the vendor or organization in all uppercase
2048       letters.  It is further suggested that the keyword following the period
2049       be named differently than any of the standard keywords so that it could
2050       be  used  for  future  standardization, if appropriate, by omitting the
2051       VENDOR prefix.
2052
2053       The <length> field in the extended header record was included  to  make
2054       it  simpler  to  step through the records, even if a record contains an
2055       unknown format (to a particular pax) with complex interactions of  spe‐
2056       cial  characters.  It also provides a minor integrity checkpoint within
2057       the records to aid a program attempting to recover files from a damaged
2058       archive.
2059
2060       There  are  no  extended  header  versions of the devmajor and devminor
2061       fields because the unspecified format ustar header field should be suf‐
2062       ficient.  If  they  are not, vendor-specific extended keywords (such as
2063       VENDOR.devmajor) should be used.
2064
2065       Device and i-number labeling of files was not adopted from cpio;  files
2066       are interchanged strictly on a symbolic name basis, as in ustar.
2067
2068       Just  as  with  the  ustar format descriptions, the new format makes no
2069       special arrangements for multi-volume archives. Each of the pax archive
2070       types  is  assumed  to be inside a single POSIX file and splitting that
2071       file over multiple volumes (diskettes, tape  cartridges,  and  so  on),
2072       processing  their  labels, and mounting each in the proper sequence are
2073       considered to  be  implementation  details  that  cannot  be  described
2074       portably.
2075
2076       The  pax  format  is intended for interchange, not only for backup on a
2077       single (family of) systems. It is not as densely  packed  as  might  be
2078       possible for backup:
2079
2080        * It  contains  information as coded characters that could be coded in
2081          binary.
2082
2083        * It identifies extended records with name fields that could be  omit‐
2084          ted in favor of a fixed-field layout.
2085
2086        * It  translates  names  into  a portable character set and identifies
2087          locale-related information, both of which are  probably  unnecessary
2088          for backup.
2089
2090       The  requirements  on  restoring from an archive are slightly different
2091       from the historical wording, allowing for non-monolithic  privilege  to
2092       bring  forward  as  much as possible. In particular, attributes such as
2093       "high performance file" might be broadly but  not  universally  granted
2094       while  set-user-ID  or chown() might be much more restricted.  There is
2095       no implication in IEEE Std 1003.1-2001 that the security information be
2096       honored  after  it  is restored to the file hierarchy, in spite of what
2097       might be improperly inferred by the silence on that topic.  That  is  a
2098       topic for another standard.
2099
2100       Links  are recorded in the fashion described here because a link can be
2101       to any file type. It is desirable in general to be able to restore part
2102       of an archive selectively and restore all of those files completely. If
2103       the data is not associated with each link, it is  not  possible  to  do
2104       this.  However,  the data associated with a file can be large, and when
2105       selective restoration is not needed, this can be a significant  burden.
2106       The  archive  is  structured so that files that have no associated data
2107       can always be restored by the name of any link name of  any  link,  and
2108       the  user  may  choose whether data is recorded with each instance of a
2109       file that contains data. The format permits mixing  of  both  types  of
2110       links  in a single archive; this can be done for special needs, and pax
2111       is expected to interpret such archives on input properly,  despite  the
2112       fact  that  there  is no pax option that would force this mixed case on
2113       output. (When -o linkdata is used, the output must contain  the  dupli‐
2114       cate data, but the implementation is free to include it or omit it when
2115       -o linkdata is not used.)
2116
2117       The time values are included  as  extended  header  records  for  those
2118       implementations  needing  more  than the eleven octal digits allowed by
2119       the ustar format. Portable file timestamps cannot be negative.  If  pax
2120       encounters  a  file with a negative timestamp in copy or write mode, it
2121       can reject the file, substitute a non-negative timestamp, or generate a
2122       non-portable  timestamp with a leading '-' . Even though some implemen‐
2123       tations can support finer file-time  granularities  than  seconds,  the
2124       normative  text  requires  support  only  for  seconds  since the Epoch
2125       because the ISO POSIX-1 standard states them that way. The ustar format
2126       includes  only mtime; the new format adds atime and ctime for symmetry.
2127       The atime access time restored to the file system will be  affected  by
2128       the  -p  a  and  -p e options.  The ctime creation time (actually inode
2129       modification time) is described with "appropriate privilege" so that it
2130       can  be ignored when writing to the file system. POSIX does not provide
2131       a portable means to change file creation time. Nothing is  intended  to
2132       prevent a non-portable implementation of pax from restoring the value.
2133
2134       The  gid,  size, and uid extended header records were included to allow
2135       expansion beyond the sizes specified in the  regular  tar  header.  New
2136       file  system  architectures are emerging that will exhaust the 12-digit
2137       size field. There are probably not many systems requiring more  than  8
2138       digits  for  user  and  group  IDs, but the extended header values were
2139       included for completeness, allowing overrides for all  of  the  decimal
2140       values in the tar header.
2141
2142       The  standard  developers intended to describe the effective results of
2143       pax with regard to file ownerships and permissions; implementations are
2144       not  restricted  in  timing or sequencing the restoration of such, pro‐
2145       vided the results are as specified.
2146
2147       Much of the text describing the extended headers refers  to  use  in  "
2148       write or copy modes". The copy mode references are due to the normative
2149       text: "The effect of the copy shall be as  if  the  copied  files  were
2150       written  to an archive file and then subsequently extracted ...". There
2151       is certainly no way to test whether  pax  is  actually  generating  the
2152       extended headers in copy mode, but the effects must be as if it had.
2153
2154   pax Archive Character Set Encoding/Decoding
2155       There  is  a need to exchange archives of files between systems of dif‐
2156       ferent native codesets. Filenames, group names, and user names must  be
2157       preserved to the fullest extent possible when an archive is read on the
2158       receiving platform. Translation of the contents of files is not  within
2159       the scope of the pax utility.
2160
2161       There will also be the need to represent characters that are not avail‐
2162       able on the receiving platform. These unsupported characters cannot  be
2163       automatically  folded  to the local set of characters due to the chance
2164       of collisions. This could  result  in  overwriting  previous  extracted
2165       files from the archive or pre-existing files on the system.
2166
2167       For  these reasons, the codeset used to represent characters within the
2168       extended header records of the pax archive must be sufficiently rich to
2169       handle  all commonly used character sets. The fields requiring transla‐
2170       tion include, at a minimum, filenames, user  names,  group  names,  and
2171       link  pathnames.  Implementations  may  wish to have localized extended
2172       keywords that use non-portable characters.
2173
2174       The standard developers considered the following options:
2175
2176        * The archive creator specifies the well-defined name  of  the  source
2177          codeset.  The receiver must then recognize the codeset name and per‐
2178          form the appropriate translations to the destination codeset.
2179
2180        * The archive creator includes within the archive the  character  map‐
2181          ping  table  for  the  source codeset used to encode extended header
2182          records. The receiver must then read the character mapping table and
2183          perform the appropriate translations to the destination codeset.
2184
2185        * The  archive  creator  translates the extended header records in the
2186          source codeset into a canonical form. The receiver must then perform
2187          the appropriate translations to the destination codeset.
2188
2189       The approach that incorporates the name of the source codeset poses the
2190       problem of codeset name registration, and makes the archive useless  to
2191       pax archive decoders that do not recognize that codeset.
2192
2193       Because  parts  of an archive may be corrupted, the standard developers
2194       felt that including the character map of the  source  codeset  was  too
2195       fragile.  The loss of this one key component could result in making the
2196       entire archive useless. (The difference between  this  and  the  global
2197       extended header decision was that the latter has a workaround-duplicat‐
2198       ing extended header records on unreliable media-but this would  be  too
2199       burdensome for large character set maps.)
2200
2201       Both  of  the  above approaches also put an undue burden on the pax ar‐
2202       chive receiver to handle the cross-product of all source  and  destina‐
2203       tion codesets.
2204
2205       To  simplify  the  translation from the source codeset to the canonical
2206       form and from the canonical form to the destination codeset, the  stan‐
2207       dard  developers  decided  that the internal representation should be a
2208       stateless encoding. A stateless encoding is one  where  each  codepoint
2209       has the same meaning, without regard to the decoder being in a specific
2210       state. An example of a stateful encoding would be the  Japanese  Shift-
2211       JIS;  an  example of a stateless encoding would be the ISO/IEC 646:1991
2212       standard (equivalent to 7-bit ASCII).
2213
2214       For these reasons, the standard developers decided to adopt a canonical
2215       format for the representation of file information strings. The obvious,
2216       well-endorsed candidate is the ISO/IEC 10646-1:2000 standard (based  in
2217       part on Unicode), which can be used to represent the characters of vir‐
2218       tually all standardized character sets. The  standard  developers  ini‐
2219       tially  agreed  upon using UCS2 (16-bit Unicode) as the internal repre‐
2220       sentation. This repertoire of characters provides a  sufficiently  rich
2221       set to represent all commonly-used codesets.
2222
2223       However,  the  standard developers found that the 16-bit Unicode repre‐
2224       sentation had some problems. It forced the issue of standardizing  byte
2225       ordering.  The 2-byte length of each character made the extended header
2226       records twice as long for the case of strings coded entirely from  his‐
2227       torical  7-bit  ASCII. For these reasons, the standard developers chose
2228       the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2229       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2230       tically, eliminating the need for a canonical byte ordering.  In  addi‐
2231       tion,  NUL octets and other characters possibly confusing to POSIX file
2232       systems do not appear, except to represent themselves. It was  realized
2233       that  certain  national codesets take up more space after the encoding,
2234       due to their placement within the UCS range; it was felt that the  use‐
2235       fulness of the encoding of the names outweighs the disadvantage of size
2236       increase for file, user, and group names.
2237
2238       The encoding of UTF-8 is as follows:
2239
2240
2241              UCS4 Hex Encoding  UTF-8 Binary Encoding
2242
2243
2244              00000000-0000007F  0xxxxxxx
2245              00000080-000007FF  110xxxxx 10xxxxxx
2246              00000800-0000FFFF  1110xxxx 10xxxxxx 10xxxxxx
2247              00010000-001FFFFF  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2248              00200000-03FFFFFF  111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2249              04000000-7FFFFFFF  1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2250
2251       where each 'x' represents a bit value from the character  being  trans‐
2252       lated.
2253
2254   ustar Interchange Format
2255       The description of the ustar format reflects numerous enhancements over
2256       pre-1988 versions of the historical tar  utility.  The  goal  of  these
2257       changes  was  not  only to provide the functional enhancements desired,
2258       but also to retain compatibility between new  and  old  versions.  This
2259       compatibility  has  been  retained.  Archives written using the old ar‐
2260       chive format are compatible with the new format.
2261
2262       Implementors should be aware that the  previous  file  format  did  not
2263       include  a  mechanism to archive directory type files. For this reason,
2264       the convention of using a filename ending with  slash  was  adopted  to
2265       specify a directory on the archive.
2266
2267       The  total size of the name and prefix fields have been set to meet the
2268       minimum requirements for {PATH_MAX}. If a pathname will fit within  the
2269       name field, it is recommended that the pathname be stored there without
2270       the use of the prefix field. Although the name field is known to be too
2271       small  to  contain  {PATH_MAX} characters, the value was not changed in
2272       this version of the archive file format to retain backwards-compatibil‐
2273       ity,  and  instead the prefix was introduced. Also, because of the ear‐
2274       lier version of the format, there is no way to remove  the  restriction
2275       on  the  linkname  field being limited in size to just that of the name
2276       field.
2277
2278       The size field is required  to  be  meaningful  in  all  implementation
2279       extensions,  although  it  could  be zero. This is required so that the
2280       data blocks can always be properly counted.
2281
2282       It is suggested that if device special files  need  to  be  represented
2283       that  cannot  be  represented  in  the standard format, that one of the
2284       extension types ( A- Z) be used, and that  the  additional  information
2285       for  the  special  file  be represented as data and be reflected in the
2286       size field.
2287
2288       Attempting to restore a special file type, where  it  is  converted  to
2289       ordinary data and conflicts with an existing filename, need not be spe‐
2290       cially detected by the utility. If run as an ordinary user, pax  should
2291       not  be able to overwrite the entries in, for example, /dev in any case
2292       (whether the file is converted to another type or not).  If  run  as  a
2293       privileged user, it should be able to do so, and it would be considered
2294       a bug if it did not.  The same is true of ordinary data files and simi‐
2295       larly  named special files; it is impossible to anticipate the needs of
2296       the user (who could really intend to overwrite the file), so the behav‐
2297       ior should be predictable (and thus regular) and rely on the protection
2298       system as required.
2299
2300       The value 7 in the typeflag field is intended to define how  contiguous
2301       files  can be stored in a ustar archive.  IEEE Std 1003.1-2001 does not
2302       require the contiguous file extension, but does define a  standard  way
2303       of  archiving  such  files so that all conforming systems can interpret
2304       these file types in a meaningful and consistent  manner.  On  a  system
2305       that  does  not  support extended file types, the pax utility should do
2306       the best it can with the file and go on to the next.
2307
2308       The file protection modes are those conventionally used by the ls util‐
2309       ity.  This  is extended beyond the usage in the ISO POSIX-2 standard to
2310       support the "shared text" or "sticky" bit. It is intended that the con‐
2311       formance  document should not document anything beyond the existence of
2312       and support of such a mode. Further extensions are  expected  to  these
2313       bits,  particularly  with  overloading the set-user-ID and set-group-ID
2314       flags.
2315
2316   cpio Interchange Format
2317       The reference to appropriate privilege in the cpio format refers to  an
2318       error  on  standard  output;  the ustar format does not make comparable
2319       statements.
2320
2321       The model for this format was the historical  System  V  cpio  -c  data
2322       interchange  format.  This  model documents the portable version of the
2323       cpio format and not the binary version.   It  has  the  flexibility  to
2324       transfer data of any type described within IEEE Std 1003.1-2001, yet is
2325       extensible  to  transfer  data  types  specific  to  extensions  beyond
2326       IEEE Std 1003.1-2001   (for  example,  contiguous  files).  Because  it
2327       describes existing  practice,  there  is  no  question  of  maintaining
2328       upwards-compatibility.
2329
2330   cpio Header
2331       There  has  been  some  concern that the size of the c_ino field of the
2332       header is too small to handle those systems that have very large  inode
2333       numbers.  However,  the c_ino field in the header is used strictly as a
2334       hard-link resolution mechanism for archives. It is not necessarily  the
2335       same  value  as the inode number of the file in the location from which
2336       that file is extracted.
2337
2338       The name c_magic is based on historical usage.
2339
2340   cpio Filename
2341       For most historical implementations of  the  cpio  utility,  {PATH_MAX}
2342       octets can be used to describe the pathname without the addition of any
2343       other header fields (the  NUL  character  would  be  included  in  this
2344       count).  {PATH_MAX}  is the minimum value for pathname size, documented
2345       as 256 bytes. However, an implementation may use c_namesize  to  deter‐
2346       mine  the exact length of the pathname. With the current description of
2347       the <cpio.h> header, this pathname size can be as  large  as  a  number
2348       that is described in six octal digits.
2349
2350       Two  values are documented under the c_mode field values to provide for
2351       extensibility for known file types:
2352
2353       0110 000
2354              Reserved for contiguous files. The implementation may treat  the
2355              rest  of  the  information for this archive like a regular file.
2356              If this file type is undefined, the  implementation  may  create
2357              the file as a regular file.
2358
2359
2360       This  provides  for extensibility of the cpio format while allowing for
2361       the ability to read old archives. Files of an unknown type may be  read
2362       as  "regular files" on some implementations.  On a system that does not
2363       support extended file types, the pax utility should do the best it  can
2364       with the file and go on to the next.
2365

FUTURE DIRECTIONS

2367       None.
2368

SEE ALSO

2370       Shell  Command  Language , cp , ed , getopts , ls , printf() , the Base
2371       Definitions volume of IEEE Std 1003.1-2001, <cpio.h>, the System Inter‐
2372       faces   volume  of  IEEE Std 1003.1-2001,  chown(),  creat(),  mkdir(),
2373       mkfifo(), stat(), utime(), write()
2374
2376       Portions of this text are reprinted and reproduced in  electronic  form
2377       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
2378       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
2379       Specifications  Issue  6,  Copyright  (C) 2001-2003 by the Institute of
2380       Electrical and Electronics Engineers, Inc and The Open  Group.  In  the
2381       event of any discrepancy between this version and the original IEEE and
2382       The Open Group Standard, the original IEEE and The Open Group  Standard
2383       is  the  referee document. The original Standard can be obtained online
2384       at http://www.opengroup.org/unix/online.html .
2385
2386
2387
2388IEEE/The Open Group                  2003                               PAX(P)
Impressum