1PAX(1P)                    POSIX Programmer's Manual                   PAX(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       pax - portable archive interchange
13

SYNOPSIS

15       pax [-cdnv][-H|-L][-f archive][-s replstr]...[pattern...]
16
17       pax -r[-cdiknuv][-H|-L][-f archive][-o options]...[-p string]...
18              [-s replstr]...[pattern...]
19
20       pax -w[-dituvX][-H|-L][-b blocksize][[-a][-f archive][-o options]...
21              [-s replstr]...[-x format][file...]
22
23       pax -r -w[-diklntuvX][-H|-L][-p string]...[-s replstr]...
24              [file...] directory
25
26

DESCRIPTION

28       The pax utility shall read, write, and write lists of  the  members  of
29       archive files and copy directory hierarchies. A variety of archive for‐
30       mats shall be supported; see the -x format option.
31
32       The action to be taken depends  on  the  presence  of  the  -r  and  -w
33       options. The four combinations of -r and -w are referred to as the four
34       modes of operation: list, read, write, and  copy  modes,  corresponding
35       respectively to the four forms shown in the SYNOPSIS section.
36
37       list   In  list  mode (when neither -r nor -w are specified), pax shall
38              write the names of the members of the archive file read from the
39              standard  input, with pathnames matching the specified patterns,
40              to standard output. If a named file is of  type  directory,  the
41              file hierarchy rooted at that file shall be listed as well.
42
43       read   In  read  mode  (when -r is specified, but -w is not), pax shall
44              extract the members of the archive file read from  the  standard
45              input,  with  pathnames  matching  the specified patterns. If an
46              extracted file is of type directory, the file  hierarchy  rooted
47              at  that  file  shall  be extracted as well. The extracted files
48              shall be created performing pathname resolution with the  direc‐
49              tory in which pax was invoked as the current working directory.
50
51       If an attempt is made to extract a directory when the directory already
52       exists, this shall not be considered an error. If an attempt is made to
53       extract  a FIFO when the FIFO already exists, this shall not be consid‐
54       ered an error.
55
56       The ownership, access, and modification times, and  file  mode  of  the
57       restored files are discussed under the -p option.
58
59       write  In  write  mode (when -w is specified, but -r is not), pax shall
60              write the contents of the file operands to the  standard  output
61              in  an archive format. If no file operands are specified, a list
62              of files to copy, one per line, shall be read from the  standard
63              input.  A  file of type directory shall include all of the files
64              in the file hierarchy rooted at the file.
65
66       copy   In copy mode (when both -r and -w are specified), pax shall copy
67              the file operands to the destination directory.
68
69       If  no  file  operands  are specified, a list of files to copy, one per
70       line, shall be read from the standard input. A file of  type  directory
71       shall  include  all  of  the  files in the file hierarchy rooted at the
72       file.
73
74       The effect of the copy shall be as if the copied files were written  to
75       an  archive file and then subsequently extracted, except that there may
76       be hard links between the original and the copied files. If the  desti‐
77       nation  directory  is  a subdirectory of one of the files to be copied,
78       the results are unspecified. If the destination directory is a file  of
79       a   type   not   defined   by   the   System   Interfaces   volume   of
80       IEEE Std 1003.1-2001, the results  are  implementation-defined;  other‐
81       wise,  it shall be an error for the file named by the directory operand
82       not to exist, not be writable by the user, or not be  a  file  of  type
83       directory.
84
85
86       In  read  or  copy  modes, if intermediate directories are necessary to
87       extract an archive member, pax shall perform actions equivalent to  the
88       mkdir()   function   defined   in   the  System  Interfaces  volume  of
89       IEEE Std 1003.1-2001, called with the following arguments:
90
91        * The intermediate directory used as the path argument
92
93        * The value of the  bitwise-inclusive  OR  of  S_IRWXU,  S_IRWXG,  and
94          S_IRWXO as the mode argument
95
96       If  any  specified pattern or file operands are not matched by at least
97       one file or archive member, pax shall write  a  diagnostic  message  to
98       standard error for each one that did not match and exit with a non-zero
99       exit status.
100
101       The archive formats described in the EXTENDED DESCRIPTION section shall
102       be  automatically  detected on input. The default output archive format
103       shall be implementation-defined.
104
105       A single archive can span multiple files. The pax utility shall  deter‐
106       mine,  in  an implementation-defined manner, what file to read or write
107       as the next file.
108
109       If the selected archive format supports  the  specification  of  linked
110       files,  it  shall  be an error if these files cannot be linked when the
111       archive is extracted. For archive formats that do not store  file  con‐
112       tents with each name that causes a hard link, if the file that contains
113       the data is not extracted during this  pax  session,  either  the  data
114       shall be restored from the original file, or a diagnostic message shall
115       be displayed with the name of a file that can be used  to  extract  the
116       data.  In traversing directories, pax shall detect infinite loops; that
117       is, entering a previously visited directory that is an ancestor of  the
118       last  file visited. When it detects an infinite loop, pax shall write a
119       diagnostic message to standard error and shall terminate.
120

OPTIONS

122       The pax utility  shall  conform  to  the  Base  Definitions  volume  of
123       IEEE Std 1003.1-2001,  Section  12.2, Utility Syntax Guidelines, except
124       that the order of presentation of the -o, -p, and -s options is signif‐
125       icant.
126
127       The following options shall be supported:
128
129       -r     Read an archive file from standard input.
130
131       -w     Write files to the standard output in the specified archive for‐
132              mat.
133
134       -a     Append files to the end of the archive.  It  is  implementation-
135              defined  which  devices  on  the system support appending. Addi‐
136              tional   file   formats   unspecified   by   this   volume    of
137              IEEE Std 1003.1-2001 may impose restrictions on appending.
138
139       -b  blocksize
140              Block  the  output at a positive decimal integer number of bytes
141              per write to the archive file. Devices and archive  formats  may
142              impose restrictions on blocking. Blocking shall be automatically
143              determined on input. Conforming applications shall not specify a
144              blocksize  value larger than 32256. Default blocking when creat‐
145              ing archives depends on the archive format. (See the  -x  option
146              below.)
147
148       -c     Match  all file or archive members except those specified by the
149              pattern or file operands.
150
151       -d     Cause files of type directory being copied or  archived  or  ar‐
152              chive  members  of  type  directory being extracted or listed to
153              match only the file or archive member itself and  not  the  file
154              hierarchy rooted at the file.
155
156       -f  archive
157              Specify  the pathname of the input or output archive, overriding
158              the default standard input (in list or read modes)  or  standard
159              output ( write mode).
160
161       -H     If a symbolic link referencing a file of type directory is spec‐
162              ified on the command line, pax shall archive the file  hierarchy
163              rooted in the file referenced by the link, using the name of the
164              link as the root of the file hierarchy. Otherwise, if a symbolic
165              link  referencing  a  file  of any other file type which pax can
166              normally archive is specified on  the  command  line,  then  pax
167              shall archive the file referenced by the link, using the name of
168              the link. The default behavior shall be to archive the  symbolic
169              link itself.
170
171       -i     Interactively  rename files or archive members. For each archive
172              member matching a pattern operand or file matching a file  oper‐
173              and, a prompt shall be written to the file /dev/tty.  The prompt
174              shall contain the name of the file or archive  member,  but  the
175              format  is otherwise unspecified. A line shall then be read from
176              /dev/tty. If this line is blank,  the  file  or  archive  member
177              shall  be skipped. If this line consists of a single period, the
178              file or archive member shall be processed with  no  modification
179              to its name. Otherwise, its name shall be replaced with the con‐
180              tents of the line. The pax utility shall immediately exit with a
181              non-zero  exit status if end-of-file is encountered when reading
182              a response or if /dev/tty cannot be opened for reading and writ‐
183              ing.
184
185       The  results  of extracting a hard link to a file that has been renamed
186       during extraction are unspecified.
187
188       -k     Prevent the overwriting of existing files.
189
190       -l     (The letter ell.) In copy mode, hard links shall be made between
191              the  source  and destination file hierarchies whenever possible.
192              If specified in conjunction with -H or -L, when a symbolic  link
193              is  encountered,  the  hard link created in the destination file
194              hierarchy shall be to the file referenced by the symbolic  link.
195              If  specified  when  neither -H nor -L is specified, when a sym‐
196              bolic link is encountered, the  implementation  shall  create  a
197              hard  link  to the symbolic link in the source file hierarchy or
198              copy the symbolic link to the destination.
199
200       -L     If a symbolic link referencing a file of type directory is spec‐
201              ified on the command line or encountered during the traversal of
202              a file hierarchy, pax shall archive the file hierarchy rooted in
203              the  file  referenced by the link, using the name of the link as
204              the root of the file hierarchy. Otherwise, if  a  symbolic  link
205              referencing a file of any other file type which pax can normally
206              archive is specified on the command line or  encountered  during
207              the  traversal  of  a file hierarchy, pax shall archive the file
208              referenced by the link, using the name of the link. The  default
209              behavior shall be to archive the symbolic link itself.
210
211       -n     Select  the first archive member that matches each pattern oper‐
212              and.  No more than one archive member shall be matched for  each
213              pattern  (although  members  of type directory shall still match
214              the file hierarchy rooted at that file).
215
216       -o  options
217              Provide information to the implementation to  modify  the  algo‐
218              rithm  for  extracting  or  writing  files. The value of options
219              shall consist of one or more  comma-separated  keywords  of  the
220              form:
221
222
223              keyword[[:]=value][,keyword[[:]=value], ...]
224
225       Some  keywords  apply  only  to certain file formats, as indicated with
226       each description. Use of keywords that are  inapplicable  to  the  file
227       format being processed produces undefined results.
228
229       Keywords  in  the  options  argument  shall be a string that would be a
230       valid portable filename as described in the Base Definitions volume  of
231       IEEE Std 1003.1-2001, Section 3.276, Portable Filename Character Set.
232
233       Note:
234              Keywords  are not expected to be filenames, merely to follow the
235              same character composition rules as portable filenames.
236
237
238       Keywords can be preceded with white space. The value field  shall  con‐
239       sist  of  zero  or more characters; within value, the application shall
240       precede any literal comma with a backslash, which shall be ignored, but
241       preserves  the  comma as part of value. A comma as the final character,
242       or a comma followed solely by white space as the final  characters,  in
243       options shall be ignored. Multiple -o options can be specified; if key‐
244       words given to these multiple -o options  conflict,  the  keywords  and
245       values  appearing  later in command line sequence shall take precedence
246       and the earlier shall be silently ignored. The following keyword values
247       of options shall be supported for the file formats as indicated:
248
249       delete=pattern
250
251              (Applicable  only  to  the -x pax format.) When used in write or
252              copy mode, pax shall omit from extended header records  that  it
253              produces  any keywords matching the string pattern. When used in
254              read or list mode, pax shall ignore any  keywords  matching  the
255              string  pattern  in  the extended header records. In both cases,
256              matching shall be performed using the pattern matching  notation
257              described  in  Patterns Matching a Single Character and Patterns
258              Matching Multiple Characters . For example:
259
260
261                     -o delete=security.*
262
263              would suppress security-related information.  See  pax  Extended
264              Header for extended header record keyword usage.
265
266       exthdr.name=string
267
268              (Applicable only to the -x pax format.) This keyword allows user
269              control over the name that is  written  into  the  ustar  header
270              blocks  for the extended header produced under the circumstances
271              described in pax Header Block . The name shall be  the  contents
272              of string, after the following character substitutions have been
273              made:
274
275                    string
276                    Includes:   Replaced By:
277                    %d          The directory name of the file, equiva‐
278                                lent to the result of the dirname util‐
279                                ity on the translated pathname.
280                    %f          The filename of the file, equivalent to
281                                the result of the basename utility on
282                                the translated pathname.
283                    %p          The process ID of the pax process.
284                    %%          A '%' character.
285
286              Any other '%' characters in string produce undefined results.
287
288              If no -o exthdr.name= string is specified,  pax  shall  use  the
289              following default value:
290
291
292                     %d/PaxHeaders.%p/%f
293
294       globexthdr.name=string
295
296              (Applicable  only  to  the -x pax format.) When used in write or
297              copy mode with the appropriate options, pax shall create  global
298              extended  header  records  with ustar header blocks that will be
299              treated as regular files by previous versions of pax. This  key‐
300              word  allows user control over the name that is written into the
301              ustar header blocks for global extended header records. The name
302              shall  be  the contents of string, after the following character
303              substitutions have been made:
304
305                    string
306                    Includes:   Replaced By:
307                    %n          An integer that represents the sequence
308                                number of the global extended header
309                                record in the archive, starting at 1.
310                    %p          The process ID of the pax process.
311                    %%          A '%' character.
312
313              Any other '%' characters in string produce undefined results.
314
315              If no -o globexthdr.name= string is specified, pax shall use the
316              following default value:
317
318
319                     $TMPDIR/GlobalHead.%p.%n
320
321              where  $  TMPDIR  represents the value of the TMPDIR environment
322              variable. If TMPDIR is not set, pax shall use /tmp.
323
324       invalid=action
325
326              (Applicable only to the -x pax format.) This keyword allows user
327              control over the action pax takes upon encountering values in an
328              extended header record that, in read or copy mode,  are  invalid
329              in the destination hierarchy or, in list mode, cannot be written
330              in the codeset and current locale  of  the  implementation.  The
331              following are invalid values that shall be recognized by pax:
332
333                      * In  read  or  copy  mode, a filename or link name that
334                        contains character encodings invalid in  the  destina‐
335                        tion  hierarchy.  (For  example,  the name may contain
336                        embedded NULs.)
337
338                      * In read or copy mode, a filename or link name that  is
339                        longer  than  the  maximum  allowed in the destination
340                        hierarchy (for either  a  pathname  component  or  the
341                        entire pathname).
342
343                      * In  list  mode,  any character string value (filename,
344                        link name, user name, and so on) that cannot be  writ‐
345                        ten in the codeset and current locale of the implemen‐
346                        tation.
347
348              The following mutually-exclusive values of the  action  argument
349              are supported:
350
351              bypass
352                     In  read or copy mode, pax shall bypass the file, causing
353                     no change to the destination hierarchy. In list mode, pax
354                     shall  write all requested valid values for the file, but
355                     its method for writing invalid values is unspecified.
356
357              rename
358                     In read or copy mode, pax shall act as if the  -i  option
359                     were  in  effect  for  each file with invalid filename or
360                     link name values, allowing the user to provide a replace‐
361                     ment  name  interactively. In list mode, pax shall behave
362                     identically to the bypass action.
363
364              UTF-8
365                     When used in read, copy, or list  mode  and  a  filename,
366                     link  name, owner name, or any other field in an extended
367                     header record cannot be translated  from  the  pax  UTF-8
368                     codeset  format  to the codeset and current locale of the
369                     implementation, pax shall use the actual  UTF-8  encoding
370                     for the name.
371
372              write
373                     In read or copy mode, pax shall write the file, translat‐
374                     ing or truncating the name, regardless  of  whether  this
375                     may overwrite an existing file with a valid name. In list
376                     mode, pax shall behave identically to the bypass action.
377
378
379              If no -o invalid= option is specified, pax shall act  as  if  -o
380              invalid=  bypass  were  specified.  Any  overwriting of existing
381              files that may be allowed by the -o invalid=  actions  shall  be
382              subject to permission ( -p) and modification time ( -u) restric‐
383              tions, and shall be suppressed if the -k option is  also  speci‐
384              fied.
385
386       linkdata
387
388              (Applicable only to the -x pax format.) In write mode, pax shall
389              write the contents of a file to the archive even when that  file
390              is merely a hard link to a file whose contents have already been
391              written to the archive.
392
393       listopt=format
394
395              This keyword specifies the output format of the  table  of  con‐
396              tents produced when the -v option is specified in list mode. See
397              List Mode  Format  Specifications  .  To  avoid  ambiguity,  the
398              listopt=  format  shall be the only or final keyword= value pair
399              in a -o option-argument; all characters in the remainder of  the
400              option-argument  shall  be considered part of the format string.
401              When multiple -o listopt= format options are specified, the for‐
402              mat  strings  shall be considered a single, concatenated string,
403              evaluated in command line order.
404
405       times
406
407              (Applicable only to the -x pax format.) When used  in  write  or
408              copy  mode,  pax  shall include atime, ctime, and mtime extended
409              header records for each file. See pax Extended Header File Times
410              .
411
412
413       In  addition  to these keywords, if the -x pax format is specified, any
414       of the keywords and values defined in pax  Extended  Header,  including
415       implementation  extensions,  can  be  used  in  -o option-arguments, in
416       either of two modes:
417
418       keyword=value
419
420              When used in write or copy mode, these keyword/value pairs shall
421              be included at the beginning of the archive as typeflag g global
422              extended header records. When used in read or list  mode,  these
423              keyword/value  pairs shall act as if they had been at the begin‐
424              ning of  the  archive  as  typeflag  g  global  extended  header
425              records.
426
427       keyword:=value
428
429              When used in write or copy mode, these keyword/value pairs shall
430              be included as records at the beginning of a typeflag x extended
431              header  for  each  file. (This shall be equivalent to the equal-
432              sign form except that it creates no typeflag g  global  extended
433              header  records.)  When  used  in  read or list mode, these key‐
434              word/value pairs shall act as if they were included  as  records
435              at  the  end  of each extended header; thus, they shall override
436              any global or file-specific extended header record  keywords  of
437              the same names. For example, in the command:
438
439
440                     pax -r -o "
441                     gname:=mygroup,
442                     " <archive
443
444              the  group name will be forced to a new value for all files read
445              from the archive.
446
447
448       The precedence of -o keywords over various fields  in  the  archive  is
449       described in pax Extended Header Keyword Precedence .
450
451       -p  string
452              Specify  one  or  more file characteristic options (privileges).
453              The string option-argument shall be  a  string  specifying  file
454              characteristics  to  be retained or discarded on extraction. The
455              string shall consist of the specification characters a, e, m, o,
456              and p . Other implementation-defined characters can be included.
457              Multiple characteristics can be  concatenated  within  the  same
458              string  and multiple -p options can be specified. The meaning of
459              the specification characters are as follows:
460
461       a
462              Do not preserve file access times.
463
464       e
465              Preserve the user ID, group ID, file mode  bits  (see  the  Base
466              Definitions  volume of IEEE Std 1003.1-2001, Section 3.168, File
467              Mode Bits), access time, modification time, and any other imple‐
468              mentation-defined file characteristics.
469
470       m
471              Do not preserve file modification times.
472
473       o
474              Preserve the user ID and group ID.
475
476       p
477              Preserve  the  file mode bits. Other implementation-defined file
478              mode attributes may be preserved.
479
480
481       In the preceding list, "preserve" indicates that an attribute stored in
482       the  archive  shall be given to the extracted file, subject to the per‐
483       missions of the invoking process. The access and modification times  of
484       the  file  shall  be  preserved  unless otherwise specified with the -p
485       option or not stored in the archive. All attributes that are  not  pre‐
486       served  shall  be determined as part of the normal file creation action
487       (see File Read, Write, and Creation ).
488
489       If neither the e nor the o specification character is specified, or the
490       user  ID  and  group ID are not preserved for any reason, pax shall not
491       set the S_ISUID and S_ISGID bits of the file mode.
492
493       If the preservation of any of these items fails  for  any  reason,  pax
494       shall  write  a  diagnostic message to standard error.  Failure to pre‐
495       serve these items shall affect the final exit  status,  but  shall  not
496       cause the extracted file to be deleted.
497
498       If  file  characteristic  letters in any of the string option-arguments
499       are duplicated or conflict with each other, the ones given  last  shall
500       take precedence. For example, if -p eme is specified, file modification
501       times are preserved.
502
503       -s  replstr
504              Modify file or archive member names named by pattern or file op‐
505              erands  according  to the substitution expression replstr, using
506              the syntax of the ed utility.  The  concepts  of  "address"  and
507              "line"  are  meaningless  in the context of the pax utility, and
508              shall not be supplied. The format shall be:
509
510
511              -s /old/new/[gp]
512
513       where as in ed, old is a basic regular expression and new  can  contain
514       an  ampersand,  '\n' (where n is a digit) backreferences, or subexpres‐
515       sion matching. The old string shall also be permitted to contain  <new‐
516       line>s.
517
518       Any  non-null  character  can be used as a delimiter ( '/' shown here).
519       Multiple -s expressions can be  specified;  the  expressions  shall  be
520       applied  in  the order specified, terminating with the first successful
521       substitution. The optional trailing 'g' is as defined in the  ed  util‐
522       ity.  The optional trailing 'p' shall cause successful substitutions to
523       be written to standard error. File or archive member names that substi‐
524       tute  to the empty string shall be ignored when reading and writing ar‐
525       chives.
526
527       -t     When reading files from the file system, and if the user has the
528              permissions required by utime() to do so, set the access time of
529              each file read to the access time that it had before being  read
530              by pax.
531
532       -u     Ignore files that are older (having a less recent file modifica‐
533              tion time) than a pre-existing file or archive member  with  the
534              same name. In read mode, an archive member with the same name as
535              a file in the file system shall be extracted if the archive mem‐
536              ber  is newer than the file. In write mode, an archive file mem‐
537              ber with the same name as a file in the  file  system  shall  be
538              superseded  if  the file is newer than the archive member. If -a
539              is also specified, this is accomplished by appending to the  ar‐
540              chive; otherwise, it is unspecified whether this is accomplished
541              by actual replacement in the archive or by appending to the  ar‐
542              chive. In copy mode, the file in the destination hierarchy shall
543              be replaced by the file in the source hierarchy or by a link  to
544              the file in the source hierarchy if the file in the source hier‐
545              archy is newer.
546
547       -v     In list mode, produce a verbose table of contents (see the  STD‐
548              OUT section). Otherwise, write archive member pathnames to stan‐
549              dard error (see the STDERR section).
550
551       -x  format
552              Specify the output archive format. The pax utility shall support
553              the following formats:
554
555       cpio
556              The  cpio  interchange format; see the EXTENDED DESCRIPTION sec‐
557              tion.  The default blocksize for this format for character  spe‐
558              cial  archive files shall be 5120. Implementations shall support
559              all blocksize values less than or equal to 32256 that are multi‐
560              ples of 512.
561
562       pax
563              The  pax  interchange  format; see the EXTENDED DESCRIPTION sec‐
564              tion.  The default blocksize for this format for character  spe‐
565              cial  archive files shall be 5120. Implementations shall support
566              all blocksize values less than or equal to 32256 that are multi‐
567              ples of 512.
568
569       ustar
570              The  tar  interchange  format; see the EXTENDED DESCRIPTION sec‐
571              tion.  The default blocksize for this format for character  spe‐
572              cial archive files shall be 10240. Implementations shall support
573              all blocksize values less than or equal to 32256 that are multi‐
574              ples of 512.
575
576
577       Implementation-defined  formats  shall  specify a default block size as
578       well as any other block sizes supported for character  special  archive
579       files.
580
581       Any attempt to append to an archive file in a format different from the
582       existing archive format shall cause pax to exit immediately with a non-
583       zero exit status.
584
585       In  copy  mode, if no -x format is specified, pax shall behave as if -x
586       pax were specified.
587
588       -X     When traversing the file hierarchy specified by a pathname,  pax
589              shall  not descend into directories that have a different device
590              ID   (   st_dev;   see   the   System   Interfaces   volume   of
591              IEEE Std 1003.1-2001, stat()).
592
593
594       The options that operate on the names of files or archive members ( -c,
595       -i, -n, -s, -u, and -v) shall interact as follows. In  read  mode,  the
596       archive  members  shall be selected based on the user-specified pattern
597       operands as modified by the -c, -n, and -u options. Then, any -s and -i
598       options  shall  modify, in that order, the names of the selected files.
599       The -v option shall write names resulting from these modifications.
600
601       In write mode, the files shall be selected based on the  user-specified
602       pathnames  as  modified  by  the -n and -u options. Then, any -s and -i
603       options shall modify, in that order, the names of these selected files.
604       The -v option shall write names resulting from these modifications.
605
606       If  both  the -u and -n options are specified, pax shall not consider a
607       file selected unless it is newer than the file to which it is compared.
608
609   List Mode Format Specifications
610       In list mode with the -o listopt= format option,  the  format  argument
611       shall be applied for each selected file. The pax utility shall append a
612       <newline> to the listopt output for  each  selected  file.  The  format
613       argument shall be used as the format string described in the Base Defi‐
614       nitions volume of IEEE Std 1003.1-2001, Chapter 5,  File  Format  Nota‐
615       tion,  with  the  exceptions  1.   through  5.  defined in the EXTENDED
616       DESCRIPTION section of printf, plus the following exceptions:
617
618       6.     The sequence ( keyword) can occur  before  a  format  conversion
619              specifier.  The  conversion  argument is defined by the value of
620              keyword. The implementation shall  support  the  following  key‐
621              words:
622
623               * Any  of  the  Field  Name  entries  in ustar Header Block and
624                 Octet-Oriented cpio Archive Entry .  The  implementation  may
625                 support  the cpio keywords without the leading c_ in addition
626                 to the form required by Values for cpio c_mode Field .
627
628               * Any keyword defined for the extended header in  pax  Extended
629                 Header .
630
631               * Any  keyword  provided as an implementation-defined extension
632                 within the extended header defined in pax Extended Header .
633
634       For example, the sequence "%(charset)s" is the string value of the name
635       of the character set in the extended header.
636
637       The  result  of the keyword conversion argument shall be the value from
638       the applicable header field or extended header,  without  any  trailing
639       NULs.
640
641       All  keyword  values  used  as conversion arguments shall be translated
642       from the UTF-8 encoding to the character set appropriate for the  local
643       file system, user database, and so on, as applicable.
644
645       7.     An  additional  conversion specifier character, T, shall be used
646              to specify time formats. The T  conversion  specifier  character
647              can  be  preceded  by  the sequence ( keyword= subformat), where
648              subformat is a date format as  defined  by  date  operands.  The
649              default  keyword  shall be mtime and the default subformat shall
650              be:
651
652
653              %b %e %H:%M %Y
654
655       8.     An additional conversion specifier character, M, shall  be  used
656              to  specify  the file mode string as defined in ls Standard Out‐
657              put.  If ( keyword) is omitted, the mode keyword shall be  used.
658              For  example,  %.1M writes the single character corresponding to
659              the <entry type> field of the ls -l command.
660
661       9.     An additional conversion specifier character, D, shall  be  used
662              to specify the device for block or special files, if applicable,
663              in an implementation-defined format. If not  applicable,  and  (
664              keyword)  is specified, then this conversion shall be equivalent
665              to %(keyword)u. If not applicable, and (  keyword)  is  omitted,
666              then this conversion shall be equivalent to <space>.
667
668       10.    An  additional  conversion specifier character, F, shall be used
669              to specify a pathname. The F conversion character  can  be  pre‐
670              ceded by a sequence of comma-separated keywords:
671
672
673              (keyword[,keyword] ... )
674
675       The values for all the keywords that are non-null shall be concatenated
676       together, each separated by a '/' . The default shall be ( path) if the
677       keyword  path  is  defined;  otherwise,  the default shall be ( prefix,
678       name).
679
680       11.    An additional conversion specifier character, L, shall  be  used
681              to  specify  a symbolic line expansion. If the current file is a
682              symbolic link, then %L shall expand to:
683
684
685              "%s -> %s", <value of keyword>, <contents of link>
686
687       Otherwise, the %L conversion specification shall be the  equivalent  of
688       %F .
689
690

OPERANDS

692       The following operands shall be supported:
693
694       directory
695              The destination directory pathname for copy mode.
696
697       file   A pathname of a file to be copied or archived.
698
699       pattern
700              A  pattern  matching one or more pathnames of archive members. A
701              pattern must be given in the  name-generating  notation  of  the
702              pattern  matching notation in Pattern Matching Notation, includ‐
703              ing the filename expansion rules in Patterns Used  for  Filename
704              Expansion  .  The  default,  if  no  pattern is specified, is to
705              select all members in the archive.
706
707

STDIN

709       In write mode, the standard input shall be used only if no  file  oper‐
710       ands  are specified. It shall be a text file containing a list of path‐
711       names, one per line, without leading or trailing <blank>s.
712
713       In list and read modes, if -f is  not  specified,  the  standard  input
714       shall be an archive file.
715
716       Otherwise, the standard input shall not be used.
717

INPUT FILES

719       The  input file named by the archive option-argument, or standard input
720       when the archive is read from there, shall be a file formatted  accord‐
721       ing to one of the specifications in the EXTENDED DESCRIPTION section or
722       some other implementation-defined format.
723
724       The file /dev/tty shall be used to write prompts and read responses.
725

ENVIRONMENT VARIABLES

727       The following environment variables shall affect the execution of pax:
728
729       LANG   Provide a default value for the  internationalization  variables
730              that  are  unset  or  null.  (See the Base Definitions volume of
731              IEEE Std 1003.1-2001, Section  8.2,  Internationalization  Vari‐
732              ables  for the precedence of internationalization variables used
733              to determine the values of locale categories.)
734
735       LC_ALL If set to a non-empty string value, override the values  of  all
736              the other internationalization variables.
737
738       LC_COLLATE
739
740              Determine  the  locale  for  the behavior of ranges, equivalence
741              classes, and multi-character collating elements used in the pat‐
742              tern  matching  expressions  for  the pattern operand, the basic
743              regular expression for the -s option, and the  extended  regular
744              expression defined for the yesexpr locale keyword in the LC_MES‐
745              SAGES category.
746
747       LC_CTYPE
748              Determine the locale for  the  interpretation  of  sequences  of
749              bytes  of  text  data as characters (for example, single-byte as
750              opposed to multi-byte characters in arguments and input  files),
751              the  behavior  of character classes used in the extended regular
752              expression defined for the yesexpr locale keyword in the LC_MES‐
753              SAGES category, and pattern matching.
754
755       LC_MESSAGES
756              Determine the locale for the processing of affirmative responses
757              that should be used to affect the format and contents  of  diag‐
758              nostic messages written to standard error.
759
760       LC_TIME
761              Determine  the format and contents of date and time strings when
762              the -v option is specified.
763
764       NLSPATH
765              Determine the location of message catalogs for the processing of
766              LC_MESSAGES .
767
768       TMPDIR Determine  the pathname that provides part of the default global
769              extended header record file, as described for the -o globexthdr=
770              keyword in the OPTIONS section.
771
772       TZ     Determine  the  timezone used to calculate date and time strings
773              when the -v option is specified. If TZ  is  unset  or  null,  an
774              unspecified default timezone shall be used.
775
776

ASYNCHRONOUS EVENTS

778       Default.
779

STDOUT

781       In write mode, if -f is not specified, the standard output shall be the
782       archive formatted  according  to  one  of  the  specifications  in  the
783       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
784       mat (see -x format).
785
786       In list mode, when the -o  listopt=  format  has  been  specified,  the
787       selected  archive members shall be written to standard output using the
788       format described under List Mode Format Specifications . In  list  mode
789       without  the  -o  listopt=  format option, the table of contents of the
790       selected archive members shall be written to standard output using  the
791       following format:
792
793
794              "%s\n", <pathname>
795
796       If  the  -v  option is specified in list mode, the table of contents of
797       the selected archive members shall be written to standard output  using
798       the following formats.
799
800       For  pathnames  representing  hard links to previous members of the ar‐
801       chive:
802
803
804              "%s == %s\n", <ls -l listing>, <linkname>
805
806       For all other pathnames:
807
808
809              "%s\n", <ls -l listing>
810
811       where <ls  -l listing> shall be the format specified by the ls  utility
812       with  the  -l  option.  When  writing  pathnames  in this format, it is
813       unspecified what is written for fields for which the underlying archive
814       format does not have the correct information, although the correct num‐
815       ber of <blank>-separated fields shall be written.
816
817       In list mode, standard output shall not be buffered more than a line at
818       a time.
819

STDERR

821       If  -v  is specified in read, write, or copy modes, pax shall write the
822       pathnames it processes to the standard error output using the following
823       format:
824
825
826              "%s\n", <pathname>
827
828       These  pathnames shall be written as soon as processing is begun on the
829       file or archive member, and shall be flushed  to  standard  error.  The
830       trailing  <newline>,  which  shall not be buffered, is written when the
831       file has been read or written.
832
833       If the -s option is specified, and the replacement string has a  trail‐
834       ing  'p',  substitutions shall be written to standard error in the fol‐
835       lowing format:
836
837
838              "%s >> %s\n", <original pathname>, <new pathname>
839
840       In all operating modes of pax, optional messages of unspecified  format
841       concerning  the  input  archive format and volume number, the number of
842       files, blocks, volumes, and media parts as  well  as  other  diagnostic
843       messages may be written to standard error.
844
845       In  all  formats,  for  both  standard output and standard error, it is
846       unspecified how non-printable characters in pathnames or link names are
847       written.
848
849       When pax is in read mode or list mode, using the -x pax archive format,
850       and a filename, link name,  owner  name,  or  any  other  field  in  an
851       extended  header record cannot be translated from the pax UTF-8 codeset
852       format to the codeset and current locale  of  the  implementation,  pax
853       shall  write  a diagnostic message to standard error, shall process the
854       file as described for the -o invalid= option, and  then  shall  process
855       the next file in the archive.
856

OUTPUT FILES

858       In  read mode, the extracted output files shall be of the archived file
859       type. In copy mode, the copied output files shall be the  type  of  the
860       file  being  copied.  In either mode, existing files in the destination
861       hierarchy shall be overwritten only when all permission ( -p),  modifi‐
862       cation time ( -u), and invalid-value ( -o invalid=) tests allow it.
863
864       In write mode, the output file named by the -f option-argument shall be
865       a file formatted according to one of the specifications in the EXTENDED
866       DESCRIPTION section, or some other implementation-defined format.
867

EXTENDED DESCRIPTION

869   pax Interchange Format
870       A  pax archive tape or file produced in the -x pax format shall contain
871       a series of blocks. The physical layout of the archive shall be identi‐
872       cal  to  the  ustar format described in ustar Interchange Format . Each
873       file archived shall be represented by the following sequence:
874
875        * An optional header block with extended header records.  This  header
876          block  is of the form described in pax Header Block, with a typeflag
877          value of x or g. The  extended  header  records,  described  in  pax
878          Extended  Header,  shall  be  included  as  the data for this header
879          block.
880
881        * A header block that describes the file. Any fields in the  preceding
882          optional  extended  header  shall  override the associated fields in
883          this header block for this file.
884
885        * Zero or more blocks that contain the contents of the file.
886
887       At the end of the archive file  there  shall  be  two  512-byte  blocks
888       filled with binary zeros, interpreted as an end-of-archive indicator.
889
890       A  schematic  of an example archive with global extended header records
891       and two actual files is shown in pax Format Archive Example  .  In  the
892       example,  the second file in the archive has no extended header preced‐
893       ing it, presumably because it has no need for extended attributes.
894
895
896
897                         Figure: pax Format Archive Example
898
899   pax Header Block
900       The pax header block shall be  identical  to  the  ustar  header  block
901       described in ustar Interchange Format, except that two additional type‐
902       flag values are defined:
903
904       x      Represents extended header records for the following file in the
905              archive (which shall have its own ustar header block).  The for‐
906              mat of these extended header records shall be  as  described  in
907              pax Extended Header .
908
909       g      Represents  global  extended  header  records  for the following
910              files in the  archive.  The  format  of  these  extended  header
911              records  shall  be  as  described  in pax Extended Header . Each
912              value shall affect all subsequent files  that  do  not  override
913              that value in their own extended header record and until another
914              global extended header record is reached that  provides  another
915              value  for  the same field. The typeflag g global headers should
916              not be used with interchange media  that  could  suffer  partial
917              data loss in transporting the archive.
918
919
920       For  both  of  these  types,  the  size  field shall be the size of the
921       extended header records in octets. The other fields in the header block
922       are not meaningful to this version of the pax utility. However, if this
923       archive is read by a pax utility  conforming  to  the  ISO POSIX-2:1993
924       standard,  the  header  block  fields are used to create a regular file
925       that contains the extended header records as  data.  Therefore,  header
926       block field values should be selected to provide reasonable file access
927       to this regular file.
928
929       A further difference from the ustar header block is  that  data  blocks
930       for  files  of  typeflag 1 (the digit one) (hard link) may be included,
931       which means that the size field may be greater than zero. Archives cre‐
932       ated  by  pax -o linkdata shall include these data blocks with the hard
933       links.
934
935   pax Extended Header
936       A pax extended header contains values that are  inappropriate  for  the
937       ustar  header  block  because  of  limitations  in  that format: fields
938       requiring a  character  encoding  other  than  that  described  in  the
939       ISO/IEC 646:1991  standard,  fields  representing  file  attributes not
940       described in the ustar header, and fields whose format or length do not
941       fit  the  requirements  of  the ustar header. The values in an extended
942       header add attributes to the following file (or files; see the descrip‐
943       tion  of the typeflag g header block) or override values in the follow‐
944       ing header block(s), as indicated in the following list of keywords.
945
946       An extended header shall consist of one  or  more  records,  each  con‐
947       structed as follows:
948
949
950              "%d %s=%s\n", <length>, <keyword>, <value>
951
952       The   extended  header  records  shall  be  encoded  according  to  the
953       ISO/IEC 10646-1:2000 standard (UTF-8).  The  <length>  field,  <blank>,
954       equals sign, and <newline> shown shall be limited to the portable char‐
955       acter set, as encoded in UTF-8. The <keyword> and <value> fields can be
956       any UTF-8 characters. The <length> field shall be the decimal length of
957       the extended header record in octets, including the trailing <newline>.
958
959       The <keyword> field shall be one of the entries from the following list
960       or a keyword provided as an implementation extension. Keywords consist‐
961       ing entirely of lowercase letters, digits, and periods are reserved for
962       future standardization. A keyword shall not include an equals sign. (In
963       the following list, the notations "file(s)" or "block(s)"  is  used  to
964       acknowledge  that  a  keyword affects the following single file after a
965       typeflag x extended header, but possibly multiple files after  typeflag
966       g.  Any  requirements  in  the list for pax to include a record when in
967       write or copy mode shall apply only when such a record has not  already
968       been provided through the use of the -o option. When used in copy mode,
969       pax shall behave as if an archive  had  been  created  with  applicable
970       extended header records and then extracted.)
971
972       atime  The  file  access  time for the following file(s), equivalent to
973              the value of the st_atime member of the  stat  structure  for  a
974              file, as described by the stat() function. The access time shall
975              be  restored  if  the  process  has  the  appropriate  privilege
976              required  to  do  so.  The  format  of  the  <value> shall be as
977              described in pax Extended Header File Times .
978
979       charset
980              The name of the character set used to encode  the  data  in  the
981              following  file(s).  The  entries  in  the  following  table are
982              defined to refer to known standards;  additional  names  may  be
983              agreed on between the originator and recipient.
984
985                   <value>                  Formal Standard
986                   ISO-IR 646 1990          ISO/IEC 646:1990
987                   ISO-IR 8859 1 1998       ISO/IEC 8859-1:1998
988                   ISO-IR 8859 2 1999       ISO/IEC 8859-2:1999
989                   ISO-IR 8859 3 1999       ISO/IEC 8859-3:1999
990                   ISO-IR 8859 4 1998       ISO/IEC 8859-4:1998
991                   ISO-IR 8859 5 1999       ISO/IEC 8859-5:1999
992                   ISO-IR 8859 6 1999       ISO/IEC 8859-6:1999
993                   ISO-IR 8859 7 1987       ISO/IEC 8859-7:1987
994                   ISO-IR 8859 8 1999       ISO/IEC 8859-8:1999
995                   ISO-IR 8859 9 1999       ISO/IEC 8859-9:1999
996                   ISO-IR 8859 10 1998      ISO/IEC 8859-10:1998
997                   ISO-IR 8859 13 1998      ISO/IEC 8859-13:1998
998                   ISO-IR 8859 14 1998      ISO/IEC 8859-14:1998
999                   ISO-IR 8859 15 1999      ISO/IEC 8859-15:1999
1000                   ISO-IR 10646 2000        ISO/IEC 10646:2000
1001                   ISO-IR 10646 2000 UTF-8  ISO/IEC 10646, UTF-8 encoding
1002                   BINARY                   None.
1003
1004       The  encoding  is  included in an extended header for information only;
1005       when pax is used as described in  IEEE Std 1003.1-2001,  it  shall  not
1006       translate the file data into any other encoding. The BINARY entry indi‐
1007       cates unencoded binary data.
1008
1009       When used in write or copy mode, it is  implementation-defined  whether
1010       pax includes a charset extended header record for a file.
1011
1012       comment
1013              A  series of characters used as a comment. All characters in the
1014              <value> field shall be ignored by pax.
1015
1016       ctime  The file creation time for the following file(s), equivalent  to
1017              the  value  of  the  st_ctime member of the stat structure for a
1018              file, as described by the stat()  function.  The  creation  time
1019              shall  be  restored if the process has the appropriate privilege
1020              required to do so.  The  format  of  the  <value>  shall  be  as
1021              described in pax Extended Header File Times .
1022
1023       gid    The  group  ID  of  the group that owns the file, expressed as a
1024              decimal number using digits from the ISO/IEC 646:1991  standard.
1025              This record shall override the gid field in the following header
1026              block(s). When used in write or copy mode, pax shall  include  a
1027              gid  extended  header  record  for  each  file whose group ID is
1028              greater than 2097151 (octal 7777777).
1029
1030       gname  The group of the file(s), formatted as a group name in the group
1031              database.   This  record shall override the gid and gname fields
1032              in the following header block(s), and any  gid  extended  header
1033              record.  When used in read, copy, or list mode, pax shall trans‐
1034              late the name from the UTF-8 encoding in the  header  record  to
1035              the  character  set  appropriate  for  the group database on the
1036              receiving system. If any  of  the  UTF-8  characters  cannot  be
1037              translated,  and  if  the -o invalid= UTF-8 option is not speci‐
1038              fied, the results are implementation-defined. When used in write
1039              or  copy  mode, pax shall include a gname extended header record
1040              for each file whose group name cannot  be  represented  entirely
1041              with the letters and digits of the portable character set.
1042
1043       linkpath
1044              The  pathname  of  a  link being created to another file, of any
1045              type,  previously  archived.  This  record  shall  override  the
1046              linkname field in the following ustar header block(s).  The fol‐
1047              lowing ustar header block shall determine the type of link  cre‐
1048              ated.  If  typeflag of the following header block is 1, it shall
1049              be a hard link. If typeflag is 2, it shall be  a  symbolic  link
1050              and  the  linkpath  value  shall be the contents of the symbolic
1051              link. The pax utility shall translate the name of the link (con‐
1052              tents of the symbolic link) from the UTF-8 encoding to the char‐
1053              acter set appropriate for the local file system.  When  used  in
1054              write or copy mode, pax shall include a linkpath extended header
1055              record for  each  link  whose  pathname  cannot  be  represented
1056              entirely  with  the  members of the portable character set other
1057              than NUL.
1058
1059       mtime  The file modification time of the following file(s),  equivalent
1060              to  the value of the st_mtime member of the stat structure for a
1061              file, as described in the stat()  function.  This  record  shall
1062              override  the  mtime field in the following header block(s). The
1063              modification time shall be  restored  if  the  process  has  the
1064              appropriate  privilege  required  to  do  so.  The format of the
1065              <value> shall be as described in pax Extended Header File  Times
1066              .
1067
1068       path   The  pathname  of the following file(s). This record shall over‐
1069              ride  the  name  and  prefix  fields  in  the  following  header
1070              block(s).  The  pax  utility shall translate the pathname of the
1071              file from the UTF-8 encoding to the  character  set  appropriate
1072              for the local file system.
1073
1074       When  used  in  write  or  copy mode, pax shall include a path extended
1075       header record for  each  file  whose  pathname  cannot  be  represented
1076       entirely with the members of the portable character set other than NUL.
1077
1078       realtime.any
1079              The  keywords  prefixed  by  "realtime." are reserved for future
1080              standardization.
1081
1082       security.any
1083              The keywords prefixed by "security."  are  reserved  for  future
1084              standardization.
1085
1086       size   The  size  of  the file in octets, expressed as a decimal number
1087              using digits from the  ISO/IEC 646:1991  standard.  This  record
1088              shall  override the size field in the following header block(s).
1089              When used in write or  copy  mode,  pax  shall  include  a  size
1090              extended  header  record for each file with a size value greater
1091              than 8589934591 (octal 77777777777).
1092
1093       uid    The user ID of the file owner, expressed  as  a  decimal  number
1094              using  digits  from  the  ISO/IEC 646:1991 standard. This record
1095              shall override the uid field in the following  header  block(s).
1096              When  used  in  write  or  copy  mode,  pax  shall include a uid
1097              extended header record for each file whose owner ID  is  greater
1098              than 2097151 (octal 7777777).
1099
1100       uname  The  owner of the following file(s), formatted as a user name in
1101              the user database. This record shall override the uid and  uname
1102              fields  in  the  following header block(s), and any uid extended
1103              header record. When used in read, copy, or list mode, pax  shall
1104              translate  the name from the UTF-8 encoding in the header record
1105              to the character set appropriate for the user  database  on  the
1106              receiving  system.  If  any  of  the  UTF-8 characters cannot be
1107              translated, and if the -o invalid= UTF-8 option  is  not  speci‐
1108              fied, the results are implementation-defined. When used in write
1109              or copy mode, pax shall include a uname extended  header  record
1110              for  each  file  whose  user name cannot be represented entirely
1111              with the letters and digits of the portable character set.
1112
1113
1114       If the <value> field is zero length, it shall delete any  header  block
1115       field,  previously  entered  extended  header value, or global extended
1116       header value of the same name.
1117
1118       If a keyword in an extended header record (or in a -o  option-argument)
1119       overrides  or  deletes a corresponding field in the ustar header block,
1120       pax shall ignore the contents of that header block field.
1121
1122       Unlike the ustar header block fields, NULs shall not delimit  <value>s;
1123       all  characters  within  the <value> field shall be considered data for
1124       the field. None of the length limitations of  the  ustar  header  block
1125       fields  in  ustar  Header  Block  shall  apply  to  the extended header
1126       records.
1127
1128   pax Extended Header Keyword Precedence
1129       This section describes the  precedence  in  which  the  various  header
1130       records  and fields and command line options are selected to apply to a
1131       file in the archive. When pax is used in read or list modes,  it  shall
1132       determine a file attribute in the following sequence:
1133
1134        1. If -o delete= keyword-prefix is used, the affected attributes shall
1135           be determined from step 7., if applicable, or ignored otherwise.
1136
1137        2. If -o keyword:= is used, the affected attributes shall be ignored.
1138
1139        3. If -o keyword := value is used, the  affected  attribute  shall  be
1140           assigned the value.
1141
1142        4. If  there  is  a  typeflag  x  extended header record, the affected
1143           attribute shall be  assigned  the  <value>.  When  extended  header
1144           records  conflict,  the  last  one  given  in the header shall take
1145           precedence.
1146
1147        5. If -o keyword = value is used,  the  affected  attribute  shall  be
1148           assigned the value.
1149
1150        6. If  there  is  a  typeflag  g  global  extended  header record, the
1151           affected attribute shall  be  assigned  the  <value>.  When  global
1152           extended  header records conflict, the last one given in the global
1153           header shall take precedence.
1154
1155        7. Otherwise, the attribute shall be determined from the ustar  header
1156           block.
1157
1158   pax Extended Header File Times
1159       The  pax  utility shall write an mtime record for each file in write or
1160       copy modes if  the  file's  modification  time  cannot  be  represented
1161       exactly  in  the  ustar header logical record described in ustar Inter‐
1162       change Format . This can occur if the time is out of ustar range, or if
1163       the  file  system of the underlying implementation supports non-integer
1164       time granularities and the time is not an integer. All  of  these  time
1165       records  shall  be formatted as a decimal representation of the time in
1166       seconds since the Epoch. If a period ( '.' ) decimal point character is
1167       present, the digits to the right of the point shall represent the units
1168       of a subsecond timing granularity, where the first digit is tenths of a
1169       second  and  each subsequent digit is a tenth of the previous digit. In
1170       read or copy mode, the pax utility shall truncate the time of a file to
1171       the greatest value that is not greater than the input header file time.
1172       In write or copy mode, the pax utility shall output a time  exactly  if
1173       it  can be represented exactly as a decimal number, and otherwise shall
1174       generate only enough digits so that the same time shall be recovered if
1175       the  file is extracted on a system whose underlying implementation sup‐
1176       ports the same time granularity.
1177
1178   ustar Interchange Format
1179       A ustar archive tape or file shall contain a series of logical records.
1180       Each  logical record shall be a fixed-size logical record of 512 octets
1181       (see below). Although this format may be thought of as being stored  on
1182       9-track  industry-standard  12.7 mm (0.5 in) magnetic tape, other types
1183       of transportable media are not excluded.  Each file archived  shall  be
1184       represented  by  a  header logical record that describes the file, fol‐
1185       lowed by zero or more logical records that give  the  contents  of  the
1186       file. At the end of the archive file there shall be two 512-octet logi‐
1187       cal records filled with binary zeros, interpreted as an  end-of-archive
1188       indicator.
1189
1190       The  logical  records  may  be  grouped for physical I/O operations, as
1191       described under the -b blocksize and -x ustar options.  Each  group  of
1192       logical  records  may  be written with a single operation equivalent to
1193       the write() function.  On magnetic tape, the result of this write shall
1194       be  a  single tape physical block. The last physical block shall always
1195       be the full size, so logical records after the two zero logical records
1196       may contain undefined data.
1197
1198       The header logical record shall be structured as shown in the following
1199       table. All lengths and offsets are in decimal.
1200
1201                              Table: ustar Header Block
1202
1203                   Field Name   Octet Offset   Length (in Octets)
1204                   name         0              100
1205                   mode         100            8
1206                   uid          108            8
1207                   gid          116            8
1208                   size         124            12
1209                   mtime        136            12
1210                   chksum       148            8
1211                   typeflag     156            1
1212                   linkname     157            100
1213                   magic        257            6
1214                   version      263            2
1215                   uname        265            32
1216                   gname        297            32
1217                   devmajor     329            8
1218                   devminor     337            8
1219                   prefix       345            155
1220
1221       All characters in the header logical record shall be represented in the
1222       coded  character  set  of  the  ISO/IEC 646:1991  standard. For maximum
1223       portability between implementations,  names  should  be  selected  from
1224       characters represented by the portable filename character set as octets
1225       with the most significant bit zero.  If an implementation supports  the
1226       use  of characters outside of slash and the portable filename character
1227       set in names for files, users, and groups, one or more  implementation-
1228       defined encodings of these characters shall be provided for interchange
1229       purposes.
1230
1231       However, the pax utility shall never create filenames on the local sys‐
1232       tem   that   cannot   be  accessed  via  the  procedures  described  in
1233       IEEE Std 1003.1-2001. If a filename is found on the medium  that  would
1234       create  an  invalid  filename, it is implementation-defined whether the
1235       data from the file is stored on the file hierarchy and under what  name
1236       it  is stored. The pax utility may choose to ignore these files as long
1237       as it produces an error indicating that the file is being ignored.
1238
1239       Each field within the header logical record  is  contiguous;  that  is,
1240       there is no padding used. Each character on the archive medium shall be
1241       stored contiguously.
1242
1243       The fields magic, uname, and gname are character  strings  each  termi‐
1244       nated  by  a  NUL  character. The fields name, linkname, and prefix are
1245       NUL-terminated character strings except  when  all  characters  in  the
1246       array contain non-NUL characters including the last character. The ver‐
1247       sion field is two octets containing the  characters  "00"  (zero-zero).
1248       The typeflag contains a single character.  All other fields are leading
1249       zero-filled octal numbers using digits from the ISO/IEC 646:1991  stan‐
1250       dard  IRV.  Each  numeric field is terminated by one or more <space> or
1251       NUL characters.
1252
1253       The name and the prefix fields shall produce the pathname of the  file.
1254       A  new  pathname shall be formed, if prefix is not an empty string (its
1255       first character is not NUL), by concatenating prefix (up to  the  first
1256       NUL  character),  a  slash character, and name; otherwise, name is used
1257       alone. In either case, name is terminated at the first  NUL  character.
1258       If  prefix  begins  with  a NUL character, it shall be ignored. In this
1259       manner, pathnames of at most 256 characters  can  be  supported.  If  a
1260       pathname  does not fit in the space provided, pax shall notify the user
1261       of the error, and shall not store any part of the file-header or  data-
1262       on the medium.
1263
1264       The  linkname  field, described below, shall not use the prefix to pro‐
1265       duce a pathname. As such, a linkname is limited to 100  characters.  If
1266       the  name does not fit in the space provided, pax shall notify the user
1267       of the error, and shall not attempt to store the link on the medium.
1268
1269       The mode field provides 12 bits encoded in the  ISO/IEC 646:1991  stan‐
1270       dard  octal  digit representation. The encoded bits shall represent the
1271       following values:
1272
1273                               Table: ustar mode Field
1274
1275       Bit Value IEEE Std 1003.1-2001 Bit Description
1276       04000     S_ISUID                  Set UID on execution.
1277       02000     S_ISGID                  Set GID on execution.
1278       01000     <reserved>               Reserved for future standardization.
1279       00400     S_IRUSR                  Read permission for file owner class.
1280       00200     S_IWUSR                  Write permission for file owner
1281                                          class.
1282       00100     S_IXUSR                  Execute/search permission for file
1283                                          owner class.
1284       00040     S_IRGRP                  Read permission for file group class.
1285       00020     S_IWGRP                  Write permission for file group
1286                                          class.
1287       00010     S_IXGRP                  Execute/search permission for file
1288                                          group class.
1289       00004     S_IROTH                  Read permission for file other class.
1290       00002     S_IWOTH                  Write permission for file other
1291                                          class.
1292       00001     S_IXOTH                  Execute/search permission for file
1293                                          other class.
1294
1295       When appropriate privilege is required to set one of these  mode  bits,
1296       and  the  user  restoring  the files from the archive does not have the
1297       appropriate privilege, the mode bits for which the user does  not  have
1298       appropriate  privilege  shall  be ignored. Some of the mode bits in the
1299       archive  format  are  not  mentioned  elsewhere  in  this   volume   of
1300       IEEE Std 1003.1-2001.  If  the  implementation  does  not support those
1301       bits, they may be ignored.
1302
1303       The uid and gid fields are the user and group ID of the owner and group
1304       of the file, respectively.
1305
1306       The size field is the size of the file in octets. If the typeflag field
1307       is set to specify a file to be of type 1 (a  link)  or  2  (a  symbolic
1308       link), the size field shall be specified as zero. If the typeflag field
1309       is set to specify a file of type 5 (directory), the size field shall be
1310       interpreted  as  described under the definition of that record type. No
1311       data logical records are stored for types 1, 2, or 5. If  the  typeflag
1312       field  is set to 3 (character special file), 4 (block special file), or
1313       6 (FIFO), the meaning of the size field is unspecified by  this  volume
1314       of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1315       the medium. Additionally, for type 6, the size field shall  be  ignored
1316       when reading. If the typeflag field is set to any other value, the num‐
1317       ber of  logical  records  written  following  the  header  shall  be  (
1318       size+511)/512, ignoring any fraction in the result of the division.
1319
1320       The  mtime field shall be the modification time of the file at the time
1321       it was archived. It is the ISO/IEC 646:1991 standard representation  of
1322       the octal value of the modification time obtained from the stat() func‐
1323       tion.
1324
1325       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1326       tion  of  the octal value of the simple sum of all octets in the header
1327       logical record. Each octet  in  the  header  shall  be  treated  as  an
1328       unsigned  value.  These  values  shall be added to an unsigned integer,
1329       initialized to zero, the precision of which is not less than  17  bits.
1330       When  calculating  the  checksum,  the chksum field is treated as if it
1331       were all spaces.
1332
1333       The typeflag field specifies the type of file archived. If a particular
1334       implementation  does  not recognize the type, or the user does not have
1335       appropriate privilege to create that type, the file shall be  extracted
1336       as  if  it  were  a  regular file if the file type is defined to have a
1337       meaning for the size field that could cause data logical records to  be
1338       written  on the medium (see the previous description for size). If con‐
1339       version to a regular file occurs, the  pax  utility  shall  produce  an
1340       error  indicating  that  the conversion took place. All of the typeflag
1341       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1342
1343       0      Represents a regular file. For backwards-compatibility, a  type‐
1344              flag value of binary zero ( '\0' ) should be recognized as mean‐
1345              ing a regular file when extracting files from the  archive.  Ar‐
1346              chives written with this version of the archive file format cre‐
1347              ate regular files with a typeflag value of the  ISO/IEC 646:1991
1348              standard IRV '0' .
1349
1350       1      Represents  a  file  linked to another file, of any type, previ‐
1351              ously archived. Such files are identified by  each  file  having
1352              the  same  device  and file serial number. The linked-to name is
1353              specified in the linkname field with a NUL-character  terminator
1354              if it is less than 100 octets in length.
1355
1356       2      Represents  a  symbolic  link. The contents of the symbolic link
1357              shall be stored in the linkname field.
1358
1359       3,4    Represent  character  special  files  and  block  special  files
1360              respectively.   In  this  case  the devmajor and devminor fields
1361              shall contain information defining the  device,  the  format  of
1362              which  is  unspecified  by  this volume of IEEE Std 1003.1-2001.
1363              Implementations may map the device specifications to  their  own
1364              local specification or may ignore the entry.
1365
1366       5      Specifies  a  directory  or  subdirectory. On systems where disk
1367              allocation is performed on a directory  basis,  the  size  field
1368              shall contain the maximum number of octets (which may be rounded
1369              to the nearest disk block allocation unit)  that  the  directory
1370              may hold. A size field of zero indicates no such limiting.  Sys‐
1371              tems that do not support limiting in this manner  should  ignore
1372              the size field.
1373
1374       6      Specifies a FIFO special file. Note that the archiving of a FIFO
1375              file archives the existence of this file and not its contents.
1376
1377       7      Reserved to represent a file  to  which  an  implementation  has
1378              associated   some  high-performance  attribute.  Implementations
1379              without such extensions should treat this file as a regular file
1380              (type 0).
1381
1382       A-Z    The  letters  'A'  to  'Z',  inclusive,  are reserved for custom
1383              implementations. All other values are reserved for  future  ver‐
1384              sions of IEEE Std 1003.1-2001.
1385
1386
1387       Attempts  to archive a socket using ustar interchange format shall pro‐
1388       duce a diagnostic message. Handling of other file types is  implementa‐
1389       tion-defined.
1390
1391       The  magic  field  is the specification that this archive was output in
1392       this archive format. If this field contains ustar (the five  characters
1393       from  the  ISO/IEC 646:1991  standard  IRV  shown followed by NUL), the
1394       uname and gname fields shall contain the ISO/IEC 646:1991 standard  IRV
1395       representation  of the owner and group of the file, respectively (trun‐
1396       cated to fit, if necessary). When the file is restored by a privileged,
1397       protection-preserving  version of the utility, the user and group data‐
1398       bases shall be scanned for these names.  If found, the user  and  group
1399       IDs  contained  within these files shall be used rather than the values
1400       contained within the uid and gid fields.
1401
1402   cpio Interchange Format
1403       The octet-oriented cpio archive format shall be a  series  of  entries,
1404       each comprising a header that describes the file, the name of the file,
1405       and then the contents of the file.
1406
1407       An archive may be recorded as a series of fixed-size blocks of  octets.
1408       This  blocking  shall be used only to make physical I/O more efficient.
1409       The last group of blocks shall always be at the full size.
1410
1411       For the octet-oriented cpio archive format, the individual entry infor‐
1412       mation  shall  be in the order indicated and described by the following
1413       table; see also the <cpio.h> header.
1414
1415                      Table: Octet-Oriented cpio Archive Entry
1416
1417              Header Field Name     Length (in Octets)  Interpreted as
1418              c_magic               6                   Octal number
1419              c_dev                 6                   Octal number
1420              c_ino                 6                   Octal number
1421              c_mode                6                   Octal number
1422              c_uid                 6                   Octal number
1423              c_gid                 6                   Octal number
1424              c_nlink               6                   Octal number
1425              c_rdev                6                   Octal number
1426              c_mtime               11                  Octal number
1427              c_namesize            6                   Octal number
1428              c_filesize            11                  Octal number
1429              Filename Field Name   Length              Interpreted as
1430              c_name                c_namesize          Pathname string
1431              File Data Field Name  Length              Interpreted as
1432              c_filedata            c_filesize          Data
1433
1434   cpio Header
1435       For each file in the archive, a header as defined previously  shall  be
1436       written.  The information in the header fields is written as streams of
1437       the ISO/IEC 646:1991 standard characters interpreted as octal  numbers.
1438       The  octal numbers shall be extended to the necessary length by append‐
1439       ing the ISO/IEC 646:1991 standard IRV zeros  at  the  most-significant-
1440       digit  end of the number; the result is written to the most-significant
1441       digit of the stream of octets first. The fields shall be interpreted as
1442       follows:
1443
1444       c_magic
1445              Identify  the  archive  as being a transportable archive by con‐
1446              taining the identifying value "070707" .
1447
1448       c_dev, c_ino
1449              Contains values that uniquely identify the file within  the  ar‐
1450              chive  (that  is,  no  files  contain the same pair of c_dev and
1451              c_ino values unless they are links to the same file). The values
1452              shall be determined in an unspecified manner.
1453
1454       c_mode Contains  the file type and access permissions as defined in the
1455              following table.
1456
1457                            Table: Values for cpio c_mode Field
1458
1459                   File Permissions Name  Value    Indicates
1460                   C_IRUSR                000400   Read by owner
1461                   C_IWUSR                000200   Write by owner
1462                   C_IXUSR                000100   Execute by owner
1463                   C_IRGRP                000040   Read by group
1464                   C_IWGRP                000020   Write by group
1465                   C_IXGRP                000010   Execute by group
1466                   C_IROTH                000004   Read by others
1467                   C_IWOTH                000002   Write by others
1468                   C_IXOTH                000001   Execute by others
1469                   C_ISUID                004000   Set uid
1470                   C_ISGID                002000   Set gid
1471                   C_ISVTX                001000   Reserved
1472                   File Type Name         Value    Indicates
1473                   C_ISDIR                040000   Directory
1474                   C_ISFIFO               010000   FIFO
1475                   C_ISREG                0100000  Regular file
1476                   C_ISLNK                0120000  Symbolic link
1477                   C_ISBLK                060000   Block special file
1478                   C_ISCHR                020000   Character special file
1479                   C_ISSOCK               0140000  Socket
1480                   C_ISCTG                0110000  Reserved
1481
1482       Directories, FIFOs, symbolic links, and regular  files  shall  be  sup‐
1483       ported  on  a system conforming to this volume of IEEE Std 1003.1-2001;
1484       additional values defined previously  are  reserved  for  compatibility
1485       with  existing  systems.   Additional file types may be supported; how‐
1486       ever, such files should not be  written  to  archives  intended  to  be
1487       transported to other systems.
1488
1489       c_uid  Contains the user ID of the owner.
1490
1491       c_gid  Contains the group ID of the group.
1492
1493       c_nlink
1494              Contains  the  number  of links referencing the file at the time
1495              the archive was created.
1496
1497       c_rdev Contains implementation-defined  information  for  character  or
1498              block special files.
1499
1500       c_mtime
1501              Contains the latest time of modification of the file at the time
1502              the archive was created.
1503
1504       c_namesize
1505              Contains the length of the pathname, including  the  terminating
1506              NUL character.
1507
1508       c_filesize
1509              Contains  the  length  of  the file in octets. This shall be the
1510              length of the data section following the header structure.
1511
1512
1513   cpio Filename
1514       The c_name field shall contain the pathname of the file. The length  of
1515       this field in octets is the value of c_namesize.
1516
1517       If a filename is found on the medium that would create an invalid path‐
1518       name, it is implementation-defined whether the data from  the  file  is
1519       stored on the file hierarchy and under what name it is stored.
1520
1521       All  characters  shall  be represented in the ISO/IEC 646:1991 standard
1522       IRV. For maximum portability between implementations, names  should  be
1523       selected from characters represented by the portable filename character
1524       set as octets with the most significant bit zero. If an  implementation
1525       supports  the use of characters outside the portable filename character
1526       set in names for files, users, and groups, one or more  implementation-
1527       defined encodings of these characters shall be provided for interchange
1528       purposes. However, the pax utility shall never create filenames on  the
1529       local  system that cannot be accessed via the procedures described pre‐
1530       viously in this volume of IEEE Std 1003.1-2001. If a filename is  found
1531       on  the medium that would create an invalid filename, it is implementa‐
1532       tion-defined whether the data from the file is stored on the local file
1533       system  and under what name it is stored. The pax utility may choose to
1534       ignore these files as long as it produces an error indicating that  the
1535       file is being ignored.
1536
1537   cpio File Data
1538       Following c_name, there shall be c_filesize octets of data. Interpreta‐
1539       tion of such data occurs in a manner dependent on the file. If  c_file‐
1540       size is zero, no data shall be contained in c_filedata.
1541
1542       When restoring from an archive:
1543
1544        * If the user does not have the appropriate privilege to create a file
1545          of the specified type, pax shall ignore the entry and write an error
1546          message to standard error.
1547
1548        * Only  regular  files  have  data to be restored. Presuming a regular
1549          file meets any selection criteria that might be imposed on the  for‐
1550          mat-reading utility by the user, such data shall be restored.
1551
1552        * If  a  user  does not have appropriate privilege to set a particular
1553          mode flag, the flag shall be ignored. Some of the mode flags in  the
1554          archive  format  are  not  mentioned  elsewhere  in  this  volume of
1555          IEEE Std 1003.1-2001. If the implementation does not  support  those
1556          flags, they may be ignored.
1557
1558   cpio Special Entries
1559       FIFO special files, directories, and the trailer shall be recorded with
1560       c_filesize equal to  zero.  For  other  special  files,  c_filesize  is
1561       unspecified by this volume of IEEE Std 1003.1-2001.  The header for the
1562       next file entry in the archive shall be written directly after the last
1563       octet  of  the  file entry preceding it. A header denoting the filename
1564       TRAILER!!! shall indicate the end  of  the  archive;  the  contents  of
1565       octets  in  the  last  block of the archive following such a header are
1566       undefined.
1567

EXIT STATUS

1569       The following exit values shall be returned:
1570
1571        0     All files were processed successfully.
1572
1573       >0     An error occurred.
1574
1575

CONSEQUENCES OF ERRORS

1577       If pax cannot create a file or a link when reading an archive or cannot
1578       find  a  file  when writing an archive, or cannot preserve the user ID,
1579       group ID, or file mode when the -p option is  specified,  a  diagnostic
1580       message  shall  be written to standard error and a non-zero exit status
1581       shall be returned, but processing shall continue. In the case where pax
1582       cannot  create  a  link  to a file, pax shall not, by default, create a
1583       second copy of the file.
1584
1585       If the extraction of a file from an archive is  prematurely  terminated
1586       by a signal or error, pax may have only partially extracted the file or
1587       (if the -n option was not specified) may have extracted a file  of  the
1588       same  name as that specified by the user, but which is not the file the
1589       user wanted. Additionally, the file modes of extracted directories  may
1590       have  additional  bits  from  the S_IRWXU mask set as well as incorrect
1591       modification and access times.
1592
1593       The following sections are informative.
1594

APPLICATION USAGE

1596       The -p  (privileges)  option  was  invented  to  reconcile  differences
1597       between historical tar and cpio implementations. In particular, the two
1598       utilities use -m in diametrically opposed ways. The -p option also pro‐
1599       vides  a  consistent  means  of extending the ways in which future file
1600       attributes can be addressed, such as for enhanced security  systems  or
1601       high-performance  files. Although it may seem complex, there are really
1602       two modes that are most commonly used:
1603
1604       -p e   ``Preserve everything". This would be  used  by  the  historical
1605              superuser,  someone with all the appropriate privileges, to pre‐
1606              serve all aspects of the files as they are recorded in  the  ar‐
1607              chive.   The e flag is the sum of o and p, and other implementa‐
1608              tion-defined attributes.
1609
1610       -p p   ``Preserve" the file mode bits. This would be used by  the  user
1611              with  regular  privileges  who wished to preserve aspects of the
1612              file other than the ownership. The file times are  preserved  by
1613              default,  but  two  other flags are offered to disable these and
1614              use the time of extraction.
1615
1616
1617       The one pathname per line format of standard input precludes  pathnames
1618       containing  <newline>s.  Although  such  pathnames violate the portable
1619       filename guidelines, they may exist  and  their  presence  may  inhibit
1620       usage of pax within shell scripts.  This problem is inherited from his‐
1621       torical archive programs. The problem can be avoided by  listing  file‐
1622       name arguments on the command line instead of on standard input.
1623
1624       It  is  almost certain that appropriate privileges are required for pax
1625       to accomplish parts of this volume  of  IEEE Std 1003.1-2001.  Specifi‐
1626       cally,  creating  files  of  type  block  special or character special,
1627       restoring file access times unless the files are owned by the user (the
1628       -t  option),  or preserving file owner, group, and mode (the -p option)
1629       all probably require appropriate privileges.
1630
1631       In read mode, implementations are permitted to overwrite files when the
1632       archive has multiple members with the same name.  This may fail if per‐
1633       missions on the first version of the file do not permit it to be  over‐
1634       written.
1635
1636       The  cpio  and  ustar  formats  can only support files up to 8589934592
1637       bytes (8 * 2^30) in size.
1638

EXAMPLES

1640       The following command:
1641
1642
1643              pax -w -f /dev/rmt/1m .
1644
1645       copies the contents of the current directory to tape  drive  1,  medium
1646       density (assuming historical System V device naming procedures-the his‐
1647       torical BSD device name would be /dev/rmt9).
1648
1649       The following commands:
1650
1651
1652              mkdir newdirpax -rw olddir newdir
1653
1654       copy the olddir directory hierarchy to newdir.
1655
1656
1657              pax -r -s ',^//*usr//*,,' -f a.pax
1658
1659       reads the archive a.pax, with all files rooted in /usr in  the  archive
1660       extracted relative to the current directory.
1661
1662       Using the option:
1663
1664
1665              -o listopt="%M %(atime)T %(size)D %(name)s"
1666
1667       overrides the default output description in Standard Output and instead
1668       writes:
1669
1670
1671              -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1672
1673       Using the options:
1674
1675
1676              -o listopt='%L\t%(size)D\n%.7' \
1677              -o listopt='(name)s\n%(ctime)T\n%T'
1678
1679       overrides the default output description in Standard Output and instead
1680       writes:
1681
1682
1683              /usr/foo/bar -> /tmp   1492
1684              /usr/fo
1685              Jan 12 1991
1686              Jan 31 15:53
1687

RATIONALE

1689       The  pax  utility was new for the ISO POSIX-2:1993 standard.  It repre‐
1690       sents a peaceful compromise between advocates of the historical tar and
1691       cpio utilities.
1692
1693       A  fundamental  difference between cpio and tar was in the way directo‐
1694       ries were treated. The cpio utility did not treat  directories  differ‐
1695       ently  from  other  files,  and  to select a directory and its contents
1696       required that each file in the hierarchy be explicitly  specified.  For
1697       tar, a directory matched every file in the file hierarchy it rooted.
1698
1699       The  pax  utility  offers  both interfaces; by default, directories map
1700       into the file hierarchy they root. The -d option causes pax to skip any
1701       file  not  explicitly  referenced,  as cpio historically did. The tar -
1702       style behavior was chosen as the default because it was  believed  that
1703       this  was  the  more  common usage and because tar is the more commonly
1704       available interface, as it was historically provided on both  System  V
1705       and BSD implementations.
1706
1707       The   data   interchange   format   specification  in  this  volume  of
1708       IEEE Std 1003.1-2001 requires that processes with  "appropriate  privi‐
1709       leges"  shall always restore the ownership and permissions of extracted
1710       files exactly as archived. If  viewed  from  the  historic  equivalence
1711       between  superuser and "appropriate privileges", there are two problems
1712       with this requirement.  First, users running as superusers may  unknow‐
1713       ingly set dangerous permissions on extracted files. Second, it is need‐
1714       lessly limiting, in that superusers cannot extract files and  own  them
1715       as  superuser  unless  the  archive  was  created by the superuser. (It
1716       should be noted that restoration of ownerships and permissions for  the
1717       superuser, by default, is historical practice in cpio, but not in tar.)
1718       In order to avoid these two problems,  the  pax  specification  has  an
1719       additional  "privilege" mechanism, the -p option. Only a pax invocation
1720       with the privileges needed, and which has the -p option set using the e
1721       specification  character,  has  the  "appropriate privilege" to restore
1722       full ownership and permission information.
1723
1724       Note also that this volume of IEEE Std 1003.1-2001  requires  that  the
1725       file  ownership  and access permissions shall be set, on extraction, in
1726       the same fashion as the creat() function when provided  with  the  mode
1727       stored  in  the  archive. This means that the file creation mask of the
1728       user is applied to the file permissions.
1729
1730       Users should note that directories may be created by pax while extract‐
1731       ing  files  with permissions that are different from those that existed
1732       at the time the archive was created. When extracting sensitive informa‐
1733       tion  into  a  directory  hierarchy  that  no  longer exists, users are
1734       encouraged to set their file creation  mask  appropriately  to  protect
1735       these files during extraction.
1736
1737       The  table  of contents output is written to standard output to facili‐
1738       tate pipeline processing.
1739
1740       An early proposal had hard links displaying for all pathnames. This was
1741       removed  because  it complicates the output of the case where -v is not
1742       specified and does not  match  historical  cpio  usage.  The  hard-link
1743       information is available in the -v display.
1744
1745       The  description  of  the -l option allows implementations to make hard
1746       links to symbolic links.  IEEE Std 1003.1-2001 does not specify any way
1747       to create a hard link to a symbolic link, but many implementations pro‐
1748       vide this capability as an extension. If there are hard links  to  sym‐
1749       bolic  links when an archive is created, the implementation is required
1750       to archive the hard link in the archive (unless -H or -L is specified).
1751       When  in  read  mode  and in copy mode, implementations supporting hard
1752       links to symbolic links should use them when appropriate.
1753
1754       The archive formats inherited from the POSIX.1-1990 standard have  cer‐
1755       tain  restrictions  that have been brought along from historical usage.
1756       For example, there are restrictions on the length of  pathnames  stored
1757       in  the archive. When pax is used in copy( -rw) mode (copying directory
1758       hierarchies), the ability to use extensions  from  the  -x  pax  format
1759       overcomes these restrictions.
1760
1761       The default blocksize value of 5120 bytes for cpio was selected because
1762       it is one of the standard block-size values for cpio, set when  the  -B
1763       option  is  specified.  (The other default block-size value for cpio is
1764       512 bytes, and this was considered to be too small.) The default  block
1765       value  of 10240 bytes for tar was selected because that is the standard
1766       block-size value for BSD tar. The maximum block  size  of  32256  bytes
1767       (2**15-512 bytes) is the largest multiple of 512 bytes that fits into a
1768       signed 16-bit tape controller transfer register. There are known  limi‐
1769       tations  in  some  historical  systems that would prevent larger blocks
1770       from being accepted. Historical values were chosen to improve  compati‐
1771       bility with historical scripts using dd or similar utilities to manipu‐
1772       late archives. Also, default block sizes for any file type  other  than
1773       character   special   file   has  been  deleted  from  this  volume  of
1774       IEEE Std 1003.1-2001 as unimportant and not likely to affect the struc‐
1775       ture of the resulting archive.
1776
1777       Implementations  are  permitted to modify the block-size value based on
1778       the archive format or the device to which the archive is being written.
1779       This  is to provide implementations with the opportunity to take advan‐
1780       tage of special types of devices, and it should not be used  without  a
1781       great  deal  of  consideration as it almost certainly decreases archive
1782       portability.
1783
1784       The intended use of the -n option was to permit extraction  of  one  or
1785       more files from the archive without processing the entire archive. This
1786       was viewed by the standard developers as offering  significant  perfor‐
1787       mance  advantages  over  historical  implementations.  The -n option in
1788       early proposals had three effects; the first was to cause special char‐
1789       acters in patterns to not be treated specially. The second was to cause
1790       only the first file that matched a pattern to be extracted.  The  third
1791       was  to  cause pax to write a diagnostic message to standard error when
1792       no file was found matching a specified pattern. Only the second  behav‐
1793       ior  is  retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1794       sons. First, it is in general not acceptable for  a  single  option  to
1795       have  multiple  effects.  Second,  the ability to make pattern matching
1796       characters act as normal characters is useful for parts  of  pax  other
1797       than  file  extraction.  Third, a finer degree of control over the spe‐
1798       cial characters is useful because users may wish to  normalize  only  a
1799       single  special  character  in  a single filename. Fourth, given a more
1800       general escape mechanism, the previous behavior of the -n option can be
1801       easily  obtained using the -s option or a sed script.  Finally, writing
1802       a diagnostic message when a pattern specified by the user is  unmatched
1803       by any file is useful behavior in all cases.
1804
1805       In this version, the -n was removed from the copy mode synopsis of pax;
1806       it is inapplicable because there are no pattern operands  specified  in
1807       this mode.
1808
1809       There   is   another   method   than   pax   for  copying  subtrees  in
1810       IEEE Std 1003.1-2001 described as part of the cp utility. Both  methods
1811       are  historical  practice: cp provides a simpler, more intuitive inter‐
1812       face, while pax offers a finer granularity of  control.  Each  provides
1813       additional functionality to the other; in particular, pax maintains the
1814       hard-link structure of the hierarchy while  cp  does  not.  It  is  the
1815       intention of the standard developers that the results be similar (using
1816       appropriate option combinations in both utilities). The results are not
1817       required  to  be  identical; there seemed insufficient gain to applica‐
1818       tions to balance the difficulty of implementations having to  guarantee
1819       that the results would be exactly identical.
1820
1821       A  single  archive  may  span  more than one file. It is suggested that
1822       implementations provide informative messages to the  user  on  standard
1823       error whenever the archive file is changed.
1824
1825       The -d option (do not create intermediate directories not listed in the
1826       archive) found in early proposals was originally provided as a  comple‐
1827       ment to the historic -d option of cpio.  It has been deleted.
1828
1829       The -s option in early proposals specified a subset of the substitution
1830       command from the ed utility. As there was no reason for only  a  subset
1831       to  be  supported,  the -s option is now compatible with the current ed
1832       specification. Since the delimiter can be any non-null  character,  the
1833       following usage with single spaces is valid:
1834
1835
1836              pax -s " foo bar " ...
1837
1838       The  -t  description  is  worded  so as to note that this may cause the
1839       access time update caused by some other activity  (which  occurs  while
1840       the file is being read) to be overwritten.
1841
1842       The  default  behavior of pax with regard to file modification times is
1843       the same as historical implementations of tar. It is not the historical
1844       behavior of cpio.
1845
1846       Because  the  -i  option uses /dev/tty, utilities without a controlling
1847       terminal are not able to use this option.
1848
1849       The -y option, found in early proposals, has  been  deleted  because  a
1850       line  containing a single period for the -i option has equivalent func‐
1851       tionality. The special lines for the -i option (a single period and the
1852       empty line) are historical practice in cpio.
1853
1854       In early drafts, a -e charmap option was included to increase portabil‐
1855       ity of files between systems using different coded character sets. This
1856       option  was omitted because it was apparent that consensus could not be
1857       formed for it. In this version, the use of UTF-8 should be an  adequate
1858       substitute.
1859
1860       The  -k  option  was  added to address international concerns about the
1861       dangers involved in the character set transformations  of  -e  (if  the
1862       target  character  set  were  different  from the source, the filenames
1863       might be transformed into names matching existing files) and  also  was
1864       made  more  general  to  protect files transferred between file systems
1865       with different {NAME_MAX} values (truncating a filename  on  a  smaller
1866       system  might  also inadvertently overwrite existing files). As stated,
1867       it prevents any overwriting, even if the target file is older than  the
1868       source.  This  version  adds  more granularity of options to solve this
1869       problem by introducing the -o invalid=  option-specifically  the  UTF-8
1870       action. (Note that an existing file that is named with a UTF-8 encoding
1871       is still subject to overwriting in this case. The -k option closes that
1872       loophole.)
1873
1874       Some   of  the  file  characteristics  referenced  in  this  volume  of
1875       IEEE Std 1003.1-2001 might not be supported by  some  archive  formats.
1876       For  example,  neither the tar nor cpio formats contain the file access
1877       time. For this reason, the e specification character has been provided,
1878       intended  to cause all file characteristics specified in the archive to
1879       be retained.
1880
1881       It is required that  extracted  directories,  by  default,  have  their
1882       access  and modification times and permissions set to the values speci‐
1883       fied in the archive. This has obvious problems in that the  directories
1884       are  almost certainly modified after being extracted and that directory
1885       permissions may not permit file creation.  One possible solution is  to
1886       create  directories with the mode specified in the archive, as modified
1887       by the umask of the user, with sufficient  permissions  to  allow  file
1888       creation. After all files have been extracted, pax would then reset the
1889       access and modification times and permissions as necessary.
1890
1891       The list-mode formatting  description  borrows  heavily  from  the  one
1892       defined  by the printf utility. However, since there is no separate op‐
1893       erand list to get conversion arguments,  the  format  was  extended  to
1894       allow  specifying  the  name  of the conversion argument as part of the
1895       conversion specification.
1896
1897       The T conversion specifier allows time fields to be displayed in any of
1898       the date formats. Unlike the ls utility, pax does not adjust the format
1899       when the date is less than six months in the past. This  makes  parsing
1900       the output more predictable.
1901
1902       The   D  conversion  specifier  handles  the  ability  to  display  the
1903       major/minor or file size, as with ls, by using %-8(size)D.
1904
1905       The L conversion specifier handles the ls display for symbolic links.
1906
1907       Conversion specifiers were added to generate existing known types  used
1908       for ls.
1909
1910   pax Interchange Format
1911       The  new  POSIX data interchange format was developed primarily to sat‐
1912       isfy international concerns that the ustar and  cpio  formats  did  not
1913       provide for file, user, and group names encoded in characters outside a
1914       subset of the ISO/IEC 646:1991 standard. The standard developers  real‐
1915       ized  that this new POSIX data interchange format should be very exten‐
1916       sible because there were other requirements they foresaw  in  the  near
1917       future:
1918
1919        * Support international character encodings and locale information
1920
1921        * Support security information (ACLs, and so on)
1922
1923        * Support future file types, such as realtime or contiguous files
1924
1925        * Include data areas for implementation use
1926
1927        * Support  systems with words larger than 32 bits and timers with sub‐
1928          second granularity
1929
1930       The following were not goals for this format because these  are  better
1931       handled  by separate utilities or are inappropriate for a portable for‐
1932       mat:
1933
1934        * Encryption
1935
1936        * Compression
1937
1938        * Data translation between locales and codesets
1939
1940        * inode storage
1941
1942       The format chosen to support the goals is an  extension  of  the  ustar
1943       format.  Of the two formats previously available, only the ustar format
1944       was selected for extensions because:
1945
1946        * It was easier to extend in an  upwards-compatible  way.  It  offered
1947          version  flags  and  header  block  type fields with room for future
1948          standardization. The cpio format, while possessing a  more  flexible
1949          file naming methodology, could not be extended without breaking some
1950          theoretical implementation or using a dummy filename that could be a
1951          legitimate filename.
1952
1953        * Industry  experience since the original " tar wars" fought in devel‐
1954          oping the ISO POSIX-1 standard has clearly  been  in  favor  of  the
1955          ustar  format, which is generally the default output format selected
1956          for pax implementations on new systems.
1957
1958       The new format was designed with one additional goal in  mind:  reason‐
1959       able  behavior when an older tar or pax utility happened to read an ar‐
1960       chive. Since the POSIX.1-1990 standard mandated that a  "format-reading
1961       utility"  had  to  treat unrecognized typeflag values as regular files,
1962       this allowed the format to include all the extended  information  in  a
1963       pseudo-regular  file  that  preceded each real file. An option is given
1964       that allows the archive creator to set up reasonable  names  for  these
1965       files on the older systems. Also, the normative text suggests that rea‐
1966       sonable file access values be used for this ustar header block.  Making
1967       these  header  files  inaccessible  for convenient reading and deleting
1968       would not be reasonable. File permissions of 600 or 700 are suggested.
1969
1970       The ustar typeflag field was used to accommodate the  additional  func‐
1971       tionality  of  the  new format rather than magic or version because the
1972       POSIX.1-1990 standard (and, by reference, the previous version of pax),
1973       mandated the behavior of the format-reading utility when it encountered
1974       an unknown typeflag, but was silent about the other two fields.
1975
1976       Early proposals of the first revision to IEEE Std 1003.1-2001 contained
1977       a  proposed  archive  format  that  was based on compatibility with the
1978       standard for tape files (ISO 1001, similar to the format used  histori‐
1979       cally  on  many  mainframes  and minicomputers). This format was overly
1980       complex  and  required  considerable  overhead  in  volume  and  header
1981       records. Furthermore, the standard developers felt that it would not be
1982       acceptable to the community  of  POSIX  developers,  so  it  was  later
1983       changed  to  be a format more closely related to historical practice on
1984       POSIX systems.
1985
1986       The prefix and name split of pathnames in ustar  was  replaced  by  the
1987       single path extended header record for simplicity.
1988
1989       The  concept  of  a global extended header ( typeflag g) was controver‐
1990       sial. If this were applied to an archive  being  recorded  on  magnetic
1991       tape,  a  few unreadable blocks at the beginning of the tape could be a
1992       serious problem; a utility attempting to extract as many files as  pos‐
1993       sible  from  a  damaged  archive  could lose a large percentage of file
1994       header information in this case.  However, if the  archive  were  on  a
1995       reliable  medium,  such  as a CD-ROM, the global extended header offers
1996       considerable potential size reductions by eliminating redundant  infor‐
1997       mation.  Thus, the text warns against using the global method for unre‐
1998       liable media and provides a method for implanting global information in
1999       the  extended  header  for  each  file,  rather  than in the typeflag g
2000       records.
2001
2002       No facility for data translation or filtering on a  per-file  basis  is
2003       included  because the standard developers could not invent an interface
2004       that would allow this in an efficient manner.  If  a  filter,  such  as
2005       encryption  or  compression,  is  to be applied to all the files, it is
2006       more efficient to apply the filter to the entire archive  as  a  single
2007       file. The standard developers considered interfaces that would invoke a
2008       shell script for each file going into or out of the  archive,  but  the
2009       system overhead in this approach was considered to be too high.
2010
2011       One such approach would be to have filter= records that give a pathname
2012       for an executable. When the program is invoked, the  file  and  archive
2013       would be open for standard input/output and all the header fields would
2014       be available as environment variables or  command-line  arguments.  The
2015       standard  developers  did  discuss  such schemes, but they were omitted
2016       from IEEE Std 1003.1-2001 due to  concerns  about  excessive  overhead.
2017       Also,  the program itself would need to be in the archive if it were to
2018       be used portably.
2019
2020       There is currently no  portable  means  of  identifying  the  character
2021       set(s)  used for a file in the file system. Therefore, pax has not been
2022       given a mechanism to generate charset records automatically.  The  only
2023       portable means of doing this is for the user to write the archive using
2024       the -o charset= string command line option. This assumes  that  all  of
2025       the  files  in  the archive use the same encoding. The "implementation-
2026       defined" text is included to allow for a system that can  identify  the
2027       encodings used for each of its files.
2028
2029       The  table of standards that accompanies the charset record description
2030       is acknowledged to be very limited. Only a limited number of  character
2031       set standards is reasonable for maximal interchange.  Any character set
2032       is, of course, possible by  prior  agreement.  It  was  suggested  that
2033       EBCDIC  be  listed,  but  it was omitted because it is not defined by a
2034       formal standard. Formal standards, and then only those with  reasonably
2035       large  followings,  can be included here, simply as a matter of practi‐
2036       cality. The <value>s represent names of officially registered character
2037       sets in the format required by the ISO 2375:1985 standard.
2038
2039       The  normal  comma  or <blank>-separated list rules are not followed in
2040       the case of keyword options to  allow  ease  of  argument  parsing  for
2041       getopts.
2042
2043       Further  information on character encodings is in pax Archive Character
2044       Set Encoding/Decoding .
2045
2046       The standard developers have reserved keyword  name  space  for  vendor
2047       extensions. It is suggested that the format to be used is:
2048
2049
2050              VENDOR.keyword
2051
2052       where VENDOR is the name of the vendor or organization in all uppercase
2053       letters. It is further suggested that the keyword following the  period
2054       be named differently than any of the standard keywords so that it could
2055       be used for future standardization, if  appropriate,  by  omitting  the
2056       VENDOR prefix.
2057
2058       The  <length>  field in the extended header record was included to make
2059       it simpler to step through the records, even if a  record  contains  an
2060       unknown  format (to a particular pax) with complex interactions of spe‐
2061       cial characters. It also provides a minor integrity  checkpoint  within
2062       the records to aid a program attempting to recover files from a damaged
2063       archive.
2064
2065       There are no extended header versions  of  the  devmajor  and  devminor
2066       fields because the unspecified format ustar header field should be suf‐
2067       ficient. If they are not, vendor-specific extended  keywords  (such  as
2068       VENDOR.devmajor) should be used.
2069
2070       Device  and i-number labeling of files was not adopted from cpio; files
2071       are interchanged strictly on a symbolic name basis, as in ustar.
2072
2073       Just as with the ustar format descriptions, the  new  format  makes  no
2074       special arrangements for multi-volume archives. Each of the pax archive
2075       types is assumed to be inside a single POSIX file  and  splitting  that
2076       file  over  multiple  volumes  (diskettes, tape cartridges, and so on),
2077       processing their labels, and mounting each in the proper  sequence  are
2078       considered  to  be  implementation  details  that  cannot  be described
2079       portably.
2080
2081       The pax format is intended for interchange, not only for  backup  on  a
2082       single  (family  of)  systems.  It is not as densely packed as might be
2083       possible for backup:
2084
2085        * It contains information as coded characters that could be  coded  in
2086          binary.
2087
2088        * It  identifies extended records with name fields that could be omit‐
2089          ted in favor of a fixed-field layout.
2090
2091        * It translates names into a portable  character  set  and  identifies
2092          locale-related  information,  both of which are probably unnecessary
2093          for backup.
2094
2095       The requirements on restoring from an archive  are  slightly  different
2096       from  the  historical wording, allowing for non-monolithic privilege to
2097       bring forward as much as possible. In particular,  attributes  such  as
2098       "high  performance  file"  might be broadly but not universally granted
2099       while set-user-ID or chown() might be much more restricted.   There  is
2100       no implication in IEEE Std 1003.1-2001 that the security information be
2101       honored after it is restored to the file hierarchy, in  spite  of  what
2102       might  be  improperly  inferred by the silence on that topic. That is a
2103       topic for another standard.
2104
2105       Links are recorded in the fashion described here because a link can  be
2106       to any file type. It is desirable in general to be able to restore part
2107       of an archive selectively and restore all of those files completely. If
2108       the  data  is  not  associated with each link, it is not possible to do
2109       this. However, the data associated with a file can be large,  and  when
2110       selective  restoration is not needed, this can be a significant burden.
2111       The archive is structured so that files that have  no  associated  data
2112       can  always  be  restored by the name of any link name of any link, and
2113       the user may choose whether data is recorded with each  instance  of  a
2114       file  that  contains  data.  The format permits mixing of both types of
2115       links in a single archive; this can be done for special needs, and  pax
2116       is  expected  to interpret such archives on input properly, despite the
2117       fact that there is no pax option that would force this  mixed  case  on
2118       output.  (When  -o linkdata is used, the output must contain the dupli‐
2119       cate data, but the implementation is free to include it or omit it when
2120       -o linkdata is not used.)
2121
2122       The  time  values  are  included  as  extended header records for those
2123       implementations needing more than the eleven octal  digits  allowed  by
2124       the  ustar  format. Portable file timestamps cannot be negative. If pax
2125       encounters a file with a negative timestamp in copy or write  mode,  it
2126       can reject the file, substitute a non-negative timestamp, or generate a
2127       non-portable timestamp with a leading '-' . Even though some  implemen‐
2128       tations  can  support  finer  file-time granularities than seconds, the
2129       normative text requires  support  only  for  seconds  since  the  Epoch
2130       because the ISO POSIX-1 standard states them that way. The ustar format
2131       includes only mtime; the new format adds atime and ctime for  symmetry.
2132       The  atime  access time restored to the file system will be affected by
2133       the -p a and -p e options.  The ctime  creation  time  (actually  inode
2134       modification time) is described with "appropriate privilege" so that it
2135       can be ignored when writing to the file system. POSIX does not  provide
2136       a  portable  means to change file creation time. Nothing is intended to
2137       prevent a non-portable implementation of pax from restoring the value.
2138
2139       The gid, size, and uid extended header records were included  to  allow
2140       expansion  beyond  the  sizes  specified in the regular tar header. New
2141       file system architectures are emerging that will exhaust  the  12-digit
2142       size  field.  There are probably not many systems requiring more than 8
2143       digits for user and group IDs, but  the  extended  header  values  were
2144       included  for  completeness,  allowing overrides for all of the decimal
2145       values in the tar header.
2146
2147       The standard developers intended to describe the effective  results  of
2148       pax with regard to file ownerships and permissions; implementations are
2149       not restricted in timing or sequencing the restoration  of  such,  pro‐
2150       vided the results are as specified.
2151
2152       Much  of  the  text  describing the extended headers refers to use in "
2153       write or copy modes". The copy mode references are due to the normative
2154       text:  "The  effect  of  the  copy shall be as if the copied files were
2155       written to an archive file and then subsequently extracted ...".  There
2156       is  certainly  no  way  to  test whether pax is actually generating the
2157       extended headers in copy mode, but the effects must be as if it had.
2158
2159   pax Archive Character Set Encoding/Decoding
2160       There is a need to exchange archives of files between systems  of  dif‐
2161       ferent  native codesets. Filenames, group names, and user names must be
2162       preserved to the fullest extent possible when an archive is read on the
2163       receiving  platform. Translation of the contents of files is not within
2164       the scope of the pax utility.
2165
2166       There will also be the need to represent characters that are not avail‐
2167       able  on the receiving platform. These unsupported characters cannot be
2168       automatically folded to the local set of characters due to  the  chance
2169       of  collisions.  This  could  result  in overwriting previous extracted
2170       files from the archive or pre-existing files on the system.
2171
2172       For these reasons, the codeset used to represent characters within  the
2173       extended header records of the pax archive must be sufficiently rich to
2174       handle all commonly used character sets. The fields requiring  transla‐
2175       tion  include,  at  a  minimum, filenames, user names, group names, and
2176       link pathnames. Implementations may wish  to  have  localized  extended
2177       keywords that use non-portable characters.
2178
2179       The standard developers considered the following options:
2180
2181        * The  archive  creator  specifies the well-defined name of the source
2182          codeset. The receiver must then recognize the codeset name and  per‐
2183          form the appropriate translations to the destination codeset.
2184
2185        * The  archive  creator includes within the archive the character map‐
2186          ping table for the source codeset used  to  encode  extended  header
2187          records. The receiver must then read the character mapping table and
2188          perform the appropriate translations to the destination codeset.
2189
2190        * The archive creator translates the extended header  records  in  the
2191          source codeset into a canonical form. The receiver must then perform
2192          the appropriate translations to the destination codeset.
2193
2194       The approach that incorporates the name of the source codeset poses the
2195       problem  of codeset name registration, and makes the archive useless to
2196       pax archive decoders that do not recognize that codeset.
2197
2198       Because parts of an archive may be corrupted, the  standard  developers
2199       felt  that  including  the  character map of the source codeset was too
2200       fragile. The loss of this one key component could result in making  the
2201       entire  archive  useless.  (The  difference between this and the global
2202       extended header decision was that the latter has a workaround-duplicat‐
2203       ing  extended  header records on unreliable media-but this would be too
2204       burdensome for large character set maps.)
2205
2206       Both of the above approaches also put an undue burden on  the  pax  ar‐
2207       chive  receiver  to handle the cross-product of all source and destina‐
2208       tion codesets.
2209
2210       To simplify the translation from the source codeset  to  the  canonical
2211       form  and from the canonical form to the destination codeset, the stan‐
2212       dard developers decided that the internal representation  should  be  a
2213       stateless  encoding.  A  stateless encoding is one where each codepoint
2214       has the same meaning, without regard to the decoder being in a specific
2215       state.  An  example of a stateful encoding would be the Japanese Shift-
2216       JIS; an example of a stateless encoding would be  the  ISO/IEC 646:1991
2217       standard (equivalent to 7-bit ASCII).
2218
2219       For these reasons, the standard developers decided to adopt a canonical
2220       format for the representation of file information strings. The obvious,
2221       well-endorsed  candidate is the ISO/IEC 10646-1:2000 standard (based in
2222       part on Unicode), which can be used to represent the characters of vir‐
2223       tually  all  standardized  character sets. The standard developers ini‐
2224       tially agreed upon using UCS2 (16-bit Unicode) as the  internal  repre‐
2225       sentation.  This  repertoire of characters provides a sufficiently rich
2226       set to represent all commonly-used codesets.
2227
2228       However, the standard developers found that the 16-bit  Unicode  repre‐
2229       sentation  had some problems. It forced the issue of standardizing byte
2230       ordering. The 2-byte length of each character made the extended  header
2231       records  twice as long for the case of strings coded entirely from his‐
2232       torical 7-bit ASCII. For these reasons, the standard  developers  chose
2233       the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2234       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2235       tically,  eliminating  the need for a canonical byte ordering. In addi‐
2236       tion, NUL octets and other characters possibly confusing to POSIX  file
2237       systems  do not appear, except to represent themselves. It was realized
2238       that certain national codesets take up more space after  the  encoding,
2239       due  to their placement within the UCS range; it was felt that the use‐
2240       fulness of the encoding of the names outweighs the disadvantage of size
2241       increase for file, user, and group names.
2242
2243       The encoding of UTF-8 is as follows:
2244
2245
2246              UCS4 Hex Encoding  UTF-8 Binary Encoding
2247
2248
2249              00000000-0000007F  0xxxxxxx
2250              00000080-000007FF  110xxxxx 10xxxxxx
2251              00000800-0000FFFF  1110xxxx 10xxxxxx 10xxxxxx
2252              00010000-001FFFFF  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2253              00200000-03FFFFFF  111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2254              04000000-7FFFFFFF  1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2255
2256       where  each  'x' represents a bit value from the character being trans‐
2257       lated.
2258
2259   ustar Interchange Format
2260       The description of the ustar format reflects numerous enhancements over
2261       pre-1988  versions  of  the  historical  tar utility. The goal of these
2262       changes was not only to provide the  functional  enhancements  desired,
2263       but  also  to  retain  compatibility between new and old versions. This
2264       compatibility has been retained.  Archives written using  the  old  ar‐
2265       chive format are compatible with the new format.
2266
2267       Implementors  should  be  aware  that  the previous file format did not
2268       include a mechanism to archive directory type files. For  this  reason,
2269       the  convention  of  using  a filename ending with slash was adopted to
2270       specify a directory on the archive.
2271
2272       The total size of the name and prefix fields have been set to meet  the
2273       minimum  requirements for {PATH_MAX}. If a pathname will fit within the
2274       name field, it is recommended that the pathname be stored there without
2275       the use of the prefix field. Although the name field is known to be too
2276       small to contain {PATH_MAX} characters, the value was  not  changed  in
2277       this version of the archive file format to retain backwards-compatibil‐
2278       ity, and instead the prefix was introduced. Also, because of  the  ear‐
2279       lier  version  of the format, there is no way to remove the restriction
2280       on the linkname field being limited in size to just that  of  the  name
2281       field.
2282
2283       The  size  field  is  required  to  be meaningful in all implementation
2284       extensions, although it could be zero. This is  required  so  that  the
2285       data blocks can always be properly counted.
2286
2287       It  is  suggested  that  if device special files need to be represented
2288       that cannot be represented in the standard  format,  that  one  of  the
2289       extension  types  (  A- Z) be used, and that the additional information
2290       for the special file be represented as data and  be  reflected  in  the
2291       size field.
2292
2293       Attempting  to  restore  a  special file type, where it is converted to
2294       ordinary data and conflicts with an existing filename, need not be spe‐
2295       cially  detected by the utility. If run as an ordinary user, pax should
2296       not be able to overwrite the entries in, for example, /dev in any  case
2297       (whether  the  file  is  converted to another type or not). If run as a
2298       privileged user, it should be able to do so, and it would be considered
2299       a bug if it did not.  The same is true of ordinary data files and simi‐
2300       larly named special files; it is impossible to anticipate the needs  of
2301       the user (who could really intend to overwrite the file), so the behav‐
2302       ior should be predictable (and thus regular) and rely on the protection
2303       system as required.
2304
2305       The  value 7 in the typeflag field is intended to define how contiguous
2306       files can be stored in a ustar archive.  IEEE Std 1003.1-2001 does  not
2307       require  the  contiguous file extension, but does define a standard way
2308       of archiving such files so that all conforming  systems  can  interpret
2309       these  file  types  in  a meaningful and consistent manner. On a system
2310       that does not support extended file types, the pax  utility  should  do
2311       the best it can with the file and go on to the next.
2312
2313       The file protection modes are those conventionally used by the ls util‐
2314       ity. This is extended beyond the usage in the ISO POSIX-2  standard  to
2315       support the "shared text" or "sticky" bit. It is intended that the con‐
2316       formance document should not document anything beyond the existence  of
2317       and  support  of  such a mode. Further extensions are expected to these
2318       bits, particularly with overloading the  set-user-ID  and  set-group-ID
2319       flags.
2320
2321   cpio Interchange Format
2322       The  reference to appropriate privilege in the cpio format refers to an
2323       error on standard output; the ustar format  does  not  make  comparable
2324       statements.
2325
2326       The  model  for  this  format  was the historical System V cpio -c data
2327       interchange format. This model documents the portable  version  of  the
2328       cpio  format  and  not  the  binary version.  It has the flexibility to
2329       transfer data of any type described within IEEE Std 1003.1-2001, yet is
2330       extensible  to  transfer  data  types  specific  to  extensions  beyond
2331       IEEE Std 1003.1-2001  (for  example,  contiguous  files).  Because   it
2332       describes  existing  practice,  there  is  no  question  of maintaining
2333       upwards-compatibility.
2334
2335   cpio Header
2336       There has been some concern that the size of the  c_ino  field  of  the
2337       header  is too small to handle those systems that have very large inode
2338       numbers. However, the c_ino field in the header is used strictly  as  a
2339       hard-link  resolution mechanism for archives. It is not necessarily the
2340       same value as the inode number of the file in the location  from  which
2341       that file is extracted.
2342
2343       The name c_magic is based on historical usage.
2344
2345   cpio Filename
2346       For  most  historical  implementations  of the cpio utility, {PATH_MAX}
2347       octets can be used to describe the pathname without the addition of any
2348       other  header  fields  (the  NUL  character  would  be included in this
2349       count). {PATH_MAX} is the minimum value for pathname  size,  documented
2350       as  256  bytes. However, an implementation may use c_namesize to deter‐
2351       mine the exact length of the pathname. With the current description  of
2352       the  <cpio.h>  header,  this  pathname size can be as large as a number
2353       that is described in six octal digits.
2354
2355       Two values are documented under the c_mode field values to provide  for
2356       extensibility for known file types:
2357
2358       0110 000
2359              Reserved  for contiguous files. The implementation may treat the
2360              rest of the information for this archive like  a  regular  file.
2361              If  this  file  type is undefined, the implementation may create
2362              the file as a regular file.
2363
2364
2365       This provides for extensibility of the cpio format while  allowing  for
2366       the  ability to read old archives. Files of an unknown type may be read
2367       as "regular files" on some implementations.  On a system that does  not
2368       support  extended file types, the pax utility should do the best it can
2369       with the file and go on to the next.
2370

FUTURE DIRECTIONS

2372       None.
2373

SEE ALSO

2375       Shell Command Language, cp, ed, getopts, ls, printf(), the Base Defini‐
2376       tions  volume  of IEEE Std 1003.1-2001, <cpio.h>, the System Interfaces
2377       volume of IEEE Std 1003.1-2001, chown(),  creat(),  mkdir(),  mkfifo(),
2378       stat(), utime(), write()
2379
2381       Portions  of  this text are reprinted and reproduced in electronic form
2382       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
2383       --  Portable  Operating  System  Interface (POSIX), The Open Group Base
2384       Specifications Issue 6, Copyright (C) 2001-2003  by  the  Institute  of
2385       Electrical  and  Electronics  Engineers, Inc and The Open Group. In the
2386       event of any discrepancy between this version and the original IEEE and
2387       The  Open Group Standard, the original IEEE and The Open Group Standard
2388       is the referee document. The original Standard can be  obtained  online
2389       at http://www.opengroup.org/unix/online.html .
2390
2391
2392
2393IEEE/The Open Group                  2003                              PAX(1P)
Impressum