1SPAX(1L)                    Schily´s USER COMMANDS                    SPAX(1L)
2
3
4

NAME

6       pax - portable archive interchange
7

SYNOPSIS

9       spax        [other options]      [-cdnv]      [-H|-L]      [-f archive]
10              [-o options]...  [-s replstr]...  [pattern...]
11
12
13       spax   -r   [other options]     [-cdiknuv]     [-H|-L]     [-f archive]
14              [-o options]...  [-p string]...  [-s replstr]...  [pattern...]
15
16
17       spax   -w   [other options]   [-dituvX]   [-H|-L]  [-b blocksize]  [-a]
18              [-f archive]   [-o options]...    [-s replstr]...    [-x format]
19              [file...]
20
21
22       spax   -r -w[other options]    [-diklntuvX]   [-H|-L]   [-o options]...
23              [-p string]...  [-s replstr]...  [file...] directory
24

DESCRIPTION

26       The pax utility shall read, write, and write lists of  the  members  of
27       archive files and copy directory hierarchies. A variety of archive for‐
28       mats shall be supported; see the -x format option.
29
30       The action to be taken depends  on  the  presence  of  the  -r  and  -w
31       options. The four combinations of -r and -w are referred to as the four
32       modes of operation: list, read, write, and  copy  modes,  corresponding
33       respectively to the four forms shown in the SYNOPSIS section.
34
35       list   In  list  mode (when neither -r nor -w are specified), pax shall
36              write the names of the members of the archive file read from the
37              standard  input, with pathnames matching the specified patterns,
38              to standard output. If a named file is of  type  directory,  the
39              file hierarchy rooted at that file shall be listed as well.
40
41       read   In  read  mode  (when -r is specified, but -w is not), pax shall
42              extract the members of the archive file read from  the  standard
43              input,  with  pathnames  matching the specified patterns.  If an
44              extracted file is of type directory, the file  hierarchy  rooted
45              at  that  file  shall  be extracted as well. The extracted files
46              shall be created performing pathname resolution with the  direc‐
47              tory in which pax was invoked as the current working directory.
48
49              If  an attempt is made to extract a directory when the directory
50              already exists, this shall not be considered  an  error.  If  an
51              attempt  is made to extract a FIFO when the FIFO already exists,
52              this shall not be considered an error.
53
54              The ownership, access, and modification times, and file mode  of
55              the restored files are discussed under the -p option.
56
57       write  In  write  mode (when -w is specified, but -r is not), pax shall
58              write the contents of the file operands to the  standard  output
59              in  an archive format. If no file operands are specified, a list
60              of files to copy, one per line, shall be read from the  standard
61              input.  A  file of type directory shall include all of the files
62              in the file hierarchy rooted at the file.
63
64       copy   In copy mode (when both -r and -w are specified), pax shall copy
65              the file operands to the destination directory.
66
67              If  no file operands are specified, a list of files to copy, one
68              per line, shall be read from the standard input. A file of  type
69              directory  shall  include all of the files in the file hierarchy
70              rooted at the file.
71
72              The effect of the copy shall be as  if  the  copied  files  were
73              written  to  an  archive  file  and then subsequently extracted,
74              except that there may be hard links between the original and the
75              copied  files. If the destination directory is a subdirectory of
76              one of the files to be copied, the results are  unspecified.  If
77              the destination directory is a file of a type not defined by the
78              System Interfaces volume of IEEE Std  1003.1-2001,  the  results
79              are  implementation-defined; otherwise, it shall be an error for
80              the file named by the directory operand not  to  exist,  not  be
81              writable by the user, or not be a file of type directory.
82
83       In  read  or  copy  modes, if intermediate directories are necessary to
84       extract an archive member, pax shall perform actions equivalent to  the
85       mkdir()  function  defined  in the System Interfaces volume of IEEE Std
86       1003.1-2001, called with the following arguments:
87
88       ·      The intermediate directory used as the path argument.
89
90       ·      The value of the bitwise-inclusive OR of S_IRWXU,  S_IRWXG,  and
91              S_IRWXO as the mode argument.
92
93       If  any  specified pattern or file operands are not matched by at least
94       one file or archive member, pax shall write  a  diagnostic  message  to
95       standard error for each one that did not match and exit with a non-zero
96       exit status.
97
98       The archive formats described in the EXTENDED DESCRIPTION section shall
99       be  automatically  detected on input. The default output archive format
100       shall be implementation-defined.
101
102       The spax implementation defaults to -x ustar.
103
104       A single archive can span multiple files. The pax utility shall  deter‐
105       mine,  in  an implementation-defined manner, what file to read or write
106       as the next file.
107
108       If the selected archive format supports  the  specification  of  linked
109       files,  it  shall  be an error if these files cannot be linked when the
110       archive is extracted, except that if the files to be  linked  are  sym‐
111       bolic  links and the system is not capable of making hard links to sym‐
112       bolic links, then separate copies of the symbolic link shall be created
113       instead.  For archive formats that do not store file contents with each
114       name that causes a hard link, if the file that contains the data is not
115       extracted  during  this  pax session, either the data shall be restored
116       from the original file, or a diagnostic message shall be displayed with
117       the  name of a file that can be used to extract the data. In traversing
118       directories, pax shall detect infinite loops; that is, entering a  pre‐
119       viously visited directory that is an ancestor of the last file visited.
120       When it detects an infinite loop, pax shall write a diagnostic  message
121       to standard error and shall terminate.
122
123

OPTIONS

125       The  pax  utility  shall conform to the Base Definitions volume of IEEE
126       Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines,  except  that
127       the order of presentation of the -o, -p, and -s options is significant.
128       See also the OTHER OPTIONS section.
129
130       The following options shall be supported:
131
132       -r     Read an archive file from standard input.
133
134       -w     Write files to the standard output in the specified archive for‐
135              mat.
136
137       -a     Append  files  to  the end of the archive. It is implementation-
138              defined which devices on the  system  support  appending.  Addi‐
139              tional  file  formats  unspecified  by  this  volume of IEEE Std
140              1003.1-2001 may impose restrictions on appending.
141
142       -b blocksize
143              Block the output at a positive decimal integer number  of  bytes
144              per  write  to the archive file. Devices and archive formats may
145              impose restrictions on blocking. Blocking shall be automatically
146              determined on input. Conforming applications shall not specify a
147              blocksize value larger than 32256.  Default blocking when creat‐
148              ing  archives  depends on the archive format. (See the -x option
149              below.)
150
151       -c     Match all file or archive members except those specified by  the
152              pattern or file operands.
153
154       -d     Cause  files  of  type directory being copied or archived or ar‐
155              chive members of type directory being  extracted  or  listed  to
156              match  only  the  file or archive member itself and not the file
157              hierarchy rooted at the file.
158
159       -f archive
160              Specify the pathname of the input or output archive,  overriding
161              the  default  standard input (in list or read modes) or standard
162              output (write mode).
163
164       -H     If a symbolic link referencing a file of type directory is spec‐
165              ified  on the command line, pax shall archive the file hierarchy
166              rooted in the file referenced by the link, using the name of the
167              link  as  the  root of the file hierarchy.  Otherwise, if a sym‐
168              bolic link referencing a file of any other file type  which  pax
169              can  normally archive is specified on the command line, then pax
170              shall archive the file referenced by the link, using the name of
171              the  link. The default behavior shall be to archive the symbolic
172              link itself.
173
174       -i     Interactively rename files or archive members. For each  archive
175              member  matching a pattern operand or file matching a file oper‐
176              and, a prompt shall be written to the file /dev/tty.  The prompt
177              shall  contain  the  name of the file or archive member, but the
178              format is otherwise unspecified. A line shall then be read  from
179              /dev/tty.  If  this  line  is  blank, the file or archive member
180              shall be skipped. If this line consists of a single period,  the
181              file  or  archive member shall be processed with no modification
182              to its name. Otherwise, its name shall be replaced with the con‐
183              tents of the line. The pax utility shall immediately exit with a
184              non-zero exit status if end-of-file is encountered when  reading
185              a response or if /dev/tty cannot be opened for reading and writ‐
186              ing.
187
188              The results of extracting a hard link to a file  that  has  been
189              renamed during extraction are unspecified.
190
191       -k     Prevent the overwriting of existing files.
192
193       -l     (The letter ell.) In copy mode, hard links shall be made between
194              the source and destination file hierarchies  whenever  possible.
195              If  specified in conjunction with -H or -L, when a symbolic link
196              is encountered, the hard link created in  the  destination  file
197              hierarchy  shall be to the file referenced by the symbolic link.
198              If specified when neither -H nor -L is specified,  when  a  sym‐
199              bolic  link  is  encountered,  the implementation shall create a
200              hard link to the symbolic link in the source file  hierarchy  or
201              copy the symbolic link to the destination.
202
203       -L     If a symbolic link referencing a file of type directory is spec‐
204              ified on the command line or encountered during the traversal of
205              a file hierarchy, pax shall archive the file hierarchy rooted in
206              the file referenced by the link, using the name of the  link  as
207              the  root  of the file hierarchy.  Otherwise, if a symbolic link
208              referencing a file of any other file type which pax can normally
209              archive  is  specified on the command line or encountered during
210              the traversal of a file hierarchy, pax shall  archive  the  file
211              referenced  by the link, using the name of the link. The default
212              behavior shall be to archive the symbolic link itself.
213
214       -n     Select the first archive member that matches each pattern  oper‐
215              and.  No  more than one archive member shall be matched for each
216              pattern (although members of type directory  shall  still  match
217              the file hierarchy rooted at that file).
218
219       -o options
220              Provide  information  to  the implementation to modify the algo‐
221              rithm for extracting or writing  files.  The  value  of  options
222              shall  consist  of  one  or more comma-separated keywords of the
223              form:
224
225              keyword[[:]=value][,keyword[[:]=value],...]
226
227              Some keywords apply only to certain file formats,  as  indicated
228              with  each description. Use of keywords that are inapplicable to
229              the file format being processed produces undefined results.
230
231              Keywords in the options argument shall be a string that would be
232              a  valid  portable filename as described in the Base Definitions
233              volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
234              Character Set.
235
236              Note:  Keywords are not expected to be filenames, merely to fol‐
237                     low the same  character  composition  rules  as  portable
238                     filenames.
239
240              Keywords can be preceded with white space. The value field shall
241              consist of zero or more characters; within value,  the  applica‐
242              tion  shall  precede  any  literal comma with a backslash, which
243              shall be ignored, but preserves the comma as part  of  value.  A
244              comma  as  the  final  character,  or a comma followed solely by
245              white space  as  the  final  characters,  in  options  shall  be
246              ignored. Multiple -o options can be specified; if keywords given
247              to these multiple -o options conflict, the keywords  and  values
248              appearing  later  in command line sequence shall take precedence
249              and the earlier shall be silently ignored. The following keyword
250              values  of  options  shall  be supported for the file formats as
251              indicated:
252
253              delete=pattern
254                     (Applicable only to the -x  pax  format.)  When  used  in
255                     write  or  copy mode, pax shall omit from extended header
256                     records that it produces any keywords matching the string
257                     pattern. When used in read or list mode, pax shall ignore
258                     any keywords matching the string pattern in the  extended
259                     header  records.  In  both  cases, matching shall be per‐
260                     formed using the pattern matching notation  described  in
261                     Patterns  Matching a Single Character and Patterns Match‐
262                     ing Multiple Characters. For example:
263
264                     -o delete=security.*
265
266                     would  suppress  security-related  information.  See  pax
267                     Extended Header for extended header record keyword usage.
268
269                     When  multiple  -o  delete=pattern options are specified,
270                     the patterns shall be additive; all keywords matching the
271                     specified  string patterns shall be omitted from extended
272                     header records that pax produces.
273
274              exthdr.name=string
275                     (Applicable only to the  -x  pax  format.)  This  keyword
276                     allows  user  control  over the name that is written into
277                     the ustar header blocks for the extended header  produced
278                     under  the  circumstances  described in pax Header Block.
279                     The name shall be the contents of string, after the  fol‐
280                     lowing character substitutions have been made:
281
282                  ┌─────────────────┬─────────────────────────────────────────────┐
283string Includes: Replaced By:                                
284                  ├─────────────────┼─────────────────────────────────────────────┤
285                  │%d               │ The directory name of the file, equivalent  │
286                  │                 │ to the result of the dirname utility on the │
287                  │                 │ translated pathname.                        │
288                  ├─────────────────┼─────────────────────────────────────────────┤
289                  │%f               │ The filename of the file, equivalent to the │
290                  │                 │ result of the basename utility on the       │
291                  │                 │ translated pathname.                        │
292                  ├─────────────────┼─────────────────────────────────────────────┤
293                  │%p               │ The process ID of the pax process.          │
294                  ├─────────────────┼─────────────────────────────────────────────┤
295                  │%%               │ A '%' character.                            │
296                  └─────────────────┴─────────────────────────────────────────────┘
297                     Any  other  '%'  characters  in  string produce undefined
298                     results.
299
300                     If no -o exthdr.name= string is specified, pax shall  use
301                     the following default value:
302
303                             %d/PaxHeaders.%p/%f
304
305              globexthdr.name=string
306                     (Applicable  only  to  the  -x  pax format.) When used in
307                     write or copy mode  with  the  appropriate  options,  pax
308                     shall  create  global  extended header records with ustar
309                     header blocks that will be treated as  regular  files  by
310                     previous  versions of pax.  This keyword allows user con‐
311                     trol over the name that is written into the ustar  header
312                     blocks for global extended header records. The name shall
313                     be the contents of string, after the following  character
314                     substitutions have been made:
315
316                  ┌─────────────────┬─────────────────────────────────────────────┐
317string Includes: Replaced By:                                
318                  ├─────────────────┼─────────────────────────────────────────────┤
319                  │%n               │ An integer that represents the sequence     │
320                  │                 │ number of the global extended header record │
321                  │                 │ in the archive, starting at 1.              │
322                  ├─────────────────┼─────────────────────────────────────────────┤
323                  │%p               │ The process ID of the pax process.          │
324                  ├─────────────────┼─────────────────────────────────────────────┤
325                  │%%               │ A '%' character.                            │
326                  └─────────────────┴─────────────────────────────────────────────┘
327                     Any  other  '%'  characters  in  string produce undefined
328                     results.
329
330                     If no -o globexthdr.name=string is specified,  pax  shall
331                     use the following default value:
332
333                     $TMPDIR/GlobalHead.%p.%n
334
335                     where $TMPDIR represents the value of the TMPDIR environ‐
336                     ment variable. If TMPDIR is not set, pax shall use /tmp.
337
338              invalid=action
339                     (Applicable only to the  -x  pax  format.)  This  keyword
340                     allows  user  control  over  the  action  pax  takes upon
341                     encountering values in an extended header record that, in
342                     read or copy mode, are invalid in the destination hierar‐
343                     chy or, in list mode, cannot be written  in  the  codeset
344                     and  current  locale of the implementation. The following
345                     are invalid values that shall be recognized by pax:
346
347                     +      In read or copy mode, a filename or link name that
348                            contains character encodings invalid in the desti‐
349                            nation hierarchy. (For example, the name may  con‐
350                            tain embedded NULs.)
351
352                     +      In read or copy mode, a filename or link name that
353                            is longer than the maximum allowed in the destina‐
354                            tion hierarchy (for either a pathname component or
355                            the entire pathname).
356
357                     +      In list mode, any character  string  value  (file‐
358                            name, link name, user name, and so on) that cannot
359                            be written in the codeset and  current  locale  of
360                            the implementation.
361
362                     The  following  mutually-exclusive  values  of the action
363                     argument are supported:
364
365                     bypass In read or copy mode, pax shall bypass  the  file,
366                            causing no change to the destination hierarchy. In
367                            list mode, pax shall  write  all  requested  valid
368                            values  for  the  file, but its method for writing
369                            invalid values is unspecified.
370
371                     rename In read or copy mode, pax shall act as if  the  -i
372                            option  were  in effect for each file with invalid
373                            filename or link name values, allowing the user to
374                            provide  a replacement name interactively. In list
375                            mode, pax shall behave identically to  the  bypass
376                            action.
377
378                     UTF-8  When  used in read, copy, or list mode and a file‐
379                            name, link name, owner name, or any other field in
380                            an  extended  header  record  cannot be translated
381                            from the pax UTF-8 codeset format to  the  codeset
382                            and  current  locale  of  the  implementation, pax
383                            shall use the actual UTF-8 encoding for the name.
384
385                     write  In read or copy mode, pax shall  write  the  file,
386                            translating  the  name, regardless of whether this
387                            may overwrite an existing file with a valid  name.
388                            In  list mode, pax shall behave identically to the
389                            bypass action.
390
391                     If no -o invalid=option is specified, pax shall act as if
392                     -o  invalid=  bypass  were  specified. Any overwriting of
393                     existing files that may be allowed  by  the  -o  invalid=
394                     actions  shall be subject to permission(-p) and modifica‐
395                     tion time (-u) restrictions, and shall be  suppressed  if
396                     the -k option is also specified.
397
398              linkdata
399                     (Applicable  only  to  the -x pax format.) In write mode,
400                     pax shall write the contents of a  file  to  the  archive
401                     even when that file is merely a hard link to a file whose
402                     contents have already been written to the archive.
403
404              listopt=format
405                     This keyword specifies the output format of the table  of
406                     contents produced when the -v option is specified in list
407                     mode. See List Mode Format Specifications. To avoid ambi‐
408                     guity,  the  listopt=  format  shall be the only or final
409                     keyword= value pair in a -o option-argument; all  charac‐
410                     ters  in  the  remainder  of the option-argument shall be
411                     considered part of the format string.  When  multiple  -o
412                     listopt= format options are specified, the format strings
413                     shall be considered a single, concatenated string, evalu‐
414                     ated in command line order.
415
416              times  (Applicable  only  to  the  -x  pax format.) When used in
417                     write or copy mode, pax shall  include  atime  and  mtime
418                     extended  header  records for each file. See pax Extended
419                     Header File Times.
420
421              In addition to these keywords, if the -x pax  format  is  speci‐
422              fied,  any  of  the  keywords and values defined in pax Extended
423              Header, including implementation extensions, can be used  in  -o
424              option-arguments, in either of two modes:
425
426              keyword=value
427                     When  used  in  write  or  copy mode, these keyword/value
428                     pairs shall be included at the beginning of  the  archive
429                     as  typeflag  g global extended header records. When used
430                     in read or list mode, these keyword/value pairs shall act
431                     as  if  they  had been at the beginning of the archive as
432                     typeflag g global extended header records.
433
434              keyword:=value
435                     When used in write  or  copy  mode,  these  keyword/value
436                     pairs  shall be included as records at the beginning of a
437                     typeflag x extended header for each file. (This shall  be
438                     equivalent  to the equal-sign form except that it creates
439                     no typeflag g global extended header records.) When  used
440                     in read or list mode, these keyword/value pairs shall act
441                     as if they were included as records at the  end  of  each
442                     extended  header; thus, they shall override any global or
443                     file-specific extended header record keywords of the same
444                     names. For example, in the command:
445
446                     pax -r -o "gname:=mygroup," <archive
447
448                     the  group  name  will  be  forced to a new value for all
449                     files read from the archive.
450
451              The precedence of -o keywords over various fields in the archive
452              is described in pax Extended Header Keyword Precedence.
453
454       -p string
455              Specify  one  or  more file characteristic options (privileges).
456              The string option-argument shall be  a  string  specifying  file
457              characteristics  to be retained or discarded on extraction.  The
458              string shall consist of the specification characters a ,  e,  m,
459              o,   and  p.  Other  implementation-defined  characters  can  be
460              included. Multiple characteristics can  be  concatenated  within
461              the  same  string  and multiple -p options can be specified. The
462              meaning of the specification characters are as follows:
463
464              a      Do not preserve file access times.
465
466              e      Preserve the user ID, group ID, file mode bits  (see  the
467                     Base  Definitions volume of IEEE Std 1003.1-2001, Section
468                     3.168, File Mode Bits), access time,  modification  time,
469                     and  any  other  implementation-defined file characteris‐
470                     tics.
471
472              m
473
474                     Do not preserve file modification times.
475
476              o      Preserve the user ID and group ID.
477
478              p      Preserve the file mode bits. Other implementation-defined
479                     file mode attributes may be preserved.
480
481              In  the  preceding  list, "preserve" indicates that an attribute
482              stored in the archive shall be given to the extracted file, sub‐
483              ject  to the permissions of the invoking process. The access and
484              modification times of the file shall be preserved unless  other‐
485              wise  specified with the -p option or not stored in the archive.
486              All attributes that are not preserved  shall  be  determined  as
487              part  of  the normal file creation action (see File Read, Write,
488              and Creation).
489
490              If neither the e nor the o specification character is specified,
491              or  the  user  ID and group ID are not preserved for any reason,
492              pax shall not set the S_ISUID and S_ISGID bits of the file mode.
493
494              If the preservation of any of these items fails for any  reason,
495              pax  shall write a diagnostic message to standard error. Failure
496              to preserve these items shall affect the final exit status,  but
497              shall not cause the extracted file to be deleted.
498
499              If file characteristic letters in any of the string option-argu‐
500              ments are duplicated or conflict with each other, the ones given
501              last shall take precedence. For example, if -p eme is specified,
502              file modification times are preserved.
503
504       -s replstr
505              Modify file or archive member names named by pattern or file op‐
506              erands  according  to the substitution expression replstr, using
507              the syntax of the ed utility.  The  concepts  of  "address"  and
508              "line"  are  meaningless  in the context of the pax utility, and
509              shall not be supplied. The format shall be:
510
511              -s /old/new/[gp]
512
513              where as in ed, old is a basic regular expression  and  new  can
514              contain  an ampersand, '\n' (where n is a digit) backreferences,
515              or subexpression matching. The old string shall also be  permit‐
516              ted to contain <newline>s.
517
518              Any  non-null  character  can be used as a delimiter ( '/' shown
519              here). Multiple -s expressions can be specified; the expressions
520              shall  be  applied  in the order specified, terminating with the
521              first successful substitution. The optional trailing 'g'  is  as
522              defined in the ed utility. The optional trailing 'p' shall cause
523              successful substitutions to be written to standard  error.  File
524              or  archive  member  names  that  substitute to the empty string
525              shall be ignored when reading and writing archives.
526
527       -t     When reading files from the file system, and if the user has the
528              permissions required by utime() to do so, set the access time of
529              each file read to the access time that it had before being  read
530              by pax.
531
532       -u     Ignore files that are older (having a less recent file modifica‐
533              tion time) than a pre-existing file or archive member  with  the
534              same name. In read mode, an archive member with the same name as
535              a file in the file system shall be extracted if the archive mem‐
536              ber  is newer than the file. In write mode, an archive file mem‐
537              ber with the same name as a file in the  file  system  shall  be
538              superseded  if  the file is newer than the archive member. If -a
539              is also specified, this is accomplished by appending to the  ar‐
540              chive; otherwise, it is unspecified whether this is accomplished
541              by actual replacement in the archive or by appending to the  ar‐
542              chive. In copy mode, the file in the destination hierarchy shall
543              be replaced by the file in the source hierarchy or by a link  to
544              the file in the source hierarchy if the file in the source hier‐
545              archy is newer.
546
547       -v     In list mode, produce a verbose table of contents (see the  STD‐
548              OUT section). Otherwise, write archive member pathnames to stan‐
549              dard error (see the STDERR section).
550
551       -x format
552              Specify the output archive format. The pax utility shall support
553              the following formats:
554
555              cpio   The cpio interchange format; see the EXTENDED DESCRIPTION
556                     section. The default blocksize for this format for  char‐
557                     acter  special  archive  files shall be 5120. Implementa‐
558                     tions shall support all blocksize  values  less  than  or
559                     equal to 32256 that are multiples of 512.
560
561              pax    The  pax interchange format; see the EXTENDED DESCRIPTION
562                     section. The default blocksize for this format for  char‐
563                     acter  special  archive files shall be 5120.  Implementa‐
564                     tions shall support all blocksize  values  less  than  or
565                     equal to 32256 that are multiples of 512.
566
567              ustar  The  tar interchange format; see the EXTENDED DESCRIPTION
568                     section. The default blocksize for this format for  char‐
569                     acter  special archive files shall be 10240.  Implementa‐
570                     tions shall support all blocksize  values  less  than  or
571                     equal to 32256 that are multiples of 512.
572
573              Implementation-defined  formats  shall  specify  a default block
574              size as well as any other block sizes  supported  for  character
575              special archive files.
576
577              Any  attempt  to append to an archive file in a format different
578              from the existing archive format shall cause pax to exit immedi‐
579              ately with a non-zero exit status.
580
581              In  copy mode, if no -x format is specified, pax shall behave as
582              if -x pax were specified.
583
584       -X     When traversing the file hierarchy specified by a pathname,  pax
585              shall  not descend into directories that have a different device
586              ID ( st_dev; see  the  System  Interfaces  volume  of  IEEE  Std
587              1003.1-2001, stat()).
588
589       Specifying  more  than  one of the mutually-exclusive options -H and -L
590       shall not be considered an error and the last  option  specified  shall
591       determine the behavior of the utility.
592
593       The  options that operate on the names of files or archive members (-c,
594       -i, -n, -s, -u, and -v) shall interact as follows. In  read  mode,  the
595       archive  members  shall be selected based on the user-specified pattern
596       operands as modified by the -c, -n, and -u options. Then, any -s and -i
597       options  shall  modify, in that order, the names of the selected files.
598       The -v option shall write names resulting from these modifications.
599
600       In write mode, the files shall be selected based on the  user-specified
601       pathnames  as  modified  by the -n and -u options.  Then, any -s and -i
602       options shall modify, in that order, the names of these selected files.
603       The -v option shall write names resulting from these modifications.
604
605       If  both  the -u and -n options are specified, pax shall not consider a
606       file selected unless it is newer than the file to which it is compared.
607
608
609   List Mode Format Specifications
610       The manual page for spax is not yet ready.  The  following  text  is  a
611       quotation from the POSIX.1-2001 standard.
612
613       In  list  mode  with  the -o listopt=format option, the format argument
614       shall be applied for each selected file. The pax utility shall append a
615       <newline>  to  the  listopt  output  for each selected file. The format
616       argument shall be used as the format string described in the Base Defi‐
617       nitions  volume  of  IEEE Std 1003.1-2001, Chapter 5, File Format Nota‐
618       tion, with the exceptions  1.  through  5.   defined  in  the  EXTENDED
619       DESCRIPTION section of printf(3), plus the following exceptions:
620
621       6.     The  sequence  (keyword)  can  occur  before a format conversion
622              specifier. The conversion argument is defined by  the  value  of
623              keyword.   The  implementation  shall support the following key‐
624              words:
625
626              ·      Any of the Field Name entries in ustar Header  Block  and
627                     Octet-Oriented cpio Archive Entry. The implementation may
628                     support the cpio keywords without the leading c_ in addi‐
629                     tion  to  the  form  required  by  Values for cpio c_mode
630                     Field.
631
632              ·      Any keyword  defined  for  the  extended  header  in  pax
633                     Extended Header.
634
635              ·      Any  keyword provided as an implementation-defined exten‐
636                     sion within the extended header defined in  pax  Extended
637                     Header.
638
639              For  example,  the sequence "%(charset)s" is the string value of
640              the name of the character set in the extended header.
641
642              The result of the keyword conversion argument shall be the value
643              from the applicable header field or extended header, without any
644              trailing NULs.
645
646              All keyword values used as conversion arguments shall be  trans‐
647              lated  from  the UTF-8 encoding to the character set appropriate
648              for the local file system, user database, and so on, as applica‐
649              ble.
650
651       7.     An  additional  conversion specifier character, T, shall be used
652              to specify time formats. The T  conversion  specifier  character
653              can  be preceded by the sequence (keyword=subformat), where sub‐
654              format is a date format as defined by date operands. The default
655              keyword shall be mtime and the default subformat shall be:
656
657                 %b %e %H:%M %Y
658
659       8.     An  additional  conversion specifier character, M, shall be used
660              to specify the file mode string as  defined  in  ls(1)  Standard
661              Output. If (keyword) is omitted, the mode keyword shall be used.
662              For example, %.1M writes the single character  corresponding  to
663              the <entry type> field of the ls -l command.
664
665       9.     An  additional  conversion specifier character, D, shall be used
666              to specify the device for block or special files, if applicable,
667              in  an  implementation-defined  format.  If  not applicable, and
668              (keyword) is specified, then this conversion shall be equivalent
669              to  %(keyword)u.   If  not applicable, and (keyword) is omitted,
670              then this conversion shall be equivalent to <space>.
671
672       10.    An additional conversion specifier character, F, shall  be  used
673              to  specify  a  pathname. The F conversion character can be pre‐
674              ceded by a sequence of comma-separated keywords:
675
676                 (keyword[,keyword] ... )
677              The values for all the keywords that are non-null shall be  con‐
678              catenated  together,  each separated by a '/'. The default shall
679              be (path) if the keyword path is defined; otherwise, the default
680              shall be (prefix, name).
681
682       11.    An  additional  conversion specifier character, L, shall be used
683              to specify a symbolic line expansion. If the current file  is  a
684              symbolic link, then %L shall expand to:
685
686                 "%s -> %s", <value of keyword>, <contents of link>
687
688       Otherwise,  the  %L conversion specification shall be the equivalent of
689       %F.
690
691

OPERANDS

693       The following operands shall be supported:
694
695       directory
696              The destination directory pathname for copy mode.
697
698       file   A pathname of a file to be copied or archived.
699
700       pattern
701              A pattern matching one or more pathnames of archive members.   A
702              pattern  must  be  given  in the name-generating notation of the
703              pattern matching notation in Pattern Matching Notation , includ‐
704              ing  the  filename expansion rules in Patterns Used for Filename
705              Expansion. The default, if no pattern is specified, is to select
706              all members in the archive.
707
708

STDIN

710       In  write  mode, the standard input shall be used only if no file oper‐
711       ands are specified. It shall be a text file containing a list of  path‐
712       names, one per line, without leading or trailing <blank>s.
713
714       In  list  and  read  modes,  if -f is not specified, the standard input
715       shall be an archive file.
716
717       Otherwise, the standard input shall not be used.
718
719

INPUT FILES

721       The input file named by the archive option-argument, or standard  input
722       when  the archive is read from there, shall be a file formatted accord‐
723       ing to one of the specifications in the EXTENDED DESCRIPTION section or
724       some other implementation-defined format.
725
726       The file /dev/tty shall be used to write prompts and read responses.
727
728

ENVIRONMENT VARIABLES

730       The following environment variables shall affect the execution of pax:
731
732       LANG   Provide  a  default value for the internationalization variables
733              that are unset or null. (See the Base Definitions volume of IEEE
734              Std 1003.1-2001, Section 8.2, Internationalization Variables for
735              the precedence of internationalization variables used to  deter‐
736              mine the values of locale categories.)
737
738       LC_ALL If  set  to a non-empty string value, override the values of all
739              the other internationalization variables.
740
741       LC_COLLATE
742              Determine the locale for the  behavior  of  ranges,  equivalence
743              classes, and multi-character collating elements used in the pat‐
744              tern matching expressions for the  pattern  operand,  the  basic
745              regular  expression  for the -s option, and the extended regular
746              expression defined for the yesexpr locale keyword in the LC_MES‐
747              SAGES category.
748
749       LC_CTYPE
750              Determine  the  locale  for  the  interpretation of sequences of
751              bytes of text data as characters (for  example,  single-byte  as
752              opposed  to multi-byte characters in arguments and input files),
753              the behavior of character classes used in the  extended  regular
754              expression defined for the yesexpr locale keyword in the LC_MES‐
755              SAGES category, and pattern matching.
756
757       LC_MESSAGES
758              Determine the locale for the processing of affirmative responses
759              that  should  be used to affect the format and contents of diag‐
760              nostic messages written to standard error.
761
762       LC_TIME
763              Determine the format and contents of date and time strings  when
764              the -v option is specified.
765
766       NLSPATH
767              [XSI]  [Option Start] Determine the location of message catalogs
768              for the processing of LC_MESSAGES . [Option End]
769
770       TMPDIR Determine the pathname that provides part of the default  global
771              extended header record file, as described for the -o globexthdr=
772              keyword in the OPTIONS section.
773
774       TZ     Determine the timezone used to calculate date and  time  strings
775              when  the  -v  option  is  specified. If TZ is unset or null, an
776              unspecified default timezone shall be used.
777
778

ASYNCHRONOUS EVENTS

780       Default.
781
782

STDOUT

784       In write mode, if -f is not specified, the standard output shall be the
785       archive  formatted  according  to  one  of  the  specifications  in the
786       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
787       mat (see -x format).
788
789       In  list  mode,  when  the  -o  listopt= format has been specified, the
790       selected archive members shall be written to standard output using  the
791       format  described  under  List Mode Format Specifications. In list mode
792       without the -o listopt= format option, the table  of  contents  of  the
793       selected  archive members shall be written to standard output using the
794       following format:
795
796            "%s\n", <pathname>
797
798       If the -v option is specified in list mode, the table  of  contents  of
799       the  selected archive members shall be written to standard output using
800       the following formats.
801
802       For pathnames representing hard links to previous members  of  the  ar‐
803       chive:
804
805            "%s == %s\n", <ls -l listing>, <linkname>
806
807       For all other pathnames:
808
809            "%s\n", <ls -l listing>
810
811       where  <ls -l listing> shall be the format specified by the ls(1) util‐
812       ity with the -l option. When writing pathnames in this  format,  it  is
813       unspecified what is written for fields for which the underlying archive
814       format does not have the correct information, although the correct num‐
815       ber of <blank>-separated fields shall be written.
816
817       In list mode, standard output shall not be buffered more than a line at
818       a time.
819
820

STDERR

822       If -v is specified in read, write, or copy modes, pax shall  write  the
823       pathnames it processes to the standard error output using the following
824       format:
825
826            "%s\n", <pathname>
827
828       These pathnames shall be written as soon as processing is begun on  the
829       file  or  archive  member,  and shall be flushed to standard error. The
830       trailing <newline>, which shall not be buffered, is  written  when  the
831       file has been read or written.
832
833       If  the -s option is specified, and the replacement string has a trail‐
834       ing 'p', substitutions shall be written to standard error in  the  fol‐
835       lowing format:
836
837            "%s >> %s\n", <original pathname>, <new pathname>
838
839       In  all operating modes of pax, optional messages of unspecified format
840       concerning the input archive format and volume number,  the  number  of
841       files,  blocks,  volumes,  and  media parts as well as other diagnostic
842       messages may be written to standard error.
843
844       In all formats, for both standard output  and  standard  error,  it  is
845       unspecified how non-printable characters in pathnames or link names are
846       written.
847
848       When pax is in read mode or list mode, using the -x pax archive format,
849       and  a  filename,  link  name,  owner  name,  or  any other field in an
850       extended header record cannot be translated from the pax UTF-8  codeset
851       format  to  the  codeset  and current locale of the implementation, pax
852       shall write a diagnostic message to standard error, shall  process  the
853       file  as  described  for the -o invalid= option, and then shall process
854       the next file in the archive.
855
856

OUTPUT FILES

858       In read mode, the extracted output files shall be of the archived  file
859       type.  In  copy  mode, the copied output files shall be the type of the
860       file being copied. In either mode, existing files  in  the  destination
861       hierarchy shall be overwritten only when all permission (-p), modifica‐
862       tion time (-u), and invalid-value (-o invalid=) tests allow it.
863
864       In write mode, the output file named by the -f option-argument shall be
865       a file formatted according to one of the specifications in the EXTENDED
866       DESCRIPTION section, or some other implementation-defined format.
867
868

EXTENDED DESCRIPTION

870   pax Interchange Format
871       A pax archive tape or file produced in the -x pax format shall  contain
872       a series of blocks. The physical layout of the archive shall be identi‐
873       cal to the ustar format described in  ustar  Interchange  Format.  Each
874       file archived shall be represented by the following sequence:
875
876              ·      An  optional  header  block with extended header records.
877                     This header block is of the form described in pax  Header
878                     Block,  with  a  typeflag  value of x or g.  The extended
879                     header records, described in pax Extended  Header,  shall
880                     be included as the data for this header block.
881
882              ·      A header block that describes the file. Any fields in the
883                     preceding optional extended  header  shall  override  the
884                     associated fields in this header block for this file.
885
886              ·      Zero  or  more  blocks  that  contain the contents of the
887                     file.
888
889       At the end of the archive file  there  shall  be  two  512-byte  blocks
890       filled with binary zeros, interpreted as an end-of-archive indicator.
891
892       A  schematic  of an example archive with global extended header records
893       and two actual files is shown in pax Format  Archive  Example.  In  the
894       example,  the second file in the archive has no extended header preced‐
895       ing it, presumably because it has no need for extended attributes.
896
897                         Figure: pax Format Archive Example
898
899    ┌──────────────────────────────┬─────────────────────────────────────────────┐
900    │ustar Header [typeflag = 'g'] │                                             │
901    ├──────────────────────────────┤           Global Extended header            │
902    │Global Extended Header Data   │                                             │
903    ├──────────────────────────────┼─────────────────────────────────────────────┤
904    │ustar Header [typeflag = 'x'] │                                             │
905    ├──────────────────────────────┤                                             │
906    │Extended Header Data          │                                             │
907    ├──────────────────────────────┤  File 1: Extended Header data is included   │
908    │ustar Header [typeflag = '0'] │                                             │
909    ├──────────────────────────────┤                                             │
910    │Data for File 1               │                                             │
911    ├──────────────────────────────┼─────────────────────────────────────────────┤
912    │ustar Header [typeflag = '0'] │                                             │
913    ├──────────────────────────────┤ File 2: No Extended Header data is included │
914    │Data for File 2               │                                             │
915    ├──────────────────────────────┼─────────────────────────────────────────────┤
916    │Block of binary Zeroes        │                                             │
917    ├──────────────────────────────┤          End of Archive Indicator           │
918    │Block of binary Zeroes        │                                             │
919    └──────────────────────────────┴─────────────────────────────────────────────┘
920
921   pax Header Block
922       The pax header block shall be  identical  to  the  ustar  header  block
923       described in ustar Interchange Format, except that two additional type‐
924       flag values are defined:
925
926       x      Represents extended header records for the following file in the
927              archive (which shall have its own ustar header block).  The for‐
928              mat of these extended header records shall be  as  described  in
929              pax Extended Header.
930
931       g      Represents  global  extended  header  records  for the following
932              files in the  archive.  The  format  of  these  extended  header
933              records  shall  be  as  described  in pax Extended Header.  Each
934              value shall affect all subsequent files  that  do  not  override
935              that value in their own extended header record and until another
936              global extended header record is reached that  provides  another
937              value  for  the same field. The typeflag g global headers should
938              not be used with interchange media  that  could  suffer  partial
939              data loss in transporting the archive.
940
941       For  both  of  these  types,  the  size  field shall be the size of the
942       extended header records in octets. The other fields in the header block
943       are  not  meaningful  to  this version of the pax utility.  However, if
944       this  archive  is  read  by  a  pax  utility  conforming  to  the   ISO
945       POSIX-2:1993  standard,  the  header  block fields are used to create a
946       regular file that contains the extended header records as data.  There‐
947       fore,  header  block field values should be selected to provide reason‐
948       able file access to this regular file.
949
950       A further difference from the ustar header block is  that  data  blocks
951       for  files  of  typeflag 1 (the digit one) (hard link) may be included,
952       which means that the size field may be greater than zero. Archives cre‐
953       ated  by  pax -o linkdata shall include these data blocks with the hard
954       links.
955
956
957   pax Extended Header
958       A pax extended header contains values that are  inappropriate  for  the
959       ustar  header  block  because  of  limitations  in  that format: fields
960       requiring a character encoding other than that described in the ISO/IEC
961       646:1991 standard, fields representing file attributes not described in
962       the ustar header, and fields whose format or  length  do  not  fit  the
963       requirements  of the ustar header. The values in an extended header add
964       attributes to the following file (or files; see the description of  the
965       typeflag  g  header  block)  or override values in the following header
966       block(s), as indicated in the following list of keywords.
967
968       An extended header shall consist of one  or  more  records,  each  con‐
969       structed as follows:
970
971            "%d %s=%s\n", <length>, <keyword>, <value>
972
973       The  extended  header records shall be encoded according to the ISO/IEC
974       10646-1:2000 standard (UTF-8).  The  <length>  field,  <blank>,  equals
975       sign,  and  <newline>  shown shall be limited to the portable character
976       set, as encoded in UTF-8. The <keyword> and <value> fields can  be  any
977       UTF-8 characters. The <length> field shall be the decimal length of the
978       extended header record in octets, including the trailing <newline>.
979
980       The <keyword> field shall be one of the entries from the following list
981       or  a  keyword  provided as an implementation extension.  Keywords con‐
982       sisting entirely of lowercase letters, digits, and periods are reserved
983       for future standardization. A keyword shall not include an equals sign.
984       (In the following list, the notations "file(s)" or "block(s)"  is  used
985       to acknowledge that a keyword affects the following single file after a
986       typeflag x extended header, but possibly multiple files after  typeflag
987       g.   Any  requirements  in the list for pax to include a record when in
988       write or copy mode shall apply only when such a record has not  already
989       been provided through the use of the -o option. When used in copy mode,
990       pax shall behave as if an archive  had  been  created  with  applicable
991       extended header records and then extracted.)
992
993       atime  The  file  access  time for the following file(s), equivalent to
994              the value of the st_atime member of the  stat  structure  for  a
995              file,  as  described  by  the  stat(2) function. The access time
996              shall be restored if the process has the  appropriate  privilege
997              required  to  do  so.  The  format  of  the  <value> shall be as
998              described in pax Extended Header File Times.
999
1000       charset
1001              The name of the character set used to encode  the  data  in  the
1002              following  file(s).  The  entries  in  the  following  table are
1003              defined to refer to known standards;  additional  names  may  be
1004              agreed on between the originator and recipient.
1005
1006              ┌────────────────────────┬───────────────────────────────┐
1007<value>         Formal Standard        
1008              ├────────────────────────┼───────────────────────────────┤
1009              │ISO-IR 646 1990         │ ISO/IEC 646:1990              │
1010              │ISO-IR 8859 1 1998      │ ISO/IEC 8859-1:1998           │
1011              │ISO-IR 8859 2 1999      │ ISO/IEC 8859-2:1999           │
1012              │ISO-IR 8859 3 1999      │ ISO/IEC 8859-3:1999           │
1013              │ISO-IR 8859 4 1998      │ ISO/IEC 8859-4:1998           │
1014              │ISO-IR 8859 5 1999      │ ISO/IEC 8859-5:1999           │
1015              │ISO-IR 8859 6 1999      │ ISO/IEC 8859-6:1999           │
1016              │ISO-IR 8859 7 1987      │ ISO/IEC 8859-7:1987           │
1017              │ISO-IR 8859 8 1999      │ ISO/IEC 8859-8:1999           │
1018              │ISO-IR 8859 9 1999      │ ISO/IEC 8859-9:1999           │
1019              │ISO-IR 8859 10 1998     │ ISO/IEC 8859-10:1998          │
1020              │ISO-IR 8859 13 1998     │ ISO/IEC 8859-13:1998          │
1021              │ISO-IR 8859 14 1998     │ ISO/IEC 8859-14:1998          │
1022              │ISO-IR 8859 15 1999     │ ISO/IEC 8859-15:1999          │
1023              │ISO-IR 10646 2000       │ ISO/IEC 10646:2000            │
1024              │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1025              │BINARY                  │ None                          │
1026              └────────────────────────┴───────────────────────────────┘
1027       The  encoding  is  included in an extended header for information only;
1028       when pax is used as described in IEEE Std  1003.1-2001,  it  shall  not
1029       translate the file data into any other encoding. The BINARY entry indi‐
1030       cates unencoded binary data.
1031
1032       When used in write or copy mode, it is  implementation-defined  whether
1033       pax includes a charset extended header record for a file.
1034
1035       comment
1036              A  series of characters used as a comment. All characters in the
1037              <value> field shall be ignored by pax.
1038
1039       gid    The group ID of the group that owns the  file,  expressed  as  a
1040              decimal  number using digits from the ISO/IEC 646:1991 standard.
1041              This record shall override the gid field in the following header
1042              block(s).  When  used in write or copy mode, pax shall include a
1043              gid extended header record for  each  file  whose  group  ID  is
1044              greater than 2097151 (octal 7777777).
1045
1046       gname  The group of the file(s), formatted as a group name in the group
1047              database. This record shall override the gid and gname fields in
1048              the  following  header  block(s),  and  any  gid extended header
1049              record. When used in read, copy, or list mode, pax shall  trans‐
1050              late  the  name  from the UTF-8 encoding in the header record to
1051              the character set appropriate for  the  group  database  on  the
1052              receiving  system.  If  any  of  the  UTF-8 characters cannot be
1053              translated, and if the -o invalid=UTF-8 option is not specified,
1054              the  results  are  implementation-defined. When used in write or
1055              copy mode, pax shall include a gname extended header record  for
1056              each  file  whose group name cannot be represented entirely with
1057              the letters and digits of the portable character set.
1058
1059       linkpath
1060              The pathname of a link being created to  another  file,  of  any
1061              type,  previously  archived.  This  record  shall  override  the
1062              linkname field in the following ustar header block(s). The  fol‐
1063              lowing  ustar header block shall determine the type of link cre‐
1064              ated. If typeflag of the following header block is 1,  it  shall
1065              be  a  hard  link. If typeflag is 2, it shall be a symbolic link
1066              and the linkpath value shall be the  contents  of  the  symbolic
1067              link. The pax utility shall translate the name of the link (con‐
1068              tents of the symbolic link) from the UTF-8 encoding to the char‐
1069              acter  set  appropriate  for the local file system. When used in
1070              write or copy mode, pax shall include a linkpath extended header
1071              record  for  each  link  whose  pathname  cannot  be represented
1072              entirely with the members of the portable  character  set  other
1073              than NUL.
1074
1075       mtime  The  file modification time of the following file(s), equivalent
1076              to the value of the st_mtime member of the stat structure for  a
1077              file,  as  described in the stat(2) function.  This record shall
1078              override the mtime field in the following header  block(s).  The
1079              modification  time  shall  be  restored  if  the process has the
1080              appropriate privilege required to do  so.   The  format  of  the
1081              <value> shall be as described in pax Extended Header File Times.
1082
1083       path   The  pathname  of the following file(s). This record shall over‐
1084              ride  the  name  and  prefix  fields  in  the  following  header
1085              block(s).  The  pax  utility shall translate the pathname of the
1086              file from the UTF-8 encoding to the  character  set  appropriate
1087              for the local file system.
1088
1089              When  used  in  write  or  copy  mode,  pax shall include a path
1090              extended header record for each file whose  pathname  cannot  be
1091              represented  entirely with the members of the portable character
1092              set other than NUL.
1093
1094       realtime.any
1095              The keywords prefixed by "realtime."  are  reserved  for  future
1096              standardization.
1097
1098       security.any
1099              The  keywords  prefixed  by  "security." are reserved for future
1100              standardization.
1101
1102       size   The size of the file in octets, expressed as  a  decimal  number
1103              using  digits  from  the  ISO/IEC 646:1991 standard. This record
1104              shall override the size field in the following header  block(s).
1105              When  used  in  write  or  copy  mode,  pax shall include a size
1106              extended header record for each file with a size  value  greater
1107              than 8589934591 (octal 77777777777).
1108
1109       uid    The  user  ID  of  the file owner, expressed as a decimal number
1110              using digits from the ISO/IEC  646:1991  standard.  This  record
1111              shall  override  the uid field in the following header block(s).
1112              When used in write  or  copy  mode,  pax  shall  include  a  uid
1113              extended  header  record for each file whose owner ID is greater
1114              than 2097151 (octal 7777777).
1115
1116       uname  The owner of the following file(s), formatted as a user name  in
1117              the  user database. This record shall override the uid and uname
1118              fields in the following header block(s), and  any  uid  extended
1119              header  record. When used in read, copy, or list mode, pax shall
1120              translate the name from the UTF-8 encoding in the header  record
1121              to  the  character  set appropriate for the user database on the
1122              receiving system. If any  of  the  UTF-8  characters  cannot  be
1123              translated, and if the -o invalid=UTF-8 option is not specified,
1124              the results are implementation-defined. When used  in  write  or
1125              copy  mode, pax shall include a uname extended header record for
1126              each file whose user name cannot be  represented  entirely  with
1127              the letters and digits of the portable character set.
1128
1129       If  the  <value> field is zero length, it shall delete any header block
1130       field, previously entered extended header  value,  or  global  extended
1131       header value of the same name.
1132
1133       If  a keyword in an extended header record (or in a -o option-argument)
1134       overrides or deletes a corresponding field in the ustar  header  block,
1135       pax shall ignore the contents of that header block field.
1136
1137       Unlike  the ustar header block fields, NULs shall not delimit <value>s;
1138       all characters within the <value> field shall be  considered  data  for
1139       the  field.  None  of  the length limitations of the ustar header block
1140       fields in ustar  Header  Block  shall  apply  to  the  extended  header
1141       records.
1142
1143
1144   pax Extended Header Keyword Precedence
1145       This  section  describes  the  precedence  in  which the various header
1146       records and fields and command line options are selected to apply to  a
1147       file  in  the archive. When pax is used in read or list modes, it shall
1148       determine a file attribute in the following sequence:
1149
1150              1.     If  -o  delete=keyword-prefix  is  used,   the   affected
1151                     attributes  shall be determined from step 7., if applica‐
1152                     ble, or ignored otherwise.
1153
1154              2.     If -o keyword:= is used, the affected attributes shall be
1155                     ignored.
1156
1157              3.     If  -o  keyword:=value  is  used,  the affected attribute
1158                     shall be assigned the value.
1159
1160              4.     If there is a typeflag  x  extended  header  record,  the
1161                     affected  attribute  shall be assigned the <value>.  When
1162                     extended header records conflict, the last one  given  in
1163                     the header shall take precedence.
1164
1165              5.     If -o keyword=value is used, the affected attribute shall
1166                     be assigned the value.
1167
1168              6.     If there is a typeflag g global extended  header  record,
1169                     the  affected  attribute  shall  be assigned the <value>.
1170                     When global extended header records  conflict,  the  last
1171                     one given in the global header shall take precedence.
1172
1173              7.     Otherwise,  the  attribute  shall  be determined from the
1174                     ustar header block.
1175
1176
1177   pax Extended Header File Times
1178       The pax utility shall write an mtime record for each file in  write  or
1179       copy  modes  if  the  file's  modification  time  cannot be represented
1180       exactly in the ustar header logical record described  in  ustar  Inter‐
1181       change Format.  This can occur if the time is out of ustar range, or if
1182       the file system of the underlying implementation  supports  non-integer
1183       time  granularities  and  the time is not an integer. All of these time
1184       records shall be formatted as a decimal representation of the  time  in
1185       seconds  since  the Epoch. If a period ('.') decimal point character is
1186       present, the digits to the right of the point shall represent the units
1187       of a subsecond timing granularity, where the first digit is tenths of a
1188       second and each subsequent digit is a tenth of the previous  digit.  In
1189       read or copy mode, the pax utility shall truncate the time of a file to
1190       the greatest value that is not greater than the input header file time.
1191       In  write  or copy mode, the pax utility shall output a time exactly if
1192       it can be represented exactly as a decimal number, and otherwise  shall
1193       generate only enough digits so that the same time shall be recovered if
1194       the file is extracted on a system whose underlying implementation  sup‐
1195       ports the same time granularity.
1196
1197
1198   ustar Interchange Format
1199       A ustar archive tape or file shall contain a series of logical records.
1200       Each logical record shall be a fixed-size logical record of 512  octets
1201       (see  below). Although this format may be thought of as being stored on
1202       9-track industry-standard 12.7 mm (0.5 in) magnetic tape,  other  types
1203       of  transportable  media  are not excluded. Each file archived shall be
1204       represented by a header logical record that describes  the  file,  fol‐
1205       lowed  by  zero  or  more logical records that give the contents of the
1206       file. At the end of the archive file there shall be two 512-octet logi‐
1207       cal  records filled with binary zeros, interpreted as an end-of-archive
1208       indicator.
1209
1210       The logical records may be grouped  for  physical  I/O  operations,  as
1211       described  under  the  -b blocksize and -x ustar options. Each group of
1212       logical records may be written with a single  operation  equivalent  to
1213       the write(2) function. On magnetic tape, the result of this write shall
1214       be a single tape physical block. The last physical block  shall  always
1215       be the full size, so logical records after the two zero logical records
1216       may contain undefined data.
1217
1218       The header logical record shall be structured as shown in the following
1219       table. All lengths and offsets are in decimal.
1220
1221                              Table: ustar Header Block
1222
1223                  ┌───────────┬──────────────┬────────────────────┐
1224Field Name Octet Offset Length (in Octets) 
1225                  ├───────────┼──────────────┼────────────────────┤
1226                  │name       │       0      │        100         │
1227                  │mode       │     100      │          8         │
1228                  │uid        │     108      │          8         │
1229                  │gid        │     116      │          8         │
1230                  │size       │     124      │         12         │
1231                  │mtime      │     136      │         12         │
1232                  │chksum     │     148      │          8         │
1233                  │typeflag   │     156      │          1         │
1234                  │linkname   │     157      │        100         │
1235                  │magic      │     257      │          6         │
1236                  │version    │     263      │          2         │
1237                  │uname      │     265      │         32         │
1238                  │gname      │     297      │         32         │
1239                  │devmajor   │     329      │          8         │
1240                  │devminor   │     337      │          8         │
1241                  │prefix     │     345      │        155         │
1242                  └───────────┴──────────────┴────────────────────┘
1243       All characters in the header logical record shall be represented in the
1244       coded character set of  the  ISO/IEC  646:1991  standard.  For  maximum
1245       portability  between  implementations,  names  should  be selected from
1246       characters represented by the portable filename character set as octets
1247       with  the  most significant bit zero. If an implementation supports the
1248       use of characters outside of slash and the portable filename  character
1249       set  in names for files, users, and groups, one or more implementation-
1250       defined encodings of these characters shall be provided for interchange
1251       purposes.
1252
1253       However, the pax utility shall never create filenames on the local sys‐
1254       tem that cannot be accessed via the procedures described  in  IEEE  Std
1255       1003.1-2001.  If a filename is found on the medium that would create an
1256       invalid filename, it is implementation-defined whether  the  data  from
1257       the  file  is  stored  on  the file hierarchy and under what name it is
1258       stored. The pax utility may choose to ignore these files as long as  it
1259       produces an error indicating that the file is being ignored.
1260
1261       Each  field  within  the  header logical record is contiguous; that is,
1262       there is no padding used. Each character on the archive medium shall be
1263       stored contiguously.
1264
1265       The  fields  magic,  uname, and gname are character strings each termi‐
1266       nated by a NUL character. The fields name,  linkname,  and  prefix  are
1267       NUL-terminated  character  strings  except  when  all characters in the
1268       array contain non-NUL characters including the last character. The ver‐
1269       sion  field  is  two octets containing the characters "00" (zero-zero).
1270       The typeflag contains a single character. All other fields are  leading
1271       zero-filled  octal numbers using digits from the ISO/IEC 646:1991 stan‐
1272       dard IRV. Each numeric field is terminated by one or  more  <space>  or
1273       NUL characters.
1274
1275       The  name and the prefix fields shall produce the pathname of the file.
1276       A new pathname shall be formed, if prefix is not an empty  string  (its
1277       first  character  is not NUL), by concatenating prefix (up to the first
1278       NUL character), a slash character, and name; otherwise,  name  is  used
1279       alone.  In  either case, name is terminated at the first NUL character.
1280       If prefix begins with a NUL character, it shall  be  ignored.  In  this
1281       manner,  pathnames  of  at  most  256 characters can be supported. If a
1282       pathname does not fit in the space provided, pax shall notify the  user
1283       of  the error, and shall not store any part of the file-header or data-
1284       on the medium.
1285
1286       The linkname field, described below, shall not use the prefix  to  pro‐
1287       duce  a  pathname. As such, a linkname is limited to 100 characters. If
1288       the name does not fit in the space provided, pax shall notify the  user
1289       of the error, and shall not attempt to store the link on the medium.
1290
1291       The  mode  field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1292       dard octal digit representation. The encoded bits shall  represent  the
1293       following values:
1294
1295                               Table: ustar mode Field
1296
1297     ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1298Bit  IEEE Std     Description                   
1299Value 1003.1-2001 Bit │                                                 │
1300     ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1301     │04000 │ S_ISUID         │ Set UID on execution.                           │
1302     │02000 │ S_ISGID         │ Set GID on execution.                           │
1303     │01000 │ <reserved>      │ Reserved for future standardization.            │
1304     │00400 │ S_IRUSR         │ Read permission for file owner class.           │
1305     │00200 │ S_IWUSR         │ Write permission for file owner class.          │
1306     │00100 │ S_IXUSR         │ Execute/search permission for file owner class. │
1307     │00040 │ S_IRGRP         │ Read permission for file group class.           │
1308     │00020 │ S_IWGRP         │ Write permission for file group class.          │
1309     │00010 │ S_IXGRP         │ Execute/search permission for file group class. │
1310     │00004 │ S_IROTH         │ Read permission for file other class.           │
1311     │00002 │ S_IWOTH         │ Write permission for file other class.          │
1312     │00001 │ S_IXOTH         │ Execute/search permission for file other class. │
1313     └──────┴─────────────────┴─────────────────────────────────────────────────┘
1314       When  appropriate  privilege is required to set one of these mode bits,
1315       and the user restoring the files from the archive  does  not  have  the
1316       appropriate  privilege,  the mode bits for which the user does not have
1317       appropriate privilege shall be ignored. Some of the mode  bits  in  the
1318       archive  format  are not mentioned elsewhere in this volume of IEEE Std
1319       1003.1-2001. If the implementation does not support  those  bits,  they
1320       may be ignored.
1321
1322       The uid and gid fields are the user and group ID of the owner and group
1323       of the file, respectively.
1324
1325       The size field is the size of the file in octets. If the typeflag field
1326       is  set  to  specify  a  file to be of type 1 (a link) or 2 (a symbolic
1327       link), the size field shall be specified as zero. If the typeflag field
1328       is set to specify a file of type 5 (directory), the size field shall be
1329       interpreted as described under the definition of that record  type.  No
1330       data  logical  records are stored for types 1, 2, or 5. If the typeflag
1331       field is set to 3 (character special file), 4 (block special file),  or
1332       6  (FIFO),  the meaning of the size field is unspecified by this volume
1333       of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1334       the  medium.  Additionally, for type 6, the size field shall be ignored
1335       when reading. If the typeflag field is set to any other value, the num‐
1336       ber   of   logical  records  written  following  the  header  shall  be
1337       (size+511)/512, ignoring any fraction in the result of the division.
1338
1339       The mtime field shall be the modification time of the file at the  time
1340       it  was archived. It is the ISO/IEC 646:1991 standard representation of
1341       the octal value of the modification  time  obtained  from  the  stat(2)
1342       function.
1343
1344       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1345       tion of the octal value of the simple sum of all octets in  the  header
1346       logical  record.  Each  octet  in  the  header  shall  be treated as an
1347       unsigned value. These values shall be added  to  an  unsigned  integer,
1348       initialized  to  zero, the precision of which is not less than 17 bits.
1349       When calculating the checksum, the chksum field is  treated  as  if  it
1350       were all spaces.
1351
1352       The typeflag field specifies the type of file archived. If a particular
1353       implementation does not recognize the type, or the user does  not  have
1354       appropriate  privilege to create that type, the file shall be extracted
1355       as if it were a regular file if the file type  is  defined  to  have  a
1356       meaning  for the size field that could cause data logical records to be
1357       written on the medium (see the previous description for size).  If con‐
1358       version  to  a  regular  file  occurs, the pax utility shall produce an
1359       error indicating that the conversion took place. All  of  the  typeflag
1360       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1361
1362       0      Represents  a regular file. For backwards-compatibility, a type‐
1363              flag value of binary zero ('\0') should be recognized as meaning
1364              a  regular file when extracting files from the archive. Archives
1365              written with this version of the archive file format create reg‐
1366              ular files with a typefla value of the ISO/IEC 646:1991 standard
1367              IRV '0'.
1368
1369       1      Represents a file linked to another file, of  any  type,  previ‐
1370              ously  archived.  Such  files  are identified by having the same
1371              device and file serial numbers, and pathnames that refer to dif‐
1372              ferent  directory  entries.  All such files shall be archived as
1373              linked files. The linked-to name is specified  in  the  linkname
1374              field  with  a  NUL-character  terminator if it is less than 100
1375              octets in length.
1376
1377       2      Represents a symbolic link. The contents of  the  symbolic  link
1378              shall be stored in the linkname field.
1379
1380       3,4    Represent  character  special  files  and  block  special  files
1381              respectively. In this case  the  devmajor  and  devminor  fields
1382              shall  contain  information  defining  the device, the format of
1383              which is unspecified by this volume  of  IEEE  Std  1003.1-2001.
1384              Implementations  may  map the device specifications to their own
1385              local specification or may ignore the entry.
1386
1387       5      Specifies a directory or subdirectory.  On  systems  where  disk
1388              allocation  is  performed  on  a directory basis, the size field
1389              shall contain the maximum number of octets (which may be rounded
1390              to  the  nearest  disk block allocation unit) that the directory
1391              may hold. A size field of zero indicates no such limiting.  Sys‐
1392              tems  that  do not support limiting in this manner should ignore
1393              the size field.
1394
1395       6      Specifies a FIFO special file. Note that the archiving of a FIFO
1396              file archives the existence of this file and not its contents.
1397
1398       7      Reserved  to  represent  a  file  to which an implementation has
1399              associated  some  high-performance  attribute.   Implementations
1400              without such extensions should treat this file as a regular file
1401              (type 0).
1402
1403       A-Z    The letters 'A' to  'Z',  inclusive,  are  reserved  for  custom
1404              implementations.  All  other values are reserved for future ver‐
1405              sions of IEEE Std 1003.1-2001.
1406
1407       It is unspecified whether files with pathnames that refer to  the  same
1408       directory  entry  are archived as linked files or as separate files. If
1409       they are archived as  linked  files,  this  means  that  attempting  to
1410       extract  both pathnames from the resulting archive will always cause an
1411       error (unless the -u option is used) because the link  cannot  be  cre‐
1412       ated.
1413
1414       It  is  unspecified  whether files with the same device and file serial
1415       numbers being appended to an archive are treated  as  linked  files  to
1416       members that were in the archive before the append.
1417
1418       Attempts  to archive a socket using ustar interchange format shall pro‐
1419       duce a diagnostic message. Handling of other file types is  implementa‐
1420       tion-defined.
1421
1422       The  magic  field  is the specification that this archive was output in
1423       this archive format. If this field contains ustar (the five  characters
1424       from  the  ISO/IEC  646:1991  standard  IRV shown followed by NUL), the
1425       uname and gname fields shall contain the ISO/IEC 646:1991 standard  IRV
1426       representation  of the owner and group of the file, respectively (trun‐
1427       cated to fit, if necessary).  When the file is  restored  by  a  privi‐
1428       leged, protection-preserving version of the utility, the user and group
1429       databases shall be scanned for these names.  If  found,  the  user  and
1430       group  IDs  contained  within these files shall be used rather than the
1431       values contained within the uid and gid fields.
1432
1433
1434   cpio Interchange Format
1435       The octet-oriented cpio archive format shall be a  series  of  entries,
1436       each comprising a header that describes the file, the name of the file,
1437       and then the contents of the file.
1438
1439       An archive may be recorded as a series of fixed-size blocks of  octets.
1440       This  blocking  shall be used only to make physical I/O more efficient.
1441       The last group of blocks shall always be at the full size.
1442
1443       For the octet-oriented cpio archive format, the individual entry infor‐
1444       mation  shall  be in the order indicated and described by the following
1445       table; see also the <cpio.h> header.
1446
1447                      Table: Octet-Oriented cpio Archive Entry
1448
1449            ┌─────────────────────┬────────────────────┬─────────────────┐
1450Header Field Name   Length (in Octets) Interpreted as  
1451            ├─────────────────────┼────────────────────┼─────────────────┤
1452            │c_magic              │ 6                  │ Octal number    │
1453            │c_dev                │ 6                  │ Octal number    │
1454            │c_ino                │ 6                  │ Octal number    │
1455            │c_mode               │ 6                  │ Octal number    │
1456            │c_uid                │ 6                  │ Octal number    │
1457            │c_gid                │ 6                  │ Octal number    │
1458            │c_nlink              │ 6                  │ Octal number    │
1459            │c_rdev               │ 6                  │ Octal number    │
1460            │c_mtime              │ 11                 │ Octal number    │
1461            │c_namesize           │ 6                  │ Octal number    │
1462            │c_filesize           │ 11                 │ Octal number    │
1463            │                     │                    │                 │
1464Filename Field Name  Length             Interpreted as  
1465            │c_name               │ c_namesize         │ Pathname string │
1466            │                     │                    │                 │
1467File Data Field Name Length             Interpreted as  
1468            │c_filedata           │ c_filesize         │ Data            │
1469            └─────────────────────┴────────────────────┴─────────────────┘
1470   cpio Header
1471       For each file in the archive, a header as defined previously  shall  be
1472       written.  The information in the header fields is written as streams of
1473       the ISO/IEC 646:1991 standard characters interpreted as octal  numbers.
1474       The  octal numbers shall be extended to the necessary length by append‐
1475       ing the ISO/IEC 646:1991 standard IRV zeros  at  the  most-significant-
1476       digit  end of the number; the result is written to the most-significant
1477       digit of the stream of octets first. The fields shall be interpreted as
1478       follows:
1479
1480       c_magic
1481              Identify  the  archive  as being a transportable archive by con‐
1482              taining the identifying value "070707".
1483
1484       c_dev, c_ino
1485              Contains values that uniquely identify the file within  the  ar‐
1486              chive  (that  is,  no  files  contain the same pair of c_dev and
1487              c_ino values unless they are links to the same file). The values
1488              shall be determined in an unspecified manner.
1489
1490       c_mode Contains  the file type and access permissions as defined in the
1491              following table.
1492
1493                            Table: Values for cpio c_mode Field
1494
1495                 ┌──────────────────────┬─────────┬────────────────────────┐
1496File Permissions Name Value  Indicates        
1497                 ├──────────────────────┼─────────┼────────────────────────┤
1498                 │C_IRUSR               │ 000400  │ Read by owner          │
1499                 │C_IWUSR               │ 000200  │ Write by owner         │
1500                 │C_IXUSR               │ 000100  │ Execute by owner       │
1501                 │C_IRGRP               │ 000040  │ Read by group          │
1502                 │C_IWGRP               │ 000020  │ Write by group         │
1503                 │C_IXGRP               │ 000010  │ Execute by group       │
1504                 │C_IROTH               │ 000004  │ Read by others         │
1505                 │C_IWOTH               │ 000002  │ Write by others        │
1506                 │C_IXOTH               │ 000001  │ Execute by others      │
1507                 │C_ISUID               │ 004000  │ Set uid                │
1508                 │C_ISGID               │ 002000  │ Set gid                │
1509                 │C_ISVTX               │ 001000  │ Reserved               │
1510                 ├──────────────────────┼─────────┼────────────────────────┤
1511File Type Name        Value   Indicates              
1512                 ├──────────────────────┼─────────┼────────────────────────┤
1513                 │C_ISDIR               │ 0040000 │ Directory              │
1514                 │C_ISFIFO              │ 0010000 │ FIFO                   │
1515                 │C_ISREG               │ 0100000 │ Regular file           │
1516                 │C_ISLNK               │ 0120000 │ Symbolic link          │
1517                 │C_ISBLK               │ 0060000 │ Block special file     │
1518                 │C_ISCHR               │ 0020000 │ Character special file │
1519                 │C_ISSOCK              │ 0140000 │ Socket                 │
1520                 │C_ISCTG               │ 0110000 │ Reserved               │
1521                 └──────────────────────┴─────────┴────────────────────────┘
1522              Directories, FIFOs, symbolic links, and regular files  shall  be
1523              supported  on  a  system  conforming  to this volume of IEEE Std
1524              1003.1-2001; additional values defined previously  are  reserved
1525              for  compatibility with existing systems.  Additional file types
1526              may be supported; however, such files should not be  written  to
1527              archives intended to be transported to other systems.
1528
1529       c_uid  Contains the user ID of the owner.
1530
1531       c_gid  Contains the group ID of the group.
1532
1533       c_nlink
1534              Contains  a  number greater than or equal to the number of links
1535              in the archive referencing the file. If the -a option is used to
1536              append  to a cpio archive, then the pax utility need not account
1537              for the files in the existing part of the archive when calculat‐
1538              ing the c_nlink values for the appended part of the archive, and
1539              need not alter the c_nlink values in the existing  part  of  the
1540              archive if additional files with the same c_dev and c_ino values
1541              are appended to the archive.
1542
1543       c_rdev Contains implementation-defined  information  for  character  or
1544              block special files.
1545
1546       c_mtime
1547              Contains the latest time of modification of the file at the time
1548              the archive was created.
1549
1550       c_namesize
1551              Contains the length of the pathname, including  the  terminating
1552              NUL character.
1553
1554       c_filesize
1555              Contains  the  length  of  the file in octets. This shall be the
1556              length of the data section following the header structure.
1557
1558
1559   cpio Filename
1560       The c_name field shall contain the pathname of the file. The length  of
1561       this field in octets is the value of c_namesize.
1562
1563       If a filename is found on the medium that would create an invalid path‐
1564       name, it is implementation-defined whether the data from  the  file  is
1565       stored on the file hierarchy and under what name it is stored.
1566
1567       All  characters  shall  be represented in the ISO/IEC 646:1991 standard
1568       IRV. For maximum portability between implementations, names  should  be
1569       selected from characters represented by the portable filename character
1570       set as octets with the most significant bit zero. If an  implementation
1571       supports  the use of characters outside the portable filename character
1572       set in names for files, users, and groups, one or more  implementation-
1573       defined encodings of these characters shall be provided for interchange
1574       purposes. However, the pax utility shall never create filenames on  the
1575       local  system that cannot be accessed via the procedures described pre‐
1576       viously in this volume of IEEE Std 1003.1-2001. If a filename is  found
1577       on  the medium that would create an invalid filename, it is implementa‐
1578       tion-defined whether the data from the file is stored on the local file
1579       system  and under what name it is stored. The pax utility may choose to
1580       ignore these files as long as it produces an error indicating that  the
1581       file is being ignored.
1582
1583
1584   cpio File Data
1585       Following  c_name, there shall be c_filesize octets of data.  Interpre‐
1586       tation of such data occurs in  a  manner  dependent  on  the  file.  If
1587       c_filesize is zero, no data shall be contained in c_filedata.
1588
1589       When restoring from an archive:
1590
1591       ·      If  the user does not have the appropriate privilege to create a
1592              file of the specified type, pax shall ignore the entry and write
1593              an error message to standard error.
1594
1595       ·      Only regular files have data to be restored. Presuming a regular
1596              file meets any selection criteria that might be imposed  on  the
1597              format-reading utility by the user, such data shall be restored.
1598
1599       ·      If  a user does not have appropriate privilege to set a particu‐
1600              lar mode flag, the flag shall be ignored. Some of the mode flags
1601              in the archive format are not mentioned elsewhere in this volume
1602              of IEEE Std 1003.1-2001. If the implementation does not  support
1603              those flags, they may be ignored.
1604
1605
1606   cpio Special Entries
1607       FIFO special files, directories, and the trailer shall be recorded with
1608       c_filesize equal to  zero.  For  other  special  files,  c_filesize  is
1609       unspecified  by this volume of IEEE Std 1003.1-2001. The header for the
1610       next file entry in the archive shall be written directly after the last
1611       octet  of  the  file entry preceding it. A header denoting the filename
1612       TRAILER!!!  shall indicate the end of  the  archive;  the  contents  of
1613       octets  in  the  last  block of the archive following such a header are
1614       undefined.
1615
1616

EXIT STATUS

1618       The following exit values shall be returned:
1619
1620        0     All files were processed successfully.
1621
1622       >0     An error occurred.
1623
1624

CONSEQUENCES OF ERRORS

1626       If pax cannot create a file or a link when reading an archive or cannot
1627       find  a  file  when writing an archive, or cannot preserve the user ID,
1628       group ID, or file mode when the -p option is  specified,  a  diagnostic
1629       message  shall  be written to standard error and a non-zero exit status
1630       shall be returned, but processing shall continue. In the case where pax
1631       cannot  create  a  link  to a file, pax shall not, by default, create a
1632       second copy of the file.
1633
1634       If the extraction of a file from an archive is  prematurely  terminated
1635       by a signal or error, pax may have only partially extracted the file or
1636       (if the -n option was not specified) may have extracted a file  of  the
1637       same  name as that specified by the user, but which is not the file the
1638       user wanted. Additionally, the file modes of extracted directories  may
1639       have  additional  bits  from  the S_IRWXU mask set as well as incorrect
1640       modification and access times.
1641
1642
1643_________________________________________________________________

The following sections are informative.

1645
1646

APPLICATION USAGE

1648       Caution is advised when using the -a option to append to a cpio  format
1649       archive. If any of the files being appended happen to be given the same
1650       c_dev and c_ino values as a file in the existing part of  the  archive,
1651       then  they may be treated as links to that file on extraction. Thus, it
1652       is risky to use -a with cpio format except when it is done on the  same
1653       system  that the original archive was created on, and with the same pax
1654       utility, and in the knowledge that there has been  little  or  no  file
1655       system  activity since the original archive was created that could lead
1656       to any of the files appended being given the same c_dev and c_ino  val‐
1657       ues  as  an  unrelated  file in the existing part of the archive. Also,
1658       when (intentionally) appending additional links to a file in the exist‐
1659       ing part of the archive, the c_nlink values in the modified archive can
1660       be smaller than the number of links to the file in the  archive,  which
1661       may mean that the links are not preserved on extraction.
1662
1663       The  -p  (privileges)  option  was  invented  to  reconcile differences
1664       between historical tar and cpio implementations. In particular, the two
1665       utilities use -m in diametrically opposed ways. The -p option also pro‐
1666       vides a consistent means of extending the ways  in  which  future  file
1667       attributes  can  be addressed, such as for enhanced security systems or
1668       high-performance files. Although it may seem complex, there are  really
1669       two modes that are most commonly used:
1670
1671       -p e   ``Preserve  everything".  This  would  be used by the historical
1672              superuser, someone with all the appropriate privileges, to  pre‐
1673              serve  all  aspects of the files as they are recorded in the ar‐
1674              chive. The e flag is the sum of o and p, and  other  implementa‐
1675              tion-defined attributes.
1676
1677       -p p   ``Preserve"  the  file mode bits. This would be used by the user
1678              with regular privileges who wished to preserve  aspects  of  the
1679              file  other  than the ownership. The file times are preserved by
1680              default, but two other flags are offered to  disable  these  and
1681              use the time of extraction.
1682
1683       The  one pathname per line format of standard input precludes pathnames
1684       containing <newline>s. Although such  pathnames  violate  the  portable
1685       filename  guidelines,  they  may  exist  and their presence may inhibit
1686       usage of pax within shell scripts. This problem is inherited from  his‐
1687       torical  archive  programs. The problem can be avoided by listing file‐
1688       name arguments on the command line instead of on standard input.
1689
1690       It is almost certain that appropriate privileges are required  for  pax
1691       to  accomplish  parts of this volume of IEEE Std 1003.1-2001.  Specifi‐
1692       cally, creating files of  type  block  special  or  character  special,
1693       restoring file access times unless the files are owned by the user (the
1694       -t option), or preserving file owner, group, and mode (the  -p  option)
1695       all probably require appropriate privileges.
1696
1697       In read mode, implementations are permitted to overwrite files when the
1698       archive has multiple members with the same name. This may fail if  per‐
1699       missions  on the first version of the file do not permit it to be over‐
1700       written.
1701
1702       The cpio and ustar formats can only  support  files  up  to  8589934592
1703       bytes (8 * 2^30) in size.
1704
1705

EXAMPLES

1707       The following command:
1708
1709            pax -w -f /dev/rmt/1m .
1710
1711       copies  the  contents  of the current directory to tape drive 1, medium
1712       density (assuming historical System V device naming procedures-the his‐
1713       torical BSD device name would be /dev/rmt9).
1714
1715       The following commands:
1716
1717            mkdir newdirpax -rw olddir newdir
1718
1719       copy the olddir directory hierarchy to newdir.
1720
1721            pax -r -s ',^//*usr//*,,' -f a.pax
1722
1723       reads  the  archive a.pax, with all files rooted in /usr in the archive
1724       extracted relative to the current directory.
1725
1726       Using the option:
1727
1728            -o listopt="%M %(atime)T %(size)D %(name)s"
1729
1730       overrides the default output description in Standard Output and instead
1731       writes:
1732
1733            -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1734
1735       Using the options:
1736
1737            -o listopt='%L\t%(size)D\n%.7' \
1738            -o listopt='(name)s\n%(atime)T\n%T'
1739
1740       overrides the default output description in Standard Output and instead
1741       writes:
1742
1743       /usr/foo/bar -> /tmp   1492
1744       /usr/fo
1745       Jan 12 1991
1746       Jan 31 15:53
1747
1748

RATIONALE

1750       The pax utility was new for the ISO POSIX-2:1993  standard.  It  repre‐
1751       sents a peaceful compromise between advocates of the historical tar and
1752       cpio utilities.
1753
1754       A fundamental difference between cpio and tar was in the  way  directo‐
1755       ries  were  treated. The cpio utility did not treat directories differ‐
1756       ently from other files, and to select  a  directory  and  its  contents
1757       required  that  each file in the hierarchy be explicitly specified. For
1758       tar, a directory matched every file in the file hierarchy it rooted.
1759
1760       The pax utility offers both interfaces;  by  default,  directories  map
1761       into the file hierarchy they root. The -d option causes pax to skip any
1762       file not explicitly referenced, as cpio historically did.   The  tar  -
1763       style  behavior  was chosen as the default because it was believed that
1764       this was the more common usage and because tar  is  the  more  commonly
1765       available  interface,  as it was historically provided on both System V
1766       and BSD implementations.
1767
1768       The data interchange format specification in this volume  of  IEEE  Std
1769       1003.1-2001 requires that processes with "appropriate privileges" shall
1770       always restore the ownership and permissions of extracted files exactly
1771       as  archived. If viewed from the historic equivalence between superuser
1772       and "appropriate privileges", there are two problems with this require‐
1773       ment.  First, users running as superusers may unknowingly set dangerous
1774       permissions on extracted files. Second, it is needlessly  limiting,  in
1775       that  superusers  cannot extract files and own them as superuser unless
1776       the archive was created by the superuser.  (It  should  be  noted  that
1777       restoration  of  ownerships  and  permissions  for  the  superuser,  by
1778       default, is historical practice in cpio, but not in tar.)  In order  to
1779       avoid  these  two  problems,  the  pax  specification has an additional
1780       "privilege" mechanism, the -p option. Only a pax  invocation  with  the
1781       privileges needed, and which has the -p option set using the e specifi‐
1782       cation character, has the "appropriate privilege" to restore full  own‐
1783       ership and permission information.
1784
1785       Note  also  that  this volume of IEEE Std 1003.1-2001 requires that the
1786       file ownership and access permissions shall be set, on  extraction,  in
1787       the  same  fashion as the creat(2) function when provided with the mode
1788       stored in the archive. This means that the file creation  mask  of  the
1789       user is applied to the file permissions.
1790
1791       Users should note that directories may be created by pax while extract‐
1792       ing files with permissions that are different from those  that  existed
1793       at the time the archive was created. When extracting sensitive informa‐
1794       tion into a directory  hierarchy  that  no  longer  exists,  users  are
1795       encouraged  to  set  their  file creation mask appropriately to protect
1796       these files during extraction.
1797
1798       The table of contents output is written to standard output  to  facili‐
1799       tate pipeline processing.
1800
1801       An  early  proposal  had hard links displaying for all pathnames.  This
1802       was removed because it complicates the output of the case where  -v  is
1803       not  specified  and does not match historical cpio usage. The hard-link
1804       information is available in the -v display.
1805
1806       The description of the -l option allows implementations  to  make  hard
1807       links  to symbolic links. IEEE Std 1003.1-2001 does not specify any way
1808       to create a hard link to a symbolic link, but many implementations pro‐
1809       vide  this  capability as an extension. If there are hard links to sym‐
1810       bolic links when an archive is created, the implementation is  required
1811       to archive the hard link in the archive (unless -H or -L is specified).
1812       When in read mode and in copy  mode,  implementations  supporting  hard
1813       links to symbolic links should use them when appropriate.
1814
1815       The  archive formats inherited from the POSIX.1-1990 standard have cer‐
1816       tain restrictions that have been brought along from  historical  usage.
1817       For  example,  there are restrictions on the length of pathnames stored
1818       in the archive. When pax is used in copy (-rw) mode (copying  directory
1819       hierarchies),  the  ability  to  use  extensions from the -x pax format
1820       overcomes these restrictions.
1821
1822       The default blocksize value of 5120 bytes for cpio was selected because
1823       it  is  one of the standard block-size values for cpio, set when the -B
1824       option is specified. (The other default block-size value  for  cpio  is
1825       512  bytes, and this was considered to be too small.) The default block
1826       value of 10240 bytes for tar was selected because that is the  standard
1827       block-size  value  for  BSD tar.  The maximum block size of 32256 bytes
1828       (2^15-512 bytes) is the largest multiple of 512 bytes that fits into  a
1829       signed  16-bit tape controller transfer register. There are known limi‐
1830       tations in some historical systems that  would  prevent  larger  blocks
1831       from  being accepted. Historical values were chosen to improve compati‐
1832       bility with historical scripts using  dd(1)  or  similar  utilities  to
1833       manipulate  archives. Also, default block sizes for any file type other
1834       than character special file has been deleted from this volume  of  IEEE
1835       Std  1003.1-2001  as unimportant and not likely to affect the structure
1836       of the resulting archive.
1837
1838       Implementations are permitted to modify the block-size value  based  on
1839       the archive format or the device to which the archive is being written.
1840       This is to provide implementations with the opportunity to take  advan‐
1841       tage  of  special types of devices, and it should not be used without a
1842       great deal of consideration as it almost  certainly  decreases  archive
1843       portability.
1844
1845       The  intended  use  of the -n option was to permit extraction of one or
1846       more files from the archive without processing the entire archive. This
1847       was  viewed  by the standard developers as offering significant perfor‐
1848       mance advantages over historical  implementations.  The  -n  option  in
1849       early proposals had three effects; the first was to cause special char‐
1850       acters in patterns to not be treated specially. The second was to cause
1851       only  the  first file that matched a pattern to be extracted. The third
1852       was to cause pax to write a diagnostic message to standard  error  when
1853       no  file was found matching a specified pattern. Only the second behav‐
1854       ior is retained by this volume of IEEE Std 1003.1-2001, for  many  rea‐
1855       sons.  First,  it  is  in general not acceptable for a single option to
1856       have multiple effects. Second, the ability  to  make  pattern  matching
1857       characters  act  as  normal characters is useful for parts of pax other
1858       than file extraction. Third, a finer degree of control over the special
1859       characters  is useful because users may wish to normalize only a single
1860       special character in a single filename. Fourth, given  a  more  general
1861       escape  mechanism, the previous behavior of the -n option can be easily
1862       obtained using the -s option or a sed script. Finally, writing a  diag‐
1863       nostic message when a pattern specified by the user is unmatched by any
1864       file is useful behavior in all cases.
1865
1866       In this version, the -n was removed from the copy mode synopsis of pax;
1867       it  is  inapplicable because there are no pattern operands specified in
1868       this mode.
1869
1870       There is another method than pax  for  copying  subtrees  in  IEEE  Std
1871       1003.1-2001  described  as  part of the cp(1) utility. Both methods are
1872       historical practice: cp(1) provides a simpler,  more  intuitive  inter‐
1873       face,  while  pax  offers a finer granularity of control. Each provides
1874       additional functionality to the other; in particular, pax maintains the
1875       hard-link  structure  of  the hierarchy while cp(1) does not. It is the
1876       intention of the standard developers that the results be similar (using
1877       appropriate option combinations in both utilities). The results are not
1878       required to be identical; there seemed insufficient  gain  to  applica‐
1879       tions  to balance the difficulty of implementations having to guarantee
1880       that the results would be exactly identical.
1881
1882       A single archive may span more than one  file.  It  is  suggested  that
1883       implementations  provide  informative  messages to the user on standard
1884       error whenever the archive file is changed.
1885
1886       The -d option (do not create intermediate directories not listed in the
1887       archive)  found in early proposals was originally provided as a comple‐
1888       ment to the historic -d option of cpio.  It has been deleted.
1889
1890       The -s option in early proposals specified a subset of the substitution
1891       command  from  the ed utility. As there was no reason for only a subset
1892       to be supported, the -s option is now compatible with  the  current  ed
1893       specification.  Since  the delimiter can be any non-null character, the
1894       following usage with single spaces is valid:
1895
1896            pax -s " foo bar " ...
1897
1898       The -t description is worded so as to note  that  this  may  cause  the
1899       access  time  update  caused by some other activity (which occurs while
1900       the file is being read) to be overwritten.
1901
1902       The default behavior of pax with regard to file modification  times  is
1903       the  same as historical implementations of tar.  It is not the histori‐
1904       cal behavior of cpio.
1905
1906       Because the -i option uses /dev/tty, utilities  without  a  controlling
1907       terminal are not able to use this option.
1908
1909       The  -y  option,  found  in early proposals, has been deleted because a
1910       line containing a single period for the -i option has equivalent  func‐
1911       tionality. The special lines for the -i option (a single period and the
1912       empty line) are historical practice in cpio.
1913
1914       In early drafts, a -e charmap option was included to increase portabil‐
1915       ity of files between systems using different coded character sets. This
1916       option was omitted because it was apparent that consensus could not  be
1917       formed  for it. In this version, the use of UTF-8 should be an adequate
1918       substitute.
1919
1920       The -k option was added to address  international  concerns  about  the
1921       dangers  involved  in  the  character set transformations of -e (if the
1922       target character set were different  from  the  source,  the  filenames
1923       might  be  transformed into names matching existing files) and also was
1924       made more general to protect files  transferred  between  file  systems
1925       with  different  {NAME_MAX}  values (truncating a filename on a smaller
1926       system might also inadvertently overwrite existing files).  As  stated,
1927       it  prevents any overwriting, even if the target file is older than the
1928       source. This version adds more granularity of  options  to  solve  this
1929       problem  by  introducing the -o invalid=option - specifically the UTF-8
1930       action. (Note that an existing file that is named with a UTF-8 encoding
1931       is still subject to overwriting in this case. The -k option closes that
1932       loophole.)
1933
1934       Some of the file characteristics referenced in this volume of IEEE  Std
1935       1003.1-2001  might  not be supported by some archive formats. For exam‐
1936       ple, neither the tar nor cpio formats contain the file access time. For
1937       this  reason, the e specification character has been provided, intended
1938       to cause all file  characteristics  specified  in  the  archive  to  be
1939       retained.
1940
1941       It  is  required  that  extracted  directories,  by default, have their
1942       access and modification times and permissions set to the values  speci‐
1943       fied  in the archive. This has obvious problems in that the directories
1944       are almost certainly modified after being extracted and that  directory
1945       permissions  may  not permit file creation. One possible solution is to
1946       create directories with the mode specified in the archive, as  modified
1947       by  the  umask  of  the user, with sufficient permissions to allow file
1948       creation. After all files have been extracted, pax would then reset the
1949       access and modification times and permissions as necessary.
1950
1951       The  list-mode  formatting  description  borrows  heavily  from the one
1952       defined by the printf(1) utility. However, since there is  no  separate
1953       operand  list  to  get conversion arguments, the format was extended to
1954       allow specifying the name of the conversion argument  as  part  of  the
1955       conversion specification.
1956
1957       The T conversion specifier allows time fields to be displayed in any of
1958       the date formats. Unlike the ls(1) utility, pax  does  not  adjust  the
1959       format  when  the  date is less than six months in the past. This makes
1960       parsing the output more predictable.
1961
1962       The  D  conversion  specifier  handles  the  ability  to  display   the
1963       major/minor or file size, as with ls(1), by using %-8(size)D.
1964
1965       The L conversion specifier handles the ls display for symbolic links.
1966
1967       Conversion  specifiers were added to generate existing known types used
1968       for ls(1).
1969
1970
1971   pax Interchange Format
1972       The new POSIX data interchange format was developed primarily  to  sat‐
1973       isfy  international  concerns  that  the ustar and cpio formats did not
1974       provide for file, user, and group names encoded in characters outside a
1975       subset  of the ISO/IEC 646:1991 standard. The standard developers real‐
1976       ized that this new POSIX data interchange format should be very  exten‐
1977       sible  because  there  were other requirements they foresaw in the near
1978       future:
1979
1980       ·      Support international character encodings and locale information
1981
1982       ·      Support security information (ACLs, and so on)
1983
1984       ·      Support future file types, such as realtime or contiguous files
1985
1986       ·      Include data areas for implementation use
1987
1988       ·      Support systems with words larger than 32 bits and  timers  with
1989              subsecond granularity
1990
1991       The  following  were not goals for this format because these are better
1992       handled by separate utilities or are inappropriate for a portable  for‐
1993       mat:
1994
1995       ·      Encryption
1996
1997       ·      Compression
1998
1999       ·      Data translation between locales and codesets
2000
2001       ·      inode storage
2002
2003       The  format  chosen  to  support the goals is an extension of the ustar
2004       format. Of the two formats previously available, only the ustar  format
2005       was selected for extensions because:
2006
2007       ·      It was easier to extend in an upwards-compatible way. It offered
2008              version flags and header block type fields with room for  future
2009              standardization. The cpio format, while possessing a more flexi‐
2010              ble file naming  methodology,  could  not  be  extended  without
2011              breaking  some theoretical implementation or using a dummy file‐
2012              name that could be a legitimate filename.
2013
2014       ·      Industry experience since  the  original  "tar wars"  fought  in
2015              developing the ISO POSIX-1 standard has clearly been in favor of
2016              the ustar format, which is generally the default  output  format
2017              selected for pax implementations on new systems.
2018
2019       The  new  format was designed with one additional goal in mind: reason‐
2020       able behavior when an older tar or pax utility happened to read an  ar‐
2021       chive.  Since the POSIX.1-1990 standard mandated that a "format-reading
2022       utility" had to treat unrecognized typeflag values  as  regular  files,
2023       this  allowed  the  format to include all the extended information in a
2024       pseudo-regular file that preceded each real file. An  option  is  given
2025       that  allows  the  archive creator to set up reasonable names for these
2026       files on the older systems.  Also, the  normative  text  suggests  that
2027       reasonable file access values be used for this ustar header block. Mak‐
2028       ing these header files inaccessible for convenient reading and deleting
2029       would not be reasonable. File permissions of 600 or 700 are suggested.
2030
2031       The  ustar  typeflag field was used to accommodate the additional func‐
2032       tionality of the new format rather than magic or  version  because  the
2033       POSIX.1-1990 standard (and, by reference, the previous version of pax),
2034       mandated the behavior of the format-reading utility when it encountered
2035       an unknown typeflag, but was silent about the other two fields.
2036
2037       Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2038       a proposed archive format that was  based  on  compatibility  with  the
2039       standard  for tape files (ISO 1001, similar to the format used histori‐
2040       cally on many mainframes and minicomputers).  This  format  was  overly
2041       complex  and  required  considerable  overhead  in  volume  and  header
2042       records. Furthermore, the standard developers felt that it would not be
2043       acceptable  to  the  community  of  POSIX  developers,  so it was later
2044       changed to be a format more closely related to historical  practice  on
2045       POSIX systems.
2046
2047       The  prefix  and  name  split of pathnames in ustar was replaced by the
2048       single path extended header record for simplicity.
2049
2050       The concept of a global extended header (typeflag g) was controversial.
2051       If  this  were applied to an archive being recorded on magnetic tape, a
2052       few unreadable blocks at the beginning of the tape could be  a  serious
2053       problem; a utility attempting to extract as many files as possible from
2054       a damaged archive could lose a large percentage of file header informa‐
2055       tion  in  this case. However, if the archive were on a reliable medium,
2056       such as a CD-ROM, the global extended header offers considerable poten‐
2057       tial  size  reductions  by eliminating redundant information. Thus, the
2058       text warns against using the global method  for  unreliable  media  and
2059       provides  a  method  for  implanting global information in the extended
2060       header for each file, rather than in the typeflag g records.
2061
2062       No facility for data translation or filtering on a  per-file  basis  is
2063       included  because the standard developers could not invent an interface
2064       that would allow this in an efficient manner.  If  a  filter,  such  as
2065       encryption  or  compression,  is  to be applied to all the files, it is
2066       more efficient to apply the filter to the entire archive  as  a  single
2067       file. The standard developers considered interfaces that would invoke a
2068       shell script for each file going into or out of the  archive,  but  the
2069       system overhead in this approach was considered to be too high.
2070
2071       One such approach would be to have filter= records that give a pathname
2072       for an executable. When the program is invoked, the  file  and  archive
2073       would be open for standard input/output and all the header fields would
2074       be available as environment variables or  command-line  arguments.  The
2075       standard  developers  did  discuss  such schemes, but they were omitted
2076       from IEEE Std 1003.1-2001 due to  concerns  about  excessive  overhead.
2077       Also,  the program itself would need to be in the archive if it were to
2078       be used portably.
2079
2080       There is currently no  portable  means  of  identifying  the  character
2081       set(s)  used for a file in the file system. Therefore, pax has not been
2082       given a mechanism to generate charset records automatically.  The  only
2083       portable means of doing this is for the user to write the archive using
2084       the -o charset=string command line option. This assumes that all of the
2085       files  in  the  archive  use  the  same  encoding. The "implementation-
2086       defined" text is included to allow for a system that can  identify  the
2087       encodings used for each of its files.
2088
2089       The  table of standards that accompanies the charset record description
2090       is acknowledged to be very limited. Only a limited number of  character
2091       set  standards is reasonable for maximal interchange. Any character set
2092       is, of course, possible by  prior  agreement.  It  was  suggested  that
2093       EBCDIC  be  listed,  but  it was omitted because it is not defined by a
2094       formal standard. Formal standards, and then only those with  reasonably
2095       large  followings,  can be included here, simply as a matter of practi‐
2096       cality. The <value>s represent names of officially registered character
2097       sets in the format required by the ISO 2375:1985 standard.
2098
2099       The  normal  comma  or <blank>-separated list rules are not followed in
2100       the case of keyword options to  allow  ease  of  argument  parsing  for
2101       getopts.
2102
2103       Further  information on character encodings is in pax Archive Character
2104       Set Encoding/Decoding.
2105
2106       The standard developers have reserved keyword  name  space  for  vendor
2107       extensions. It is suggested that the format to be used is:
2108
2109           VENDOR.keyword
2110
2111       where VENDOR is the name of the vendor or organization in all uppercase
2112       letters. It is further suggested that the keyword following the  period
2113       be named differently than any of the standard keywords so that it could
2114       be used for future standardization, if  appropriate,  by  omitting  the
2115       VENDOR prefix.
2116
2117       The  <length>  field in the extended header record was included to make
2118       it simpler to step through the records, even if a  record  contains  an
2119       unknown  format (to a particular pax) with complex interactions of spe‐
2120       cial characters. It also provides a minor integrity  checkpoint  within
2121       the records to aid a program attempting to recover files from a damaged
2122       archive.
2123
2124       There are no extended header versions  of  the  devmajor  and  devminor
2125       fields because the unspecified format ustar header field should be suf‐
2126       ficient. If they are not, vendor-specific extended  keywords  (such  as
2127       VENDOR.devmajor) should be used.
2128
2129       Device  and i-number labeling of files was not adopted from cpio; files
2130       are interchanged strictly on a symbolic name basis, as in ustar.
2131
2132       Just as with the ustar format descriptions, the  new  format  makes  no
2133       special arrangements for multi-volume archives. Each of the pax archive
2134       types is assumed to be inside a single POSIX file  and  splitting  that
2135       file  over  multiple  volumes  (diskettes, tape cartridges, and so on),
2136       processing their labels, and mounting each in the proper  sequence  are
2137       considered  to  be  implementation  details  that  cannot  be described
2138       portably.
2139
2140       The pax format is intended for interchange, not only for  backup  on  a
2141       single  (family  of)  systems.  It is not as densely packed as might be
2142       possible for backup:
2143
2144       ·      It contains information as coded characters that could be  coded
2145              in binary.
2146
2147       ·      It  identifies  extended  records with name fields that could be
2148              omitted in favor of a fixed-field layout.
2149
2150       ·      It translates names into a portable character set and identifies
2151              locale-related  information, both of which are probably unneces‐
2152              sary for backup.
2153
2154       The requirements on restoring from an archive  are  slightly  different
2155       from  the  historical wording, allowing for non-monolithic privilege to
2156       bring forward as much as possible. In particular,  attributes  such  as
2157       "high  performance  file"  might be broadly but not universally granted
2158       while set-user-ID or chown(2) might be much more restricted.  There  is
2159       no implication in IEEE Std 1003.1-2001 that the security information be
2160       honored after it is restored to the file hierarchy, in  spite  of  what
2161       might  be  improperly  inferred by the silence on that topic. That is a
2162       topic for another standard.
2163
2164       Links are recorded in the fashion described here because a link can  be
2165       to any file type. It is desirable in general to be able to restore part
2166       of an archive selectively and restore all of those files completely. If
2167       the  data  is  not  associated with each link, it is not possible to do
2168       this. However, the data associated with a file can be large,  and  when
2169       selective  restoration is not needed, this can be a significant burden.
2170       The archive is structured so that files that have  no  associated  data
2171       can  always  be  restored by the name of any link name of any link, and
2172       the user may choose whether data is recorded with each  instance  of  a
2173       file  that  contains  data.  The format permits mixing of both types of
2174       links in a single archive; this can be done for special needs, and  pax
2175       is  expected  to interpret such archives on input properly, despite the
2176       fact that there is no pax option that would force this  mixed  case  on
2177       output.  (When  -o linkdata is used, the output must contain the dupli‐
2178       cate data, but the implementation is free to include it or omit it when
2179       -o linkdata is not used.)
2180
2181       The  time  values  are  included  as  extended header records for those
2182       implementations needing more than the eleven octal  digits  allowed  by
2183       the  ustar format. Portable file timestamps cannot be negative.  If pax
2184       encounters a file with a negative timestamp in copy or write  mode,  it
2185       can reject the file, substitute a non-negative timestamp, or generate a
2186       non-portable timestamp with a leading '-'. Even though some implementa‐
2187       tions  can support finer file-time granularities than seconds, the nor‐
2188       mative text requires support only for seconds since the  Epoch  because
2189       the  ISO  POSIX-1  standard  states  them  that  way.  The ustar format
2190       includes only mtime; the new format adds atime and ctime for  symmetry.
2191       The  atime  access time restored to the file system will be affected by
2192       the -p a and -p e options. The ctime creation time (actually inode mod‐
2193       ification  time)  is  described with "appropriate privilege" so that it
2194       can be ignored when writing to the file system. POSIX does not  provide
2195       a  portable  means to change file creation time. Nothing is intended to
2196       prevent a non-portable implementation of pax from restoring the value.
2197
2198       The gid, size, and uid extended header records were included  to  allow
2199       expansion  beyond  the  sizes  specified in the regular tar header. New
2200       file system architectures are emerging that will exhaust  the  12-digit
2201       size  field.  There are probably not many systems requiring more than 8
2202       digits for user and group IDs, but  the  extended  header  values  were
2203       included  for  completeness,  allowing overrides for all of the decimal
2204       values in the tar header.
2205
2206       The standard developers intended to describe the effective  results  of
2207       pax with regard to file ownerships and permissions; implementations are
2208       not restricted in timing or sequencing the restoration  of  such,  pro‐
2209       vided the results are as specified.
2210
2211       Much  of  the  text  describing  the  extended headers refers to use in
2212       "write or copy modes". The copy mode references are due to  the  norma‐
2213       tive text: "The effect of the copy shall be as if the copied files were
2214       written to an archive file and then subsequently extracted ...".  There
2215       is  certainly  no  way  to  test whether pax is actually generating the
2216       extended headers in copy mode, but the effects must be as if it had.
2217
2218
2219   pax Archive Character Set Encoding/Decoding
2220       There is a need to exchange archives of files between systems  of  dif‐
2221       ferent  native codesets. Filenames, group names, and user names must be
2222       preserved to the fullest extent possible when an archive is read on the
2223       receiving  platform. Translation of the contents of files is not within
2224       the scope of the pax utility.
2225
2226       There will also be the need to represent characters that are not avail‐
2227       able  on the receiving platform. These unsupported characters cannot be
2228       automatically folded to the local set of characters due to  the  chance
2229       of  collisions.  This  could  result  in overwriting previous extracted
2230       files from the archive or pre-existing files on the system.
2231
2232       For these reasons, the codeset used to represent characters within  the
2233       extended header records of the pax archive must be sufficiently rich to
2234       handle all commonly used character sets. The fields requiring  transla‐
2235       tion  include,  at  a  minimum, filenames, user names, group names, and
2236       link pathnames. Implementations may wish  to  have  localized  extended
2237       keywords that use non-portable characters.
2238
2239       The standard developers considered the following options:
2240
2241       ·      The  archive  creator  specifies  the  well-defined  name of the
2242              source codeset. The receiver must  then  recognize  the  codeset
2243              name and perform the appropriate translations to the destination
2244              codeset.
2245
2246       ·      The archive creator includes within the  archive  the  character
2247              mapping  table  for  the  source codeset used to encode extended
2248              header records. The receiver must then read the  character  map‐
2249              ping  table and perform the appropriate translations to the des‐
2250              tination codeset.
2251
2252       ·      The archive creator translates the extended  header  records  in
2253              the source codeset into a canonical form. The receiver must then
2254              perform the appropriate translations to the destination codeset.
2255
2256       The approach that incorporates the name of the source codeset poses the
2257       problem  of codeset name registration, and makes the archive useless to
2258       pax archive decoders that do not recognize that codeset.
2259
2260       Because parts of an archive may be corrupted, the  standard  developers
2261       felt  that  including  the  character map of the source codeset was too
2262       fragile. The loss of this one key component could result in making  the
2263       entire  archive  useless.  (The  difference between this and the global
2264       extended header decision was that the latter has a workaround-duplicat‐
2265       ing  extended  header records on unreliable media-but this would be too
2266       burdensome for large character set maps.)
2267
2268       Both of the above approaches also put an undue burden on  the  pax  ar‐
2269       chive  receiver  to handle the cross-product of all source and destina‐
2270       tion codesets.
2271
2272       To simplify the translation from the source codeset  to  the  canonical
2273       form  and from the canonical form to the destination codeset, the stan‐
2274       dard developers decided that the internal representation  should  be  a
2275       stateless  encoding.  A  stateless encoding is one where each codepoint
2276       has the same meaning, without regard to the decoder being in a specific
2277       state.  An  example of a stateful encoding would be the Japanese Shift-
2278       JIS; an example of a stateless encoding would be the  ISO/IEC  646:1991
2279       standard (equivalent to 7-bit ASCII).
2280
2281       For these reasons, the standard developers decided to adopt a canonical
2282       format for the representation of file information strings. The obvious,
2283       well-endorsed  candidate is the ISO/IEC 10646-1:2000 standard (based in
2284       part on Unicode), which can be used to represent the characters of vir‐
2285       tually  all  standardized  character sets. The standard developers ini‐
2286       tially agreed upon using UCS2 (16-bit Unicode) as the  internal  repre‐
2287       sentation.  This  repertoire of characters provides a sufficiently rich
2288       set to represent all commonly-used codesets.
2289
2290       However, the standard developers found that the 16-bit  Unicode  repre‐
2291       sentation  had some problems. It forced the issue of standardizing byte
2292       ordering. The 2-byte length of each character made the extended  header
2293       records  twice as long for the case of strings coded entirely from his‐
2294       torical 7-bit ASCII. For these reasons, the standard  developers  chose
2295       the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2296       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2297       tically,  eliminating  the need for a canonical byte ordering. In addi‐
2298       tion, NUL octets and other characters possibly confusing to POSIX  file
2299       systems  do not appear, except to represent themselves. It was realized
2300       that certain national codesets take up more space after  the  encoding,
2301       due  to their placement within the UCS range; it was felt that the use‐
2302       fulness of the encoding of the names outweighs the disadvantage of size
2303       increase for file, user, and group names.
2304
2305       The encoding of UTF-8 is as follows:
2306
2307       UCS4 Hex Encoding   UTF-8 Binary Encoding
2308       00000000-0000007F   0xxxxxxx
2309       00000080-000007FF   110xxxxx 10xxxxxx
2310       00000800-0000FFFF   1110xxxx 10xxxxxx 10xxxxxx
2311       00010000-001FFFFF   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2312       00200000-03FFFFFF   111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2313       04000000-7FFFFFFF   1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2314
2315       where  each  'x' represents a bit value from the character being trans‐
2316       lated.
2317
2318
2319   ustar Interchange Format
2320       The description of the ustar format reflects numerous enhancements over
2321       pre-1988  versions  of  the  historical  tar utility. The goal of these
2322       changes was not only to provide the  functional  enhancements  desired,
2323       but  also  to  retain  compatibility between new and old versions. This
2324       compatibility has been retained. Archives written using the old archive
2325       format are compatible with the new format.
2326
2327       Implementors  should  be  aware  that  the previous file format did not
2328       include a mechanism to archive directory type files. For  this  reason,
2329       the  convention  of  using  a filename ending with slash was adopted to
2330       specify a directory on the archive.
2331
2332       The total size of the name and prefix fields have been set to meet  the
2333       minimum  requirements  for {PATH_MAX} If a pathname will fit within the
2334       name field, it is recommended that the pathname be stored there without
2335       the use of the prefix field. Although the name field is known to be too
2336       small to contain {PATH_MAX} characters, the value was  not  changed  in
2337       this version of the archive file format to retain backwards-compatibil‐
2338       ity, and instead the prefix was introduced. Also, because of  the  ear‐
2339       lier  version  of the format, there is no way to remove the restriction
2340       on the linkname field being limited in size to just that  of  the  name
2341       field.
2342
2343       The  size  field  is  required  to  be meaningful in all implementation
2344       extensions, although it could be zero. This is  required  so  that  the
2345       data blocks can always be properly counted.
2346
2347       It  is  suggested  that  if device special files need to be represented
2348       that cannot be represented in the standard  format,  that  one  of  the
2349       extension  types (A-Z) be used, and that the additional information for
2350       the special file be represented as data and be reflected  in  the  size
2351       field.
2352
2353       Attempting  to  restore  a  special file type, where it is converted to
2354       ordinary data and conflicts with an existing filename, need not be spe‐
2355       cially  detected by the utility. If run as an ordinary user, pax should
2356       not be able to overwrite the entries in, for example, /dev in any  case
2357       (whether  the  file  is  converted to another type or not). If run as a
2358       privileged user, it should be able to do so, and it would be considered
2359       a  bug if it did not. The same is true of ordinary data files and simi‐
2360       larly named special files; it is impossible to anticipate the needs  of
2361       the user (who could really intend to overwrite the file), so the behav‐
2362       ior should be predictable (and thus regular) and rely on the protection
2363       system as required.
2364
2365       The  value 7 in the typeflag field is intended to define how contiguous
2366       files can be stored in a ustar archive.  IEEE Std 1003.1-2001 does  not
2367       require  the  contiguous file extension, but does define a standard way
2368       of archiving such files so that all conforming  systems  can  interpret
2369       these  file  types  in  a meaningful and consistent manner. On a system
2370       that does not support extended file types, the pax  utility  should  do
2371       the best it can with the file and go on to the next.
2372
2373       The  file  protection  modes are those conventionally used by the ls(1)
2374       utility. This is extended beyond the usage in the ISO POSIX-2  standard
2375       to  support  the "shared text" or "sticky" bit. It is intended that the
2376       conformance document should not document anything beyond the  existence
2377       of  and  support  of  such  a mode.  Further extensions are expected to
2378       these bits, particularly with  overloading  the  set-user-ID  and  set-
2379       group-ID flags.
2380
2381
2382   cpio Interchange Format
2383       The  reference to appropriate privilege in the cpio format refers to an
2384       error on standard output; the ustar format  does  not  make  comparable
2385       statements.
2386
2387       The  model  for  this  format  was the historical System V cpio -c data
2388       interchange format. This model documents the portable  version  of  the
2389       cpio  format  and  not  the  binary  version. It has the flexibility to
2390       transfer data of any type described within IEEE Std 1003.1-2001, yet is
2391       extensible  to  transfer  data types specific to extensions beyond IEEE
2392       Std 1003.1-2001 (for example, contiguous files). Because  it  describes
2393       existing practice, there is no question of maintaining upwards-compati‐
2394       bility.
2395
2396
2397   cpio Header
2398       There has been some concern that the size of the  c_ino  field  of  the
2399       header  is too small to handle those systems that have very large inode
2400       numbers. However, the c_ino field in the header is used strictly  as  a
2401       hard-link  resolution mechanism for archives. It is not necessarily the
2402       same value as the inode number of the file in the location  from  which
2403       that file is extracted.
2404
2405       The name c_magic is based on historical usage.
2406
2407
2408   cpio Filename
2409       For  most  historical  implementations  of the cpio utility, {PATH_MAX}
2410       octets can be used to describe the pathname without the addition of any
2411       other  header  fields  (the  NUL  character  would  be included in this
2412       count).  {PATH_MAX} is the minimum value for pathname size,  documented
2413       as  256  bytes. However, an implementation may use c_namesize to deter‐
2414       mine the exact length of the pathname.  With the current description of
2415       the  <cpio.h>  header,  this  pathname size can be as large as a number
2416       that is described in six octal digits.
2417
2418       Two values are documented under the c_mode field values to provide  for
2419       extensibility for known file types:
2420
2421       0110 000
2422              Reserved  for contiguous files. The implementation may treat the
2423              rest of the information for this archive like a regular file. If
2424              this  file  type is undefined, the implementation may create the
2425              file as a regular file.
2426
2427       This provides for extensibility of the cpio format while  allowing  for
2428       the  ability to read old archives. Files of an unknown type may be read
2429       as "regular files" on some implementations. On a system that  does  not
2430       support  extended file types, the pax utility should do the best it can
2431       with the file and go on to the next.
2432
2433

FUTURE DIRECTIONS

2435       None.
2436
2437

End of informative sections.

2439_________________________________________________________________
2440
2441

SEE ALSO

2443       Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2444       Base  Definitions  volume of IEEE Std 1003.1-2001, <cpio.h>, the System
2445       Interfaces  volume  of  IEEE  Std  1003.1-2001,   chown(2),   creat(2),
2446       mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2447
2448

CHANGE HISTORY

2450       First released in Issue 4.
2451
2452
2453   Issue 5
2454       A  note  is added to the APPLICATION USAGE indicating that the cpio and
2455       tar formats can only support files up to 8 gigabytes in size.
2456
2457
2458   Issue 6
2459       The pax utility is aligned with the IEEE P1003.2b draft standard:
2460
2461       ·      Support has been added for symbolic links  in  the  options  and
2462              interchange formats.
2463
2464       ·      A new format has been devised, based on extensions to ustar.
2465
2466       ·      References  to  the "extended" tar and cpio formats derived from
2467              the POSIX.1-1990  standard  have  been  changed  to  remove  the
2468              "extended" adjective because this could cause confusion with the
2469              extended tar header added in this revision. (All  references  to
2470              tar are actually to ustar.)
2471
2472       The TZ entry is added to the ENVIRONMENT VARIABLES section.
2473
2474       IEEE  PASC  Interpretation  1003.2  #168  is  applied,  clarifying that
2475       mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2476       ing an archive.
2477
2478       IEEE  PASC  Interpretation  1003.2  #180  is  applied,  clarifying  how
2479       extracted files are created when in read mode.
2480
2481       IEEE  PASC  Interpretation  1003.2  #181  is  applied,  clarifying  the
2482       description of the -t option.
2483
2484       IEEE PASC Interpretation 1003.2 #195 is applied.
2485
2486       IEEE  PASC  Interpretation  1003.2 #206 is applied, clarifying the han‐
2487       dling of links for the -H, -L, and -l options.
2488
2489       IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied,  adding
2490       the process ID of the pax process into certain fields. This change pro‐
2491       vides  a  method  for  the  implementation  to  ensure  that  different
2492       instances of pax extracting a file named /a/b/foo will not collide when
2493       processing the extended header information associated with foo.
2494
2495       IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied,  chang‐
2496       ing -x B to -x pax in the OPTIONS section.
2497
2498       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied, updat‐
2499       ing the SYNOPSIS to be consistent with the normative text.
2500
2501       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied,  updat‐
2502       ing  the  DESCRIPTION  to describe the behavior when files to be linked
2503       are symbolic links and the system is not capable of making  hard  links
2504       to symbolic links.
2505
2506       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied, updat‐
2507       ing the OPTIONS section to  describe  the  behavior  for  how  multiple
2508       options are to be handled.
2509
2510       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied, updat‐
2511       ing the write option within the OPTIONS section.
2512
2513       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied,  adding
2514       a  paragraph  into the OPTIONS section that states that specifying more
2515       than one of the mutually-exclusive options (-H and -L) is  not  consid‐
2516       ered  an  error  and  that the last option specified will determine the
2517       behavior of the utility.
2518
2519       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied,  remov‐
2520       ing  the  ctime  paragraph within the EXTENDED DESCRIPTION.  There is a
2521       contradiction in the definition  of  the  ctime  keyword  for  the  pax
2522       extended header, in that the st_ctime member of the stat structure does
2523       not refer to a file creation time. No field in the standard stat struc‐
2524       ture from <sys/stat.h> includes a file creation time.
2525
2526       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied, making
2527       it clear that typeflag 1 RB ( ustar  Interchange  Format)  applies  not
2528       only  to files that are hard-linked, but also to files that are aliased
2529       via symlinks.
2530
2531       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied,  clari‐
2532       fying the cpio c_nlink field.
2533
2534       End of quoted text from the POSIX.1-2001 standard.
2535

OTHER OPTIONS

2537       The  following  other options are implemented as extension to the POSIX
2538       standard.  Note that some other  non-POSIX  options  are  mentioned  in
2539       -help  and  -xhelp output - these are also supported in spax(1) and are
2540       described in the star(1) manual page.
2541
2542       -help  Prints a summary of the most important options for  spax(1)  and
2543              exits.
2544
2545       -do-statistics
2546              Print statistic messages at the end of a spax(1) run.
2547
2548       -xhelp Prints  a  summary of the less important options for spax(1) and
2549              exits.
2550
2551       -version
2552              Prints the spax version number string and exists.
2553
2554

EXAMPLES

ENVIRONMENT

FILES

SEE ALSO

DIAGNOSTICS

NOTES

2561       The Institute of Electrical and  Electronics  Engineers  and  The  Open
2562       Group, have given us permission to reprint portions of their documenta‐
2563       tion. In the following statement, the phrase ``this  text''  refers  to
2564       portions of the system documentation.
2565
2566       Portions  of  this text are reprinted and reproduced in electronic form
2567       in the sfind manual, from IEEE Std 1003.1, 2004 Edition,  Standard  for
2568       Information  Technology -- Portable Operating System Interface (POSIX),
2569       The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004  by
2570       the Institute of Electrical and Electronics Engineers, Inc and The Open
2571       Group. In the event of any discrepancy between these versions  and  the
2572       original  IEEE  and  The Open Group Standard, the original IEEE and The
2573       Open Group Standard is the referee document. The original Standard  can
2574       be obtained online at http://www.opengroup.org/unix/online.html.
2575

BUGS

AUTHOR

2578       Joerg Schilling
2579       Seestr. 110
2580       D-13353 Berlin
2581       Germany
2582
2583       Mail bugs and suggestions to:
2584
2585       schilling@fokus.fraunhofer.de       or       js@cs.tu-berlin.de      or
2586       joerg@schily.isdn.cs.tu-berlin.de
2587
2588
2589
2590Joerg Schilling                    13/04/16                           SPAX(1L)
Impressum