1SPAX(1L)                    Schily´s USER COMMANDS                    SPAX(1L)
2
3
4

NAME

6       pax - portable archive interchange
7

SYNOPSIS

9       spax        [other options]      [-cdnv]      [-H|-L]      [-f archive]
10              [-o options]...  [-s replstr]...  [pattern...]
11
12
13       spax   -r   [other options]     [-cdiknuv]     [-H|-L]     [-f archive]
14              [-o options]...  [-p string]...  [-s replstr]...  [pattern...]
15
16
17       spax   -w   [other options]   [-dituvX]   [-H|-L]  [-b blocksize]  [-a]
18              [-f archive]   [-o options]...    [-s replstr]...    [-x format]
19              [file...]
20
21
22       spax   -r -w[other options]    [-diklntuvX]   [-H|-L]   [-o options]...
23              [-p string]...  [-s replstr]...  [file...] directory
24

DESCRIPTION

26       The pax utility shall read, write, and write lists of  the  members  of
27       archive files and copy directory hierarchies. A variety of archive for‐
28       mats shall be supported; see the -x format option.
29
30       The action to be taken depends  on  the  presence  of  the  -r  and  -w
31       options. The four combinations of -r and -w are referred to as the four
32       modes of operation: list, read, write, and  copy  modes,  corresponding
33       respectively to the four forms shown in the SYNOPSIS section.
34
35       list   In  list  mode (when neither -r nor -w are specified), pax shall
36              write the names of the members of the archive file read from the
37              standard  input, with pathnames matching the specified patterns,
38              to standard output. If a named file is of  type  directory,  the
39              file hierarchy rooted at that file shall be listed as well.
40
41       read   In  read  mode  (when -r is specified, but -w is not), pax shall
42              extract the members of the archive file read from  the  standard
43              input,  with  pathnames  matching the specified patterns.  If an
44              extracted file is of type directory, the file  hierarchy  rooted
45              at  that  file  shall  be extracted as well. The extracted files
46              shall be created performing pathname resolution with the  direc‐
47              tory in which pax was invoked as the current working directory.
48
49              If  an attempt is made to extract a directory when the directory
50              already exists, this shall not be considered  an  error.  If  an
51              attempt  is made to extract a FIFO when the FIFO already exists,
52              this shall not be considered an error.
53
54              The ownership, access, and modification times, and file mode  of
55              the restored files are discussed under the -p option.
56
57       write  In  write  mode (when -w is specified, but -r is not), pax shall
58              write the contents of the file operands to the  standard  output
59              in  an archive format. If no file operands are specified, a list
60              of files to copy, one per line, shall be read from the  standard
61              input.  A  file of type directory shall include all of the files
62              in the file hierarchy rooted at the file.
63
64       copy   In copy mode (when both -r and -w are specified), pax shall copy
65              the file operands to the destination directory.
66
67              If  no file operands are specified, a list of files to copy, one
68              per line, shall be read from the standard input. A file of  type
69              directory  shall  include all of the files in the file hierarchy
70              rooted at the file.
71
72              The effect of the copy shall be as  if  the  copied  files  were
73              written  to  an  archive  file  and then subsequently extracted,
74              except that there may be hard links between the original and the
75              copied  files. If the destination directory is a subdirectory of
76              one of the files to be copied, the results are  unspecified.  If
77              the destination directory is a file of a type not defined by the
78              System Interfaces volume of IEEE Std  1003.1-2001,  the  results
79              are  implementation-defined; otherwise, it shall be an error for
80              the file named by the directory operand not  to  exist,  not  be
81              writable by the user, or not be a file of type directory.
82
83       In  read  or  copy  modes, if intermediate directories are necessary to
84       extract an archive member, pax shall perform actions equivalent to  the
85       mkdir()  function  defined  in the System Interfaces volume of IEEE Std
86       1003.1-2001, called with the following arguments:
87
88       ·      The intermediate directory used as the path argument.
89
90       ·      The value of the bitwise-inclusive OR of S_IRWXU,  S_IRWXG,  and
91              S_IRWXO as the mode argument.
92
93       If  any  specified pattern or file operands are not matched by at least
94       one file or archive member, pax shall write  a  diagnostic  message  to
95       standard error for each one that did not match and exit with a non-zero
96       exit status.
97
98       The archive formats described in the EXTENDED DESCRIPTION section shall
99       be  automatically  detected on input. The default output archive format
100       shall be implementation-defined.
101
102       The spax implementation defaults to -x ustar.
103
104       A single archive can span multiple files. The pax utility shall  deter‐
105       mine,  in  an implementation-defined manner, what file to read or write
106       as the next file.
107
108       If the selected archive format supports  the  specification  of  linked
109       files,  it  shall  be an error if these files cannot be linked when the
110       archive is extracted, except that if the files to be  linked  are  sym‐
111       bolic  links and the system is not capable of making hard links to sym‐
112       bolic links, then separate copies of the symbolic link shall be created
113       instead.  For archive formats that do not store file contents with each
114       name that causes a hard link, if the file that contains the data is not
115       extracted  during  this  pax session, either the data shall be restored
116       from the original file, or a diagnostic message shall be displayed with
117       the  name of a file that can be used to extract the data. In traversing
118       directories, pax shall detect infinite loops; that is, entering a  pre‐
119       viously visited directory that is an ancestor of the last file visited.
120       When it detects an infinite loop, pax shall write a diagnostic  message
121       to standard error and shall terminate.
122
123

OPTIONS

125       The  pax  utility  shall conform to the Base Definitions volume of IEEE
126       Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines,  except  that
127       the order of presentation of the -o, -p, and -s options is significant.
128
129       See also the "OTHER OPTIONS" section.
130
131
132       The following options shall be supported:
133
134       -r     Read an archive file from standard input.
135
136       -w     Write files to the standard output in the specified archive for‐
137              mat.
138
139       -a     Append files to the end of the archive.  It  is  implementation-
140              defined  which  devices  on  the system support appending. Addi‐
141              tional file formats unspecified  by  this  volume  of  IEEE  Std
142              1003.1-2001 may impose restrictions on appending.
143
144       -b blocksize
145              Block  the  output at a positive decimal integer number of bytes
146              per write to the archive file. Devices and archive  formats  may
147              impose restrictions on blocking. Blocking shall be automatically
148              determined on input. Conforming applications shall not specify a
149              blocksize value larger than 32256.  Default blocking when creat‐
150              ing archives depends on the archive format. (See the  -x  option
151              below.)
152
153       -c     Match  all file or archive members except those specified by the
154              pattern or file operands.
155
156       -d     Cause files of type directory being copied or  archived  or  ar‐
157              chive  members  of  type  directory being extracted or listed to
158              match only the file or archive member itself and  not  the  file
159              hierarchy rooted at the file.
160
161       -f archive
162              Specify  the pathname of the input or output archive, overriding
163              the default standard input (in list or read modes)  or  standard
164              output (write mode).
165
166       -H     If a symbolic link referencing a file of type directory is spec‐
167              ified on the command line, pax shall archive the file  hierarchy
168              rooted in the file referenced by the link, using the name of the
169              link as the root of the file hierarchy.  Otherwise,  if  a  sym‐
170              bolic  link  referencing a file of any other file type which pax
171              can normally archive is specified on the command line, then  pax
172              shall archive the file referenced by the link, using the name of
173              the link. The default behavior shall be to archive the  symbolic
174              link itself.
175
176       -i     Interactively  rename files or archive members. For each archive
177              member matching a pattern operand or file matching a file  oper‐
178              and, a prompt shall be written to the file /dev/tty.  The prompt
179              shall contain the name of the file or archive  member,  but  the
180              format  is otherwise unspecified. A line shall then be read from
181              /dev/tty. If this line is blank,  the  file  or  archive  member
182              shall  be skipped. If this line consists of a single period, the
183              file or archive member shall be processed with  no  modification
184              to its name. Otherwise, its name shall be replaced with the con‐
185              tents of the line. The pax utility shall immediately exit with a
186              non-zero  exit status if end-of-file is encountered when reading
187              a response or if /dev/tty cannot be opened for reading and writ‐
188              ing.
189
190              The  results  of  extracting a hard link to a file that has been
191              renamed during extraction are unspecified.
192
193       -k     Prevent the overwriting of existing files.
194
195       -l     (The letter ell.) In copy mode, hard links shall be made between
196              the  source  and destination file hierarchies whenever possible.
197              If specified in conjunction with -H or -L, when a symbolic  link
198              is  encountered,  the  hard link created in the destination file
199              hierarchy shall be to the file referenced by the symbolic  link.
200              If  specified  when  neither -H nor -L is specified, when a sym‐
201              bolic link is encountered, the  implementation  shall  create  a
202              hard  link  to the symbolic link in the source file hierarchy or
203              copy the symbolic link to the destination.
204
205       -L     If a symbolic link referencing a file of type directory is spec‐
206              ified on the command line or encountered during the traversal of
207              a file hierarchy, pax shall archive the file hierarchy rooted in
208              the  file  referenced by the link, using the name of the link as
209              the root of the file hierarchy.  Otherwise, if a  symbolic  link
210              referencing a file of any other file type which pax can normally
211              archive is specified on the command line or  encountered  during
212              the  traversal  of  a file hierarchy, pax shall archive the file
213              referenced by the link, using the name of the link. The  default
214              behavior shall be to archive the symbolic link itself.
215
216       -n     Select  the first archive member that matches each pattern oper‐
217              and. No more than one archive member shall be matched  for  each
218              pattern  (although  members  of type directory shall still match
219              the file hierarchy rooted at that file).
220
221       -o options
222              Provide information to the implementation to  modify  the  algo‐
223              rithm  for  extracting  or  writing  files. The value of options
224              shall consist of one or more  comma-separated  keywords  of  the
225              form:
226
227              keyword[[:]=value][,keyword[[:]=value],...]
228
229              Some  keywords  apply only to certain file formats, as indicated
230              with each description. Use of keywords that are inapplicable  to
231              the file format being processed produces undefined results.
232
233              Keywords in the options argument shall be a string that would be
234              a valid portable filename as described in the  Base  Definitions
235              volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
236              Character Set.
237
238              Note:  Keywords are not expected to be filenames, merely to fol‐
239                     low  the  same  character  composition  rules as portable
240                     filenames.
241
242              Keywords can be preceded with white space. The value field shall
243              consist  of  zero or more characters; within value, the applica‐
244              tion shall precede any literal comma  with  a  backslash,  which
245              shall  be  ignored,  but preserves the comma as part of value. A
246              comma as the final character, or  a  comma  followed  solely  by
247              white  space  as  the  final  characters,  in  options  shall be
248              ignored. Multiple -o options can be specified; if keywords given
249              to  these  multiple -o options conflict, the keywords and values
250              appearing later in command line sequence shall  take  precedence
251              and the earlier shall be silently ignored. The following keyword
252              values of options shall be supported for  the  file  formats  as
253              indicated:
254
255              delete=pattern
256                     (Applicable  only  to  the  -x  pax format.) When used in
257                     write or copy mode, pax shall omit from  extended  header
258                     records that it produces any keywords matching the string
259                     pattern. When used in read or list mode, pax shall ignore
260                     any  keywords matching the string pattern in the extended
261                     header records. In both cases,  matching  shall  be  per‐
262                     formed  using  the pattern matching notation described in
263                     Patterns Matching a Single Character and Patterns  Match‐
264                     ing Multiple Characters. For example:
265
266                     -o delete=security.*
267
268                     would  suppress  security-related  information.  See  pax
269                     Extended Header for extended header record keyword usage.
270
271                     When multiple -o delete=pattern  options  are  specified,
272                     the patterns shall be additive; all keywords matching the
273                     specified string patterns shall be omitted from  extended
274                     header records that pax produces.
275
276              exthdr.name=string
277                     (Applicable  only  to  the  -x  pax format.) This keyword
278                     allows user control over the name that  is  written  into
279                     the  ustar header blocks for the extended header produced
280                     under the circumstances described in  pax  Header  Block.
281                     The  name shall be the contents of string, after the fol‐
282                     lowing character substitutions have been made:
283
284                  ┌─────────────────┬─────────────────────────────────────────────┐
285string Includes: Replaced By:                                
286                  ├─────────────────┼─────────────────────────────────────────────┤
287                  │%d               │ The directory name of the file, equivalent  │
288                  │                 │ to the result of the dirname utility on the │
289                  │                 │ translated pathname.                        │
290                  ├─────────────────┼─────────────────────────────────────────────┤
291                  │%f               │ The filename of the file, equivalent to the │
292                  │                 │ result of the basename utility on the       │
293                  │                 │ translated pathname.                        │
294                  ├─────────────────┼─────────────────────────────────────────────┤
295                  │%p               │ The process ID of the pax process.          │
296                  ├─────────────────┼─────────────────────────────────────────────┤
297                  │%%               │ A '%' character.                            │
298                  └─────────────────┴─────────────────────────────────────────────┘
299                     Any other '%'  characters  in  string  produce  undefined
300                     results.
301
302                     If  no -o exthdr.name= string is specified, pax shall use
303                     the following default value:
304
305                             %d/PaxHeaders.%p/%f
306
307              globexthdr.name=string
308                     (Applicable only to the -x  pax  format.)  When  used  in
309                     write  or  copy  mode  with  the appropriate options, pax
310                     shall create global extended header  records  with  ustar
311                     header  blocks  that  will be treated as regular files by
312                     previous versions of pax.  This keyword allows user  con‐
313                     trol  over the name that is written into the ustar header
314                     blocks for global extended header records. The name shall
315                     be  the contents of string, after the following character
316                     substitutions have been made:
317
318                  ┌─────────────────┬─────────────────────────────────────────────┐
319string Includes: Replaced By:                                
320                  ├─────────────────┼─────────────────────────────────────────────┤
321                  │%n               │ An integer that represents the sequence     │
322                  │                 │ number of the global extended header record │
323                  │                 │ in the archive, starting at 1.              │
324                  ├─────────────────┼─────────────────────────────────────────────┤
325                  │%p               │ The process ID of the pax process.          │
326                  ├─────────────────┼─────────────────────────────────────────────┤
327                  │%%               │ A '%' character.                            │
328                  └─────────────────┴─────────────────────────────────────────────┘
329                     Any other '%'  characters  in  string  produce  undefined
330                     results.
331
332                     If  no  -o globexthdr.name=string is specified, pax shall
333                     use the following default value:
334
335                     $TMPDIR/GlobalHead.%p.%n
336
337                     where $TMPDIR represents the value of the TMPDIR environ‐
338                     ment variable. If TMPDIR is not set, pax shall use /tmp.
339
340              invalid=action
341                     (Applicable  only  to  the  -x  pax format.) This keyword
342                     allows user  control  over  the  action  pax  takes  upon
343                     encountering values in an extended header record that, in
344                     read or copy mode, are invalid in the destination hierar‐
345                     chy  or,  in  list mode, cannot be written in the codeset
346                     and current locale of the implementation.  The  following
347                     are invalid values that shall be recognized by pax:
348
349                     +      In read or copy mode, a filename or link name that
350                            contains character encodings invalid in the desti‐
351                            nation  hierarchy. (For example, the name may con‐
352                            tain embedded NULs.)
353
354                     +      In read or copy mode, a filename or link name that
355                            is longer than the maximum allowed in the destina‐
356                            tion hierarchy (for either a pathname component or
357                            the entire pathname).
358
359                     +      In  list  mode,  any character string value (file‐
360                            name, link name, user name, and so on) that cannot
361                            be  written  in  the codeset and current locale of
362                            the implementation.
363
364                     The following mutually-exclusive  values  of  the  action
365                     argument are supported:
366
367                     bypass In  read  or copy mode, pax shall bypass the file,
368                            causing no change to the destination hierarchy. In
369                            list  mode,  pax  shall  write all requested valid
370                            values for the file, but its  method  for  writing
371                            invalid values is unspecified.
372
373                     rename In  read  or copy mode, pax shall act as if the -i
374                            option were in effect for each file  with  invalid
375                            filename or link name values, allowing the user to
376                            provide a replacement name interactively. In  list
377                            mode,  pax  shall behave identically to the bypass
378                            action.
379
380                     UTF-8  When used in read, copy, or list mode and a  file‐
381                            name, link name, owner name, or any other field in
382                            an extended header  record  cannot  be  translated
383                            from  the  pax UTF-8 codeset format to the codeset
384                            and current  locale  of  the  implementation,  pax
385                            shall use the actual UTF-8 encoding for the name.
386
387                     write  In  read  or  copy mode, pax shall write the file,
388                            translating the name, regardless of  whether  this
389                            may  overwrite an existing file with a valid name.
390                            In list mode, pax shall behave identically to  the
391                            bypass action.
392
393                     If no -o invalid=option is specified, pax shall act as if
394                     -o invalid= bypass were  specified.  Any  overwriting  of
395                     existing  files  that  may  be allowed by the -o invalid=
396                     actions shall be subject to permission(-p) and  modifica‐
397                     tion  time  (-u) restrictions, and shall be suppressed if
398                     the -k option is also specified.
399
400              linkdata
401                     (Applicable only to the -x pax format.)  In  write  mode,
402                     pax  shall  write  the  contents of a file to the archive
403                     even when that file is merely a hard link to a file whose
404                     contents have already been written to the archive.
405
406              listopt=format
407                     This  keyword specifies the output format of the table of
408                     contents produced when the -v option is specified in list
409                     mode. See List Mode Format Specifications. To avoid ambi‐
410                     guity, the listopt= format shall be  the  only  or  final
411                     keyword=  value pair in a -o option-argument; all charac‐
412                     ters in the remainder of  the  option-argument  shall  be
413                     considered  part  of  the format string. When multiple -o
414                     listopt= format options are specified, the format strings
415                     shall be considered a single, concatenated string, evalu‐
416                     ated in command line order.
417
418              times  (Applicable only to the -x  pax  format.)  When  used  in
419                     write  or  copy  mode,  pax shall include atime and mtime
420                     extended header records for each file. See  pax  Extended
421                     Header File Times.
422
423              In  addition  to  these keywords, if the -x pax format is speci‐
424              fied, any of the keywords and values  defined  in  pax  Extended
425              Header,  including  implementation extensions, can be used in -o
426              option-arguments, in either of two modes:
427
428              keyword=value
429                     When used in write  or  copy  mode,  these  keyword/value
430                     pairs  shall  be included at the beginning of the archive
431                     as typeflag g global extended header records.  When  used
432                     in read or list mode, these keyword/value pairs shall act
433                     as if they had been at the beginning of  the  archive  as
434                     typeflag g global extended header records.
435
436              keyword:=value
437                     When  used  in  write  or  copy mode, these keyword/value
438                     pairs shall be included as records at the beginning of  a
439                     typeflag  x extended header for each file. (This shall be
440                     equivalent to the equal-sign form except that it  creates
441                     no  typeflag g global extended header records.) When used
442                     in read or list mode, these keyword/value pairs shall act
443                     as  if  they  were included as records at the end of each
444                     extended header; thus, they shall override any global  or
445                     file-specific extended header record keywords of the same
446                     names. For example, in the command:
447
448                     pax -r -o "gname:=mygroup," <archive
449
450                     the group name will be forced to  a  new  value  for  all
451                     files read from the archive.
452
453              The precedence of -o keywords over various fields in the archive
454              is described in pax Extended Header Keyword Precedence.
455
456       -p string
457              Specify one or more file  characteristic  options  (privileges).
458              The  string  option-argument  shall  be a string specifying file
459              characteristics to be retained or discarded on extraction.   The
460              string  shall  consist of the specification characters a , e, m,
461              o,  and  p.  Other  implementation-defined  characters  can   be
462              included.  Multiple  characteristics  can be concatenated within
463              the same string and multiple -p options can  be  specified.  The
464              meaning of the specification characters are as follows:
465
466              a      Do not preserve file access times.
467
468              e      Preserve  the  user ID, group ID, file mode bits (see the
469                     Base Definitions volume of IEEE Std 1003.1-2001,  Section
470                     3.168,  File  Mode Bits), access time, modification time,
471                     and any other  implementation-defined  file  characteris‐
472                     tics.
473
474              m
475
476                     Do not preserve file modification times.
477
478              o      Preserve the user ID and group ID.
479
480              p      Preserve the file mode bits. Other implementation-defined
481                     file mode attributes may be preserved.
482
483              In the preceding list, "preserve" indicates  that  an  attribute
484              stored in the archive shall be given to the extracted file, sub‐
485              ject to the permissions of the invoking process. The access  and
486              modification  times of the file shall be preserved unless other‐
487              wise specified with the -p option or not stored in the  archive.
488              All  attributes  that  are  not preserved shall be determined as
489              part of the normal file creation action (see File  Read,  Write,
490              and Creation).
491
492              If neither the e nor the o specification character is specified,
493              or the user ID and group ID are not preserved  for  any  reason,
494              pax shall not set the S_ISUID and S_ISGID bits of the file mode.
495
496              If  the preservation of any of these items fails for any reason,
497              pax shall write a diagnostic message to standard error.  Failure
498              to  preserve these items shall affect the final exit status, but
499              shall not cause the extracted file to be deleted.
500
501              If file characteristic letters in any of the string option-argu‐
502              ments are duplicated or conflict with each other, the ones given
503              last shall take precedence. For example, if -p eme is specified,
504              file modification times are preserved.
505
506       -s replstr
507              Modify file or archive member names named by pattern or file op‐
508              erands according to the substitution expression  replstr,  using
509              the  syntax  of  the  ed  utility. The concepts of "address" and
510              "line" are meaningless in the context of the  pax  utility,  and
511              shall not be supplied. The format shall be:
512
513              -s /old/new/[gp]
514
515              where  as  in  ed, old is a basic regular expression and new can
516              contain an ampersand, '\n' (where n is a digit)  backreferences,
517              or  subexpression matching. The old string shall also be permit‐
518              ted to contain <newline>s.
519
520              Any non-null character can be used as a delimiter  (  '/'  shown
521              here). Multiple -s expressions can be specified; the expressions
522              shall be applied in the order specified,  terminating  with  the
523              first  successful  substitution. The optional trailing 'g' is as
524              defined in the ed utility. The optional trailing 'p' shall cause
525              successful  substitutions  to be written to standard error. File
526              or archive member names that  substitute  to  the  empty  string
527              shall be ignored when reading and writing archives.
528
529       -t     When reading files from the file system, and if the user has the
530              permissions required by utime() to do so, set the access time of
531              each  file read to the access time that it had before being read
532              by pax.
533
534       -u     Ignore files that are older (having a less recent file modifica‐
535              tion  time)  than a pre-existing file or archive member with the
536              same name. In read mode, an archive member with the same name as
537              a file in the file system shall be extracted if the archive mem‐
538              ber is newer than the file. In write mode, an archive file  mem‐
539              ber  with  the  same  name as a file in the file system shall be
540              superseded if the file is newer than the archive member.  If  -a
541              is  also specified, this is accomplished by appending to the ar‐
542              chive; otherwise, it is unspecified whether this is accomplished
543              by  actual replacement in the archive or by appending to the ar‐
544              chive. In copy mode, the file in the destination hierarchy shall
545              be  replaced by the file in the source hierarchy or by a link to
546              the file in the source hierarchy if the file in the source hier‐
547              archy is newer.
548
549       -v     In  list mode, produce a verbose table of contents (see the STD‐
550              OUT section). Otherwise, write archive member pathnames to stan‐
551              dard error (see the STDERR section).
552
553       -x format
554              Specify the output archive format. The pax utility shall support
555              the following formats:
556
557              cpio   The cpio interchange format; see the EXTENDED DESCRIPTION
558                     section.  The default blocksize for this format for char‐
559                     acter special archive files shall  be  5120.  Implementa‐
560                     tions  shall  support  all  blocksize values less than or
561                     equal to 32256 that are multiples of 512.
562
563              pax    The pax interchange format; see the EXTENDED  DESCRIPTION
564                     section.  The default blocksize for this format for char‐
565                     acter special archive files shall be  5120.   Implementa‐
566                     tions  shall  support  all  blocksize values less than or
567                     equal to 32256 that are multiples of 512.
568
569              ustar  The tar interchange format; see the EXTENDED  DESCRIPTION
570                     section.  The default blocksize for this format for char‐
571                     acter special archive files shall be 10240.   Implementa‐
572                     tions  shall  support  all  blocksize values less than or
573                     equal to 32256 that are multiples of 512.
574
575              Implementation-defined formats shall  specify  a  default  block
576              size  as  well  as any other block sizes supported for character
577              special archive files.
578
579              Any attempt to append to an archive file in a  format  different
580              from the existing archive format shall cause pax to exit immedi‐
581              ately with a non-zero exit status.
582
583              In copy mode, if no -x format is specified, pax shall behave  as
584              if -x pax were specified.
585
586       -X     When  traversing the file hierarchy specified by a pathname, pax
587              shall not descend into directories that have a different  device
588              ID  (  st_dev;  see  the  System  Interfaces  volume of IEEE Std
589              1003.1-2001, stat()).
590
591       Specifying more than one of the mutually-exclusive options  -H  and  -L
592       shall  not  be  considered an error and the last option specified shall
593       determine the behavior of the utility.
594
595       The options that operate on the names of files or archive members  (-c,
596       -i,  -n,  -s,  -u, and -v) shall interact as follows. In read mode, the
597       archive members shall be selected based on the  user-specified  pattern
598       operands as modified by the -c, -n, and -u options. Then, any -s and -i
599       options shall modify, in that order, the names of the  selected  files.
600       The -v option shall write names resulting from these modifications.
601
602       In  write mode, the files shall be selected based on the user-specified
603       pathnames as modified by the -n and -u options.  Then, any  -s  and  -i
604       options shall modify, in that order, the names of these selected files.
605       The -v option shall write names resulting from these modifications.
606
607       If both the -u and -n options are specified, pax shall not  consider  a
608       file selected unless it is newer than the file to which it is compared.
609
610
611   List Mode Format Specifications
612       The  manual  page  for  spax is not yet ready.  The following text is a
613       quotation from the POSIX.1-2001 standard.
614
615       In list mode with the -o listopt=format  option,  the  format  argument
616       shall be applied for each selected file. The pax utility shall append a
617       <newline> to the listopt output for  each  selected  file.  The  format
618       argument shall be used as the format string described in the Base Defi‐
619       nitions volume of IEEE Std 1003.1-2001, Chapter 5,  File  Format  Nota‐
620       tion,  with  the  exceptions  1.  through  5.   defined in the EXTENDED
621       DESCRIPTION section of printf(3), plus the following exceptions:
622
623       6.     The sequence (keyword) can  occur  before  a  format  conversion
624              specifier.  The  conversion  argument is defined by the value of
625              keyword.  The implementation shall support  the  following  key‐
626              words:
627
628              ·      Any  of  the Field Name entries in ustar Header Block and
629                     Octet-Oriented cpio Archive Entry. The implementation may
630                     support the cpio keywords without the leading c_ in addi‐
631                     tion to the form  required  by  Values  for  cpio  c_mode
632                     Field.
633
634              ·      Any  keyword  defined  for  the  extended  header  in pax
635                     Extended Header.
636
637              ·      Any keyword provided as an implementation-defined  exten‐
638                     sion  within  the extended header defined in pax Extended
639                     Header.
640
641              For example, the sequence "%(charset)s" is the string  value  of
642              the name of the character set in the extended header.
643
644              The result of the keyword conversion argument shall be the value
645              from the applicable header field or extended header, without any
646              trailing NULs.
647
648              All  keyword values used as conversion arguments shall be trans‐
649              lated from the UTF-8 encoding to the character  set  appropriate
650              for the local file system, user database, and so on, as applica‐
651              ble.
652
653       7.     An additional conversion specifier character, T, shall  be  used
654              to  specify  time  formats. The T conversion specifier character
655              can be preceded by the sequence (keyword=subformat), where  sub‐
656              format is a date format as defined by date operands. The default
657              keyword shall be mtime and the default subformat shall be:
658
659                 %b %e %H:%M %Y
660
661       8.     An additional conversion specifier character, M, shall  be  used
662              to  specify  the  file  mode string as defined in ls(1) Standard
663              Output. If (keyword) is omitted, the mode keyword shall be used.
664              For  example,  %.1M writes the single character corresponding to
665              the <entry type> field of the ls -l command.
666
667       9.     An additional conversion specifier character, D, shall  be  used
668              to specify the device for block or special files, if applicable,
669              in an implementation-defined  format.  If  not  applicable,  and
670              (keyword) is specified, then this conversion shall be equivalent
671              to %(keyword)u.  If not applicable, and  (keyword)  is  omitted,
672              then this conversion shall be equivalent to <space>.
673
674       10.    An  additional  conversion specifier character, F, shall be used
675              to specify a pathname. The F conversion character  can  be  pre‐
676              ceded by a sequence of comma-separated keywords:
677
678                 (keyword[,keyword] ... )
679              The  values for all the keywords that are non-null shall be con‐
680              catenated together, each separated by a '/'. The  default  shall
681              be (path) if the keyword path is defined; otherwise, the default
682              shall be (prefix, name).
683
684       11.    An additional conversion specifier character, L, shall  be  used
685              to  specify  a symbolic line expansion. If the current file is a
686              symbolic link, then %L shall expand to:
687
688                 "%s -> %s", <value of keyword>, <contents of link>
689
690       Otherwise, the %L conversion specification shall be the  equivalent  of
691       %F.
692
693

OPERANDS

695       The following operands shall be supported:
696
697       directory
698              The destination directory pathname for copy mode.
699
700       file   A pathname of a file to be copied or archived.
701
702       pattern
703              A  pattern matching one or more pathnames of archive members.  A
704              pattern must be given in the  name-generating  notation  of  the
705              pattern matching notation in Pattern Matching Notation , includ‐
706              ing the filename expansion rules in Patterns Used  for  Filename
707              Expansion. The default, if no pattern is specified, is to select
708              all members in the archive.
709
710

STDIN

712       In write mode, the standard input shall be used only if no  file  oper‐
713       ands  are specified. It shall be a text file containing a list of path‐
714       names, one per line, without leading or trailing <blank>s.
715
716       In list and read modes, if -f is  not  specified,  the  standard  input
717       shall be an archive file.
718
719       Otherwise, the standard input shall not be used.
720
721

INPUT FILES

723       The  input file named by the archive option-argument, or standard input
724       when the archive is read from there, shall be a file formatted  accord‐
725       ing to one of the specifications in the EXTENDED DESCRIPTION section or
726       some other implementation-defined format.
727
728       The file /dev/tty shall be used to write prompts and read responses.
729
730

ENVIRONMENT VARIABLES

732       The following environment variables shall affect the execution of pax:
733
734       LANG   Provide a default value for the  internationalization  variables
735              that are unset or null. (See the Base Definitions volume of IEEE
736              Std 1003.1-2001, Section 8.2, Internationalization Variables for
737              the  precedence of internationalization variables used to deter‐
738              mine the values of locale categories.)
739
740       LC_ALL If set to a non-empty string value, override the values  of  all
741              the other internationalization variables.
742
743       LC_COLLATE
744              Determine  the  locale  for  the behavior of ranges, equivalence
745              classes, and multi-character collating elements used in the pat‐
746              tern  matching  expressions  for  the pattern operand, the basic
747              regular expression for the -s option, and the  extended  regular
748              expression defined for the yesexpr locale keyword in the LC_MES‐
749              SAGES category.
750
751       LC_CTYPE
752              Determine the locale for  the  interpretation  of  sequences  of
753              bytes  of  text  data as characters (for example, single-byte as
754              opposed to multi-byte characters in arguments and input  files),
755              the  behavior  of character classes used in the extended regular
756              expression defined for the yesexpr locale keyword in the LC_MES‐
757              SAGES category, and pattern matching.
758
759       LC_MESSAGES
760              Determine the locale for the processing of affirmative responses
761              that should be used to affect the format and contents  of  diag‐
762              nostic messages written to standard error.
763
764       LC_TIME
765              Determine  the format and contents of date and time strings when
766              the -v option is specified.
767
768       NLSPATH
769              [XSI] [Option Start] Determine the location of message  catalogs
770              for the processing of LC_MESSAGES . [Option End]
771
772       TMPDIR Determine  the pathname that provides part of the default global
773              extended header record file, as described for the -o globexthdr=
774              keyword in the OPTIONS section.
775
776       TZ     Determine  the  timezone used to calculate date and time strings
777              when the -v option is specified. If TZ  is  unset  or  null,  an
778              unspecified default timezone shall be used.
779
780

ASYNCHRONOUS EVENTS

782       Default.
783
784

STDOUT

786       In write mode, if -f is not specified, the standard output shall be the
787       archive formatted  according  to  one  of  the  specifications  in  the
788       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
789       mat (see -x format).
790
791       In list mode, when the -o  listopt=  format  has  been  specified,  the
792       selected  archive members shall be written to standard output using the
793       format described under List Mode Format Specifications.  In  list  mode
794       without  the  -o  listopt=  format option, the table of contents of the
795       selected archive members shall be written to standard output using  the
796       following format:
797
798            "%s\n", <pathname>
799
800       If  the  -v  option is specified in list mode, the table of contents of
801       the selected archive members shall be written to standard output  using
802       the following formats.
803
804       For  pathnames  representing  hard links to previous members of the ar‐
805       chive:
806
807            "%s == %s\n", <ls -l listing>, <linkname>
808
809       For all other pathnames:
810
811            "%s\n", <ls -l listing>
812
813       where <ls -l listing> shall be the format specified by the ls(1)  util‐
814       ity  with  the  -l option. When writing pathnames in this format, it is
815       unspecified what is written for fields for which the underlying archive
816       format does not have the correct information, although the correct num‐
817       ber of <blank>-separated fields shall be written.
818
819       In list mode, standard output shall not be buffered more than a line at
820       a time.
821
822

STDERR

824       If  -v  is specified in read, write, or copy modes, pax shall write the
825       pathnames it processes to the standard error output using the following
826       format:
827
828            "%s\n", <pathname>
829
830       These  pathnames shall be written as soon as processing is begun on the
831       file or archive member, and shall be flushed  to  standard  error.  The
832       trailing  <newline>,  which  shall not be buffered, is written when the
833       file has been read or written.
834
835       If the -s option is specified, and the replacement string has a  trail‐
836       ing  'p',  substitutions shall be written to standard error in the fol‐
837       lowing format:
838
839            "%s >> %s\n", <original pathname>, <new pathname>
840
841       In all operating modes of pax, optional messages of unspecified  format
842       concerning  the  input  archive format and volume number, the number of
843       files, blocks, volumes, and media parts as  well  as  other  diagnostic
844       messages may be written to standard error.
845
846       In  all  formats,  for  both  standard output and standard error, it is
847       unspecified how non-printable characters in pathnames or link names are
848       written.
849
850       When pax is in read mode or list mode, using the -x pax archive format,
851       and a filename, link name,  owner  name,  or  any  other  field  in  an
852       extended  header record cannot be translated from the pax UTF-8 codeset
853       format to the codeset and current locale  of  the  implementation,  pax
854       shall  write  a diagnostic message to standard error, shall process the
855       file as described for the -o invalid= option, and  then  shall  process
856       the next file in the archive.
857
858

OUTPUT FILES

860       In  read mode, the extracted output files shall be of the archived file
861       type. In copy mode, the copied output files shall be the  type  of  the
862       file  being  copied.  In either mode, existing files in the destination
863       hierarchy shall be overwritten only when all permission (-p), modifica‐
864       tion time (-u), and invalid-value (-o invalid=) tests allow it.
865
866       In write mode, the output file named by the -f option-argument shall be
867       a file formatted according to one of the specifications in the EXTENDED
868       DESCRIPTION section, or some other implementation-defined format.
869
870

EXTENDED DESCRIPTION

872   pax Interchange Format
873       A  pax archive tape or file produced in the -x pax format shall contain
874       a series of blocks. The physical layout of the archive shall be identi‐
875       cal  to  the  ustar  format described in ustar Interchange Format. Each
876       file archived shall be represented by the following sequence:
877
878              ·      An optional header block with  extended  header  records.
879                     This  header block is of the form described in pax Header
880                     Block, with a typeflag value of x  or  g.   The  extended
881                     header  records,  described in pax Extended Header, shall
882                     be included as the data for this header block.
883
884              ·      A header block that describes the file. Any fields in the
885                     preceding  optional  extended  header  shall override the
886                     associated fields in this header block for this file.
887
888              ·      Zero or more blocks that  contain  the  contents  of  the
889                     file.
890
891       At  the  end  of  the  archive  file there shall be two 512-byte blocks
892       filled with binary zeros, interpreted as an end-of-archive indicator.
893
894       A schematic of an example archive with global extended  header  records
895       and  two  actual  files  is shown in pax Format Archive Example. In the
896       example, the second file in the archive has no extended header  preced‐
897       ing it, presumably because it has no need for extended attributes.
898
899                         Figure: pax Format Archive Example
900
901    ┌──────────────────────────────┬─────────────────────────────────────────────┐
902    │ustar Header [typeflag = 'g'] │                                             │
903    ├──────────────────────────────┤           Global Extended header            │
904    │Global Extended Header Data   │                                             │
905    ├──────────────────────────────┼─────────────────────────────────────────────┤
906    │ustar Header [typeflag = 'x'] │                                             │
907    ├──────────────────────────────┤                                             │
908    │Extended Header Data          │                                             │
909    ├──────────────────────────────┤  File 1: Extended Header data is included   │
910    │ustar Header [typeflag = '0'] │                                             │
911    ├──────────────────────────────┤                                             │
912    │Data for File 1               │                                             │
913    ├──────────────────────────────┼─────────────────────────────────────────────┤
914    │ustar Header [typeflag = '0'] │                                             │
915    ├──────────────────────────────┤ File 2: No Extended Header data is included │
916    │Data for File 2               │                                             │
917    ├──────────────────────────────┼─────────────────────────────────────────────┤
918    │Block of binary Zeroes        │                                             │
919    ├──────────────────────────────┤          End of Archive Indicator           │
920    │Block of binary Zeroes        │                                             │
921    └──────────────────────────────┴─────────────────────────────────────────────┘
922
923   pax Header Block
924       The  pax  header  block  shall  be  identical to the ustar header block
925       described in ustar Interchange Format, except that two additional type‐
926       flag values are defined:
927
928       x      Represents extended header records for the following file in the
929              archive (which shall have its own ustar header block).  The for‐
930              mat  of  these  extended header records shall be as described in
931              pax Extended Header.
932
933       g      Represents global extended  header  records  for  the  following
934              files  in  the  archive.  The  format  of  these extended header
935              records shall be as described  in  pax  Extended  Header.   Each
936              value  shall  affect  all  subsequent files that do not override
937              that value in their own extended header record and until another
938              global  extended  header record is reached that provides another
939              value for the same field. The typeflag g global  headers  should
940              not  be  used  with  interchange media that could suffer partial
941              data loss in transporting the archive.
942
943       For both of these types, the size  field  shall  be  the  size  of  the
944       extended header records in octets. The other fields in the header block
945       are not meaningful to this version of the  pax  utility.   However,  if
946       this   archive  is  read  by  a  pax  utility  conforming  to  the  ISO
947       POSIX-2:1993 standard, the header block fields are  used  to  create  a
948       regular  file that contains the extended header records as data. There‐
949       fore, header block field values should be selected to  provide  reason‐
950       able file access to this regular file.
951
952       A  further  difference  from the ustar header block is that data blocks
953       for files of typeflag 1 (the digit one) (hard link)  may  be  included,
954       which means that the size field may be greater than zero. Archives cre‐
955       ated by pax -o linkdata shall include these data blocks with  the  hard
956       links.
957
958
959   pax Extended Header
960       A  pax  extended  header contains values that are inappropriate for the
961       ustar header block  because  of  limitations  in  that  format:  fields
962       requiring a character encoding other than that described in the ISO/IEC
963       646:1991 standard, fields representing file attributes not described in
964       the  ustar  header,  and  fields  whose format or length do not fit the
965       requirements of the ustar header. The values in an extended header  add
966       attributes  to the following file (or files; see the description of the
967       typeflag g header block) or override values  in  the  following  header
968       block(s), as indicated in the following list of keywords.
969
970       An  extended  header  shall  consist  of one or more records, each con‐
971       structed as follows:
972
973            "%d %s=%s\n", <length>, <keyword>, <value>
974
975       The extended header records shall be encoded according to  the  ISO/IEC
976       10646-1:2000  standard  (UTF-8).  The  <length>  field, <blank>, equals
977       sign, and <newline> shown shall be limited to  the  portable  character
978       set,  as  encoded in UTF-8. The <keyword> and <value> fields can be any
979       UTF-8 characters. The <length> field shall be the decimal length of the
980       extended header record in octets, including the trailing <newline>.
981
982       The <keyword> field shall be one of the entries from the following list
983       or a keyword provided as an implementation  extension.   Keywords  con‐
984       sisting entirely of lowercase letters, digits, and periods are reserved
985       for future standardization. A keyword shall not include an equals sign.
986       (In  the  following list, the notations "file(s)" or "block(s)" is used
987       to acknowledge that a keyword affects the following single file after a
988       typeflag  x extended header, but possibly multiple files after typeflag
989       g.  Any requirements in the list for pax to include a  record  when  in
990       write  or copy mode shall apply only when such a record has not already
991       been provided through the use of the -o option. When used in copy mode,
992       pax  shall  behave  as  if  an archive had been created with applicable
993       extended header records and then extracted.)
994
995       atime  The file access time for the following  file(s),  equivalent  to
996              the  value  of  the  st_atime member of the stat structure for a
997              file, as described by the  stat(2)  function.  The  access  time
998              shall  be  restored if the process has the appropriate privilege
999              required to do so.  The  format  of  the  <value>  shall  be  as
1000              described in pax Extended Header File Times.
1001
1002       charset
1003              The  name  of  the  character set used to encode the data in the
1004              following file(s).  The  entries  in  the  following  table  are
1005              defined  to  refer  to  known standards; additional names may be
1006              agreed on between the originator and recipient.
1007
1008              ┌────────────────────────┬───────────────────────────────┐
1009<value>         Formal Standard        
1010              ├────────────────────────┼───────────────────────────────┤
1011              │ISO-IR 646 1990         │ ISO/IEC 646:1990              │
1012              │ISO-IR 8859 1 1998      │ ISO/IEC 8859-1:1998           │
1013              │ISO-IR 8859 2 1999      │ ISO/IEC 8859-2:1999           │
1014              │ISO-IR 8859 3 1999      │ ISO/IEC 8859-3:1999           │
1015              │ISO-IR 8859 4 1998      │ ISO/IEC 8859-4:1998           │
1016              │ISO-IR 8859 5 1999      │ ISO/IEC 8859-5:1999           │
1017              │ISO-IR 8859 6 1999      │ ISO/IEC 8859-6:1999           │
1018              │ISO-IR 8859 7 1987      │ ISO/IEC 8859-7:1987           │
1019              │ISO-IR 8859 8 1999      │ ISO/IEC 8859-8:1999           │
1020              │ISO-IR 8859 9 1999      │ ISO/IEC 8859-9:1999           │
1021              │ISO-IR 8859 10 1998     │ ISO/IEC 8859-10:1998          │
1022              │ISO-IR 8859 13 1998     │ ISO/IEC 8859-13:1998          │
1023              │ISO-IR 8859 14 1998     │ ISO/IEC 8859-14:1998          │
1024              │ISO-IR 8859 15 1999     │ ISO/IEC 8859-15:1999          │
1025              │ISO-IR 10646 2000       │ ISO/IEC 10646:2000            │
1026              │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1027              │BINARY                  │ None                          │
1028              └────────────────────────┴───────────────────────────────┘
1029       The encoding is included in an extended header  for  information  only;
1030       when  pax  is  used  as described in IEEE Std 1003.1-2001, it shall not
1031       translate the file data into any other encoding. The BINARY entry indi‐
1032       cates unencoded binary data.
1033
1034       When  used  in write or copy mode, it is implementation-defined whether
1035       pax includes a charset extended header record for a file.
1036
1037       comment
1038              A series of characters used as a comment. All characters in  the
1039              <value> field shall be ignored by pax.
1040
1041       gid    The  group  ID  of  the group that owns the file, expressed as a
1042              decimal number using digits from the ISO/IEC 646:1991  standard.
1043              This record shall override the gid field in the following header
1044              block(s). When used in write or copy mode, pax shall  include  a
1045              gid  extended  header  record  for  each  file whose group ID is
1046              greater than 2097151 (octal 7777777).
1047
1048       gname  The group of the file(s), formatted as a group name in the group
1049              database. This record shall override the gid and gname fields in
1050              the following header  block(s),  and  any  gid  extended  header
1051              record.  When used in read, copy, or list mode, pax shall trans‐
1052              late the name from the UTF-8 encoding in the  header  record  to
1053              the  character  set  appropriate  for  the group database on the
1054              receiving system. If any  of  the  UTF-8  characters  cannot  be
1055              translated, and if the -o invalid=UTF-8 option is not specified,
1056              the results are implementation-defined. When used  in  write  or
1057              copy  mode, pax shall include a gname extended header record for
1058              each file whose group name cannot be represented  entirely  with
1059              the letters and digits of the portable character set.
1060
1061       linkpath
1062              The  pathname  of  a  link being created to another file, of any
1063              type,  previously  archived.  This  record  shall  override  the
1064              linkname  field in the following ustar header block(s). The fol‐
1065              lowing ustar header block shall determine the type of link  cre‐
1066              ated.  If  typeflag of the following header block is 1, it shall
1067              be a hard link. If typeflag is 2, it shall be  a  symbolic  link
1068              and  the  linkpath  value  shall be the contents of the symbolic
1069              link. The pax utility shall translate the name of the link (con‐
1070              tents of the symbolic link) from the UTF-8 encoding to the char‐
1071              acter set appropriate for the local file system.  When  used  in
1072              write or copy mode, pax shall include a linkpath extended header
1073              record for  each  link  whose  pathname  cannot  be  represented
1074              entirely  with  the  members of the portable character set other
1075              than NUL.
1076
1077       mtime  The file modification time of the following file(s),  equivalent
1078              to  the value of the st_mtime member of the stat structure for a
1079              file, as described in the stat(2) function.  This  record  shall
1080              override  the  mtime field in the following header block(s). The
1081              modification time shall be  restored  if  the  process  has  the
1082              appropriate  privilege  required  to  do  so.  The format of the
1083              <value> shall be as described in pax Extended Header File Times.
1084
1085       path   The pathname of the following file(s). This record  shall  over‐
1086              ride  the  name  and  prefix  fields  in  the  following  header
1087              block(s). The pax utility shall translate the  pathname  of  the
1088              file  from  the  UTF-8 encoding to the character set appropriate
1089              for the local file system.
1090
1091              When used in write or  copy  mode,  pax  shall  include  a  path
1092              extended  header  record  for each file whose pathname cannot be
1093              represented entirely with the members of the portable  character
1094              set other than NUL.
1095
1096       realtime.any
1097              The  keywords  prefixed  by  "realtime." are reserved for future
1098              standardization.
1099
1100       security.any
1101              The keywords prefixed by "security."  are  reserved  for  future
1102              standardization.
1103
1104       size   The  size  of  the file in octets, expressed as a decimal number
1105              using digits from the ISO/IEC  646:1991  standard.  This  record
1106              shall  override the size field in the following header block(s).
1107              When used in write or  copy  mode,  pax  shall  include  a  size
1108              extended  header  record for each file with a size value greater
1109              than 8589934591 (octal 77777777777).
1110
1111       uid    The user ID of the file owner, expressed  as  a  decimal  number
1112              using  digits  from  the  ISO/IEC 646:1991 standard. This record
1113              shall override the uid field in the following  header  block(s).
1114              When  used  in  write  or  copy  mode,  pax  shall include a uid
1115              extended header record for each file whose owner ID  is  greater
1116              than 2097151 (octal 7777777).
1117
1118       uname  The  owner of the following file(s), formatted as a user name in
1119              the user database. This record shall override the uid and  uname
1120              fields  in  the  following header block(s), and any uid extended
1121              header record. When used in read, copy, or list mode, pax  shall
1122              translate  the name from the UTF-8 encoding in the header record
1123              to the character set appropriate for the user  database  on  the
1124              receiving  system.  If  any  of  the  UTF-8 characters cannot be
1125              translated, and if the -o invalid=UTF-8 option is not specified,
1126              the  results  are  implementation-defined. When used in write or
1127              copy mode, pax shall include a uname extended header record  for
1128              each  file  whose  user name cannot be represented entirely with
1129              the letters and digits of the portable character set.
1130
1131       If the <value> field is zero length, it shall delete any  header  block
1132       field,  previously  entered  extended  header value, or global extended
1133       header value of the same name.
1134
1135       If a keyword in an extended header record (or in a -o  option-argument)
1136       overrides  or  deletes a corresponding field in the ustar header block,
1137       pax shall ignore the contents of that header block field.
1138
1139       Unlike the ustar header block fields, NULs shall not delimit  <value>s;
1140       all  characters  within  the <value> field shall be considered data for
1141       the field. None of the length limitations of  the  ustar  header  block
1142       fields  in  ustar  Header  Block  shall  apply  to  the extended header
1143       records.
1144
1145
1146   pax Extended Header Keyword Precedence
1147       This section describes the  precedence  in  which  the  various  header
1148       records  and fields and command line options are selected to apply to a
1149       file in the archive. When pax is used in read or list modes,  it  shall
1150       determine a file attribute in the following sequence:
1151
1152              1.     If   -o   delete=keyword-prefix  is  used,  the  affected
1153                     attributes shall be determined from step 7., if  applica‐
1154                     ble, or ignored otherwise.
1155
1156              2.     If -o keyword:= is used, the affected attributes shall be
1157                     ignored.
1158
1159              3.     If -o keyword:=value  is  used,  the  affected  attribute
1160                     shall be assigned the value.
1161
1162              4.     If  there  is  a  typeflag  x extended header record, the
1163                     affected attribute shall be assigned the  <value>.   When
1164                     extended  header  records conflict, the last one given in
1165                     the header shall take precedence.
1166
1167              5.     If -o keyword=value is used, the affected attribute shall
1168                     be assigned the value.
1169
1170              6.     If  there  is a typeflag g global extended header record,
1171                     the affected attribute shall  be  assigned  the  <value>.
1172                     When  global  extended  header records conflict, the last
1173                     one given in the global header shall take precedence.
1174
1175              7.     Otherwise, the attribute shall  be  determined  from  the
1176                     ustar header block.
1177
1178
1179   pax Extended Header File Times
1180       The  pax  utility shall write an mtime record for each file in write or
1181       copy modes if  the  file's  modification  time  cannot  be  represented
1182       exactly  in  the  ustar header logical record described in ustar Inter‐
1183       change Format.  This can occur if the time is out of ustar range, or if
1184       the  file  system of the underlying implementation supports non-integer
1185       time granularities and the time is not an integer. All  of  these  time
1186       records  shall  be formatted as a decimal representation of the time in
1187       seconds since the Epoch. If a period ('.') decimal point  character  is
1188       present, the digits to the right of the point shall represent the units
1189       of a subsecond timing granularity, where the first digit is tenths of a
1190       second  and  each subsequent digit is a tenth of the previous digit. In
1191       read or copy mode, the pax utility shall truncate the time of a file to
1192       the greatest value that is not greater than the input header file time.
1193       In write or copy mode, the pax utility shall output a time  exactly  if
1194       it  can be represented exactly as a decimal number, and otherwise shall
1195       generate only enough digits so that the same time shall be recovered if
1196       the  file is extracted on a system whose underlying implementation sup‐
1197       ports the same time granularity.
1198
1199
1200   ustar Interchange Format
1201       A ustar archive tape or file shall contain a series of logical records.
1202       Each  logical record shall be a fixed-size logical record of 512 octets
1203       (see below). Although this format may be thought of as being stored  on
1204       9-track  industry-standard  12.7 mm (0.5 in) magnetic tape, other types
1205       of transportable media are not excluded. Each file  archived  shall  be
1206       represented  by  a  header logical record that describes the file, fol‐
1207       lowed by zero or more logical records that give  the  contents  of  the
1208       file. At the end of the archive file there shall be two 512-octet logi‐
1209       cal records filled with binary zeros, interpreted as an  end-of-archive
1210       indicator.
1211
1212       The  logical  records  may  be  grouped for physical I/O operations, as
1213       described under the -b blocksize and -x ustar options.  Each  group  of
1214       logical  records  may  be written with a single operation equivalent to
1215       the write(2) function. On magnetic tape, the result of this write shall
1216       be  a  single tape physical block. The last physical block shall always
1217       be the full size, so logical records after the two zero logical records
1218       may contain undefined data.
1219
1220       The header logical record shall be structured as shown in the following
1221       table. All lengths and offsets are in decimal.
1222
1223                              Table: ustar Header Block
1224
1225                  ┌───────────┬──────────────┬────────────────────┐
1226Field Name Octet Offset Length (in Octets) 
1227                  ├───────────┼──────────────┼────────────────────┤
1228                  │name       │       0      │        100         │
1229                  │mode       │     100      │          8         │
1230                  │uid        │     108      │          8         │
1231                  │gid        │     116      │          8         │
1232                  │size       │     124      │         12         │
1233                  │mtime      │     136      │         12         │
1234                  │chksum     │     148      │          8         │
1235                  │typeflag   │     156      │          1         │
1236                  │linkname   │     157      │        100         │
1237                  │magic      │     257      │          6         │
1238                  │version    │     263      │          2         │
1239                  │uname      │     265      │         32         │
1240                  │gname      │     297      │         32         │
1241                  │devmajor   │     329      │          8         │
1242                  │devminor   │     337      │          8         │
1243                  │prefix     │     345      │        155         │
1244                  └───────────┴──────────────┴────────────────────┘
1245       All characters in the header logical record shall be represented in the
1246       coded  character  set  of  the  ISO/IEC  646:1991 standard. For maximum
1247       portability between implementations,  names  should  be  selected  from
1248       characters represented by the portable filename character set as octets
1249       with the most significant bit zero. If an implementation  supports  the
1250       use  of characters outside of slash and the portable filename character
1251       set in names for files, users, and groups, one or more  implementation-
1252       defined encodings of these characters shall be provided for interchange
1253       purposes.
1254
1255       However, the pax utility shall never create filenames on the local sys‐
1256       tem  that  cannot  be accessed via the procedures described in IEEE Std
1257       1003.1-2001. If a filename is found on the medium that would create  an
1258       invalid  filename,  it  is implementation-defined whether the data from
1259       the file is stored on the file hierarchy and  under  what  name  it  is
1260       stored.  The pax utility may choose to ignore these files as long as it
1261       produces an error indicating that the file is being ignored.
1262
1263       Each field within the header logical record  is  contiguous;  that  is,
1264       there is no padding used. Each character on the archive medium shall be
1265       stored contiguously.
1266
1267       The fields magic, uname, and gname are character  strings  each  termi‐
1268       nated  by  a  NUL  character. The fields name, linkname, and prefix are
1269       NUL-terminated character strings except  when  all  characters  in  the
1270       array contain non-NUL characters including the last character. The ver‐
1271       sion field is two octets containing the  characters  "00"  (zero-zero).
1272       The  typeflag contains a single character. All other fields are leading
1273       zero-filled octal numbers using digits from the ISO/IEC 646:1991  stan‐
1274       dard  IRV.  Each  numeric field is terminated by one or more <space> or
1275       NUL characters.
1276
1277       The name and the prefix fields shall produce the pathname of the  file.
1278       A  new  pathname shall be formed, if prefix is not an empty string (its
1279       first character is not NUL), by concatenating prefix (up to  the  first
1280       NUL  character),  a  slash character, and name; otherwise, name is used
1281       alone. In either case, name is terminated at the first  NUL  character.
1282       If  prefix  begins  with  a NUL character, it shall be ignored. In this
1283       manner, pathnames of at most 256 characters  can  be  supported.  If  a
1284       pathname  does not fit in the space provided, pax shall notify the user
1285       of the error, and shall not store any part of the file-header or  data-
1286       on the medium.
1287
1288       The  linkname  field, described below, shall not use the prefix to pro‐
1289       duce a pathname. As such, a linkname is limited to 100  characters.  If
1290       the  name does not fit in the space provided, pax shall notify the user
1291       of the error, and shall not attempt to store the link on the medium.
1292
1293       The mode field provides 12 bits encoded in the ISO/IEC  646:1991  stan‐
1294       dard  octal  digit representation. The encoded bits shall represent the
1295       following values:
1296
1297                               Table: ustar mode Field
1298
1299     ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1300Bit  IEEE Std     Description                   
1301Value 1003.1-2001 Bit │                                                 │
1302     ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1303     │04000 │ S_ISUID         │ Set UID on execution.                           │
1304     │02000 │ S_ISGID         │ Set GID on execution.                           │
1305     │01000 │ <reserved>      │ Reserved for future standardization.            │
1306     │00400 │ S_IRUSR         │ Read permission for file owner class.           │
1307     │00200 │ S_IWUSR         │ Write permission for file owner class.          │
1308     │00100 │ S_IXUSR         │ Execute/search permission for file owner class. │
1309     │00040 │ S_IRGRP         │ Read permission for file group class.           │
1310     │00020 │ S_IWGRP         │ Write permission for file group class.          │
1311     │00010 │ S_IXGRP         │ Execute/search permission for file group class. │
1312     │00004 │ S_IROTH         │ Read permission for file other class.           │
1313     │00002 │ S_IWOTH         │ Write permission for file other class.          │
1314     │00001 │ S_IXOTH         │ Execute/search permission for file other class. │
1315     └──────┴─────────────────┴─────────────────────────────────────────────────┘
1316       When appropriate privilege is required to set one of these  mode  bits,
1317       and  the  user  restoring  the files from the archive does not have the
1318       appropriate privilege, the mode bits for which the user does  not  have
1319       appropriate  privilege  shall  be ignored. Some of the mode bits in the
1320       archive format are not mentioned elsewhere in this volume of  IEEE  Std
1321       1003.1-2001.  If  the  implementation does not support those bits, they
1322       may be ignored.
1323
1324       The uid and gid fields are the user and group ID of the owner and group
1325       of the file, respectively.
1326
1327       The size field is the size of the file in octets. If the typeflag field
1328       is set to specify a file to be of type 1 (a  link)  or  2  (a  symbolic
1329       link), the size field shall be specified as zero. If the typeflag field
1330       is set to specify a file of type 5 (directory), the size field shall be
1331       interpreted  as  described under the definition of that record type. No
1332       data logical records are stored for types 1, 2, or 5. If  the  typeflag
1333       field  is set to 3 (character special file), 4 (block special file), or
1334       6 (FIFO), the meaning of the size field is unspecified by  this  volume
1335       of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1336       the medium.  Additionally, for type 6, the size field shall be  ignored
1337       when reading. If the typeflag field is set to any other value, the num‐
1338       ber  of  logical  records  written  following  the  header   shall   be
1339       (size+511)/512, ignoring any fraction in the result of the division.
1340
1341       The  mtime field shall be the modification time of the file at the time
1342       it was archived. It is the ISO/IEC 646:1991 standard representation  of
1343       the  octal  value  of  the  modification time obtained from the stat(2)
1344       function.
1345
1346       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1347       tion  of  the octal value of the simple sum of all octets in the header
1348       logical record. Each octet  in  the  header  shall  be  treated  as  an
1349       unsigned  value.  These  values  shall be added to an unsigned integer,
1350       initialized to zero, the precision of which is not less than  17  bits.
1351       When  calculating  the  checksum,  the chksum field is treated as if it
1352       were all spaces.
1353
1354       The typeflag field specifies the type of file archived. If a particular
1355       implementation  does  not recognize the type, or the user does not have
1356       appropriate privilege to create that type, the file shall be  extracted
1357       as  if  it  were  a  regular file if the file type is defined to have a
1358       meaning for the size field that could cause data logical records to  be
1359       written on the medium (see the previous description for size).  If con‐
1360       version to a regular file occurs, the  pax  utility  shall  produce  an
1361       error  indicating  that  the conversion took place. All of the typeflag
1362       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1363
1364       0      Represents a regular file. For backwards-compatibility, a  type‐
1365              flag value of binary zero ('\0') should be recognized as meaning
1366              a regular file when extracting files from the archive.  Archives
1367              written with this version of the archive file format create reg‐
1368              ular files with a typefla value of the ISO/IEC 646:1991 standard
1369              IRV '0'.
1370
1371       1      Represents  a  file  linked to another file, of any type, previ‐
1372              ously archived. Such files are identified  by  having  the  same
1373              device and file serial numbers, and pathnames that refer to dif‐
1374              ferent directory entries. All such files shall  be  archived  as
1375              linked  files.  The  linked-to name is specified in the linkname
1376              field with a NUL-character terminator if it  is  less  than  100
1377              octets in length.
1378
1379       2      Represents  a  symbolic  link. The contents of the symbolic link
1380              shall be stored in the linkname field.
1381
1382       3,4    Represent  character  special  files  and  block  special  files
1383              respectively.  In  this  case  the  devmajor and devminor fields
1384              shall contain information defining the  device,  the  format  of
1385              which  is  unspecified  by  this volume of IEEE Std 1003.1-2001.
1386              Implementations may map the device specifications to  their  own
1387              local specification or may ignore the entry.
1388
1389       5      Specifies  a  directory  or  subdirectory. On systems where disk
1390              allocation is performed on a directory  basis,  the  size  field
1391              shall contain the maximum number of octets (which may be rounded
1392              to the nearest disk block allocation unit)  that  the  directory
1393              may  hold. A size field of zero indicates no such limiting. Sys‐
1394              tems that do not support limiting in this manner  should  ignore
1395              the size field.
1396
1397       6      Specifies a FIFO special file. Note that the archiving of a FIFO
1398              file archives the existence of this file and not its contents.
1399
1400       7      Reserved to represent a file  to  which  an  implementation  has
1401              associated   some  high-performance  attribute.  Implementations
1402              without such extensions should treat this file as a regular file
1403              (type 0).
1404
1405       A-Z    The  letters  'A'  to  'Z',  inclusive,  are reserved for custom
1406              implementations. All other values are reserved for  future  ver‐
1407              sions of IEEE Std 1003.1-2001.
1408
1409       It  is  unspecified whether files with pathnames that refer to the same
1410       directory entry are archived as linked files or as separate  files.  If
1411       they  are  archived  as  linked  files,  this  means that attempting to
1412       extract both pathnames from the resulting archive will always cause  an
1413       error  (unless  the  -u option is used) because the link cannot be cre‐
1414       ated.
1415
1416       It is unspecified whether files with the same device  and  file  serial
1417       numbers  being  appended  to  an archive are treated as linked files to
1418       members that were in the archive before the append.
1419
1420       Attempts to archive a socket using ustar interchange format shall  pro‐
1421       duce  a diagnostic message. Handling of other file types is implementa‐
1422       tion-defined.
1423
1424       The magic field is the specification that this archive  was  output  in
1425       this  archive format. If this field contains ustar (the five characters
1426       from the ISO/IEC 646:1991 standard IRV  shown  followed  by  NUL),  the
1427       uname  and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1428       representation of the owner and group of the file, respectively  (trun‐
1429       cated  to  fit,  if  necessary).  When the file is restored by a privi‐
1430       leged, protection-preserving version of the utility, the user and group
1431       databases  shall  be  scanned  for  these names. If found, the user and
1432       group IDs contained within these files shall be used  rather  than  the
1433       values contained within the uid and gid fields.
1434
1435
1436   cpio Interchange Format
1437       The  octet-oriented  cpio  archive format shall be a series of entries,
1438       each comprising a header that describes the file, the name of the file,
1439       and then the contents of the file.
1440
1441       An  archive may be recorded as a series of fixed-size blocks of octets.
1442       This blocking shall be used only to make physical I/O  more  efficient.
1443       The last group of blocks shall always be at the full size.
1444
1445       For the octet-oriented cpio archive format, the individual entry infor‐
1446       mation shall be in the order indicated and described by  the  following
1447       table; see also the <cpio.h> header.
1448
1449                      Table: Octet-Oriented cpio Archive Entry
1450
1451            ┌─────────────────────┬────────────────────┬─────────────────┐
1452Header Field Name   Length (in Octets) Interpreted as  
1453            ├─────────────────────┼────────────────────┼─────────────────┤
1454            │c_magic              │ 6                  │ Octal number    │
1455            │c_dev                │ 6                  │ Octal number    │
1456            │c_ino                │ 6                  │ Octal number    │
1457            │c_mode               │ 6                  │ Octal number    │
1458            │c_uid                │ 6                  │ Octal number    │
1459            │c_gid                │ 6                  │ Octal number    │
1460            │c_nlink              │ 6                  │ Octal number    │
1461            │c_rdev               │ 6                  │ Octal number    │
1462            │c_mtime              │ 11                 │ Octal number    │
1463            │c_namesize           │ 6                  │ Octal number    │
1464            │c_filesize           │ 11                 │ Octal number    │
1465            │                     │                    │                 │
1466Filename Field Name  Length             Interpreted as  
1467            │c_name               │ c_namesize         │ Pathname string │
1468            │                     │                    │                 │
1469File Data Field Name Length             Interpreted as  
1470            │c_filedata           │ c_filesize         │ Data            │
1471            └─────────────────────┴────────────────────┴─────────────────┘
1472   cpio Header
1473       For  each  file in the archive, a header as defined previously shall be
1474       written. The information in the header fields is written as streams  of
1475       the  ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1476       The octal numbers shall be extended to the necessary length by  append‐
1477       ing  the  ISO/IEC  646:1991 standard IRV zeros at the most-significant-
1478       digit end of the number; the result is written to the  most-significant
1479       digit of the stream of octets first. The fields shall be interpreted as
1480       follows:
1481
1482       c_magic
1483              Identify the archive as being a transportable  archive  by  con‐
1484              taining the identifying value "070707".
1485
1486       c_dev, c_ino
1487              Contains  values  that uniquely identify the file within the ar‐
1488              chive (that is, no files contain the  same  pair  of  c_dev  and
1489              c_ino values unless they are links to the same file). The values
1490              shall be determined in an unspecified manner.
1491
1492       c_mode Contains the file type and access permissions as defined in  the
1493              following table.
1494
1495                            Table: Values for cpio c_mode Field
1496
1497                 ┌──────────────────────┬─────────┬────────────────────────┐
1498File Permissions Name Value  Indicates        
1499                 ├──────────────────────┼─────────┼────────────────────────┤
1500                 │C_IRUSR               │ 000400  │ Read by owner          │
1501                 │C_IWUSR               │ 000200  │ Write by owner         │
1502                 │C_IXUSR               │ 000100  │ Execute by owner       │
1503                 │C_IRGRP               │ 000040  │ Read by group          │
1504                 │C_IWGRP               │ 000020  │ Write by group         │
1505                 │C_IXGRP               │ 000010  │ Execute by group       │
1506                 │C_IROTH               │ 000004  │ Read by others         │
1507                 │C_IWOTH               │ 000002  │ Write by others        │
1508                 │C_IXOTH               │ 000001  │ Execute by others      │
1509                 │C_ISUID               │ 004000  │ Set uid                │
1510                 │C_ISGID               │ 002000  │ Set gid                │
1511                 │C_ISVTX               │ 001000  │ Reserved               │
1512                 ├──────────────────────┼─────────┼────────────────────────┤
1513File Type Name        Value   Indicates              
1514                 ├──────────────────────┼─────────┼────────────────────────┤
1515                 │C_ISDIR               │ 0040000 │ Directory              │
1516                 │C_ISFIFO              │ 0010000 │ FIFO                   │
1517                 │C_ISREG               │ 0100000 │ Regular file           │
1518                 │C_ISLNK               │ 0120000 │ Symbolic link          │
1519                 │C_ISBLK               │ 0060000 │ Block special file     │
1520                 │C_ISCHR               │ 0020000 │ Character special file │
1521                 │C_ISSOCK              │ 0140000 │ Socket                 │
1522                 │C_ISCTG               │ 0110000 │ Reserved               │
1523                 └──────────────────────┴─────────┴────────────────────────┘
1524              Directories,  FIFOs,  symbolic links, and regular files shall be
1525              supported on a system conforming to  this  volume  of  IEEE  Std
1526              1003.1-2001;  additional  values defined previously are reserved
1527              for compatibility with existing systems.  Additional file  types
1528              may  be  supported; however, such files should not be written to
1529              archives intended to be transported to other systems.
1530
1531       c_uid  Contains the user ID of the owner.
1532
1533       c_gid  Contains the group ID of the group.
1534
1535       c_nlink
1536              Contains a number greater than or equal to the number  of  links
1537              in the archive referencing the file. If the -a option is used to
1538              append to a cpio archive, then the pax utility need not  account
1539              for the files in the existing part of the archive when calculat‐
1540              ing the c_nlink values for the appended part of the archive, and
1541              need  not  alter  the c_nlink values in the existing part of the
1542              archive if additional files with the same c_dev and c_ino values
1543              are appended to the archive.
1544
1545       c_rdev Contains  implementation-defined  information  for  character or
1546              block special files.
1547
1548       c_mtime
1549              Contains the latest time of modification of the file at the time
1550              the archive was created.
1551
1552       c_namesize
1553              Contains  the  length of the pathname, including the terminating
1554              NUL character.
1555
1556       c_filesize
1557              Contains the length of the file in octets.  This  shall  be  the
1558              length of the data section following the header structure.
1559
1560
1561   cpio Filename
1562       The  c_name field shall contain the pathname of the file. The length of
1563       this field in octets is the value of c_namesize.
1564
1565       If a filename is found on the medium that would create an invalid path‐
1566       name,  it  is  implementation-defined whether the data from the file is
1567       stored on the file hierarchy and under what name it is stored.
1568
1569       All characters shall be represented in the  ISO/IEC  646:1991  standard
1570       IRV.  For  maximum portability between implementations, names should be
1571       selected from characters represented by the portable filename character
1572       set  as octets with the most significant bit zero. If an implementation
1573       supports the use of characters outside the portable filename  character
1574       set  in names for files, users, and groups, one or more implementation-
1575       defined encodings of these characters shall be provided for interchange
1576       purposes.  However, the pax utility shall never create filenames on the
1577       local system that cannot be accessed via the procedures described  pre‐
1578       viously  in this volume of IEEE Std 1003.1-2001. If a filename is found
1579       on the medium that would create an invalid filename, it is  implementa‐
1580       tion-defined whether the data from the file is stored on the local file
1581       system and under what name it is stored. The pax utility may choose  to
1582       ignore  these files as long as it produces an error indicating that the
1583       file is being ignored.
1584
1585
1586   cpio File Data
1587       Following c_name, there shall be c_filesize octets of data.   Interpre‐
1588       tation  of  such  data  occurs  in  a  manner dependent on the file. If
1589       c_filesize is zero, no data shall be contained in c_filedata.
1590
1591       When restoring from an archive:
1592
1593       ·      If the user does not have the appropriate privilege to create  a
1594              file of the specified type, pax shall ignore the entry and write
1595              an error message to standard error.
1596
1597       ·      Only regular files have data to be restored. Presuming a regular
1598              file  meets  any selection criteria that might be imposed on the
1599              format-reading utility by the user, such data shall be restored.
1600
1601       ·      If a user does not have appropriate privilege to set a  particu‐
1602              lar mode flag, the flag shall be ignored. Some of the mode flags
1603              in the archive format are not mentioned elsewhere in this volume
1604              of  IEEE Std 1003.1-2001. If the implementation does not support
1605              those flags, they may be ignored.
1606
1607
1608   cpio Special Entries
1609       FIFO special files, directories, and the trailer shall be recorded with
1610       c_filesize  equal  to  zero.  For  other  special  files, c_filesize is
1611       unspecified by this volume of IEEE Std 1003.1-2001. The header for  the
1612       next file entry in the archive shall be written directly after the last
1613       octet of the file entry preceding it. A header  denoting  the  filename
1614       TRAILER!!!   shall  indicate  the  end  of the archive; the contents of
1615       octets in the last block of the archive following  such  a  header  are
1616       undefined.
1617
1618

EXIT STATUS

1620       The following exit values shall be returned:
1621
1622        0     All files were processed successfully.
1623
1624       >0     An error occurred.
1625
1626

CONSEQUENCES OF ERRORS

1628       If pax cannot create a file or a link when reading an archive or cannot
1629       find a file when writing an archive, or cannot preserve  the  user  ID,
1630       group  ID,  or  file mode when the -p option is specified, a diagnostic
1631       message shall be written to standard error and a non-zero  exit  status
1632       shall be returned, but processing shall continue. In the case where pax
1633       cannot create a link to a file, pax shall not,  by  default,  create  a
1634       second copy of the file.
1635
1636       If  the  extraction of a file from an archive is prematurely terminated
1637       by a signal or error, pax may have only partially extracted the file or
1638       (if  the  -n option was not specified) may have extracted a file of the
1639       same name as that specified by the user, but which is not the file  the
1640       user  wanted. Additionally, the file modes of extracted directories may
1641       have additional bits from the S_IRWXU mask set  as  well  as  incorrect
1642       modification and access times.
1643
1644
1645_________________________________________________________________

The following sections are informative.

1647
1648

APPLICATION USAGE

1650       Caution  is advised when using the -a option to append to a cpio format
1651       archive. If any of the files being appended happen to be given the same
1652       c_dev  and  c_ino values as a file in the existing part of the archive,
1653       then they may be treated as links to that file on extraction. Thus,  it
1654       is  risky to use -a with cpio format except when it is done on the same
1655       system that the original archive was created on, and with the same  pax
1656       utility,  and  in  the  knowledge that there has been little or no file
1657       system activity since the original archive was created that could  lead
1658       to  any of the files appended being given the same c_dev and c_ino val‐
1659       ues as an unrelated file in the existing part  of  the  archive.  Also,
1660       when (intentionally) appending additional links to a file in the exist‐
1661       ing part of the archive, the c_nlink values in the modified archive can
1662       be  smaller  than the number of links to the file in the archive, which
1663       may mean that the links are not preserved on extraction.
1664
1665       The -p  (privileges)  option  was  invented  to  reconcile  differences
1666       between historical tar and cpio implementations. In particular, the two
1667       utilities use -m in diametrically opposed ways. The -p option also pro‐
1668       vides  a  consistent  means  of extending the ways in which future file
1669       attributes can be addressed, such as for enhanced security  systems  or
1670       high-performance  files. Although it may seem complex, there are really
1671       two modes that are most commonly used:
1672
1673       -p e   ``Preserve everything". This would be  used  by  the  historical
1674              superuser,  someone with all the appropriate privileges, to pre‐
1675              serve all aspects of the files as they are recorded in  the  ar‐
1676              chive.  The  e flag is the sum of o and p, and other implementa‐
1677              tion-defined attributes.
1678
1679       -p p   ``Preserve" the file mode bits. This would be used by  the  user
1680              with  regular  privileges  who wished to preserve aspects of the
1681              file other than the ownership. The file times are  preserved  by
1682              default,  but  two  other flags are offered to disable these and
1683              use the time of extraction.
1684
1685       The one pathname per line format of standard input precludes  pathnames
1686       containing  <newline>s.  Although  such  pathnames violate the portable
1687       filename guidelines, they may exist  and  their  presence  may  inhibit
1688       usage  of pax within shell scripts. This problem is inherited from his‐
1689       torical archive programs. The problem can be avoided by  listing  file‐
1690       name arguments on the command line instead of on standard input.
1691
1692       It  is  almost certain that appropriate privileges are required for pax
1693       to accomplish parts of this volume of IEEE Std  1003.1-2001.   Specifi‐
1694       cally,  creating  files  of  type  block  special or character special,
1695       restoring file access times unless the files are owned by the user (the
1696       -t  option),  or preserving file owner, group, and mode (the -p option)
1697       all probably require appropriate privileges.
1698
1699       In read mode, implementations are permitted to overwrite files when the
1700       archive  has multiple members with the same name. This may fail if per‐
1701       missions on the first version of the file do not permit it to be  over‐
1702       written.
1703
1704       The  cpio  and  ustar  formats  can only support files up to 8589934592
1705       bytes (8 * 2^30) in size.
1706
1707

EXAMPLES

1709       The following command:
1710
1711            pax -w -f /dev/rmt/1m .
1712
1713       copies the contents of the current directory to tape  drive  1,  medium
1714       density (assuming historical System V device naming procedures-the his‐
1715       torical BSD device name would be /dev/rmt9).
1716
1717       The following commands:
1718
1719            mkdir newdirpax -rw olddir newdir
1720
1721       copy the olddir directory hierarchy to newdir.
1722
1723            pax -r -s ',^//*usr//*,,' -f a.pax
1724
1725       reads the archive a.pax, with all files rooted in /usr in  the  archive
1726       extracted relative to the current directory.
1727
1728       Using the option:
1729
1730            -o listopt="%M %(atime)T %(size)D %(name)s"
1731
1732       overrides the default output description in Standard Output and instead
1733       writes:
1734
1735            -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1736
1737       Using the options:
1738
1739            -o listopt='%L\t%(size)D\n%.7' \
1740            -o listopt='(name)s\n%(atime)T\n%T'
1741
1742       overrides the default output description in Standard Output and instead
1743       writes:
1744
1745       /usr/foo/bar -> /tmp   1492
1746       /usr/fo
1747       Jan 12 1991
1748       Jan 31 15:53
1749
1750

RATIONALE

1752       The  pax  utility  was new for the ISO POSIX-2:1993 standard. It repre‐
1753       sents a peaceful compromise between advocates of the historical tar and
1754       cpio utilities.
1755
1756       A  fundamental  difference between cpio and tar was in the way directo‐
1757       ries were treated. The cpio utility did not treat  directories  differ‐
1758       ently  from  other  files,  and  to select a directory and its contents
1759       required that each file in the hierarchy be explicitly  specified.  For
1760       tar, a directory matched every file in the file hierarchy it rooted.
1761
1762       The  pax  utility  offers  both interfaces; by default, directories map
1763       into the file hierarchy they root. The -d option causes pax to skip any
1764       file  not  explicitly  referenced, as cpio historically did.  The tar -
1765       style behavior was chosen as the default because it was  believed  that
1766       this  was  the  more  common usage and because tar is the more commonly
1767       available interface, as it was historically provided on both  System  V
1768       and BSD implementations.
1769
1770       The  data  interchange  format specification in this volume of IEEE Std
1771       1003.1-2001 requires that processes with "appropriate privileges" shall
1772       always restore the ownership and permissions of extracted files exactly
1773       as archived. If viewed from the historic equivalence between  superuser
1774       and "appropriate privileges", there are two problems with this require‐
1775       ment. First, users running as superusers may unknowingly set  dangerous
1776       permissions  on  extracted files. Second, it is needlessly limiting, in
1777       that superusers cannot extract files and own them as  superuser  unless
1778       the  archive  was  created  by  the superuser. (It should be noted that
1779       restoration  of  ownerships  and  permissions  for  the  superuser,  by
1780       default,  is historical practice in cpio, but not in tar.)  In order to
1781       avoid these two problems,  the  pax  specification  has  an  additional
1782       "privilege"  mechanism,  the  -p option. Only a pax invocation with the
1783       privileges needed, and which has the -p option set using the e specifi‐
1784       cation  character, has the "appropriate privilege" to restore full own‐
1785       ership and permission information.
1786
1787       Note also that this volume of IEEE Std 1003.1-2001  requires  that  the
1788       file  ownership  and access permissions shall be set, on extraction, in
1789       the same fashion as the creat(2) function when provided with  the  mode
1790       stored  in  the  archive. This means that the file creation mask of the
1791       user is applied to the file permissions.
1792
1793       Users should note that directories may be created by pax while extract‐
1794       ing  files  with permissions that are different from those that existed
1795       at the time the archive was created. When extracting sensitive informa‐
1796       tion  into  a  directory  hierarchy  that  no  longer exists, users are
1797       encouraged to set their file creation  mask  appropriately  to  protect
1798       these files during extraction.
1799
1800       The  table  of contents output is written to standard output to facili‐
1801       tate pipeline processing.
1802
1803       An early proposal had hard links displaying for  all  pathnames.   This
1804       was  removed  because it complicates the output of the case where -v is
1805       not specified and does not match historical cpio usage.  The  hard-link
1806       information is available in the -v display.
1807
1808       The  description  of  the -l option allows implementations to make hard
1809       links to symbolic links. IEEE Std 1003.1-2001 does not specify any  way
1810       to create a hard link to a symbolic link, but many implementations pro‐
1811       vide this capability as an extension. If there are hard links  to  sym‐
1812       bolic  links when an archive is created, the implementation is required
1813       to archive the hard link in the archive (unless -H or -L is specified).
1814       When  in  read  mode  and in copy mode, implementations supporting hard
1815       links to symbolic links should use them when appropriate.
1816
1817       The archive formats inherited from the POSIX.1-1990 standard have  cer‐
1818       tain  restrictions  that have been brought along from historical usage.
1819       For example, there are restrictions on the length of  pathnames  stored
1820       in  the archive. When pax is used in copy (-rw) mode (copying directory
1821       hierarchies), the ability to use extensions  from  the  -x  pax  format
1822       overcomes these restrictions.
1823
1824       The default blocksize value of 5120 bytes for cpio was selected because
1825       it is one of the standard block-size values for cpio, set when  the  -B
1826       option  is  specified.  (The other default block-size value for cpio is
1827       512 bytes, and this was considered to be too small.) The default  block
1828       value  of 10240 bytes for tar was selected because that is the standard
1829       block-size value for BSD tar.  The maximum block size  of  32256  bytes
1830       (2^15-512  bytes) is the largest multiple of 512 bytes that fits into a
1831       signed 16-bit tape controller transfer register. There are known  limi‐
1832       tations  in  some  historical  systems that would prevent larger blocks
1833       from being accepted. Historical values were chosen to improve  compati‐
1834       bility  with  historical  scripts  using  dd(1) or similar utilities to
1835       manipulate archives. Also, default block sizes for any file type  other
1836       than  character  special file has been deleted from this volume of IEEE
1837       Std 1003.1-2001 as unimportant and not likely to affect  the  structure
1838       of the resulting archive.
1839
1840       Implementations  are  permitted to modify the block-size value based on
1841       the archive format or the device to which the archive is being written.
1842       This  is to provide implementations with the opportunity to take advan‐
1843       tage of special types of devices, and it should not be used  without  a
1844       great  deal  of  consideration as it almost certainly decreases archive
1845       portability.
1846
1847       The intended use of the -n option was to permit extraction  of  one  or
1848       more files from the archive without processing the entire archive. This
1849       was viewed by the standard developers as offering  significant  perfor‐
1850       mance  advantages  over  historical  implementations.  The -n option in
1851       early proposals had three effects; the first was to cause special char‐
1852       acters in patterns to not be treated specially. The second was to cause
1853       only the first file that matched a pattern to be extracted.  The  third
1854       was  to  cause pax to write a diagnostic message to standard error when
1855       no file was found matching a specified pattern. Only the second  behav‐
1856       ior  is  retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1857       sons. First, it is in general not acceptable for  a  single  option  to
1858       have  multiple  effects.  Second,  the ability to make pattern matching
1859       characters act as normal characters is useful for parts  of  pax  other
1860       than file extraction. Third, a finer degree of control over the special
1861       characters is useful because users may wish to normalize only a  single
1862       special  character  in  a single filename. Fourth, given a more general
1863       escape mechanism, the previous behavior of the -n option can be  easily
1864       obtained  using the -s option or a sed script. Finally, writing a diag‐
1865       nostic message when a pattern specified by the user is unmatched by any
1866       file is useful behavior in all cases.
1867
1868       In this version, the -n was removed from the copy mode synopsis of pax;
1869       it is inapplicable because there are no pattern operands  specified  in
1870       this mode.
1871
1872       There  is  another  method  than  pax  for copying subtrees in IEEE Std
1873       1003.1-2001 described as part of the cp(1) utility.  Both  methods  are
1874       historical  practice:  cp(1)  provides a simpler, more intuitive inter‐
1875       face, while pax offers a finer granularity of  control.  Each  provides
1876       additional functionality to the other; in particular, pax maintains the
1877       hard-link structure of the hierarchy while cp(1) does not.  It  is  the
1878       intention of the standard developers that the results be similar (using
1879       appropriate option combinations in both utilities). The results are not
1880       required  to  be  identical; there seemed insufficient gain to applica‐
1881       tions to balance the difficulty of implementations having to  guarantee
1882       that the results would be exactly identical.
1883
1884       A  single  archive  may  span  more than one file. It is suggested that
1885       implementations provide informative messages to the  user  on  standard
1886       error whenever the archive file is changed.
1887
1888       The -d option (do not create intermediate directories not listed in the
1889       archive) found in early proposals was originally provided as a  comple‐
1890       ment to the historic -d option of cpio.  It has been deleted.
1891
1892       The -s option in early proposals specified a subset of the substitution
1893       command from the ed utility. As there was no reason for only  a  subset
1894       to  be  supported,  the -s option is now compatible with the current ed
1895       specification. Since the delimiter can be any non-null  character,  the
1896       following usage with single spaces is valid:
1897
1898            pax -s " foo bar " ...
1899
1900       The  -t  description  is  worded  so as to note that this may cause the
1901       access time update caused by some other activity  (which  occurs  while
1902       the file is being read) to be overwritten.
1903
1904       The  default  behavior of pax with regard to file modification times is
1905       the same as historical implementations of tar.  It is not the  histori‐
1906       cal behavior of cpio.
1907
1908       Because  the  -i  option uses /dev/tty, utilities without a controlling
1909       terminal are not able to use this option.
1910
1911       The -y option, found in early proposals, has  been  deleted  because  a
1912       line  containing a single period for the -i option has equivalent func‐
1913       tionality. The special lines for the -i option (a single period and the
1914       empty line) are historical practice in cpio.
1915
1916       In early drafts, a -e charmap option was included to increase portabil‐
1917       ity of files between systems using different coded character sets. This
1918       option  was omitted because it was apparent that consensus could not be
1919       formed for it. In this version, the use of UTF-8 should be an  adequate
1920       substitute.
1921
1922       The  -k  option  was  added to address international concerns about the
1923       dangers involved in the character set transformations  of  -e  (if  the
1924       target  character  set  were  different  from the source, the filenames
1925       might be transformed into names matching existing files) and  also  was
1926       made  more  general  to  protect files transferred between file systems
1927       with different {NAME_MAX} values (truncating a filename  on  a  smaller
1928       system  might  also inadvertently overwrite existing files). As stated,
1929       it prevents any overwriting, even if the target file is older than  the
1930       source.  This  version  adds  more granularity of options to solve this
1931       problem by introducing the -o invalid=option - specifically  the  UTF-8
1932       action. (Note that an existing file that is named with a UTF-8 encoding
1933       is still subject to overwriting in this case. The -k option closes that
1934       loophole.)
1935
1936       Some  of the file characteristics referenced in this volume of IEEE Std
1937       1003.1-2001 might not be supported by some archive formats.  For  exam‐
1938       ple, neither the tar nor cpio formats contain the file access time. For
1939       this reason, the e specification character has been provided,  intended
1940       to  cause  all  file  characteristics  specified  in  the archive to be
1941       retained.
1942
1943       It is required that  extracted  directories,  by  default,  have  their
1944       access  and modification times and permissions set to the values speci‐
1945       fied in the archive. This has obvious problems in that the  directories
1946       are  almost certainly modified after being extracted and that directory
1947       permissions may not permit file creation. One possible solution  is  to
1948       create  directories with the mode specified in the archive, as modified
1949       by the umask of the user, with sufficient  permissions  to  allow  file
1950       creation. After all files have been extracted, pax would then reset the
1951       access and modification times and permissions as necessary.
1952
1953       The list-mode formatting  description  borrows  heavily  from  the  one
1954       defined  by  the printf(1) utility. However, since there is no separate
1955       operand list to get conversion arguments, the format  was  extended  to
1956       allow  specifying  the  name  of the conversion argument as part of the
1957       conversion specification.
1958
1959       The T conversion specifier allows time fields to be displayed in any of
1960       the  date  formats.  Unlike  the ls(1) utility, pax does not adjust the
1961       format when the date is less than six months in the  past.  This  makes
1962       parsing the output more predictable.
1963
1964       The   D  conversion  specifier  handles  the  ability  to  display  the
1965       major/minor or file size, as with ls(1), by using %-8(size)D.
1966
1967       The L conversion specifier handles the ls display for symbolic links.
1968
1969       Conversion specifiers were added to generate existing known types  used
1970       for ls(1).
1971
1972
1973   pax Interchange Format
1974       The  new  POSIX data interchange format was developed primarily to sat‐
1975       isfy international concerns that the ustar and  cpio  formats  did  not
1976       provide for file, user, and group names encoded in characters outside a
1977       subset of the ISO/IEC 646:1991 standard. The standard developers  real‐
1978       ized  that this new POSIX data interchange format should be very exten‐
1979       sible because there were other requirements they foresaw  in  the  near
1980       future:
1981
1982       ·      Support international character encodings and locale information
1983
1984       ·      Support security information (ACLs, and so on)
1985
1986       ·      Support future file types, such as realtime or contiguous files
1987
1988       ·      Include data areas for implementation use
1989
1990       ·      Support  systems  with words larger than 32 bits and timers with
1991              subsecond granularity
1992
1993       The following were not goals for this format because these  are  better
1994       handled  by separate utilities or are inappropriate for a portable for‐
1995       mat:
1996
1997       ·      Encryption
1998
1999       ·      Compression
2000
2001       ·      Data translation between locales and codesets
2002
2003       ·      inode storage
2004
2005       The format chosen to support the goals is an  extension  of  the  ustar
2006       format.  Of the two formats previously available, only the ustar format
2007       was selected for extensions because:
2008
2009       ·      It was easier to extend in an upwards-compatible way. It offered
2010              version  flags and header block type fields with room for future
2011              standardization. The cpio format, while possessing a more flexi‐
2012              ble  file  naming  methodology,  could  not  be extended without
2013              breaking some theoretical implementation or using a dummy  file‐
2014              name that could be a legitimate filename.
2015
2016       ·      Industry  experience  since  the  original  "tar wars" fought in
2017              developing the ISO POSIX-1 standard has clearly been in favor of
2018              the  ustar  format, which is generally the default output format
2019              selected for pax implementations on new systems.
2020
2021       The new format was designed with one additional goal in  mind:  reason‐
2022       able  behavior when an older tar or pax utility happened to read an ar‐
2023       chive. Since the POSIX.1-1990 standard mandated that a  "format-reading
2024       utility"  had  to  treat unrecognized typeflag values as regular files,
2025       this allowed the format to include all the extended  information  in  a
2026       pseudo-regular  file  that  preceded each real file. An option is given
2027       that allows the archive creator to set up reasonable  names  for  these
2028       files  on  the  older  systems.  Also, the normative text suggests that
2029       reasonable file access values be used for this ustar header block. Mak‐
2030       ing these header files inaccessible for convenient reading and deleting
2031       would not be reasonable. File permissions of 600 or 700 are suggested.
2032
2033       The ustar typeflag field was used to accommodate the  additional  func‐
2034       tionality  of  the  new format rather than magic or version because the
2035       POSIX.1-1990 standard (and, by reference, the previous version of pax),
2036       mandated the behavior of the format-reading utility when it encountered
2037       an unknown typeflag, but was silent about the other two fields.
2038
2039       Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2040       a  proposed  archive  format  that  was based on compatibility with the
2041       standard for tape files (ISO 1001, similar to the format used  histori‐
2042       cally  on  many  mainframes  and minicomputers). This format was overly
2043       complex  and  required  considerable  overhead  in  volume  and  header
2044       records. Furthermore, the standard developers felt that it would not be
2045       acceptable to the community  of  POSIX  developers,  so  it  was  later
2046       changed  to  be a format more closely related to historical practice on
2047       POSIX systems.
2048
2049       The prefix and name split of pathnames in ustar  was  replaced  by  the
2050       single path extended header record for simplicity.
2051
2052       The concept of a global extended header (typeflag g) was controversial.
2053       If this were applied to an archive being recorded on magnetic  tape,  a
2054       few  unreadable  blocks at the beginning of the tape could be a serious
2055       problem; a utility attempting to extract as many files as possible from
2056       a damaged archive could lose a large percentage of file header informa‐
2057       tion in this case. However, if the archive were on a  reliable  medium,
2058       such as a CD-ROM, the global extended header offers considerable poten‐
2059       tial size reductions by eliminating redundant  information.  Thus,  the
2060       text  warns  against  using  the global method for unreliable media and
2061       provides a method for implanting global  information  in  the  extended
2062       header for each file, rather than in the typeflag g records.
2063
2064       No  facility  for  data translation or filtering on a per-file basis is
2065       included because the standard developers could not invent an  interface
2066       that  would  allow  this  in  an efficient manner. If a filter, such as
2067       encryption or compression, is to be applied to all  the  files,  it  is
2068       more  efficient  to  apply the filter to the entire archive as a single
2069       file. The standard developers considered interfaces that would invoke a
2070       shell  script  for  each file going into or out of the archive, but the
2071       system overhead in this approach was considered to be too high.
2072
2073       One such approach would be to have filter= records that give a pathname
2074       for  an  executable.  When the program is invoked, the file and archive
2075       would be open for standard input/output and all the header fields would
2076       be  available  as  environment variables or command-line arguments. The
2077       standard developers did discuss such schemes,  but  they  were  omitted
2078       from  IEEE  Std  1003.1-2001  due to concerns about excessive overhead.
2079       Also, the program itself would need to be in the archive if it were  to
2080       be used portably.
2081
2082       There  is  currently  no  portable  means  of identifying the character
2083       set(s) used for a file in the file system. Therefore, pax has not  been
2084       given  a mechanism to generate charset records automatically.  The only
2085       portable means of doing this is for the user to write the archive using
2086       the -o charset=string command line option. This assumes that all of the
2087       files in the  archive  use  the  same  encoding.  The  "implementation-
2088       defined"  text  is included to allow for a system that can identify the
2089       encodings used for each of its files.
2090
2091       The table of standards that accompanies the charset record  description
2092       is  acknowledged to be very limited. Only a limited number of character
2093       set standards is reasonable for maximal interchange. Any character  set
2094       is,  of  course,  possible  by  prior  agreement. It was suggested that
2095       EBCDIC be listed, but it was omitted because it is  not  defined  by  a
2096       formal  standard. Formal standards, and then only those with reasonably
2097       large followings, can be included here, simply as a matter  of  practi‐
2098       cality. The <value>s represent names of officially registered character
2099       sets in the format required by the ISO 2375:1985 standard.
2100
2101       The normal comma or <blank>-separated list rules are  not  followed  in
2102       the  case  of  keyword  options  to  allow ease of argument parsing for
2103       getopts.
2104
2105       Further information on character encodings is in pax Archive  Character
2106       Set Encoding/Decoding.
2107
2108       The  standard  developers  have  reserved keyword name space for vendor
2109       extensions. It is suggested that the format to be used is:
2110
2111           VENDOR.keyword
2112
2113       where VENDOR is the name of the vendor or organization in all uppercase
2114       letters.  It is further suggested that the keyword following the period
2115       be named differently than any of the standard keywords so that it could
2116       be  used  for  future  standardization, if appropriate, by omitting the
2117       VENDOR prefix.
2118
2119       The <length> field in the extended header record was included  to  make
2120       it  simpler  to  step through the records, even if a record contains an
2121       unknown format (to a particular pax) with complex interactions of  spe‐
2122       cial  characters.  It also provides a minor integrity checkpoint within
2123       the records to aid a program attempting to recover files from a damaged
2124       archive.
2125
2126       There  are  no  extended  header  versions of the devmajor and devminor
2127       fields because the unspecified format ustar header field should be suf‐
2128       ficient.  If  they  are not, vendor-specific extended keywords (such as
2129       VENDOR.devmajor) should be used.
2130
2131       Device and i-number labeling of files was not adopted from cpio;  files
2132       are interchanged strictly on a symbolic name basis, as in ustar.
2133
2134       Just  as  with  the  ustar format descriptions, the new format makes no
2135       special arrangements for multi-volume archives. Each of the pax archive
2136       types  is  assumed  to be inside a single POSIX file and splitting that
2137       file over multiple volumes (diskettes, tape  cartridges,  and  so  on),
2138       processing  their  labels, and mounting each in the proper sequence are
2139       considered to  be  implementation  details  that  cannot  be  described
2140       portably.
2141
2142       The  pax  format  is intended for interchange, not only for backup on a
2143       single (family of) systems. It is not as densely  packed  as  might  be
2144       possible for backup:
2145
2146       ·      It  contains information as coded characters that could be coded
2147              in binary.
2148
2149       ·      It identifies extended records with name fields  that  could  be
2150              omitted in favor of a fixed-field layout.
2151
2152       ·      It translates names into a portable character set and identifies
2153              locale-related information, both of which are probably  unneces‐
2154              sary for backup.
2155
2156       The  requirements  on  restoring from an archive are slightly different
2157       from the historical wording, allowing for non-monolithic  privilege  to
2158       bring  forward  as  much as possible. In particular, attributes such as
2159       "high performance file" might be broadly but  not  universally  granted
2160       while  set-user-ID  or chown(2) might be much more restricted. There is
2161       no implication in IEEE Std 1003.1-2001 that the security information be
2162       honored  after  it  is restored to the file hierarchy, in spite of what
2163       might be improperly inferred by the silence on that topic.  That  is  a
2164       topic for another standard.
2165
2166       Links  are recorded in the fashion described here because a link can be
2167       to any file type. It is desirable in general to be able to restore part
2168       of an archive selectively and restore all of those files completely. If
2169       the data is not associated with each link, it is  not  possible  to  do
2170       this.  However,  the data associated with a file can be large, and when
2171       selective restoration is not needed, this can be a significant  burden.
2172       The  archive  is  structured so that files that have no associated data
2173       can always be restored by the name of any link name of  any  link,  and
2174       the  user  may  choose whether data is recorded with each instance of a
2175       file that contains data. The format permits mixing  of  both  types  of
2176       links  in a single archive; this can be done for special needs, and pax
2177       is expected to interpret such archives on input properly,  despite  the
2178       fact  that  there  is no pax option that would force this mixed case on
2179       output. (When -o linkdata is used, the output must contain  the  dupli‐
2180       cate data, but the implementation is free to include it or omit it when
2181       -o linkdata is not used.)
2182
2183       The time values are included  as  extended  header  records  for  those
2184       implementations  needing  more  than the eleven octal digits allowed by
2185       the ustar format. Portable file timestamps cannot be negative.  If  pax
2186       encounters  a  file with a negative timestamp in copy or write mode, it
2187       can reject the file, substitute a non-negative timestamp, or generate a
2188       non-portable timestamp with a leading '-'. Even though some implementa‐
2189       tions can support finer file-time granularities than seconds, the  nor‐
2190       mative  text  requires support only for seconds since the Epoch because
2191       the ISO POSIX-1  standard  states  them  that  way.  The  ustar  format
2192       includes  only mtime; the new format adds atime and ctime for symmetry.
2193       The atime access time restored to the file system will be  affected  by
2194       the -p a and -p e options. The ctime creation time (actually inode mod‐
2195       ification time) is described with "appropriate privilege"  so  that  it
2196       can  be ignored when writing to the file system. POSIX does not provide
2197       a portable means to change file creation time. Nothing is  intended  to
2198       prevent a non-portable implementation of pax from restoring the value.
2199
2200       The  gid,  size, and uid extended header records were included to allow
2201       expansion beyond the sizes specified in the  regular  tar  header.  New
2202       file  system  architectures are emerging that will exhaust the 12-digit
2203       size field. There are probably not many systems requiring more  than  8
2204       digits  for  user  and  group  IDs, but the extended header values were
2205       included for completeness, allowing overrides for all  of  the  decimal
2206       values in the tar header.
2207
2208       The  standard  developers intended to describe the effective results of
2209       pax with regard to file ownerships and permissions; implementations are
2210       not  restricted  in  timing or sequencing the restoration of such, pro‐
2211       vided the results are as specified.
2212
2213       Much of the text describing the  extended  headers  refers  to  use  in
2214       "write  or  copy modes". The copy mode references are due to the norma‐
2215       tive text: "The effect of the copy shall be as if the copied files were
2216       written  to an archive file and then subsequently extracted ...". There
2217       is certainly no way to test whether  pax  is  actually  generating  the
2218       extended headers in copy mode, but the effects must be as if it had.
2219
2220
2221   pax Archive Character Set Encoding/Decoding
2222       There  is  a need to exchange archives of files between systems of dif‐
2223       ferent native codesets. Filenames, group names, and user names must  be
2224       preserved to the fullest extent possible when an archive is read on the
2225       receiving platform. Translation of the contents of files is not  within
2226       the scope of the pax utility.
2227
2228       There will also be the need to represent characters that are not avail‐
2229       able on the receiving platform. These unsupported characters cannot  be
2230       automatically  folded  to the local set of characters due to the chance
2231       of collisions. This could  result  in  overwriting  previous  extracted
2232       files from the archive or pre-existing files on the system.
2233
2234       For  these reasons, the codeset used to represent characters within the
2235       extended header records of the pax archive must be sufficiently rich to
2236       handle  all commonly used character sets. The fields requiring transla‐
2237       tion include, at a minimum, filenames, user  names,  group  names,  and
2238       link  pathnames.  Implementations  may  wish to have localized extended
2239       keywords that use non-portable characters.
2240
2241       The standard developers considered the following options:
2242
2243       ·      The archive creator  specifies  the  well-defined  name  of  the
2244              source  codeset.  The  receiver  must then recognize the codeset
2245              name and perform the appropriate translations to the destination
2246              codeset.
2247
2248       ·      The  archive  creator  includes within the archive the character
2249              mapping table for the source codeset  used  to  encode  extended
2250              header  records.  The receiver must then read the character map‐
2251              ping table and perform the appropriate translations to the  des‐
2252              tination codeset.
2253
2254       ·      The  archive  creator  translates the extended header records in
2255              the source codeset into a canonical form. The receiver must then
2256              perform the appropriate translations to the destination codeset.
2257
2258       The approach that incorporates the name of the source codeset poses the
2259       problem of codeset name registration, and makes the archive useless  to
2260       pax archive decoders that do not recognize that codeset.
2261
2262       Because  parts  of an archive may be corrupted, the standard developers
2263       felt that including the character map of the  source  codeset  was  too
2264       fragile.  The loss of this one key component could result in making the
2265       entire archive useless. (The difference between  this  and  the  global
2266       extended header decision was that the latter has a workaround-duplicat‐
2267       ing extended header records on unreliable media-but this would  be  too
2268       burdensome for large character set maps.)
2269
2270       Both  of  the  above approaches also put an undue burden on the pax ar‐
2271       chive receiver to handle the cross-product of all source  and  destina‐
2272       tion codesets.
2273
2274       To  simplify  the  translation from the source codeset to the canonical
2275       form and from the canonical form to the destination codeset, the  stan‐
2276       dard  developers  decided  that the internal representation should be a
2277       stateless encoding. A stateless encoding is one  where  each  codepoint
2278       has the same meaning, without regard to the decoder being in a specific
2279       state. An example of a stateful encoding would be the  Japanese  Shift-
2280       JIS;  an  example of a stateless encoding would be the ISO/IEC 646:1991
2281       standard (equivalent to 7-bit ASCII).
2282
2283       For these reasons, the standard developers decided to adopt a canonical
2284       format for the representation of file information strings. The obvious,
2285       well-endorsed candidate is the ISO/IEC 10646-1:2000 standard (based  in
2286       part on Unicode), which can be used to represent the characters of vir‐
2287       tually all standardized character sets. The  standard  developers  ini‐
2288       tially  agreed  upon using UCS2 (16-bit Unicode) as the internal repre‐
2289       sentation. This repertoire of characters provides a  sufficiently  rich
2290       set to represent all commonly-used codesets.
2291
2292       However,  the  standard developers found that the 16-bit Unicode repre‐
2293       sentation had some problems. It forced the issue of standardizing  byte
2294       ordering.  The 2-byte length of each character made the extended header
2295       records twice as long for the case of strings coded entirely from  his‐
2296       torical  7-bit  ASCII. For these reasons, the standard developers chose
2297       the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2298       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2299       tically, eliminating the need for a canonical byte ordering.  In  addi‐
2300       tion,  NUL octets and other characters possibly confusing to POSIX file
2301       systems do not appear, except to represent themselves. It was  realized
2302       that  certain  national codesets take up more space after the encoding,
2303       due to their placement within the UCS range; it was felt that the  use‐
2304       fulness of the encoding of the names outweighs the disadvantage of size
2305       increase for file, user, and group names.
2306
2307       The encoding of UTF-8 is as follows:
2308
2309       UCS4 Hex Encoding   UTF-8 Binary Encoding
2310       00000000-0000007F   0xxxxxxx
2311       00000080-000007FF   110xxxxx 10xxxxxx
2312       00000800-0000FFFF   1110xxxx 10xxxxxx 10xxxxxx
2313       00010000-001FFFFF   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2314       00200000-03FFFFFF   111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2315       04000000-7FFFFFFF   1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2316
2317       where each 'x' represents a bit value from the character  being  trans‐
2318       lated.
2319
2320
2321   ustar Interchange Format
2322       The description of the ustar format reflects numerous enhancements over
2323       pre-1988 versions of the historical tar  utility.  The  goal  of  these
2324       changes  was  not  only to provide the functional enhancements desired,
2325       but also to retain compatibility between new  and  old  versions.  This
2326       compatibility has been retained. Archives written using the old archive
2327       format are compatible with the new format.
2328
2329       Implementors should be aware that the  previous  file  format  did  not
2330       include  a  mechanism to archive directory type files. For this reason,
2331       the convention of using a filename ending with  slash  was  adopted  to
2332       specify a directory on the archive.
2333
2334       The  total size of the name and prefix fields have been set to meet the
2335       minimum requirements for {PATH_MAX} If a pathname will fit  within  the
2336       name field, it is recommended that the pathname be stored there without
2337       the use of the prefix field. Although the name field is known to be too
2338       small  to  contain  {PATH_MAX} characters, the value was not changed in
2339       this version of the archive file format to retain backwards-compatibil‐
2340       ity,  and  instead the prefix was introduced. Also, because of the ear‐
2341       lier version of the format, there is no way to remove  the  restriction
2342       on  the  linkname  field being limited in size to just that of the name
2343       field.
2344
2345       The size field is required  to  be  meaningful  in  all  implementation
2346       extensions,  although  it  could  be zero. This is required so that the
2347       data blocks can always be properly counted.
2348
2349       It is suggested that if device special files  need  to  be  represented
2350       that  cannot  be  represented  in  the standard format, that one of the
2351       extension types (A-Z) be used, and that the additional information  for
2352       the  special  file  be represented as data and be reflected in the size
2353       field.
2354
2355       Attempting to restore a special file type, where  it  is  converted  to
2356       ordinary data and conflicts with an existing filename, need not be spe‐
2357       cially detected by the utility. If run as an ordinary user, pax  should
2358       not  be able to overwrite the entries in, for example, /dev in any case
2359       (whether the file is converted to another type or not).  If  run  as  a
2360       privileged user, it should be able to do so, and it would be considered
2361       a bug if it did not. The same is true of ordinary data files and  simi‐
2362       larly  named special files; it is impossible to anticipate the needs of
2363       the user (who could really intend to overwrite the file), so the behav‐
2364       ior should be predictable (and thus regular) and rely on the protection
2365       system as required.
2366
2367       The value 7 in the typeflag field is intended to define how  contiguous
2368       files  can be stored in a ustar archive.  IEEE Std 1003.1-2001 does not
2369       require the contiguous file extension, but does define a  standard  way
2370       of  archiving  such  files so that all conforming systems can interpret
2371       these file types in a meaningful and consistent  manner.  On  a  system
2372       that  does  not  support extended file types, the pax utility should do
2373       the best it can with the file and go on to the next.
2374
2375       The file protection modes are those conventionally used  by  the  ls(1)
2376       utility.  This is extended beyond the usage in the ISO POSIX-2 standard
2377       to support the "shared text" or "sticky" bit. It is intended  that  the
2378       conformance  document should not document anything beyond the existence
2379       of and support of such a mode.   Further  extensions  are  expected  to
2380       these  bits,  particularly  with  overloading  the set-user-ID and set-
2381       group-ID flags.
2382
2383
2384   cpio Interchange Format
2385       The reference to appropriate privilege in the cpio format refers to  an
2386       error  on  standard  output;  the ustar format does not make comparable
2387       statements.
2388
2389       The model for this format was the historical  System  V  cpio  -c  data
2390       interchange  format.  This  model documents the portable version of the
2391       cpio format and not the binary  version.  It  has  the  flexibility  to
2392       transfer data of any type described within IEEE Std 1003.1-2001, yet is
2393       extensible to transfer data types specific to  extensions  beyond  IEEE
2394       Std  1003.1-2001  (for example, contiguous files). Because it describes
2395       existing practice, there is no question of maintaining upwards-compati‐
2396       bility.
2397
2398
2399   cpio Header
2400       There  has  been  some  concern that the size of the c_ino field of the
2401       header is too small to handle those systems that have very large  inode
2402       numbers.  However,  the c_ino field in the header is used strictly as a
2403       hard-link resolution mechanism for archives. It is not necessarily  the
2404       same  value  as the inode number of the file in the location from which
2405       that file is extracted.
2406
2407       The name c_magic is based on historical usage.
2408
2409
2410   cpio Filename
2411       For most historical implementations of  the  cpio  utility,  {PATH_MAX}
2412       octets can be used to describe the pathname without the addition of any
2413       other header fields (the  NUL  character  would  be  included  in  this
2414       count).   {PATH_MAX} is the minimum value for pathname size, documented
2415       as 256 bytes. However, an implementation may use c_namesize  to  deter‐
2416       mine the exact length of the pathname.  With the current description of
2417       the <cpio.h> header, this pathname size can be as  large  as  a  number
2418       that is described in six octal digits.
2419
2420       Two  values are documented under the c_mode field values to provide for
2421       extensibility for known file types:
2422
2423       0110 000
2424              Reserved for contiguous files. The implementation may treat  the
2425              rest of the information for this archive like a regular file. If
2426              this file type is undefined, the implementation may  create  the
2427              file as a regular file.
2428
2429       This  provides  for extensibility of the cpio format while allowing for
2430       the ability to read old archives. Files of an unknown type may be  read
2431       as  "regular  files" on some implementations. On a system that does not
2432       support extended file types, the pax utility should do the best it  can
2433       with the file and go on to the next.
2434
2435

FUTURE DIRECTIONS

2437       None.
2438
2439

End of informative sections.

2441_________________________________________________________________
2442
2443

SEE ALSO

2445       Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2446       Base Definitions volume of IEEE Std 1003.1-2001, <cpio.h>,  the  System
2447       Interfaces   volume   of  IEEE  Std  1003.1-2001,  chown(2),  creat(2),
2448       mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2449
2450

CHANGE HISTORY

2452       First released in Issue 4.
2453
2454
2455   Issue 5
2456       A note is added to the APPLICATION USAGE indicating that the  cpio  and
2457       tar formats can only support files up to 8 gigabytes in size.
2458
2459
2460   Issue 6
2461       The pax utility is aligned with the IEEE P1003.2b draft standard:
2462
2463       ·      Support  has  been  added  for symbolic links in the options and
2464              interchange formats.
2465
2466       ·      A new format has been devised, based on extensions to ustar.
2467
2468       ·      References to the "extended" tar and cpio formats  derived  from
2469              the  POSIX.1-1990  standard  have  been  changed  to  remove the
2470              "extended" adjective because this could cause confusion with the
2471              extended  tar  header added in this revision. (All references to
2472              tar are actually to ustar.)
2473
2474       The TZ entry is added to the ENVIRONMENT VARIABLES section.
2475
2476       IEEE PASC  Interpretation  1003.2  #168  is  applied,  clarifying  that
2477       mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2478       ing an archive.
2479
2480       IEEE  PASC  Interpretation  1003.2  #180  is  applied,  clarifying  how
2481       extracted files are created when in read mode.
2482
2483       IEEE  PASC  Interpretation  1003.2  #181  is  applied,  clarifying  the
2484       description of the -t option.
2485
2486       IEEE PASC Interpretation 1003.2 #195 is applied.
2487
2488       IEEE PASC Interpretation 1003.2 #206 is applied,  clarifying  the  han‐
2489       dling of links for the -H, -L, and -l options.
2490
2491       IEEE  Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied, adding
2492       the process ID of the pax process into certain fields. This change pro‐
2493       vides  a  method  for  the  implementation  to  ensure  that  different
2494       instances of pax extracting a file named /a/b/foo will not collide when
2495       processing the extended header information associated with foo.
2496
2497       IEEE  Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied, chang‐
2498       ing -x B to -x pax in the OPTIONS section.
2499
2500       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied,  updat‐
2501       ing the SYNOPSIS to be consistent with the normative text.
2502
2503       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied, updat‐
2504       ing the DESCRIPTION to describe the behavior when files  to  be  linked
2505       are  symbolic  links and the system is not capable of making hard links
2506       to symbolic links.
2507
2508       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied,  updat‐
2509       ing  the  OPTIONS  section  to  describe  the behavior for how multiple
2510       options are to be handled.
2511
2512       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied,  updat‐
2513       ing the write option within the OPTIONS section.
2514
2515       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied, adding
2516       a paragraph into the OPTIONS section that states that  specifying  more
2517       than  one  of the mutually-exclusive options (-H and -L) is not consid‐
2518       ered an error and that the last option  specified  will  determine  the
2519       behavior of the utility.
2520
2521       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied, remov‐
2522       ing the ctime paragraph within the EXTENDED DESCRIPTION.   There  is  a
2523       contradiction  in  the  definition  of  the  ctime  keyword for the pax
2524       extended header, in that the st_ctime member of the stat structure does
2525       not refer to a file creation time. No field in the standard stat struc‐
2526       ture from <sys/stat.h> includes a file creation time.
2527
2528       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied,  making
2529       it  clear  that  typeflag  1 RB ( ustar Interchange Format) applies not
2530       only to files that are hard-linked, but also to files that are  aliased
2531       via symlinks.
2532
2533       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied, clari‐
2534       fying the cpio c_nlink field.
2535
2536       End of quoted text from the POSIX.1-2001 standard.
2537

OTHER OPTIONS

2539       The following other options are implemented as extension to  the  POSIX
2540       standard.   Note  that  some  other  non-POSIX options are mentioned in
2541       -help and -xhelp output - these are also supported in spax(1)  and  are
2542       well described in star(1) manual page.
2543
2544       -help  Prints  a  summary of the most important options for spax(1) and
2545              exits.
2546
2547       -xhelp Prints a summary of the less important options for  spax(1)  and
2548              exits.
2549
2550       -version
2551              Prints the spax version number string and exists.
2552
2553       -do-statistics
2554              Print statistic messages at the end of a spax(1) run.
2555
2556

EXAMPLES

ENVIRONMENT

FILES

SEE ALSO

DIAGNOSTICS

NOTES

2563       The  Institute  of  Electrical  and  Electronics Engineers and The Open
2564       Group, have given us permission to reprint portions of their documenta‐
2565       tion.  In  the  following statement, the phrase ``this text'' refers to
2566       portions of the system documentation.
2567
2568       Portions of this text are reprinted and reproduced in  electronic  form
2569       in  the  sfind manual, from IEEE Std 1003.1, 2004 Edition, Standard for
2570       Information Technology -- Portable Operating System Interface  (POSIX),
2571       The  Open Group Base Specifications Issue 6, Copyright (C) 2001-2004 by
2572       the Institute of Electrical and Electronics Engineers, Inc and The Open
2573       Group.  In  the event of any discrepancy between these versions and the
2574       original IEEE and The Open Group Standard, the original  IEEE  and  The
2575       Open  Group Standard is the referee document. The original Standard can
2576       be obtained online at http://www.opengroup.org/unix/online.html.
2577

BUGS

AUTHOR

2580       Joerg Schilling
2581       Seestr. 110
2582       D-13353 Berlin
2583       Germany
2584
2585       Mail bugs and suggestions to:
2586
2587       schilling@fokus.fraunhofer.de      or       js@cs.tu-berlin.de       or
2588       joerg@schily.isdn.cs.tu-berlin.de
2589
2590
2591
2592Joerg Schilling                    10/08/01                           SPAX(1L)
Impressum