1SPAX(1L)                    Schily´s USER COMMANDS                    SPAX(1L)
2
3
4

NAME

6       pax - portable archive interchange
7

SYNOPSIS

9       spax        [other options]      [-cdnv]      [-H|-L]      [-f archive]
10              [-o options]...  [-s replstr]...  [pattern...]
11
12
13       spax   -r   [other options]     [-cdiknuv]     [-H|-L]     [-f archive]
14              [-o options]...  [-p string]...  [-s replstr]...  [pattern...]
15
16
17       spax   -w   [other options]   [-dituvX]   [-H|-L]  [-b blocksize]  [-a]
18              [-f archive]   [-o options]...    [-s replstr]...    [-x format]
19              [file...]
20
21
22       spax   -r -w[other options]    [-diklntuvX]   [-H|-L]   [-o options]...
23              [-p string]...  [-s replstr]...  [file...] directory
24

DESCRIPTION

26       The pax utility shall read, write, and write lists of  the  members  of
27       archive files and copy directory hierarchies. A variety of archive for‐
28       mats shall be supported; see the -x format option.
29
30       The action to be taken depends  on  the  presence  of  the  -r  and  -w
31       options. The four combinations of -r and -w are referred to as the four
32       modes of operation: list, read, write, and  copy  modes,  corresponding
33       respectively to the four forms shown in the SYNOPSIS section.
34
35       list   In  list  mode (when neither -r nor -w are specified), pax shall
36              write the names of the members of the archive file read from the
37              standard  input, with pathnames matching the specified patterns,
38              to standard output. If a named file is of  type  directory,  the
39              file hierarchy rooted at that file shall be listed as well.
40
41       read   In  read  mode  (when -r is specified, but -w is not), pax shall
42              extract the members of the archive file read from  the  standard
43              input,  with  pathnames  matching the specified patterns.  If an
44              extracted file is of type directory, the file  hierarchy  rooted
45              at  that  file  shall  be extracted as well. The extracted files
46              shall be created performing pathname resolution with the  direc‐
47              tory in which pax was invoked as the current working directory.
48
49              If  an attempt is made to extract a directory when the directory
50              already exists, this shall not be considered  an  error.  If  an
51              attempt  is made to extract a FIFO when the FIFO already exists,
52              this shall not be considered an error.
53
54              The ownership, access, and modification times, and file mode  of
55              the restored files are discussed under the -p option.
56
57       write  In  write  mode (when -w is specified, but -r is not), pax shall
58              write the contents of the file operands to the  standard  output
59              in  an archive format. If no file operands are specified, a list
60              of files to copy, one per line, shall be read from the  standard
61              input.  A  file of type directory shall include all of the files
62              in the file hierarchy rooted at the file.
63
64       copy   In copy mode (when both -r and -w are specified), pax shall copy
65              the file operands to the destination directory.
66
67              If  no file operands are specified, a list of files to copy, one
68              per line, shall be read from the standard input. A file of  type
69              directory  shall  include all of the files in the file hierarchy
70              rooted at the file.
71
72              The effect of the copy shall be as  if  the  copied  files  were
73              written  to  an  archive  file  and then subsequently extracted,
74              except that there may be hard links between the original and the
75              copied  files. If the destination directory is a subdirectory of
76              one of the files to be copied, the results are  unspecified.  If
77              the destination directory is a file of a type not defined by the
78              System Interfaces volume of IEEE Std  1003.1-2001,  the  results
79              are  implementation-defined; otherwise, it shall be an error for
80              the file named by the directory operand not  to  exist,  not  be
81              writable by the user, or not be a file of type directory.
82
83       In  read  or  copy  modes, if intermediate directories are necessary to
84       extract an archive member, pax shall perform actions equivalent to  the
85       mkdir()  function  defined  in the System Interfaces volume of IEEE Std
86       1003.1-2001, called with the following arguments:
87
88       ·      The intermediate directory used as the path argument.
89
90       ·      The value of the bitwise-inclusive OR of S_IRWXU,  S_IRWXG,  and
91              S_IRWXO as the mode argument.
92
93       If  any  specified pattern or file operands are not matched by at least
94       one file or archive member, pax shall write  a  diagnostic  message  to
95       standard error for each one that did not match and exit with a non-zero
96       exit status.
97
98       The archive formats described in the EXTENDED DESCRIPTION section shall
99       be  automatically  detected on input. The default output archive format
100       shall be implementation-defined.
101
102       The spax implementation defaults to -x ustar.
103
104       A single archive can span multiple files. The pax utility shall  deter‐
105       mine,  in  an implementation-defined manner, what file to read or write
106       as the next file.
107
108       If the selected archive format supports  the  specification  of  linked
109       files,  it  shall  be an error if these files cannot be linked when the
110       archive is extracted, except that if the files to be  linked  are  sym‐
111       bolic  links and the system is not capable of making hard links to sym‐
112       bolic links, then separate copies of the symbolic link shall be created
113       instead.  For archive formats that do not store file contents with each
114       name that causes a hard link, if the file that contains the data is not
115       extracted  during  this  pax session, either the data shall be restored
116       from the original file, or a diagnostic message shall be displayed with
117       the  name of a file that can be used to extract the data. In traversing
118       directories, pax shall detect infinite loops; that is, entering a  pre‐
119       viously visited directory that is an ancestor of the last file visited.
120       When it detects an infinite loop, pax shall write a diagnostic  message
121       to standard error and shall terminate.
122
123

OPTIONS

125       The  pax  utility  shall conform to the Base Definitions volume of IEEE
126       Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines,  except  that
127       the order of presentation of the -o, -p, and -s options is significant.
128
129       The following options shall be supported:
130
131       -r     Read an archive file from standard input.
132
133       -w     Write files to the standard output in the specified archive for‐
134              mat.
135
136       -a     Append files to the end of the archive.  It  is  implementation-
137              defined  which  devices  on  the system support appending. Addi‐
138              tional file formats unspecified  by  this  volume  of  IEEE  Std
139              1003.1-2001 may impose restrictions on appending.
140
141       -b blocksize
142              Block  the  output at a positive decimal integer number of bytes
143              per write to the archive file. Devices and archive  formats  may
144              impose restrictions on blocking. Blocking shall be automatically
145              determined on input. Conforming applications shall not specify a
146              blocksize value larger than 32256.  Default blocking when creat‐
147              ing archives depends on the archive format. (See the  -x  option
148              below.)
149
150       -c     Match  all file or archive members except those specified by the
151              pattern or file operands.
152
153       -d     Cause files of type directory being copied or  archived  or  ar‐
154              chive  members  of  type  directory being extracted or listed to
155              match only the file or archive member itself and  not  the  file
156              hierarchy rooted at the file.
157
158       -f archive
159              Specify  the pathname of the input or output archive, overriding
160              the default standard input (in list or read modes)  or  standard
161              output (write mode).
162
163       -H     If a symbolic link referencing a file of type directory is spec‐
164              ified on the command line, pax shall archive the file  hierarchy
165              rooted in the file referenced by the link, using the name of the
166              link as the root of the file hierarchy.  Otherwise,  if  a  sym‐
167              bolic  link  referencing a file of any other file type which pax
168              can normally archive is specified on the command line, then  pax
169              shall archive the file referenced by the link, using the name of
170              the link. The default behavior shall be to archive the  symbolic
171              link itself.
172
173       -i     Interactively  rename files or archive members. For each archive
174              member matching a pattern operand or file matching a file  oper‐
175              and, a prompt shall be written to the file /dev/tty.  The prompt
176              shall contain the name of the file or archive  member,  but  the
177              format  is otherwise unspecified. A line shall then be read from
178              /dev/tty. If this line is blank,  the  file  or  archive  member
179              shall  be skipped. If this line consists of a single period, the
180              file or archive member shall be processed with  no  modification
181              to its name. Otherwise, its name shall be replaced with the con‐
182              tents of the line. The pax utility shall immediately exit with a
183              non-zero  exit status if end-of-file is encountered when reading
184              a response or if /dev/tty cannot be opened for reading and writ‐
185              ing.
186
187              The  results  of  extracting a hard link to a file that has been
188              renamed during extraction are unspecified.
189
190       -k     Prevent the overwriting of existing files.
191
192       -l     (The letter ell.) In copy mode, hard links shall be made between
193              the  source  and destination file hierarchies whenever possible.
194              If specified in conjunction with -H or -L, when a symbolic  link
195              is  encountered,  the  hard link created in the destination file
196              hierarchy shall be to the file referenced by the symbolic  link.
197              If  specified  when  neither -H nor -L is specified, when a sym‐
198              bolic link is encountered, the  implementation  shall  create  a
199              hard  link  to the symbolic link in the source file hierarchy or
200              copy the symbolic link to the destination.
201
202       -L     If a symbolic link referencing a file of type directory is spec‐
203              ified on the command line or encountered during the traversal of
204              a file hierarchy, pax shall archive the file hierarchy rooted in
205              the  file  referenced by the link, using the name of the link as
206              the root of the file hierarchy.  Otherwise, if a  symbolic  link
207              referencing a file of any other file type which pax can normally
208              archive is specified on the command line or  encountered  during
209              the  traversal  of  a file hierarchy, pax shall archive the file
210              referenced by the link, using the name of the link. The  default
211              behavior shall be to archive the symbolic link itself.
212
213       -n     Select  the first archive member that matches each pattern oper‐
214              and. No more than one archive member shall be matched  for  each
215              pattern  (although  members  of type directory shall still match
216              the file hierarchy rooted at that file).
217
218       -o options
219              Provide information to the implementation to  modify  the  algo‐
220              rithm  for  extracting  or  writing  files. The value of options
221              shall consist of one or more  comma-separated  keywords  of  the
222              form:
223
224              keyword[[:]=value][,keyword[[:]=value],...]
225
226              Some  keywords  apply only to certain file formats, as indicated
227              with each description. Use of keywords that are inapplicable  to
228              the file format being processed produces undefined results.
229
230              Keywords in the options argument shall be a string that would be
231              a valid portable filename as described in the  Base  Definitions
232              volume of IEEE Std 1003.1-2001, Section 3.276, Portable Filename
233              Character Set.
234
235              Note:  Keywords are not expected to be filenames, merely to fol‐
236                     low  the  same  character  composition  rules as portable
237                     filenames.
238
239              Keywords can be preceded with white space. The value field shall
240              consist  of  zero or more characters; within value, the applica‐
241              tion shall precede any literal comma  with  a  backslash,  which
242              shall  be  ignored,  but preserves the comma as part of value. A
243              comma as the final character, or  a  comma  followed  solely  by
244              white  space  as  the  final  characters,  in  options  shall be
245              ignored. Multiple -o options can be specified; if keywords given
246              to  these  multiple -o options conflict, the keywords and values
247              appearing later in command line sequence shall  take  precedence
248              and the earlier shall be silently ignored. The following keyword
249              values of options shall be supported for  the  file  formats  as
250              indicated:
251
252              delete=pattern
253                     (Applicable  only  to  the  -x  pax format.) When used in
254                     write or copy mode, pax shall omit from  extended  header
255                     records that it produces any keywords matching the string
256                     pattern. When used in read or list mode, pax shall ignore
257                     any  keywords matching the string pattern in the extended
258                     header records. In both cases,  matching  shall  be  per‐
259                     formed  using  the pattern matching notation described in
260                     Patterns Matching a Single Character and Patterns  Match‐
261                     ing Multiple Characters. For example:
262
263                     -o delete=security.*
264
265                     would  suppress  security-related  information.  See  pax
266                     Extended Header for extended header record keyword usage.
267
268                     When multiple -o delete=pattern  options  are  specified,
269                     the patterns shall be additive; all keywords matching the
270                     specified string patterns shall be omitted from  extended
271                     header records that pax produces.
272
273              exthdr.name=string
274                     (Applicable  only  to  the  -x  pax format.) This keyword
275                     allows user control over the name that  is  written  into
276                     the  ustar header blocks for the extended header produced
277                     under the circumstances described in  pax  Header  Block.
278                     The  name shall be the contents of string, after the fol‐
279                     lowing character substitutions have been made:
280
281                  ┌─────────────────┬─────────────────────────────────────────────┐
282string Includes: Replaced By:                                
283                  ├─────────────────┼─────────────────────────────────────────────┤
284                  │%d               │ The directory name of the file, equivalent  │
285                  │                 │ to the result of the dirname utility on the │
286                  │                 │ translated pathname.                        │
287                  ├─────────────────┼─────────────────────────────────────────────┤
288                  │%f               │ The filename of the file, equivalent to the │
289                  │                 │ result of the basename utility on the       │
290                  │                 │ translated pathname.                        │
291                  ├─────────────────┼─────────────────────────────────────────────┤
292                  │%p               │ The process ID of the pax process.          │
293                  ├─────────────────┼─────────────────────────────────────────────┤
294                  │%%               │ A '%' character.                            │
295                  └─────────────────┴─────────────────────────────────────────────┘
296                     Any other '%'  characters  in  string  produce  undefined
297                     results.
298
299                     If  no -o exthdr.name= string is specified, pax shall use
300                     the following default value:
301
302                             %d/PaxHeaders.%p/%f
303
304              globexthdr.name=string
305                     (Applicable only to the -x  pax  format.)  When  used  in
306                     write  or  copy  mode  with  the appropriate options, pax
307                     shall create global extended header  records  with  ustar
308                     header  blocks  that  will be treated as regular files by
309                     previous versions of pax.  This keyword allows user  con‐
310                     trol  over the name that is written into the ustar header
311                     blocks for global extended header records. The name shall
312                     be  the contents of string, after the following character
313                     substitutions have been made:
314
315                  ┌─────────────────┬─────────────────────────────────────────────┐
316string Includes: Replaced By:                                
317                  ├─────────────────┼─────────────────────────────────────────────┤
318                  │%n               │ An integer that represents the sequence     │
319                  │                 │ number of the global extended header record │
320                  │                 │ in the archive, starting at 1.              │
321                  ├─────────────────┼─────────────────────────────────────────────┤
322                  │%p               │ The process ID of the pax process.          │
323                  ├─────────────────┼─────────────────────────────────────────────┤
324                  │%%               │ A '%' character.                            │
325                  └─────────────────┴─────────────────────────────────────────────┘
326                     Any other '%'  characters  in  string  produce  undefined
327                     results.
328
329                     If  no  -o globexthdr.name=string is specified, pax shall
330                     use the following default value:
331
332                     $TMPDIR/GlobalHead.%p.%n
333
334                     where $TMPDIR represents the value of the TMPDIR environ‐
335                     ment variable. If TMPDIR is not set, pax shall use /tmp.
336
337              invalid=action
338                     (Applicable  only  to  the  -x  pax format.) This keyword
339                     allows user  control  over  the  action  pax  takes  upon
340                     encountering values in an extended header record that, in
341                     read or copy mode, are invalid in the destination hierar‐
342                     chy  or,  in  list mode, cannot be written in the codeset
343                     and current locale of the implementation.  The  following
344                     are invalid values that shall be recognized by pax:
345
346                     +      In read or copy mode, a filename or link name that
347                            contains character encodings invalid in the desti‐
348                            nation  hierarchy. (For example, the name may con‐
349                            tain embedded NULs.)
350
351                     +      In read or copy mode, a filename or link name that
352                            is longer than the maximum allowed in the destina‐
353                            tion hierarchy (for either a pathname component or
354                            the entire pathname).
355
356                     +      In  list  mode,  any character string value (file‐
357                            name, link name, user name, and so on) that cannot
358                            be  written  in  the codeset and current locale of
359                            the implementation.
360
361                     The following mutually-exclusive  values  of  the  action
362                     argument are supported:
363
364                     bypass In  read  or copy mode, pax shall bypass the file,
365                            causing no change to the destination hierarchy. In
366                            list  mode,  pax  shall  write all requested valid
367                            values for the file, but its  method  for  writing
368                            invalid values is unspecified.
369
370                     rename In  read  or copy mode, pax shall act as if the -i
371                            option were in effect for each file  with  invalid
372                            filename or link name values, allowing the user to
373                            provide a replacement name interactively. In  list
374                            mode,  pax  shall behave identically to the bypass
375                            action.
376
377                     UTF-8  When used in read, copy, or list mode and a  file‐
378                            name, link name, owner name, or any other field in
379                            an extended header  record  cannot  be  translated
380                            from  the  pax UTF-8 codeset format to the codeset
381                            and current  locale  of  the  implementation,  pax
382                            shall use the actual UTF-8 encoding for the name.
383
384                     write  In  read  or  copy mode, pax shall write the file,
385                            translating the name, regardless of  whether  this
386                            may  overwrite an existing file with a valid name.
387                            In list mode, pax shall behave identically to  the
388                            bypass action.
389
390                     If no -o invalid=option is specified, pax shall act as if
391                     -o invalid= bypass were  specified.  Any  overwriting  of
392                     existing  files  that  may  be allowed by the -o invalid=
393                     actions shall be subject to permission(-p) and  modifica‐
394                     tion  time  (-u) restrictions, and shall be suppressed if
395                     the -k option is also specified.
396
397              linkdata
398                     (Applicable only to the -x pax format.)  In  write  mode,
399                     pax  shall  write  the  contents of a file to the archive
400                     even when that file is merely a hard link to a file whose
401                     contents have already been written to the archive.
402
403              listopt=format
404                     This  keyword specifies the output format of the table of
405                     contents produced when the -v option is specified in list
406                     mode. See List Mode Format Specifications. To avoid ambi‐
407                     guity, the listopt= format shall be  the  only  or  final
408                     keyword=  value pair in a -o option-argument; all charac‐
409                     ters in the remainder of  the  option-argument  shall  be
410                     considered  part  of  the format string. When multiple -o
411                     listopt= format options are specified, the format strings
412                     shall be considered a single, concatenated string, evalu‐
413                     ated in command line order.
414
415              times  (Applicable only to the -x  pax  format.)  When  used  in
416                     write  or  copy  mode,  pax shall include atime and mtime
417                     extended header records for each file. See  pax  Extended
418                     Header File Times.
419
420              In  addition  to  these keywords, if the -x pax format is speci‐
421              fied, any of the keywords and values  defined  in  pax  Extended
422              Header,  including  implementation extensions, can be used in -o
423              option-arguments, in either of two modes:
424
425              keyword=value
426                     When used in write  or  copy  mode,  these  keyword/value
427                     pairs  shall  be included at the beginning of the archive
428                     as typeflag g global extended header records.  When  used
429                     in read or list mode, these keyword/value pairs shall act
430                     as if they had been at the beginning of  the  archive  as
431                     typeflag g global extended header records.
432
433              keyword:=value
434                     When  used  in  write  or  copy mode, these keyword/value
435                     pairs shall be included as records at the beginning of  a
436                     typeflag  x extended header for each file. (This shall be
437                     equivalent to the equal-sign form except that it  creates
438                     no  typeflag g global extended header records.) When used
439                     in read or list mode, these keyword/value pairs shall act
440                     as  if  they  were included as records at the end of each
441                     extended header; thus, they shall override any global  or
442                     file-specific extended header record keywords of the same
443                     names. For example, in the command:
444
445                     pax -r -o "gname:=mygroup," <archive
446
447                     the group name will be forced to  a  new  value  for  all
448                     files read from the archive.
449
450              The precedence of -o keywords over various fields in the archive
451              is described in pax Extended Header Keyword Precedence.
452
453       -p string
454              Specify one or more file  characteristic  options  (privileges).
455              The  string  option-argument  shall  be a string specifying file
456              characteristics to be retained or discarded on extraction.   The
457              string  shall  consist of the specification characters a , e, m,
458              o,  and  p.  Other  implementation-defined  characters  can   be
459              included.  Multiple  characteristics  can be concatenated within
460              the same string and multiple -p options can  be  specified.  The
461              meaning of the specification characters are as follows:
462
463              a      Do not preserve file access times.
464
465              e      Preserve  the  user ID, group ID, file mode bits (see the
466                     Base Definitions volume of IEEE Std 1003.1-2001,  Section
467                     3.168,  File  Mode Bits), access time, modification time,
468                     and any other  implementation-defined  file  characteris‐
469                     tics.
470
471              m
472
473                     Do not preserve file modification times.
474
475              o      Preserve the user ID and group ID.
476
477              p      Preserve the file mode bits. Other implementation-defined
478                     file mode attributes may be preserved.
479
480              In the preceding list, "preserve" indicates  that  an  attribute
481              stored in the archive shall be given to the extracted file, sub‐
482              ject to the permissions of the invoking process. The access  and
483              modification  times of the file shall be preserved unless other‐
484              wise specified with the -p option or not stored in the  archive.
485              All  attributes  that  are  not preserved shall be determined as
486              part of the normal file creation action (see File  Read,  Write,
487              and Creation).
488
489              If neither the e nor the o specification character is specified,
490              or the user ID and group ID are not preserved  for  any  reason,
491              pax shall not set the S_ISUID and S_ISGID bits of the file mode.
492
493              If  the preservation of any of these items fails for any reason,
494              pax shall write a diagnostic message to standard error.  Failure
495              to  preserve these items shall affect the final exit status, but
496              shall not cause the extracted file to be deleted.
497
498              If file characteristic letters in any of the string option-argu‐
499              ments are duplicated or conflict with each other, the ones given
500              last shall take precedence. For example, if -p eme is specified,
501              file modification times are preserved.
502
503       -s replstr
504              Modify file or archive member names named by pattern or file op‐
505              erands according to the substitution expression  replstr,  using
506              the  syntax  of  the  ed  utility. The concepts of "address" and
507              "line" are meaningless in the context of the  pax  utility,  and
508              shall not be supplied. The format shall be:
509
510              -s /old/new/[gp]
511
512              where  as  in  ed, old is a basic regular expression and new can
513              contain an ampersand, '\n' (where n is a digit)  backreferences,
514              or  subexpression matching. The old string shall also be permit‐
515              ted to contain <newline>s.
516
517              Any non-null character can be used as a delimiter  (  '/'  shown
518              here). Multiple -s expressions can be specified; the expressions
519              shall be applied in the order specified,  terminating  with  the
520              first  successful  substitution. The optional trailing 'g' is as
521              defined in the ed utility. The optional trailing 'p' shall cause
522              successful  substitutions  to be written to standard error. File
523              or archive member names that  substitute  to  the  empty  string
524              shall be ignored when reading and writing archives.
525
526       -t     When reading files from the file system, and if the user has the
527              permissions required by utime() to do so, set the access time of
528              each  file read to the access time that it had before being read
529              by pax.
530
531       -u     Ignore files that are older (having a less recent file modifica‐
532              tion  time)  than a pre-existing file or archive member with the
533              same name. In read mode, an archive member with the same name as
534              a file in the file system shall be extracted if the archive mem‐
535              ber is newer than the file. In write mode, an archive file  mem‐
536              ber  with  the  same  name as a file in the file system shall be
537              superseded if the file is newer than the archive member.  If  -a
538              is  also specified, this is accomplished by appending to the ar‐
539              chive; otherwise, it is unspecified whether this is accomplished
540              by  actual replacement in the archive or by appending to the ar‐
541              chive. In copy mode, the file in the destination hierarchy shall
542              be  replaced by the file in the source hierarchy or by a link to
543              the file in the source hierarchy if the file in the source hier‐
544              archy is newer.
545
546       -v     In  list mode, produce a verbose table of contents (see the STD‐
547              OUT section). Otherwise, write archive member pathnames to stan‐
548              dard error (see the STDERR section).
549
550       -x format
551              Specify the output archive format. The pax utility shall support
552              the following formats:
553
554              cpio   The cpio interchange format; see the EXTENDED DESCRIPTION
555                     section.  The default blocksize for this format for char‐
556                     acter special archive files shall  be  5120.  Implementa‐
557                     tions  shall  support  all  blocksize values less than or
558                     equal to 32256 that are multiples of 512.
559
560              pax    The pax interchange format; see the EXTENDED  DESCRIPTION
561                     section.  The default blocksize for this format for char‐
562                     acter special archive files shall be  5120.   Implementa‐
563                     tions  shall  support  all  blocksize values less than or
564                     equal to 32256 that are multiples of 512.
565
566              ustar  The tar interchange format; see the EXTENDED  DESCRIPTION
567                     section.  The default blocksize for this format for char‐
568                     acter special archive files shall be 10240.   Implementa‐
569                     tions  shall  support  all  blocksize values less than or
570                     equal to 32256 that are multiples of 512.
571
572              Implementation-defined formats shall  specify  a  default  block
573              size  as  well  as any other block sizes supported for character
574              special archive files.
575
576              Any attempt to append to an archive file in a  format  different
577              from the existing archive format shall cause pax to exit immedi‐
578              ately with a non-zero exit status.
579
580              In copy mode, if no -x format is specified, pax shall behave  as
581              if -x pax were specified.
582
583       -X     When  traversing the file hierarchy specified by a pathname, pax
584              shall not descend into directories that have a different  device
585              ID  (  st_dev;  see  the  System  Interfaces  volume of IEEE Std
586              1003.1-2001, stat()).
587
588       Specifying more than one of the mutually-exclusive options  -H  and  -L
589       shall  not  be  considered an error and the last option specified shall
590       determine the behavior of the utility.
591
592       The options that operate on the names of files or archive members  (-c,
593       -i,  -n, -s, -u, and -v)shallinteractasfollows.Inread mode, the archive
594       members shall be selected based on the user-specified pattern  operands
595       as  modified by the -c, -n, and -u options. Then, any -s and -i options
596       shall modify, in that order, the names of the selected  files.  The  -v
597       option shall write names resulting from these modifications.
598
599       In  write mode, the files shall be selected based on the user-specified
600       pathnames as modified by the -n and -u options.  Then, any  -s  and  -i
601       options shall modify, in that order, the names of these selected files.
602       The -v option shall write names resulting from these modifications.
603
604       If both the -u and -n options are specified, pax shall not  consider  a
605       file selected unless it is newer than the file to which it is compared.
606
607
608   List Mode Format Specifications
609       The  manual  page  for  spax is not yet ready.  The following text is a
610       quotation from the POSIX.1-2001 standard.
611
612       In list mode with the -o listopt=format  option,  the  format  argument
613       shall be applied for each selected file. The pax utility shall append a
614       <newline> to the listopt output for  each  selected  file.  The  format
615       argument shall be used as the format string described in the Base Defi‐
616       nitions volume of IEEE Std 1003.1-2001, Chapter 5,  File  Format  Nota‐
617       tion,  with  the  exceptions  1.  through  5.   defined in the EXTENDED
618       DESCRIPTION section of printf(3), plus the following exceptions:
619
620       6.     The sequence (keyword) can  occur  before  a  format  conversion
621              specifier.  The  conversion  argument is defined by the value of
622              keyword.  The implementation shall support  the  following  key‐
623              words:
624
625              ·      Any  of  the Field Name entries in ustar Header Block and
626                     Octet-Oriented cpio Archive Entry. The implementation may
627                     support the cpio keywords without the leading c_ in addi‐
628                     tion to the form  required  by  Values  for  cpio  c_mode
629                     Field.
630
631              ·      Any  keyword  defined  for  the  extended  header  in pax
632                     Extended Header.
633
634              ·      Any keyword provided as an implementation-defined  exten‐
635                     sion  within  the extended header defined in pax Extended
636                     Header.
637
638              For example, the sequence "%(charset)s" is the string  value  of
639              the name of the character set in the extended header.
640
641              The result of the keyword conversion argument shall be the value
642              from the applicable header field or extended header, without any
643              trailing NULs.
644
645              All  keyword values used as conversion arguments shall be trans‐
646              lated from the UTF-8 encoding to the character  set  appropriate
647              for the local file system, user database, and so on, as applica‐
648              ble.
649
650       7.     An additional conversion specifier character, T, shall  be  used
651              to  specify  time  formats. The T conversion specifier character
652              can be preceded by the sequence (keyword=subformat), where  sub‐
653              format is a date format as defined by date operands. The default
654              keyword shall be mtime and the default subformat shall be:
655
656                 %b %e %H:%M %Y
657
658       8.     An additional conversion specifier character, M, shall  be  used
659              to  specify  the  file  mode string as defined in ls(1) Standard
660              Output. If (keyword) is omitted, the mode keyword shall be used.
661              For  example,  %.1M writes the single character corresponding to
662              the <entry type> field of the ls -l command.
663
664       9.     An additional conversion specifier character, D, shall  be  used
665              to specify the device for block or special files, if applicable,
666              in an implementation-defined  format.  If  not  applicable,  and
667              (keyword) is specified, then this conversion shall be equivalent
668              to %(keyword)u.  If not applicable, and  (keyword)  is  omitted,
669              then this conversion shall be equivalent to <space>.
670
671       10.    An  additional  conversion specifier character, F, shall be used
672              to specify a pathname. The F conversion character  can  be  pre‐
673              ceded by a sequence of comma-separated keywords:
674
675                 (keyword[,keyword] ... )
676              The  values for all the keywords that are non-null shall be con‐
677              catenated together, each separated by a '/'. The  default  shall
678              be (path) if the keyword path is defined; otherwise, the default
679              shall be (prefix, name).
680
681       11.    An additional conversion specifier character, L, shall  be  used
682              to  specify  a symbolic line expansion. If the current file is a
683              symbolic link, then %L shall expand to:
684
685                 "%s -> %s", <value of keyword>, <contents of link>
686
687       Otherwise, the %L conversion specification shall be the  equivalent  of
688       %F.
689
690

OPERANDS

692       The following operands shall be supported:
693
694       directory
695              The destination directory pathname for copy mode.
696
697       file   A pathname of a file to be copied or archived.
698
699       pattern
700              A  pattern matching one or more pathnames of archive members.  A
701              pattern must be given in the  name-generating  notation  of  the
702              pattern matching notation in Pattern Matching Notation , includ‐
703              ing the filename expansion rules in Patterns Used  for  Filename
704              Expansion. The default, if no pattern is specified, is to select
705              all members in the archive.
706
707

STDIN

709       In write mode, the standard input shall be used only if no  file  oper‐
710       ands  are specified. It shall be a text file containing a list of path‐
711       names, one per line, without leading or trailing <blank>s.
712
713       In list and read modes, if -f is  not  specified,  the  standard  input
714       shall be an archive file.
715
716       Otherwise, the standard input shall not be used.
717
718

INPUT FILES

720       The  input file named by the archive option-argument, or standard input
721       when the archive is read from there, shall be a file formatted  accord‐
722       ing to one of the specifications in the EXTENDED DESCRIPTION section or
723       some other implementation-defined format.
724
725       The file /dev/tty shall be used to write prompts and read responses.
726
727

ENVIRONMENT VARIABLES

729       The following environment variables shall affect the execution of pax:
730
731       LANG   Provide a default value for the  internationalization  variables
732              that are unset or null. (See the Base Definitions volume of IEEE
733              Std 1003.1-2001, Section 8.2, Internationalization Variables for
734              the  precedence of internationalization variables used to deter‐
735              mine the values of locale categories.)
736
737       LC_ALL If set to a non-empty string value, override the values  of  all
738              the other internationalization variables.
739
740       LC_COLLATE
741              Determine  the  locale  for  the behavior of ranges, equivalence
742              classes, and multi-character collating elements used in the pat‐
743              tern  matching  expressions  for  the pattern operand, the basic
744              regular expression for the -s option, and the  extended  regular
745              expression defined for the yesexpr locale keyword in the LC_MES‐
746              SAGES category.
747
748       LC_CTYPE
749              Determine the locale for  the  interpretation  of  sequences  of
750              bytes  of  text  data as characters (for example, single-byte as
751              opposed to multi-byte characters in arguments and input  files),
752              the  behavior  of character classes used in the extended regular
753              expression defined for the yesexpr locale keyword in the LC_MES‐
754              SAGES category, and pattern matching.
755
756       LC_MESSAGES
757              Determine the locale for the processing of affirmative responses
758              that should be used to affect the format and contents  of  diag‐
759              nostic messages written to standard error.
760
761       LC_TIME
762              Determine  the format and contents of date and time strings when
763              the -v option is specified.
764
765       NLSPATH
766              [XSI] [Option Start] Determine the location of message  catalogs
767              for the processing of LC_MESSAGES . [Option End]
768
769       TMPDIR Determine  the pathname that provides part of the default global
770              extended header record file, as described for the -o globexthdr=
771              keyword in the OPTIONS section.
772
773       TZ     Determine  the  timezone used to calculate date and time strings
774              when the -v option is specified. If TZ  is  unset  or  null,  an
775              unspecified default timezone shall be used.
776
777

ASYNCHRONOUS EVENTS

779       Default.
780
781

STDOUT

783       In write mode, if -f is not specified, the standard output shall be the
784       archive formatted  according  to  one  of  the  specifications  in  the
785       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
786       mat (see -x format).
787
788       In list mode, when the -o  listopt=  format  has  been  specified,  the
789       selected  archive members shall be written to standard output using the
790       format described under List Mode Format Specifications.  In  list  mode
791       without  the  -o  listopt=  format option, the table of contents of the
792       selected archive members shall be written to standard output using  the
793       following format:
794
795            "%s\n", <pathname>
796
797       If  the  -v  option is specified in list mode, the table of contents of
798       the selected archive members shall be written to standard output  using
799       the following formats.
800
801       For  pathnames  representing  hard links to previous members of the ar‐
802       chive:
803
804            "%s == %s\n", <ls -l listing>, <linkname>
805
806       For all other pathnames:
807
808            "%s\n", <ls -l listing>
809
810       where <ls -l listing> shall be the format specified by the ls(1)  util‐
811       ity  with  the  -l option. When writing pathnames in this format, it is
812       unspecified what is written for fields for which the underlying archive
813       format does not have the correct information, although the correct num‐
814       ber of <blank>-separated fields shall be written.
815
816       In list mode, standard output shall not be buffered more than a line at
817       a time.
818
819

STDERR

821       If  -v  is specified in read, write, or copy modes, pax shall write the
822       pathnames it processes to the standard error output using the following
823       format:
824
825            "%s\n", <pathname>
826
827       These  pathnames shall be written as soon as processing is begun on the
828       file or archive member, and shall be flushed  to  standard  error.  The
829       trailing  <newline>,  which  shall not be buffered, is written when the
830       file has been read or written.
831
832       If the -s option is specified, and the replacement string has a  trail‐
833       ing  'p',  substitutions shall be written to standard error in the fol‐
834       lowing format:
835
836            "%s >> %s\n", <original pathname>, <new pathname>
837
838       In all operating modes of pax, optional messages of unspecified  format
839       concerning  the  input  archive format and volume number, the number of
840       files, blocks, volumes, and media parts as  well  as  other  diagnostic
841       messages may be written to standard error.
842
843       In  all  formats,  for  both  standard output and standard error, it is
844       unspecified how non-printable characters in pathnames or link names are
845       written.
846
847       When pax is in read mode or list mode, using the -x pax archive format,
848       and a filename, link name,  owner  name,  or  any  other  field  in  an
849       extended  header record cannot be translated from the pax UTF-8 codeset
850       format to the codeset and current locale  of  the  implementation,  pax
851       shall  write  a diagnostic message to standard error, shall process the
852       file as described for the -o invalid= option, and  then  shall  process
853       the next file in the archive.
854
855

OUTPUT FILES

857       In  read mode, the extracted output files shall be of the archived file
858       type. In copy mode, the copied output files shall be the  type  of  the
859       file  being  copied.  In either mode, existing files in the destination
860       hierarchy shall be overwritten only when all permission (-p), modifica‐
861       tion time (-u), and invalid-value (-o invalid=) tests allow it.
862
863       In write mode, the output file named by the -f option-argument shall be
864       a file formatted according to one of the specifications in the EXTENDED
865       DESCRIPTION section, or some other implementation-defined format.
866
867

EXTENDED DESCRIPTION

869   pax Interchange Format
870       A  pax archive tape or file produced in the -x pax format shall contain
871       a series of blocks. The physical layout of the archive shall be identi‐
872       cal  to  the  ustar  format described in ustar Interchange Format. Each
873       file archived shall be represented by the following sequence:
874
875              ·      An optional header block with  extended  header  records.
876                     This  header block is of the form described in pax Header
877                     Block, with a typeflag value of x  or  g.   The  extended
878                     header  records,  described in pax Extended Header, shall
879                     be included as the data for this header block.
880
881              ·      A header block that describes the file. Any fields in the
882                     preceding  optional  extended  header  shall override the
883                     associated fields in this header block for this file.
884
885              ·      Zero or more blocks that  contain  the  contents  of  the
886                     file.
887
888       At  the  end  of  the  archive  file there shall be two 512-byte blocks
889       filled with binary zeros, interpreted as an end-of-archive indicator.
890
891       A schematic of an example archive with global extended  header  records
892       and  two  actual  files  is shown in pax Format Archive Example. In the
893       example, the second file in the archive has no extended header  preced‐
894       ing it, presumably because it has no need for extended attributes.
895
896                         Figure: pax Format Archive Example
897
898    ┌──────────────────────────────┬─────────────────────────────────────────────┐
899    │ustar Header [typeflag = 'g'] │                                             │
900    ├──────────────────────────────┤           Global Extended header            │
901    │Global Extended Header Data   │                                             │
902    ├──────────────────────────────┼─────────────────────────────────────────────┤
903    │ustar Header [typeflag = 'x'] │                                             │
904    ├──────────────────────────────┤                                             │
905    │Extended Header Data          │                                             │
906    ├──────────────────────────────┤  File 1: Extended Header data is included   │
907    │ustar Header [typeflag = '0'] │                                             │
908    ├──────────────────────────────┤                                             │
909    │Data for File 1               │                                             │
910    ├──────────────────────────────┼─────────────────────────────────────────────┤
911    │ustar Header [typeflag = '0'] │                                             │
912    ├──────────────────────────────┤ File 2: No Extended Header data is included │
913    │Data for File 2               │                                             │
914    ├──────────────────────────────┼─────────────────────────────────────────────┤
915    │Block of binary Zeroes        │                                             │
916    ├──────────────────────────────┤          End of Archive Indicator           │
917    │Block of binary Zeroes        │                                             │
918    └──────────────────────────────┴─────────────────────────────────────────────┘
919
920   pax Header Block
921       The  pax  header  block  shall  be  identical to the ustar header block
922       described in ustar Interchange Format, except that two additional type‐
923       flag values are defined:
924
925       x      Represents extended header records for the following file in the
926              archive (which shall have its own ustar header block).  The for‐
927              mat  of  these  extended header records shall be as described in
928              pax Extended Header.
929
930       g      Represents global extended  header  records  for  the  following
931              files  in  the  archive.  The  format  of  these extended header
932              records shall be as described  in  pax  Extended  Header.   Each
933              value  shall  affect  all  subsequent files that do not override
934              that value in their own extended header record and until another
935              global  extended  header record is reached that provides another
936              value for the same field. The typeflag g global  headers  should
937              not  be  used  with  interchange media that could suffer partial
938              data loss in transporting the archive.
939
940       For both of these types, the size  field  shall  be  the  size  of  the
941       extended header records in octets. The other fields in the header block
942       are not meaningful to this version of the  pax  utility.   However,  if
943       this   archive  is  read  by  a  pax  utility  conforming  to  the  ISO
944       POSIX-2:1993 standard, the header block fields are  used  to  create  a
945       regular  file that contains the extended header records as data. There‐
946       fore, header block field values should be selected to  provide  reason‐
947       able file access to this regular file.
948
949       A  further  difference  from the ustar header block is that data blocks
950       for files of typeflag 1 (the digit one) (hard link)  may  be  included,
951       which means that the size field may be greater than zero. Archives cre‐
952       ated by pax -o linkdata shall include these data blocks with  the  hard
953       links.
954
955
956   pax Extended Header
957       A  pax  extended  header contains values that are inappropriate for the
958       ustar header block  because  of  limitations  in  that  format:  fields
959       requiring a character encoding other than that described in the ISO/IEC
960       646:1991 standard, fields representing file attributes not described in
961       the  ustar  header,  and  fields  whose format or length do not fit the
962       requirements of the ustar header. The values in an extended header  add
963       attributes  to the following file (or files; see the description of the
964       typeflag g header block) or override values  in  the  following  header
965       block(s), as indicated in the following list of keywords.
966
967       An  extended  header  shall  consist  of one or more records, each con‐
968       structed as follows:
969
970            "%d %s=%s\n", <length>, <keyword>, <value>
971
972       The extended header records shall be encoded according to  the  ISO/IEC
973       10646-1:2000  standard  (UTF-8).  The  <length>  field, <blank>, equals
974       sign, and <newline> shown shall be limited to  the  portable  character
975       set,  as  encoded in UTF-8. The <keyword> and <value> fields can be any
976       UTF-8 characters. The <length> field shall be the decimal length of the
977       extended header record in octets, including the trailing <newline>.
978
979       The <keyword> field shall be one of the entries from the following list
980       or a keyword provided as an implementation  extension.   Keywords  con‐
981       sisting entirely of lowercase letters, digits, and periods are reserved
982       for future standardization. A keyword shall not include an equals sign.
983       (In  the  following list, the notations "file(s)" or "block(s)" is used
984       to acknowledge that a keyword affects the following single file after a
985       typeflag  x extended header, but possibly multiple files after typeflag
986       g.  Any requirements in the list for pax to include a  record  when  in
987       write  or copy mode shall apply only when such a record has not already
988       been provided through the use of the -o option. When used in copy mode,
989       pax  shall  behave  as  if  an archive had been created with applicable
990       extended header records and then extracted.)
991
992       atime  The file access time for the following  file(s),  equivalent  to
993              the  value  of  the  st_atime member of the stat structure for a
994              file, as described by the  stat(2)  function.  The  access  time
995              shall  be  restored if the process has the appropriate privilege
996              required to do so.  The  format  of  the  <value>  shall  be  as
997              described in pax Extended Header File Times.
998
999       charset
1000              The  name  of  the  character set used to encode the data in the
1001              following file(s).  The  entries  in  the  following  table  are
1002              defined  to  refer  to  known standards; additional names may be
1003              agreed on between the originator and recipient.
1004
1005              ┌────────────────────────┬───────────────────────────────┐
1006<value>         Formal Standard        
1007              ├────────────────────────┼───────────────────────────────┤
1008              │ISO-IR 646 1990         │ ISO/IEC 646:1990              │
1009              │ISO-IR 8859 1 1998      │ ISO/IEC 8859-1:1998           │
1010              │ISO-IR 8859 2 1999      │ ISO/IEC 8859-2:1999           │
1011              │ISO-IR 8859 3 1999      │ ISO/IEC 8859-3:1999           │
1012              │ISO-IR 8859 4 1998      │ ISO/IEC 8859-4:1998           │
1013              │ISO-IR 8859 5 1999      │ ISO/IEC 8859-5:1999           │
1014              │ISO-IR 8859 6 1999      │ ISO/IEC 8859-6:1999           │
1015              │ISO-IR 8859 7 1987      │ ISO/IEC 8859-7:1987           │
1016              │ISO-IR 8859 8 1999      │ ISO/IEC 8859-8:1999           │
1017              │ISO-IR 8859 9 1999      │ ISO/IEC 8859-9:1999           │
1018              │ISO-IR 8859 10 1998     │ ISO/IEC 8859-10:1998          │
1019              │ISO-IR 8859 13 1998     │ ISO/IEC 8859-13:1998          │
1020              │ISO-IR 8859 14 1998     │ ISO/IEC 8859-14:1998          │
1021              │ISO-IR 8859 15 1999     │ ISO/IEC 8859-15:1999          │
1022              │ISO-IR 10646 2000       │ ISO/IEC 10646:2000            │
1023              │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1024              │BINARY                  │ None                          │
1025              └────────────────────────┴───────────────────────────────┘
1026       The encoding is included in an extended header  for  information  only;
1027       when  pax  is  used  as described in IEEE Std 1003.1-2001, it shall not
1028       translate the file data into any other encoding. The BINARY entry indi‐
1029       cates unencoded binary data.
1030
1031       When  used  in write or copy mode, it is implementation-defined whether
1032       pax includes a charset extended header record for a file.
1033
1034       comment
1035              A series of characters used as a comment. All characters in  the
1036              <value> field shall be ignored by pax.
1037
1038       gid    The  group  ID  of  the group that owns the file, expressed as a
1039              decimal number using digits from the ISO/IEC 646:1991  standard.
1040              This record shall override the gid field in the following header
1041              block(s). When used in write or copy mode, pax shall  include  a
1042              gid  extended  header  record  for  each  file whose group ID is
1043              greater than 2097151 (octal 7777777).
1044
1045       gname  The group of the file(s), formatted as a group name in the group
1046              database. This record shall override the gid and gname fields in
1047              the following header  block(s),  and  any  gid  extended  header
1048              record.  When used in read, copy, or list mode, pax shall trans‐
1049              late the name from the UTF-8 encoding in the  header  record  to
1050              the  character  set  appropriate  for  the group database on the
1051              receiving system. If any  of  the  UTF-8  characters  cannot  be
1052              translated, and if the -o invalid=UTF-8 option is not specified,
1053              the results are implementation-defined. When used  in  write  or
1054              copy  mode, pax shall include a gname extended header record for
1055              each file whose group name cannot be represented  entirely  with
1056              the letters and digits of the portable character set.
1057
1058       linkpath
1059              The  pathname  of  a  link being created to another file, of any
1060              type,  previously  archived.  This  record  shall  override  the
1061              linkname  field in the following ustar header block(s). The fol‐
1062              lowing ustar header block shall determine the type of link  cre‐
1063              ated.  If  typeflag of the following header block is 1, it shall
1064              be a hard link. If typeflag is 2, it shall be  a  symbolic  link
1065              and  the  linkpath  value  shall be the contents of the symbolic
1066              link. The pax utility shall translate the name of the link (con‐
1067              tents of the symbolic link) from the UTF-8 encoding to the char‐
1068              acter set appropriate for the local file system.  When  used  in
1069              write or copy mode, pax shall include a linkpath extended header
1070              record for  each  link  whose  pathname  cannot  be  represented
1071              entirely  with  the  members of the portable character set other
1072              than NUL.
1073
1074       mtime  The file modification time of the following file(s),  equivalent
1075              to  the value of the st_mtime member of the stat structure for a
1076              file, as described in the stat(2) function.  This  record  shall
1077              override  the  mtime field in the following header block(s). The
1078              modification time shall be  restored  if  the  process  has  the
1079              appropriate  privilege  required  to  do  so.  The format of the
1080              <value> shall be as described in pax Extended Header File Times.
1081
1082       path   The pathname of the following file(s). This record  shall  over‐
1083              ride  the  name  and  prefix  fields  in  the  following  header
1084              block(s). The pax utility shall translate the  pathname  of  the
1085              file  from  the  UTF-8 encoding to the character set appropriate
1086              for the local file system.
1087
1088              When used in write or  copy  mode,  pax  shall  include  a  path
1089              extended  header  record  for each file whose pathname cannot be
1090              represented entirely with the members of the portable  character
1091              set other than NUL.
1092
1093       realtime.any
1094              The  keywords  prefixed  by  "realtime." are reserved for future
1095              standardization.
1096
1097       security.any
1098              The keywords prefixed by "security."  are  reserved  for  future
1099              standardization.
1100
1101       size   The  size  of  the file in octets, expressed as a decimal number
1102              using digits from the ISO/IEC  646:1991  standard.  This  record
1103              shall  override the size field in the following header block(s).
1104              When used in write or  copy  mode,  pax  shall  include  a  size
1105              extended  header  record for each file with a size value greater
1106              than 8589934591 (octal 77777777777).
1107
1108       uid    The user ID of the file owner, expressed  as  a  decimal  number
1109              using  digits  from  the  ISO/IEC 646:1991 standard. This record
1110              shall override the uid field in the following  header  block(s).
1111              When  used  in  write  or  copy  mode,  pax  shall include a uid
1112              extended header record for each file whose owner ID  is  greater
1113              than 2097151 (octal 7777777).
1114
1115       uname  The  owner of the following file(s), formatted as a user name in
1116              the user database. This record shall override the uid and  uname
1117              fields  in  the  following header block(s), and any uid extended
1118              header record. When used in read, copy, or list mode, pax  shall
1119              translate  the name from the UTF-8 encoding in the header record
1120              to the character set appropriate for the user  database  on  the
1121              receiving  system.  If  any  of  the  UTF-8 characters cannot be
1122              translated, and if the -o invalid=UTF-8 option is not specified,
1123              the  results  are  implementation-defined. When used in write or
1124              copy mode, pax shall include a uname extended header record  for
1125              each  file  whose  user name cannot be represented entirely with
1126              the letters and digits of the portable character set.
1127
1128       If the <value> field is zero length, it shall delete any  header  block
1129       field,  previously  entered  extended  header value, or global extended
1130       header value of the same name.
1131
1132       If a keyword in an extended header record (or in a -o  option-argument)
1133       overrides  or  deletes a corresponding field in the ustar header block,
1134       pax shall ignore the contents of that header block field.
1135
1136       Unlike the ustar header block fields, NULs shall not delimit  <value>s;
1137       all  characters  within  the <value> field shall be considered data for
1138       the field. None of the length limitations of  the  ustar  header  block
1139       fields  in  ustar  Header  Block  shall  apply  to  the extended header
1140       records.
1141
1142
1143   pax Extended Header Keyword Precedence
1144       This section describes the  precedence  in  which  the  various  header
1145       records  and fields and command line options are selected to apply to a
1146       file in the archive. When pax is used in read or list modes,  it  shall
1147       determine a file attribute in the following sequence:
1148
1149              1.     If   -o   delete=keyword-prefix  is  used,  the  affected
1150                     attributes shall be determined from step 7., if  applica‐
1151                     ble, or ignored otherwise.
1152
1153              2.     If -o keyword:= is used, the affected attributes shall be
1154                     ignored.
1155
1156              3.     If -o keyword:=value  is  used,  the  affected  attribute
1157                     shall be assigned the value.
1158
1159              4.     If  there  is  a  typeflag  x extended header record, the
1160                     affected attribute shall be assigned the  <value>.   When
1161                     extended  header  records conflict, the last one given in
1162                     the header shall take precedence.
1163
1164              5.     If -o keyword=value is used, the affected attribute shall
1165                     be assigned the value.
1166
1167              6.     If  there  is a typeflag g global extended header record,
1168                     the affected attribute shall  be  assigned  the  <value>.
1169                     When  global  extended  header records conflict, the last
1170                     one given in the global header shall take precedence.
1171
1172              7.     Otherwise, the attribute shall  be  determined  from  the
1173                     ustar header block.
1174
1175
1176   pax Extended Header File Times
1177       The  pax  utility shall write an mtime record for each file in write or
1178       copy modes if  the  file's  modification  time  cannot  be  represented
1179       exactly  in  the  ustar header logical record described in ustar Inter‐
1180       change Format.  This can occur if the time is out of ustar range, or if
1181       the  file  system of the underlying implementation supports non-integer
1182       time granularities and the time is not an integer. All  of  these  time
1183       records  shall  be formatted as a decimal representation of the time in
1184       seconds since the Epoch. If a period ('.') decimal point  character  is
1185       present, the digits to the right of the point shall represent the units
1186       of a subsecond timing granularity, where the first digit is tenths of a
1187       second  and  each subsequent digit is a tenth of the previous digit. In
1188       read or copy mode, the pax utility shall truncate the time of a file to
1189       the greatest value that is not greater than the input header file time.
1190       In write or copy mode, the pax utility shall output a time  exactly  if
1191       it  can be represented exactly as a decimal number, and otherwise shall
1192       generate only enough digits so that the same time shall be recovered if
1193       the  file is extracted on a system whose underlying implementation sup‐
1194       ports the same time granularity.
1195
1196
1197   ustar Interchange Format
1198       A ustar archive tape or file shall contain a series of logical records.
1199       Each  logical record shall be a fixed-size logical record of 512 octets
1200       (see below). Although this format may be thought of as being stored  on
1201       9-track  industry-standard  12.7 mm (0.5 in) magnetic tape, other types
1202       of transportable media are not excluded. Each file  archived  shall  be
1203       represented  by  a  header logical record that describes the file, fol‐
1204       lowed by zero or more logical records that give  the  contents  of  the
1205       file. At the end of the archive file there shall be two 512-octet logi‐
1206       cal records filled with binary zeros, interpreted as an  end-of-archive
1207       indicator.
1208
1209       The  logical  records  may  be  grouped for physical I/O operations, as
1210       described under the -b blocksize and -x ustar options.  Each  group  of
1211       logical  records  may  be written with a single operation equivalent to
1212       the write(2) function. On magnetic tape, the result of this write shall
1213       be  a  single tape physical block. The last physical block shall always
1214       be the full size, so logical records after the two zero logical records
1215       may contain undefined data.
1216
1217       The header logical record shall be structured as shown in the following
1218       table. All lengths and offsets are in decimal.
1219
1220                              Table: ustar Header Block
1221
1222                  ┌───────────┬──────────────┬────────────────────┐
1223Field Name Octet Offset Length (in Octets) 
1224                  ├───────────┼──────────────┼────────────────────┤
1225                  │name       │       0      │        100         │
1226                  │mode       │     100      │          8         │
1227                  │uid        │     108      │          8         │
1228                  │gid        │     116      │          8         │
1229                  │size       │     124      │         12         │
1230                  │mtime      │     136      │         12         │
1231                  │chksum     │     148      │          8         │
1232                  │typeflag   │     156      │          1         │
1233                  │linkname   │     157      │        100         │
1234                  │magic      │     257      │          6         │
1235                  │version    │     263      │          2         │
1236                  │uname      │     265      │         32         │
1237                  │gname      │     297      │         32         │
1238                  │devmajor   │     329      │          8         │
1239                  │devminor   │     337      │          8         │
1240                  │prefix     │     345      │        155         │
1241                  └───────────┴──────────────┴────────────────────┘
1242       All characters in the header logical record shall be represented in the
1243       coded  character  set  of  the  ISO/IEC  646:1991 standard. For maximum
1244       portability between implementations,  names  should  be  selected  from
1245       characters represented by the portable filename character set as octets
1246       with the most significant bit zero. If an implementation  supports  the
1247       use  of characters outside of slash and the portable filename character
1248       set in names for files, users, and groups, one or more  implementation-
1249       defined encodings of these characters shall be provided for interchange
1250       purposes.
1251
1252       However, the pax utility shall never create filenames on the local sys‐
1253       tem  that  cannot  be accessed via the procedures described in IEEE Std
1254       1003.1-2001. If a filename is found on the medium that would create  an
1255       invalid  filename,  it  is implementation-defined whether the data from
1256       the file is stored on the file hierarchy and  under  what  name  it  is
1257       stored.  The pax utility may choose to ignore these files as long as it
1258       produces an error indicating that the file is being ignored.
1259
1260       Each field within the header logical record  is  contiguous;  that  is,
1261       there is no padding used. Each character on the archive medium shall be
1262       stored contiguously.
1263
1264       The fields magic, uname, and gname are character  strings  each  termi‐
1265       nated  by  a  NUL  character. The fields name, linkname, and prefix are
1266       NUL-terminated character strings except  when  all  characters  in  the
1267       array contain non-NUL characters including the last character. The ver‐
1268       sion field is two octets containing the  characters  "00"  (zero-zero).
1269       The  typeflag contains a single character. All other fields are leading
1270       zero-filled octal numbers using digits from the ISO/IEC 646:1991  stan‐
1271       dard  IRV.  Each  numeric field is terminated by one or more <space> or
1272       NUL characters.
1273
1274       The name and the prefix fields shall produce the pathname of the  file.
1275       A  new  pathname shall be formed, if prefix is not an empty string (its
1276       first character is not NUL), by concatenating prefix (up to  the  first
1277       NUL  character),  a  slash character, and name; otherwise, name is used
1278       alone. In either case, name is terminated at the first  NUL  character.
1279       If  prefix  begins  with  a NUL character, it shall be ignored. In this
1280       manner, pathnames of at most 256 characters  can  be  supported.  If  a
1281       pathname  does not fit in the space provided, pax shall notify the user
1282       of the error, and shall not store any part of the file-header or  data-
1283       on the medium.
1284
1285       The  linkname  field, described below, shall not use the prefix to pro‐
1286       duce a pathname. As such, a linkname is limited to 100  characters.  If
1287       the  name does not fit in the space provided, pax shall notify the user
1288       of the error, and shall not attempt to store the link on the medium.
1289
1290       The mode field provides 12 bits encoded in the ISO/IEC  646:1991  stan‐
1291       dard  octal  digit representation. The encoded bits shall represent the
1292       following values:
1293
1294                               Table: ustar mode Field
1295
1296     ┌──────┬─────────────────┬─────────────────────────────────────────────────┐
1297Bit  IEEE Std     Description                   
1298Value 1003.1-2001 Bit │                                                 │
1299     ├──────┼─────────────────┼─────────────────────────────────────────────────┤
1300     │04000 │ S_ISUID         │ Set UID on execution.                           │
1301     │02000 │ S_ISGID         │ Set GID on execution.                           │
1302     │01000 │ <reserved>      │ Reserved for future standardization.            │
1303     │00400 │ S_IRUSR         │ Read permission for file owner class.           │
1304     │00200 │ S_IWUSR         │ Write permission for file owner class.          │
1305     │00100 │ S_IXUSR         │ Execute/search permission for file owner class. │
1306     │00040 │ S_IRGRP         │ Read permission for file group class.           │
1307     │00020 │ S_IWGRP         │ Write permission for file group class.          │
1308     │00010 │ S_IXGRP         │ Execute/search permission for file group class. │
1309     │00004 │ S_IROTH         │ Read permission for file other class.           │
1310     │00002 │ S_IWOTH         │ Write permission for file other class.          │
1311     │00001 │ S_IXOTH         │ Execute/search permission for file other class. │
1312     └──────┴─────────────────┴─────────────────────────────────────────────────┘
1313       When appropriate privilege is required to set one of these  mode  bits,
1314       and  the  user  restoring  the files from the archive does not have the
1315       appropriate privilege, the mode bits for which the user does  not  have
1316       appropriate  privilege  shall  be ignored. Some of the mode bits in the
1317       archive format are not mentioned elsewhere in this volume of  IEEE  Std
1318       1003.1-2001.  If  the  implementation does not support those bits, they
1319       may be ignored.
1320
1321       The uid and gid fields are the user and group ID of the owner and group
1322       of the file, respectively.
1323
1324       The size field is the size of the file in octets. If the typeflag field
1325       is set to specify a file to be of type 1 (a  link)  or  2  (a  symbolic
1326       link), the size field shall be specified as zero. If the typeflag field
1327       is set to specify a file of type 5 (directory), the size field shall be
1328       interpreted  as  described under the definition of that record type. No
1329       data logical records are stored for types 1, 2, or 5. If  the  typeflag
1330       field  is set to 3 (character special file), 4 (block special file), or
1331       6 (FIFO), the meaning of the size field is unspecified by  this  volume
1332       of IEEE Std 1003.1-2001, and no data logical records shall be stored on
1333       the medium.  Additionally, for type 6, the size field shall be  ignored
1334       when reading. If the typeflag field is set to any other value, the num‐
1335       ber  of  logical  records  written  following  the  header   shall   be
1336       (size+511)/512, ignoring any fraction in the result of the division.
1337
1338       The  mtime field shall be the modification time of the file at the time
1339       it was archived. It is the ISO/IEC 646:1991 standard representation  of
1340       the  octal  value  of  the  modification time obtained from the stat(2)
1341       function.
1342
1343       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1344       tion  of  the octal value of the simple sum of all octets in the header
1345       logical record. Each octet  in  the  header  shall  be  treated  as  an
1346       unsigned  value.  These  values  shall be added to an unsigned integer,
1347       initialized to zero, the precision of which is not less than  17  bits.
1348       When  calculating  the  checksum,  the chksum field is treated as if it
1349       were all spaces.
1350
1351       The typeflag field specifies the type of file archived. If a particular
1352       implementation  does  not recognize the type, or the user does not have
1353       appropriate privilege to create that type, the file shall be  extracted
1354       as  if  it  were  a  regular file if the file type is defined to have a
1355       meaning for the size field that could cause data logical records to  be
1356       written on the medium (see the previous description for size).  If con‐
1357       version to a regular file occurs, the  pax  utility  shall  produce  an
1358       error  indicating  that  the conversion took place. All of the typeflag
1359       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1360
1361       0      Represents a regular file. For backwards-compatibility, a  type‐
1362              flag value of binary zero ('\0') should be recognized as meaning
1363              a regular file when extracting files from the archive.  Archives
1364              written with this version of the archive file format create reg‐
1365              ular files with a typefla value of the ISO/IEC 646:1991 standard
1366              IRV '0'.
1367
1368       1      Represents  a  file  linked to another file, of any type, previ‐
1369              ously archived. Such files are identified  by  having  the  same
1370              device and file serial numbers, and pathnames that refer to dif‐
1371              ferent directory entries. All such files shall  be  archived  as
1372              linked  files.  The  linked-to name is specified in the linkname
1373              field with a NUL-character terminator if it  is  less  than  100
1374              octets in length.
1375
1376       2      Represents  a  symbolic  link. The contents of the symbolic link
1377              shall be stored in the linkname field.
1378
1379       3,4    Represent  character  special  files  and  block  special  files
1380              respectively.  In  this  case  the  devmajor and devminor fields
1381              shall contain information defining the  device,  the  format  of
1382              which  is  unspecified  by  this volume of IEEE Std 1003.1-2001.
1383              Implementations may map the device specifications to  their  own
1384              local specification or may ignore the entry.
1385
1386       5      Specifies  a  directory  or  subdirectory. On systems where disk
1387              allocation is performed on a directory  basis,  the  size  field
1388              shall contain the maximum number of octets (which may be rounded
1389              to the nearest disk block allocation unit)  that  the  directory
1390              may  hold. A size field of zero indicates no such limiting. Sys‐
1391              tems that do not support limiting in this manner  should  ignore
1392              the size field.
1393
1394       6      Specifies a FIFO special file. Note that the archiving of a FIFO
1395              file archives the existence of this file and not its contents.
1396
1397       7      Reserved to represent a file  to  which  an  implementation  has
1398              associated   some  high-performance  attribute.  Implementations
1399              without such extensions should treat this file as a regular file
1400              (type 0).
1401
1402       A-Z    The  letters  'A'  to  'Z',  inclusive,  are reserved for custom
1403              implementations. All other values are reserved for  future  ver‐
1404              sions of IEEE Std 1003.1-2001.
1405
1406       It  is  unspecified whether files with pathnames that refer to the same
1407       directory entry are archived as linked files or as separate  files.  If
1408       they  are  archived  as  linked  files,  this  means that attempting to
1409       extract both pathnames from the resulting archive will always cause  an
1410       error  (unless  the  -u option is used) because the link cannot be cre‐
1411       ated.
1412
1413       It is unspecified whether files with the same device  and  file  serial
1414       numbers  being  appended  to  an archive are treated as linked files to
1415       members that were in the archive before the append.
1416
1417       Attempts to archive a socket using ustar interchange format shall  pro‐
1418       duce  a diagnostic message. Handling of other file types is implementa‐
1419       tion-defined.
1420
1421       The magic field is the specification that this archive  was  output  in
1422       this  archive format. If this field contains ustar (the five characters
1423       from the ISO/IEC 646:1991 standard IRV  shown  followed  by  NUL),  the
1424       uname  and gname fields shall contain the ISO/IEC 646:1991 standard IRV
1425       representation of the owner and group of the file, respectively  (trun‐
1426       cated  to  fit,  if  necessary).  When the file is restored by a privi‐
1427       leged, protection-preserving version of the utility, the user and group
1428       databases  shall  be  scanned  for  these names. If found, the user and
1429       group IDs contained within these files shall be used  rather  than  the
1430       values contained within the uid and gid fields.
1431
1432
1433   cpio Interchange Format
1434       The  octet-oriented  cpio  archive format shall be a series of entries,
1435       each comprising a header that describes the file, the name of the file,
1436       and then the contents of the file.
1437
1438       An  archive may be recorded as a series of fixed-size blocks of octets.
1439       This blocking shall be used only to make physical I/O  more  efficient.
1440       The last group of blocks shall always be at the full size.
1441
1442       For the octet-oriented cpio archive format, the individual entry infor‐
1443       mation shall be in the order indicated and described by  the  following
1444       table; see also the <cpio.h> header.
1445
1446                      Table: Octet-Oriented cpio Archive Entry
1447
1448            ┌─────────────────────┬────────────────────┬─────────────────┐
1449Header Field Name   Length (in Octets) Interpreted as  
1450            ├─────────────────────┼────────────────────┼─────────────────┤
1451            │c_magic              │ 6                  │ Octal number    │
1452            │c_dev                │ 6                  │ Octal number    │
1453            │c_ino                │ 6                  │ Octal number    │
1454            │c_mode               │ 6                  │ Octal number    │
1455            │c_uid                │ 6                  │ Octal number    │
1456            │c_gid                │ 6                  │ Octal number    │
1457            │c_nlink              │ 6                  │ Octal number    │
1458            │c_rdev               │ 6                  │ Octal number    │
1459            │c_mtime              │ 11                 │ Octal number    │
1460            │c_namesize           │ 6                  │ Octal number    │
1461            │c_filesize           │ 11                 │ Octal number    │
1462            │                     │                    │                 │
1463Filename Field Name  Length             Interpreted as  
1464            │c_name               │ c_namesize         │ Pathname string │
1465            │                     │                    │                 │
1466File Data Field Name Length             Interpreted as  
1467            │c_filedata           │ c_filesize         │ Data            │
1468            └─────────────────────┴────────────────────┴─────────────────┘
1469   cpio Header
1470       For  each  file in the archive, a header as defined previously shall be
1471       written. The information in the header fields is written as streams  of
1472       the  ISO/IEC 646:1991 standard characters interpreted as octal numbers.
1473       The octal numbers shall be extended to the necessary length by  append‐
1474       ing  the  ISO/IEC  646:1991 standard IRV zeros at the most-significant-
1475       digit end of the number; the result is written to the  most-significant
1476       digit of the stream of octets first. The fields shall be interpreted as
1477       follows:
1478
1479       c_magic
1480              Identify the archive as being a transportable  archive  by  con‐
1481              taining the identifying value "070707".
1482
1483       c_dev, c_ino
1484              Contains  values  that uniquely identify the file within the ar‐
1485              chive (that is, no files contain the  same  pair  of  c_dev  and
1486              c_ino values unless they are links to the same file). The values
1487              shall be determined in an unspecified manner.
1488
1489       c_mode Contains the file type and access permissions as defined in  the
1490              following table.
1491
1492                            Table: Values for cpio c_mode Field
1493
1494                 ┌──────────────────────┬─────────┬────────────────────────┐
1495File Permissions Name Value  Indicates        
1496                 ├──────────────────────┼─────────┼────────────────────────┤
1497                 │C_IRUSR               │ 000400  │ Read by owner          │
1498                 │C_IWUSR               │ 000200  │ Write by owner         │
1499                 │C_IXUSR               │ 000100  │ Execute by owner       │
1500                 │C_IRGRP               │ 000040  │ Read by group          │
1501                 │C_IWGRP               │ 000020  │ Write by group         │
1502                 │C_IXGRP               │ 000010  │ Execute by group       │
1503                 │C_IROTH               │ 000004  │ Read by others         │
1504                 │C_IWOTH               │ 000002  │ Write by others        │
1505                 │C_IXOTH               │ 000001  │ Execute by others      │
1506                 │C_ISUID               │ 004000  │ Set uid                │
1507                 │C_ISGID               │ 002000  │ Set gid                │
1508                 │C_ISVTX               │ 001000  │ Reserved               │
1509                 ├──────────────────────┼─────────┼────────────────────────┤
1510File Type Name        Value   Indicates              
1511                 ├──────────────────────┼─────────┼────────────────────────┤
1512                 │C_ISDIR               │ 0040000 │ Directory              │
1513                 │C_ISFIFO              │ 0010000 │ FIFO                   │
1514                 │C_ISREG               │ 0100000 │ Regular file           │
1515                 │C_ISLNK               │ 0120000 │ Symbolic link          │
1516                 │C_ISBLK               │ 0060000 │ Block special file     │
1517                 │C_ISCHR               │ 0020000 │ Character special file │
1518                 │C_ISSOCK              │ 0140000 │ Socket                 │
1519                 │C_ISCTG               │ 0110000 │ Reserved               │
1520                 └──────────────────────┴─────────┴────────────────────────┘
1521              Directories,  FIFOs,  symbolic links, and regular files shall be
1522              supported on a system conforming to  this  volume  of  IEEE  Std
1523              1003.1-2001;  additional  values defined previously are reserved
1524              for compatibility with existing systems.  Additional file  types
1525              may  be  supported; however, such files should not be written to
1526              archives intended to be transported to other systems.
1527
1528       c_uid  Contains the user ID of the owner.
1529
1530       c_gid  Contains the group ID of the group.
1531
1532       c_nlink
1533              Contains a number greater than or equal to the number  of  links
1534              in the archive referencing the file. If the -a option is used to
1535              append to a cpio archive, then the pax utility need not  account
1536              for the files in the existing part of the archive when calculat‐
1537              ing the c_nlink values for the appended part of the archive, and
1538              need  not  alter  the c_nlink values in the existing part of the
1539              archive if additional files with the same c_dev and c_ino values
1540              are appended to the archive.
1541
1542       c_rdev Contains  implementation-defined  information  for  character or
1543              block special files.
1544
1545       c_mtime
1546              Contains the latest time of modification of the file at the time
1547              the archive was created.
1548
1549       c_namesize
1550              Contains  the  length of the pathname, including the terminating
1551              NUL character.
1552
1553       c_filesize
1554              Contains the length of the file in octets.  This  shall  be  the
1555              length of the data section following the header structure.
1556
1557
1558   cpio Filename
1559       The  c_name field shall contain the pathname of the file. The length of
1560       this field in octets is the value of c_namesize.
1561
1562       If a filename is found on the medium that would create an invalid path‐
1563       name,  it  is  implementation-defined whether the data from the file is
1564       stored on the file hierarchy and under what name it is stored.
1565
1566       All characters shall be represented in the  ISO/IEC  646:1991  standard
1567       IRV.  For  maximum portability between implementations, names should be
1568       selected from characters represented by the portable filename character
1569       set  as octets with the most significant bit zero. If an implementation
1570       supports the use of characters outside the portable filename  character
1571       set  in names for files, users, and groups, one or more implementation-
1572       defined encodings of these characters shall be provided for interchange
1573       purposes.  However, the pax utility shall never create filenames on the
1574       local system that cannot be accessed via the procedures described  pre‐
1575       viously  in this volume of IEEE Std 1003.1-2001. If a filename is found
1576       on the medium that would create an invalid filename, it is  implementa‐
1577       tion-defined whether the data from the file is stored on the local file
1578       system and under what name it is stored. The pax utility may choose  to
1579       ignore  these files as long as it produces an error indicating that the
1580       file is being ignored.
1581
1582
1583   cpio File Data
1584       Following c_name, there shall be c_filesize octets of data.   Interpre‐
1585       tation  of  such  data  occurs  in  a  manner dependent on the file. If
1586       c_filesize is zero, no data shall be contained in c_filedata.
1587
1588       When restoring from an archive:
1589
1590       ·      If the user does not have the appropriate privilege to create  a
1591              file of the specified type, pax shall ignore the entry and write
1592              an error message to standard error.
1593
1594       ·      Only regular files have data to be restored. Presuming a regular
1595              file  meets  any selection criteria that might be imposed on the
1596              format-reading utility by the user, such data shall be restored.
1597
1598       ·      If a user does not have appropriate privilege to set a  particu‐
1599              lar mode flag, the flag shall be ignored. Some of the mode flags
1600              in the archive format are not mentioned elsewhere in this volume
1601              of  IEEE Std 1003.1-2001. If the implementation does not support
1602              those flags, they may be ignored.
1603
1604
1605   cpio Special Entries
1606       FIFO special files, directories, and the trailer shall be recorded with
1607       c_filesize  equal  to  zero.  For  other  special  files, c_filesize is
1608       unspecified by this volume of IEEE Std 1003.1-2001. The header for  the
1609       next file entry in the archive shall be written directly after the last
1610       octet of the file entry preceding it. A header  denoting  the  filename
1611       TRAILER!!!   shall  indicate  the  end  of the archive; the contents of
1612       octets in the last block of the archive following  such  a  header  are
1613       undefined.
1614
1615

EXIT STATUS

1617       The following exit values shall be returned:
1618
1619        0     All files were processed successfully.
1620
1621       >0     An error occurred.
1622
1623

CONSEQUENCES OF ERRORS

1625       If pax cannot create a file or a link when reading an archive or cannot
1626       find a file when writing an archive, or cannot preserve  the  user  ID,
1627       group  ID,  or  file mode when the -p option is specified, a diagnostic
1628       message shall be written to standard error and a non-zero  exit  status
1629       shall be returned, but processing shall continue. In the case where pax
1630       cannot create a link to a file, pax shall not,  by  default,  create  a
1631       second copy of the file.
1632
1633       If  the  extraction of a file from an archive is prematurely terminated
1634       by a signal or error, pax may have only partially extracted the file or
1635       (if  the  -n option was not specified) may have extracted a file of the
1636       same name as that specified by the user, but which is not the file  the
1637       user  wanted. Additionally, the file modes of extracted directories may
1638       have additional bits from the S_IRWXU mask set  as  well  as  incorrect
1639       modification and access times.
1640
1641
1642_________________________________________________________________

The following sections are informative.

1644
1645

APPLICATION USAGE

1647       Caution  is advised when using the -a option to append to a cpio format
1648       archive. If any of the files being appended happen to be given the same
1649       c_dev  and  c_ino values as a file in the existing part of the archive,
1650       then they may be treated as links to that file on extraction. Thus,  it
1651       is  risky to use -a with cpio format except when it is done on the same
1652       system that the original archive was created on, and with the same  pax
1653       utility,  and  in  the  knowledge that there has been little or no file
1654       system activity since the original archive was created that could  lead
1655       to  any of the files appended being given the same c_dev and c_ino val‐
1656       ues as an unrelated file in the existing part  of  the  archive.  Also,
1657       when (intentionally) appending additional links to a file in the exist‐
1658       ing part of the archive, the c_nlink values in the modified archive can
1659       be  smaller  than the number of links to the file in the archive, which
1660       may mean that the links are not preserved on extraction.
1661
1662       The -p  (privileges)  option  was  invented  to  reconcile  differences
1663       between historical tar and cpio implementations. In particular, the two
1664       utilities use -m in diametrically opposed ways. The -p option also pro‐
1665       vides  a  consistent  means  of extending the ways in which future file
1666       attributes can be addressed, such as for enhanced security  systems  or
1667       high-performance  files. Although it may seem complex, there are really
1668       two modes that are most commonly used:
1669
1670       -p e   ``Preserve everything". This would be  used  by  the  historical
1671              superuser,  someone with all the appropriate privileges, to pre‐
1672              serve all aspects of the files as they are recorded in  the  ar‐
1673              chive.  The  e flag is the sum of o and p, and other implementa‐
1674              tion-defined attributes.
1675
1676       -p p   ``Preserve" the file mode bits. This would be used by  the  user
1677              with  regular  privileges  who wished to preserve aspects of the
1678              file other than the ownership. The file times are  preserved  by
1679              default,  but  two  other flags are offered to disable these and
1680              use the time of extraction.
1681
1682       The one pathname per line format of standard input precludes  pathnames
1683       containing  <newline>s.  Although  such  pathnames violate the portable
1684       filename guidelines, they may exist  and  their  presence  may  inhibit
1685       usage  of pax within shell scripts. This problem is inherited from his‐
1686       torical archive programs. The problem can be avoided by  listing  file‐
1687       name arguments on the command line instead of on standard input.
1688
1689       It  is  almost certain that appropriate privileges are required for pax
1690       to accomplish parts of this volume of IEEE Std  1003.1-2001.   Specifi‐
1691       cally,  creating  files  of  type  block  special or character special,
1692       restoring file access times unless the files are owned by the user (the
1693       -t  option),  or preserving file owner, group, and mode (the -p option)
1694       all probably require appropriate privileges.
1695
1696       In read mode, implementations are permitted to overwrite files when the
1697       archive  has multiple members with the same name. This may fail if per‐
1698       missions on the first version of the file do not permit it to be  over‐
1699       written.
1700
1701       The  cpio  and  ustar  formats  can only support files up to 8589934592
1702       bytes (8 * 2^30) in size.
1703
1704

EXAMPLES

1706       The following command:
1707
1708            pax -w -f /dev/rmt/1m .
1709
1710       copies the contents of the current directory to tape  drive  1,  medium
1711       density (assuming historical System V device naming procedures-the his‐
1712       torical BSD device name would be /dev/rmt9).
1713
1714       The following commands:
1715
1716            mkdir newdirpax -rw olddir newdir
1717
1718       copy the olddir directory hierarchy to newdir.
1719
1720            pax -r -s ',^//*usr//*,,' -f a.pax
1721
1722       reads the archive a.pax, with all files rooted in /usr in  the  archive
1723       extracted relative to the current directory.
1724
1725       Using the option:
1726
1727            -o listopt="%M %(atime)T %(size)D %(name)s"
1728
1729       overrides the default output description in Standard Output and instead
1730       writes:
1731
1732            -rw-rw--- Jan 12 15:53 1492 /usr/foo/bar
1733
1734       Using the options:
1735
1736            -o listopt='%L\t%(size)D\n%.7' \
1737            -o listopt='(name)s\n%(atime)T\n%T'
1738
1739       overrides the default output description in Standard Output and instead
1740       writes:
1741
1742       /usr/foo/bar -> /tmp   1492
1743       /usr/fo
1744       Jan 12 1991
1745       Jan 31 15:53
1746
1747

RATIONALE

1749       The  pax  utility  was new for the ISO POSIX-2:1993 standard. It repre‐
1750       sents a peaceful compromise between advocates of the historical tar and
1751       cpio utilities.
1752
1753       A  fundamental  difference between cpio and tar was in the way directo‐
1754       ries were treated. The cpio utility did not treat  directories  differ‐
1755       ently  from  other  files,  and  to select a directory and its contents
1756       required that each file in the hierarchy be explicitly  specified.  For
1757       tar, a directory matched every file in the file hierarchy it rooted.
1758
1759       The  pax  utility  offers  both interfaces; by default, directories map
1760       into the file hierarchy they root. The -d option causes pax to skip any
1761       file  not  explicitly  referenced, as cpio historically did.  The tar -
1762       style behavior was chosen as the default because it was  believed  that
1763       this  was  the  more  common usage and because tar is the more commonly
1764       available interface, as it was historically provided on both  System  V
1765       and BSD implementations.
1766
1767       The  data  interchange  format specification in this volume of IEEE Std
1768       1003.1-2001 requires that processes with "appropriate privileges" shall
1769       always restore the ownership and permissions of extracted files exactly
1770       as archived. If viewed from the historic equivalence between  superuser
1771       and "appropriate privileges", there are two problems with this require‐
1772       ment. First, users running as superusers may unknowingly set  dangerous
1773       permissions  on  extracted files. Second, it is needlessly limiting, in
1774       that superusers cannot extract files and own them as  superuser  unless
1775       the  archive  was  created  by  the superuser. (It should be noted that
1776       restoration  of  ownerships  and  permissions  for  the  superuser,  by
1777       default,  is historical practice in cpio, but not in tar.)  In order to
1778       avoid these two problems,  the  pax  specification  has  an  additional
1779       "privilege"  mechanism,  the  -p option. Only a pax invocation with the
1780       privileges needed, and which has the -p option set using the e specifi‐
1781       cation  character, has the "appropriate privilege" to restore full own‐
1782       ership and permission information.
1783
1784       Note also that this volume of IEEE Std 1003.1-2001  requires  that  the
1785       file  ownership  and access permissions shall be set, on extraction, in
1786       the same fashion as the creat(2) function when provided with  the  mode
1787       stored  in  the  archive. This means that the file creation mask of the
1788       user is applied to the file permissions.
1789
1790       Users should note that directories may be created by pax while extract‐
1791       ing  files  with permissions that are different from those that existed
1792       at the time the archive was created. When extracting sensitive informa‐
1793       tion  into  a  directory  hierarchy  that  no  longer exists, users are
1794       encouraged to set their file creation  mask  appropriately  to  protect
1795       these files during extraction.
1796
1797       The  table  of contents output is written to standard output to facili‐
1798       tate pipeline processing.
1799
1800       An early proposal had hard links displaying for  all  pathnames.   This
1801       was  removed  because it complicates the output of the case where -v is
1802       not specified and does not match historical cpio usage.  The  hard-link
1803       information is available in the -v display.
1804
1805       The  description  of  the -l option allows implementations to make hard
1806       links to symbolic links. IEEE Std 1003.1-2001 does not specify any  way
1807       to create a hard link to a symbolic link, but many implementations pro‐
1808       vide this capability as an extension. If there are hard links  to  sym‐
1809       bolic  links when an archive is created, the implementation is required
1810       to archive the hard link in the archive (unless -H or -L is specified).
1811       When  in  read  mode  and in copy mode, implementations supporting hard
1812       links to symbolic links should use them when appropriate.
1813
1814       The archive formats inherited from the POSIX.1-1990 standard have  cer‐
1815       tain  restrictions  that have been brought along from historical usage.
1816       For example, there are restrictions on the length of  pathnames  stored
1817       in  the archive. When pax is used in copy (-rw) mode (copying directory
1818       hierarchies), the ability to use extensions  from  the  -x  pax  format
1819       overcomes these restrictions.
1820
1821       The default blocksize value of 5120 bytes for cpio was selected because
1822       it is one of the standard block-size values for cpio, set when  the  -B
1823       option  is  specified.  (The other default block-size value for cpio is
1824       512 bytes, and this was considered to be too small.) The default  block
1825       value  of 10240 bytes for tar was selected because that is the standard
1826       block-size value for BSD tar.  The maximum block size  of  32256  bytes
1827       (2^15-512  bytes) is the largest multiple of 512 bytes that fits into a
1828       signed 16-bit tape controller transfer register. There are known  limi‐
1829       tations  in  some  historical  systems that would prevent larger blocks
1830       from being accepted. Historical values were chosen to improve  compati‐
1831       bility  with  historical  scripts  using  dd(1) or similar utilities to
1832       manipulate archives. Also, default block sizes for any file type  other
1833       than  character  special file has been deleted from this volume of IEEE
1834       Std 1003.1-2001 as unimportant and not likely to affect  the  structure
1835       of the resulting archive.
1836
1837       Implementations  are  permitted to modify the block-size value based on
1838       the archive format or the device to which the archive is being written.
1839       This  is to provide implementations with the opportunity to take advan‐
1840       tage of special types of devices, and it should not be used  without  a
1841       great  deal  of  consideration as it almost certainly decreases archive
1842       portability.
1843
1844       The intended use of the -n option was to permit extraction  of  one  or
1845       more files from the archive without processing the entire archive. This
1846       was viewed by the standard developers as offering  significant  perfor‐
1847       mance  advantages  over  historical  implementations.  The -n option in
1848       early proposals had three effects; the first was to cause special char‐
1849       acters in patterns to not be treated specially. The second was to cause
1850       only the first file that matched a pattern to be extracted.  The  third
1851       was  to  cause pax to write a diagnostic message to standard error when
1852       no file was found matching a specified pattern. Only the second  behav‐
1853       ior  is  retained by this volume of IEEE Std 1003.1-2001, for many rea‐
1854       sons. First, it is in general not acceptable for  a  single  option  to
1855       have  multiple  effects.  Second,  the ability to make pattern matching
1856       characters act as normal characters is useful for parts  of  pax  other
1857       than file extraction. Third, a finer degree of control over the special
1858       characters is useful because users may wish to normalize only a  single
1859       special  character  in  a single filename. Fourth, given a more general
1860       escape mechanism, the previous behavior of the -n option can be  easily
1861       obtained  using the -s option or a sed script. Finally, writing a diag‐
1862       nostic message when a pattern specified by the user is unmatched by any
1863       file is useful behavior in all cases.
1864
1865       In this version, the -n was removed from the copy mode synopsis of pax;
1866       it is inapplicable because there are no pattern operands  specified  in
1867       this mode.
1868
1869       There  is  another  method  than  pax  for copying subtrees in IEEE Std
1870       1003.1-2001 described as part of the cp(1) utility.  Both  methods  are
1871       historical  practice:  cp(1)  provides a simpler, more intuitive inter‐
1872       face, while pax offers a finer granularity of  control.  Each  provides
1873       additional functionality to the other; in particular, pax maintains the
1874       hard-link structure of the hierarchy while cp(1) does not.  It  is  the
1875       intention of the standard developers that the results be similar (using
1876       appropriate option combinations in both utilities). The results are not
1877       required  to  be  identical; there seemed insufficient gain to applica‐
1878       tions to balance the difficulty of implementations having to  guarantee
1879       that the results would be exactly identical.
1880
1881       A  single  archive  may  span  more than one file. It is suggested that
1882       implementations provide informative messages to the  user  on  standard
1883       error whenever the archive file is changed.
1884
1885       The -d option (do not create intermediate directories not listed in the
1886       archive) found in early proposals was originally provided as a  comple‐
1887       ment to the historic -d option of cpio.  It has been deleted.
1888
1889       The -s option in early proposals specified a subset of the substitution
1890       command from the ed utility. As there was no reason for only  a  subset
1891       to  be  supported,  the -s option is now compatible with the current ed
1892       specification. Since the delimiter can be any non-null  character,  the
1893       following usage with single spaces is valid:
1894
1895            pax -s " foo bar " ...
1896
1897       The  -t  description  is  worded  so as to note that this may cause the
1898       access time update caused by some other activity  (which  occurs  while
1899       the file is being read) to be overwritten.
1900
1901       The  default  behavior of pax with regard to file modification times is
1902       the same as historical implementations of tar.  It is not the  histori‐
1903       cal behavior of cpio.
1904
1905       Because  the  -i  option uses /dev/tty, utilities without a controlling
1906       terminal are not able to use this option.
1907
1908       The -y option, found in early proposals, has  been  deleted  because  a
1909       line  containing a single period for the -i option has equivalent func‐
1910       tionality. The special lines for the -i option (a single period and the
1911       empty line) are historical practice in cpio.
1912
1913       In early drafts, a -e charmap option was included to increase portabil‐
1914       ity of files between systems using different coded character sets. This
1915       option  was omitted because it was apparent that consensus could not be
1916       formed for it. In this version, the use of UTF-8 should be an  adequate
1917       substitute.
1918
1919       The  -k  option  was  added to address international concerns about the
1920       dangers involved in the character set transformations  of  -e  (if  the
1921       target  character  set  were  different  from the source, the filenames
1922       might be transformed into names matching existing files) and  also  was
1923       made  more  general  to  protect files transferred between file systems
1924       with different {NAME_MAX} values (truncating a filename  on  a  smaller
1925       system  might  also inadvertently overwrite existing files). As stated,
1926       it prevents any overwriting, even if the target file is older than  the
1927       source.  This  version  adds  more granularity of options to solve this
1928       problem by introducing the -o invalid=option  -specifically  the  UTF-8
1929       action. (Note that an existing file that is named with a UTF-8 encoding
1930       is still subject to overwriting in this case. The -k option closes that
1931       loophole.)
1932
1933       Some  of the file characteristics referenced in this volume of IEEE Std
1934       1003.1-2001 might not be supported by some archive formats.  For  exam‐
1935       ple, neither the tar nor cpio formats contain the file access time. For
1936       this reason, the e specification character has been provided,  intended
1937       to  cause  all  file  characteristics  specified  in  the archive to be
1938       retained.
1939
1940       It is required that  extracted  directories,  by  default,  have  their
1941       access  and modification times and permissions set to the values speci‐
1942       fied in the archive. This has obvious problems in that the  directories
1943       are  almost certainly modified after being extracted and that directory
1944       permissions may not permit file creation. One possible solution  is  to
1945       create  directories with the mode specified in the archive, as modified
1946       by the umask of the user, with sufficient  permissions  to  allow  file
1947       creation. After all files have been extracted, pax would then reset the
1948       access and modification times and permissions as necessary.
1949
1950       The list-mode formatting  description  borrows  heavily  from  the  one
1951       defined  by  the printf(1) utility. However, since there is no separate
1952       operand list to get conversion arguments, the format  was  extended  to
1953       allow  specifying  the  name  of the conversion argument as part of the
1954       conversion specification.
1955
1956       The T conversion specifier allows time fields to be displayed in any of
1957       the  date  formats.  Unlike  the ls(1) utility, pax does not adjust the
1958       format when the date is less than six months in the  past.  This  makes
1959       parsing the output more predictable.
1960
1961       The   D  conversion  specifier  handles  the  ability  to  display  the
1962       major/minor or file size, as with ls(1), by using %-8(size)D.
1963
1964       The L conversion specifier handles the ls display for symbolic links.
1965
1966       Conversion specifiers were added to generate existing known types  used
1967       for ls(1).
1968
1969
1970   pax Interchange Format
1971       The  new  POSIX data interchange format was developed primarily to sat‐
1972       isfy international concerns that the ustar and  cpio  formats  did  not
1973       provide for file, user, and group names encoded in characters outside a
1974       subset of the ISO/IEC 646:1991 standard. The standard developers  real‐
1975       ized  that this new POSIX data interchange format should be very exten‐
1976       sible because there were other requirements they foresaw  in  the  near
1977       future:
1978
1979       ·      Support international character encodings and locale information
1980
1981       ·      Support security information (ACLs, and so on)
1982
1983       ·      Support future file types, such as realtime or contiguous files
1984
1985       ·      Include data areas for implementation use
1986
1987       ·      Support  systems  with words larger than 32 bits and timers with
1988              subsecond granularity
1989
1990       The following were not goals for this format because these  are  better
1991       handled  by separate utilities or are inappropriate for a portable for‐
1992       mat:
1993
1994       ·      Encryption
1995
1996       ·      Compression
1997
1998       ·      Data translation between locales and codesets
1999
2000       ·      inode storage
2001
2002       The format chosen to support the goals is an  extension  of  the  ustar
2003       format.  Of the two formats previously available, only the ustar format
2004       was selected for extensions because:
2005
2006       ·      It was easier to extend in an upwards-compatible way. It offered
2007              version  flags and header block type fields with room for future
2008              standardization. The cpio format, while possessing a more flexi‐
2009              ble  file  naming  methodology,  could  not  be extended without
2010              breaking some theoretical implementation or using a dummy  file‐
2011              name that could be a legitimate filename.
2012
2013       ·      Industry  experience  since  the  original  "tar wars" fought in
2014              developing the ISO POSIX-1 standard has clearly been in favor of
2015              the  ustar  format, which is generally the default output format
2016              selected for pax implementations on new systems.
2017
2018       The new format was designed with one additional goal in  mind:  reason‐
2019       able  behavior when an older tar or pax utility happened to read an ar‐
2020       chive. Since the POSIX.1-1990 standard mandated that a  "format-reading
2021       utility"  had  to  treat unrecognized typeflag values as regular files,
2022       this allowed the format to include all the extended  information  in  a
2023       pseudo-regular  file  that  preceded each real file. An option is given
2024       that allows the archive creator to set up reasonable  names  for  these
2025       files  on  the  older  systems.  Also, the normative text suggests that
2026       reasonable file access values be used for this ustar header block. Mak‐
2027       ing these header files inaccessible for convenient reading and deleting
2028       would not be reasonable. File permissions of 600 or 700 are suggested.
2029
2030       The ustar typeflag field was used to accommodate the  additional  func‐
2031       tionality  of  the  new format rather than magic or version because the
2032       POSIX.1-1990 standard (and, by reference, the previous version of pax),
2033       mandated the behavior of the format-reading utility when it encountered
2034       an unknown typeflag, but was silent about the other two fields.
2035
2036       Early proposals of the first revision to IEEE Std 1003.1-2001 contained
2037       a  proposed  archive  format  that  was based on compatibility with the
2038       standard for tape files (ISO 1001, similar to the format used  histori‐
2039       cally  on  many  mainframes  and minicomputers). This format was overly
2040       complex  and  required  considerable  overhead  in  volume  and  header
2041       records. Furthermore, the standard developers felt that it would not be
2042       acceptable to the community  of  POSIX  developers,  so  it  was  later
2043       changed  to  be a format more closely related to historical practice on
2044       POSIX systems.
2045
2046       The prefix and name split of pathnames in ustar  was  replaced  by  the
2047       single path extended header record for simplicity.
2048
2049       The concept of a global extended header (typeflag g) was controversial.
2050       If this were applied to an archive being recorded on magnetic  tape,  a
2051       few  unreadable  blocks at the beginning of the tape could be a serious
2052       problem; a utility attempting to extract as many files as possible from
2053       a damaged archive could lose a large percentage of file header informa‐
2054       tion in this case. However, if the archive were on a  reliable  medium,
2055       such as a CD-ROM, the global extended header offers considerable poten‐
2056       tial size reductions by eliminating redundant  information.  Thus,  the
2057       text  warns  against  using  the global method for unreliable media and
2058       provides a method for implanting global  information  in  the  extended
2059       header for each file, rather than in the typeflag g records.
2060
2061       No  facility  for  data translation or filtering on a per-file basis is
2062       included because the standard developers could not invent an  interface
2063       that  would  allow  this  in  an efficient manner. If a filter, such as
2064       encryption or compression, is to be applied to all  the  files,  it  is
2065       more  efficient  to  apply the filter to the entire archive as a single
2066       file. The standard developers considered interfaces that would invoke a
2067       shell  script  for  each file going into or out of the archive, but the
2068       system overhead in this approach was considered to be too high.
2069
2070       One such approach would be to have filter= records that give a pathname
2071       for  an  executable.  When the program is invoked, the file and archive
2072       would be open for standard input/output and all the header fields would
2073       be  available  as  environment variables or command-line arguments. The
2074       standard developers did discuss such schemes,  but  they  were  omitted
2075       from  IEEE  Std  1003.1-2001  due to concerns about excessive overhead.
2076       Also, the program itself would need to be in the archive if it were  to
2077       be used portably.
2078
2079       There  is  currently  no  portable  means  of identifying the character
2080       set(s) used for a file in the file system. Therefore, pax has not  been
2081       given  a mechanism to generate charset records automatically.  The only
2082       portable means of doing this is for the user to write the archive using
2083       the -o charset=string command line option. This assumes that all of the
2084       files in the  archive  use  the  same  encoding.  The  "implementation-
2085       defined"  text  is included to allow for a system that can identify the
2086       encodings used for each of its files.
2087
2088       The table of standards that accompanies the charset record  description
2089       is  acknowledged to be very limited. Only a limited number of character
2090       set standards is reasonable for maximal interchange. Any character  set
2091       is,  of  course,  possible  by  prior  agreement. It was suggested that
2092       EBCDIC be listed, but it was omitted because it is  not  defined  by  a
2093       formal  standard. Formal standards, and then only those with reasonably
2094       large followings, can be included here, simply as a matter  of  practi‐
2095       cality. The <value>s represent names of officially registered character
2096       sets in the format required by the ISO 2375:1985 standard.
2097
2098       The normal comma or <blank>-separated list rules are  not  followed  in
2099       the  case  of  keyword  options  to  allow ease of argument parsing for
2100       getopts.
2101
2102       Further information on character encodings is in pax Archive  Character
2103       Set Encoding/Decoding.
2104
2105       The  standard  developers  have  reserved keyword name space for vendor
2106       extensions. It is suggested that the format to be used is:
2107
2108           VENDOR.keyword
2109
2110       where VENDOR is the name of the vendor or organization in all uppercase
2111       letters.  It is further suggested that the keyword following the period
2112       be named differently than any of the standard keywords so that it could
2113       be  used  for  future  standardization, if appropriate, by omitting the
2114       VENDOR prefix.
2115
2116       The <length> field in the extended header record was included  to  make
2117       it  simpler  to  step through the records, even if a record contains an
2118       unknown format (to a particular pax) with complex interactions of  spe‐
2119       cial  characters.  It also provides a minor integrity checkpoint within
2120       the records to aid a program attempting to recover files from a damaged
2121       archive.
2122
2123       There  are  no  extended  header  versions of the devmajor and devminor
2124       fields because the unspecified format ustar header field should be suf‐
2125       ficient.  If  they  are not, vendor-specific extended keywords (such as
2126       VENDOR.devmajor) should be used.
2127
2128       Device and i-number labeling of files was not adopted from cpio;  files
2129       are interchanged strictly on a symbolic name basis, as in ustar.
2130
2131       Just  as  with  the  ustar format descriptions, the new format makes no
2132       special arrangements for multi-volume archives. Each of the pax archive
2133       types  is  assumed  to be inside a single POSIX file and splitting that
2134       file over multiple volumes (diskettes, tape  cartridges,  and  so  on),
2135       processing  their  labels, and mounting each in the proper sequence are
2136       considered to  be  implementation  details  that  cannot  be  described
2137       portably.
2138
2139       The  pax  format  is intended for interchange, not only for backup on a
2140       single (family of) systems. It is not as densely  packed  as  might  be
2141       possible for backup:
2142
2143       ·      It  contains information as coded characters that could be coded
2144              in binary.
2145
2146       ·      It identifies extended records with name fields  that  could  be
2147              omitted in favor of a fixed-field layout.
2148
2149       ·      It translates names into a portable character set and identifies
2150              locale-related information, both of which are probably  unneces‐
2151              sary for backup.
2152
2153       The  requirements  on  restoring from an archive are slightly different
2154       from the historical wording, allowing for non-monolithic  privilege  to
2155       bring  forward  as  much as possible. In particular, attributes such as
2156       "high performance file" might be broadly but  not  universally  granted
2157       while  set-user-ID  or chown(2) might be much more restricted. There is
2158       no implication in IEEE Std 1003.1-2001 that the security information be
2159       honored  after  it  is restored to the file hierarchy, in spite of what
2160       might be improperly inferred by the silence on that topic.  That  is  a
2161       topic for another standard.
2162
2163       Links  are recorded in the fashion described here because a link can be
2164       to any file type. It is desirable in general to be able to restore part
2165       of an archive selectively and restore all of those files completely. If
2166       the data is not associated with each link, it is  not  possible  to  do
2167       this.  However,  the data associated with a file can be large, and when
2168       selective restoration is not needed, this can be a significant  burden.
2169       The  archive  is  structured so that files that have no associated data
2170       can always be restored by the name of any link name of  any  link,  and
2171       the  user  may  choose whether data is recorded with each instance of a
2172       file that contains data. The format permits mixing  of  both  types  of
2173       links  in a single archive; this can be done for special needs, and pax
2174       is expected to interpret such archives on input properly,  despite  the
2175       fact  that  there  is no pax option that would force this mixed case on
2176       output. (When -o linkdata is used, the output must contain  the  dupli‐
2177       cate data, but the implementation is free to include it or omit it when
2178       -o linkdata is not used.)
2179
2180       The time values are included  as  extended  header  records  for  those
2181       implementations  needing  more  than the eleven octal digits allowed by
2182       the ustar format. Portable file timestamps cannot be negative.  If  pax
2183       encounters  a  file with a negative timestamp in copy or write mode, it
2184       can reject the file, substitute a non-negative timestamp, or generate a
2185       non-portable  timestamp  with a leading granularities than seconds, the
2186       normative text requires  support  only  for  seconds  since  the  Epoch
2187       because the ISO POSIX-1 standard states them that way. The ustar format
2188       includes only mtime; the new format adds atime and ctime for  symmetry.
2189       The  atime  access time restored to the file system will be affected by
2190       the -p a and -p e options. The ctime creation time (actually inode mod‐
2191       ification  time)  is  described with "appropriate privilege" so that it
2192       can be ignored when writing to the file system. POSIX does not  provide
2193       a  portable  means to change file creation time. Nothing is intended to
2194       prevent a non-portable implementation of pax from restoring the value.
2195
2196       The gid, size, and uid extended header records were included  to  allow
2197       expansion  beyond  the  sizes  specified in the regular tar header. New
2198       file system architectures are emerging that will exhaust  the  12-digit
2199       size  field.  There are probably not many systems requiring more than 8
2200       digits for user and group IDs, but  the  extended  header  values  were
2201       included  for  completeness,  allowing overrides for all of the decimal
2202       values in the tar header.
2203
2204       The standard developers intended to describe the effective  results  of
2205       pax with regard to file ownerships and permissions; implementations are
2206       not restricted in timing or sequencing the restoration  of  such,  pro‐
2207       vided the results are as specified.
2208
2209       Much  of  the  text  describing  the  extended headers refers to use in
2210       "write or copy modes". The copy mode references are due to  the  norma‐
2211       tive text: "The effect of the copy shall be as if the copied files were
2212       written to an archive file and then subsequently extracted ...".  There
2213       is  certainly  no  way  to  test whether pax is actually generating the
2214       extended headers in copy mode, but the effects must be as if it had.
2215
2216
2217   pax Archive Character Set Encoding/Decoding
2218       There is a need to exchange archives of files between systems  of  dif‐
2219       ferent  native codesets. Filenames, group names, and user names must be
2220       preserved to the fullest extent possible when an archive is read on the
2221       receiving  platform. Translation of the contents of files is not within
2222       the scope of the pax utility.
2223
2224       There will also be the need to represent characters that are not avail‐
2225       able  on the receiving platform. These unsupported characters cannot be
2226       automatically folded to the local set of characters due to  the  chance
2227       of  collisions.  This  could  result  in overwriting previous extracted
2228       files from the archive or pre-existing files on the system.
2229
2230       For these reasons, the codeset used to represent characters within  the
2231       extended header records of the pax archive must be sufficiently rich to
2232       handle all commonly used character sets. The fields requiring  transla‐
2233       tion  include,  at  a  minimum, filenames, user names, group names, and
2234       link pathnames. Implementations may wish  to  have  localized  extended
2235       keywords that use non-portable characters.
2236
2237       The standard developers considered the following options:
2238
2239       ·      The  archive  creator  specifies  the  well-defined  name of the
2240              source codeset. The receiver must  then  recognize  the  codeset
2241              name and perform the appropriate translations to the destination
2242              codeset.
2243
2244       ·      The archive creator includes within the  archive  the  character
2245              mapping  table  for  the  source codeset used to encode extended
2246              header records. The receiver must then read the  character  map‐
2247              ping  table and perform the appropriate translations to the des‐
2248              tination codeset.
2249
2250       ·      The archive creator translates the extended  header  records  in
2251              the source codeset into a canonical form. The receiver must then
2252              perform the appropriate translations to the destination codeset.
2253
2254       The approach that incorporates the name of the source codeset poses the
2255       problem  of codeset name registration, and makes the archive useless to
2256       pax archive decoders that do not recognize that codeset.
2257
2258       Because parts of an archive may be corrupted, the  standard  developers
2259       felt  that  including  the  character map of the source codeset was too
2260       fragile. The loss of this one key component could result in making  the
2261       entire  archive  useless.  (The  difference between this and the global
2262       extended header decision was that the latter has a workaround-duplicat‐
2263       ing  extended  header records on unreliable media-but this would be too
2264       burdensome for large character set maps.)
2265
2266       Both of the above approaches also put an undue burden on  the  pax  ar‐
2267       chive  receiver  to handle the cross-product of all source and destina‐
2268       tion codesets.
2269
2270       To simplify the translation from the source codeset  to  the  canonical
2271       form  and from the canonical form to the destination codeset, the stan‐
2272       dard developers decided that the internal representation  should  be  a
2273       stateless  encoding.  A  stateless encoding is one where each codepoint
2274       has the same meaning, without regard to the decoder being in a specific
2275       state.  An  example of a stateful encoding would be the Japanese Shift-
2276       JIS; an example of a stateless encoding would be the  ISO/IEC  646:1991
2277       standard (equivalent to 7-bit ASCII).
2278
2279       For these reasons, the standard developers decided to adopt a canonical
2280       format for the representation of file information strings. The obvious,
2281       well-endorsed  candidate is the ISO/IEC 10646-1:2000 standard (based in
2282       part on Unicode), which can be used to represent the characters of vir‐
2283       tually  all  standardized  character sets. The standard developers ini‐
2284       tially agreed upon using UCS2 (16-bit Unicode) as the  internal  repre‐
2285       sentation.  This  repertoire of characters provides a sufficiently rich
2286       set to represent all commonly-used codesets.
2287
2288       However, the standard developers found that the 16-bit  Unicode  repre‐
2289       sentation  had some problems. It forced the issue of standardizing byte
2290       ordering. The 2-byte length of each character made the extended  header
2291       records  twice as long for the case of strings coded entirely from his‐
2292       torical 7-bit ASCII. For these reasons, the standard  developers  chose
2293       the UTF-8 defined in the ISO/IEC 10646-1:2000 standard. This multi-byte
2294       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2295       tically,  eliminating  the need for a canonical byte ordering. In addi‐
2296       tion, NUL octets and other characters possibly confusing to POSIX  file
2297       systems  do not appear, except to represent themselves. It was realized
2298       that certain national codesets take up more space after  the  encoding,
2299       due  to their placement within the UCS range; it was felt that the use‐
2300       fulness of the encoding of the names outweighs the disadvantage of size
2301       increase for file, user, and group names.
2302
2303       The encoding of UTF-8 is as follows:
2304
2305       UCS4 Hex Encoding   UTF-8 Binary Encoding
2306       00000000-0000007F   0xxxxxxx
2307       00000080-000007FF   110xxxxx 10xxxxxx
2308       00000800-0000FFFF   1110xxxx 10xxxxxx 10xxxxxx
2309       00010000-001FFFFF   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2310       00200000-03FFFFFF   111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2311       04000000-7FFFFFFF   1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2312
2313       where  each  'x' represents a bit value from the character being trans‐
2314       lated.
2315
2316
2317   ustar Interchange Format
2318       The description of the ustar format reflects numerous enhancements over
2319       pre-1988  versions  of  the  historical  tar utility. The goal of these
2320       changes was not only to provide the  functional  enhancements  desired,
2321       but  also  to  retain  compatibility between new and old versions. This
2322       compatibility has been retained. Archives written using the old archive
2323       format are compatible with the new format.
2324
2325       Implementors  should  be  aware  that  the previous file format did not
2326       include a mechanism to archive directory type files. For  this  reason,
2327       the  convention  of  using  a filename ending with slash was adopted to
2328       specify a directory on the archive.
2329
2330       The total size of the name and prefix fields have been set to meet  the
2331       minimum  requirements  for {PATH_MAX} If a pathname will fit within the
2332       name field, it is recommended that the pathname be stored there without
2333       the use of the prefix field. Although the name field is known to be too
2334       small to contain {PATH_MAX} characters, the value was  not  changed  in
2335       this version of the archive file format to retain backwards-compatibil‐
2336       ity, and instead the prefix was introduced. Also, because of  the  ear‐
2337       lier  version  of the format, there is no way to remove the restriction
2338       on the linkname field being limited in size to just that  of  the  name
2339       field.
2340
2341       The  size  field  is  required  to  be meaningful in all implementation
2342       extensions, although it could be zero. This is  required  so  that  the
2343       data blocks can always be properly counted.
2344
2345       It  is  suggested  that  if device special files need to be represented
2346       that cannot be represented in the standard  format,  that  one  of  the
2347       extension  types (A-Z) be used, and that the additional information for
2348       the special file be represented as data and be reflected  in  the  size
2349       field.
2350
2351       Attempting  to  restore  a  special file type, where it is converted to
2352       ordinary data and conflicts with an existing filename, need not be spe‐
2353       cially  detected by the utility. If run as an ordinary user, pax should
2354       not be able to overwrite the entries in, for example, /dev in any  case
2355       (whether  the  file  is  converted to another type or not). If run as a
2356       privileged user, it should be able to do so, and it would be considered
2357       a  bug if it did not. The same is true of ordinary data files and simi‐
2358       larly named special files; it is impossible to anticipate the needs  of
2359       the user (who could really intend to overwrite the file), so the behav‐
2360       ior should be predictable (and thus regular) and rely on the protection
2361       system as required.
2362
2363       The  value 7 in the typeflag field is intended to define how contiguous
2364       files can be stored in a ustar archive.  IEEE Std 1003.1-2001 does  not
2365       require  the  contiguous file extension, but does define a standard way
2366       of archiving such files so that all conforming  systems  can  interpret
2367       these  file  types  in  a meaningful and consistent manner. On a system
2368       that does not support extended file types, the pax  utility  should  do
2369       the best it can with the file and go on to the next.
2370
2371       The  file  protection  modes are those conventionally used by the ls(1)
2372       utility. This is extended beyond the usage in the ISO POSIX-2  standard
2373       to  support  the "shared text" or "sticky" bit. It is intended that the
2374       conformance document should not document anything beyond the  existence
2375       of  and  support  of  such  a mode.  Further extensions are expected to
2376       these bits, particularly with  overloading  the  set-user-ID  and  set-
2377       group-ID flags.
2378
2379
2380   cpio Interchange Format
2381       The  reference to appropriate privilege in the cpio format refers to an
2382       error on standard output; the ustar format  does  not  make  comparable
2383       statements.
2384
2385       The  model  for  this  format  was the historical System V cpio -c data
2386       interchange format. This model documents the portable  version  of  the
2387       cpio  format  and  not  the  binary  version. It has the flexibility to
2388       transfer data of any type described within IEEE Std 1003.1-2001, yet is
2389       extensible  to  transfer  data types specific to extensions beyond IEEE
2390       Std 1003.1-2001 (for example, contiguous files). Because  it  describes
2391       existing practice, there is no question of maintaining upwards-compati‐
2392       bility.
2393
2394
2395   cpio Header
2396       There has been some concern that the size of the  c_ino  field  of  the
2397       header  is too small to handle those systems that have very large inode
2398       numbers. However, the c_ino field in the header is used strictly  as  a
2399       hard-link  resolution mechanism for archives. It is not necessarily the
2400       same value as the inode number of the file in the location  from  which
2401       that file is extracted.
2402
2403       The name c_magic is based on historical usage.
2404
2405
2406   cpio Filename
2407       For  most  historical  implementations  of the cpio utility, {PATH_MAX}
2408       octets can be used to describe the pathname without the addition of any
2409       other  header  fields  (the  NUL  character  would  be included in this
2410       count).  {PATH_MAX} is the minimum value for pathname size,  documented
2411       as  256  bytes. However, an implementation may use c_namesize to deter‐
2412       mine the exact length of the pathname.  With the current description of
2413       the  <cpio.h>  header,  this  pathname size can be as large as a number
2414       that is described in six octal digits.
2415
2416       Two values are documented under the c_mode field values to provide  for
2417       extensibility for known file types:
2418
2419       0110 000
2420              Reserved  for contiguous files. The implementation may treat the
2421              rest of the information for this archive like a regular file. If
2422              this  file  type is undefined, the implementation may create the
2423              file as a regular file.
2424
2425       This provides for extensibility of the cpio format while  allowing  for
2426       the  ability to read old archives. Files of an unknown type may be read
2427       as "regular files" on some implementations. On a system that  does  not
2428       support  extended file types, the pax utility should do the best it can
2429       with the file and go on to the next.
2430
2431

FUTURE DIRECTIONS

2433       None.
2434
2435

End of informative sections.

2437_________________________________________________________________
2438
2439

SEE ALSO

2441       Shell Command Language, cp(1), ed(1), getopts(1), ls(1), printf(3), the
2442       Base  Definitions  volume of IEEE Std 1003.1-2001, <cpio.h>, the System
2443       Interfaces  volume  of  IEEE  Std  1003.1-2001,   chown(2),   creat(2),
2444       mkdir(2), mkfifo(3), stat(2), utime(2), write(2).
2445
2446

CHANGE HISTORY

2448       First released in Issue 4.
2449
2450
2451   Issue 5
2452       A  note  is added to the APPLICATION USAGE indicating that the cpio and
2453       tar formats can only support files up to 8 gigabytes in size.
2454
2455
2456   Issue 6
2457       The pax utility is aligned with the IEEE P1003.2b draft standard:
2458
2459       ·      Support has been added for symbolic links  in  the  options  and
2460              interchange formats.
2461
2462       ·      A new format has been devised, based on extensions to ustar.
2463
2464       ·      References  to  the "extended" tar and cpio formats derived from
2465              the POSIX.1-1990  standard  have  been  changed  to  remove  the
2466              "extended" adjective because this could cause confusion with the
2467              extended tar header added in this revision. (All  references  to
2468              tar are actually to ustar.)
2469
2470       The TZ entry is added to the ENVIRONMENT VARIABLES section.
2471
2472       IEEE  PASC  Interpretation  1003.2  #168  is  applied,  clarifying that
2473       mkdir(2) and mkfifo(3) calls can ignore an [EEXIST] error when extract‐
2474       ing an archive.
2475
2476       IEEE  PASC  Interpretation  1003.2  #180  is  applied,  clarifying  how
2477       extracted files are created when in read mode.
2478
2479       IEEE  PASC  Interpretation  1003.2  #181  is  applied,  clarifying  the
2480       description of the -t option.
2481
2482       IEEE PASC Interpretation 1003.2 #195 is applied.
2483
2484       IEEE  PASC  Interpretation  1003.2 #206 is applied, clarifying the han‐
2485       dling of links for the -H, -L, and -l options.
2486
2487       IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied,  adding
2488       the process ID of the pax process into certain fields. This change pro‐
2489       vides  a  method  for  the  implementation  to  ensure  that  different
2490       instances of pax extracting a file named /a/b/foo will not collide when
2491       processing the extended header information associated with foo.
2492
2493       IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/36 is applied,  chang‐
2494       ing -x B to -x pax in the OPTIONS section.
2495
2496       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/20 is applied, updat‐
2497       ing the SYNOPSIS to be consistent with the normative text.
2498
2499       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/21 is applied,  updat‐
2500       ing  the  DESCRIPTION  to describe the behavior when files to be linked
2501       are symbolic links and the system is not capable of making  hard  links
2502       to symbolic links.
2503
2504       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/22 is applied, updat‐
2505       ing the OPTIONS section to  describe  the  behavior  for  how  multiple
2506       options are to be handled.
2507
2508       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/23 is applied, updat‐
2509       ing the write option within the OPTIONS section.
2510
2511       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/24 is applied,  adding
2512       a  paragraph  into the OPTIONS section that states that specifying more
2513       than one of the mutually-exclusive options (-H and -L) is  not  consid‐
2514       ered  an  error  and  that the last option specified will determine the
2515       behavior of the utility.
2516
2517       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/25 is applied,  remov‐
2518       ing  the  ctime  paragraph within the EXTENDED DESCRIPTION.  There is a
2519       contradiction in the definition  of  the  ctime  keyword  for  the  pax
2520       extended header, in that the st_ctime member of the stat structure does
2521       not refer to a file creation time. No field in the standard stat struc‐
2522       ture from <sys/stat.h> includes a file creation time.
2523
2524       IEEE  Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/26 is applied, making
2525       it clear that typeflag 1 RB ( ustar  Interchange  Format)  applies  not
2526       only  to files that are hard-linked, but also to files that are aliased
2527       via symlinks.
2528
2529       IEEE Std 1003.1-2001/Cor 2-2004, item XCU/TC2/D6/27 is applied,  clari‐
2530       fying the cpio c_nlink field.
2531
2532       End of quoted text from the POSIX.1-2001 standard.
2533

OPTIONS

2535       The  following  other options are implemented as extension to the POSIX
2536       standard:
2537
2538       -help  Prints a summary of the most important options for  spax(1)  and
2539              exits.
2540
2541       -xhelp Prints  a  summary of the less important options for spax(1) and
2542              exits.
2543
2544       -version
2545              Prints the spax version number string and exists.
2546
2547

EXAMPLES

ENVIRONMENT

FILES

SEE ALSO

DIAGNOSTICS

NOTES

2554       The Institute of Electrical and  Electronics  Engineers  and  The  Open
2555       Group, have given us permission to reprint portions of their documenta‐
2556       tion. In the following statement, the phrase ``this  text''  refers  to
2557       portions of the system documentation.
2558
2559       Portions  of  this text are reprinted and reproduced in electronic form
2560       in the sfind manual, from IEEE Std 1003.1, 2004 Edition,  Standard  for
2561       Information  Technology -- Portable Operating System Interface (POSIX),
2562       The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004  by
2563       the Institute of Electrical and Electronics Engineers, Inc and The Open
2564       Group. In the event of any discrepancy between these versions  and  the
2565       original  IEEE  and  The Open Group Standard, the original IEEE and The
2566       Open Group Standard is the referee document. The original Standard  can
2567       be obtained online at http://www.opengroup.org/unix/online.html.
2568

BUGS

AUTHOR

2571       Joerg Schilling
2572       Seestr. 110
2573       D-13353 Berlin
2574       Germany
2575
2576       Mail bugs and suggestions to:
2577
2578       schilling@fokus.fraunhofer.de       or       js@cs.tu-berlin.de      or
2579       joerg@schily.isdn.cs.tu-berlin.de
2580
2581
2582
2583Joerg Schilling                    09/04/10                           SPAX(1L)
Impressum