1PAX(1P)                    POSIX Programmer's Manual                   PAX(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       pax — portable archive interchange
13

SYNOPSIS

15       pax [-dv] [-c|-n] [-H|-L] [-o options] [-f archive] [-s replstr]...
16           [pattern...]
17
18       pax -r[-c|-n] [-dikuv] [-H|-L] [-f archive] [-o options]... [-p string]...
19           [-s replstr]... [pattern...]
20
21       pax -w [-dituvX] [-H|-L] [-b blocksize] [[-a] [-f archive]] [-o options]...
22           [-s replstr]... [-x format] [file...]
23
24       pax -r -w [-diklntuvX] [-H|-L] [-o options]... [-p string]...
25           [-s replstr]... [file...] directory
26

DESCRIPTION

28       The pax utility shall read, write, and write lists of  the  members  of
29       archive files and copy directory hierarchies. A variety of archive for‐
30       mats shall be supported; see the -x format option.
31
32       The action to be taken depends  on  the  presence  of  the  -r  and  -w
33       options. The four combinations of -r and -w are referred to as the four
34       modes of operation: list, read, write, and  copy  modes,  corresponding
35       respectively to the four forms shown in the SYNOPSIS section.
36
37       list      In  list  mode  (when  neither  -r nor -w are specified), pax
38                 shall write the names of the members of the archive file read
39                 from  the  standard input, with pathnames matching the speci‐
40                 fied patterns, to standard output. If a named file is of type
41                 directory,  the  file  hierarchy rooted at that file shall be
42                 listed as well.
43
44       read      In read mode (when -r is specified, but -w is not), pax shall
45                 extract  the  members of the archive file read from the stan‐
46                 dard input, with pathnames matching the  specified  patterns.
47                 If an extracted file is of type directory, the file hierarchy
48                 rooted at that file shall be extracted as well. The extracted
49                 files  shall  be  created performing pathname resolution with
50                 the directory in which pax was invoked as the current working
51                 directory.
52
53                 If  an attempt is made to extract a directory when the direc‐
54                 tory already exists, this shall not be considered  an  error.
55                 If an attempt is made to extract a FIFO when the FIFO already
56                 exists, this shall not be considered an error.
57
58                 The ownership, access, and modification times, and file  mode
59                 of the restored files are discussed under the -p option.
60
61       write     In  write  mode  (when  -w  is specified, but -r is not), pax
62                 shall write the contents of the file operands to the standard
63                 output  in  an archive format. If no file operands are speci‐
64                 fied, a list of files to copy, one per line,  shall  be  read
65                 from  the standard input and each entry in this list shall be
66                 processed as if it had been a file  operand  on  the  command
67                 line. A file of type directory shall include all of the files
68                 in the file hierarchy rooted at the file.
69
70       copy      In copy mode (when both -r and -w are specified),  pax  shall
71                 copy the file operands to the destination directory.
72
73                 If  no  file operands are specified, a list of files to copy,
74                 one per line, shall be read from the standard input.  A  file
75                 of  type directory shall include all of the files in the file
76                 hierarchy rooted at the file.
77
78                 The effect of the copy shall be as if the copied  files  were
79                 written  to  a  pax format archive file and then subsequently
80                 extracted, except that copying of sockets  may  be  supported
81                 even  if  archiving  them in write mode is not supported, and
82                 that there may be hard links between  the  original  and  the
83                 copied  files. If the destination directory is a subdirectory
84                 of one of the files to be copied, the  results  are  unspeci‐
85                 fied.  If  the  destination directory is a file of a type not
86                 defined by the System Interfaces volume of POSIX.1‐2017,  the
87                 results are implementation-defined; otherwise, it shall be an
88                 error for the file named by  the  directory  operand  not  to
89                 exist,  not be writable by the user, or not be a file of type
90                 directory.
91
92       In read or copy modes, if intermediate  directories  are  necessary  to
93       extract  an archive member, pax shall perform actions equivalent to the
94       mkdir()  function  defined  in  the   System   Interfaces   volume   of
95       POSIX.1‐2017, called with the following arguments:
96
97        *  The intermediate directory used as the path argument
98
99        *  The  value  of  the  bitwise-inclusive  OR of S_IRWXU, S_IRWXG, and
100           S_IRWXO as the mode argument
101
102       If any specified pattern or file operands are not matched by  at  least
103       one  file  or  archive  member, pax shall write a diagnostic message to
104       standard error for each one that did not match and exit with a non-zero
105       exit status.
106
107       The archive formats described in the EXTENDED DESCRIPTION section shall
108       be automatically detected on input. The default output  archive  format
109       shall be implementation-defined.
110
111       A  single archive can span multiple files. The pax utility shall deter‐
112       mine, in an implementation-defined manner, what file to read  or  write
113       as the next file.
114
115       If  the  selected  archive  format supports the specification of linked
116       files, it shall be an error if these files cannot be  linked  when  the
117       archive  is  extracted. For archive formats that do not store file con‐
118       tents with each name that causes a hard link, if the file that contains
119       the  data  is  not  extracted  during this pax session, either the data
120       shall be restored from the original file, or a diagnostic message shall
121       be  displayed  with  the name of a file that can be used to extract the
122       data. In traversing directories, pax shall detect infinite loops;  that
123       is,  entering a previously visited directory that is an ancestor of the
124       last file visited. When it detects an infinite loop, pax shall write  a
125       diagnostic message to standard error and shall terminate.
126

OPTIONS

128       The  pax  utility  shall  conform  to  the  Base  Definitions volume of
129       POSIX.1‐2017, Section 12.2, Utility Syntax Guidelines, except that  the
130       order of presentation of the -o, -p, and -s options is significant.
131
132       The following options shall be supported:
133
134       -r        Read an archive file from standard input.
135
136       -w        Write  files  to the standard output in the specified archive
137                 format.
138
139       -a        Append files to the end of the archive. It is implementation-
140                 defined  which devices on the system support appending. Addi‐
141                 tional  file  formats   unspecified   by   this   volume   of
142                 POSIX.1‐2017 may impose restrictions on appending.
143
144       -b blocksize
145                 Block  the  output  at  a  positive decimal integer number of
146                 bytes per write to the archive file. Devices and archive for‐
147                 mats  may  impose restrictions on blocking. Blocking shall be
148                 automatically determined on  input.  Conforming  applications
149                 shall  not  specify  a  blocksize  value  larger  than 32256.
150                 Default blocking when creating archives depends  on  the  ar‐
151                 chive format. (See the -x option below.)
152
153       -c        Match  all  file or archive members except those specified by
154                 the pattern or file operands.
155
156       -d        Cause files of type directory being copied or archived or ar‐
157                 chive  members of type directory being extracted or listed to
158                 match only the file or archive member itself and not the file
159                 hierarchy rooted at the file.
160
161       -f archive
162                 Specify the pathname of the input or output archive, overrid‐
163                 ing the default standard input (in list  or  read  modes)  or
164                 standard output (write mode).
165
166       -H        If  a  symbolic  link referencing a file of type directory is
167                 specified on the command line, pax  shall  archive  the  file
168                 hierarchy  rooted  in  the file referenced by the link, using
169                 the name of the link as the root of the file hierarchy.  Oth‐
170                 erwise,  if  a  symbolic link referencing a file of any other
171                 file type which pax can normally archive is specified on  the
172                 command  line,  then pax shall archive the file referenced by
173                 the link, using the name of the link. The  default  behavior,
174                 when  neither -H or -L are specified, shall be to archive the
175                 symbolic link itself.
176
177       -i        Interactively rename files or archive members. For  each  ar‐
178                 chive  member  matching  a pattern operand or file matching a
179                 file operand, a prompt shall be written to the file /dev/tty.
180                 The prompt shall contain the name of the file or archive mem‐
181                 ber, but the format is otherwise unspecified.  A  line  shall
182                 then  be read from /dev/tty.  If this line is blank, the file
183                 or archive member shall be skipped. If this line consists  of
184                 a  single  period,  the  file or archive member shall be pro‐
185                 cessed with no modification to its name. Otherwise, its  name
186                 shall  be  replaced  with  the  contents of the line. The pax
187                 utility shall immediately exit with a non-zero exit status if
188                 end-of-file  is  encountered  when  reading  a response or if
189                 /dev/tty cannot be opened for reading and writing.
190
191                 The results of extracting a hard link to a file that has been
192                 renamed during extraction are unspecified.
193
194       -k        Prevent the overwriting of existing files.
195
196       -l        (The  letter  ell.)  In  copy  mode, hard links shall be made
197                 between the source and destination file hierarchies  whenever
198                 possible.  If  specified in conjunction with -H or -L, when a
199                 symbolic link is encountered, the hard link  created  in  the
200                 destination file hierarchy shall be to the file referenced by
201                 the symbolic link. If specified when neither  -H  nor  -L  is
202                 specified, when a symbolic link is encountered, the implemen‐
203                 tation shall create a hard link to the symbolic link  in  the
204                 source file hierarchy or copy the symbolic link to the desti‐
205                 nation.
206
207       -L        If a symbolic link referencing a file of  type  directory  is
208                 specified  on the command line or encountered during the tra‐
209                 versal of a file hierarchy, pax shall archive the file  hier‐
210                 archy  rooted  in  the file referenced by the link, using the
211                 name of the link as the root of the file  hierarchy.   Other‐
212                 wise, if a symbolic link referencing a file of any other file
213                 type which pax can normally archive is specified on the  com‐
214                 mand line or encountered during the traversal of a file hier‐
215                 archy, pax shall archive the file  referenced  by  the  link,
216                 using  the  name of the link. The default behavior, when nei‐
217                 ther -H or -L are specified, shall be to archive the symbolic
218                 link itself.
219
220       -n        Select the first archive member that matches each pattern op‐
221                 erand. No more than one archive member shall be  matched  for
222                 each  pattern (although members of type directory shall still
223                 match the file hierarchy rooted at that file).
224
225       -o options
226                 Provide information to the implementation to modify the algo‐
227                 rithm  for  extracting or writing files. The value of options
228                 shall consist of one or more  <comma>-separated  keywords  of
229                 the form:
230
231
232                     keyword[[:]=value][,keyword[[:]=value], ...]
233
234                 Some  keywords  apply  only to certain file formats, as indi‐
235                 cated with each description. Use of keywords that  are  inap‐
236                 plicable  to  the  file format being processed produces unde‐
237                 fined results.
238
239                 Keywords in the options argument shall be a string that would
240                 be a valid portable filename as described in the Base Defini‐
241                 tions volume of POSIX.1‐2017, Section 3.282,  Portable  File‐
242                 name Character Set.
243
244                 Note:     Keywords  are  not expected to be filenames, merely
245                           to follow the same character composition  rules  as
246                           portable filenames.
247
248                 Keywords  can  be  preceded with white space. The value field
249                 shall consist of zero or more characters; within  value,  the
250                 application  shall  precede any literal <comma> with a <back‐
251                 slash>, which shall be ignored, but preserves the <comma>  as
252                 part  of  value.   A  <comma>  as  the  final character, or a
253                 <comma> followed solely by white space as the  final  charac‐
254                 ters, in options shall be ignored. Multiple -o options can be
255                 specified; if keywords given to  these  multiple  -o  options
256                 conflict,  the keywords and values appearing later in command
257                 line sequence shall take precedence and the earlier shall  be
258                 silently  ignored.  The  following  keyword values of options
259                 shall be supported for the file formats as indicated:
260
261                 delete=pattern
262                       (Applicable only to the -x pax format.)  When  used  in
263                       write or copy mode, pax shall omit from extended header
264                       records that it  produces  any  keywords  matching  the
265                       string  pattern.  When  used  in read or list mode, pax
266                       shall ignore any keywords matching the  string  pattern
267                       in the extended header records. In both cases, matching
268                       shall be performed using the pattern matching  notation
269                       described in Section 2.13.1, Patterns Matching a Single
270                       Character and Section 2.13.2, Patterns Matching  Multi‐
271                       ple Characters.  For example:
272
273
274                           -o delete=security.*
275
276                       would  suppress  security-related  information. See pax
277                       Extended Header  for  extended  header  record  keyword
278                       usage.
279
280                       When  multiple  -odelete=pattern options are specified,
281                       the patterns shall be additive; all  keywords  matching
282                       the  specified  string  patterns  shall be omitted from
283                       extended header records that pax produces.
284
285                 exthdr.name=string
286                       (Applicable only to the -x pax  format.)  This  keyword
287                       allows  user control over the name that is written into
288                       the ustar header blocks for the  extended  header  pro‐
289                       duced  under  the circumstances described in pax Header
290                       Block.  The name shall be the contents of string, after
291                       the following character substitutions have been made:
292
293                        ┌──────────┬────────────────────────────────────────┐
294string   │                                        │
295Includes: Replaced by:              
296                        ├──────────┼────────────────────────────────────────┤
297                        │%d        │ The directory name of the file, equiv‐ │
298                        │          │ alent to the  result  of  the  dirname
299                        │          │ utility on the translated pathname.    │
300                        │%f        │ The  filename  of the file, equivalent │
301                        │          │ to the result of the basename  utility │
302                        │          │ on the translated pathname.            │
303                        │%p        │ The process ID of the pax process.     │
304                        │%%        │ A '%' character.                       │
305                        └──────────┴────────────────────────────────────────┘
306                       Any  other  '%'  characters in string produce undefined
307                       results.
308
309                       If no -o exthdr.name=string is specified, pax shall use
310                       the following default value:
311
312
313                           %d/PaxHeaders.%p/%f
314
315                 globexthdr.name=string
316                       (Applicable  only  to  the -x pax format.) When used in
317                       write or copy mode with the  appropriate  options,  pax
318                       shall  create global extended header records with ustar
319                       header blocks that will be treated as regular files  by
320                       previous  versions  of  pax.   This keyword allows user
321                       control over the name that is written  into  the  ustar
322                       header  blocks  for global extended header records. The
323                       name shall be the contents of string, after the follow‐
324                       ing character substitutions have been made:
325
326                        ┌──────────┬────────────────────────────────────────┐
327string   │                                        │
328Includes: Replaced by:              
329                        ├──────────┼────────────────────────────────────────┤
330                        │%n        │ An   integer   that   represents   the │
331                        │          │ sequence number of the global extended │
332                        │          │ header record in the archive, starting │
333                        │          │ at 1.                                  │
334                        │%p        │ The process ID of the pax process.     │
335                        │%%        │ A '%' character.                       │
336                        └──────────┴────────────────────────────────────────┘
337                       Any other '%' characters in  string  produce  undefined
338                       results.
339
340                       If no -o globexthdr.name=string is specified, pax shall
341                       use the following default value:
342
343
344                           $TMPDIR/GlobalHead.%p.%n
345
346                       where $TMPDIR represents the value of the TMPDIR  envi‐
347                       ronment  variable.  If TMPDIR is not set, pax shall use
348                       /tmp.
349
350                 invalid=action
351                       (Applicable only to the -x pax  format.)  This  keyword
352                       allows  user  control  over  the  action pax takes upon
353                       encountering values in an extended header record  that,
354                       in  read  or  copy mode, are invalid in the destination
355                       hierarchy or, in list mode, cannot be  written  in  the
356                       codeset  and  current locale of the implementation. The
357                       following are invalid values that shall  be  recognized
358                       by pax:
359
360                       --  In  read or copy mode, a filename or link name that
361                           contains character encodings invalid in the  desti‐
362                           nation  hierarchy.  (For example, the name may con‐
363                           tain embedded NULs.)
364
365                       --  In read or copy mode, a filename or link name  that
366                           is  longer than the maximum allowed in the destina‐
367                           tion hierarchy (for either a pathname component  or
368                           the entire pathname).
369
370                       --  In list mode, any character string value (filename,
371                           link name, user name, and so  on)  that  cannot  be
372                           written  in  the  codeset and current locale of the
373                           implementation.
374
375                       The following mutually-exclusive values of  the  action
376                       argument are supported:
377
378                       binary    In  write  mode,  pax  shall  generate a hdr‐
379                                 charset=BINARY  extended  header  record  for
380                                 each  file  with a filename, link name, group
381                                 name, owner name, or any other  field  in  an
382                                 extended  header record that cannot be trans‐
383                                 lated to the UTF‐8 codeset, allowing the  ar‐
384                                 chive  to  contain  the  files with unencoded
385                                 extended header record  values.  In  read  or
386                                 copy mode, pax shall use the values specified
387                                 in the header without translation, regardless
388                                 of  whether  this  may  overwrite an existing
389                                 file with a valid name.  In  list  mode,  pax
390                                 shall   behave   identically  to  the  bypass
391                                 action.
392
393                       bypass    In read or copy mode, pax  shall  bypass  the
394                                 file,  causing  no  change to the destination
395                                 hierarchy.  In list mode, pax shall write all
396                                 requested  valid values for the file, but its
397                                 method for writing invalid values is unspeci‐
398                                 fied.
399
400                       rename    In read or copy mode, pax shall act as if the
401                                 -i option were in effect for each  file  with
402                                 invalid  filename or link name values, allow‐
403                                 ing the user to provide  a  replacement  name
404                                 interactively.    In  list  mode,  pax  shall
405                                 behave identically to the bypass action.
406
407                       UTF‐8     When used in read, copy, or list mode  and  a
408                                 filename, link name, owner name, or any other
409                                 field in an extended header record cannot  be
410                                 translated  from the pax UTF‐8 codeset format
411                                 to the codeset  and  current  locale  of  the
412                                 implementation,  pax  shall  use  the  actual
413                                 UTF‐8 encoding for the name. If a  hdrcharset
414                                 extended  header record is in effect for this
415                                 file, the character  set  specified  by  that
416                                 record  shall  be used instead of UTF‐8. If a
417                                 hdrcharset=BINARY extended header  record  is
418                                 in effect for this file, no translation shall
419                                 be performed.
420
421                       write     In read or copy mode,  pax  shall  write  the
422                                 file,  translating  the  name,  regardless of
423                                 whether this may overwrite an  existing  file
424                                 with  a  valid  name. In list mode, pax shall
425                                 behave identically to the bypass action.
426
427                       If no -o invalid=option is specified, pax shall act  as
428                       if  -oinvalid=bypass were specified. Any overwriting of
429                       existing files that may be allowed  by  the  -oinvalid=
430                       actions shall be subject to permission (-p) and modifi‐
431                       cation time (-u) restrictions, and shall be  suppressed
432                       if the -k option is also specified.
433
434                 linkdata
435                       (Applicable  only to the -x pax format.) In write mode,
436                       pax shall write the contents of a file to  the  archive
437                       even  when  that  file  is merely a hard link to a file
438                       whose contents have already been  written  to  the  ar‐
439                       chive.
440
441                 listopt=format
442                       This  keyword  specifies the output format of the table
443                       of contents produced when the -v option is specified in
444                       list  mode.  See  List  Mode Format Specifications.  To
445                       avoid ambiguity, the listopt=format shall be  the  only
446                       or  final  keyword=value  pair in a -o option-argument;
447                       all characters in the remainder of the  option-argument
448                       shall  be  considered  part  of the format string. When
449                       multiple -olistopt=format options  are  specified,  the
450                       format  strings  shall be considered a single, concate‐
451                       nated string, evaluated in command line order.
452
453                 times
454                       (Applicable only to the -x pax format.)  When  used  in
455                       write  or  copy mode, pax shall include atime and mtime
456                       extended header records for each file. See pax Extended
457                       Header File Times.
458
459                 In addition to these keywords, if the -x pax format is speci‐
460                 fied, any of the keywords and values defined in pax  Extended
461                 Header,  including  implementation extensions, can be used in
462                 -o option-arguments, in either of two modes:
463
464                 keyword=value
465                       When used in write or copy  mode,  these  keyword/value
466                       pairs shall be included at the beginning of the archive
467                       as typeflag g global extended header records. When used
468                       in  read  or list mode, these keyword/value pairs shall
469                       act as if they had been at the beginning of the archive
470                       as typeflag g global extended header records.
471
472                 keyword:=value
473                       When  used  in  write or copy mode, these keyword/value
474                       pairs shall be included as records at the beginning  of
475                       a typeflag x extended header for each file. (This shall
476                       be equivalent to the <equals-sign> form except that  it
477                       creates  no typeflag g global extended header records.)
478                       When used in read or  list  mode,  these  keyword/value
479                       pairs  shall act as if they were included as records at
480                       the end of each extended header; thus, they shall over‐
481                       ride any global or file-specific extended header record
482                       keywords of the same names. For example,  in  the  com‐
483                       mand:
484
485
486                           pax -r -o "
487                           gname:=mygroup,
488                           " <archive
489
490                       the  group  name  will be forced to a new value for all
491                       files read from the archive.
492
493                 The precedence of -o keywords over various fields in the  ar‐
494                 chive is described in pax Extended Header Keyword Precedence.
495                 If the  -o  delete=pattern,  -o  keyword=value,  or  -o  key‐
496                 word:=value  options  are  used  to  override  or  remove any
497                 extended header data needed  to  find  files  in  an  archive
498                 (e.g.,  -o delete=size for a file whose size cannot be repre‐
499                 sented in a ustar header or -o size=100 for a file whose size
500                 is not 100 bytes), the behavior is undefined.
501
502       -p string Specify one or more file characteristic options (privileges).
503                 The string option-argument shall be a string specifying  file
504                 characteristics  to  be  retained or discarded on extraction.
505                 The string shall consist of the specification  characters  a,
506                 e,  m, o, and p.  Other implementation-defined characters can
507                 be included. Multiple  characteristics  can  be  concatenated
508                 within  the same string and multiple -p options can be speci‐
509                 fied. The meaning of the specification characters are as fol‐
510                 lows:
511
512                 a     Do not preserve file access times.
513
514                 e     Preserve the user ID, group ID, file mode bits (see the
515                       Base Definitions volume of POSIX.1‐2017, Section 3.169,
516                       File  Mode  Bits),  access time, modification time, and
517                       any other implementation-defined file characteristics.
518
519                 m     Do not preserve file modification times.
520
521                 o     Preserve the user ID and group ID.
522
523                 p     Preserve the  file  mode  bits.  Other  implementation-
524                       defined file mode attributes may be preserved.
525
526                 In   the  preceding  list,  ``preserve''  indicates  that  an
527                 attribute stored  in  the  archive  shall  be  given  to  the
528                 extracted  file,  subject  to the permissions of the invoking
529                 process. The access and modification times of the file  shall
530                 be preserved unless otherwise specified with the -p option or
531                 not stored in the archive. All attributes that are  not  pre‐
532                 served  shall  be  determined as part of the normal file cre‐
533                 ation action (see Section 1.1.1.4, File Read, Write, and Cre‐
534                 ation).
535
536                 If  neither the e nor the o specification character is speci‐
537                 fied, or the user ID and group ID are not preserved  for  any
538                 reason, pax shall not set the S_ISUID and S_ISGID bits of the
539                 file mode.
540
541                 If the preservation of any of these items fails for any  rea‐
542                 son,  pax shall write a diagnostic message to standard error.
543                 Failure to preserve these items shall affect the  final  exit
544                 status, but shall not cause the extracted file to be deleted.
545
546                 If  file  characteristic letters in any of the string option-
547                 arguments are duplicated or conflict  with  each  other,  the
548                 ones given last shall take precedence. For example, if -p eme
549                 is specified, file modification times are preserved.
550
551       -s replstr
552                 Modify file or archive member names named by pattern or  file
553                 operands  according  to  the substitution expression replstr,
554                 using  the  syntax  of  the  ed  utility.  The  concepts   of
555                 ``address''  and  ``line''  are meaningless in the context of
556                 the pax utility, and shall not be supplied. The format  shall
557                 be:
558
559
560                     -s /old/new/[gp]
561
562                 where as in ed, old is a basic regular expression and new can
563                 contain an <ampersand>, '\n' (where n is a digit) back-refer‐
564                 ences,  or  subexpression matching. The old string shall also
565                 be permitted to contain <newline> characters.
566
567                 Any non-null character can be used as a delimiter ('/'  shown
568                 here).  Multiple -s expressions can be specified; the expres‐
569                 sions shall be applied in the  order  specified,  terminating
570                 with  the first successful substitution.  The optional trail‐
571                 ing 'g' is as defined in the ed utility. The optional  trail‐
572                 ing 'p' shall cause successful substitutions to be written to
573                 standard error.  File or archive member names that substitute
574                 to the empty string shall be ignored when reading and writing
575                 archives.
576
577       -t        When reading files from the file system, and if the user  has
578                 the  permissions required by utime() to do so, set the access
579                 time of each file read to the access time that it had  before
580                 being read by pax.
581
582       -u        Ignore  files that are older (having a less recent file modi‐
583                 fication time) than a pre-existing  file  or  archive  member
584                 with the same name.  In read mode, an archive member with the
585                 same name as a file in the file system shall be extracted  if
586                 the  archive member is newer than the file. In write mode, an
587                 archive file member with the same name as a file in the  file
588                 system  shall be superseded if the file is newer than the ar‐
589                 chive member. If -a is also specified, this  is  accomplished
590                 by  appending  to  the  archive; otherwise, it is unspecified
591                 whether this is accomplished by actual replacement in the ar‐
592                 chive  or by appending to the archive. In copy mode, the file
593                 in the destination hierarchy shall be replaced by the file in
594                 the  source  hierarchy or by a link to the file in the source
595                 hierarchy if the file in the source hierarchy is newer.
596
597       -v        In list mode, produce a verbose table of  contents  (see  the
598                 STDOUT  section).   Otherwise, write archive member pathnames
599                 to standard error (see the STDERR section).
600
601       -x format Specify the output archive format. The pax utility shall sup‐
602                 port the following formats:
603
604                 cpio      The  cpio  interchange  format;  see  the  EXTENDED
605                           DESCRIPTION section. The default blocksize for this
606                           format for character special archive files shall be
607                           5120.  Implementations shall support all  blocksize
608                           values  less than or equal to 32256 that are multi‐
609                           ples of 512.
610
611                 pax       The  pax  interchange  format;  see  the   EXTENDED
612                           DESCRIPTION section. The default blocksize for this
613                           format for character special archive files shall be
614                           5120.   Implementations shall support all blocksize
615                           values less than or equal to 32256 that are  multi‐
616                           ples of 512.
617
618                 ustar     The   tar  interchange  format;  see  the  EXTENDED
619                           DESCRIPTION section. The default blocksize for this
620                           format for character special archive files shall be
621                           10240.  Implementations shall support all blocksize
622                           values  less than or equal to 32256 that are multi‐
623                           ples of 512.
624
625                 Implementation-defined formats shall specify a default  block
626                 size as well as any other block sizes supported for character
627                 special archive files.
628
629                 Any attempt to append to an archive file in a format  differ‐
630                 ent  from the existing archive format shall cause pax to exit
631                 immediately with a non-zero exit status.
632
633       -X        When traversing the file hierarchy specified by  a  pathname,
634                 pax  shall not descend into directories that have a different
635                 device ID  (st_dev;  see  the  System  Interfaces  volume  of
636                 POSIX.1‐2017, stat()).
637
638       Specifying  more  than  one of the mutually-exclusive options -H and -L
639       shall not be considered an error and the last  option  specified  shall
640       determine the behavior of the utility.
641
642       The  options that operate on the names of files or archive members (-c,
643       -i, -n, -s, -u, and -v) shall interact as follows. In  read  mode,  the
644       archive  members  shall be selected based on the user-specified pattern
645       operands as modified by the -c, -n, and -u options. Then, any -s and -i
646       options  shall  modify, in that order, the names of the selected files.
647       The -v option shall write names resulting from these modifications.
648
649       In write mode, the files shall be selected based on the  user-specified
650       pathnames  as  modified  by  the -n and -u options. Then, any -s and -i
651       options shall modify, in that order, the names of these selected files.
652       The -v option shall write names resulting from these modifications.
653
654       If  both  the -u and -n options are specified, pax shall not consider a
655       file selected unless it is newer than the file to which it is compared.
656
657   List Mode Format Specifications
658       In list mode with the -o listopt=format  option,  the  format  argument
659       shall be applied for each selected file. The pax utility shall append a
660       <newline> to the listopt output for  each  selected  file.  The  format
661       argument shall be used as the format string described in the Base Defi‐
662       nitions volume of POSIX.1‐2017, Chapter 5, File Format  Notation,  with
663       the  exceptions  1. through 6. defined in the EXTENDED DESCRIPTION sec‐
664       tion of printf, plus the following exceptions:
665
666       7.    The sequence (keyword) can occur before a format conversion spec‐
667             ifier.  The  conversion  argument is defined by the value of key‐
668             word.  The implementation shall support the following keywords:
669
670             --  Any of the Field Name entries in  Table  4-14,  ustar  Header
671                 Block and Table 4-16, Octet-Oriented cpio Archive Entry.  The
672                 implementation may support  the  cpio  keywords  without  the
673                 leading  c_  in  addition to the form required by Table 4-16,
674                 Octet-Oriented cpio Archive Entry.
675
676             --  Any keyword defined for the extended header in  pax  Extended
677                 Header.
678
679             --  Any  keyword  provided as an implementation-defined extension
680                 within the extended header defined in pax Extended Header.
681
682             For example, the sequence "%(charset)s" is the  string  value  of
683             the name of the character set in the extended header.
684
685             The  result of the keyword conversion argument shall be the value
686             from the applicable header field or extended header, without  any
687             trailing NULs.
688
689             All  keyword  values used as conversion arguments shall be trans‐
690             lated from the UTF‐8 encoding (or alternative encoding  specified
691             by  any  hdrcharset  extended header record) to the character set
692             appropriate for the local file system, user database, and so  on,
693             as applicable.
694
695       8.    An additional conversion specifier character, T, shall be used to
696             specify time formats. The T conversion specifier character can be
697             preceded  by the sequence (keyword=subformat), where subformat is
698             a date format as defined by date operands.  The  default  keyword
699             shall be mtime and the default subformat shall be:
700
701
702                 %b %e %H:%M %Y
703
704       9.    An additional conversion specifier character, M, shall be used to
705             specify the file mode string as defined in ls Standard Output. If
706             (keyword)  is  omitted, the mode keyword shall be used. For exam‐
707             ple, %.1M  writes  the  single  character  corresponding  to  the
708             <entry type> field of the ls -l command.
709
710       10.   An additional conversion specifier character, D, shall be used to
711             specify the device for block or special files, if applicable,  in
712             an  implementation-defined  format.  If not applicable, and (key‐
713             word) is specified, then this conversion shall be  equivalent  to
714             %(keyword)u.  If  not  applicable, and (keyword) is omitted, then
715             this conversion shall be equivalent to <space>.
716
717       11.   An additional conversion specifier character, F, shall be used to
718             specify a pathname. The F conversion character can be preceded by
719             a sequence of <comma>-separated keywords:
720
721
722                 (keyword[,keyword] ... )
723
724             The values for all the keywords that are non-null shall  be  con‐
725             catenated  together,  each separated by a '/'.  The default shall
726             be (path) if the keyword path is defined; otherwise, the  default
727             shall be (prefix,name).
728
729       12.   An additional conversion specifier character, L, shall be used to
730             specify a symbolic link expansion. If the current file is a  sym‐
731             bolic link, then %L shall expand to:
732
733
734                 "%s -> %s", <value of keyword>, <contents of link>
735
736             Otherwise,  the  %L conversion specification shall be the equiva‐
737             lent of %F.
738

OPERANDS

740       The following operands shall be supported:
741
742       directory The destination directory pathname for copy mode.
743
744       file      A pathname of a file to be copied or archived.
745
746       pattern   A pattern matching one or more pathnames of archive  members.
747                 A  pattern  must  be given in the name-generating notation of
748                 the pattern matching notation in Section 2.13, Pattern Match‐
749                 ing  Notation, including the filename expansion rules in Sec‐
750                 tion 2.13.3,  Patterns  Used  for  Filename  Expansion.   The
751                 default, if no pattern is specified, is to select all members
752                 in the archive.
753

STDIN

755       In write mode, the standard input shall be used only if no  file  oper‐
756       ands  are specified. It shall be a file containing a list of pathnames,
757       each terminated by a <newline> character.
758
759       In list and read modes, if -f is  not  specified,  the  standard  input
760       shall be an archive file.
761
762       Otherwise, the standard input shall not be used.
763

INPUT FILES

765       The  input file named by the archive option-argument, or standard input
766       when the archive is read from there, shall be a file formatted  accord‐
767       ing to one of the specifications in the EXTENDED DESCRIPTION section or
768       some other implementation-defined format.
769
770       The file /dev/tty shall be used to write prompts and read responses.
771

ENVIRONMENT VARIABLES

773       The following environment variables shall affect the execution of pax:
774
775       LANG      Provide a default value for  the  internationalization  vari‐
776                 ables  that are unset or null. (See the Base Definitions vol‐
777                 ume of POSIX.1‐2017, Section 8.2, Internationalization  Vari‐
778                 ables  the  precedence of internationalization variables used
779                 to determine the values of locale categories.)
780
781       LC_ALL    If set to a non-empty string value, override  the  values  of
782                 all the other internationalization variables.
783
784       LC_COLLATE
785                 Determine  the locale for the behavior of ranges, equivalence
786                 classes, and multi-character collating elements used  in  the
787                 pattern  matching  expressions  for  the pattern operand, the
788                 basic regular expression for the -s option, and the  extended
789                 regular  expression defined for the yesexpr locale keyword in
790                 the LC_MESSAGES category.
791
792       LC_CTYPE  Determine the locale for the interpretation of  sequences  of
793                 bytes of text data as characters (for example, single-byte as
794                 opposed to  multi-byte  characters  in  arguments  and  input
795                 files),  the  behavior  of  character  classes  used  in  the
796                 extended regular expression defined for  the  yesexpr  locale
797                 keyword in the LC_MESSAGES category, and pattern matching.
798
799       LC_MESSAGES
800                 Determine  the  locale used to process affirmative responses,
801                 and the locale used to affect  the  format  and  contents  of
802                 diagnostic messages and prompts written to standard error.
803
804       LC_TIME   Determine  the  format  and contents of date and time strings
805                 when the -v option is specified.
806
807       NLSPATH   Determine the location of message catalogs for the processing
808                 of LC_MESSAGES.
809
810       TMPDIR    Determine  the  pathname  that  provides  part of the default
811                 global extended header record file, as described for  the  -o
812                 globexthdr= keyword in the OPTIONS section.
813
814       TZ        Determine  the  timezone  used  to  calculate  date  and time
815                 strings when the -v option is specified. If TZ  is  unset  or
816                 null, an unspecified default timezone shall be used.
817

ASYNCHRONOUS EVENTS

819       Default.
820

STDOUT

822       In write mode, if -f is not specified, the standard output shall be the
823       archive formatted  according  to  one  of  the  specifications  in  the
824       EXTENDED DESCRIPTION section, or some other implementation-defined for‐
825       mat (see -x format).
826
827       In list  mode,  when  the  -olistopt=format  has  been  specified,  the
828       selected  archive members shall be written to standard output using the
829       format described under List Mode Format Specifications.  In  list  mode
830       without  the  -olistopt=format  option,  the  table  of contents of the
831       selected archive members shall be written to standard output using  the
832       following format:
833
834
835           "%s\n", <pathname>
836
837       If  the  -v  option is specified in list mode, the table of contents of
838       the selected archive members shall be written to standard output  using
839       the following formats.
840
841       For  pathnames  representing  hard links to previous members of the ar‐
842       chive:
843
844
845           "%s == %s\n", <ls -l listing>, <linkname>
846
847       For all other pathnames:
848
849
850           "%s\n", <ls -l listing>
851
852       where <ls -l listing> shall be the format specified by the  ls  utility
853       with  the  -l  option.  When  writing  pathnames  in this format, it is
854       unspecified what is written for fields for which the underlying archive
855       format does not have the correct information, although the correct num‐
856       ber of <blank>-separated fields shall be written.
857
858       In list mode, standard output shall not be buffered more than  a  path‐
859       name  (plus any associated information and a <newline> terminator) at a
860       time.
861

STDERR

863       If -v is specified in read, write, or copy modes, pax shall  write  the
864       pathnames it processes to the standard error output using the following
865       format:
866
867
868           "%s\n", <pathname>
869
870       These pathnames shall be written as soon as processing is begun on  the
871       file  or  archive  member,  and shall be flushed to standard error. The
872       trailing <newline>, which shall not be buffered, is  written  when  the
873       file has been read or written.
874
875       If  the -s option is specified, and the replacement string has a trail‐
876       ing 'p', substitutions shall be written to standard error in  the  fol‐
877       lowing format:
878
879
880           "%s >> %s\n", <original pathname>, <new pathname>
881
882       In  all operating modes of pax, optional messages of unspecified format
883       concerning the input archive format and volume number,  the  number  of
884       files,  blocks,  volumes,  and  media parts as well as other diagnostic
885       messages may be written to standard error.
886
887       In all formats, for both standard output  and  standard  error,  it  is
888       unspecified how non-printable characters in pathnames or link names are
889       written.
890
891       When using the -xpax archive format, if a filename,  link  name,  group
892       name,  owner name, or any other field in an extended header record can‐
893       not be translated between the codeset in use for that  extended  header
894       record  and  the character set of the current locale, pax shall write a
895       diagnostic message  to  standard  error,  shall  process  the  file  as
896       described  for the -o invalid= option, and then shall continue process‐
897       ing with the next file.
898

OUTPUT FILES

900       In read mode, the extracted output files shall be of the archived  file
901       type.   In  copy mode, the copied output files shall be the type of the
902       file being copied. In either mode, existing files  in  the  destination
903       hierarchy shall be overwritten only when all permission (-p), modifica‐
904       tion time (-u), and invalid-value (-oinvalid=) tests allow it.
905
906       In write mode, the output file named by the -f option-argument shall be
907       a file formatted according to one of the specifications in the EXTENDED
908       DESCRIPTION section, or some other implementation-defined format.
909

EXTENDED DESCRIPTION

911   pax Interchange Format
912       A pax archive tape or file produced in the -xpax format shall contain a
913       series of blocks. The physical layout of the archive shall be identical
914       to the ustar format described in ustar Interchange Format.   Each  file
915       archived shall be represented by the following sequence:
916
917        *  An  optional header block with extended header records. This header
918           block is of the form described in pax Header Block, with a typeflag
919           value  of  x  or  g.  The extended header records, described in pax
920           Extended Header, shall be included as  the  data  for  this  header
921           block.
922
923        *  A header block that describes the file. Any fields in the preceding
924           optional extended header shall override the  associated  fields  in
925           this header block for this file.
926
927        *  Zero or more blocks that contain the contents of the file.
928
929       At  the  end  of  the  archive  file there shall be two 512-byte blocks
930       filled with binary zeros, interpreted as an end-of-archive indicator.
931
932       A schematic of an example archive with global extended  header  records
933       and  two  actual files is shown in Figure 4-1, pax Format Archive Exam‐
934       ple.  In the example, the second file in the archive  has  no  extended
935       header  preceding  it,  presumably  because it has no need for extended
936       attributes.
937
938                       Figure 4-1: pax Format Archive Example
939
940   pax Header Block
941       The pax header block shall be  identical  to  the  ustar  header  block
942       described in ustar Interchange Format, except that two additional type‐
943       flag values are defined:
944
945       x     Represents extended header records for the following file in  the
946             archive (which shall have its own ustar header block). The format
947             of these extended header records shall be  as  described  in  pax
948             Extended Header.
949
950       g     Represents global extended header records for the following files
951             in the archive. The format of these extended header records shall
952             be  as described in pax Extended Header.  Each value shall affect
953             all subsequent files that do not override that value in their own
954             extended  header  record and until another global extended header
955             record is reached that provides another value for the same field.
956             The typeflag g global headers should not be used with interchange
957             media that could suffer partial data loss in transporting the ar‐
958             chive.
959
960       For  both  of  these  types,  the  size  field shall be the size of the
961       extended header records in octets. The other fields in the header block
962       are not meaningful to this version of the pax utility. However, if this
963       archive is read by a pax utility  conforming  to  the  ISO POSIX‐2:1993
964       standard,  the  header  block  fields are used to create a regular file
965       that contains the extended header records as  data.  Therefore,  header
966       block field values should be selected to provide reasonable file access
967       to this regular file.
968
969       A further difference from the ustar header block is  that  data  blocks
970       for  files  of  typeflag 1 (the digit one) (hard link) may be included,
971       which means that the size field may be greater than zero. Archives cre‐
972       ated  by  pax -o linkdata shall include these data blocks with the hard
973       links.
974
975   pax Extended Header
976       A pax extended header contains values that are  inappropriate  for  the
977       ustar  header  block  because  of  limitations  in  that format: fields
978       requiring a  character  encoding  other  than  that  described  in  the
979       ISO/IEC 646:1991  standard,  fields  representing  file  attributes not
980       described in the ustar header, and fields whose format or length do not
981       fit  the  requirements  of  the ustar header. The values in an extended
982       header add attributes to the following file (or files; see the descrip‐
983       tion  of the typeflag g header block) or override values in the follow‐
984       ing header block(s), as indicated in the following list of keywords.
985
986       An extended header shall consist of one  or  more  records,  each  con‐
987       structed as follows:
988
989
990           "%d %s=%s\n", <length>, <keyword>, <value>
991
992       The   extended  header  records  shall  be  encoded  according  to  the
993       ISO/IEC 10646‐1:2000  standard  UTF‐8  encoding.  The  <length>  field,
994       <blank>,  <equals-sign>,  and  <newline>  shown shall be limited to the
995       portable character set, as encoded in UTF‐8. The <keyword>  fields  can
996       be  any  UTF‐8  characters.   The  <length>  field shall be the decimal
997       length of the extended header record in octets, including the  trailing
998       <newline>.   If  there  is a hdrcharset extended header in effect for a
999       file, the value field for any gname, linkpath, path, and uname extended
1000       header  records  shall  be encoded using the character set specified by
1001       the hdrcharset extended header record; otherwise, the value field shall
1002       be  encoded  using UTF‐8. The value field for all other keywords speci‐
1003       fied by POSIX.1‐2008 shall be encoded using UTF‐8.
1004
1005       The <keyword> field shall be one of the entries from the following list
1006       or  a  keyword  provided as an implementation extension.  Keywords con‐
1007       sisting entirely of lowercase letters, digits, and periods are reserved
1008       for  future  standardization.  A  keyword shall not include an <equals-
1009       sign>.   (In  the  following  list,  the   notations   ``file(s)''   or
1010       ``block(s)''  is used to acknowledge that a keyword affects the follow‐
1011       ing single file after a typeflag x extended header, but possibly multi‐
1012       ple  files  after  typeflag g.  Any requirements in the list for pax to
1013       include a record when in write or copy mode shall apply only when  such
1014       a  record  has  not  already  been  provided  through the use of the -o
1015       option. When used in copy mode, pax shall behave as if an  archive  had
1016       been   created   with  applicable  extended  header  records  and  then
1017       extracted.)
1018
1019       atime     The file access time for the following file(s), equivalent to
1020                 the  value of the st_atime member of the stat structure for a
1021                 file, as described by the stat() function.  The  access  time
1022                 shall  be  restored if the process has appropriate privileges
1023                 required to do so. The format of  the  <value>  shall  be  as
1024                 described in pax Extended Header File Times.
1025
1026       charset   The  name of the character set used to encode the data in the
1027                 following file(s). The entries in  the  following  table  are
1028                 defined  to refer to known standards; additional names may be
1029                 agreed on between the originator and recipient.
1030
1031                   ┌────────────────────────┬───────────────────────────────┐
1032<value>         Formal Standard        
1033                   ├────────────────────────┼───────────────────────────────┤
1034                   │ISO-IR 646 1990         │ ISO/IEC 646:1990              │
1035                   │ISO-IR 8859 1 1998      │ ISO/IEC 8859‐1:1998           │
1036                   │ISO-IR 8859 2 1999      │ ISO/IEC 8859‐2:1999           │
1037                   │ISO-IR 8859 3 1999      │ ISO/IEC 8859‐3:1999           │
1038                   │ISO-IR 8859 4 1998      │ ISO/IEC 8859‐4:1998           │
1039                   │ISO-IR 8859 5 1999      │ ISO/IEC 8859‐5:1999           │
1040                   │ISO-IR 8859 6 1999      │ ISO/IEC 8859‐6:1999           │
1041                   │ISO-IR 8859 7 1987      │ ISO/IEC 8859‐7:1987           │
1042                   │ISO-IR 8859 8 1999      │ ISO/IEC 8859‐8:1999           │
1043                   │ISO-IR 8859 9 1999      │ ISO/IEC 8859‐9:1999           │
1044                   │ISO-IR 8859 10 1998     │ ISO/IEC 8859‐10:1998          │
1045                   │ISO-IR 8859 13 1998     │ ISO/IEC 8859‐13:1998          │
1046                   │ISO-IR 8859 14 1998     │ ISO/IEC 8859‐14:1998          │
1047                   │ISO-IR 8859 15 1999     │ ISO/IEC 8859‐15:1999          │
1048                   │ISO-IR 10646 2000       │ ISO/IEC 10646:2000            │
1049                   │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1050                   │BINARY                  │ None.                         │
1051                   └────────────────────────┴───────────────────────────────┘
1052                 The encoding is included in an extended header  for  informa‐
1053                 tion  only; when pax is used as described in POSIX.1‐2008, it
1054                 shall not translate the file data into  any  other  encoding.
1055                 The BINARY entry indicates unencoded binary data.
1056
1057                 When used in write or copy mode, it is implementation-defined
1058                 whether pax includes a charset extended header record  for  a
1059                 file.
1060
1061       comment   A  series  of characters used as a comment. All characters in
1062                 the <value> field shall be ignored by pax.
1063
1064       gid       The group ID of the group that owns the file, expressed as  a
1065                 decimal  number  using digits from the ISO/IEC 646:1991 stan‐
1066                 dard. This record shall override the gid field in the follow‐
1067                 ing  header  block(s).  When  used in write or copy mode, pax
1068                 shall include a gid extended  header  record  for  each  file
1069                 whose group ID is greater than 2097151 (octal 7777777).
1070
1071       gname     The  group  of  the file(s), formatted as a group name in the
1072                 group database. This record shall override the gid and  gname
1073                 fields in the following header block(s), and any gid extended
1074                 header record. When used in read, copy,  or  list  mode,  pax
1075                 shall  translate  the  name  from  the encoding in the header
1076                 record to the character set appropriate for the  group  data‐
1077                 base on the receiving system. If any of the characters cannot
1078                 be translated, and if neither the -oinvalid=UTF‐8 option  nor
1079                 the  -oinvalid=binary  option  is  specified, the results are
1080                 implementation-defined.  When used in write or copy mode, pax
1081                 shall  include  a  gname extended header record for each file
1082                 whose group name cannot be represented entirely with the let‐
1083                 ters and digits of the portable character set.
1084
1085       hdrcharset
1086                 The  name of the character set used to encode the value field
1087                 of the gname, linkpath, path, and uname pax  extended  header
1088                 records.  The  entries  in the following table are defined to
1089                 refer to known standards;  additional  names  may  be  agreed
1090                 between the originator and the recipient.
1091
1092                   ┌────────────────────────┬───────────────────────────────┐
1093<value>         Formal Standard        
1094                   ├────────────────────────┼───────────────────────────────┤
1095                   │ISO-IR 10646 2000 UTF-8 │ ISO/IEC 10646, UTF-8 encoding │
1096                   │BINARY                  │ None.                         │
1097                   └────────────────────────┴───────────────────────────────┘
1098                 If  no  hdrcharset  extended  header record is specified, the
1099                 default character set used to encode all values  in  extended
1100                 header  records  shall  be  the ISO/IEC 10646‐1:2000 standard
1101                 UTF‐8 encoding.
1102
1103                 The BINARY  entry  indicates  that  all  values  recorded  in
1104                 extended headers for affected files are unencoded binary data
1105                 from the underlying system.
1106
1107       linkpath  The pathname of a link being created to another file, of  any
1108                 type,  previously  archived.  This  record shall override the
1109                 linkname field in the following ustar  header  block(s).  The
1110                 following ustar header block shall determine the type of link
1111                 created. If typeflag of the following header block is  1,  it
1112                 shall  be  a  hard link. If typeflag is 2, it shall be a sym‐
1113                 bolic link and the linkpath value shall be  the  contents  of
1114                 the  symbolic  link. The pax utility shall translate the name
1115                 of the link (contents of the symbolic link) from the encoding
1116                 in  the header to the character set appropriate for the local
1117                 file system. When used in  write  or  copy  mode,  pax  shall
1118                 include a linkpath extended header record for each link whose
1119                 pathname cannot be represented entirely with the  members  of
1120                 the portable character set other than NUL.
1121
1122       mtime     The  file modification time of the following file(s), equiva‐
1123                 lent to the value of the st_mtime member of the  stat  struc‐
1124                 ture  for  a  file, as described in the stat() function. This
1125                 record shall override the mtime field in the following header
1126                 block(s).  The  modification  time  shall  be restored if the
1127                 process has appropriate privileges required  to  do  so.  The
1128                 format  of  the <value> shall be as described in pax Extended
1129                 Header File Times.
1130
1131       path      The pathname of the  following  file(s).  This  record  shall
1132                 override  the  name and prefix fields in the following header
1133                 block(s). The pax utility shall translate the pathname of the
1134                 file  from  the  encoding  in the header to the character set
1135                 appropriate for the local file system.
1136
1137                 When used in write or copy mode, pax  shall  include  a  path
1138                 extended header record for each file whose pathname cannot be
1139                 represented entirely with the members of the portable charac‐
1140                 ter set other than NUL.
1141
1142       realtime.any
1143                 The  keywords  prefixed  by  ``realtime.''  are  reserved for
1144                 future standardization.
1145
1146       security.any
1147                 The keywords  prefixed  by  ``security.''  are  reserved  for
1148                 future standardization.
1149
1150       size      The size of the file in octets, expressed as a decimal number
1151                 using digits from the ISO/IEC 646:1991 standard. This  record
1152                 shall  override  the  size  field  in  the  following  header
1153                 block(s). When used in write or copy mode, pax shall  include
1154                 a size extended header record for each file with a size value
1155                 greater than 8589934591 (octal 77777777777).
1156
1157       uid       The user ID of the file owner, expressed as a decimal  number
1158                 using  digits from the ISO/IEC 646:1991 standard. This record
1159                 shall  override  the  uid  field  in  the  following   header
1160                 block(s).  When used in write or copy mode, pax shall include
1161                 a uid extended header record for each file whose owner ID  is
1162                 greater than 2097151 (octal 7777777).
1163
1164       uname     The  owner of the following file(s), formatted as a user name
1165                 in the user database. This record shall override the uid  and
1166                 uname  fields  in  the following header block(s), and any uid
1167                 extended header record. When used  in  read,  copy,  or  list
1168                 mode,  pax  shall translate the name from the encoding in the
1169                 header record to the character set appropriate for  the  user
1170                 database  on  the  receiving system. If any of the characters
1171                 cannot be translated,  and  if  neither  the  -oinvalid=UTF‐8
1172                 option  nor  the  -oinvalid=binary  option  is specified, the
1173                 results are implementation-defined.  When used  in  write  or
1174                 copy  mode,  pax shall include a uname extended header record
1175                 for each file whose user name cannot be represented  entirely
1176                 with the letters and digits of the portable character set.
1177
1178       If  the  <value> field is zero length, it shall delete any header block
1179       field, previously entered extended header  value,  or  global  extended
1180       header value of the same name.
1181
1182       If  a keyword in an extended header record (or in a -o option-argument)
1183       overrides or deletes a corresponding field in the ustar  header  block,
1184       pax shall ignore the contents of that header block field.
1185
1186       Unlike  the ustar header block fields, NULs shall not delimit <value>s;
1187       all characters within the <value> field shall be  considered  data  for
1188       the  field.  None  of  the length limitations of the ustar header block
1189       fields in Table 4-14, ustar Header Block shall apply  to  the  extended
1190       header records.
1191
1192   pax Extended Header Keyword Precedence
1193       This  section  describes  the  precedence  in  which the various header
1194       records and fields and command line options are selected to apply to  a
1195       file  in  the archive. When pax is used in read or list modes, it shall
1196       determine a file attribute in the following sequence:
1197
1198        1. If -odelete=keyword-prefix is used, the affected  attributes  shall
1199           be determined from step 7., if applicable, or ignored otherwise.
1200
1201        2. If -okeyword:= is used, the affected attributes shall be ignored.
1202
1203        3. If  -okeyword:=value  is  used,  the  affected  attribute  shall be
1204           assigned the value.
1205
1206        4. If there is a typeflag  x  extended  header  record,  the  affected
1207           attribute  shall  be  assigned  the  <value>.  When extended header
1208           records conflict, the last one  given  in  the  header  shall  take
1209           precedence.
1210
1211        5. If  -okeyword=value  is  used,  the  affected  attribute  shall  be
1212           assigned the value.
1213
1214        6. If there is  a  typeflag  g  global  extended  header  record,  the
1215           affected  attribute  shall  be  assigned  the  <value>. When global
1216           extended header records conflict, the last one given in the  global
1217           header shall take precedence.
1218
1219        7. Otherwise,  the attribute shall be determined from the ustar header
1220           block.
1221
1222   pax Extended Header File Times
1223       The pax utility shall write an mtime record for each file in  write  or
1224       copy  modes  if  the  file's  modification  time  cannot be represented
1225       exactly in the ustar header logical record described  in  ustar  Inter‐
1226       change Format.  This can occur if the time is out of ustar range, or if
1227       the file system of the underlying implementation  supports  non-integer
1228       time  granularities  and  the time is not an integer. All of these time
1229       records shall be formatted as a decimal representation of the  time  in
1230       seconds  since  the Epoch. If a <period> ('.')  decimal point character
1231       is present, the digits to the right of the point  shall  represent  the
1232       units  of  a  subsecond  timing  granularity,  where the first digit is
1233       tenths of a second and each subsequent digit is a tenth of the previous
1234       digit. In read or copy mode, the pax utility shall truncate the time of
1235       a file to the greatest value that is not greater than the input  header
1236       file  time.  In write or copy mode, the pax utility shall output a time
1237       exactly if it can be represented exactly as a decimal number, and  oth‐
1238       erwise shall generate only enough digits so that the same time shall be
1239       recovered if the file is extracted on a system whose underlying  imple‐
1240       mentation supports the same time granularity.
1241
1242   ustar Interchange Format
1243       A ustar archive tape or file shall contain a series of logical records.
1244       Each logical record shall be a fixed-size logical record of 512  octets
1245       (see  below). Although this format may be thought of as being stored on
1246       9-track industry-standard 12.7 mm (0.5 in) magnetic tape,  other  types
1247       of  transportable  media  are not excluded. Each file archived shall be
1248       represented by a header logical record that describes  the  file,  fol‐
1249       lowed  by  zero  or  more logical records that give the contents of the
1250       file. At the end of the archive file there shall be two 512-octet logi‐
1251       cal  records filled with binary zeros, interpreted as an end-of-archive
1252       indicator.
1253
1254       The logical records may be grouped  for  physical  I/O  operations,  as
1255       described  under  the  -bblocksize  and -x ustar options. Each group of
1256       logical records may be written with a single  operation  equivalent  to
1257       the  write() function. On magnetic tape, the result of this write shall
1258       be a single tape physical block. The last physical block  shall  always
1259       be the full size, so logical records after the two zero logical records
1260       may contain undefined data.
1261
1262       The header logical record shall be structured as shown in the following
1263       table. All lengths and offsets are in decimal.
1264
1265                           Table 4-14: ustar Header Block
1266
1267                  ┌───────────┬──────────────┬────────────────────┐
1268Field Name Octet Offset Length (in Octets) 
1269                  ├───────────┼──────────────┼────────────────────┤
1270name       │       0      │        100         │
1271mode       │     100      │          8         │
1272uid        │     108      │          8         │
1273gid        │     116      │          8         │
1274size       │     124      │         12         │
1275mtime      │     136      │         12         │
1276chksum     │     148      │          8         │
1277typeflag   │     156      │          1         │
1278linkname   │     157      │        100         │
1279magic      │     257      │          6         │
1280version    │     263      │          2         │
1281uname      │     265      │         32         │
1282gname      │     297      │         32         │
1283devmajor   │     329      │          8         │
1284devminor   │     337      │          8         │
1285prefix     │     345      │        155         │
1286                  └───────────┴──────────────┴────────────────────┘
1287       All characters in the header logical record shall be represented in the
1288       coded character set  of  the  ISO/IEC 646:1991  standard.  For  maximum
1289       portability  between  implementations,  names  should  be selected from
1290       characters represented by the portable filename character set as octets
1291       with  the  most significant bit zero. If an implementation supports the
1292       use of characters outside of <slash> and the portable filename  charac‐
1293       ter  set in names for files, users, and groups, one or more implementa‐
1294       tion-defined encodings of these characters shall be provided for inter‐
1295       change purposes.
1296
1297       However, the pax utility shall never create filenames on the local sys‐
1298       tem  that  cannot  be  accessed  via  the   procedures   described   in
1299       POSIX.1‐2008. If a filename is found on the medium that would create an
1300       invalid filename, it is implementation-defined whether  the  data  from
1301       the  file  is  stored  on  the file hierarchy and under what name it is
1302       stored. The pax utility may choose to ignore these files as long as  it
1303       produces an error indicating that the file is being ignored.
1304
1305       Each  field  within  the  header logical record is contiguous; that is,
1306       there is no padding used. Each character on the archive medium shall be
1307       stored contiguously.
1308
1309       The  fields  magic,  uname, and gname are character strings each termi‐
1310       nated by a NUL character. The fields name,  linkname,  and  prefix  are
1311       NUL-terminated  character  strings  except  when  all characters in the
1312       array contain non-NUL characters including the last character. The ver‐
1313       sion  field  is  two octets containing the characters "00" (zero-zero).
1314       The typeflag contains a single character. All other fields are  leading
1315       zero-filled  octal numbers using digits from the ISO/IEC 646:1991 stan‐
1316       dard IRV. Each numeric field is terminated by one or  more  <space>  or
1317       NUL characters.
1318
1319       The  name and the prefix fields shall produce the pathname of the file.
1320       A new pathname shall be formed, if prefix is not an empty  string  (its
1321       first  character  is not NUL), by concatenating prefix (up to the first
1322       NUL character), a <slash> character, and name; otherwise, name is  used
1323       alone.  In  either case, name is terminated at the first NUL character.
1324       If prefix begins with a NUL character, it shall  be  ignored.  In  this
1325       manner,  pathnames  of  at  most  256 characters can be supported. If a
1326       pathname does not fit in the space provided, pax shall notify the  user
1327       of  the error, and shall not store any part of the file—header or data—
1328       on the medium.
1329
1330       The linkname field, described below, shall not use the prefix  to  pro‐
1331       duce  a  pathname. As such, a linkname is limited to 100 characters. If
1332       the name does not fit in the space provided, pax shall notify the  user
1333       of the error, and shall not attempt to store the link on the medium.
1334
1335       The  mode  field provides 12 bits encoded in the ISO/IEC 646:1991 stan‐
1336       dard octal digit representation.  The encoded bits shall represent  the
1337       following values:
1338
1339                               Table: ustar mode Field
1340
1341   ┌──────────┬──────────────────┬─────────────────────────────────────────────────┐
1342Bit Value POSIX.1‐2008 Bit Description                   
1343   ├──────────┼──────────────────┼─────────────────────────────────────────────────┤
1344   │  04000   │ S_ISUID          │ Set UID on execution.                           │
1345   │  02000   │ S_ISGID          │ Set GID on execution.                           │
1346   │  01000   │ <reserved>       │ Reserved for future standardization.            │
1347   │  00400   │ S_IRUSR          │ Read permission for file owner class.           │
1348   │  00200   │ S_IWUSR          │ Write permission for file owner class.          │
1349   │  00100   │ S_IXUSR          │ Execute/search permission for file owner class. │
1350   │  00040   │ S_IRGRP          │ Read permission for file group class.           │
1351   │  00020   │ S_IWGRP          │ Write permission for file group class.          │
1352   │  00010   │ S_IXGRP          │ Execute/search permission for file group class. │
1353   │  00004   │ S_IROTH          │ Read permission for file other class.           │
1354   │  00002   │ S_IWOTH          │ Write permission for file other class.          │
1355   │  00001   │ S_IXOTH          │ Execute/search permission for file other class. │
1356   └──────────┴──────────────────┴─────────────────────────────────────────────────┘
1357       When appropriate privileges are required to set one of these mode bits,
1358       and the user restoring the files from the archive does not have  appro‐
1359       priate  privileges,  the  mode  bits  for  which the user does not have
1360       appropriate privileges shall be ignored. Some of the mode bits  in  the
1361       archive   format   are  not  mentioned  elsewhere  in  this  volume  of
1362       POSIX.1‐2017. If the implementation does not support those  bits,  they
1363       may be ignored.
1364
1365       The uid and gid fields are the user and group ID of the owner and group
1366       of the file, respectively.
1367
1368       The size field is the size of the file in octets. If the typeflag field
1369       is  set  to  specify  a  file to be of type 1 (a link) or 2 (a symbolic
1370       link), the size field shall be specified as zero. If the typeflag field
1371       is set to specify a file of type 5 (directory), the size field shall be
1372       interpreted as described under the definition of that record  type.  No
1373       data  logical records are stored for types 1, 2, or 5.  If the typeflag
1374       field is set to 3 (character special file), 4 (block special file),  or
1375       6  (FIFO),  the meaning of the size field is unspecified by this volume
1376       of POSIX.1‐2017, and no data logical records shall  be  stored  on  the
1377       medium.  Additionally, for type 6, the size field shall be ignored when
1378       reading. If the typeflag field is set to any other value, the number of
1379       logical  records  written following the header shall be (size+511)/512,
1380       ignoring any fraction in the result of the division.
1381
1382       The mtime field shall be the modification time of the file at the  time
1383       it  was archived. It is the ISO/IEC 646:1991 standard representation of
1384       the octal value of the modification time obtained from the stat() func‐
1385       tion.
1386
1387       The chksum field shall be the ISO/IEC 646:1991 standard IRV representa‐
1388       tion of the octal value of the simple sum of all octets in  the  header
1389       logical  record.  Each  octet  in  the  header  shall  be treated as an
1390       unsigned value. These values shall be added  to  an  unsigned  integer,
1391       initialized  to  zero, the precision of which is not less than 17 bits.
1392       When calculating the checksum, the chksum field is  treated  as  if  it
1393       were all <space> characters.
1394
1395       The typeflag field specifies the type of file archived. If a particular
1396       implementation does not recognize the type, or the user does  not  have
1397       appropriate privileges to create that type, the file shall be extracted
1398       as if it were a regular file if the file type  is  defined  to  have  a
1399       meaning  for the size field that could cause data logical records to be
1400       written on the medium (see the previous description for size).  If con‐
1401       version  to  a  regular  file  occurs, the pax utility shall produce an
1402       error indicating that the conversion took place. All  of  the  typeflag
1403       fields shall be coded in the ISO/IEC 646:1991 standard IRV:
1404
1405       0       Represents a regular file. For backwards-compatibility, a type‐
1406               flag value of binary zero ('\0') should be recognized as  mean‐
1407               ing  a regular file when extracting files from the archive. Ar‐
1408               chives written with this version of  the  archive  file  format
1409               create   regular   files   with   a   typeflag   value  of  the
1410               ISO/IEC 646:1991 standard IRV '0'.
1411
1412       1       Represents a file linked to another file, of any  type,  previ‐
1413               ously  archived.  Such  files are identified by having the same
1414               device and file serial numbers, and  pathnames  that  refer  to
1415               different  directory  entries. All such files shall be archived
1416               as linked files.   The  linked-to  name  is  specified  in  the
1417               linkname  field  with  a NUL-character terminator if it is less
1418               than 100 octets in length.
1419
1420       2       Represents a symbolic link. The contents of the  symbolic  link
1421               shall be stored in the linkname field.
1422
1423       3,4     Represent  character  special  files  and  block  special files
1424               respectively.  In this case the devmajor  and  devminor  fields
1425               shall  contain  information  defining the device, the format of
1426               which is unspecified by this volume of POSIX.1‐2017.  Implemen‐
1427               tations  may  map  the device specifications to their own local
1428               specification or may ignore the entry.
1429
1430       5       Specifies a directory or subdirectory. On  systems  where  disk
1431               allocation  is  performed  on a directory basis, the size field
1432               shall contain the  maximum  number  of  octets  (which  may  be
1433               rounded  to  the  nearest  disk block allocation unit) that the
1434               directory may hold.  A size field of  zero  indicates  no  such
1435               limiting.  Systems  that do not support limiting in this manner
1436               should ignore the size field.
1437
1438       6       Specifies a FIFO special file. Note that  the  archiving  of  a
1439               FIFO  file archives the existence of this file and not its con‐
1440               tents.
1441
1442       7       Reserved to represent a file to  which  an  implementation  has
1443               associated  some  high-performance  attribute.  Implementations
1444               without such extensions should treat this  file  as  a  regular
1445               file (type 0).
1446
1447       A‐Z     The  letters  'A'  to  'Z',  inclusive, are reserved for custom
1448               implementations. All other values are reserved for future  ver‐
1449               sions of this standard.
1450
1451       It  is  unspecified whether files with pathnames that refer to the same
1452       directory entry are archived as linked files or as separate  files.  If
1453       they  are  archived  as  linked  files,  this  means that attempting to
1454       extract both pathnames from the resulting archive will always cause  an
1455       error  (unless  the  -u option is used) because the link cannot be cre‐
1456       ated.
1457
1458       It is unspecified whether files with the same device  and  file  serial
1459       numbers  being  appended  to  an archive are treated as linked files to
1460       members that were in the archive before the append.
1461
1462       Attempts to archive a socket shall produce a  diagnostic  message  when
1463       ustar  interchange  format  is used, but may be allowed when pax inter‐
1464       change format is used. Handling of other file types is  implementation-
1465       defined.
1466
1467       The  magic  field  is the specification that this archive was output in
1468       this archive format. If this field contains ustar (the five  characters
1469       from  the  ISO/IEC 646:1991  standard  IRV  shown followed by NUL), the
1470       uname and gname fields shall contain the ISO/IEC 646:1991 standard  IRV
1471       representation  of the owner and group of the file, respectively (trun‐
1472       cated to fit, if necessary). When the file is restored by a privileged,
1473       protection-preserving  version of the utility, the user and group data‐
1474       bases shall be scanned for these names. If found, the  user  and  group
1475       IDs  contained  within these files shall be used rather than the values
1476       contained within the uid and gid fields.
1477
1478   cpio Interchange Format
1479       The octet-oriented cpio archive format shall be a  series  of  entries,
1480       each comprising a header that describes the file, the name of the file,
1481       and then the contents of the file.
1482
1483       An archive may be recorded as a series of fixed-size blocks of  octets.
1484       This  blocking  shall be used only to make physical I/O more efficient.
1485       The last group of blocks shall always be at the full size.
1486
1487       For the octet-oriented cpio archive format, the individual entry infor‐
1488       mation  shall  be in the order indicated and described by the following
1489       table; see also the <cpio.h> header.
1490
1491                    Table 4-16: Octet-Oriented cpio Archive Entry
1492
1493            ┌─────────────────────┬────────────────────┬─────────────────┐
1494Header Field Name   Length (in Octets) Interpreted as  
1495            ├─────────────────────┼────────────────────┼─────────────────┤
1496c_magic              │          6         │ Octal number    │
1497c_dev                │          6         │ Octal number    │
1498c_ino                │          6         │ Octal number    │
1499c_mode               │          6         │ Octal number    │
1500c_uid                │          6         │ Octal number    │
1501c_gid                │          6         │ Octal number    │
1502c_nlink              │          6         │ Octal number    │
1503c_rdev               │          6         │ Octal number    │
1504c_mtime              │         11         │ Octal number    │
1505c_namesize           │          6         │ Octal number    │
1506c_filesize           │         11         │ Octal number    │
1507            ├─────────────────────┼────────────────────┼─────────────────┤
1508Filename Field Name  Length       Interpreted as  
1509            ├─────────────────────┴────────────────────┴─────────────────┤
1510c_name                 c_namesize           Pathname string │
1511            ├─────────────────────┬────────────────────┬─────────────────┤
1512File Data Field Name Length       Interpreted as  
1513            ├─────────────────────┴────────────────────┴─────────────────┤
1514c_filedata             c_filesize           Data            │
1515            └────────────────────────────────────────────────────────────┘
1516   cpio Header
1517       For each file in the archive, a header as defined previously  shall  be
1518       written.  The information in the header fields is written as streams of
1519       the ISO/IEC 646:1991 standard characters interpreted as octal  numbers.
1520       The  octal numbers shall be extended to the necessary length by append‐
1521       ing the ISO/IEC 646:1991 standard IRV zeros  at  the  most-significant-
1522       digit  end of the number; the result is written to the most-significant
1523       digit of the stream of octets first.  The fields shall  be  interpreted
1524       as follows:
1525
1526       c_magic   Identify the archive as being a transportable archive by con‐
1527                 taining the identifying value "070707".
1528
1529       c_dev, c_ino
1530                 Contains values that uniquely identify the  file  within  the
1531                 archive (that is, no files contain the same pair of c_dev and
1532                 c_ino values unless they are links to  the  same  file).  The
1533                 values shall be determined in an unspecified manner.
1534
1535       c_mode    Contains  the  file type and access permissions as defined in
1536                 the following table.
1537
1538                           Table 4-17: Values for cpio c_mode Field
1539
1540                 │──────────────────────┬─────────┬────────────────────────┬─
1541File Permissions NameValue Indicates       
1542                 │──────────────────────┼─────────┼────────────────────────┼─
1543                 │ C_IRUSR              │   000400│  Read by owner         │
1544                 │ C_IWUSR              │   000200│  Write by owner        │
1545                 │ C_IXUSR              │   000100│  Execute by owner      │
1546                 │ C_IRGRP              │   000040│  Read by group         │
1547                 │ C_IWGRP              │   000020│  Write by group        │
1548                 │ C_IXGRP              │   000010│  Execute by group      │
1549                 │ C_IROTH              │   000004│  Read by others        │
1550                 │ C_IWOTH              │   000002│  Write by others       │
1551                 │ C_IXOTH              │   000001│  Execute by others     │
1552                 │ C_ISUID              │   004000│  Set uid
1553                 │ C_ISGID              │   002000│  Set gid
1554                 │ C_ISVTX              │   001000│  Reserved              │
1555                 │──────────────────────┼─────────┼────────────────────────┼─
1556File Type Name    Value Indicates       
1557                 │──────────────────────┼─────────┼────────────────────────┼─
1558                 │ C_ISDIR              │   040000│  Directory             │
1559                 │ C_ISFIFO             │   010000│  FIFO                  │
1560                 │ C_ISREG              │  0100000│  Regular file          │
1561                 │ C_ISLNK              │  0120000│  Symbolic link         │
1562                 │                      │         │                        │
1563                 │C_ISBLK               │  060000 │ Block special file     │
1564                 │C_ISCHR               │  020000 │ Character special file │
1565                 │C_ISSOCK              │ 0140000 │ Socket                 │
1566                 │                      │         │                        │
1567                 │C_ISCTG               │ 0110000 │ Reserved               │
1568                 └──────────────────────┴─────────┴────────────────────────┘
1569                 Directories, FIFOs, symbolic links, and regular  files  shall
1570                 be  supported  on  a  system  conforming  to  this  volume of
1571                 POSIX.1‐2017;  additional  values  defined   previously   are
1572                 reserved for compatibility with existing systems.  Additional
1573                 file types may be supported; however, such files  should  not
1574                 be  written  to  archives intended to be transported to other
1575                 systems.
1576
1577       c_uid     Contains the user ID of the owner.
1578
1579       c_gid     Contains the group ID of the group.
1580
1581       c_nlink   Contains a number greater than or  equal  to  the  number  of
1582                 links  in  the archive referencing the file. If the -a option
1583                 is used to append to a cpio archive,  then  the  pax  utility
1584                 need  not  account  for the files in the existing part of the
1585                 archive when calculating the c_nlink values for the  appended
1586                 part of the archive, and need not alter the c_nlink values in
1587                 the existing part of the archive if additional files with the
1588                 same c_dev and c_ino values are appended to the archive.
1589
1590       c_rdev    Contains  implementation-defined information for character or
1591                 block special files.
1592
1593       c_mtime   Contains the latest time of modification of the file  at  the
1594                 time the archive was created.
1595
1596       c_namesize
1597                 Contains  the length of the pathname, including the terminat‐
1598                 ing NUL character.
1599
1600       c_filesize
1601                 Contains the length in octets of the data  section  following
1602                 the header structure.
1603
1604   cpio Filename
1605       The  c_name field shall contain the pathname of the file. The length of
1606       this field in octets is the value of c_namesize.
1607
1608       If a filename is found on the medium that would create an invalid path‐
1609       name,  it  is  implementation-defined whether the data from the file is
1610       stored on the file hierarchy and under what name it is stored.
1611
1612       All characters shall be represented in  the  ISO/IEC 646:1991  standard
1613       IRV.  For  maximum portability between implementations, names should be
1614       selected from characters represented by the portable filename character
1615       set  as octets with the most significant bit zero. If an implementation
1616       supports the use of characters outside the portable filename  character
1617       set  in names for files, users, and groups, one or more implementation-
1618       defined encodings of these characters shall be provided for interchange
1619       purposes.  However, the pax utility shall never create filenames on the
1620       local system that cannot be accessed via the procedures described  pre‐
1621       viously  in  this volume of POSIX.1‐2017. If a filename is found on the
1622       medium that would create an invalid  filename,  it  is  implementation-
1623       defined whether the data from the file is stored on the local file sys‐
1624       tem and under what name it is stored. The pax  utility  may  choose  to
1625       ignore  these files as long as it produces an error indicating that the
1626       file is being ignored.
1627
1628   cpio File Data
1629       Following c_name, there shall be c_filesize octets of data. Interpreta‐
1630       tion of such data occurs in a manner dependent on the file. For regular
1631       files, the data shall consist of the contents of the file. For symbolic
1632       links,  the data shall consist of the contents of the symbolic link. If
1633       c_filesize is zero, no data shall be contained in c_filedata.
1634
1635       When restoring from an archive:
1636
1637        *  If the user does not have appropriate privileges to create  a  file
1638           of  the  specified  type,  pax  shall ignore the entry and write an
1639           error message to standard error.
1640
1641        *  Only regular files and symbolic links have  data  to  be  restored.
1642           Presuming a regular file meets any selection criteria that might be
1643           imposed on the format-reading utility by the user, such data  shall
1644           be restored.
1645
1646        *  If  a user does not have appropriate privileges to set a particular
1647           mode flag, the flag shall be ignored. Some of the mode flags in the
1648           archive  format  are  not  mentioned  elsewhere  in  this volume of
1649           POSIX.1‐2017. If the implementation does not support  those  flags,
1650           they may be ignored.
1651
1652   cpio Special Entries
1653       FIFO special files, directories, and the trailer shall be recorded with
1654       c_filesize equal to zero. Symbolic links shall be recorded with c_file‐
1655       size  equal  to  the  length of the contents of the symbolic link.  For
1656       other special files,  c_filesize  is  unspecified  by  this  volume  of
1657       POSIX.1‐2017.  The  header for the next file entry in the archive shall
1658       be written directly after the last octet of the  file  entry  preceding
1659       it.  A  header denoting the filename TRAILER!!!  shall indicate the end
1660       of the archive; the contents of octets in the last block of the archive
1661       following such a header are undefined.
1662

EXIT STATUS

1664       The following exit values shall be returned:
1665
1666        0    All files were processed successfully.
1667
1668       >0    An error occurred.
1669

CONSEQUENCES OF ERRORS

1671       If pax cannot create a file or a link when reading an archive or cannot
1672       find a file when writing an archive, or cannot preserve  the  user  ID,
1673       group  ID,  or  file mode when the -p option is specified, a diagnostic
1674       message shall be written to standard error and a non-zero  exit  status
1675       shall be returned, but processing shall continue. In the case where pax
1676       cannot create a link to a file, pax shall not,  by  default,  create  a
1677       second copy of the file.
1678
1679       If  the  extraction of a file from an archive is prematurely terminated
1680       by a signal or error, pax may have only partially extracted the file or
1681       (if  the  -n option was not specified) may have extracted a file of the
1682       same name as that specified by the user, but which is not the file  the
1683       user wanted.  Additionally, the file modes of extracted directories may
1684       have additional bits from the S_IRWXU mask set  as  well  as  incorrect
1685       modification and access times.
1686
1687       The following sections are informative.
1688

APPLICATION USAGE

1690       Caution  is advised when using the -a option to append to a cpio format
1691       archive. If any of the files being appended happen to be given the same
1692       c_dev  and  c_ino values as a file in the existing part of the archive,
1693       then they may be treated as links to that file on extraction. Thus,  it
1694       is  risky to use -a with cpio format except when it is done on the same
1695       system that the original archive was created on, and with the same  pax
1696       utility,  and  in  the  knowledge that there has been little or no file
1697       system activity since the original archive was created that could  lead
1698       to  any of the files appended being given the same c_dev and c_ino val‐
1699       ues as an unrelated file in the existing part  of  the  archive.  Also,
1700       when (intentionally) appending additional links to a file in the exist‐
1701       ing part of the archive, the c_nlink values in the modified archive can
1702       be  smaller  than the number of links to the file in the archive, which
1703       may mean that the links are not preserved on extraction.
1704
1705       The -p  (privileges)  option  was  invented  to  reconcile  differences
1706       between historical tar and cpio implementations. In particular, the two
1707       utilities use -m in diametrically opposed ways. The -p option also pro‐
1708       vides  a  consistent  means  of extending the ways in which future file
1709       attributes can be addressed, such as for enhanced security  systems  or
1710       high-performance  files. Although it may seem complex, there are really
1711       two modes that are most commonly used:
1712
1713       -p e    ``Preserve everything''. This would be used by  the  historical
1714               superuser, someone with all appropriate privileges, to preserve
1715               all aspects of the files as they are recorded in  the  archive.
1716               The  e  flag  is  the sum of o and p, and other implementation-
1717               defined attributes.
1718
1719       -p p    ``Preserve'' the file mode bits. This would be used by the user
1720               with  regular  privileges who wished to preserve aspects of the
1721               file other than the ownership. The file times are preserved  by
1722               default,  but  two other flags are offered to disable these and
1723               use the time of extraction.
1724
1725       The one pathname per line format of standard input precludes  pathnames
1726       containing  <newline>  characters.  Although such pathnames violate the
1727       portable filename guidelines, they may exist  and  their  presence  may
1728       inhibit  usage  of  pax within shell scripts. This problem is inherited
1729       from historical archive programs. The problem can be avoided by listing
1730       filename arguments on the command line instead of on standard input.
1731
1732       It  is  almost certain that appropriate privileges are required for pax
1733       to accomplish parts of this volume of POSIX.1‐2017. Specifically,  cre‐
1734       ating  files of type block special or character special, restoring file
1735       access times unless the files are owned by the user (the -t option), or
1736       preserving  file  owner,  group,  and mode (the -p option) all probably
1737       require appropriate privileges.
1738
1739       In read mode, implementations are permitted to overwrite files when the
1740       archive  has multiple members with the same name. This may fail if per‐
1741       missions on the first version of the file do not permit it to be  over‐
1742       written.
1743
1744       The  cpio  and  ustar  formats  can only support files up to 8589934592
1745       bytes (8 ∗ 2^30) in size.
1746
1747       When archives containing binary header information  are  listed  ,  the
1748       filenames printed may cause strange behavior on some terminals.
1749
1750       When all of the following are true:
1751
1752        1. A file of type directory is being placed into an archive.
1753
1754        2. The ustar archive format is being used.
1755
1756        3. The  pathname  of  the directory is less than or equal to 155 bytes
1757           long (it will fit in the prefix field in the ustar header block).
1758
1759        4. The last component of the pathname of the directory is longer  than
1760           100  bytes  long  (it  will  not fit in the name field in the ustar
1761           header block).
1762
1763       some implementations of the pax utility will place the entire directory
1764       pathname  in  the  prefix field, set the name field to an empty string,
1765       and place the directory in the archive.  Other implementations  of  the
1766       pax  utility will give an error under these conditions because the name
1767       field is not large enough to hold the last component of  the  directory
1768       name.  This standard allows either behavior. However, when extracting a
1769       directory from a ustar format archive, this standard requires that  all
1770       implementations  be  able to extract a directory even if the name field
1771       contains an empty string as long as the prefix field does not also con‐
1772       tain an empty string.
1773

EXAMPLES

1775       The following command:
1776
1777
1778           pax -w -f /dev/rmt/1m .
1779
1780       copies  the  contents  of the current directory to tape drive 1, medium
1781       density (assuming historical System V device naming procedures—the his‐
1782       torical BSD device name would be /dev/rmt9).
1783
1784       The following commands:
1785
1786
1787           mkdir newdir
1788           pax -rw olddir newdir
1789
1790       copy the olddir directory hierarchy to newdir.
1791
1792
1793           pax -r -s ',^//*usr//*,,' -f a.pax
1794
1795       reads  the  archive a.pax, with all files rooted in /usr in the archive
1796       extracted relative to the current directory.
1797
1798       Using the option:
1799
1800
1801           -o listopt="%M %(atime)T %(size)D %(name)s"
1802
1803       overrides the default output description in Standard Output and instead
1804       writes:
1805
1806
1807           -rw-rw--- Jan 12 15:53 2003 1492 /usr/foo/bar
1808
1809       Using the options:
1810
1811
1812           -o listopt='%L\t%(size)D\n%.7' \
1813           -o listopt='(name)s\n%(atime)T\n%T'
1814
1815       overrides the default output description in Standard Output and instead
1816       writes:
1817
1818
1819           /usr/foo/bar -> /tmp   1492
1820           /usr/fo
1821           Jan 12 15:53 1991
1822           Jan 31 15:53 2003
1823

RATIONALE

1825       The pax utility was new for the ISO POSIX‐2:1993  standard.  It  repre‐
1826       sents a peaceful compromise between advocates of the historical tar and
1827       cpio utilities.
1828
1829       A fundamental difference between cpio and tar was in the  way  directo‐
1830       ries  were  treated. The cpio utility did not treat directories differ‐
1831       ently from other files, and to select  a  directory  and  its  contents
1832       required  that  each file in the hierarchy be explicitly specified. For
1833       tar, a directory matched every file in the file hierarchy it rooted.
1834
1835       The pax utility offers both interfaces;  by  default,  directories  map
1836       into the file hierarchy they root. The -d option causes pax to skip any
1837       file not explicitly referenced,  as  cpio  historically  did.  The  tar
1838       -style  behavior was chosen as the default because it was believed that
1839       this was the more common usage and because tar  is  the  more  commonly
1840       available  interface,  as it was historically provided on both System V
1841       and BSD implementations.
1842
1843       The  data  interchange  format  specification   in   this   volume   of
1844       POSIX.1‐2017  requires  that  processes with ``appropriate privileges''
1845       shall always restore the ownership and permissions of  extracted  files
1846       exactly  as  archived.  If viewed from the historic equivalence between
1847       superuser and ``appropriate privileges'', there are two  problems  with
1848       this  requirement.  First,  users running as superusers may unknowingly
1849       set dangerous permissions on extracted files. Second, it is  needlessly
1850       limiting, in that superusers cannot extract files and own them as supe‐
1851       ruser unless the archive was created by the superuser.  (It  should  be
1852       noted that restoration of ownerships and permissions for the superuser,
1853       by default, is historical practice in cpio, but not in tar.)  In  order
1854       to  avoid  these  two problems, the pax specification has an additional
1855       ``privilege'' mechanism, the -p option. Only a pax invocation with  the
1856       privileges needed, and which has the -p option set using the e specifi‐
1857       cation character, has appropriate privileges to restore full  ownership
1858       and permission information.
1859
1860       Note  also that this volume of POSIX.1‐2017 requires that the file own‐
1861       ership and access permissions shall be set, on extraction, in the  same
1862       fashion  as  the creat() function when provided with the mode stored in
1863       the archive. This means that the file creation  mask  of  the  user  is
1864       applied to the file permissions.
1865
1866       Users should note that directories may be created by pax while extract‐
1867       ing files with permissions that are different from those  that  existed
1868       at the time the archive was created. When extracting sensitive informa‐
1869       tion into a directory  hierarchy  that  no  longer  exists,  users  are
1870       encouraged  to  set  their  file creation mask appropriately to protect
1871       these files during extraction.
1872
1873       The table of contents output is written to standard output  to  facili‐
1874       tate pipeline processing.
1875
1876       An early proposal had hard links displaying for all pathnames. This was
1877       removed because it complicates the output of the case where -v  is  not
1878       specified  and  does  not  match  historical  cpio usage. The hard-link
1879       information is available in the -v display.
1880
1881       The description of the -l option allows implementations  to  make  hard
1882       links  to  symbolic  links.   Earlier versions of this standard did not
1883       specify any way to create a hard link to  a  symbolic  link,  but  many
1884       implementations  provided this capability as an extension. If there are
1885       hard links to symbolic links when an archive is created, the  implemen‐
1886       tation  is  required to archive the hard link in the archive (unless -H
1887       or -L is specified). When in read mode and in  copy  mode,  implementa‐
1888       tions  supporting  hard  links  to  symbolic links should use them when
1889       appropriate.
1890
1891       The archive formats inherited from the POSIX.1‐1990 standard have  cer‐
1892       tain  restrictions  that have been brought along from historical usage.
1893       For example, there are restrictions on the length of  pathnames  stored
1894       in  the archive.  When pax is used in copy(-rw) mode (copying directory
1895       hierarchies), the ability to use extensions from the -xpax format over‐
1896       comes these restrictions.
1897
1898       The default blocksize value of 5120 bytes for cpio was selected because
1899       it is one of the standard block-size values for cpio, set when  the  -B
1900       option  is  specified.  (The other default block-size value for cpio is
1901       512 bytes, and this was considered to be too small.) The default  block
1902       value  of 10240 bytes for tar was selected because that is the standard
1903       block-size value for BSD tar.  The maximum block size  of  32256  bytes
1904       (215-512  bytes)  is the largest multiple of 512 bytes that fits into a
1905       signed 16-bit tape controller transfer register. There are known  limi‐
1906       tations  in  some  historical  systems that would prevent larger blocks
1907       from being accepted. Historical values were chosen to improve  compati‐
1908       bility with historical scripts using dd or similar utilities to manipu‐
1909       late archives. Also, default block sizes for any file type  other  than
1910       character   special   file   has  been  deleted  from  this  volume  of
1911       POSIX.1‐2017 as unimportant and not likely to affect the  structure  of
1912       the resulting archive.
1913
1914       Implementations  are  permitted to modify the block-size value based on
1915       the archive format or the device to which the archive is being written.
1916       This  is to provide implementations with the opportunity to take advan‐
1917       tage of special types of devices, and it should not be used  without  a
1918       great  deal  of  consideration as it almost certainly decreases archive
1919       portability.
1920
1921       The intended use of the -n option was to permit extraction  of  one  or
1922       more files from the archive without processing the entire archive. This
1923       was viewed by the standard developers as offering  significant  perfor‐
1924       mance  advantages  over  historical  implementations.  The -n option in
1925       early proposals had three effects; the first was to cause special char‐
1926       acters in patterns to not be treated specially. The second was to cause
1927       only the first file that matched a pattern to be extracted.  The  third
1928       was  to  cause pax to write a diagnostic message to standard error when
1929       no file was found matching a specified pattern. Only the second  behav‐
1930       ior  is  retained  by  this  volume  of POSIX.1‐2017, for many reasons.
1931       First, it is in general not acceptable for a single option to have mul‐
1932       tiple  effects. Second, the ability to make pattern matching characters
1933       act as normal characters is useful for parts of  pax  other  than  file
1934       extraction.  Third,  a finer degree of control over the special charac‐
1935       ters is useful because users may wish to normalize only a  single  spe‐
1936       cial  character  in  a  single  filename.  Fourth, given a more general
1937       escape mechanism, the previous behavior of the -n option can be  easily
1938       obtained  using the -s option or a sed script. Finally, writing a diag‐
1939       nostic message when a pattern specified by the user is unmatched by any
1940       file is useful behavior in all cases.
1941
1942       In this version, the -n was removed from the copy mode synopsis of pax;
1943       it is inapplicable because there are no pattern operands  specified  in
1944       this mode.
1945
1946       There  is  another method than pax for copying subtrees in POSIX.1‐2008
1947       described as part of the cp utility. Both methods are historical  prac‐
1948       tice: cp provides a simpler, more intuitive interface, while pax offers
1949       a finer granularity of control. Each provides additional  functionality
1950       to  the  other; in particular, pax maintains the hard-link structure of
1951       the hierarchy while cp does not. It is the intention  of  the  standard
1952       developers that the results be similar (using appropriate option combi‐
1953       nations in both utilities). The results are not required to be  identi‐
1954       cal; there seemed insufficient gain to applications to balance the dif‐
1955       ficulty of implementations having to guarantee that the  results  would
1956       be exactly identical.
1957
1958       A  single  archive  may  span  more than one file. It is suggested that
1959       implementations provide informative messages to the  user  on  standard
1960       error whenever the archive file is changed.
1961
1962       The -d option (do not create intermediate directories not listed in the
1963       archive) found in early proposals was originally provided as a  comple‐
1964       ment to the historic -d option of cpio.  It has been deleted.
1965
1966       The -s option in early proposals specified a subset of the substitution
1967       command from the ed utility. As there was no reason for only  a  subset
1968       to  be  supported,  the -s option is now compatible with the current ed
1969       specification. Since the delimiter can be any non-null  character,  the
1970       following usage with single <space> characters is valid:
1971
1972
1973           pax -s " foo bar " ...
1974
1975       The  -t  description  is  worded  so as to note that this may cause the
1976       access time update caused by some other activity  (which  occurs  while
1977       the file is being read) to be overwritten.
1978
1979       The  default  behavior of pax with regard to file modification times is
1980       the same as historical implementations of tar.  It is not the  histori‐
1981       cal behavior of cpio.
1982
1983       Because  the  -i  option uses /dev/tty, utilities without a controlling
1984       terminal are not able to use this option.
1985
1986       The -y option, found in early proposals, has  been  deleted  because  a
1987       line  containing  a  single  <period>  for the -i option has equivalent
1988       functionality. The special lines for the -i option (a  single  <period>
1989       and the empty line) are historical practice in cpio.
1990
1991       In  early drafts, a -echarmap option was included to increase portabil‐
1992       ity of files between systems using different coded character sets. This
1993       option  was omitted because it was apparent that consensus could not be
1994       formed for it. In this version, the use of UTF‐8 should be an  adequate
1995       substitute.
1996
1997       The ISO POSIX‐2:1993 standard and ISO POSIX‐1 standard requirements for
1998       pax, however, made it very difficult to create a  single  archive  con‐
1999       taining  files  created using extended characters provided by different
2000       locales.  This version adds the hdrcharset keyword to make it  possible
2001       to  archive files in these cases without dropping files due to transla‐
2002       tion errors.
2003
2004       Translating filenames and other attributes from a locale's encoding  to
2005       UTF‐8  and then back again can lose information, as the resulting file‐
2006       name might not be byte-for-byte equivalent to the  original.  To  avoid
2007       this  problem, users can specify the -o hdrcharset=binary option, which
2008       will cause the resulting archive to use binary format for all names and
2009       attributes. Such archives are not portable among hosts that use differ‐
2010       ent native encodings (e.g., EBCDIC versus ASCII-based  encodings),  but
2011       they  will allow interchange among the vast majority of POSIX file sys‐
2012       tems in practical use. Also, the -o hdrcharset=binary option will cause
2013       pax  in  copy mode to behave more like other standard utilities such as
2014       cp.
2015
2016       If the  values  specified  by  the  -o  exthdr.name=value,  -o  globex‐
2017       thdr.name=value, or by $TMPDIR (if -o globexthdr.name is not specified)
2018       require  a  character  encoding  other  than  that  described  in   the
2019       ISO/IEC 646:1991  standard,  a path extended header record will have to
2020       be created for the file. If a  hdrcharset  extended  header  record  is
2021       active  for  such  headers,  it will determine the codeset used for the
2022       value field in these extended path header records. These path  extended
2023       header  records  always need to be created when writing an archive even
2024       if hdrcharset=binary has been specified  and  would  contain  the  same
2025       (binary)  data  that appears in the ustar header record prefix and name
2026       fields. (In other words, an  extended  header  path  record  is  always
2027       required to be generated if the prefix or name fields contain non-ASCII
2028       characters even when hdrcharset=binary  is  also  in  effect  for  that
2029       file.)
2030
2031       The  -k  option  was  added to address international concerns about the
2032       dangers involved in the character set transformations  of  -e  (if  the
2033       target  character  set  were  different  from the source, the filenames
2034       might be transformed into names matching existing files) and  also  was
2035       made  more  general  to  protect files transferred between file systems
2036       with different {NAME_MAX} values (truncating a filename  on  a  smaller
2037       system  might  also inadvertently overwrite existing files). As stated,
2038       it prevents any overwriting, even if the target file is older than  the
2039       source.  This  version  adds  more granularity of options to solve this
2040       problem by introducing the -oinvalid=option—specifically the UTF‐8  and
2041       binary  actions.  (Note that an existing file is still subject to over‐
2042       writing in this case. The -k option closes that loophole.)
2043
2044       Some  of  the  file  characteristics  referenced  in  this  volume   of
2045       POSIX.1‐2017  might not be supported by some archive formats. For exam‐
2046       ple, neither the tar nor cpio formats contain the file access time. For
2047       this  reason, the e specification character has been provided, intended
2048       to cause all file  characteristics  specified  in  the  archive  to  be
2049       retained.
2050
2051       It  is  required  that  extracted  directories,  by default, have their
2052       access and modification times and permissions set to the values  speci‐
2053       fied  in the archive. This has obvious problems in that the directories
2054       are almost certainly modified after being extracted and that  directory
2055       permissions  may  not permit file creation. One possible solution is to
2056       create directories with the mode specified in the archive, as  modified
2057       by  the  umask  of  the user, with sufficient permissions to allow file
2058       creation. After all files have been extracted, pax would then reset the
2059       access and modification times and permissions as necessary.
2060
2061       The  list-mode  formatting  description  borrows  heavily  from the one
2062       defined by the printf utility. However, since there is no separate  op‐
2063       erand  list  to  get  conversion  arguments, the format was extended to
2064       allow specifying the name of the conversion argument  as  part  of  the
2065       conversion specification.
2066
2067       The T conversion specifier allows time fields to be displayed in any of
2068       the date formats. Unlike the ls utility, pax does not adjust the format
2069       when  the  date is less than six months in the past. This makes parsing
2070       the output more predictable.
2071
2072       The  D  conversion  specifier  handles  the  ability  to  display   the
2073       major/minor or file size, as with ls, by using %-8(size)D.
2074
2075       The L conversion specifier handles the ls display for symbolic links.
2076
2077       Conversion  specifiers were added to generate existing known types used
2078       for ls.
2079
2080   pax Interchange Format
2081       The new POSIX data interchange format was developed primarily  to  sat‐
2082       isfy  international  concerns  that  the ustar and cpio formats did not
2083       provide for file, user, and group names encoded in characters outside a
2084       subset  of the ISO/IEC 646:1991 standard. The standard developers real‐
2085       ized that this new POSIX data interchange format should be very  exten‐
2086       sible  because  there  were other requirements they foresaw in the near
2087       future:
2088
2089        *  Support international character encodings and locale information
2090
2091        *  Support security information (ACLs, and so on)
2092
2093        *  Support future file types, such as realtime or contiguous files
2094
2095        *  Include data areas for implementation use
2096
2097        *  Support systems with words larger than 32 bits and timers with sub‐
2098           second granularity
2099
2100       The  following  were not goals for this format because these are better
2101       handled by separate utilities or are inappropriate for a portable  for‐
2102       mat:
2103
2104        *  Encryption
2105
2106        *  Compression
2107
2108        *  Data translation between locales and codesets
2109
2110        *  inode storage
2111
2112       The  format  chosen  to  support the goals is an extension of the ustar
2113       format. Of the two formats previously available, only the ustar  format
2114       was selected for extensions because:
2115
2116        *  It  was  easier  to extend in an upwards-compatible way. It offered
2117           version flags and header block type fields  with  room  for  future
2118           standardization.  The cpio format, while possessing a more flexible
2119           file naming methodology, could not  be  extended  without  breaking
2120           some  theoretical  implementation  or  using  a dummy filename that
2121           could be a legitimate filename.
2122
2123        *  Industry experience since  the  original  ``tar  wars''  fought  in
2124           developing  the  ISO POSIX‐1  standard has clearly been in favor of
2125           the ustar format, which is  generally  the  default  output  format
2126           selected for pax implementations on new systems.
2127
2128       The  new  format was designed with one additional goal in mind: reason‐
2129       able behavior when an older tar or pax utility happened to read an  ar‐
2130       chive. Since the POSIX.1‐1990 standard mandated that a ``format-reading
2131       utility'' had to treat unrecognized typeflag values as  regular  files,
2132       this  allowed  the  format to include all the extended information in a
2133       pseudo-regular file that preceded each real file. An  option  is  given
2134       that  allows  the  archive creator to set up reasonable names for these
2135       files on the older systems. Also, the normative text suggests that rea‐
2136       sonable  file access values be used for this ustar header block. Making
2137       these header files inaccessible for  convenient  reading  and  deleting
2138       would not be reasonable. File permissions of 600 or 700 are suggested.
2139
2140       The  ustar  typeflag field was used to accommodate the additional func‐
2141       tionality of the new format rather than magic or  version  because  the
2142       POSIX.1‐1990 standard (and, by reference, the previous version of pax),
2143       mandated the behavior of the format-reading utility when it encountered
2144       an unknown typeflag, but was silent about the other two fields.
2145
2146       Early proposals for the first version of this standard contained a pro‐
2147       posed archive format that was based on compatibility with the  standard
2148       for  tape  files  (ISO 1001, similar to the format used historically on
2149       many mainframes and minicomputers). This format was overly complex  and
2150       required  considerable  overhead in volume and header records. Further‐
2151       more, the standard developers felt that it would not be  acceptable  to
2152       the community of POSIX developers, so it was later changed to be a for‐
2153       mat more closely related to historical practice on POSIX systems.
2154
2155       The prefix and name split of pathnames in ustar  was  replaced  by  the
2156       single path extended header record for simplicity.
2157
2158       The  concept of a global extended header (typeflagg) was controversial.
2159       If this were applied to an archive being recorded on magnetic  tape,  a
2160       few  unreadable  blocks at the beginning of the tape could be a serious
2161       problem; a utility attempting to extract as many files as possible from
2162       a damaged archive could lose a large percentage of file header informa‐
2163       tion in this case. However, if the archive were on a  reliable  medium,
2164       such as a CD‐ROM, the global extended header offers considerable poten‐
2165       tial size reductions by eliminating redundant  information.  Thus,  the
2166       text  warns  against  using  the global method for unreliable media and
2167       provides a method for implanting global  information  in  the  extended
2168       header for each file, rather than in the typeflag g records.
2169
2170       No  facility  for  data translation or filtering on a per-file basis is
2171       included because the standard developers could not invent an  interface
2172       that  would  allow  this  in  an efficient manner. If a filter, such as
2173       encryption or compression, is to be applied to all  the  files,  it  is
2174       more  efficient  to  apply the filter to the entire archive as a single
2175       file. The standard developers considered interfaces that would invoke a
2176       shell  script  for  each file going into or out of the archive, but the
2177       system overhead in this approach was considered to be too high.
2178
2179       One such approach would be to have filter= records that give a pathname
2180       for  an  executable.  When the program is invoked, the file and archive
2181       would be open for standard input/output and all the header fields would
2182       be  available  as  environment variables or command-line arguments. The
2183       standard developers did discuss such schemes,  but  they  were  omitted
2184       from  POSIX.1‐2008  due to concerns about excessive overhead. Also, the
2185       program itself would need to be in the archive if it were  to  be  used
2186       portably.
2187
2188       There  is  currently  no  portable  means  of identifying the character
2189       set(s) used for a file in the file system. Therefore, pax has not  been
2190       given  a  mechanism to generate charset records automatically. The only
2191       portable means of doing this is for the user to write the archive using
2192       the  -ocharset=string command line option. This assumes that all of the
2193       files in the archive  use  the  same  encoding.  The  ``implementation-
2194       defined''  text is included to allow for a system that can identify the
2195       encodings used for each of its files.
2196
2197       The table of standards that accompanies the charset record  description
2198       is  acknowledged to be very limited. Only a limited number of character
2199       set standards is reasonable for maximal interchange. Any character  set
2200       is,  of  course,  possible  by  prior  agreement. It was suggested that
2201       EBCDIC be listed, but it was omitted because it is  not  defined  by  a
2202       formal  standard. Formal standards, and then only those with reasonably
2203       large followings, can be included here, simply as a matter  of  practi‐
2204       cality. The <value>s represent names of officially registered character
2205       sets in the format required by the ISO 2375:1985 standard.
2206
2207       The normal <comma> or <blank>-separated list rules are not followed  in
2208       the  case  of  keyword  options  to  allow ease of argument parsing for
2209       getopts.
2210
2211       Further information on character encodings is in pax Archive  Character
2212       Set Encoding/Decoding.
2213
2214       The  standard  developers  have  reserved keyword name space for vendor
2215       extensions. It is suggested that the format to be used is:
2216
2217
2218           VENDOR.keyword
2219
2220       where VENDOR is the name of the vendor or organization in all uppercase
2221       letters.  It  is  further  suggested  that  the  keyword  following the
2222       <period> be named differently than any of the standard keywords so that
2223       it  could  be used for future standardization, if appropriate, by omit‐
2224       ting the VENDOR prefix.
2225
2226       The <length> field in the extended header record was included  to  make
2227       it  simpler  to  step through the records, even if a record contains an
2228       unknown format (to a particular pax) with complex interactions of  spe‐
2229       cial  characters.  It also provides a minor integrity checkpoint within
2230       the records to aid a program attempting to recover files from a damaged
2231       archive.
2232
2233       There  are  no  extended  header  versions of the devmajor and devminor
2234       fields because the unspecified format ustar header field should be suf‐
2235       ficient.  If  they  are not, vendor-specific extended keywords (such as
2236       VENDOR.devmajor) should be used.
2237
2238       Device and i-number labeling of files was not adopted from cpio;  files
2239       are interchanged strictly on a symbolic name basis, as in ustar.
2240
2241       Just  as  with  the  ustar format descriptions, the new format makes no
2242       special arrangements for multi-volume archives. Each of the pax archive
2243       types  is  assumed  to be inside a single POSIX file and splitting that
2244       file over multiple volumes (diskettes, tape  cartridges,  and  so  on),
2245       processing  their  labels, and mounting each in the proper sequence are
2246       considered to  be  implementation  details  that  cannot  be  described
2247       portably.
2248
2249       The  pax  format  is intended for interchange, not only for backup on a
2250       single (family of) systems. It is not as densely  packed  as  might  be
2251       possible for backup:
2252
2253        *  It  contains information as coded characters that could be coded in
2254           binary.
2255
2256        *  It identifies extended records with name fields that could be omit‐
2257           ted in favor of a fixed-field layout.
2258
2259        *  It  translates  names  into a portable character set and identifies
2260           locale-related information, both of which are probably  unnecessary
2261           for backup.
2262
2263       The  requirements  on  restoring from an archive are slightly different
2264       from the historical wording, allowing for non-monolithic  privilege  to
2265       bring  forward  as  much as possible. In particular, attributes such as
2266       ``high performance file'' might be broadly but not universally  granted
2267       while set-user-ID or chown() might be much more restricted. There is no
2268       implication in POSIX.1‐2008 that the security  information  be  honored
2269       after  it  is restored to the file hierarchy, in spite of what might be
2270       improperly inferred by the silence on that topic. That is a  topic  for
2271       another standard.
2272
2273       Links  are recorded in the fashion described here because a link can be
2274       to any file type. It is desirable in general to be able to restore part
2275       of an archive selectively and restore all of those files completely. If
2276       the data is not associated with each link, it is  not  possible  to  do
2277       this.  However,  the data associated with a file can be large, and when
2278       selective restoration is not needed, this can be a significant  burden.
2279       The  archive  is  structured so that files that have no associated data
2280       can always be restored by the name of any link name of  any  link,  and
2281       the  user  may  choose whether data is recorded with each instance of a
2282       file that contains data. The format permits mixing  of  both  types  of
2283       links  in a single archive; this can be done for special needs, and pax
2284       is expected to interpret such archives on input properly,  despite  the
2285       fact  that  there  is no pax option that would force this mixed case on
2286       output. (When -o linkdata is used, the output must contain  the  dupli‐
2287       cate data, but the implementation is free to include it or omit it when
2288       -o linkdata is not used.)
2289
2290       The time values are included  as  extended  header  records  for  those
2291       implementations  needing  more  than the eleven octal digits allowed by
2292       the ustar format. Portable file timestamps cannot be negative.  If  pax
2293       encounters  a  file with a negative timestamp in copy or write mode, it
2294       can reject the file, substitute a non-negative timestamp, or generate a
2295       non-portable  timestamp with a leading '-'.  Even though some implemen‐
2296       tations can support finer file-time  granularities  than  seconds,  the
2297       normative  text  requires  support  only  for  seconds  since the Epoch
2298       because the ISO POSIX‐1 standard states them that way. The ustar format
2299       includes  only mtime; the new format adds atime and ctime for symmetry.
2300       The atime access time restored to the file system will be  affected  by
2301       the -p a and -p e options. The ctime creation time (actually inode mod‐
2302       ification time) is described with appropriate privileges so that it can
2303       be  ignored  when  writing to the file system. POSIX does not provide a
2304       portable means to change file creation time.  Nothing  is  intended  to
2305       prevent a non-portable implementation of pax from restoring the value.
2306
2307       The  gid,  size, and uid extended header records were included to allow
2308       expansion beyond the sizes specified in the  regular  tar  header.  New
2309       file  system  architectures are emerging that will exhaust the 12-digit
2310       size field. There are probably not many systems requiring more  than  8
2311       digits  for  user  and  group  IDs, but the extended header values were
2312       included for completeness, allowing overrides for all  of  the  decimal
2313       values in the tar header.
2314
2315       The  standard  developers intended to describe the effective results of
2316       pax with regard to file ownerships and permissions; implementations are
2317       not  restricted  in  timing or sequencing the restoration of such, pro‐
2318       vided the results are as specified.
2319
2320       Much of the text describing the  extended  headers  refers  to  use  in
2321       ``write or copy modes''. The copy mode references are due to the norma‐
2322       tive text: ``The effect of the copy shall be as  if  the  copied  files
2323       were  written to an archive file and then subsequently extracted ...''.
2324       There is certainly no way to test whether pax  is  actually  generating
2325       the  extended  headers  in  copy mode, but the effects must be as if it
2326       had.
2327
2328   pax Archive Character Set Encoding/Decoding
2329       There is a need to exchange archives of files between systems  of  dif‐
2330       ferent  native codesets. Filenames, group names, and user names must be
2331       preserved to the fullest extent possible when an archive is read on the
2332       receiving  platform. Translation of the contents of files is not within
2333       the scope of the pax utility.
2334
2335       There will also be the need to represent characters that are not avail‐
2336       able  on the receiving platform. These unsupported characters cannot be
2337       automatically folded to the local set of characters due to  the  chance
2338       of  collisions.  This  could  result  in overwriting previous extracted
2339       files from the archive or pre-existing files on the system.
2340
2341       For these reasons, the codeset used to represent characters within  the
2342       extended header records of the pax archive must be sufficiently rich to
2343       handle all commonly used character sets. The fields requiring  transla‐
2344       tion  include,  at  a  minimum, filenames, user names, group names, and
2345       link pathnames. Implementations may wish  to  have  localized  extended
2346       keywords that use non-portable characters.
2347
2348       The standard developers considered the following options:
2349
2350        *  The  archive  creator specifies the well-defined name of the source
2351           codeset. The receiver must then recognize the codeset name and per‐
2352           form the appropriate translations to the destination codeset.
2353
2354        *  The  archive creator includes within the archive the character map‐
2355           ping table for the source codeset used to  encode  extended  header
2356           records.   The  receiver must then read the character mapping table
2357           and perform the appropriate translations to the  destination  code‐
2358           set.
2359
2360        *  The  archive  creator translates the extended header records in the
2361           source codeset into a canonical form. The receiver must  then  per‐
2362           form the appropriate translations to the destination codeset.
2363
2364       The approach that incorporates the name of the source codeset poses the
2365       problem of codeset name registration, and makes the archive useless  to
2366       pax archive decoders that do not recognize that codeset.
2367
2368       Because  parts  of an archive may be corrupted, the standard developers
2369       felt that including the character map of the  source  codeset  was  too
2370       fragile.  The loss of this one key component could result in making the
2371       entire archive useless. (The difference between  this  and  the  global
2372       extended header decision was that the latter has a workaround—duplicat‐
2373       ing extended header records on unreliable media—but this would  be  too
2374       burdensome for large character set maps.)
2375
2376       Both  of  the  above approaches also put an undue burden on the pax ar‐
2377       chive receiver to handle the cross-product of all source  and  destina‐
2378       tion codesets.
2379
2380       To  simplify  the  translation from the source codeset to the canonical
2381       form and from the canonical form to the destination codeset, the  stan‐
2382       dard  developers  decided  that the internal representation should be a
2383       stateless encoding. A stateless encoding is one  where  each  codepoint
2384       has the same meaning, without regard to the decoder being in a specific
2385       state. An example of a stateful encoding would be the  Japanese  Shift-
2386       JIS;  an  example of a stateless encoding would be the ISO/IEC 646:1991
2387       standard (equivalent to 7-bit ASCII).
2388
2389       For these reasons, the standard developers decided to adopt a canonical
2390       format for the representation of file information strings. The obvious,
2391       well-endorsed candidate is the ISO/IEC 10646‐1:2000 standard (based  in
2392       part on Unicode), which can be used to represent the characters of vir‐
2393       tually all standardized character sets. The  standard  developers  ini‐
2394       tially  agreed  upon using UCS2 (16-bit Unicode) as the internal repre‐
2395       sentation. This repertoire of characters provides a  sufficiently  rich
2396       set to represent all commonly-used codesets.
2397
2398       However,  the  standard developers found that the 16-bit Unicode repre‐
2399       sentation had some problems. It forced the issue of standardizing  byte
2400       ordering.  The 2-byte length of each character made the extended header
2401       records twice as long for the case of strings coded entirely from  his‐
2402       torical  7-bit  ASCII. For these reasons, the standard developers chose
2403       the UTF‐8 defined in the ISO/IEC 10646‐1:2000 standard. This multi-byte
2404       representation encodes UCS2 or UCS4 characters reliably and determinis‐
2405       tically, eliminating the need for a canonical byte ordering.  In  addi‐
2406       tion,  NUL octets and other characters possibly confusing to POSIX file
2407       systems do not appear, except to represent themselves. It was  realized
2408       that  certain  national codesets take up more space after the encoding,
2409       due to their placement within the UCS range; it was felt that the  use‐
2410       fulness of the encoding of the names outweighs the disadvantage of size
2411       increase for file, user, and group names.
2412
2413       The encoding of UTF‐8 is as follows:
2414
2415
2416           UCS4 Hex Encoding  UTF-8 Binary Encoding
2417
2418           00000000-0000007F  0xxxxxxx
2419           00000080-000007FF  110xxxxx 10xxxxxx
2420           00000800-0000FFFF  1110xxxx 10xxxxxx 10xxxxxx
2421           00010000-001FFFFF  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
2422           00200000-03FFFFFF  111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2423           04000000-7FFFFFFF  1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
2424
2425       where each 'x' represents a bit value from the character  being  trans‐
2426       lated.
2427
2428   ustar Interchange Format
2429       The description of the ustar format reflects numerous enhancements over
2430       pre-1988 versions of the historical tar  utility.  The  goal  of  these
2431       changes  was  not  only to provide the functional enhancements desired,
2432       but also to retain compatibility between new  and  old  versions.  This
2433       compatibility  has  been  retained.  Archives written using the old ar‐
2434       chive format are compatible with the new format.
2435
2436       Implementors should be aware that the  previous  file  format  did  not
2437       include  a  mechanism to archive directory type files. For this reason,
2438       the convention of using a filename ending with <slash> was  adopted  to
2439       specify a directory on the archive.
2440
2441       The  total size of the name and prefix fields have been set to meet the
2442       minimum requirements for {PATH_MAX}.  If a pathname will fit within the
2443       name field, it is recommended that the pathname be stored there without
2444       the use of the prefix field. Although the name field is known to be too
2445       small  to  contain  {PATH_MAX} characters, the value was not changed in
2446       this version of the archive file format to retain backwards-compatibil‐
2447       ity,  and  instead the prefix was introduced. Also, because of the ear‐
2448       lier version of the format, there is no way to remove  the  restriction
2449       on  the  linkname  field being limited in size to just that of the name
2450       field.
2451
2452       The size field is required  to  be  meaningful  in  all  implementation
2453       extensions,  although  it  could  be zero. This is required so that the
2454       data blocks can always be properly counted.
2455
2456       It is suggested that if device special files  need  to  be  represented
2457       that  cannot  be  represented  in  the standard format, that one of the
2458       extension types (AZ) be used, and that the additional information  for
2459       the  special  file  be represented as data and be reflected in the size
2460       field.
2461
2462       Attempting to restore a special file type, where  it  is  converted  to
2463       ordinary data and conflicts with an existing filename, need not be spe‐
2464       cially detected by the utility. If run as an ordinary user, pax  should
2465       not  be able to overwrite the entries in, for example, /dev in any case
2466       (whether the file is converted to another type or not).  If  run  as  a
2467       privileged user, it should be able to do so, and it would be considered
2468       a bug if it did not. The same is true of ordinary data files and  simi‐
2469       larly  named special files; it is impossible to anticipate the needs of
2470       the user (who could really intend to overwrite the file), so the behav‐
2471       ior should be predictable (and thus regular) and rely on the protection
2472       system as required.
2473
2474       The value 7 in the typeflag field is intended to define how  contiguous
2475       files  can  be stored in a ustar archive. POSIX.1‐2008 does not require
2476       the contiguous file extension, but does define a standard  way  of  ar‐
2477       chiving  such  files so that all conforming systems can interpret these
2478       file types in a meaningful and consistent manner. On a system that does
2479       not  support extended file types, the pax utility should do the best it
2480       can with the file and go on to the next.
2481
2482       The file protection modes are those conventionally used by the ls util‐
2483       ity.  This  is extended beyond the usage in the ISO POSIX‐2 standard to
2484       support the ``shared text'' or ``sticky'' bit. It is intended that  the
2485       conformance  document should not document anything beyond the existence
2486       of and support of such a mode. Further extensions are expected to these
2487       bits,  particularly  with  overloading the set-user-ID and set-group-ID
2488       flags.
2489
2490   cpio Interchange Format
2491       The reference to appropriate privileges in the cpio format refers to an
2492       error  on  standard  output;  the ustar format does not make comparable
2493       statements.
2494
2495       The model for this format was  the  historical  System  V  cpio-c  data
2496       interchange  format.  This  model documents the portable version of the
2497       cpio format and not the binary  version.  It  has  the  flexibility  to
2498       transfer  data of any type described within POSIX.1‐2008, yet is exten‐
2499       sible to transfer data types specific to extensions beyond POSIX.1‐2008
2500       (for  example,  contiguous  files). Because it describes existing prac‐
2501       tice, there is no question of maintaining upwards-compatibility.
2502
2503   cpio Header
2504       There has been some concern that the size of the  c_ino  field  of  the
2505       header  is too small to handle those systems that have very large inode
2506       numbers. However, the c_ino field in the header is used strictly  as  a
2507       hard-link  resolution mechanism for archives. It is not necessarily the
2508       same value as the inode number of the file in the location  from  which
2509       that file is extracted.
2510
2511       The name c_magic is based on historical usage.
2512
2513   cpio Filename
2514       For  most  historical  implementations  of the cpio utility, {PATH_MAX}
2515       octets can be used to describe the pathname without the addition of any
2516       other  header  fields  (the  NUL  character  would  be included in this
2517       count).  {PATH_MAX} is the minimum value for pathname size,  documented
2518       as  256 bytes.  However, an implementation may use c_namesize to deter‐
2519       mine the exact length of the pathname. With the current description  of
2520       the  <cpio.h>  header,  this  pathname size can be as large as a number
2521       that is described in six octal digits.
2522
2523       Two values are documented under the c_mode field values to provide  for
2524       extensibility for known file types:
2525
2526       0110 000  Reserved  for  contiguous files. The implementation may treat
2527                 the rest of the information for this archive like  a  regular
2528                 file.  If this file type is undefined, the implementation may
2529                 create the file as a regular file.
2530
2531       This provides for extensibility of the cpio format while  allowing  for
2532       the  ability to read old archives. Files of an unknown type may be read
2533       as ``regular files'' on some implementations.  On a  system  that  does
2534       not  support extended file types, the pax utility should do the best it
2535       can with the file and go on to the next.
2536

FUTURE DIRECTIONS

2538       None.
2539

SEE ALSO

2541       Chapter 2, Shell Command Language, cp, ed, getopts, ls, printf
2542
2543       The Base Definitions volume of POSIX.1‐2017, Section 3.169,  File  Mode
2544       Bits,  Chapter  5,  File  Format Notation, Chapter 8, Environment Vari‐
2545       ables, Section 12.2, Utility Syntax Guidelines, <cpio.h>, <tar.h>
2546
2547       The  System  Interfaces  volume  of  POSIX.1‐2017,  chown(),   creat(),
2548       fstatat(), mkdir(), mkfifo(), utime(), write()
2549
2551       Portions  of  this text are reprinted and reproduced in electronic form
2552       from IEEE Std 1003.1-2017, Standard for Information Technology --  Por‐
2553       table  Operating System Interface (POSIX), The Open Group Base Specifi‐
2554       cations Issue 7, 2018 Edition, Copyright (C) 2018 by the  Institute  of
2555       Electrical  and  Electronics Engineers, Inc and The Open Group.  In the
2556       event of any discrepancy between this version and the original IEEE and
2557       The  Open Group Standard, the original IEEE and The Open Group Standard
2558       is the referee document. The original Standard can be  obtained  online
2559       at http://www.opengroup.org/unix/online.html .
2560
2561       Any  typographical  or  formatting  errors that appear in this page are
2562       most likely to have been introduced during the conversion of the source
2563       files  to  man page format. To report such errors, see https://www.ker
2564       nel.org/doc/man-pages/reporting_bugs.html .
2565
2566
2567
2568IEEE/The Open Group                  2017                              PAX(1P)
Impressum