1FILE(1P)                   POSIX Programmer's Manual                  FILE(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       file — determine file type
13

SYNOPSIS

15       file [-dh] [-M file] [-m file] file...
16
17       file -i [-h] file...
18

DESCRIPTION

20       The file utility shall perform a series of tests in  sequence  on  each
21       specified file in an attempt to classify it:
22
23        1. If  file  does  not exist, cannot be read, or its file status could
24           not be determined, the output shall indicate that the file was pro‐
25           cessed, but that its type could not be determined.
26
27        2. If  the  file is not a regular file, its file type shall be identi‐
28           fied.  The file types directory, FIFO, socket, block  special,  and
29           character  special  shall  be identified as such. Other implementa‐
30           tion-defined file types may also be identified. If file is  a  sym‐
31           bolic  link,  by  default the link shall be resolved and file shall
32           test the type of file referenced by the symbolic link. (See the  -h
33           and -i options below.)
34
35        3. If  the  length of file is zero, it shall be identified as an empty
36           file.
37
38        4. The file utility shall examine an initial segment of file and shall
39           make  a  guess at identifying its contents based on position-sensi‐
40           tive tests. (The answer is not guaranteed to be  correct;  see  the
41           -d, -M, and -m options below.)
42
43        5. The file utility shall examine file and make a guess at identifying
44           its contents based on context-sensitive default system tests.  (The
45           answer is not guaranteed to be correct.)
46
47        6. The file shall be identified as a data file.
48
49       If file does not exist, cannot be read, or its file status could not be
50       determined, the output shall indicate that the file was processed,  but
51       that its type could not be determined.
52
53       If  file  is a symbolic link, by default the link shall be resolved and
54       file shall test the type of file referenced by the symbolic link.
55

OPTIONS

57       The file utility shall  conform  to  the  Base  Definitions  volume  of
58       POSIX.1‐2017,  Section 12.2, Utility Syntax Guidelines, except that the
59       order of the -m, -d, and -M options shall be significant.
60
61       The following options shall be supported by the implementation:
62
63       -d        Apply any position-sensitive default system  tests  and  con‐
64                 text-sensitive  default system tests to the file. This is the
65                 default if no -M or -m option is specified.
66
67       -h        When a symbolic link is encountered, identify the file  as  a
68                 symbolic  link. If -h is not specified and file is a symbolic
69                 link that refers to a nonexistent file, file  shall  identify
70                 the file as a symbolic link, as if -h had been specified.
71
72       -i        If  a  file is a regular file, do not attempt to classify the
73                 type of the file further, but identify the file as  specified
74                 in the STDOUT section.
75
76       -M file   Specify  the  name  of  a  file containing position-sensitive
77                 tests that shall be applied to a file in order to classify it
78                 (see the EXTENDED DESCRIPTION). No position-sensitive default
79                 system tests nor context-sensitive default system tests shall
80                 be applied unless the -d option is also specified.
81
82       -m file   Specify  the  name  of  a  file containing position-sensitive
83                 tests that shall be applied to a file in order to classify it
84                 (see the EXTENDED DESCRIPTION).
85
86       If  the  -m option is specified without specifying the -d option or the
87       -M option, position-sensitive default system  tests  shall  be  applied
88       after  the  position-sensitive tests specified by the -m option. If the
89       -M option is specified with the -d option, the -m option, or  both,  or
90       the -m option is specified with the -d option, the concatenation of the
91       position-sensitive tests specified by these options shall be applied in
92       the  order  specified by the appearance of these options. If a -M or -m
93       file option-argument is -, the results are unspecified.
94

OPERANDS

96       The following operand shall be supported:
97
98       file      A pathname of a file to be tested.
99

STDIN

101       The standard input shall be used if a  file  operand  is  '-'  and  the
102       implementation  treats  the  '-' as meaning standard input.  Otherwise,
103       the standard input shall not be used.
104

INPUT FILES

106       The file can be any file type.
107

ENVIRONMENT VARIABLES

109       The following environment variables shall affect the execution of file:
110
111       LANG      Provide a default value for  the  internationalization  vari‐
112                 ables  that are unset or null. (See the Base Definitions vol‐
113                 ume of POSIX.1‐2017, Section 8.2, Internationalization  Vari‐
114                 ables  for  the  precedence of internationalization variables
115                 used to determine the values of locale categories.)
116
117       LC_ALL    If set to a non-empty string value, override  the  values  of
118                 all the other internationalization variables.
119
120       LC_CTYPE  Determine  the  locale for the interpretation of sequences of
121                 bytes of text data as characters (for example, single-byte as
122                 opposed  to  multi-byte  characters  in  arguments  and input
123                 files).
124
125       LC_MESSAGES
126                 Determine the locale that should be used to affect the format
127                 and contents of diagnostic messages written to standard error
128                 and informative messages written to standard output.
129
130       NLSPATH   Determine the location of message catalogs for the processing
131                 of LC_MESSAGES.
132

ASYNCHRONOUS EVENTS

134       Default.
135

STDOUT

137       In  the  POSIX  locale,  the following format shall be used to identify
138       each operand, file specified:
139
140
141           "%s: %s\n", <file>, <type>
142
143       The values for <type> are unspecified, except that in the POSIX locale,
144       if  file  is identified as one of the types listed in the following ta‐
145       ble, <type> shall contain (but is not  limited  to)  the  corresponding
146       string,  unless  the  file  is  identified by a position-sensitive test
147       specified by a -M or -m option. Each <space> shown in the strings shall
148       be exactly one <space>.
149
150                       Table 4-9: File Utility Output Strings
151
152───────┬─────────────────────────────────────────────┬──────────────────────────────────┬─      │
153If file is:                    <type│> shall contain the string:   Notes│       │
154───────┼─────────────────────────────────────────────┼──────────────────────────────────┼─      │
155 Nonexi│stent                                    canno│t open                             │       │
156       │                                             │                                  │       │
157       │Block special                                │ block special                    │ 1     │
158       │Character special                            │ character special                │ 1     │
159       │Directory                                    │ directory                        │ 1     │
160       │FIFO                                         │ fifo                             │ 1     │
161       │Socket                                       │ socket                           │ 1     │
162       │Symbolic link                                │ symbolic link to                 │ 1     │
163       │Regular file                                 │ regular file                     │ 1,2   │
164       │Empty regular file                           │ empty                            │ 3     │
165       │Regular file that cannot be read             │ cannot open                      │ 3     │
166       │                                             │                                  │       │
167       │Executable binary                            │ executable                       │ 3,4,6 │
168ar archive library (see ar)                  │ archive                          │ 3,4,6 │
169       │Extended cpio format (see pax)               │ cpio archive                     │ 3,4,6 │
170       │Extended tar format (see ustar in pax)       │ tar archive                      │ 3,4,6 │
171       │                                             │                                  │       │
172       │Shell script                                 │ commands text                    │ 3,5,6 │
173       │C-language source                            │ c program text                   │ 3,5,6 │
174       │FORTRAN source                               │ fortran program text             │ 3,5,6 │
175       │                                             │                                  │       │
176       │Regular file whose type cannot be determined │ data                             │ 3     │
177       └─────────────────────────────────────────────┴──────────────────────────────────┴───────┘
178       Notes:
179
180                  1. This is a file type test.
181
182                  2. This test is applied only if the -i option is specified.
183
184                  3. This  test is applied only if the -i option is not speci‐
185                     fied.
186
187                  4. This is a position-sensitive default system test.
188
189                  5. This is a context-sensitive default system test.
190
191                  6. Position-sensitive default system tests and  context-sen‐
192                     sitive  default  system  tests  are not applied if the -M
193                     option is specified unless the -d option is  also  speci‐
194                     fied.
195
196       In  the POSIX locale, if file is identified as a symbolic link (see the
197       -h option), the following alternative output format shall be used:
198
199
200           "%s: %s %s\n", <file>, <type>, <contents of link>"
201
202       If the file named by the file operand does not exist, cannot  be  read,
203       or the type of the file named by the file operand cannot be determined,
204       this shall not be considered an error that affects the exit status.
205

STDERR

207       The standard error shall be used only for diagnostic messages.
208

OUTPUT FILES

210       None.
211

EXTENDED DESCRIPTION

213       A file specified as an option-argument to the -m or  -M  options  shall
214       contain one position-sensitive test per line, which shall be applied to
215       the file. If the test succeeds, the message field of the line shall  be
216       printed  and no further tests shall be applied, with the exception that
217       tests on immediately following lines beginning with a single '>'  char‐
218       acter shall be applied.
219
220       Each  line  shall  be  composed  of  the following four <tab>-separated
221       fields. (Implementations may allow  any  combination  of  one  or  more
222       white-space  characters  other  than  <newline> to act as field separa‐
223       tors.)
224
225       offset    An unsigned number (optionally preceded by a single '>' char‐
226                 acter)  specifying  the offset, in bytes, of the value in the
227                 file that is to be compared against the value  field  of  the
228                 line.  If  the file is shorter than the specified offset, the
229                 test shall fail.
230
231                 If the offset begins with the character '>',  the  test  con‐
232                 tained  in  the  line shall not be applied to the file unless
233                 the test on the last line for which the offset did not  begin
234                 with  a  '>'  was successful. By default, the offset shall be
235                 interpreted as an unsigned decimal number. With a leading  0x
236                 or  0X, the offset shall be interpreted as a hexadecimal num‐
237                 ber; otherwise, with a leading 0, the offset shall be  inter‐
238                 preted as an octal number.
239
240       type      The  type  of  the  value  in the file to be tested. The type
241                 shall consist of the type specification characters d, s,  and
242                 u,  specifying  signed decimal, string, and unsigned decimal,
243                 respectively.
244
245                 The type string shall be interpreted as the  bytes  from  the
246                 file  starting at the specified offset and including the same
247                 number of bytes specified by the value field. If insufficient
248                 bytes  remain  in the file past the offset to match the value
249                 field, the test shall fail.
250
251                 The type specification characters d and u can be followed  by
252                 an  optional unsigned decimal integer that specifies the num‐
253                 ber of bytes represented by the type. The type  specification
254                 characters d and u can be followed by an optional C, S, I, or
255                 L, indicating that the value is of type char, short, int,  or
256                 long, respectively.
257
258                 The  default  number  of bytes represented by the type speci‐
259                 fiers d, f, and u shall correspond to their respective C-lan‐
260                 guage  types  as follows. If the system claims conformance to
261                 the C-Language Development Utilities option, those specifiers
262                 shall  correspond  to the default sizes used in the c99 util‐
263                 ity. Otherwise, the default sizes  shall  be  implementation-
264                 defined.
265
266                 For the type specifier characters d and u, the default number
267                 of bytes shall correspond to the size of a basic integer type
268                 of  the  implementation.  For these specifier characters, the
269                 implementation shall support values of the optional number of
270                 bytes to be converted corresponding to the number of bytes in
271                 the C-language types char, short, int, or long.   These  num‐
272                 bers  can  also be specified by an application as the charac‐
273                 ters C, S, I, and L, respectively. The byte order  used  when
274                 interpreting  numeric  values  is implementation-defined, but
275                 shall correspond to the order in which a constant of the cor‐
276                 responding type is stored in memory on the system.
277
278                 All  type specifiers, except for s, can be followed by a mask
279                 specifier of the form &number. The mask value shall be AND'ed
280                 with  the  value of the input file before the comparison with
281                 the value field of the line is made.  By  default,  the  mask
282                 shall  be  interpreted  as an unsigned decimal number. With a
283                 leading 0x or  0X,  the  mask  shall  be  interpreted  as  an
284                 unsigned hexadecimal number; otherwise, with a leading 0, the
285                 mask shall be interpreted as an unsigned octal number.
286
287                 The strings byte, short, long, and string shall also be  sup‐
288                 ported  as  type fields, being interpreted as dC, dS, dL, and
289                 s, respectively.
290
291       value     The value to be compared with the value from the file.
292
293                 If the specifier from the type field is  s  or  string,  then
294                 interpret the value as a string. Otherwise, interpret it as a
295                 number. If the value is a string, then the test shall succeed
296                 only  when  a string value exactly matches the bytes from the
297                 file.
298
299                 If the value is  a  string,  it  can  contain  the  following
300                 sequences:
301
302                 \character  The  <backslash>-escape sequences as specified in
303                             the Base Definitions volume of POSIX.1‐2017,  Ta‐
304                             ble  5-1, Escape Sequences and Associated Actions
305                             ('\\', '\a', '\b', '\f', '\n', '\r', '\t', '\v').
306                             In addition, the escape sequence '\ ' (the <back‐
307                             slash> character followed by a <space> character)
308                             shall  be recognized to represent a <space> char‐
309                             acter. The results of using any other  character,
310                             other  than  an octal digit, following the <back‐
311                             slash> are unspecified.
312
313                 \octal      Octal sequences that can  be  used  to  represent
314                             characters  with  specific coded values. An octal
315                             sequence shall consist of a <backslash>  followed
316                             by  the  longest  sequence  of one, two, or three
317                             octal-digit characters (01234567).
318
319                 By default, any value that is not a string  shall  be  inter‐
320                 preted  as  a  signed  decimal number. Any such value, with a
321                 leading 0x or 0X, shall be interpreted as an  unsigned  hexa‐
322                 decimal  number;  otherwise,  with  a leading zero, the value
323                 shall be interpreted as an unsigned octal number.
324
325                 If the value is not a string, it can be preceded by a charac‐
326                 ter  indicating  the  comparison to be performed. Permissible
327                 characters and the comparisons they specify are as follows:
328
329                 =     The test shall succeed  if  the  value  from  the  file
330                       equals the value field.
331
332                 <     The  test  shall  succeed if the value from the file is
333                       less than the value field.
334
335                 >     The test shall succeed if the value from  the  file  is
336                       greater than the value field.
337
338                 &     The  test  shall  succeed if all of the set bits in the
339                       value field are set in the value from the file.
340
341                 ^     The test shall succeed if at least one of the set  bits
342                       in  the  value  field  is not set in the value from the
343                       file.
344
345                 x     The test shall succeed if the file is large  enough  to
346                       contain  a  value of the type specified starting at the
347                       offset specified.
348
349       message   The message to be printed if the test succeeds.  The  message
350                 shall  be  interpreted using the notation for the printf for‐
351                 matting specification; see printf.  If the value field was  a
352                 string,  then  the  value from the file shall be the argument
353                 for the printf formatting specification; otherwise, the value
354                 from the file shall be the argument.
355

EXIT STATUS

357       The following exit values shall be returned:
358
359        0    Successful completion.
360
361       >0    An error occurred.
362

CONSEQUENCES OF ERRORS

364       Default.
365
366       The following sections are informative.
367

APPLICATION USAGE

369       The  file  utility  can  only  be required to guess at many of the file
370       types because only exhaustive testing can  determine  some  types  with
371       certainty. For example, binary data on some implementations might match
372       the initial segment of an executable or a tar archive.
373
374       Note that the table indicates  that  the  output  contains  the  stated
375       string.  Systems  may add text before or after the string. For executa‐
376       bles, as an example, the machine architecture and various  facts  about
377       how the file was link-edited may be included. Note also that on systems
378       that recognize shell script files  starting  with  "#!"  as  executable
379       files,  these  may be identified as executable binary files rather than
380       as shell scripts.
381

EXAMPLES

383       Determine whether an argument is a binary executable file:
384
385
386           file -- "$1" | grep -q ':.*executable' &&
387               printf "%s is executable.\n$1"
388

RATIONALE

390       The -f option was omitted because the same effect can (and  should)  be
391       obtained using the xargs utility.
392
393       Historical versions of the file utility attempt to identify the follow‐
394       ing types of files: symbolic link, directory, character special,  block
395       special,  socket,  tar  archive,  cpio  archive,  SCCS archive, archive
396       library, empty, compress output, pack output, binary  data,  C  source,
397       FORTRAN source, assembler source, nroff/troff/eqn/tbl source troff out‐
398       put, shell script, C shell script, English text,  ASCII  text,  various
399       executables,  APL  workspace,  compiled  terminfo  entries,  and CURSES
400       screen images. Only those types that are reasonably well  specified  in
401       POSIX  or are directly related to POSIX utilities are listed in the ta‐
402       ble.
403
404       Historical systems have used a ``magic file'' named /etc/magic to  help
405       identify  file  types.  Because  it  is  generally useful for users and
406       scripts to be able to identify special file types, the -m  flag  and  a
407       portable  format  for  user-created  magic files has been specified. No
408       requirement is made that an implementation of file use this  method  of
409       identifying  files, only that users be permitted to add their own clas‐
410       sifying tests.
411
412       In addition, three options have been added to historical practice.  The
413       -d  flag  has been added to permit users to cause their tests to follow
414       any default system tests. The -i flag has been added to permit users to
415       test  portably for regular files in shell scripts. The -M flag has been
416       added to permit users to ignore any default system tests.
417
418       The POSIX.1‐2008 description of default system tests and  the  interac‐
419       tion  between  the -d, -M, and -m options did not clearly indicate that
420       there were two types of ``default system tests''. The ``position-sensi‐
421       tive  tests''  determine  file  types  by looking for certain string or
422       binary values at specific offsets in the  file  being  examined.  These
423       position-sensitive  tests  were implemented in historical systems using
424       the magic file described above.  Some of these tests are now built into
425       the  file utility itself on some implementations so the output can pro‐
426       vide more detail than can be provided by magic files.  For  example,  a
427       magic file can easily identify a core file on most implementations, but
428       cannot name the program file that dropped the core. A magic file  could
429       produce output such as:
430
431
432           /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
433
434       but  by  building  the test into the file utility, you could get output
435       such as:
436
437
438           /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
439
440       These extended built-in tests are still to be treated as  position-sen‐
441       sitive  default  system tests even if they are not listed in /etc/magic
442       or any other magic file.
443
444       The context-sensitive default system tests were always built  into  the
445       file  utility. These tests looked for language constructs in text files
446       trying to identify shell scripts, C, FORTRAN, and other  computer  lan‐
447       guage source files, and even plain text files. With the addition of the
448       -m and -M options the distinction between position-sensitive  and  con‐
449       text-sensitive  default system tests became important because the order
450       of testing is important. The  context-sensitive  system  default  tests
451       should never be applied before any position-sensitive tests even if the
452       -d option is specified before a -m option or -M option due to the  high
453       probability that the context-sensitive system default tests will incor‐
454       rectly identify arbitrary text files as text files before position-sen‐
455       sitive  tests specified by the -m or -M option would be applied to give
456       a more accurate identification.
457
458       Leaving the meaning of -M - and -m -  unspecified  allows  an  existing
459       prototype  of these options to continue to work in a backwards-compati‐
460       ble manner. (In that implementation, -M - was roughly equivalent to  -d
461       in POSIX.1‐2008.)
462
463       The  historical  -c  option  was  omitted as not particularly useful to
464       users or portable shell scripts. In addition, a reasonable  implementa‐
465       tion  of  the  file utility would report any errors found each time the
466       magic file is read.
467
468       The historical format of the magic file was the same as that  specified
469       by  the  Rationale  in  the  ISO POSIX‐2:1993  standard for the offset,
470       value, and message fields; however, it used less  precise  type  fields
471       than  the  format specified by the current normative text. The new type
472       field values are a superset of the historical ones.
473
474       The following is an example magic file:
475
476
477           0  short     070707              cpio archive
478           0  short     0143561             Byte-swapped cpio archive
479           0  string    070707              ASCII cpio archive
480           0  long      0177555             Very old archive
481           0  short     0177545             Old archive
482           0  short     017437              Old packed data
483           0  string    \037\036            Packed data
484           0  string    \377\037            Compacted data
485           0  string    \037\235            Compressed data
486           >2 byte&0x80 >0                  Block compressed
487           >2 byte&0x1f x                   %d bits
488           0  string    \032\001            Compiled Terminfo Entry
489           0  short     0433                Curses screen image
490           0  short     0434                Curses screen image
491           0  string    <ar>                System V Release 1 archive
492           0  string    !<arch>\n__.SYMDEF  Archive random library
493           0  string    !<arch>             Archive
494           0  string    ARF_BEGARF          PHIGS clear text archive
495           0  long      0x137A2950          Scalable OpenFont binary
496           0  long      0x137A2951          Encrypted scalable OpenFont binary
497
498       The use of a basic integer data type is intended to allow the implemen‐
499       tation  to  choose  a  word  size commonly used by applications on that
500       architecture.
501
502       Earlier versions of this  standard  allowed  for  implementations  with
503       bytes  other  than  eight bits, but this has been modified in this ver‐
504       sion.
505

FUTURE DIRECTIONS

507       None.
508

SEE ALSO

510       ar, ls, pax, printf
511
512       The  Base  Definitions  volume  of  POSIX.1‐2017,  Table  5-1,   Escape
513       Sequences  and  Associated  Actions,  Chapter 8, Environment Variables,
514       Section 12.2, Utility Syntax Guidelines
515
517       Portions of this text are reprinted and reproduced in  electronic  form
518       from  IEEE Std 1003.1-2017, Standard for Information Technology -- Por‐
519       table Operating System Interface (POSIX), The Open Group Base  Specifi‐
520       cations  Issue  7, 2018 Edition, Copyright (C) 2018 by the Institute of
521       Electrical and Electronics Engineers, Inc and The Open Group.   In  the
522       event of any discrepancy between this version and the original IEEE and
523       The Open Group Standard, the original IEEE and The Open Group  Standard
524       is  the  referee document. The original Standard can be obtained online
525       at http://www.opengroup.org/unix/online.html .
526
527       Any typographical or formatting errors that appear  in  this  page  are
528       most likely to have been introduced during the conversion of the source
529       files to man page format. To report such errors,  see  https://www.ker
530       nel.org/doc/man-pages/reporting_bugs.html .
531
532
533
534IEEE/The Open Group                  2017                             FILE(1P)
Impressum