1FILE(1P)                   POSIX Programmer's Manual                  FILE(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       file - determine file type
13

SYNOPSIS

15       file [-dh][-M file][-m file] file ...
16
17       file -i [-h] file ...
18
19

DESCRIPTION

21       The file utility shall perform a series of tests in  sequence  on  each
22       specified file in an attempt to classify it:
23
24        1. If  file  does  not exist, cannot be read, or its file status could
25           not be determined, the output shall indicate that the file was pro‐
26           cessed, but that its type could not be determined.
27
28        2. If  the  file is not a regular file, its file type shall be identi‐
29           fied.  The file types directory, FIFO, socket, block  special,  and
30           character  special  shall  be identified as such. Other implementa‐
31           tion-defined file types may also be identified. If file is  a  sym‐
32           bolic  link,  by  default the link shall be resolved and file shall
33           test the type of file referenced by the symbolic link.  (See the -h
34           and -i options below.)
35
36        3. If  the  length of file is zero, it shall be identified as an empty
37           file.
38
39        4. The file utility shall examine an initial segment of file and shall
40           make  a  guess at identifying its contents based on position-sensi‐
41           tive tests. (The answer is not guaranteed to be  correct;  see  the
42           -d, -M, and -m options below.)
43
44        5. The file utility shall examine file and make a guess at identifying
45           its contents based on context-sensitive default system tests.  (The
46           answer is not guaranteed to be correct.)
47
48        6. The file shall be identified as a data file.
49
50       If file does not exist, cannot be read, or its file status could not be
51       determined, the output shall indicate that the file was processed,  but
52       that its type could not be determined.
53
54       If  file  is a symbolic link, by default the link shall be resolved and
55       file shall test the type of file referenced by the symbolic link.
56

OPTIONS

58       The file utility shall  conform  to  the  Base  Definitions  volume  of
59       IEEE Std 1003.1-2001,  Section  12.2, Utility Syntax Guidelines, except
60       that the order of the -m, -d, and -M options shall be significant.
61
62       The following options shall be supported by the implementation:
63
64       -d     Apply any position-sensitive default system tests  and  context-
65              sensitive  default system tests to the file. This is the default
66              if no -M or -m option is specified.
67
68       -h     When a symbolic link is encountered, identify the file as a sym‐
69              bolic  link.  If -h is not specified and file is a symbolic link
70              that refers to a nonexistent file, file shall identify the  file
71              as a symbolic link, as if -h had been specified.
72
73       -i     If a file is a regular file, do not attempt to classify the type
74              of the file further, but identify the file as specified  in  the
75              STDOUT section.
76
77       -M  file
78              Specify  the  name of a file containing position-sensitive tests
79              that shall be applied to a file in order to classify it (see the
80              EXTENDED  DESCRIPTION).  No  position-sensitive  default  system
81              tests  nor  context-sensitive  default  system  tests  shall  be
82              applied unless the -d option is also specified.
83
84       -m  file
85              Specify  the  name of a file containing position-sensitive tests
86              that shall be applied to a file in order to classify it (see the
87              EXTENDED DESCRIPTION).
88
89
90       If  the  -m option is specified without specifying the -d option or the
91       -M option, position-sensitive default system  tests  shall  be  applied
92       after  the  position-sensitive tests specified by the -m option. If the
93       -M option is specified with the -d option, the -m option, or  both,  or
94       the -m option is specified with the -d option, the concatenation of the
95       position-sensitive tests specified by these options shall be applied in
96       the  order  specified by the appearance of these options. If a -M or -m
97       file option-argument is -, the results are unspecified.
98

OPERANDS

100       The following operand shall be supported:
101
102       file   A pathname of a file to be tested.
103
104

STDIN

106       Not used.
107

INPUT FILES

109       The file can be any file type.
110

ENVIRONMENT VARIABLES

112       The following environment variables shall affect the execution of file:
113
114       LANG   Provide a default value for the  internationalization  variables
115              that  are  unset  or  null.  (See the Base Definitions volume of
116              IEEE Std 1003.1-2001, Section  8.2,  Internationalization  Vari‐
117              ables  for the precedence of internationalization variables used
118              to determine the values of locale categories.)
119
120       LC_ALL If set to a non-empty string value, override the values  of  all
121              the other internationalization variables.
122
123       LC_CTYPE
124              Determine  the  locale  for  the  interpretation of sequences of
125              bytes of text data as characters (for  example,  single-byte  as
126              opposed to multi-byte characters in arguments and input files).
127
128       LC_MESSAGES
129              Determine  the  locale  that should be used to affect the format
130              and contents of diagnostic messages written  to  standard  error
131              and informative messages written to standard output.
132
133       NLSPATH
134              Determine the location of message catalogs for the processing of
135              LC_MESSAGES .
136
137

ASYNCHRONOUS EVENTS

139       Default.
140

STDOUT

142       In the POSIX locale, the following format shall  be  used  to  identify
143       each operand, file specified:
144
145
146              "%s: %s\n", <file>, <type>
147
148       The values for <type> are unspecified, except that in the POSIX locale,
149       if file is identified as one of the types listed in the  following  ta‐
150       ble,  <type>  shall  contain  (but is not limited to) the corresponding
151       string, unless the file is  identified  by  a  position-sensitive  test
152       specified  by  a -M or -m option. Each space shown in the strings shall
153       be exactly one <space>.
154
155                         Table: File Utility Output Strings
156
157       If file is:                              <type> shall contain the  Notes
158                                                string:
159       Nonexistent                              cannot open
160       Block special                            block special             1
161       Character special                        character special         1
162       Directory                                directory                 1
163       FIFO                                     fifo                      1
164       Socket                                   socket                    1
165       Symbolic link                            symbolic link to          1
166       Regular file                             regular file              1,2
167       Empty regular file                       empty                     3
168       Regular file that cannot be read         cannot open               3
169       Executable binary                        executable                4,6
170       ar archive library (see ar)              archive                   4,6
171       Extended cpio format (see pax)           cpio archive              4,6
172       Extended tar format (see ustar in pax)   tar archive               4,6
173       Shell script                             commands text             5,6
174       C-language source                        c program text            5,6
175       FORTRAN source                           fortran program text      5,6
176       Regular file whose type cannot be deter‐ data
177       mined
178
179       Notes:
180
181               1. This is a file type test.
182
183               2. This test is applied only if the -i option is specified.
184
185               3. This test is applied only if the -i option is not specified.
186
187               4. This is a position-sensitive default system test.
188
189               5. This is a context-sensitive default system test.
190
191               6. Position-sensitive default system tests  and  context-sensi‐
192                  tive  default  system tests are not applied if the -M option
193                  is specified unless the -d option is also specified.
194
195       In the POSIX locale, if file is identified as a symbolic link (see  the
196       -h option), the following alternative output format shall be used:
197
198
199              "%s: %s %s\n", <file>, <type>, <contents of link>"
200
201       If  the  file named by the file operand does not exist, cannot be read,
202       or the type of the file named by the file operand cannot be determined,
203       this shall not be considered an error that affects the exit status.
204

STDERR

206       The standard error shall be used only for diagnostic messages.
207

OUTPUT FILES

209       None.
210

EXTENDED DESCRIPTION

212       A  file  specified  as an option-argument to the -m or -M options shall
213       contain one position-sensitive test per line, which shall be applied to
214       the  file. If the test succeeds, the message field of the line shall be
215       printed and no further tests shall be applied, with the exception  that
216       tests  on immediately following lines beginning with a single '>' char‐
217       acter shall be applied.
218
219       Each line shall be composed of  the  following  four  <blank>-separated
220       fields:
221
222       offset An  unsigned number (optionally preceded by a single '>' charac‐
223              ter) specifying the offset, in bytes, of the value in  the  file
224              that  is  to be compared against the value field of the line. If
225              the file is shorter than the specified offset,  the  test  shall
226              fail.
227
228       If  the offset begins with the character '>', the test contained in the
229       line shall not be applied to the file unless the test on the last  line
230       for  which  the  offset  did  not  begin  with a '>' was successful. By
231       default, the offset shall be interpreted as an unsigned decimal number.
232       With a leading 0x or 0X, the offset shall be interpreted as a hexadeci‐
233       mal number; otherwise, with a leading 0, the  offset  shall  be  inter‐
234       preted as an octal number.
235
236       type   The  type  of the value in the file to be tested. The type shall
237              consist of the type specification characters c, d, f, s, and  u,
238              specifying  character,  signed  decimal, floating point, string,
239              and unsigned decimal, respectively.
240
241       The type string shall be interpreted as the bytes from the file  start‐
242       ing  at  the  specified  offset  and including the same number of bytes
243       specified by the value field. If insufficient bytes remain in the  file
244       past the offset to match the value field, the test shall fail.
245
246       The  type  specification  characters  d, f, and u can be followed by an
247       optional unsigned decimal integer that specifies the  number  of  bytes
248       represented  by  the  type.   The type specification character f can be
249       followed by an optional F, D, or L, indicating that  the  value  is  of
250       type  float,  double, or long double, respectively. The type specifica‐
251       tion characters d and u can be followed by an optional C, S, I,  or  L,
252       indicating that the value is of type char, short, int, or long, respec‐
253       tively.
254
255       The default number of bytes represented by the type  specifiers  d,  f,
256       and u shall correspond to their respective C-language types as follows.
257       If the system claims conformance to the C-Language  Development  Utili‐
258       ties  option,  those  specifiers  shall correspond to the default sizes
259       used in the c99 utility.  Otherwise, the default sizes shall be  imple‐
260       mentation-defined.
261
262       For  the type specifier characters d and u, the default number of bytes
263       shall correspond to the size of a basic integer type of the implementa‐
264       tion.  For these specifier characters, the implementation shall support
265       values of the optional number of bytes to be converted corresponding to
266       the  number of bytes in the C-language types char, short, int, or long.
267       These numbers can also be specified by an application as the characters
268       C,  S,  I,  and  L, respectively. The byte order used when interpreting
269       numeric values is implementation-defined, but shall correspond  to  the
270       order in which a constant of the corresponding type is stored in memory
271       on the system.
272
273       For the type specifier f, the default number of bytes shall  correspond
274       to  the  number  of  bytes in the basic double precision floating-point
275       data type of the underlying implementation.  The  implementation  shall
276       support  values  of the optional number of bytes to be converted corre‐
277       sponding to the number of bytes in the C-language types float,  double,
278       and  long double. These numbers can also be specified by an application
279       as the characters F, D, and L, respectively.
280
281       All type specifiers, except for s, can be followed by a mask  specifier
282       of  the  form &number. The mask value shall be AND'ed with the value of
283       the input file before the comparison with the value field of  the  line
284       is made. By default, the mask shall be interpreted as an unsigned deci‐
285       mal number. With a leading 0x or 0X, the mask shall be  interpreted  as
286       an  unsigned  hexadecimal number; otherwise, with a leading 0, the mask
287       shall be interpreted as an unsigned octal number.
288
289       The strings byte, short, long, and string shall also  be  supported  as
290       type fields, being interpreted as dC, dS, dL, and s, respectively.
291
292       value  The value to be compared with the value from the file.
293
294       If the specifier from the type field is s or string, then interpret the
295       value as a string. Otherwise, interpret it as a number. If the value is
296       a  string, then the test shall succeed only when a string value exactly
297       matches the bytes from the file.
298
299       If the value is a string, it can contain the following sequences:
300
301       \character
302              The backslash-escape sequences as specified in the Base  Defini‐
303              tions   volume   of   IEEE Std 1003.1-2001,  Table  5-1,  Escape
304              Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\n',
305              '\r',  '\t',  '\v'  ). The results of using any other character,
306              other than an octal digit, following the backslash are  unspeci‐
307              fied.
308
309       \octal
310              Octal  sequences  that  can be used to represent characters with
311              specific coded values. An octal  sequence  shall  consist  of  a
312              backslash followed by the longest sequence of one, two, or three
313              octal-digit characters (01234567). If the size of a byte on  the
314              system is greater than 9 bits, the valid escape sequence used to
315              represent a byte is implementation-defined.
316
317
318       By default, any value that is not a string shall be  interpreted  as  a
319       signed  decimal  number. Any such value, with a leading 0x or 0X, shall
320       be interpreted as an unsigned hexadecimal  number;  otherwise,  with  a
321       leading  zero, the value shall be interpreted as an unsigned octal num‐
322       ber.
323
324       If the value is not a string, it can be preceded by a  character  indi‐
325       cating  the  comparison to be performed. Permissible characters and the
326       comparisons they specify are as follows:
327
328       =
329              The test shall succeed if the value from  the  file  equals  the
330              value field.
331
332       <
333              The  test  shall succeed if the value from the file is less than
334              the value field.
335
336       >
337              The test shall succeed if the value from  the  file  is  greater
338              than the value field.
339
340       &
341              The test shall succeed if all of the set bits in the value field
342              are set in the value from the file.
343
344       ^
345              The test shall succeed if at least one of the set  bits  in  the
346              value field is not set in the value from the file.
347
348       x
349              The  test shall succeed if the file is large enough to contain a
350              value of the type specified starting at the offset specified.
351
352
353       message
354              The message to be printed if  the  test  succeeds.  The  message
355              shall  be  interpreted using the notation for the printf format‐
356              ting specification; see printf().  If  the  value  field  was  a
357              string,  then  the value from the file shall be the argument for
358              the printf formatting specification; otherwise, the  value  from
359              the file shall be the argument.
360
361

EXIT STATUS

363       The following exit values shall be returned:
364
365        0     Successful completion.
366
367       >0     An error occurred.
368
369

CONSEQUENCES OF ERRORS

371       Default.
372
373       The following sections are informative.
374

APPLICATION USAGE

376       The  file  utility  can  only  be required to guess at many of the file
377       types because only exhaustive testing can  determine  some  types  with
378       certainty. For example, binary data on some implementations might match
379       the initial segment of an executable or a tar archive.
380
381       Note that the table indicates  that  the  output  contains  the  stated
382       string.  Systems  may add text before or after the string. For executa‐
383       bles, as an example, the machine architecture and various  facts  about
384       how the file was link-edited may be included. Note also that on systems
385       that recognize shell script files  starting  with  "#!"  as  executable
386       files,  these  may be identified as executable binary files rather than
387       as shell scripts.
388

EXAMPLES

390       Determine whether an argument is a binary executable file:
391
392
393              file "$1" | grep -Fq executable &&
394                  printf "%s is executable.\n" "$1"
395

RATIONALE

397       The -f option was omitted because the same effect can (and  should)  be
398       obtained using the xargs utility.
399
400       Historical versions of the file utility attempt to identify the follow‐
401       ing types of files: symbolic link, directory, character special,  block
402       special,  socket,  tar  archive,  cpio  archive,  SCCS archive, archive
403       library, empty, compress output, pack output, binary  data,  C  source,
404       FORTRAN  source,  assembler source, nroff/ troff/ eqn/ tbl source troff
405       output, shell script, C shell script, English text, ASCII text, various
406       executables,  APL  workspace,  compiled  terminfo  entries,  and CURSES
407       screen images. Only those types that are reasonably well  specified  in
408       POSIX  or are directly related to POSIX utilities are listed in the ta‐
409       ble.
410
411       Historical systems have used a "magic file" named  /etc/magic  to  help
412       identify  file  types.  Because  it  is  generally useful for users and
413       scripts to be able to identify special file types, the -m  flag  and  a
414       portable  format  for  user-created  magic files has been specified. No
415       requirement is made that an implementation of file use this  method  of
416       identifying  files, only that users be permitted to add their own clas‐
417       sifying tests.
418
419       In addition, three options have been added to historical practice.  The
420       -d  flag  has been added to permit users to cause their tests to follow
421       any default system tests. The -i flag has been added to permit users to
422       test  portably for regular files in shell scripts. The -M flag has been
423       added to permit users to ignore any default system tests.
424
425       The IEEE Std 1003.1-2001 description of default system  tests  and  the
426       interaction between the -d, -M, and -m options did not clearly indicate
427       that there were two types of "default system tests". The "position-sen‐
428       sitive  tests''  determine  file types by looking for certain string or
429       binary values at specific offsets in the  file  being  examined.  These
430       position-sensitive  tests  were implemented in historical systems using
431       the magic file described above. Some of these tests are now built  into
432       the  file utility itself on some implementations so the output can pro‐
433       vide more detail than can be provided by magic files.  For  example,  a
434       magic file can easily identify a core file on most implementations, but
435       cannot name the program file that dropped the core. A magic file  could
436       produce output such as:
437
438
439              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
440
441       but  by  building  the test into the file utility, you could get output
442       such as:
443
444
445              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
446
447       These extended built-in tests are still to be treated as  position-sen‐
448       sitive  default  system tests even if they are not listed in /etc/magic
449       or any other magic file.
450
451       The context-sensitive default system tests were always built  into  the
452       file  utility. These tests looked for language constructs in text files
453       trying to identify shell scripts, C, FORTRAN, and other  computer  lan‐
454       guage source files, and even plain text files. With the addition of the
455       -m and -M options the distinction between position-sensitive  and  con‐
456       text-sensitive  default system tests became important because the order
457       of testing is important. The  context-sensitive  system  default  tests
458       should never be applied before any position-sensitive tests even if the
459       -d option is specified before a -m option or -M option due to the  high
460       probability that the context-sensitive system default tests will incor‐
461       rectly identify arbitrary text files as text files before position-sen‐
462       sitive  tests specified by the -m or -M option would be applied to give
463       a more accurate identification.
464
465       Leaving the meaning of -M - and -m -  unspecified  allows  an  existing
466       prototype  of these options to continue to work in a backwards-compati‐
467       ble manner. (In that implementation, -M - was roughly equivalent to  -d
468       in IEEE Std 1003.1-2001.)
469
470       The  historical  -c  option  was  omitted as not particularly useful to
471       users or portable shell scripts. In addition, a reasonable  implementa‐
472       tion  of  the  file utility would report any errors found each time the
473       magic file is read.
474
475       The historical format of the magic file was the same as that  specified
476       by  the  Rationale  in  the  ISO POSIX-2:1993  standard for the offset,
477       value, and message fields; however, it used less  precise  type  fields
478       than  the  format specified by the current normative text. The new type
479       field values are a superset of the historical ones.
480
481       The following is an example magic file:
482
483
484              0  short     070707              cpio archive
485              0  short     0143561             Byte-swapped cpio archive
486              0  string    070707              ASCII cpio archive
487              0  long      0177555             Very old archive
488              0  short     0177545             Old archive
489              0  short     017437              Old packed data
490              0  string    \037\036            Packed data
491              0  string    \377\037            Compacted data
492              0  string    \037\235            Compressed data
493              >2 byte&0x80 >0                  Block compressed
494              >2 byte&0x1f x                   %d bits
495              0  string    \032\001            Compiled Terminfo Entry
496              0  short     0433                Curses screen image
497              0  short     0434                Curses screen image
498              0  string    <ar>                System V Release 1 archive
499              0  string    !<arch>\n__.SYMDEF  Archive random library
500              0  string    !<arch>             Archive
501              0  string    ARF_BEGARF          PHIGS clear text archive
502              0  long      0x137A2950          Scalable OpenFont binary
503              0  long      0x137A2951          Encrypted scalable OpenFont binary
504
505       The use of a basic integer data type is intended to allow the implemen‐
506       tation  to  choose  a  word  size commonly used by applications on that
507       architecture.
508

FUTURE DIRECTIONS

510       None.
511

SEE ALSO

513       ar, ls, pax
514
516       Portions of this text are reprinted and reproduced in  electronic  form
517       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
518       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
519       Specifications  Issue  6,  Copyright  (C) 2001-2003 by the Institute of
520       Electrical and Electronics Engineers, Inc and The Open  Group.  In  the
521       event of any discrepancy between this version and the original IEEE and
522       The Open Group Standard, the original IEEE and The Open Group  Standard
523       is  the  referee document. The original Standard can be obtained online
524       at http://www.opengroup.org/unix/online.html .
525
526
527
528IEEE/The Open Group                  2003                             FILE(1P)
Impressum