1FILE(P)                    POSIX Programmer's Manual                   FILE(P)
2
3
4

NAME

6       file - determine file type
7

SYNOPSIS

9       file [-dh][-M file][-m file] file ...
10
11       file -i [-h] file ...
12
13

DESCRIPTION

15       The  file  utility  shall perform a series of tests in sequence on each
16       specified file in an attempt to classify it:
17
18        1. If file does not exist, cannot be read, or its  file  status  could
19           not be determined, the output shall indicate that the file was pro‐
20           cessed, but that its type could not be determined.
21
22        2. If the file is not a regular file, its file type shall  be  identi‐
23           fied.   The  file types directory, FIFO, socket, block special, and
24           character special shall be identified as  such.  Other  implementa‐
25           tion-defined  file  types may also be identified. If file is a sym‐
26           bolic link, by default the link shall be resolved  and  file  shall
27           test the type of file referenced by the symbolic link.  (See the -h
28           and -i options below.)
29
30        3. If the length of file is zero, it shall be identified as  an  empty
31           file.
32
33        4. The file utility shall examine an initial segment of file and shall
34           make a guess at identifying its contents based  on  position-sensi‐
35           tive  tests.  (The  answer is not guaranteed to be correct; see the
36           -d, -M, and -m options below.)
37
38        5. The file utility shall examine file and make a guess at identifying
39           its  contents based on context-sensitive default system tests. (The
40           answer is not guaranteed to be correct.)
41
42        6. The file shall be identified as a data file.
43
44       If file does not exist, cannot be read, or its file status could not be
45       determined,  the output shall indicate that the file was processed, but
46       that its type could not be determined.
47
48       If file is a symbolic link, by default the link shall be  resolved  and
49       file shall test the type of file referenced by the symbolic link.
50

OPTIONS

52       The  file  utility  shall  conform  to  the  Base Definitions volume of
53       IEEE Std 1003.1-2001, Section 12.2, Utility Syntax  Guidelines,  except
54       that the order of the -m, -d, and -M options shall be significant.
55
56       The following options shall be supported by the implementation:
57
58       -d     Apply  any  position-sensitive default system tests and context-
59              sensitive default system tests to the file. This is the  default
60              if no -M or -m option is specified.
61
62       -h     When a symbolic link is encountered, identify the file as a sym‐
63              bolic link. If -h is not specified and file is a  symbolic  link
64              that  refers to a nonexistent file, file shall identify the file
65              as a symbolic link, as if -h had been specified.
66
67       -i     If a file is a regular file, do not attempt to classify the type
68              of  the  file further, but identify the file as specified in the
69              STDOUT section.
70
71       -M  file
72              Specify the name of a file containing  position-sensitive  tests
73              that shall be applied to a file in order to classify it (see the
74              EXTENDED  DESCRIPTION).  No  position-sensitive  default  system
75              tests  nor  context-sensitive  default  system  tests  shall  be
76              applied unless the -d option is also specified.
77
78       -m  file
79              Specify the name of a file containing  position-sensitive  tests
80              that shall be applied to a file in order to classify it (see the
81              EXTENDED DESCRIPTION).
82
83
84       If the -m option is specified without specifying the -d option  or  the
85       -M  option,  position-sensitive  default  system tests shall be applied
86       after the position-sensitive tests specified by the -m option.  If  the
87       -M  option  is specified with the -d option, the -m option, or both, or
88       the -m option is specified with the -d option, the concatenation of the
89       position-sensitive tests specified by these options shall be applied in
90       the order specified by the appearance of these options. If a -M  or  -m
91       file option-argument is -, the results are unspecified.
92

OPERANDS

94       The following operand shall be supported:
95
96       file   A pathname of a file to be tested.
97
98

STDIN

100       Not used.
101

INPUT FILES

103       The file can be any file type.
104

ENVIRONMENT VARIABLES

106       The following environment variables shall affect the execution of file:
107
108       LANG   Provide  a  default value for the internationalization variables
109              that are unset or null. (See  the  Base  Definitions  volume  of
110              IEEE Std 1003.1-2001,  Section  8.2,  Internationalization Vari‐
111              ables for the precedence of internationalization variables  used
112              to determine the values of locale categories.)
113
114       LC_ALL If  set  to a non-empty string value, override the values of all
115              the other internationalization variables.
116
117       LC_CTYPE
118              Determine the locale for  the  interpretation  of  sequences  of
119              bytes  of  text  data as characters (for example, single-byte as
120              opposed to multi-byte characters in arguments and input files).
121
122       LC_MESSAGES
123              Determine the locale that should be used to  affect  the  format
124              and  contents  of  diagnostic messages written to standard error
125              and informative messages written to standard output.
126
127       NLSPATH
128              Determine the location of message catalogs for the processing of
129              LC_MESSAGES .
130
131

ASYNCHRONOUS EVENTS

133       Default.
134

STDOUT

136       In  the  POSIX  locale,  the following format shall be used to identify
137       each operand, file specified:
138
139
140              "%s: %s\n", <file>, <type>
141
142       The values for <type> are unspecified, except that in the POSIX locale,
143       if  file  is identified as one of the types listed in the following ta‐
144       ble, <type> shall contain (but is not  limited  to)  the  corresponding
145       string,  unless  the  file  is  identified by a position-sensitive test
146       specified by a -M or -m option. Each space shown in the  strings  shall
147       be exactly one <space>.
148
149                         Table: File Utility Output Strings
150
151       If file is:                              <type> shall contain the  Notes
152                                                string:
153       Nonexistent                              cannot open
154       Block special                            block special             1
155       Character special                        character special         1
156       Directory                                directory                 1
157       FIFO                                     fifo                      1
158       Socket                                   socket                    1
159       Symbolic link                            symbolic link to          1
160       Regular file                             regular file              1,2
161       Empty regular file                       empty                     3
162       Regular file that cannot be read         cannot open               3
163       Executable binary                        executable                4,6
164       ar archive library (see ar)              archive                   4,6
165       Extended cpio format (see pax)           cpio archive              4,6
166       Extended tar format (see ustar in pax)   tar archive               4,6
167       Shell script                             commands text             5,6
168       C-language source                        c program text            5,6
169       FORTRAN source                           fortran program text      5,6
170       Regular file whose type cannot be deter‐ data
171       mined
172
173       Notes:
174
175               1. This is a file type test.
176
177               2. This test is applied only if the -i option is specified.
178
179               3. This test is applied only if the -i option is not specified.
180
181               4. This is a position-sensitive default system test.
182
183               5. This is a context-sensitive default system test.
184
185               6. Position-sensitive  default  system tests and context-sensi‐
186                  tive default system tests are not applied if the  -M  option
187                  is specified unless the -d option is also specified.
188
189       In  the POSIX locale, if file is identified as a symbolic link (see the
190       -h option), the following alternative output format shall be used:
191
192
193              "%s: %s %s\n", <file>, <type>, <contents of link>"
194
195       If the file named by the file operand does not exist, cannot  be  read,
196       or the type of the file named by the file operand cannot be determined,
197       this shall not be considered an error that affects the exit status.
198

STDERR

200       The standard error shall be used only for diagnostic messages.
201

OUTPUT FILES

203       None.
204

EXTENDED DESCRIPTION

206       A file specified as an option-argument to the -m or  -M  options  shall
207       contain one position-sensitive test per line, which shall be applied to
208       the file. If the test succeeds, the message field of the line shall  be
209       printed  and no further tests shall be applied, with the exception that
210       tests on immediately following lines beginning with a single '>'  char‐
211       acter shall be applied.
212
213       Each  line  shall  be  composed of the following four <blank>-separated
214       fields:
215
216       offset An unsigned number (optionally preceded by a single '>'  charac‐
217              ter)  specifying  the offset, in bytes, of the value in the file
218              that is to be compared against the value field of the  line.  If
219              the  file  is  shorter than the specified offset, the test shall
220              fail.
221
222       If the offset begins with the character '>' , the test contained in the
223       line  shall not be applied to the file unless the test on the last line
224       for which the offset did not  begin  with  a  '>'  was  successful.  By
225       default, the offset shall be interpreted as an unsigned decimal number.
226       With a leading 0x or 0X, the offset shall be interpreted as a hexadeci‐
227       mal  number;  otherwise,  with  a leading 0, the offset shall be inter‐
228       preted as an octal number.
229
230       type   The type of the value in the file to be tested. The  type  shall
231              consist of the type specification characters c , d , f , s , and
232              u  ,  specifying  character,  signed  decimal,  floating  point,
233              string, and unsigned decimal, respectively.
234
235       The  type string shall be interpreted as the bytes from the file start‐
236       ing at the specified offset and including  the  same  number  of  bytes
237       specified  by the value field. If insufficient bytes remain in the file
238       past the offset to match the value field, the test shall fail.
239
240       The type specification characters d , f , and u can be followed  by  an
241       optional  unsigned  decimal  integer that specifies the number of bytes
242       represented by the type.  The type specification  character  f  can  be
243       followed  by an optional F , D , or L , indicating that the value is of
244       type float, double, or long double, respectively. The  type  specifica‐
245       tion characters d and u can be followed by an optional C , S , I , or L
246       , indicating that the value is of  type  char,  short,  int,  or  long,
247       respectively.
248
249       The  default number of bytes represented by the type specifiers d , f ,
250       and u shall correspond to their respective C-language types as follows.
251       If  the  system claims conformance to the C-Language Development Utili‐
252       ties option, those specifiers shall correspond  to  the  default  sizes
253       used  in the c99 utility.  Otherwise, the default sizes shall be imple‐
254       mentation-defined.
255
256       For the type specifier characters d and u , the default number of bytes
257       shall correspond to the size of a basic integer type of the implementa‐
258       tion. For these specifier characters, the implementation shall  support
259       values of the optional number of bytes to be converted corresponding to
260       the number of bytes in the C-language types char, short, int, or  long.
261       These numbers can also be specified by an application as the characters
262       C , S , I , and L , respectively. The byte order used when interpreting
263       numeric  values  is implementation-defined, but shall correspond to the
264       order in which a constant of the corresponding type is stored in memory
265       on the system.
266
267       For the type specifier f , the default number of bytes shall correspond
268       to the number of bytes in the  basic  double  precision  floating-point
269       data  type  of the underlying implementation.  The implementation shall
270       support values of the optional number of bytes to be  converted  corre‐
271       sponding  to the number of bytes in the C-language types float, double,
272       and long double. These numbers can also be specified by an  application
273       as the characters F , D , and L , respectively.
274
275       All type specifiers, except for s , can be followed by a mask specifier
276       of the form &number. The mask value shall be AND'ed with the  value  of
277       the  input  file before the comparison with the value field of the line
278       is made. By default, the mask shall be interpreted as an unsigned deci‐
279       mal  number.  With a leading 0x or 0X, the mask shall be interpreted as
280       an unsigned hexadecimal number; otherwise, with a leading 0,  the  mask
281       shall be interpreted as an unsigned octal number.
282
283       The  strings  byte,  short, long, and string shall also be supported as
284       type fields, being interpreted as dC , dS , dL , and s , respectively.
285
286       value  The value to be compared with the value from the file.
287
288       If the specifier from the type field is s or string, then interpret the
289       value as a string. Otherwise, interpret it as a number. If the value is
290       a string, then the test shall succeed only when a string value  exactly
291       matches the bytes from the file.
292
293       If the value is a string, it can contain the following sequences:
294
295       \character
296              The  backslash-escape sequences as specified in the Base Defini‐
297              tions  volume  of  IEEE Std 1003.1-2001,   Table   5-1,   Escape
298              Sequences  and  Associated Actions ( '\\' , '\a' , '\b' , '\f' ,
299              '\n' , '\r' , '\t' , '\v' ). The  results  of  using  any  other
300              character,  other  than  an octal digit, following the backslash
301              are unspecified.
302
303       \octal
304              Octal sequences that can be used to  represent  characters  with
305              specific  coded  values.  An  octal  sequence shall consist of a
306              backslash followed by the longest sequence of one, two, or three
307              octal-digit  characters (01234567). If the size of a byte on the
308              system is greater than 9 bits, the valid escape sequence used to
309              represent a byte is implementation-defined.
310
311
312       By  default,  any  value that is not a string shall be interpreted as a
313       signed decimal number. Any such value, with a leading 0x or  0X,  shall
314       be  interpreted  as  an  unsigned hexadecimal number; otherwise, with a
315       leading zero, the value shall be interpreted as an unsigned octal  num‐
316       ber.
317
318       If  the  value is not a string, it can be preceded by a character indi‐
319       cating the comparison to be performed. Permissible characters  and  the
320       comparisons they specify are as follows:
321
322       =
323              The  test  shall  succeed  if the value from the file equals the
324              value field.
325
326       <
327              The test shall succeed if the value from the file is  less  than
328              the value field.
329
330       >
331              The  test  shall  succeed  if the value from the file is greater
332              than the value field.
333
334       &
335              The test shall succeed if all of the set bits in the value field
336              are set in the value from the file.
337
338       ^
339              The  test  shall  succeed if at least one of the set bits in the
340              value field is not set in the value from the file.
341
342       x
343              The test shall succeed if the file is large enough to contain  a
344              value of the type specified starting at the offset specified.
345
346
347       message
348              The  message  to  be  printed  if the test succeeds. The message
349              shall be interpreted using the notation for the  printf  format‐
350              ting  specification;  see  printf()  .  If the value field was a
351              string, then the value from the file shall be the  argument  for
352              the  printf  formatting specification; otherwise, the value from
353              the file shall be the argument.
354
355

EXIT STATUS

357       The following exit values shall be returned:
358
359        0     Successful completion.
360
361       >0     An error occurred.
362
363

CONSEQUENCES OF ERRORS

365       Default.
366
367       The following sections are informative.
368

APPLICATION USAGE

370       The file utility can only be required to guess  at  many  of  the  file
371       types  because  only  exhaustive  testing can determine some types with
372       certainty. For example, binary data on some implementations might match
373       the initial segment of an executable or a tar archive.
374
375       Note  that  the  table  indicates  that  the output contains the stated
376       string. Systems may add text before or after the string.  For  executa‐
377       bles,  as  an example, the machine architecture and various facts about
378       how the file was link-edited may be included. Note also that on systems
379       that  recognize  shell  script  files  starting with "#!" as executable
380       files, these may be identified as executable binary files  rather  than
381       as shell scripts.
382

EXAMPLES

384       Determine whether an argument is a binary executable file:
385
386
387              file "$1" | grep -Fq executable &&
388                  printf "%s is executable.\n" "$1"
389

RATIONALE

391       The  -f  option was omitted because the same effect can (and should) be
392       obtained using the xargs utility.
393
394       Historical versions of the file utility attempt to identify the follow‐
395       ing  types of files: symbolic link, directory, character special, block
396       special, socket, tar  archive,  cpio  archive,  SCCS  archive,  archive
397       library,  empty,  compress  output, pack output, binary data, C source,
398       FORTRAN source, assembler source, nroff/ troff/ eqn/ tbl  source  troff
399       output, shell script, C shell script, English text, ASCII text, various
400       executables, APL  workspace,  compiled  terminfo  entries,  and  CURSES
401       screen  images.  Only those types that are reasonably well specified in
402       POSIX or are directly related to POSIX utilities are listed in the  ta‐
403       ble.
404
405       Historical  systems  have  used a "magic file" named /etc/magic to help
406       identify file types. Because it  is  generally  useful  for  users  and
407       scripts  to  be  able to identify special file types, the -m flag and a
408       portable format for user-created magic files  has  been  specified.  No
409       requirement  is  made that an implementation of file use this method of
410       identifying files, only that users be permitted to add their own  clas‐
411       sifying tests.
412
413       In addition, three options have been added to historical practice.  The
414       -d flag has been added to permit users to cause their tests  to  follow
415       any default system tests. The -i flag has been added to permit users to
416       test portably for regular files in shell scripts. The -M flag has  been
417       added to permit users to ignore any default system tests.
418
419       The  IEEE Std 1003.1-2001  description  of default system tests and the
420       interaction between the -d, -M, and -m options did not clearly indicate
421       that there were two types of "default system tests". The "position-sen‐
422       sitive tests'' determine file types by looking for  certain  string  or
423       binary  values  at  specific  offsets in the file being examined. These
424       position-sensitive tests were implemented in historical  systems  using
425       the  magic file described above. Some of these tests are now built into
426       the file utility itself on some implementations so the output can  pro‐
427       vide  more  detail  than can be provided by magic files. For example, a
428       magic file can easily identify a core file on most implementations, but
429       cannot  name the program file that dropped the core. A magic file could
430       produce output such as:
431
432
433              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
434
435       but by building the test into the file utility, you  could  get  output
436       such as:
437
438
439              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
440
441       These  extended built-in tests are still to be treated as position-sen‐
442       sitive default system tests even if they are not listed  in  /etc/magic
443       or any other magic file.
444
445       The  context-sensitive  default system tests were always built into the
446       file utility. These tests looked for language constructs in text  files
447       trying  to  identify shell scripts, C, FORTRAN, and other computer lan‐
448       guage source files, and even plain text files. With the addition of the
449       -m  and  -M options the distinction between position-sensitive and con‐
450       text-sensitive default system tests became important because the  order
451       of  testing  is  important.  The context-sensitive system default tests
452       should never be applied before any position-sensitive tests even if the
453       -d  option is specified before a -m option or -M option due to the high
454       probability that the context-sensitive system default tests will incor‐
455       rectly identify arbitrary text files as text files before position-sen‐
456       sitive tests specified by the -m or -M option would be applied to  give
457       a more accurate identification.
458
459       Leaving  the  meaning  of  -M - and -m - unspecified allows an existing
460       prototype of these options to continue to work in a  backwards-compati‐
461       ble  manner. (In that implementation, -M - was roughly equivalent to -d
462       in IEEE Std 1003.1-2001.)
463
464       The historical -c option was omitted  as  not  particularly  useful  to
465       users  or portable shell scripts. In addition, a reasonable implementa‐
466       tion of the file utility would report any errors found  each  time  the
467       magic file is read.
468
469       The  historical format of the magic file was the same as that specified
470       by the Rationale in  the  ISO POSIX-2:1993  standard  for  the  offset,
471       value,  and  message  fields; however, it used less precise type fields
472       than the format specified by the current normative text. The  new  type
473       field values are a superset of the historical ones.
474
475       The following is an example magic file:
476
477
478              0  short     070707              cpio archive
479              0  short     0143561             Byte-swapped cpio archive
480              0  string    070707              ASCII cpio archive
481              0  long      0177555             Very old archive
482              0  short     0177545             Old archive
483              0  short     017437              Old packed data
484              0  string    \037\036            Packed data
485              0  string    \377\037            Compacted data
486              0  string    \037\235            Compressed data
487              >2 byte&0x80 >0                  Block compressed
488              >2 byte&0x1f x                   %d bits
489              0  string    \032\001            Compiled Terminfo Entry
490              0  short     0433                Curses screen image
491              0  short     0434                Curses screen image
492              0  string    <ar>                System V Release 1 archive
493              0  string    !<arch>\n__.SYMDEF  Archive random library
494              0  string    !<arch>             Archive
495              0  string    ARF_BEGARF          PHIGS clear text archive
496              0  long      0x137A2950          Scalable OpenFont binary
497              0  long      0x137A2951          Encrypted scalable OpenFont binary
498
499       The use of a basic integer data type is intended to allow the implemen‐
500       tation to choose a word size commonly  used  by  applications  on  that
501       architecture.
502

FUTURE DIRECTIONS

504       None.
505

SEE ALSO

507       ar , ls , pax
508
510       Portions  of  this text are reprinted and reproduced in electronic form
511       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
512       --  Portable  Operating  System  Interface (POSIX), The Open Group Base
513       Specifications Issue 6, Copyright (C) 2001-2003  by  the  Institute  of
514       Electrical  and  Electronics  Engineers, Inc and The Open Group. In the
515       event of any discrepancy between this version and the original IEEE and
516       The  Open Group Standard, the original IEEE and The Open Group Standard
517       is the referee document. The original Standard can be  obtained  online
518       at http://www.opengroup.org/unix/online.html .
519
520
521
522IEEE/The Open Group                  2003                              FILE(P)
Impressum