1FILE(1)                   BSD General Commands Manual                  FILE(1)
2

NAME

4     file — determine file type
5

SYNOPSIS

7     file [-bchiklLNnprsvz0] [--apple] [--mime-encoding] [--mime-type]
8          [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ...
9     file -C [-m magicfiles]
10     file [--help]
11

DESCRIPTION

13     This manual page documents version 5.11 of the file command.
14
15     file tests each argument in an attempt to classify it.  There are three
16     sets of tests, performed in this order: filesystem tests, magic tests,
17     and language tests.  The first test that succeeds causes the file type to
18     be printed.
19
20     The type printed will usually contain one of the words text (the file
21     contains only printing characters and a few common control characters and
22     is probably safe to read on an ASCII terminal), executable (the file con‐
23     tains the result of compiling a program in a form understandable to some
24     UNIX kernel or another), or data meaning anything else (data is usually
25     “binary” or non-printable).  Exceptions are well-known file formats (core
26     files, tar archives) that are known to contain binary data.  When modify‐
27     ing magic files or the program itself, make sure to preserve these
28     keywords.  Users depend on knowing that all the readable files in a
29     directory have the word “text” printed.  Don't do as Berkeley did and
30     change “shell commands text” to “shell script”.
31
32     The filesystem tests are based on examining the return from a stat(2)
33     system call.  The program checks to see if the file is empty, or if it's
34     some sort of special file.  Any known file types appropriate to the sys‐
35     tem you are running on (sockets, symbolic links, or named pipes (FIFOs)
36     on those systems that implement them) are intuited if they are defined in
37     the system header file <sys/stat.h>.
38
39     The magic tests are used to check for files with data in particular fixed
40     formats.  The canonical example of this is a binary executable (compiled
41     program) a.out file, whose format is defined in <elf.h>, <a.out.h> and
42     possibly <exec.h> in the standard include directory.  These files have a
43     “magic number” stored in a particular place near the beginning of the
44     file that tells the UNIX operating system that the file is a binary exe‐
45     cutable, and which of several types thereof.  The concept of a “magic”
46     has been applied by extension to data files.  Any file with some invari‐
47     ant identifier at a small fixed offset into the file can usually be
48     described in this way.  The information identifying these files is read
49     from the compiled magic file /usr/share/misc/magic.mgc, or the files in
50     the directory /usr/share/misc/magic if the compiled file does not exist.
51     In addition, if $HOME/.magic.mgc or $HOME/.magic exists, it will be used
52     in preference to the system magic files.
53
54     If a file does not match any of the entries in the magic file, it is
55     examined to see if it seems to be a text file.  ASCII, ISO-8859-x, non-
56     ISO 8-bit extended-ASCII character sets (such as those used on Macintosh
57     and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded Unicode, and
58     EBCDIC character sets can be distinguished by the different ranges and
59     sequences of bytes that constitute printable text in each set.  If a file
60     passes any of these tests, its character set is reported.  ASCII,
61     ISO-8859-x, UTF-8, and extended-ASCII files are identified as “text”
62     because they will be mostly readable on nearly any terminal; UTF-16 and
63     EBCDIC are only “character data” because, while they contain text, it is
64     text that will require translation before it can be read.  In addition,
65     file will attempt to determine other characteristics of text-type files.
66     If the lines of a file are terminated by CR, CRLF, or NEL, instead of the
67     Unix-standard LF, this will be reported.  Files that contain embedded
68     escape sequences or overstriking will also be identified.
69
70     Once file has determined the character set used in a text-type file, it
71     will attempt to determine in what language the file is written.  The lan‐
72     guage tests look for particular strings (cf.  <names.h>) that can appear
73     anywhere in the first few blocks of a file.  For example, the keyword .br
74     indicates that the file is most likely a troff(1) input file, just as the
75     keyword struct indicates a C program.  These tests are less reliable than
76     the previous two groups, so they are performed last.  The language test
77     routines also test for some miscellany (such as tar(1) archives).
78
79     Any file that cannot be identified as having been written in any of the
80     character sets listed above is simply said to be “data”.
81

OPTIONS

83     -b, --brief
84             Do not prepend filenames to output lines (brief mode).
85
86     -C, --compile
87             Write a magic.mgc output file that contains a pre-parsed version
88             of the magic file or directory.
89
90     -c, --checking-printout
91             Cause a checking printout of the parsed form of the magic file.
92             This is usually used in conjunction with the -m flag to debug a
93             new magic file before installing it.
94
95     -e, --exclude testname
96             Exclude the test named in testname from the list of tests made to
97             determine the file type.  Valid test names are:
98
99             apptype   EMX application type (only on EMX).
100
101             ascii     Various types of text files (this test will try to
102                       guess the text encoding, irrespective of the setting of
103                       the ‘encoding’ option).
104
105             encoding  Different text encodings for soft magic tests.
106
107             tokens    Ignored for backwards compatibility.
108
109             cdf       Prints details of Compound Document Files.
110
111             compress  Checks for, and looks inside, compressed files.
112
113             elf       Prints ELF file details.
114
115             soft      Consults magic files.
116
117             tar       Examines tar files.
118
119     -F, --separator separator
120             Use the specified string as the separator between the filename
121             and the file result returned.  Defaults to ‘:’.
122
123     -f, --files-from namefile
124             Read the names of the files to be examined from namefile (one per
125             line) before the argument list.  Either namefile or at least one
126             filename argument must be present; to test the standard input,
127             use ‘-’ as a filename argument.  Please note that namefile is
128             unwrapped and the enclosed filenames are processed when this
129             option is encountered and before any further options processing
130             is done.  This allows one to process multiple lists of files with
131             different command line arguments on the same file invocation.
132             Thus if you want to set the delimiter, you need to do it before
133             you specify the list of files, like: “-F @ -f namefile”, instead
134             of: “-f namefile -F @”.
135
136     -h, --no-dereference
137             option causes symlinks not to be followed (on systems that sup‐
138             port symbolic links).  This is the default if the environment
139             variable POSIXLY_CORRECT is not defined.
140
141     -i, --mime
142             Causes the file command to output mime type strings rather than
143             the more traditional human readable ones.  Thus it may say
144             ‘text/plain; charset=us-ascii’ rather than “ASCII text”.
145
146     --mime-type, --mime-encoding
147             Like -i, but print only the specified element(s).
148
149     -k, --keep-going
150             Don't stop at the first match, keep going.  Subsequent matches
151             will be have the string ‘\012- ’ prepended.  (If you want a new‐
152             line, see the -r option.)
153
154     -l, --list
155             Print information about the strength of each magic pattern.
156
157     -L, --dereference
158             option causes symlinks to be followed, as the like-named option
159             in ls(1) (on systems that support symbolic links).  This is the
160             default if the environment variable POSIXLY_CORRECT is defined.
161
162     -l      Shows sorted patterns list in the order which is used for the
163             matching.
164
165     -m, --magic-file magicfiles
166             Specify an alternate list of files and directories containing
167             magic.  This can be a single item, or a colon-separated list.  If
168             a compiled magic file is found alongside a file or directory, it
169             will be used instead.
170
171     -N, --no-pad
172             Don't pad filenames so that they align in the output.
173
174     -n, --no-buffer
175             Force stdout to be flushed after checking each file.  This is
176             only useful if checking a list of files.  It is intended to be
177             used by programs that want filetype output from a pipe.
178
179     -p, --preserve-date
180             On systems that support utime(3) or utimes(2), attempt to pre‐
181             serve the access time of files analyzed, to pretend that file
182             never read them.
183
184     -r, --raw
185             Don't translate unprintable characters to \ooo.  Normally file
186             translates unprintable characters to their octal representation.
187
188     -s, --special-files
189             Normally, file only attempts to read and determine the type of
190             argument files which stat(2) reports are ordinary files.  This
191             prevents problems, because reading special files may have pecu‐
192             liar consequences.  Specifying the -s option causes file to also
193             read argument files which are block or character special files.
194             This is useful for determining the filesystem types of the data
195             in raw disk partitions, which are block special files.  This
196             option also causes file to disregard the file size as reported by
197             stat(2) since on some systems it reports a zero size for raw disk
198             partitions.
199
200     -v, --version
201             Print the version of the program and exit.
202
203     -z, --uncompress
204             Try to look inside compressed files.
205
206     -0, --print0
207             Output a null character ‘\0’ after the end of the filename.  Nice
208             to cut(1) the output.  This does not affect the separator which
209             is still printed.
210
211     --help  Print a help message and exit.
212

FILES

214     /usr/share/misc/magic.mgc  Default compiled list of magic.
215     /usr/share/misc/magic      Directory containing default magic files.
216

ENVIRONMENT

218     The environment variable MAGIC can be used to set the default magic file
219     name.  If that variable is set, then file will not attempt to open
220     $HOME/.magic.  file adds “.mgc” to the value of this variable as appro‐
221     priate.  However, file has to exist in order for file.mime to be consid‐
222     ered.  The environment variable POSIXLY_CORRECT controls (on systems that
223     support symbolic links), whether file will attempt to follow symlinks or
224     not.  If set, then file follows symlink, otherwise it does not.  This is
225     also controlled by the -L and -h options.
226

SEE ALSO

228     magic(5), hexdump(1), od(1), strings(1),
229

STANDARDS CONFORMANCE

231     This program is believed to exceed the System V Interface Definition of
232     FILE(CMD), as near as one can determine from the vague language contained
233     therein.  Its behavior is mostly compatible with the System V program of
234     the same name.  This version knows more magic, however, so it will pro‐
235     duce different (albeit more accurate) output in many cases.
236
237     The one significant difference between this version and System V is that
238     this version treats any white space as a delimiter, so that spaces in
239     pattern strings must be escaped.  For example,
240
241           >10     string  language impress        (imPRESS data)
242
243     in an existing magic file would have to be changed to
244
245           >10     string  language\ impress       (imPRESS data)
246
247     In addition, in this version, if a pattern string contains a backslash,
248     it must be escaped.  For example
249
250           0       string          \begindata      Andrew Toolkit document
251
252     in an existing magic file would have to be changed to
253
254           0       string          \\begindata     Andrew Toolkit document
255
256     SunOS releases 3.2 and later from Sun Microsystems include a file command
257     derived from the System V one, but with some extensions.  This version
258     differs from Sun's only in minor ways.  It includes the extension of the
259     ‘&’ operator, used as, for example,
260
261           >16     long&0x7fffffff >0              not stripped
262

MAGIC DIRECTORY

264     The magic file entries have been collected from various sources, mainly
265     USENET, and contributed by various authors.  Christos Zoulas (address
266     below) will collect additional or corrected magic file entries.  A con‐
267     solidation of magic file entries will be distributed periodically.
268
269     The order of entries in the magic file is significant.  Depending on what
270     system you are using, the order that they are put together may be incor‐
271     rect.  If your old file command uses a magic file, keep the old magic
272     file around for comparison purposes (rename it to
273     /usr/share/misc/magic.orig).
274

EXAMPLES

276           $ file file.c file /dev/{wd0a,hda}
277           file.c:   C program text
278           file:     ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
279                     dynamically linked (uses shared libs), stripped
280           /dev/wd0a: block special (0/0)
281           /dev/hda: block special (3/0)
282
283           $ file -s /dev/wd0{b,d}
284           /dev/wd0b: data
285           /dev/wd0d: x86 boot sector
286
287           $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
288           /dev/hda:   x86 boot sector
289           /dev/hda1:  Linux/i386 ext2 filesystem
290           /dev/hda2:  x86 boot sector
291           /dev/hda3:  x86 boot sector, extended partition table
292           /dev/hda4:  Linux/i386 ext2 filesystem
293           /dev/hda5:  Linux/i386 swap file
294           /dev/hda6:  Linux/i386 swap file
295           /dev/hda7:  Linux/i386 swap file
296           /dev/hda8:  Linux/i386 swap file
297           /dev/hda9:  empty
298           /dev/hda10: empty
299
300           $ file -i file.c file /dev/{wd0a,hda}
301           file.c:      text/x-c
302           file:        application/x-executable
303           /dev/hda:    application/x-not-regular-file
304           /dev/wd0a:   application/x-not-regular-file
305
306

HISTORY

308     There has been a file command in every UNIX since at least Research
309     Version 4 (man page dated November, 1973).  The System V version intro‐
310     duced one significant major change: the external list of magic types.
311     This slowed the program down slightly but made it a lot more flexible.
312
313     This program, based on the System V version, was written by Ian Darwin
314     ⟨ian@darwinsys.com⟩ without looking at anybody else's source code.
315
316     John Gilmore revised the code extensively, making it better than the
317     first version.  Geoff Collyer found several inadequacies and provided
318     some magic file entries.  Contributions by the ‘&’ operator by Rob McMa‐
319     hon, ⟨cudcv@warwick.ac.uk⟩, 1989.
320
321     Guy Harris, ⟨guy@netapp.com⟩, made many changes from 1993 to the present.
322     1989.
323
324     Primary development and maintenance from 1990 to the present by Christos
325     Zoulas ⟨christos@astron.com⟩.
326
327     Altered by Chris Lowth ⟨chris@lowth.com⟩, 2000: handle the -i option to
328     output mime type strings, using an alternative magic file and internal
329     logic.
330
331     Altered by Eric Fischer ⟨enf@pobox.com⟩, July, 2000, to identify charac‐
332     ter codes and attempt to identify the languages of non-ASCII files.
333
334     Altered by Reuben Thomas ⟨rrt@sc3d.org⟩, 2007-2011, to improve MIME sup‐
335     port, merge MIME and non-MIME magic, support directories as well as files
336     of magic, apply many bug fixes, update and fix a lot of magic, improve
337     the build system, improve the documentation, and rewrite the Python bind‐
338     ings in pure Python.
339
340     The list of contributors to the ‘magic’ directory (magic files) is too
341     long to include here.  You know who you are; thank you.  Many contribu‐
342     tors are listed in the source files.
343
345     Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.  Covered by the
346     standard Berkeley Software Distribution copyright; see the file COPYING
347     in the source distribution.
348
349     The files tar.h and is_tar.c were written by John Gilmore from his pub‐
350     lic-domain tar(1) program, and are not covered by the above license.
351

RETURN CODE

353     file returns 0 on success, and non-zero on error.
354
355     If the file named by the file operand does not exist, cannot be read, or
356     the type of the file named by the file operand cannot be determined, this
357     is not be considered an error that affects the exit status.
358

BUGS

360     Please report bugs and send patches to the bug tracker at
361     http://bugs.gw.com/ or the mailing list at ⟨file@mx.gw.com⟩.
362

TODO

364     Fix output so that tests for MIME and APPLE flags are not needed all over
365     the place, and actual output is only done in one place. This needs a
366     design. Suggestion: push possible outputs on to a list, then pick the
367     last-pushed (most specific, one hopes) value at the end, or use a default
368     if the list is empty. This should not slow down evaluation.
369
370     Continue to squash all magic bugs. See Debian BTS for a good source.
371
372     Store arbitrarily long strings, for example for %s patterns, so that they
373     can be printed out. Fixes Debian bug #271672. Would require more complex
374     store/load code in apprentice.
375
376     Add syntax for relative offsets after current level (Debian bug #466037).
377
378     Make file -ki work, i.e. give multiple MIME types.
379
380     Add a zip library so we can peek inside Office2007 documents to figure
381     out what they are.
382
383     Add an option to print URLs for the sources of the file descriptions.
384

AVAILABILITY

386     You can obtain the original author's latest version by anonymous FTP on
387     ftp.astron.com in the directory /pub/file/file-X.YZ.tar.gz.
388
389BSD                            October 17, 2011                            BSD
Impressum