1FILE(1P) POSIX Programmer's Manual FILE(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 file - determine file type
13
15 file [-dh][-M file][-m file] file ...
16
17 file -i [-h] file ...
18
19
21 The file utility shall perform a series of tests in sequence on each
22 specified file in an attempt to classify it:
23
24 1. If file does not exist, cannot be read, or its file status could
25 not be determined, the output shall indicate that the file was pro‐
26 cessed, but that its type could not be determined.
27
28 2. If the file is not a regular file, its file type shall be identi‐
29 fied. The file types directory, FIFO, socket, block special, and
30 character special shall be identified as such. Other implementa‐
31 tion-defined file types may also be identified. If file is a sym‐
32 bolic link, by default the link shall be resolved and file shall
33 test the type of file referenced by the symbolic link. (See the -h
34 and -i options below.)
35
36 3. If the length of file is zero, it shall be identified as an empty
37 file.
38
39 4. The file utility shall examine an initial segment of file and shall
40 make a guess at identifying its contents based on position-sensi‐
41 tive tests. (The answer is not guaranteed to be correct; see the
42 -d, -M, and -m options below.)
43
44 5. The file utility shall examine file and make a guess at identifying
45 its contents based on context-sensitive default system tests. (The
46 answer is not guaranteed to be correct.)
47
48 6. The file shall be identified as a data file.
49
50 If file does not exist, cannot be read, or its file status could not be
51 determined, the output shall indicate that the file was processed, but
52 that its type could not be determined.
53
54 If file is a symbolic link, by default the link shall be resolved and
55 file shall test the type of file referenced by the symbolic link.
56
58 The file utility shall conform to the Base Definitions volume of
59 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except
60 that the order of the -m, -d, and -M options shall be significant.
61
62 The following options shall be supported by the implementation:
63
64 -d Apply any position-sensitive default system tests and context-
65 sensitive default system tests to the file. This is the default
66 if no -M or -m option is specified.
67
68 -h When a symbolic link is encountered, identify the file as a sym‐
69 bolic link. If -h is not specified and file is a symbolic link
70 that refers to a nonexistent file, file shall identify the file
71 as a symbolic link, as if -h had been specified.
72
73 -i If a file is a regular file, do not attempt to classify the type
74 of the file further, but identify the file as specified in the
75 STDOUT section.
76
77 -M file
78 Specify the name of a file containing position-sensitive tests
79 that shall be applied to a file in order to classify it (see the
80 EXTENDED DESCRIPTION). No position-sensitive default system
81 tests nor context-sensitive default system tests shall be
82 applied unless the -d option is also specified.
83
84 -m file
85 Specify the name of a file containing position-sensitive tests
86 that shall be applied to a file in order to classify it (see the
87 EXTENDED DESCRIPTION).
88
89
90 If the -m option is specified without specifying the -d option or the
91 -M option, position-sensitive default system tests shall be applied
92 after the position-sensitive tests specified by the -m option. If the
93 -M option is specified with the -d option, the -m option, or both, or
94 the -m option is specified with the -d option, the concatenation of the
95 position-sensitive tests specified by these options shall be applied in
96 the order specified by the appearance of these options. If a -M or -m
97 file option-argument is -, the results are unspecified.
98
100 The following operand shall be supported:
101
102 file A pathname of a file to be tested.
103
104
106 Not used.
107
109 The file can be any file type.
110
112 The following environment variables shall affect the execution of file:
113
114 LANG Provide a default value for the internationalization variables
115 that are unset or null. (See the Base Definitions volume of
116 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
117 ables for the precedence of internationalization variables used
118 to determine the values of locale categories.)
119
120 LC_ALL If set to a non-empty string value, override the values of all
121 the other internationalization variables.
122
123 LC_CTYPE
124 Determine the locale for the interpretation of sequences of
125 bytes of text data as characters (for example, single-byte as
126 opposed to multi-byte characters in arguments and input files).
127
128 LC_MESSAGES
129 Determine the locale that should be used to affect the format
130 and contents of diagnostic messages written to standard error
131 and informative messages written to standard output.
132
133 NLSPATH
134 Determine the location of message catalogs for the processing of
135 LC_MESSAGES .
136
137
139 Default.
140
142 In the POSIX locale, the following format shall be used to identify
143 each operand, file specified:
144
145
146 "%s: %s\n", <file>, <type>
147
148 The values for <type> are unspecified, except that in the POSIX locale,
149 if file is identified as one of the types listed in the following ta‐
150 ble, <type> shall contain (but is not limited to) the corresponding
151 string, unless the file is identified by a position-sensitive test
152 specified by a -M or -m option. Each space shown in the strings shall
153 be exactly one <space>.
154
155 Table: File Utility Output Strings
156
157 If file is: <type> shall contain the Notes
158 string:
159 Nonexistent cannot open
160 Block special block special 1
161 Character special character special 1
162 Directory directory 1
163 FIFO fifo 1
164 Socket socket 1
165 Symbolic link symbolic link to 1
166 Regular file regular file 1,2
167 Empty regular file empty 3
168 Regular file that cannot be read cannot open 3
169 Executable binary executable 4,6
170 ar archive library (see ar) archive 4,6
171 Extended cpio format (see pax) cpio archive 4,6
172 Extended tar format (see ustar in pax) tar archive 4,6
173 Shell script commands text 5,6
174 C-language source c program text 5,6
175 FORTRAN source fortran program text 5,6
176 Regular file whose type cannot be deter‐ data
177 mined
178
179 Notes:
180
181 1. This is a file type test.
182
183 2. This test is applied only if the -i option is specified.
184
185 3. This test is applied only if the -i option is not specified.
186
187 4. This is a position-sensitive default system test.
188
189 5. This is a context-sensitive default system test.
190
191 6. Position-sensitive default system tests and context-sensi‐
192 tive default system tests are not applied if the -M option
193 is specified unless the -d option is also specified.
194
195 In the POSIX locale, if file is identified as a symbolic link (see the
196 -h option), the following alternative output format shall be used:
197
198
199 "%s: %s %s\n", <file>, <type>, <contents of link>"
200
201 If the file named by the file operand does not exist, cannot be read,
202 or the type of the file named by the file operand cannot be determined,
203 this shall not be considered an error that affects the exit status.
204
206 The standard error shall be used only for diagnostic messages.
207
209 None.
210
212 A file specified as an option-argument to the -m or -M options shall
213 contain one position-sensitive test per line, which shall be applied to
214 the file. If the test succeeds, the message field of the line shall be
215 printed and no further tests shall be applied, with the exception that
216 tests on immediately following lines beginning with a single '>' char‐
217 acter shall be applied.
218
219 Each line shall be composed of the following four <blank>-separated
220 fields:
221
222 offset An unsigned number (optionally preceded by a single '>' charac‐
223 ter) specifying the offset, in bytes, of the value in the file
224 that is to be compared against the value field of the line. If
225 the file is shorter than the specified offset, the test shall
226 fail.
227
228 If the offset begins with the character '>', the test contained in the
229 line shall not be applied to the file unless the test on the last line
230 for which the offset did not begin with a '>' was successful. By
231 default, the offset shall be interpreted as an unsigned decimal number.
232 With a leading 0x or 0X, the offset shall be interpreted as a hexadeci‐
233 mal number; otherwise, with a leading 0, the offset shall be inter‐
234 preted as an octal number.
235
236 type The type of the value in the file to be tested. The type shall
237 consist of the type specification characters c, d, f, s, and u,
238 specifying character, signed decimal, floating point, string,
239 and unsigned decimal, respectively.
240
241 The type string shall be interpreted as the bytes from the file start‐
242 ing at the specified offset and including the same number of bytes
243 specified by the value field. If insufficient bytes remain in the file
244 past the offset to match the value field, the test shall fail.
245
246 The type specification characters d, f, and u can be followed by an
247 optional unsigned decimal integer that specifies the number of bytes
248 represented by the type. The type specification character f can be
249 followed by an optional F, D, or L, indicating that the value is of
250 type float, double, or long double, respectively. The type specifica‐
251 tion characters d and u can be followed by an optional C, S, I, or L,
252 indicating that the value is of type char, short, int, or long, respec‐
253 tively.
254
255 The default number of bytes represented by the type specifiers d, f,
256 and u shall correspond to their respective C-language types as follows.
257 If the system claims conformance to the C-Language Development Utili‐
258 ties option, those specifiers shall correspond to the default sizes
259 used in the c99 utility. Otherwise, the default sizes shall be imple‐
260 mentation-defined.
261
262 For the type specifier characters d and u, the default number of bytes
263 shall correspond to the size of a basic integer type of the implementa‐
264 tion. For these specifier characters, the implementation shall support
265 values of the optional number of bytes to be converted corresponding to
266 the number of bytes in the C-language types char, short, int, or long.
267 These numbers can also be specified by an application as the characters
268 C, S, I, and L, respectively. The byte order used when interpreting
269 numeric values is implementation-defined, but shall correspond to the
270 order in which a constant of the corresponding type is stored in memory
271 on the system.
272
273 For the type specifier f, the default number of bytes shall correspond
274 to the number of bytes in the basic double precision floating-point
275 data type of the underlying implementation. The implementation shall
276 support values of the optional number of bytes to be converted corre‐
277 sponding to the number of bytes in the C-language types float, double,
278 and long double. These numbers can also be specified by an application
279 as the characters F, D, and L, respectively.
280
281 All type specifiers, except for s, can be followed by a mask specifier
282 of the form &number. The mask value shall be AND'ed with the value of
283 the input file before the comparison with the value field of the line
284 is made. By default, the mask shall be interpreted as an unsigned deci‐
285 mal number. With a leading 0x or 0X, the mask shall be interpreted as
286 an unsigned hexadecimal number; otherwise, with a leading 0, the mask
287 shall be interpreted as an unsigned octal number.
288
289 The strings byte, short, long, and string shall also be supported as
290 type fields, being interpreted as dC, dS, dL, and s, respectively.
291
292 value The value to be compared with the value from the file.
293
294 If the specifier from the type field is s or string, then interpret the
295 value as a string. Otherwise, interpret it as a number. If the value is
296 a string, then the test shall succeed only when a string value exactly
297 matches the bytes from the file.
298
299 If the value is a string, it can contain the following sequences:
300
301 \character
302 The backslash-escape sequences as specified in the Base Defini‐
303 tions volume of IEEE Std 1003.1-2001, Table 5-1, Escape
304 Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\n',
305 '\r', '\t', '\v' ). The results of using any other character,
306 other than an octal digit, following the backslash are unspeci‐
307 fied.
308
309 \octal
310 Octal sequences that can be used to represent characters with
311 specific coded values. An octal sequence shall consist of a
312 backslash followed by the longest sequence of one, two, or three
313 octal-digit characters (01234567). If the size of a byte on the
314 system is greater than 9 bits, the valid escape sequence used to
315 represent a byte is implementation-defined.
316
317
318 By default, any value that is not a string shall be interpreted as a
319 signed decimal number. Any such value, with a leading 0x or 0X, shall
320 be interpreted as an unsigned hexadecimal number; otherwise, with a
321 leading zero, the value shall be interpreted as an unsigned octal num‐
322 ber.
323
324 If the value is not a string, it can be preceded by a character indi‐
325 cating the comparison to be performed. Permissible characters and the
326 comparisons they specify are as follows:
327
328 =
329 The test shall succeed if the value from the file equals the
330 value field.
331
332 <
333 The test shall succeed if the value from the file is less than
334 the value field.
335
336 >
337 The test shall succeed if the value from the file is greater
338 than the value field.
339
340 &
341 The test shall succeed if all of the set bits in the value field
342 are set in the value from the file.
343
344 ^
345 The test shall succeed if at least one of the set bits in the
346 value field is not set in the value from the file.
347
348 x
349 The test shall succeed if the file is large enough to contain a
350 value of the type specified starting at the offset specified.
351
352
353 message
354 The message to be printed if the test succeeds. The message
355 shall be interpreted using the notation for the printf format‐
356 ting specification; see printf(). If the value field was a
357 string, then the value from the file shall be the argument for
358 the printf formatting specification; otherwise, the value from
359 the file shall be the argument.
360
361
363 The following exit values shall be returned:
364
365 0 Successful completion.
366
367 >0 An error occurred.
368
369
371 Default.
372
373 The following sections are informative.
374
376 The file utility can only be required to guess at many of the file
377 types because only exhaustive testing can determine some types with
378 certainty. For example, binary data on some implementations might match
379 the initial segment of an executable or a tar archive.
380
381 Note that the table indicates that the output contains the stated
382 string. Systems may add text before or after the string. For executa‐
383 bles, as an example, the machine architecture and various facts about
384 how the file was link-edited may be included. Note also that on systems
385 that recognize shell script files starting with "#!" as executable
386 files, these may be identified as executable binary files rather than
387 as shell scripts.
388
390 Determine whether an argument is a binary executable file:
391
392
393 file "$1" | grep -Fq executable &&
394 printf "%s is executable.\n" "$1"
395
397 The -f option was omitted because the same effect can (and should) be
398 obtained using the xargs utility.
399
400 Historical versions of the file utility attempt to identify the follow‐
401 ing types of files: symbolic link, directory, character special, block
402 special, socket, tar archive, cpio archive, SCCS archive, archive
403 library, empty, compress output, pack output, binary data, C source,
404 FORTRAN source, assembler source, nroff/ troff/ eqn/ tbl source troff
405 output, shell script, C shell script, English text, ASCII text, various
406 executables, APL workspace, compiled terminfo entries, and CURSES
407 screen images. Only those types that are reasonably well specified in
408 POSIX or are directly related to POSIX utilities are listed in the ta‐
409 ble.
410
411 Historical systems have used a "magic file" named /etc/magic to help
412 identify file types. Because it is generally useful for users and
413 scripts to be able to identify special file types, the -m flag and a
414 portable format for user-created magic files has been specified. No
415 requirement is made that an implementation of file use this method of
416 identifying files, only that users be permitted to add their own clas‐
417 sifying tests.
418
419 In addition, three options have been added to historical practice. The
420 -d flag has been added to permit users to cause their tests to follow
421 any default system tests. The -i flag has been added to permit users to
422 test portably for regular files in shell scripts. The -M flag has been
423 added to permit users to ignore any default system tests.
424
425 The IEEE Std 1003.1-2001 description of default system tests and the
426 interaction between the -d, -M, and -m options did not clearly indicate
427 that there were two types of "default system tests". The "position-sen‐
428 sitive tests'' determine file types by looking for certain string or
429 binary values at specific offsets in the file being examined. These
430 position-sensitive tests were implemented in historical systems using
431 the magic file described above. Some of these tests are now built into
432 the file utility itself on some implementations so the output can pro‐
433 vide more detail than can be provided by magic files. For example, a
434 magic file can easily identify a core file on most implementations, but
435 cannot name the program file that dropped the core. A magic file could
436 produce output such as:
437
438
439 /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
440
441 but by building the test into the file utility, you could get output
442 such as:
443
444
445 /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
446
447 These extended built-in tests are still to be treated as position-sen‐
448 sitive default system tests even if they are not listed in /etc/magic
449 or any other magic file.
450
451 The context-sensitive default system tests were always built into the
452 file utility. These tests looked for language constructs in text files
453 trying to identify shell scripts, C, FORTRAN, and other computer lan‐
454 guage source files, and even plain text files. With the addition of the
455 -m and -M options the distinction between position-sensitive and con‐
456 text-sensitive default system tests became important because the order
457 of testing is important. The context-sensitive system default tests
458 should never be applied before any position-sensitive tests even if the
459 -d option is specified before a -m option or -M option due to the high
460 probability that the context-sensitive system default tests will incor‐
461 rectly identify arbitrary text files as text files before position-sen‐
462 sitive tests specified by the -m or -M option would be applied to give
463 a more accurate identification.
464
465 Leaving the meaning of -M - and -m - unspecified allows an existing
466 prototype of these options to continue to work in a backwards-compati‐
467 ble manner. (In that implementation, -M - was roughly equivalent to -d
468 in IEEE Std 1003.1-2001.)
469
470 The historical -c option was omitted as not particularly useful to
471 users or portable shell scripts. In addition, a reasonable implementa‐
472 tion of the file utility would report any errors found each time the
473 magic file is read.
474
475 The historical format of the magic file was the same as that specified
476 by the Rationale in the ISO POSIX-2:1993 standard for the offset,
477 value, and message fields; however, it used less precise type fields
478 than the format specified by the current normative text. The new type
479 field values are a superset of the historical ones.
480
481 The following is an example magic file:
482
483
484 0 short 070707 cpio archive
485 0 short 0143561 Byte-swapped cpio archive
486 0 string 070707 ASCII cpio archive
487 0 long 0177555 Very old archive
488 0 short 0177545 Old archive
489 0 short 017437 Old packed data
490 0 string \037\036 Packed data
491 0 string \377\037 Compacted data
492 0 string \037\235 Compressed data
493 >2 byte&0x80 >0 Block compressed
494 >2 byte&0x1f x %d bits
495 0 string \032\001 Compiled Terminfo Entry
496 0 short 0433 Curses screen image
497 0 short 0434 Curses screen image
498 0 string <ar> System V Release 1 archive
499 0 string !<arch>\n__.SYMDEF Archive random library
500 0 string !<arch> Archive
501 0 string ARF_BEGARF PHIGS clear text archive
502 0 long 0x137A2950 Scalable OpenFont binary
503 0 long 0x137A2951 Encrypted scalable OpenFont binary
504
505 The use of a basic integer data type is intended to allow the implemen‐
506 tation to choose a word size commonly used by applications on that
507 architecture.
508
510 None.
511
513 ar, ls, pax
514
516 Portions of this text are reprinted and reproduced in electronic form
517 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
518 -- Portable Operating System Interface (POSIX), The Open Group Base
519 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
520 Electrical and Electronics Engineers, Inc and The Open Group. In the
521 event of any discrepancy between this version and the original IEEE and
522 The Open Group Standard, the original IEEE and The Open Group Standard
523 is the referee document. The original Standard can be obtained online
524 at http://www.opengroup.org/unix/online.html .
525
526
527
528IEEE/The Open Group 2003 FILE(1P)