1FILE(1P) POSIX Programmer's Manual FILE(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
11
13 file — determine file type
14
16 file [−dh] [−M file] [−m file] file...
17
18 file −i [−h] file...
19
21 The file utility shall perform a series of tests in sequence on each
22 specified file in an attempt to classify it:
23
24 1. If file does not exist, cannot be read, or its file status could
25 not be determined, the output shall indicate that the file was pro‐
26 cessed, but that its type could not be determined.
27
28 2. If the file is not a regular file, its file type shall be identi‐
29 fied. The file types directory, FIFO, socket, block special, and
30 character special shall be identified as such. Other implementa‐
31 tion-defined file types may also be identified. If file is a sym‐
32 bolic link, by default the link shall be resolved and file shall
33 test the type of file referenced by the symbolic link. (See the −h
34 and −i options below.)
35
36 3. If the length of file is zero, it shall be identified as an empty
37 file.
38
39 4. The file utility shall examine an initial segment of file and shall
40 make a guess at identifying its contents based on position-sensi‐
41 tive tests. (The answer is not guaranteed to be correct; see the
42 −d, −M, and −m options below.)
43
44 5. The file utility shall examine file and make a guess at identifying
45 its contents based on context-sensitive default system tests. (The
46 answer is not guaranteed to be correct.)
47
48 6. The file shall be identified as a data file.
49
50 If file does not exist, cannot be read, or its file status could not be
51 determined, the output shall indicate that the file was processed, but
52 that its type could not be determined.
53
54 If file is a symbolic link, by default the link shall be resolved and
55 file shall test the type of file referenced by the symbolic link.
56
58 The file utility shall conform to the Base Definitions volume of
59 POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines, except that the
60 order of the −m, −d, and −M options shall be significant.
61
62 The following options shall be supported by the implementation:
63
64 −d Apply any position-sensitive default system tests and con‐
65 text-sensitive default system tests to the file. This is the
66 default if no −M or −m option is specified.
67
68 −h When a symbolic link is encountered, identify the file as a
69 symbolic link. If −h is not specified and file is a symbolic
70 link that refers to a nonexistent file, file shall identify
71 the file as a symbolic link, as if −h had been specified.
72
73 −i If a file is a regular file, do not attempt to classify the
74 type of the file further, but identify the file as specified
75 in the STDOUT section.
76
77 −M file Specify the name of a file containing position-sensitive
78 tests that shall be applied to a file in order to classify it
79 (see the EXTENDED DESCRIPTION). No position-sensitive default
80 system tests nor context-sensitive default system tests shall
81 be applied unless the −d option is also specified.
82
83 −m file Specify the name of a file containing position-sensitive
84 tests that shall be applied to a file in order to classify it
85 (see the EXTENDED DESCRIPTION).
86
87 If the −m option is specified without specifying the −d option or the
88 −M option, position-sensitive default system tests shall be applied
89 after the position-sensitive tests specified by the −m option. If the
90 −M option is specified with the −d option, the −m option, or both, or
91 the −m option is specified with the −d option, the concatenation of the
92 position-sensitive tests specified by these options shall be applied in
93 the order specified by the appearance of these options. If a −M or −m
94 file option-argument is −, the results are unspecified.
95
97 The following operand shall be supported:
98
99 file A pathname of a file to be tested.
100
102 The standard input shall be used if a file operand is '−' and the
103 implementation treats the '−' as meaning standard input. Otherwise,
104 the standard input shall not be used.
105
107 The file can be any file type.
108
110 The following environment variables shall affect the execution of file:
111
112 LANG Provide a default value for the internationalization vari‐
113 ables that are unset or null. (See the Base Definitions vol‐
114 ume of POSIX.1‐2008, Section 8.2, Internationalization Vari‐
115 ables for the precedence of internationalization variables
116 used to determine the values of locale categories.)
117
118 LC_ALL If set to a non-empty string value, override the values of
119 all the other internationalization variables.
120
121 LC_CTYPE Determine the locale for the interpretation of sequences of
122 bytes of text data as characters (for example, single-byte as
123 opposed to multi-byte characters in arguments and input
124 files).
125
126 LC_MESSAGES
127 Determine the locale that should be used to affect the format
128 and contents of diagnostic messages written to standard error
129 and informative messages written to standard output.
130
131 NLSPATH Determine the location of message catalogs for the processing
132 of LC_MESSAGES.
133
135 Default.
136
138 In the POSIX locale, the following format shall be used to identify
139 each operand, file specified:
140
141 "%s: %s\n", <file>, <type>
142
143 The values for <type> are unspecified, except that in the POSIX locale,
144 if file is identified as one of the types listed in the following ta‐
145 ble, <type> shall contain (but is not limited to) the corresponding
146 string, unless the file is identified by a position-sensitive test
147 specified by a −M or −m option. Each <space> shown in the strings shall
148 be exactly one <space>.
149
150 Table 4-9: File Utility Output Strings
151
152───────┬─────────────────────────────────────────────┬──────────────────────────────────┬─ │
153 │ If file is: <type│> shall contain the string: Notes│ │
154───────┼─────────────────────────────────────────────┼──────────────────────────────────┼─ │
155 Nonexi│stent canno│t open │ │
156 │ │ │ │
157 │Block special │ block special │ 1 │
158 │Character special │ character special │ 1 │
159 │Directory │ directory │ 1 │
160 │FIFO │ fifo │ 1 │
161 │Socket │ socket │ 1 │
162 │Symbolic link │ symbolic link to │ 1 │
163 │Regular file │ regular file │ 1,2 │
164 │Empty regular file │ empty │ 3 │
165 │Regular file that cannot be read │ cannot open │ 3 │
166 │ │ │ │
167 │Executable binary │ executable │ 3,4,6 │
168 │ar archive library (see ar) │ archive │ 3,4,6 │
169 │Extended cpio format (see pax) │ cpio archive │ 3,4,6 │
170 │Extended tar format (see ustar in pax) │ tar archive │ 3,4,6 │
171 │ │ │ │
172 │Shell script │ commands text │ 3,5,6 │
173 │C-language source │ c program text │ 3,5,6 │
174 │FORTRAN source │ fortran program text │ 3,5,6 │
175 │ │ │ │
176 │Regular file whose type cannot be determined │ data │ 3 │
177 └─────────────────────────────────────────────┴──────────────────────────────────┴───────┘
178 Notes:
179
180 1. This is a file type test.
181
182 2. This test is applied only if the −i option is specified.
183
184 3. This test is applied only if the −i option is not speci‐
185 fied.
186
187 4. This is a position-sensitive default system test.
188
189 5. This is a context-sensitive default system test.
190
191 6. Position-sensitive default system tests and context-sen‐
192 sitive default system tests are not applied if the −M
193 option is specified unless the −d option is also speci‐
194 fied.
195
196 In the POSIX locale, if file is identified as a symbolic link (see the
197 −h option), the following alternative output format shall be used:
198
199 "%s: %s %s\n", <file>, <type>, <contents of link>"
200
201 If the file named by the file operand does not exist, cannot be read,
202 or the type of the file named by the file operand cannot be determined,
203 this shall not be considered an error that affects the exit status.
204
206 The standard error shall be used only for diagnostic messages.
207
209 None.
210
212 A file specified as an option-argument to the −m or −M options shall
213 contain one position-sensitive test per line, which shall be applied to
214 the file. If the test succeeds, the message field of the line shall be
215 printed and no further tests shall be applied, with the exception that
216 tests on immediately following lines beginning with a single '>' char‐
217 acter shall be applied.
218
219 Each line shall be composed of the following four <tab>-separated
220 fields. (Implementations may allow any combination of one or more
221 white-space characters other than <newline> to act as field separa‐
222 tors.)
223
224 offset An unsigned number (optionally preceded by a single '>' char‐
225 acter) specifying the offset, in bytes, of the value in the
226 file that is to be compared against the value field of the
227 line. If the file is shorter than the specified offset, the
228 test shall fail.
229
230 If the offset begins with the character '>', the test con‐
231 tained in the line shall not be applied to the file unless
232 the test on the last line for which the offset did not begin
233 with a '>' was successful. By default, the offset shall be
234 interpreted as an unsigned decimal number. With a leading 0x
235 or 0X, the offset shall be interpreted as a hexadecimal num‐
236 ber; otherwise, with a leading 0, the offset shall be inter‐
237 preted as an octal number.
238
239 type The type of the value in the file to be tested. The type
240 shall consist of the type specification characters d, s, and
241 u, specifying signed decimal, string, and unsigned decimal,
242 respectively.
243
244 The type string shall be interpreted as the bytes from the
245 file starting at the specified offset and including the same
246 number of bytes specified by the value field. If insufficient
247 bytes remain in the file past the offset to match the value
248 field, the test shall fail.
249
250 The type specification characters d and u can be followed by
251 an optional unsigned decimal integer that specifies the num‐
252 ber of bytes represented by the type. The type specification
253 characters d and u can be followed by an optional C, S, I, or
254 L, indicating that the value is of type char, short, int, or
255 long, respectively.
256
257 The default number of bytes represented by the type speci‐
258 fiers d, f, and u shall correspond to their respective C-lan‐
259 guage types as follows. If the system claims conformance to
260 the C-Language Development Utilities option, those specifiers
261 shall correspond to the default sizes used in the c99 util‐
262 ity. Otherwise, the default sizes shall be implementation-
263 defined.
264
265 For the type specifier characters d and u, the default number
266 of bytes shall correspond to the size of a basic integer type
267 of the implementation. For these specifier characters, the
268 implementation shall support values of the optional number of
269 bytes to be converted corresponding to the number of bytes in
270 the C-language types char, short, int, or long. These num‐
271 bers can also be specified by an application as the charac‐
272 ters C, S, I, and L, respectively. The byte order used when
273 interpreting numeric values is implementation-defined, but
274 shall correspond to the order in which a constant of the cor‐
275 responding type is stored in memory on the system.
276
277 All type specifiers, except for s, can be followed by a mask
278 specifier of the form &number. The mask value shall be AND'ed
279 with the value of the input file before the comparison with
280 the value field of the line is made. By default, the mask
281 shall be interpreted as an unsigned decimal number. With a
282 leading 0x or 0X, the mask shall be interpreted as an
283 unsigned hexadecimal number; otherwise, with a leading 0, the
284 mask shall be interpreted as an unsigned octal number.
285
286 The strings byte, short, long, and string shall also be sup‐
287 ported as type fields, being interpreted as dC, dS, dL, and
288 s, respectively.
289
290 value The value to be compared with the value from the file.
291
292 If the specifier from the type field is s or string, then
293 interpret the value as a string. Otherwise, interpret it as a
294 number. If the value is a string, then the test shall succeed
295 only when a string value exactly matches the bytes from the
296 file.
297
298 If the value is a string, it can contain the following
299 sequences:
300
301 \character The <backslash>-escape sequences as specified in
302 the Base Definitions volume of POSIX.1‐2008, Ta‐
303 ble 5-1, Escape Sequences and Associated Actions
304 ('\\', '\a', '\b', '\f', '\n', '\r', '\t', '\v').
305 In addition, the escape sequence '\ ' (the <back‐
306 slash> character followed by a <space> character)
307 shall be recognized to represent a <space> char‐
308 acter. The results of using any other character,
309 other than an octal digit, following the <back‐
310 slash> are unspecified.
311
312 \octal Octal sequences that can be used to represent
313 characters with specific coded values. An octal
314 sequence shall consist of a <backslash> followed
315 by the longest sequence of one, two, or three
316 octal-digit characters (01234567).
317
318 By default, any value that is not a string shall be inter‐
319 preted as a signed decimal number. Any such value, with a
320 leading 0x or 0X, shall be interpreted as an unsigned hexa‐
321 decimal number; otherwise, with a leading zero, the value
322 shall be interpreted as an unsigned octal number.
323
324 If the value is not a string, it can be preceded by a charac‐
325 ter indicating the comparison to be performed. Permissible
326 characters and the comparisons they specify are as follows:
327
328 = The test shall succeed if the value from the file
329 equals the value field.
330
331 < The test shall succeed if the value from the file is
332 less than the value field.
333
334 > The test shall succeed if the value from the file is
335 greater than the value field.
336
337 & The test shall succeed if all of the set bits in the
338 value field are set in the value from the file.
339
340 ^ The test shall succeed if at least one of the set bits
341 in the value field is not set in the value from the
342 file.
343
344 x The test shall succeed if the file is large enough to
345 contain a value of the type specified starting at the
346 offset specified.
347
348 message The message to be printed if the test succeeds. The message
349 shall be interpreted using the notation for the printf for‐
350 matting specification; see printf. If the value field was a
351 string, then the value from the file shall be the argument
352 for the printf formatting specification; otherwise, the value
353 from the file shall be the argument.
354
356 The following exit values shall be returned:
357
358 0 Successful completion.
359
360 >0 An error occurred.
361
363 Default.
364
365 The following sections are informative.
366
368 The file utility can only be required to guess at many of the file
369 types because only exhaustive testing can determine some types with
370 certainty. For example, binary data on some implementations might match
371 the initial segment of an executable or a tar archive.
372
373 Note that the table indicates that the output contains the stated
374 string. Systems may add text before or after the string. For executa‐
375 bles, as an example, the machine architecture and various facts about
376 how the file was link-edited may be included. Note also that on systems
377 that recognize shell script files starting with "#!" as executable
378 files, these may be identified as executable binary files rather than
379 as shell scripts.
380
382 Determine whether an argument is a binary executable file:
383
384 file −− "$1" | grep −q ':.*executable' &&
385 printf "%s is executable.\n$1"
386
388 The −f option was omitted because the same effect can (and should) be
389 obtained using the xargs utility.
390
391 Historical versions of the file utility attempt to identify the follow‐
392 ing types of files: symbolic link, directory, character special, block
393 special, socket, tar archive, cpio archive, SCCS archive, archive
394 library, empty, compress output, pack output, binary data, C source,
395 FORTRAN source, assembler source, nroff/troff/eqn/tbl source troff out‐
396 put, shell script, C shell script, English text, ASCII text, various
397 executables, APL workspace, compiled terminfo entries, and CURSES
398 screen images. Only those types that are reasonably well specified in
399 POSIX or are directly related to POSIX utilities are listed in the ta‐
400 ble.
401
402 Historical systems have used a ``magic file'' named /etc/magic to help
403 identify file types. Because it is generally useful for users and
404 scripts to be able to identify special file types, the −m flag and a
405 portable format for user-created magic files has been specified. No
406 requirement is made that an implementation of file use this method of
407 identifying files, only that users be permitted to add their own clas‐
408 sifying tests.
409
410 In addition, three options have been added to historical practice. The
411 −d flag has been added to permit users to cause their tests to follow
412 any default system tests. The −i flag has been added to permit users to
413 test portably for regular files in shell scripts. The −M flag has been
414 added to permit users to ignore any default system tests.
415
416 The POSIX.1‐2008 description of default system tests and the interac‐
417 tion between the −d, −M, and −m options did not clearly indicate that
418 there were two types of ``default system tests''. The ``position-sensi‐
419 tive tests'' determine file types by looking for certain string or
420 binary values at specific offsets in the file being examined. These
421 position-sensitive tests were implemented in historical systems using
422 the magic file described above. Some of these tests are now built into
423 the file utility itself on some implementations so the output can pro‐
424 vide more detail than can be provided by magic files. For example, a
425 magic file can easily identify a core file on most implementations, but
426 cannot name the program file that dropped the core. A magic file could
427 produce output such as:
428
429 /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
430
431 but by building the test into the file utility, you could get output
432 such as:
433
434 /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
435
436 These extended built-in tests are still to be treated as position-sen‐
437 sitive default system tests even if they are not listed in /etc/magic
438 or any other magic file.
439
440 The context-sensitive default system tests were always built into the
441 file utility. These tests looked for language constructs in text files
442 trying to identify shell scripts, C, FORTRAN, and other computer lan‐
443 guage source files, and even plain text files. With the addition of the
444 −m and −M options the distinction between position-sensitive and con‐
445 text-sensitive default system tests became important because the order
446 of testing is important. The context-sensitive system default tests
447 should never be applied before any position-sensitive tests even if the
448 −d option is specified before a −m option or −M option due to the high
449 probability that the context-sensitive system default tests will incor‐
450 rectly identify arbitrary text files as text files before position-sen‐
451 sitive tests specified by the −m or −M option would be applied to give
452 a more accurate identification.
453
454 Leaving the meaning of −M − and −m − unspecified allows an existing
455 prototype of these options to continue to work in a backwards-compati‐
456 ble manner. (In that implementation, −M − was roughly equivalent to −d
457 in POSIX.1‐2008.)
458
459 The historical −c option was omitted as not particularly useful to
460 users or portable shell scripts. In addition, a reasonable implementa‐
461 tion of the file utility would report any errors found each time the
462 magic file is read.
463
464 The historical format of the magic file was the same as that specified
465 by the Rationale in the ISO POSIX‐2:1993 standard for the offset,
466 value, and message fields; however, it used less precise type fields
467 than the format specified by the current normative text. The new type
468 field values are a superset of the historical ones.
469
470 The following is an example magic file:
471
472 0 short 070707 cpio archive
473 0 short 0143561 Byte-swapped cpio archive
474 0 string 070707 ASCII cpio archive
475 0 long 0177555 Very old archive
476 0 short 0177545 Old archive
477 0 short 017437 Old packed data
478 0 string \037\036 Packed data
479 0 string \377\037 Compacted data
480 0 string \037\235 Compressed data
481 >2 byte&0x80 >0 Block compressed
482 >2 byte&0x1f x %d bits
483 0 string \032\001 Compiled Terminfo Entry
484 0 short 0433 Curses screen image
485 0 short 0434 Curses screen image
486 0 string <ar> System V Release 1 archive
487 0 string !<arch>\n__.SYMDEF Archive random library
488 0 string !<arch> Archive
489 0 string ARF_BEGARF PHIGS clear text archive
490 0 long 0x137A2950 Scalable OpenFont binary
491 0 long 0x137A2951 Encrypted scalable OpenFont binary
492
493 The use of a basic integer data type is intended to allow the implemen‐
494 tation to choose a word size commonly used by applications on that
495 architecture.
496
497 Earlier versions of this standard allowed for implementations with
498 bytes other than eight bits, but this has been modified in this ver‐
499 sion.
500
502 None.
503
505 ar, ls, pax, printf
506
507 The Base Definitions volume of POSIX.1‐2008, Table 5-1, Escape
508 Sequences and Associated Actions, Chapter 8, Environment Variables,
509 Section 12.2, Utility Syntax Guidelines
510
512 Portions of this text are reprinted and reproduced in electronic form
513 from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
514 -- Portable Operating System Interface (POSIX), The Open Group Base
515 Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
516 cal and Electronics Engineers, Inc and The Open Group. (This is
517 POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
518 event of any discrepancy between this version and the original IEEE and
519 The Open Group Standard, the original IEEE and The Open Group Standard
520 is the referee document. The original Standard can be obtained online
521 at http://www.unix.org/online.html .
522
523 Any typographical or formatting errors that appear in this page are
524 most likely to have been introduced during the conversion of the source
525 files to man page format. To report such errors, see https://www.ker‐
526 nel.org/doc/man-pages/reporting_bugs.html .
527
528
529
530IEEE/The Open Group 2013 FILE(1P)