1 .\" $File: file.man,v 1.79 2008/11/06 22:49:08 rrt Exp $
7 .Nd determine file type
15 .Op Fl m Ar magicfiles
23 This manual page documents version __VERSION__ of the
28 tests each argument in an attempt to classify it.
29 There are three sets of tests, performed in this order:
30 filesystem tests, magic tests, and language tests.
33 test that succeeds causes the file type to be printed.
35 The type printed will usually contain one of the words
37 (the file contains only
38 printing characters and a few common control
39 characters and is probably safe to read on an
43 (the file contains the result of compiling a program
44 in a form understandable to some
49 meaning anything else (data is usually
52 Exceptions are well-known file formats (core files, tar archives)
53 that are known to contain binary data.
54 When modifying magic files or the program itself, make sure to
55 .Em "preserve these keywords" .
56 Users depend on knowing that all the readable files in a directory
60 Don't do as Berkeley did and change
61 .Sq shell commands text
65 The filesystem tests are based on examining the return from a
68 The program checks to see if the file is empty,
69 or if it's some sort of special file.
70 Any known file types appropriate to the system you are running on
71 (sockets, symbolic links, or named pipes (FIFOs) on those systems that
73 are intuited if they are defined in
74 the system header file
77 The magic tests are used to check for files with data in
78 particular fixed formats.
79 The canonical example of this is a binary executable (compiled program)
81 file, whose format is defined in
86 in the standard include directory.
89 stored in a particular place
90 near the beginning of the file that tells the
91 .Dv UNIX operating system
92 that the file is a binary executable, and which of several types thereof.
95 has been applied by extension to data files.
96 Any file with some invariant identifier at a small fixed
97 offset into the file can usually be described in this way.
98 The information identifying these files is read from the compiled
101 or the files in the directory
103 if the compiled file does not exist. In addition, if
107 exists, it will be used in preference to the system magic files.
109 If a file does not match any of the entries in the magic file,
110 it is examined to see if it seems to be a text file.
111 ASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character sets
112 (such as those used on Macintosh and IBM PC systems),
113 UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC
114 character sets can be distinguished by the different
115 ranges and sequences of bytes that constitute printable text
117 If a file passes any of these tests, its character set is reported.
118 ASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identified
121 because they will be mostly readable on nearly any terminal;
122 UTF-16 and EBCDIC are only
125 they contain text, it is text that will require translation
126 before it can be read.
129 will attempt to determine other characteristics of text-type files.
130 If the lines of a file are terminated by CR, CRLF, or NEL, instead
131 of the Unix-standard LF, this will be reported.
132 Files that contain embedded escape sequences or overstriking
133 will also be identified.
137 has determined the character set used in a text-type file,
139 attempt to determine in what language the file is written.
140 The language tests look for particular strings (cf.
142 ) that can appear anywhere in the first few blocks of a file.
143 For example, the keyword
145 indicates that the file is most likely a
147 input file, just as the keyword
149 indicates a C program.
150 These tests are less reliable than the previous
151 two groups, so they are performed last.
152 The language test routines also test for some miscellany
157 Any file that cannot be identified as having been written
158 in any of the character sets listed above is simply said to be
161 .Bl -tag -width indent
163 Do not prepend filenames to output lines (brief mode).
164 .It Fl c , -checking-printout
165 Cause a checking printout of the parsed form of the magic file.
166 This is usually used in conjunction with the
168 flag to debug a new magic file before installing it.
172 output file that contains a pre-parsed version of the magic file or directory.
173 .It Fl e , -exclude Ar testname
174 Exclude the test named in
176 from the list of tests made to determine the file type. Valid test names
181 application type (only on EMX).
183 Various types of text files (this test will try to guess the text encoding, irrespective of the setting of the
187 Different text encodings for soft magic tests.
189 Looks for known tokens inside text files.
191 Prints details of Compound Document Files.
193 Checks for, and looks inside, compressed files.
195 Prints ELF file details.
197 Consults magic files.
201 .It Fl f , -files-from Ar namefile
202 Read the names of the files to be examined from
205 before the argument list.
208 or at least one filename argument must be present;
209 to test the standard input, use
211 as a filename argument.
212 .It Fl F , -separator Ar separator
213 Use the specified string as the separator between the filename and the
214 file result returned. Defaults to
216 .It Fl h , -no-dereference
217 option causes symlinks not to be followed
218 (on systems that support symbolic links). This is the default if the
223 Causes the file command to output mime type strings rather than the more
224 traditional human readable ones. Thus it may say
225 .Sq text/plain; charset=us-ascii
228 In order for this option to work, file changes the way
229 it handles files recognized by the command itself (such as many of the
230 text file types, directories etc), and makes use of an alternative
233 (See the FILES section, below).
234 .It Fl -mime-type , -mime-encoding
237 but print only the specified element(s).
238 .It Fl k , -keep-going
239 Don't stop at the first match, keep going. Subsequent matches will be
243 (If you want a newline, see the
246 .It Fl L , -dereference
247 option causes symlinks to be followed, as the like-named option in
249 (on systems that support symbolic links).
250 This is the default if the environment variable
253 .It Fl m , -magic-file Ar list
254 Specify an alternate list of files and directories containing magic.
255 This can be a single item, or a colon-separated list.
256 If a compiled magic file is found alongside a file or directory, it will be used instead.
257 .It Fl n , -no-buffer
258 Force stdout to be flushed after checking each file.
259 This is only useful if checking a list of files.
260 It is intended to be used by programs that want filetype output from a pipe.
262 Don't pad filenames so that they align in the output.
263 .It Fl p , -preserve-date
264 On systems that support
268 attempt to preserve the access time of files analyzed, to pretend that
272 Don't translate unprintable characters to \eooo.
275 translates unprintable characters to their octal representation.
276 .It Fl s , -special-files
279 only attempts to read and determine the type of argument files which
281 reports are ordinary files.
282 This prevents problems, because reading special files may have peculiar
288 to also read argument files which are block or character special files.
289 This is useful for determining the filesystem types of the data in raw
290 disk partitions, which are block special files.
291 This option also causes
293 to disregard the file size as reported by
295 since on some systems it reports a zero size for raw disk partitions.
297 Print the version of the program and exit.
298 .It Fl z , -uncompress
299 Try to look inside compressed files.
301 Output a null character
303 after the end of the filename. Nice to
305 the output. This does not affect the separator which is still printed.
307 Print a help message and exit.
310 .Bl -tag -width __MAGIC__.mgc -compact
312 Default compiled list of magic.
314 Directory containing default magic files.
317 The environment variable
319 can be used to set the default magic file name.
320 If that variable is set, then
322 will not attempt to open
327 to the value of this variable as appropriate.
328 The environment variable
330 controls (on systems that support symbolic links), whether
332 will attempt to follow symlinks or not. If set, then
334 follows symlink, otherwise it does not. This is also controlled
341 .Xr magic __FSECTION__ ,
346 .Sh STANDARDS CONFORMANCE
347 This program is believed to exceed the System V Interface Definition
348 of FILE(CMD), as near as one can determine from the vague language
350 Its behavior is mostly compatible with the System V program of the same name.
351 This version knows more magic, however, so it will produce
352 different (albeit more accurate) output in many cases.
353 .\" URL: http://www.opengroup.org/onlinepubs/009695399/utilities/file.html
355 The one significant difference
356 between this version and System V
357 is that this version treats any white space
358 as a delimiter, so that spaces in pattern strings must be escaped.
360 .Bd -literal -offset indent
361 >10 string language impress\ (imPRESS data)
364 in an existing magic file would have to be changed to
365 .Bd -literal -offset indent
366 >10 string language\e impress (imPRESS data)
369 In addition, in this version, if a pattern string contains a backslash,
372 .Bd -literal -offset indent
373 0 string \ebegindata Andrew Toolkit document
376 in an existing magic file would have to be changed to
377 .Bd -literal -offset indent
378 0 string \e\ebegindata Andrew Toolkit document
381 SunOS releases 3.2 and later from Sun Microsystems include a
383 command derived from the System V one, but with some extensions.
384 My version differs from Sun's only in minor ways.
385 It includes the extension of the
389 .Bd -literal -offset indent
390 >16 long&0x7fffffff >0 not stripped
393 The magic file entries have been collected from various sources,
394 mainly USENET, and contributed by various authors.
395 Christos Zoulas (address below) will collect additional
396 or corrected magic file entries.
397 A consolidation of magic file entries
398 will be distributed periodically.
400 The order of entries in the magic file is significant.
401 Depending on what system you are using, the order that
402 they are put together may be incorrect.
405 command uses a magic file,
406 keep the old magic file around for comparison purposes
408 .Pa __MAGIC__.orig ).
410 .Bd -literal -offset indent
411 $ file file.c file /dev/{wd0a,hda}
412 file.c: C program text
413 file: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
414 dynamically linked (uses shared libs), stripped
415 /dev/wd0a: block special (0/0)
416 /dev/hda: block special (3/0)
418 $ file -s /dev/wd0{b,d}
420 /dev/wd0d: x86 boot sector
422 $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
423 /dev/hda: x86 boot sector
424 /dev/hda1: Linux/i386 ext2 filesystem
425 /dev/hda2: x86 boot sector
426 /dev/hda3: x86 boot sector, extended partition table
427 /dev/hda4: Linux/i386 ext2 filesystem
428 /dev/hda5: Linux/i386 swap file
429 /dev/hda6: Linux/i386 swap file
430 /dev/hda7: Linux/i386 swap file
431 /dev/hda8: Linux/i386 swap file
435 $ file -i file.c file /dev/{wd0a,hda}
437 file: application/x-executable
438 /dev/hda: application/x-not-regular-file
439 /dev/wd0a: application/x-not-regular-file
446 .Dv UNIX since at least Research Version 4
447 (man page dated November, 1973).
448 The System V version introduced one significant major change:
449 the external list of magic types.
450 This slowed the program down slightly but made it a lot more flexible.
452 This program, based on the System V version,
453 was written by Ian Darwin <ian@darwinsys.com>
454 without looking at anybody else's source code.
456 John Gilmore revised the code extensively, making it better than
458 Geoff Collyer found several inadequacies
459 and provided some magic file entries.
460 Contributions by the `&' operator by Rob McMahon, cudcv@warwick.ac.uk, 1989.
462 Guy Harris, guy@netapp.com, made many changes from 1993 to the present.
464 Primary development and maintenance from 1990 to the present by
465 Christos Zoulas (christos@astron.com).
467 Altered by Chris Lowth, chris@lowth.com, 2000:
470 option to output mime type strings, using an alternative
471 magic file and internal logic.
473 Altered by Eric Fischer (enf@pobox.com), July, 2000,
474 to identify character codes and attempt to identify the languages
477 Altered by Reuben Thomas (rrt@sc3d.org), 2007 to 2008, to improve MIME
478 support and merge MIME and non-MIME magic, support directories as well
479 as files of magic, apply many bug fixes and improve the build system.
481 The list of contributors to the
483 directory (magic files)
484 is too long to include here.
485 You know who you are; thank you.
486 Many contributors are listed in the source files.
488 Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
489 Covered by the standard Berkeley Software Distribution copyright; see the file
490 LEGAL.NOTICE in the source distribution.
496 were written by John Gilmore from his public-domain
498 program, and are not covered by the above license.
501 There must be a better way to automate the construction of the Magic
502 file from all the glop in Magdir.
506 uses several algorithms that favor speed over accuracy,
507 thus it can be misled about the contents of
511 The support for text files (primarily for programming languages)
512 is simplistic, inefficient and requires recompilation to update.
514 The list of keywords in
516 probably belongs in the Magic file.
517 This could be done by using some keyword like
519 for the offset value.
521 Complain about conflicts in the magic file entries.
522 Make a rule that the magic entries sort based on file offset rather
523 than position within the magic file?
525 The program should provide a way to give an estimate
529 We end up removing guesses (e.g.
531 as first 5 chars of file) because
532 they are not as good as other guesses (e.g.
537 Still, if the others don't pan out, it should be possible to use the
540 This manual page, and particularly this section, is too long.
543 returns 0 on success, and non-zero on error.
545 You can obtain the original author's latest version by anonymous FTP
549 .Dv /pub/file/file-X.YZ.tar.gz