MIME-Type – File Used by file(1) and libmagic to Determine MIME Types


According to man 5 magic:

"The file /usr/share/misc/magic specifies what patterns are to be tested for, what message or MIME type to print if a particular pattern is found, and additional information to extract from the file."

So I went looking for that file:

$ file /usr/share/misc/magic
/usr/share/misc/magic: symbolic link to `../file/magic'

$ ll /usr/share/file/magic
total 8
drwxr-xr-x 2 root root 4096 2011-08-08 13:52 ./
drwxr-xr-x 3 root root 4096 2011-10-12 07:27 ../

So it would appear that the file specified in the man page is in fact a symbolic link to a directory which is empty. Where is that file on my Ubuntu 11.10 system?

The reason I want to look at it is that both the file --mime command and the python magic module are returning the same incorrect mime types for some files, and I'd like to see the format of that file so I can modify the relevant associations responsibly. Thanks.


Thanks to @Caesium for pointing me to the strace command. Piping the output from that to grep magic, I got the following output:

open("/usr/lib/libmagic.so.1", O_RDONLY) = 3
access("/home/phoenix/.magic", R_OK)    = -1 ENOENT (No such file or directory)
open("/etc/magic.mgc", O_RDONLY)        = -1 ENOENT (No such file or directory)
stat("/etc/magic", {st_mode=S_IFREG|0644, st_size=111, ...}) = 0
open("/etc/magic", O_RDONLY)            = 3
open("/usr/share/misc/magic.mgc", O_RDONLY) = 3

So it would seem that file first looks in /home/username/.magic, then /etc/magic.mgc, then /etc/magic, and finally in /usr/share/misc/magic.mgc to determine file types. This suggests that the proper place to add user-specific association rules is in /home/username/.magic, and system-wide rules in /etc/magic. I chose the latter option.

For the record, here are my additions to /etc/magic:

# python: file(1) magic for python modules and scripts
0 string """ a python script text executable
!:mime text/x-python
0 regex #!\ .*\ python a python script text executable
!:mime text/x-python
# pyc file: first four bytes are magic number
# which changes with each python version.
# this is for version 2.7.2:
0 belong 0x03f30d0a python compiled
!:mime application/x-python-bytecode

The man page for magic discourages the use of "regex" (for performance reasons), but I thought that this would be the simplest option for me. I hope this helps others solve this problem, should they run into it–the files which are now detected as text/x-python were previously identified as text/x-java by libmagic, which seemed frankly ridiculous.

Best Answer

You were almost there; it's in /usr/share/file/magic.mgc:

$ file /usr/share/file/magic.mgc
/usr/share/file/magic.mgc: magic binary file for file(1) cmd (version 7) (little endian)

As a slight aside, I actually found this just by looking around a bit, but you can prove it's actually using that file via strace:

$ strace file /
<snip lots of output>
open("/usr/share/misc/magic.mgc", O_RDONLY) = 3
<snip a bit more output>

/usr/share/misc/magic.mgc is just yet another symlink. I guess the manpage is out of date.

Related Question