More sophisticated file command for deep inspection

file formatfile-commandfiles

Sometimes it seems that the standard file command (5.04 on my Ubuntu system) is not sophisticated enough (or I am just using it wrong, which could well be).

For example when I run it on an .exe file, and I am quite positive that it contains some archive, I would expect output like this:

$ improved-file foo.exe
foo.exe: PE32 executable for MS Windows (GUI) Intel 80386 32-bit
         .zip archive included (just use unzip to extract)

Other issues:

  • It doesn't detect concatenations of different formats
  • It doesn't detect common file formats, e.g. .epub, which is just a .zip container with some standardized .xml files etc. inside (file displays 'data')

An example of such a .exe file containing an archive – I guessed some archive-formats and tried the corresponding unpack-commands with a trial'n'error approach – which worked in the end – but I would rather prefer a more auto-inspection oriented workflow.

Best Answer

I can't think of an all-in-one tool, but there are programs that can cope with a large array of files of a given category.

For example, p7zip recognizes a large number of archive formats, so if you suspect that a file is an archive, try running 7z l on it.

$ 7z l ta12b563enu.exe
…
Type = Cab
Method = MSZip
…

If you suspect that a file is an image, try ImageMagick.

$ identify keyboard.jpg.gz
keyboard.jpg.gz=>/tmp/magick-XXV8aR5R JPEG 639x426 639x426+0+0 8-bit DirectClass 37.5KB 0.000u 0:00.000

For audio or video files, try mplayer -identify -frames 0.

If you find a file that file can't identify, you might make a feature request to the author of your magic library.

Related Question