MacOS – How to identify file type for a bulk lot of files and attach suitable extension to each

automatorfile conversionmacosterminal

I have just over a thousand (1000) files whose suffix/extension has been changed by a database engine all to the same thing (eg. fileName.abcd where .abcd is on every file regardless of whether it is a jpg, pdf, or whatever else).

The goal is to convert all the files to PDF but I first just want to sort the various file types into separate folders.

I use Mac OS X and figure that there must be a Terminal script that would:

  1. Identify the file (Perhaps with 'File' command?) and then…
  2. Append the appropriate extension which would then…
  3. Allow me to manually sort and place them into seperate folders for further processing depending on their file type.

Using the Terminal 'File' command on a file with the database suffix, typical results are quite wordy like this, which at least shows the Mac can identify the file types even though they have the 'wrong' suffixes:

  • JPEG image data, JFIF standard 1.01
  • PDF document, version 1.3
  • Rich Text Format data, version 1, ANSI
  • etc.

So, I only need a script that would tag the files so I can use Automator to rename the files later with the appropriate suffix.

I note that if in Terminal I type 'File' and after that I drop or copy multiple files on to it, and hit return, then Terminal correctly identifies them all in the same order, but it isn't useful unless it at least tags each file type differently.

I figure this initial task is too difficult for Automator but would love to be proven wrong.

Any assistance in doing this would be appreciated. I have checked this board and elsewhere expecting someone else to have had a similar issue but found no such problem listed anywhere.

Best Answer

This is rather easy combining some standard tools:

  • file for looking up the mime type
  • tr to remove the slashes (otherwise, you'd have nested folders for different file groups)
  • well, some obvious mkdir and mv commands
for file in *
do
    mime=$(file --brief --mime-type "$file" | tr '/' '_')
    mkdir -p "$mime"
    mv "$file" "$mime/$file"
done

Alternatively, you could also directly list file extensions for each expected mime type and automatically rename them appropriately.

for file in *
do
    mime=$(file --brief --mime-type "$file")

    case "$mime" in
    "image/jpeg")
        extension="jpeg"
        ;;
    "text/plain")
        extension="txt"
        ;;
    "application/pdf")
        extension="pdf"
        ;;
    *)
        continue;
        ;;
    esac

    filename="${file%.*}"

    mv "$file" "$filename.$extension"
done

Be aware, that for sake of readability I did not take care of file names containing spaces. If you have them, you'll have to apply one of the patterns to deal with spaces in file names. As it rather seems you're not having them I left them out, as they make the scripts looking much more complicated as they are.