Sort files into folders based on first two digits of filename

applescriptautomatorfilefile-transferfinder

I constantly get hundreds of .pdf files I would like sorted into folders based on the filename.

The filenames are made up of a couple elements, although only the first string of letters and the first two digits are relevant. The letters represent the customer code and the digits represent the year.

Here's two examples:

  • TX204190_100_GR.pdf. TX is the customer code and 20 represents 2020.
  • SFLYMK220921_CR2050_BLKHTH.pdf. SFLYMK is the customer code and 22 represents 2022.

The PDFs start off as direct siblings to the year folders. I need these PDFs to go inside the correct year folder and then the correct customer folder. So for example, TX204190_100_GR.pdf would need to go inside 2020 and then go inside the TX folder. The folders will all already exist.

Before Sorting Example

After Sorting Example

I'm struggling to find a way to select only the customer code (which can be anywhere from 2 to 8 characters) and the following two digits with automator so that I can properly move the files. I'd assume it requires some kind of regex solution, but I don't know where to even begin or even if that's possible using only Automator. Any advice or help with this is greatly appreciated.

Best Answer

Here is an example Automator workflow that you might find helpful.

Example shell script code for the Run Shell Script action:

d="$HOME/Documents/Orders"

for f in "$@"; do
    if [[ ${f##*/} =~ ([[:alpha:]]{2,8})([[:digit:]]{2})(.*) ]]; then
        c="$match[1]"
        y="$match[2]"
        [ ! -d "$d/20$y/$c/" ] && mkdir -p "$d/20$y/$c/"
        mv -n "$f" "$d/20$y/$c/"
    fi
done

Notes:

Change the value of d to the fully qualified pathname of the, e.g. Orders directory.

In this example Automator workflow the settings for the Run Shell Script action are: Shell: [/bin/zsh] and Pass input: [as arguments].



enter image description here



Notes:

As I said, "Here is an example Automator workflow ..." and it is just that, an example. The main part is the shell script code in Run Shell Script action, as it does the real work! So, whichever way you want the workflow to pass a list of qualifying files to the Run Shell Script action, even if modifying the code as necessary based on how you want the files passed, you should be in business. Any files not matching the regex are ignored.

Addressing a comment, also note that while mkdir -p "$d/20$y/$c/" does not explicitly need to be tested for, nonetheless, it is a coding force of habit if you will, used in similar paradigms. You can omit the [ ! -d "$d/20$y/$c/" ] && portion of that command if you so choose. However, that said, I'd leave it as [ ! -d "$d/20$y/$c/" ] && uses the zsh shell built-in command for [ and testing for it existence incurs far less overhead than needlessly executing the mkdir command every time when the target directory already exists!



A different Automator workflow

As an example, the Automator workflow could consist of just the Run Shell Script action with the following example shell script code:

d="$HOME/Documents/Orders"

cd "$d" || exit

[ -z "$(ls *.[pP][dD][fF])" ] && exit

for f in *.[pP][dD][fF]; do
    if [[ $f =~ ([[:alpha:]]{2,8})([[:digit:]]{2})(.*) ]]; then
        c="$match[1]"
        y="$match[2]"
        [ ! -d "$d/20$y/$c/" ] && mkdir -p "$d/20$y/$c/"
        mv -n "$f" "$d/20$y/$c/"
    fi
done

Notes:

Change the value of d to the fully qualified pathname of the, e.g. Orders directory.

In this example Automator workflow the settings for the Run Shell Script action are: Shell: [/bin/zsh] and Pass input: [to stdin].


enter image description here



What the example shell script code does:

  • d="$HOME/Documents/Orders" -- Sets the value of the d variable to the target location of the files to be processed.
  • cd "$d" || exit -- Changes directory to the target directory or exits the script.
  • [ -z "$(ls *.[pP][dD][fF])" ] && exit -- Test for the existence of the target PDF files and exits the script if none exist at the target location. This test is necessary to avoid an error in the for loop to keep the Automator workflow from throwing an error if run and no target PDF files exists.
  • for f in *.[pP][dD][fF]; do -- For each target PDF file do something.
  • if [[ $f =~ ([[:alpha:]]{2,8})([[:digit:]]{2})(.*) ]]; then -- If the target file matches the regex, then do something.
  • c="$match[1]" -- Sets the value of the c variable to the Customer Code. The value of the first Capturing Group of the regex.
  • y="$match[2]" -- Sets the value of the y variable to the Year. The value of the second Capturing Group of the regex.
  • [ ! -d "$d/20$y/$c/" ] && mkdir -p "$d/20$y/$c/" -- If the target directory doesn't exist, then create it.
  • mv -n "$f" "$d/20$y/$c/" -- Move the target file to the target directory.
  • fi -- Closes the if statement block.
  • done -- Closes the do loop.

Explanation of the Regex:

enter image description here

enter image description here