Shell scripting: Select folder based on part of file name

bashcommand linefilesystemfoldersscript

My project

I'm creating a bash shell script to execute from the Terminal. Its purpose is to archive lots and lots of project folders. Each folder follows a prescribed nomenclature: [YYYY.MM.DD] - Medium - Client - Project name - details--details - JobNumber. For example: [2006.02.01] - Print - Development - Appeal I - Kids Art Show Insert - D0601-11. These projects are currently one folder. I want to sort them into folders by Client name. There are 7 (internal) clients, so I'm using the following shell script:

#!/bin/bash

# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/

# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)

for folder in *; do
    if [[ -d "$folder" ]]; then
        if [[ "$folder" == *Academics* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Academics/
        fi
        elif [[ "$folder" == *Admissions* ]]; then
            echo "Archiving $folder to Archived Projects → Admissions...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Admissions/
        fi
        elif [[ "$folder" == *Alumni* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Alumni/
        fi
        elif [[ "$folder" == *Communications* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Communications/
        fi
        elif [[ "$folder" == *Development* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Development/
        fi
        elif [[ "$folder" == *President* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/President/
        fi
        elif [[ "$folder" == *Student\ Life* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Student\ Life/
        fi
    else #Folders that don't match the pattern prompt the use to move them by hand.
        echo "$folder does not have a Department name. Move it by 
done

My problem

My script would mis-parse and mis-file a project named [2006.03.01] - Print - Development - Academics and Accreditation - D0601-08. It would read "Academics" before it ever got to the conditional for the client "Development". As a result, it would be files into "Academics". And I'd have to pick it back out by hand!

My system's advantage

My colleagues and I have been scrupulous about our nomenclature (described above). I know that the Client name falls in between the 2nd and 3rd hyphens.

My question

How to leverage my system's advantage to solve my problem? I want this script to match only the part of the folder name that comes after the first two hyphens and before the third hyphen, i.e., I only want this script to search the Client "field" in the folder name. I keep thinking "regular expressions" but have no idea how to implement them.

Note: I prefer for a solution to augment my current script, rather than replace it. I arrived at it via @patrix on this site and his idea circumvented some errors.

Best Answer

There are several ways to get this done in bash and friends (you could really knock yourself out using sed or awk). A rather simple way is to use cut to get the name of the folder

if [[ -d "$folder" ]]; then
    target=$(echo $(echo "$folder" | cut -d- -f 3))
    echo "Archiving $folder to Archived Projects → $target...";
    mv "$folder" /Volumes/communications/Projects/Archived\ Projects/$target/
fi

The $(echo $(echo ... )) is a lazy approach to get rid of the leading/trailing space (because cut doesn't support multi-char delimiters).


If you want to knock yourself out with sed you can use

    target=$(echo "$folder" | sed -n 's/^[^\-]*-[^\-]*- \([^\-]*\) -.*/\1/p')

instead of cut. This only works if the target folder name doesn't contain a - itself.


Instead of pattern matching you could also use a shell function to encapsulate most of the complexity.

#!/bin/bash

function checkAndMove() {
    if [[ "$1" == *$2* ]]; then
        echo "Archiving $1 to Archived Projects → $2...";
        mv "$1" /Volumes/communications/Projects/Archived\ Projects/$2/
    fi
}

cd /Volumes/communications/Projects/Completed\ Projects/

for folder in *; do
    if [[ -d "$folder" ]]; then
        checkAndMove Academics
        checkAndMove Admissions
        ...
    fi
done