I want to build a folder action that cleans up the filenames of my downloaded files.
For example Youtube_MyVideofile_(1080p_30fps_H264-128kbit_AAC).mp4 should be stripped by "Youtube" "30fps" "128kbit" "AAC" "(" ")" and "_" should be replaced to a "space". So the result would be MyVideofile 1080p H264.mp4
I know I could do this with Automator but then I have to set up a "search/replace" element for every word. I'd rather use a single list of words which would be easier to maintain, because I have a lot of different sources where I get files from on regular bases, so the actual list of words to be removed will be very long and may be updated from time to time.
I found this Automator or AppleScript to Remove Multiple Strings from File Names?
which is similar, but it only worked with selected folders. Instead I want to set it up so it works automatically as a folder action.
I guess therefore I also need a whitelist of file extensions that the script wont touch, such as ".download" for safari-downloads that are still in progress.
Best Answer
Using Automator, in macOS Sierra 10.12.5, I created a Folder Action with a single Run AppleScript action, using the AppleScript code below, and set it to run on my Downloads folder. (It has also been tested and works on OS X 10.8.5 and OSX 10.11.6.)
set theBlackWhiteList to POSIX path of ...
line of code, accordingly as necessary.Read the comments, included with the code, for what's necessary to use this code in the Folder Action.
To test the Folder Action, open Terminal and
cd Downloads
, then create the test file with,touch 'Youtube_MyVideofile_(1080p_30fps_H264-128kbit_AAC).mp4'
, which will create a zero length file that will be processed by the Folder Action and be renamed toMyVideofile 1080p H264.mp4
as shown in Downloads in Finder or Terminal with:ls -l My*.mp4
AppleScript code:
Example contents of the plain text data file used by the Folder Action:
The logic behind the renaming process:
Using the variable
theStringsToRemoveList
, which starts with a single space character followed by the comma-delimiter, in conjunction with the underscore character as thetext item delimiter
, turns all spaces along with all other strings to be removed, into underscores during the AppleScript'stext items
andtext items delimiters
portion of the code.This is done so
sed
can be used to replace all concurrent underscore characters with a single underscore character, then remove the leading underscore, if it exists, followed by an underscore preceding the dot before the filename extension, if it exists, and finally all remaining single underscore characters are replaced with a single space character.set theFileName to
- The variabletheFileName
will contain the output of thedo shell script
command.do shell script "_command_"
- Runs the command in ashell
.printf " & quoted form of theFileName & " |
- Prints the value of the variabletheFileName
, and pipes|
it to thesed
command.sed -E -e 's/[_]{2,}/_/g' -e 's/^_//' -e 's/_\\./\\./g' -e 's/_/ /g'
sed
- Stream EDitor.-E
- Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE’s). The re_format(7) manual page fully describes both formats.-e command
- Append the editing commands specified by the command argument to the list of commands.s/[_]{2,}/_/g
s
- Substitute pattern flag.[_]{2,}
- Match a single character present in the list, matches the character_
literally (case sensitive).{2,}
- Quantifier — Matches between 2 and unlimited times, as many times as possible, giving back as needed (greedy)./_/
- Replaces matched pattern with a single character_
literally (case sensitive).g
- Global pattern flagg
modifier, matches all occurrences of the pattern, (doesn't return after first match).s/^_//
^
- Asserts position at start of the string._
- Matches the character_
literally (case sensitive).//
- Replaces the matched pattern with literally nothing.s/_\\./\\./g
_
- Matches the character_
literally (case sensitive).\\.
- Matches the character.
literally (case sensitive)./\\./
- Replaces the matched pattern with the character.
literally (case sensitive).\\
is necessary when use in ado shell script
command, however, from the command line a single back-slash\
would be used to make the character that follows a literal.
character, in this case.s/_/ /g
_
literally, with a characterliterally (case sensitive).
Note that the info above is abbreviated in places, however, it should provide a bit of an understanding of what's happening.
On a added note, if you want to also ensure capitalization of each word in the filename, then replace the existing
do shell script
command with thedo shell script
command below, which has an addedawk
command that receives the output fromsed
to preform the capitalization. Note that I found thisawk
command on the Internet and tested it that it works, however, will not be adding an explanation of how it functions for lack of time.Update to address
.
's in the filename, per the comments.In the plain text data file, on Line 2, add a
.,
after the leading space and its comma-delimiter. In other words, the first item in the list on Line 2 is a blank space followed by a comma-delimiter followed by.
followed by a comma-delimiter and so on.Add the following lines of code after the
repeat
loop that directly before the comment starting with-- # Using the example filename in the OP. ...
which is above thetell current application
...do shell script
block of code.By adding the
.,
to line 2 in the plain text data file, all.
in the filename are replaced with_
in the original code. Then with the extra lines of code above, it replaces e.g._mp4
with.mp4
, or.
and whatever the actual filename extension is.Now when it gets to the
do shell script
command there is only the.
for the filename extension and all the underscores are process out of the name as they should.Obviously the way the original code is coded, underscores cannot be a part of the final filename, and this modification to the original code doesn't change that.