I have a folder of 6000+ PDF files (chapters, articles, etc.). I'm trying to weed out/sort those that I've just downloaded but never annotated. Is there a way to do this? Those PDFs that I've never annotated usually have the same "created" and "modified" dates, so I was thinking those criteria could be used (i.e., look for files whose modified date is later than/not the same as the created date), but I have no idea how to do that.
In other words, I need to be able to find any PDF on my computer that has been modified.
Thank you for any help!
Best Answer
Per info in the OP and comments, this will do as you asked.
In Automator:
Add a Run AppleScript action.
Replace the default code with the following example AppleScript code show further below:
Note: If Skim is not in the /Applications folder, then modify the value of the
skimpdfPathFilename
variable accordingly. You should not need to modify anything else unless you want to set the value of theoffsetInSeconds
variable, e.g.set offsetInSeconds to 60
, to a different value. This variable is used to help find the files that really have been modified since they were created. The granularity differential between thecreation date
andmodification date
when a file is first created can be from 0 seconds to a higher value, which is not a consistent value depending on how the file was created. Make adjustments as you see fit for your use case.What the Workflow and example AppleScript code does:
Run AppleScript action.
creation date
, per the value of theoffsetInSeconds
variable.repeat
loop. Files meeting the criteria are stored inmodifiedFilesList
to be used in the nextrepeat
loop.xattr
to get the extended attributes of the target files. If a file has the target extended attributes a flag is set totrue
and if not, set tofalse
. The files flagged astrue
go intoannotatedSkimFilesList
to be used in the nextrepeat
loop.skimpdf
utility within Skim on the files inannotatedSkimFilesList
, annotations are embedded in place. Thus no need to export to a second file, then delete the original and replace it.NOTE: While I have tested this and it works without issue for me, nonetheless do not run this until you are sure you have a proper backup! You should also test the workflow on a small sampling of copied files placed outside of the actual search folder the workflow will be run on after testing is done.
Example AppleScript code:
Understanding the
do shell script
command in the secondrepeat
loop:When a PDF is annotated in Skim and saved, extended attributes are set on the file, e.g.:
The output is piped
|
to:Which tests the output of
grep
counting the occurrences of the pattern and ifgrep
finds one or more occurrences of the pattern, then the value of thewithNotes
variable is set totrue
, while being set tofalse
otherwise.Note that Skim does have a built-in command line utility, e.g.
/Applications/Skim.app/Contents/SharedSupport/skimnotes
that can be used to test if a PDF has annotations made in Skim, however because of its output this utility is better used in an shell script run in Terminal then ado shell script
command, and why I usedxattr
andgrep
instead.Note: The example AppleScript code above is just that, and does not include any error handling as may be appropriate/needed/wanted, the onus is upon the user to add any appropriate error handling for any example code presented and or code written by the oneself.