Automator PDF text to Spoken Audio file: What’s wrong with this script

audioautomatorpdfscript

I have pages of PDF notes which I want converted to spoken audio to aid revision.

There are several suggested workflows online, but I'm still not getting the results I want.

I used this model from Macworld hints (see comment from chrischram).

An RFT file in TextEdit gets converted happily, but when I try and add a PDF, I get errors.

If I select the text myself, the workflow functions perfectly… What have I missed?

Current workflow (translated from German):

Get finder items
Extract PDF text (as RTF, save to folder)
Get TextEdit document contents
Convert Text to Audio, Save
Import data to iTunes
Add imported data to Playlist

For any help, I'd be grateful!

Best Answer

There might be nothing wrong with the script, but rather with the way the PDF is constructed. Text extraction from PDF is much trickier than simply reading in an RTF file. How has the PDF been created?

Due to lack of a better solution, I’d simply open the PDF in Preview.app, ⌘+A, open a new text document in your text editor of choice, ⌘+V, save the file, then use that as the text source instead of the PDF file.

Related Solutions

MacOS – An Automator workflow for extracting text as speech from PDF files

While Automator is pretty useful, I personally find the Terminal / command line a nice place to solve problems like this.

The basic idea is still using the steps you describe, but doing all the work from the command line. I researched a useful PDF-to-Text converter, and I found PDFminer quite useful. If you can get it to run, half of your work is done!

pip install pdfminer

Step one and two, then can be solved with this one-liner in Terminal:

pdf2txt.py example.pdf | say -v Daniel -o example.aiff

Still missing is the addition of metadata — what do you need here: Title / Album / "Artist"?

In a final step, you would add the file to a certain iTunes playlist. Depending on your ideal workflow, one then could build a little LaunchAgent that monitors a folder for new files...

Running a gpg shell script to decrypt a file via Automator

The Automator “Run Shell Script” action runs the script in a non-interactive shell (for an explanation of the difference between interactive and non-interactive shells, see the pertinent section of the Advanced Bash Scripting Guide) – there is, simply spoken, no terminal to get user input from. I suppose the gpg utility recognizes this and skips the password prompt (else your script would hang).

You should be able to pipe your passphrase to GPG inside such an action using the --passphrase-fd 0 option (see gpg’s man page) , however, i.e.

echo "passphrase" |  gpg  --passphrase-fd 0 --output $outfile --decrypt /path/to/file.gpg

You can securely store your passphrase in the OS X Keychain and retrieve it from there. Although possible via a shell script (the TextMate blog has details on how to achieve that – be sure to read the comments), there are so many gotchas to that I’d recommend using a bit of AppleScript and Daniel Jalkut’s excellent Usable Keychain Scripting app. Once installed, the following bit of AppleScript will retrieve your password (assuming the account name is “GPG”):

tell application "Usable Keychain Scripting" to get password of first generic item of current keychain whose account is "GPG"

Either wrap it in an osascript shell command, i.e.

passphrase=$(osascript -e '<command above>')

or, as you are using Automator, add an AppleScript action, retrieve the passphrase inside it and pass it to the shell script.

Best Answer

Related Solutions

MacOS – An Automator workflow for extracting text as speech from PDF files

Running a gpg shell script to decrypt a file via Automator

Related Question