Automator: extract PDF text and save in current directory

applescriptautomatorpdfsnow leopard

I've made an Automator service for extracting PDF text and saving it in the current directory, and it works reasonably well.

It takes one or several PDF files, extracts the text, and saves it as separate .rtfs placed in the same directory as where the original PDFs was found.
Fine, excellent, except for one small rub: what is invariably also saved alongside the other .rtfs is an empty one (zero bytes), with the name of the current directory.

Looking at the workflow below, it seems like both variables ("PDF" and "Bane") are passed to the action "Hent ut PDF-tekst". Is my assumption reasonable, and in any case, how do I fix it?

enter image description here

The script in plain text:

on run {input, parameters}
 tell application "System Events"
 set thePath to POSIX path of (container of (item 1 of input))
 end tell
 return thePath
end run

As an aside, Automator automatically creates a PNG representation of your workflow when you save it, and it can be easily got at by revealing the package contents.
Genius me realized this only after a bit of faffing about with screen capture and GIMP.

Best Answer

Here is the issue I ran into with the translation of "Hent ut PDF-tekst" in Google Translate while translating from Norwegian to English, it translated as "Get the PDF text" so when I typed "Get" in the Actions Search Box it didn't show any PDF Actions with the "Get" Actions that did show. Upon further examination I found Extract PDF Text and used it.

I was able to replicate the issue of a zero sized RTF file being created along with the one for the actual PDF file however I was not able to debug it as to why and I did try many different things. Through the Workflow it showed Bane as a Folder being the path to the selected PDF File but then converted it to the zero sized RTF along with the proper RTF file and to me it looks like a bug in Automator.

That said, I present a workaround that you can choose to use if no one else has an answer that resolves the issue without resorting to this workaround.

Add a Run Shell Script Action to the end of the list of Actions, setting Shell: /bin/bash and Pass input: as argument with the following code:

for f in "$@"; do
    if [ ! -s "$f" ]; then
        rm "$f"
    fi
done

Which translates to: If this file does not have a size greater than zero, then delete it.

BTW If you want to test the code first, you can temporarily replace rm "$f" with something like say deleting "$f" so you can hear what file it's going to delete. When satisfied it's going to delete the zero sized file, then put it back to rm "$f".

Here is an image of my Automator Service.

Export PDF Text Automator Service