I'm using pdftotext (part of poppler-utils) to convert PDF documents to text. It works, for the most part, but one thing I wish it did was to insert blank lines between separate paragraphs instead of mashing them together.
Is there way to get pdftotext to do this? And if not, is there another pdf to text utility that can do this?
Best Answer
You could try
ebook-convert
from Calibre.If anything, I'd say it errs in the other direction: too many line breaks.
Another thing I'd definitely consider though is converting to HTML using pdfreflow, and then convert the HTML to TXT.