I have a few PDFs that contain ligatures in the text (e.g., ff
is combined into a single character, ff
).
Is there an easy way to remove them when copying the text from the PDF? (i.e., when I paste, I'd like the ff
to be pasted as ff
).
I copy a lot of text from these PDFs into answers on Stack Overflow and I find the ligatures at best obnoxious (ok, I admit, I'm really picky :-P); the ligatures also do not show up correctly when copied into other places (e.g., if I copy them into Notepad, they show up as blocks).
I cannot modify the PDFs.
I use both Adobe Acrobat Reader and Foxit Reader, but I'd be open to trying a new PDF reader.
Best Answer
The reader evince seems to decode ligatures when I tested this.
Btw. for pdflatex documents you can use this in the preamble to display ligatures in the PDF document but copy individual characters: