I have Adobe Reader, Okular and Document Viewer as pdf readers. The papers I read are often texts with mathematical formulae, generated by LaTeX.
But it seems that searching special characters or mathematical symbols in pdf files with these viewers does not work perfectly. What I usually do is to select the key part (special characters or mathematical expressions) from the file, then Ctrl+C, then Ctrl+F, then Ctrl+V, quite often what the viewer highlights are unfortunately not correct.
I believe this is an important feature for the viewer, and there is a real need to look for not only words but also special characters in a document.
Could anyone tell me how you workaround this? Is there any better pdf reader or any smart way to search?
Best Answer
There is probably no generic solution to your problem, even though it would be cool if there was.
The core of the problem is that PDF is designed to specify how something should look when printed. Being able to search the PDF for a formula was probably not a mayor concern. So the problem is not the Viewer; the problem is that the PDF doesn't contain the information you are looking for in an accessible way.
When you have, for example, an alpha (α) in a formula, this could be coded
U+03B1
a
in a greek font (the Windows font Symbol comes to mind)In the first case your solution should probably work, but in the second case the search will stop at every single "a" in the text. In the third case the search will come up with nothing at all, since there is no text to be searched.
The problem gets more difficult when you search for elements with indices, such as
$A_B^C
. This needs to be typeset in a certain way (the B below the A, the C above it), but there is no fixed rule in which order the PDF creator should insert the three characters into a text box; it could even decide to create three separate text boxes, or decide that all the upper indices of a formula come first, and the lower indices come last.So as an example, the formula
$A_B^C = D^E_F$
could be represented asor
or
or any other way the PDF creator pleases, as long as the position information for each letter is correct to produce the right formula. Needless to say that in the first and third case, you will have a hard time searching for `$A_B$'...
After all this explaining, what can you do?