My OS is Ubuntu 12.04. How can I convert a pdf file from gray-scale to black-white? The gray-scale pdf file comes from scanning with gray-scale option, and the black-white scale pdf is required by OCR.
Update:
Following Marco's reply, the B-W pdf isn't good and the original file is here.
Best Answer
1) Use ghostscript to convert the PDF to a monochrome PostScript file using the psmono device:
2) Then convert the monochrome PostScript back to PDF:
EDIT: The
psmono
device creates a 1-bit half-tone image which is apparently not what you want. I couldn't find a way to specify a threshold using ghostscript, so I resorted to imagemagick.convert
internally uses ghostscript to convert the PDF. It then applies the threshold filtering to produce a 1-bit image and uses ghostscript again to create a PDF. Sinceconvert
uses a resolution of 75DPI by default, which might not match your actual resolution, you can provide thedensity
argument. And experiment with thethreshold
setting. The optimal values highly depend on the input file.