How to convert a pdf file from gray-scale to black-white

command linepdf

My OS is Ubuntu 12.04. How can I convert a pdf file from gray-scale to black-white? The gray-scale pdf file comes from scanning with gray-scale option, and the black-white scale pdf is required by OCR.


Update:

Following Marco's reply, the B-W pdf isn't good and the original file is here.

Best Answer

1) Use ghostscript to convert the PDF to a monochrome PostScript file using the psmono device:

gs -q -sDEVICE=psmono -o mono.ps input.pdf

2) Then convert the monochrome PostScript back to PDF:

ps2pdf mono.ps

EDIT: The psmono device creates a 1-bit half-tone image which is apparently not what you want. I couldn't find a way to specify a threshold using ghostscript, so I resorted to imagemagick. convert internally uses ghostscript to convert the PDF. It then applies the threshold filtering to produce a 1-bit image and uses ghostscript again to create a PDF. Since convert uses a resolution of 75DPI by default, which might not match your actual resolution, you can provide the density argument. And experiment with the threshold setting. The optimal values highly depend on the input file.

convert -density 150 -threshold 50% input.pdf output.pdf
Related Question