Linux – Convert a colored PDF into a white/black

linuxpdf

On Debian Sid, I have a PDF with a blue background and yellow font. I've searched a lot on Super User but i haven't found anything useful for me.

I have tried to convert the PDF into a grayscale one with:

gs -o grayscale.pdf -sDEVICE=pdfwrite -sColorConversionStrategy=Gray -sProcessColorModel=DeviceGray -dCompatibilityLevel=1.4 colored.pdf

The problem is that I obtain a PDF whit white fonts and dark grey background so I cannot print it.

After that I tried:

convert -density 96x96 gs2.pdf -density 96x96 -negate -compress zip inv.pdf

I got a PDF with black fonts (and this is okay) and grey background (and this is not okay).

What can I do to obtain a PDF with white background and black fonts?

Best Answer

GENERAL WARNING!!! work on a COPY of your FILE!!!

(so you can have second chance if you made mistakes)

vector pdf background (meaning not raster image) in pdf files can be easily changed in a couple of steps (see also my stackoverflow answer that now I'll extend and improve


  • PRELIMINAR CHECK:

open your pdf file with an editor able to show the internal pdf structure, like

notepad++

and verify if you can see code snippets like

1.000 1.000 0.000 rg (it means yellow)

0.000 0.000 1.000 rg (it means blue if your blue is the pure blue having RGB triplet 0, 0, 255, otherwise read the rest of answer to identify the right triplet into pdf code)

and so on...

(code snippet can change, for instance, in pdf produced by openoffice internal pdf exporting feature, the same code snippepts are in this forms:

0 0 0 rg (it means *black*)
1 1 1 rg (it means *white*)

and so on...

if you are able to see these code snippets, then you can start to change values, otherwise, you need to decompress text streams

you can perform this task with

pdftk

http://www.pdflabs.com/docs/install-pdftk/

pdftk file.pdf output uncompressed.pdf uncompress

and recompress after finished changes

pdftk uncompressed.pdf output recompressed.pdf compress

now, if you see these code snippets, you can change values

STEP 1 (for pdf editing) -

the first thing you need is to find the right equivalence between RGB color values of text and background and the internal pdf represerntation of same colors

you can use a free color pickers like these

to identify the rgb values of text and background colors

once you have these values, you need to convert into special internal pdf representation

to do this take i mind this proportion:

1:255=x:color you selected

for instance: let say you have this RGB triplet for background: 30,144,255

rgb triplet blue

to know correspondent values in pdf in order to insert in code snippet to change pdf background color, you do: (you can use http://www.wolframalpha.com/ to compute with precision)

1:255=x:30 = 30/255 = 0.117 (approximated to first three decimals)

1:255=x:144 = 144/255 = 0.564 (approximated to first three decimals)

1:255=x:255 = 255/255 = 1

so, the whole triplet in pdf, corresponding to RGB 30,144,255, will be:

0.117 0.564 1.000


# STEP 2 (for pdf editing)

we look for 0.117 0.564 1.000 in pdf file with notepad++ (wrap around and match one word only needed to be checked) and we found the internal pdf representation of background and we can change from azure to, let say, white

1.000 1.000 1.000

or

1 1 1

but, since you wrote about blue background , to be more precise, I created a sample pdf with blue background (pure blue 0,0,255 RGB - if youe blue had different tonality modify as needed my tips) and yellow text

since we know that 0.000 0.000 1.000 rg means blue, we look for this and we can change from 0.000 0.000 1.000 rg, to 1.000 1.000 1.000 rg (white) BUT...

at same time, you also change the text from yellow to black

looking for 1.000 1.000 0.000 (yellow text) and change to black 0.000 0.000 0.000

and we have now a vector pdf with black text and white background

please, remember to

    • compress again this pdf you mmodified if you uncompressed with pdftk
    • repair
pdftk file.pdf output fixed.pdf

there is another way, starting from postscript, to perform the same task, but since you already have the pdf file, converting to postscript to perform the similar task would be superfluous step

give a feedback, please, and feel free to ask more

Related Question