Replace an image in a PDF using command line

command lineimagespdf

I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.

I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.

Best Answer

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

The OP's objection to the imagemagick solution offered by pidosaurus is that an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout option. The OP could identify the correct page of text and chuck it into a .tex file which ends with an %includegraphics directive and refers to the replacement picture by filename. You then pdflatex this and end up with a new single-page .pdf to insert into the rest of your document with pdftk. If you knew where in the text of the original page the image resided, you can %includegraphics [h] and get the image in exactly the right place.

Related Question