Command-Line Tool – How to Bulk Extract Images from a PDF

batchcommand lineimage processingpdf

I have a pdf catalog that was given to me from a client in pdf format. They don't have the images but they're in the pdf.

Is there a way to extract all images from a pdf using a command line tool while preserving it's original file names?

I reviewed this question here: Extract images from PDF with layer masks but it's for individual images.

Best Answer

The program pdfimages from the package poppler-utils might be what you are looking for. From the man page:

Pdfimages reads the PDF file PDF-file, scans one or more pages, and writes one PPM, PBM, or JPEG file for each image.

On newer versions of poppler-utils there is an all switch to extract to jpg or png:

pdfimages -all input.pdf images/prefix

will output files in the form prefix-nnn.[png|jpg] in the images folder.