I am using convert
(Imagemagick component, delegating to Ghostscript in background) to transform the first page of PDF files to images.
Usually, convert -density 200 file.pdf[0] first_page.png
will do the job, and it will sample the PDF file at 200 pixels per inch of paper.
However it seldom happens that some PDF are abnormally huge (sometimes A0 paper, and recently a PDF with a page exceeding 23 m² (183 inch in length, 185 in width).
For such files, convert
will hang, eat CPU time. Images of 35000+ pixels in width and height are simply not usable.
Therefore the question: is there a switch in Imagemagick that would adapt the density to the page size, or at least specify that we don't want to sample more than a portion of maximal area of the PDF file (top left corner, 30×30 inch for example)?
Thanks.
EDIT: On its official git repository, MuPDF has added the -w
and -h
switches that, jointly with -r
will do what is wanted here.
Best Answer
I modified mupdf's pdfdraw to support drawing in best fit mode, so I could state that the output needed to be 128x128 at most and it would fit the output in the box while maintaining the aspect ratio. Before I did that the only way was to use pdfinfo to get the page size and then do the calcuations to fit it in a box and then ask pdfdraw to draw it with that scale factor (dots per inch).
Well, after that long story the process to do that is rather simple:
get the page size of the page to render (in pdf terms the media box) this can be done via pdfinfo and grep and will appear in pts (points, 1/72th of an inch) or via a pdf library like pyPDF like:
for a box fit do
dpi = min( A/(w/72.), B/(h/72.) )
where
A
is the maximum width andB
is the maximum height;w
andh
are the width and height of the page.dpi
toconvert -density $dpi
and as requested a slightly fudged git commit diff: