Pdf – Reduce resolution,size,dpi,number of pixels in PDF images

dpipdfresolution

I scanned a text at 600dpi and it turned out to be much more than I needed to make a PDF out of it. I've already OCR'ed the text and I want to retain the OCR in the PDF.

I want to decrease the number of pixels (dpi? Sorry, I'm not sure what I'd call them, I'm not used to image processing), so I can make the PDF size smaller. The images are too big when I open the PDF, it would be fine to shrink them by decreasing the number of pixels (as it is now, I can zoom in the images much more than I need).

How can I shrink image size by reducing the number of pixels(dpi))?

I don't want to re-print the PDF, or rescan it, because I don't want to lose the OCR. I tried using Adobe Acrobat Pro DC "Save as Optimized PDF", and shrank all images above 50dpi to 50dpi. It made the PDF bigger! (I think the PDF is already compressed; but I don't want compression, I want to reduce the number pixels/the resolution)

I'm on Windows 7, 64 bit

Best Answer

I think you were on the right track with Acrobat Pro.

You need to change the actual image resolution though. You could do this by manually creating a low-res version (e.g. 50% W x 50% H) and replacing the existing image in the pdf with your new one. In this case, the image dpi inside the pdf would need to be halved too in order to preserve the size. If you kept the same dpi it would appear a quarter of the size.

As long as the document dimensions don't change the OCR text should be mapped to the same spatial coordinates.

Edit: batch processing using Acrobat Pro

Below you can see Acrobat Pro (XI) can view the image properties. 123

  1. Once an image is part of a pdf it gets a physical "size" on the 'virtual paper'.
  2. The ppi (or dpi, but that is more for printing contexts) is a pdf metric that gives the ratio between physical size and number of pixels. I believe the unit pt/inch shown in the image is incorrect; it should be ppi. I also think calling it resolution is a poor word choice.
  3. The real image resolution (width hight) is a pdf-independent image property, it affects how large the file is and how much you can meaningfully zoom in when viewing digitally.

There is a simple mathematical relationship: 2 = 3 / 1 .

  • What you want to do is reduce 3 while keeping 1 constant, thereby implicitly reducing 2 by a corresponding amount.
  • Most editors use the wording "change dpi" which is effectively the same: change 2 and implicitly adjust 3 such that 1 remains the same size.
  • But under the hood the largest change occurs to the image resolution (3), the ppi/dpi is just a number that needs to be updated in the pdf; so I find my wording better:)

Below you can run a sort of 'smart-filter' on your pdf using Acrobat Pro, one of the available preset filters is reducing image dpi. So you can just run this preflight option or create your own. You can adjust downscaling options and image compression methods. Acrobat Pro

I think you can batch process multiple pdf files using this method in combination with the "action wizard" tool.