Ubuntu – How to extract a page range from a PDF file AND retain the PDF tagging in the new file

pdf

I've found and tried several of the solutions to extracting a page range from a PDF file (pdftk, Ghostscript, etc.). They all work, but they all seem to strip the tagging (PDF tags, used to make documents more accessible) from the resulting file.

Does anyone know of a solution, or a set of options I can use with an existing solution, to extract a page range AND retain the PDF tags in the extracted file?

Best Answer

To preserve original tags in the extracted pages range, please use an application that supports tag extraction. One open source and excellent application is PDFSAM.

  1. Install it by running the following command in the terminal:

sudo apt install pdfsam

  1. Open your file with PDFSAM and choose Extract. Then, specify pages range separated by commas, chose your output directory, and click Run like in the image bellow:

enter image description here

Done: Your extracted pages range with tags preserved will be located in the chosen output directory.

Related Question