How to make ghostscript not wipe PDF metadata

ghostscriptpdf

Ghostscript wipes the PDF metadata like author, title, subject etc. How can I tell ghostscript not to touch the metadata? I invoke it as follows:

gs \
  -dBATCH                    \
  -dNOPAUSE                  \
  -sOutputFile=<output_file> \
  -sDEVICE=pdfwrite          \
  -dPDFSETTINGS=/ebook       \
  <input_file>

Best Answer

Apparently it's not possible to keep the PDF metadata when using ghostscript. Here is a workaround which first saves the metadata to a file using pdftk, then compresses the file with ghostscript and finally writes back the metadata also using pdftk.

INPUTPDF=<input_file>
OUTPUTPDF=<output_file>
TMPPDF=$(mktemp)
METADATA=$(mktemp)

# save metadata
pdftk "$INPUTPDF" dump_data_utf8 > "$METADATA"

# compress
gs                       \
  -q                     \
  -sOutputFile="$TMPPDF" \
  -sDEVICE=pdfwrite      \
  -dNOPAUSE              \
  -dBATCH                \
  -dPDFSETTINGS=/ebook   \
  "$INPUTPDF"

# restore metadata
pdftk "$TMPPDF" update_info_utf8 "$METADATA" output "$OUTPUTPDF"

# clean up
rm -f "$TMPPDF" "$METADATA"

Edit: This is a bug in ghostscript, see Bug report and the confirmation that this is not supposed to happen.

Related Question