Ubuntu – PDF smart file diff

libreofficepdf

I have a LibreOffice document that I converted to PDF at some point in time using the built-in capabilities. The timestamp on the PDF is later than on the word processing document, so that makes sense, but I am not absolutely sure that the word processing document produces exactly that PDF. The document is 20 pages long so it's not a good idea to check it manually.

One possibility is to redo the PDF in a different folder and then do a binary diff of the two PDFs. Unfortunately the command line diff indicates that the "binary files are different".

Is there a "smart binary diff" that will help me determine if the difference is merely in meta data or some such non-consequential difference?

Best Answer

In general it is a good idea to check if command + file extension are what your are looking for. diff+pdf results in diffpdf.

sudo apt-get install diffpdf

DiffPDF is used to compare two PDF files. By default the comparison is of the text on each pair of pages, but comparing the appearance of pages is also supported (for example, if a diagram is changed or a paragraph reformatted). It is also possible to compare particular pages or page ranges. For example, if there are two versions of a PDF file, one with pages 1-12 and the other with pages 1-13 because of an extra page having been added as page 4, they can be compared by specifying two page ranges, 1-12 for the first and 1-3, 5-13 for the second. This will make DiffPDF compare pages in the pairs (1, 1), (2, 2), (3, 3), (4, 5), (5, 6), and so on, to (12, 13).

enter image description here

Source: Ubuntugeek.com.