For privacy concerns, I want to remove all metadata from a document (e.g. pdf, jpg, docx, …). Metadata in general is additional information stored somehow apart the actual content like:
- Used Software
- Used Operating System
- Time and sometimes place
- Camera model, used gear… (photographs, see Exif)
- …
How do I reliably strip all metadata from my pdf, jpg, docx, etc., files?
Best Answer
MAT
Have a look at MAT (Metadata Anonymisation Toolkit)! It comes from the TOR-people and as standard on Tails—a privacy and anonymity focused live OS.
Since it's kind of a wrapper around
exiftool
, it supports more file formats thatexiftool
alone.By now, they are:
For some more details, have a look at this paper.
BEWARE
JPEG
Comments and the standard Exif-/IPTC-/XMP-tags are being deleted. There might be proprietary non-standard tags (like Canon Raw tags) MAT does not touch. These could be included by e.g. proprietary RAW → JPEG conversion tools.
ZIP
MAT does not alter the content of the archive. If a tool creates additional files containing metadata within the archive, they will not be touched.
Installation
Ubuntu 12.10 and above
Since Ubuntu 12.10 it's in the standard repository universe.
sudo apt install mat
Below Ubuntu 12.10
For older versions of Ubuntu, it has to be installed by hand. The dependencies are:
Install them via:
Then get MAT here (e.g. mat-0.6.1.tar.xz). If you want verify your download with GnuPG, get the .asc file as well.
To check it, import the key given at the bottom of the page e.g. via
and check with:
The output should be something like
Extract and install via
Debian users find it in the testing-repo, Arch users in AUR.
If everything went fine, you have the console tool
mat
as well as the guimat-gui
.