I have two identical folders, with same structure and contents like this:
folder_1
hello.txt
subfolder
byebye.txt
folder_2
hello.txt
subfolder
byebye.txt
if I compress them as tar.xz formats I get two different archives with two different file sizes (just a few bytes, but they're not identical).
$ cd folder_1 && tar -Jcf archive.tar.xz *
$ cd folder_2 && tar -Jcf archive.tar.xz *
I get:
folder_1/archive.tar.xz != folder_2/archive.tar.xz
and of course if I md5sum
or sha1sum
them I'll get two different hashes
And that's my problem… I need to check if a provided archive is identical to the one I have in my storage. I cannot use hashing nor just check file sizes.
Using zip instead of tar.xz works as zip always produces identical achives from identical files.
Why is this happening? Is there a way to prevent it?
Best Answer
Ok, the explanation given by ddnomad is correct. It's about the timestamp.
Here is the solution:
add
--mtime='1970-01-01 00:00:00'
to tar command:This will force contents timestamp to a fixed value thus resulting in identical archives.