I want to be able to compress a file losslessly, and if the original file is identical to another user's file, I want both of our compressed files to match, even if the original file dates are different.
I want to use a maximum of 1GB of RAM while compressing. I'm leaning towards an asymmetric algorithm because the files I have are fairly large, and they take at least an hour to compress with LZMA1 "ultra" in 7-zip on a P4 machine with 1GB RAM and nothing else running. I think 7-zip and FreeARC can be used for my purposes. I've tried to find the commands I should be using, but I'm not having much luck.
edit: 100% identical files should be produced, even if the dates of creation are different. This should be possible through –nodates in Freearc, and with ???? in 7-zip. I'm looking for an equivalent command for 7-zip, and a way to standardize compression across multiple computers.
Best Answer
Create a couple of identical files:
gzip them...
observe timestamp field as the only difference:
For more info on the timestamp, see the RFC
Now, you can either take an MD5 that starts after byte 8, zero these four bytes in your files and lose their timestamps, or extract the CRC16 from those gzips (also see the RFC for info on how to extract this)
Or, you could save without the timestamp: