I rsynced a large file from remote centos to local ubuntu with
rsync -avzP user@<remote-ip>:/path/to/file .
It reported the transfer went well:
sent 30 bytes received 257,293,476 bytes 1,296,188.95 bytes/sec
total size is 8,217,194,015 speedup is 31.94
As far as I know rsync automatically verifies the transfer went well with hash checks after the transfer is completed.
Out of curiosity I computed md5 hashes on centos and ubuntu, and these are different:
centos: 0faa300b7b0b81bfe65199da932eb6e2
ubuntu: f3a0fcc59516d4e68fd207bdbb1fc169
Both hashes are computed with md5sum
:
centos> md5sum --version
md5sum (GNU coreutils) 8.22
ubuntu> md5sum --version
md5sum (GNU coreutils) 8.25
So the verisons are a little different, but can that lead to a different values of the hashes?
Edit:
Here are ls -l
output:
centos: -rw-rw-r--. 1 username username 8217194015
ubuntu: -rw-rw-r-- 1 username username 8217194015
Centos output includes mysterious dot I've never heard about. (could it be related to lvm? lvm is used on that centos)
Edit 2:
Checking md5sum -b
leads to different results as well:
centos: 0faa300b7b0b81bfe65199da932eb6e2
ubuntu: 6d799f6981066d82c7f861576b4980e1
What hash algorithm does rsync use? According to wikipedia rsync uses md5 to check if the chunk is the same:
The recipient splits its copy of the file into chunks and computes two checksums for each chunk: the MD5 hash, and a weaker but easier to compute 'rolling checksum'. It sends these checksums to the sender. The sender quickly computes the rolling checksum for each chunk in its version of the file; if they differ, it must be sent. If they're the same, the sender uses the more computationally expensive MD5 hash to verify the chunks are the same.
Best Answer
There's a wrong assumption here:
Rsync uses checksums to determine if a sync is needed. But, Rsync does not reread the created copy, it trust the kernel to report errors. So, the conclusion is simple: the files are not identical. Could be just one bit, could be more. How much mismatch there is, a checksum doesn't tell you.