What you want to do is reasonable, but using rsync
to do it on its own is not. So the answer is no.
The reason is simple: rsync
keeps no history of what was in each directory and has no way of knowing what needs to be deleted and what not. Not without additional support.
You should ask yourself why you like to do this with rsync
and make that more clear. There are other programs that use librsync1.so
that are more intelligent.
With the relaxed constraints that you don't need
rsync
per se, you can take a look at
rdiff-backup:
mkdir a
touch a/xx
touch a/yy
rdiff-backup a b
ls b
This shows xx
and yy
are in b
.
touch b/zz
rm a/xx
rdiff-backup a b
This shows xx
and zz
are in b
. rdiff-backup
also keeps a directory rdiff-backup-data
in b
so you can rollback any changes, you should purge this on a regular basis using the rdiff-backup
commands. (The example is with local files to show extra data in the target does not get deleted, but rdiff-backup works over a network as well).
Another alternative is to setup some distributed revision control system (mercurial, bazaar, git). With mercurial e.g. you can have a script (I use a Makefile for that), that pushes all the changes to the server and then does an update of the checked out files there, ignore any additional files that are on the remote server (but have not been put under revision control).
On the server you would do:
hg init
hg add file_list_excluding_that_should_not_should_be_deleted_if_not_on_client
hg commit -m "initial setup"
On the client:
hg clone ssh://username@server/dir_to_repository
Now if you remove a file on the client and do:
hg commit -m "removed file"
ssh username@server "cd dir_to_repository; hg update --clean"
Your removed file is removed on the server, but any other data (not added to the repository) does not get deleted.
There are 2 parts to this question. First, why is there a difference between "Number of files" and "Number of files transferred". This is explained in the rsync manpage:
Number of files: is the count of all "files" (in the generic sense), which includes directories, symlinks, etc.
Number of files transferred: is the count of normal files that were updated via rsync’s delta-transfer algorithm, which does not include created dirs, symlinks, etc.
The difference here should be equal to the total amount of directories, symnlinks, other special files. Those were not "transferred" but just re-created.
Now for the second part, why is there a size difference with du. du shows the amount of disk space used by a file, not the size of the file. The same file can take up a different amount of disk space, if for example the filesystems blocksizes differ.
If you are still worried about data integrity, a easy way to be sure is to created hashes for all your files and compare:
( cd /home/hholtmann && find . -type f -exec md5sum {} \; ) > /tmp/hholtmann.md5sum
( cd /media/wd750/c51/home/ && md5sum -c /tmp/hholtmann.md5sum )
Best Answer
rsync
will report changes forIn comments, @roaima pointed out that there is an option to give a summary of these changes, in the rsync manual page:
You may find it useful, though the summary is terse and (in the version I have at hand) only reports the type (file, link or directory) and name. Here is what I see with rsync 3.0.9-4 and 3.1.1-3 on my Debian 7 and testing machines:
For my own use, changes of timestamps for directories are relatively unimportant. I use a script which shows only files which are changed:
rsync: show when newer file on destination is to be overwritten