Rsync – Include Only Certain File Types Excluding Some Directories

file-copyrsync

I want to rsync only certain file types (e.g. .py) and I want to exclude files in some directories (e.g. venv).

This is what I have tried:

rsync -avz --include='*/' --exclude='venv/' --include='*.py' --exclude='*' /tmp/src/ /tmp/dest/

But it doesn't work.

What am I missing?

I also followed the answer to this question but it didn't help.

Best Answer

venv/ needs to be excluded before */ is included:

rsync -avz --exclude='venv/' --include='*/' --include='*.py' --exclude='*' /tmp/src/ /tmp/dest/

The subtlety is that rsync processes rules in order and the first matching rule wins. So, if --include='*/' is before --exclude='venv/', then the directory venv/ is included by --include='*/' and the exclude rule is never consulted.

Could we simplify this?

Why do we need --include='*/' and --exclude='*'? Why isn't --exclude=venv/ --include='*.py' sufficient?

The default is to include files/directories. So, consider:

rsync -avz --exclude='venv/' --include='*.py' source target

This would include everything except files or directories under venv/. You, however, only want .py files. That means that we have to explicitly exclude other files with --exclude='*'.

--exclude='*' excludes both files and directories. So, if we specify --exclude='*', then all directories would be excluded and only the .py files it the root directory would be found. .py files in subdirectories would never be found because rsync does not look into directories that are excluded. Thus, if we have --exclude='*', we need to precede it with --include='*/' to ensure that the contents of all directories are explored.

Related Question