I wrote a backup script on my Debian 8 system which uses tar command with "–exclude-from" to exclude some files/dir.
This works great, but today I would like to exclude some files sharing the same path pattern, like /home/www-data/sites/<some_string>log.txt
or directories like /home/www-data/sites/<one_or_two_directories>/vendor
.
I tried to append /home/www-data/sites/*log.txt
into the file, but tar fails and outputs on stderr the following errors:
tar: /home/www-data/sites/*log.txt: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
Did I miss something when trying to use *
or **
?
I then read that in Unix, programs usually do not interpret wildcards themselves which means that *
isn't expanded neither **
by tar.
As far as I know, my last resort here is to expand the list using bash and append it into the exclusion file (if it's not already there) prior to the tar
call. Is there a cleaner way?
EDIT
Here is the script snippet ..
# ...
broot=$(dirname "${PWD}")
i="${PWD}/list.include"
x="${PWD}/list.exclude"
o="$broot/archive.tgz"
tar -zpcf $o -T $i -X $x
# ...
Dans here is the exclusion file ..
/etc/php5/fpm
/etc/nginx
/etc/mysql
/home/me/websites/*log.txt
/home/me/websites/**/vendor
The goal is to exclude all log files located inside "websites" directory, and, all "vendor" directories that could be found in any subdirectories of "websites".
Thank you !
Best Answer
The shell expands wildcards in arguments, so most applications don't need to perform any wildcard expansion. However tar's exclude list does support wildcards, which happen to match the wildcards supported by traditional shells. Beware that there may be slight differences; for example tar doesn't distinguish
*
and**
like ksh, bash and zsh can. With tar,*
can match any character including/
, so for example*/.svn
excludes a file called.svn
at any level of the hierarchy. You can usetar --no-wildcards-match-slash
in which case*
doesn't match directory separators.For example, excluding
/home/me/websites/*log.txt
excludes/home/me/websites/log.txt
,/home/me/websites/foo-log.txt
and/home/me/websites/subdir/log.txt
. Excluding/home/me/websites/**/vendor
excludes/home/me/websites/one/vendor
and/home/me/websites/one/two/vendor
but not/home/me/websites/vendor
. With the--no-wildcards-match-slash
option,/home/me/websites/*log.txt
does not exclude/home/me/websites/subdir/log.txt
and/home/me/websites/**/vendor
does not exclude/home/me/websites/one/two/vendor
.tar … --exclude='/home/www-data/sites/*include' …
excludes the files and directories under/home/www-data/sites
whose name ends withinclude
. You might get away without the quotes, but not if you write(because then the shell would expand the wildcards before--exclude /home/www-data/sites/*include
tar
can see them) or if you use a shell that signals an error on non-matching wildcards (e.g. zsh in its default — and recommended — configuration).The option
--exclude-from
requires a file name. The file must contain one pattern per line. Do not confuse--exclude
(followed by a pattern) and--exclude-from
(followed by the name of a file containing patterns).