I am currently a user of BTRFS and would like to take advantage of the CoW such that when files are copied on the same btrfs filesystem, they are automatically deduplicated by reusing the existing extent. There are two ways I can think to do this:
Solution one (Local)
I could simply set an alias in my .bashrc so that whenever I call cp
it automatically appends the --reflink=auto
flag.
alias cp='cp --reflink=auto'
Solution two (Global)
The other solution I can think of would be to create /usr/local/bin/cp
that has a higher precedence in the PATH variable. The script would be something along the lines of:
#!/bin/sh
CP=/bin/cp
exec $CP --reflink=auto $*
I do not think it would be a good idea to replace /bin/cp
as updates of coreutils will end up overwriting my changes. This would however hopefully mean that applications that call cp from the PATH (rather than directly through /bin/cp
) will always automatically use reflinks.
Question
Is there any argument against this, or any situations where having this imposed would cause a problem? I assume by having it set to auto
it will automatically determine if the underlying file systems support reflinks and if they are on the same device, use reflinks meaning that there won't be a problem when I connect an external ext4 filesystem or am copying between btrfs filesystems?
I have read Why is cp –reflink=auto not the default behaviour? and it would seem the main argument is that cp may be used to create a backup of a file but then I would argue that for me, I would rather be able to consume less space locally and have the data duplicated to another machine completely, where I am aiming to backup data. In this case, would implemented solution 2 be safe to do?
In terms of minimising the local disk space usage, I have seen the suggestion for setting --sparse=always
so I suppose a similar question applies for this.
Best Answer
Note that there's a problem in your code. Leaving
$*
unquoted never makes sense.$*
is the concatenation of the positional parameters with the first character of$IFS
. And then, though there are some variations in behaviour when IFS is empty, that is then subject to word splitting and filename generation. Here, you want:"$@"
expands to all the positional parameters as separate arguments.If you want to update
/bin/cp
and the change to be preserved upon updates, then most systems will have a canonical way to do that. On Debian and derivatives, you'd do:Then write
/bin/cp
as:Every update of coreutils will update
cp.distrib
instead ofcp
.Note that there's a performance implication in that it needs to load and run
sh
before runningcp
. That's not as bad on Debian where/bin/sh
is based ondash
.That also means error messages and help messages will mention
cp.distrib
instead ofcp
:That last part, you can work around by writing the script as:
(same with
ksh93
orzsh
all likebash
bloated shells compared todash
though).It will not be strictly equivalent as
$0
will contain the path to that script as opposed to theargv[0]
cp
initially received but at least it will be something like/bin/cp
instead of/bin/cp.distrib
.