Linux – Syntactic differences in cp -r and how to overcome them

command linecphistorylinux

Let's say we are in a blank directory. Then, the following commands:

mkdir dir1
cp -r dir1 dir2

Yield two (blank) directories, dir1 and dir2, where dir2 has been created as a copy of dir1. However, if we do this:

mkdir dir1
mkdir dir2
cp -r dir1 dir2

Then we instead find that dir1 has now been put inside dir2. This means that the exact same cp command behaves differently depending on whether the destination directory exists. If it does, then the cp command is doing the same as this:

mkdir dir1
mkdir dir2
cp -r dir1 dir2/.

This seems extremely counter-intuitive to me. I would have expected that cp -r dir1 dir2 (when dir2 already exists) would remove the existing dir2 (and any contents) and replace it with dir1, since this is the behavior when cp is used for two files. I understand that recursive copies are themselves a bit different because of how directories exist in Linux (and more broadly in Unix-like systems), but I'm looking for some more explanation on why this behavior was chosen. Bonus points if you can point me to a way to ensure cp behaves as I had expected (without having to, say, test for and remove the destination directory beforehand). I tried a few cp options without any luck. And I suppose I'll accept rsync solutions for the sake of others that happen upon this question who don't know that command.

In case this behavior is not universal, I'm on CentOS, using bash.

Best Answer

The behaviour you're looking for is a special case:

cp -R [-H|-L|-P] [-fip] source_file... target

[This] form is denoted by two or more operands where the -R option is specified. The cp utility shall copy each file in the file hierarchy rooted in each source_file to a destination path named as follows:

  • If target exists and names an existing directory, the name of the corresponding destination path for each file in the file hierarchy shall be the concatenation of target, a single <slash> character if target did not end in a <slash>, and the pathname of the file relative to the directory containing source_file.
  • If target does not exist and two operands are specified, the name of the corresponding destination path for source_file shall be target; the name of the corresponding destination path for all other files in the file hierarchy shall be the concatenation of target, a <slash> character, and the pathname of the file relative to source_file.

It shall be an error if target does not exist and more than two operands are specified ...

Therefore I'd say it's not possible to make cp do what you want.


Since your expected behaviour is "cp -r dir1 dir2 (when dir2 already exists) would remove the existing dir2 (and any contents) and replace it with dir1":

rm -rf dir2 && cp -r dir1 dir2

You don't even need to check if dir2 exists.


The rsync solution would be adding a trailing / to the source so that it doesn't copy dir1 itself into dir2 but copies the content of dir1 to dir2 (it will still keep existing files in dir2):

$ tree dir*
dir1
└── test.txt
dir2
└── test2.txt

0 directories, 2 file
$ rsync -a dir1/ dir2
$ tree dir*           
dir1
└── test.txt
dir2
└── test.txt
└── test2.txt

0 directories, 3 files
$ rm -r dir2          
$ rsync -a dir1/ dir2
$ tree dir*           
dir1
└── test.txt
dir2
└── test.txt

0 directories, 2 files
Related Question