Transferring millions of files from one server to another

file-transferperformancersyncscp

I have two servers. One of them has 15 million text files (about 40 GB). I am trying to transfer them to another server. I considered zipping them and transferring the archive, but I realized that this is not a good idea.

So I used the following command:

scp -r usrname@ip-address:/var/www/html/txt /var/www/html/txt

But I noticed that this command just transfers about 50,000 files and then the connection is lost.

Is there any better solution that allows me to transfer the entire collection of files? I mean to use something like rsync to transfer the files which didn't get transferred when the connection was lost. When another connection interrupt would occur, I would type the command again to transfer files, ignoring those which have already been transferred successfully.

This is not possible with scp, because it always begins from the first file.

Best Answer

As you say, use rsync:

rsync -azP /var/www/html/txt/ username@ip-address:/var/www/html/txt

The options are:

-a : enables archive mode, which preserves symbolic links and works recursively
-z : compress the data transfer to minimise network usage
-P : to display a progress bar and enables you to resume partial transfers

As @aim says in his answer, make sure you have a trailing / on the source directory (on both is fine too).

More info from the man page

Related Question