Ssh – How to copy a file that is still being written over ssh

file-transferssh

Here is the situation:

  1. I am uploading a large file from client A to a server using sftp.
  2. I also need to download this file from the server to client B over ssh.

What I would like to do is start the transfer from the server to client B when the upload is still happening from client A.

What is the best method/tool to get this done?

UPDATE:

The answers so far are interesting–I'll be sure to read and test them all. Bonus points for answers that don't depend on controlling how Client A is uploading the file. (ie. the only thing we know from client A is that the file is being written to a known filename.)

Best Answer

For a single file instead of using SFTP you could pipe the file over ssh using cat or pv at the sending side and using tee on the middle server to both send the data to a file there and send a copy over the another ssh link the other side of which just writes the data to a file. The exact voodoo required I'll leave as an exercise for the reader, as I've not got time to play right now (sorry). This method would only work if the second destination is publicly accessible via SSH which may not be the case as you describe it as a client machine.

Another approach, which is less "run and wait" but may otherwise be easier, it to use rsync between the server and client B. The first time you run this it may get a partial copy of the data, but you can just re-run it to get more data afterwards (with one final run once the Client1->Server transfer is complete). This will only work if the server puts the data direct into the right file-name during the SFTP transfer (sometimes you will see the data going into a temporary file which is then renamed once the file is completely transferred - this is done to make the file update more atomic but will render the rsync idea unusable). You could also use rsync for the C1->S transfer instead of scp (if you use the --inplace option to avoid the problem mentioned above) - using rsync would also give you protection against needing to resend everything if the C1->Server connection experiences problems during a large transfer (I tend to use rsync --inplace -a --progress <source> <dest> instead of scp/sftp when rsync is available, for this "transfer resume" behaviour).

To summarise the above, running:

rsync --inplace -a --progress <source> user@server:/<destination_file_or_folder>

on client1 then running

rsync --inplace -a --progress user@server:/<destination_file_or_folder> <destination_on_cli2>

on client2 repeatedly until the first transfer is complete (then running once more to make sure you've got everything). rsync is very good at only transferring the absolute minimum it needs to update a location instead of transferring the whole lot each time. For paranoia you might want to add the --checksum option to the rsync commands (which will take much more CPU time for large files but won't result in significantly more data being transfered unless it is needed) and for speed the --compress option will help if the data you are transferring is not already in a compressed format.

Related Question