Ssh – How to copy a file that is still being written over ssh

file-transferssh

Here is the situation:

I am uploading a large file from client A to a server using sftp.
I also need to download this file from the server to client B over ssh.

What I would like to do is start the transfer from the server to client B when the upload is still happening from client A.

What is the best method/tool to get this done?

UPDATE:

The answers so far are interesting–I'll be sure to read and test them all. Bonus points for answers that don't depend on controlling how Client A is uploading the file. (ie. the only thing we know from client A is that the file is being written to a known filename.)

Best Answer

For a single file instead of using SFTP you could pipe the file over ssh using cat or pv at the sending side and using tee on the middle server to both send the data to a file there and send a copy over the another ssh link the other side of which just writes the data to a file. The exact voodoo required I'll leave as an exercise for the reader, as I've not got time to play right now (sorry). This method would only work if the second destination is publicly accessible via SSH which may not be the case as you describe it as a client machine.

Another approach, which is less "run and wait" but may otherwise be easier, it to use rsync between the server and client B. The first time you run this it may get a partial copy of the data, but you can just re-run it to get more data afterwards (with one final run once the Client1->Server transfer is complete). This will only work if the server puts the data direct into the right file-name during the SFTP transfer (sometimes you will see the data going into a temporary file which is then renamed once the file is completely transferred - this is done to make the file update more atomic but will render the rsync idea unusable). You could also use rsync for the C1->S transfer instead of scp (if you use the --inplace option to avoid the problem mentioned above) - using rsync would also give you protection against needing to resend everything if the C1->Server connection experiences problems during a large transfer (I tend to use rsync --inplace -a --progress <source> <dest> instead of scp/sftp when rsync is available, for this "transfer resume" behaviour).

To summarise the above, running:

rsync --inplace -a --progress <source> user@server:/<destination_file_or_folder>

on client1 then running

rsync --inplace -a --progress user@server:/<destination_file_or_folder> <destination_on_cli2>

on client2 repeatedly until the first transfer is complete (then running once more to make sure you've got everything). rsync is very good at only transferring the absolute minimum it needs to update a location instead of transferring the whole lot each time. For paranoia you might want to add the --checksum option to the rsync commands (which will take much more CPU time for large files but won't result in significantly more data being transfered unless it is needed) and for speed the --compress option will help if the data you are transferring is not already in a compressed format.

Related Solutions

SSH – Set SFTP Startup Folder Other Than /home/username

Right I managed to get some advice at #openssh IRC channel and here is what was missing from my solution:

The directory specified in ChrootDirectory must be owned by root. Since in the above sshd_config file I have specified the %u variable so every user has their own root directory base on their username (e.g. testuser would be /mnt/inbound/testuser/) then all of those directories must be owned by root. This is in fact the default when I create the directories doing sudo mkdir /mnt/inbound/<username> since the mkdir command is elevate via sudo.

So what I needed to do is to create a sub-directory under /mnt/inbound/<username> and give that directory permission for the user. In my case I called this directory uploads.

So I changed my configuration slightly as follows:

Match Group sftponly # Chroot the connection into the specified directory ChrootDirectory /mnt/inbound/%u # Force the connection to use the built-in SFTP support ForceCommand internal-sftp -d /uploads

The ForceCommand line has been changed to include -d /uploads, meaning that the default directory after the user logs-in in is /uploads. Note that it is /uploads and not /mnt/inbound/%u/uploads because it takes into account that /mnt/inbound/%u has been specified as the new root in the previous line in the config.

If I do ChrootDirectory /mnt/inbound/ an then specify ForceCommand internal-sftp -d /%u, I could make the /mnt/inbound/<username> folder be owned by the end-user since /mnt/inbound is now the new root directory that must be owned by the root account. However users would be able to navigate to the parent folder and see the directory names of all other accounts. I decided against that :)

SSHFS/SFTP – How to Fix Vim Hanging and Upload Issues Over SSHFS and SFTP

The sshfs FUSE filesystem is implemented by presenting a filesystem on top of sftp, the file transfer protocol. As a result, any file access such as editing with vi[m] requires the sshfs subsystem first to copy the file to a cache on the local filesystem. If the file is particularly large, or the network between your client and the server is particularly slow, it will take a measurable amount of time to transfer the file before it's accessible locally.

It's (very) broadly equivalent to the following (except it uses sftp instead of scp)

# Copy the remote file to a temporary local cache
scp -p remote:/path/to/file /tmp/file.tmp
checksum=$(cksum /tmp/file.tmp)

# Action on remote file is implemented by performing the action locally
vi /tmp/file.tmp

# Simplified; we would also need to handle local rm/mv -> remote rm/mv, etc.
[[ "$(cksum /tmp/file.tmp)" != "$checksum" ]] && scp -p /tmp/file.tmp remote:/path/to/file

As a consequence, you'll find that trying to run gcc locally will be measurably slower than just logging in to the remote server and running it there. To be honest I'm not overly surprised that "gcc crashes when trying to compile files on the remote fs". It shouldn't, of course, but then think about what's actually going on in the background...

Best Answer

Related Solutions

SSH – Set SFTP Startup Folder Other Than /home/username

SSHFS/SFTP – How to Fix Vim Hanging and Upload Issues Over SSHFS and SFTP

Related Question