Background: I'm investigating methods for encrypted storage on untrusted machines. My current setup uses sshfs to access a LUKS-encrypted image on the remote machine, which is decrypted locally and mounted as ext3. (If I were to use sshfs only, someone gaining access to the remote machine could see my data.) Here's my example setup:
# On the local machine:
sshfs remote:/home/crypt /home/crypt
cryptsetup luksOpen /home/crypt/container.img container
mount /dev/mapper/container /home/crypt-open
# Place cleartext files in /home/crypt-open,
# then reverse the above steps to unmount.
I want to make this resilient against network failures. To do this, I'd like to understand what caching / buffering happens with this setup. Consider these two commands:
dd if=/dev/random of=/home/crypt-open/test.dat bs=1000000 count=100
dd if=/dev/random of=/home/crypt-open/test.dat bs=1000000 count=100 conv=fsync
The first command returns very quickly, and I can see from the network traffic that the data is still being transmitted after the command has returned. The second command seems to wait until the data is finished transferring.
Concrete questions: What guarantees does fsync()
make under this setup? When fsync()
returns, how far along these layers is the data guaranteed to be synced? And what can I do to guarantee that it gets synced all the way down to the remote machine's hard drive?
--- /home/crypt-open on the local machine
|
| (ext3 fs)
|
--- /dev/mapper/container on the local machine
|
| (LUKS)
|
--- /home/crypt/container.img on the local machine
|
| (sshfs)
|
--- /home/crypt/container.img on the remote machine
|
| (ext3 fs)
|
--- hard drive on the remote machine
Best Answer
I'd assume the weakest link here is the SSHFS code -- the rest of the stuff is in kernel and pretty heavily used, so it's probably fine. I've never actually looked at any FUSE code before, so there could be something else going on that I've missed, but according to the SSHFS source code, SSHFS's implementation of
fsync()
doesn't do a whole bunch, it just callsflush()
on the IO stream.At
sshfs.c:2551
, we can see thatsshfs_flush()
function doesn't send any sort of sync command to the remote machine that enforces an fsync. I believe thesshfs.sync_write
flag means "wait for commands to go to the server before returning from write", not "fsync on the server on every write" because that second meaning would be very odd. Thus your fsync measurement is slower because it's bottlenecked by network speed, not remote disk speed.Note that it's possible that the remote SFTP implementation does actually fsync on writes, but I think that's actually not what's happening. According to an old draft of the SFTP standard (which is the best I can find) there is a way to specify this behavior:
which would imply that this isn't the default (as it's faster to not fsync). According to that standards document there doesn't appear to be a way to request a fsync on the remote file, but it looks like OpenSSH supports this as an extension to SFTP
I doubt it'd be hard to query for that extension and properly support fsync in SSHFS, that seems a pretty reasonable thing to do. That said, I think it'd probably be easier to just use Linux's network block device support which I assume supports all this stuff properly (though I've never used it myself, so it could be horrible).