Ssh – Meaning of fsync() in sshfs+LUKS setup

bufferlukssshfs

Background: I'm investigating methods for encrypted storage on untrusted machines. My current setup uses sshfs to access a LUKS-encrypted image on the remote machine, which is decrypted locally and mounted as ext3. (If I were to use sshfs only, someone gaining access to the remote machine could see my data.) Here's my example setup:

# On the local machine:
sshfs remote:/home/crypt /home/crypt
cryptsetup luksOpen /home/crypt/container.img container
mount /dev/mapper/container /home/crypt-open

# Place cleartext files in /home/crypt-open,
# then reverse the above steps to unmount.

I want to make this resilient against network failures. To do this, I'd like to understand what caching / buffering happens with this setup. Consider these two commands:

dd if=/dev/random of=/home/crypt-open/test.dat bs=1000000 count=100
dd if=/dev/random of=/home/crypt-open/test.dat bs=1000000 count=100 conv=fsync

The first command returns very quickly, and I can see from the network traffic that the data is still being transmitted after the command has returned. The second command seems to wait until the data is finished transferring.

Concrete questions: What guarantees does fsync() make under this setup? When fsync() returns, how far along these layers is the data guaranteed to be synced? And what can I do to guarantee that it gets synced all the way down to the remote machine's hard drive?

--- /home/crypt-open on the local machine
|
| (ext3 fs)
|
--- /dev/mapper/container on the local machine
|
| (LUKS)
|
--- /home/crypt/container.img on the local machine
|
| (sshfs)
|
--- /home/crypt/container.img on the remote machine
|
| (ext3 fs)
|
--- hard drive on the remote machine

Best Answer

I'd assume the weakest link here is the SSHFS code -- the rest of the stuff is in kernel and pretty heavily used, so it's probably fine. I've never actually looked at any FUSE code before, so there could be something else going on that I've missed, but according to the SSHFS source code, SSHFS's implementation of fsync() doesn't do a whole bunch, it just calls flush() on the IO stream.

static int sshfs_fsync(const char *path, int isdatasync,
                       struct fuse_file_info *fi)
{
    (void) isdatasync;
    return sshfs_flush(path, fi);
}

At sshfs.c:2551, we can see that sshfs_flush() function doesn't send any sort of sync command to the remote machine that enforces an fsync. I believe the sshfs.sync_write flag means "wait for commands to go to the server before returning from write", not "fsync on the server on every write" because that second meaning would be very odd. Thus your fsync measurement is slower because it's bottlenecked by network speed, not remote disk speed.

static int sshfs_flush(const char *path, struct fuse_file_info *fi)
{
    int err;
    struct sshfs_file *sf = get_sshfs_file(fi);
    struct list_head write_reqs;
    struct list_head *curr_list;

    if (!sshfs_file_is_conn(sf))
        return -EIO;

    if (sshfs.sync_write)
        return 0;

    (void) path;
    pthread_mutex_lock(&sshfs.lock);
    if (!list_empty(&sf->write_reqs)) {
        curr_list = sf->write_reqs.prev;
        list_del(&sf->write_reqs);
        list_init(&sf->write_reqs);
        list_add(&write_reqs, curr_list);
        while (!list_empty(&write_reqs))
            pthread_cond_wait(&sf->write_finished, &sshfs.lock);
    }
    err = sf->write_error;
    sf->write_error = 0;
    pthread_mutex_unlock(&sshfs.lock);
    return err;
}

Note that it's possible that the remote SFTP implementation does actually fsync on writes, but I think that's actually not what's happening. According to an old draft of the SFTP standard (which is the best I can find) there is a way to specify this behavior:

7.9. attrib-bits and attrib-bits-valid
...
SSH_FILEXFER_ATTR_FLAGS_SYNC
       When the file is modified, the changes are written synchronously
       to the disk.

which would imply that this isn't the default (as it's faster to not fsync). According to that standards document there doesn't appear to be a way to request a fsync on the remote file, but it looks like OpenSSH supports this as an extension to SFTP

/* SSH2_FXP_EXTENDED submessages */
struct sftp_handler extended_handlers[] = {
    ...
    { "fsync", "fsync@openssh.com", 0, process_extended_fsync, 1 },
    ...
};

static void
process_extended_fsync(u_int32_t id)
{
    int handle, fd, ret, status = SSH2_FX_OP_UNSUPPORTED;

    handle = get_handle();
    debug3("request %u: fsync (handle %u)", id, handle);
    verbose("fsync \"%s\"", handle_to_name(handle));
    if ((fd = handle_to_fd(handle)) < 0)
        status = SSH2_FX_NO_SUCH_FILE;
    else if (handle_is_ok(handle, HANDLE_FILE)) {
        ret = fsync(fd);
        status = (ret == -1) ? errno_to_portable(errno) : SSH2_FX_OK;
    }
    send_status(id, status);
}

I doubt it'd be hard to query for that extension and properly support fsync in SSHFS, that seems a pretty reasonable thing to do. That said, I think it'd probably be easier to just use Linux's network block device support which I assume supports all this stuff properly (though I've never used it myself, so it could be horrible).

Related Question