Using MySQL and LxD over Btrfs

btrfsext4lxcMySQLsnapshot

I have a Ubuntu 16.04 workstation, with an ext4 filesystem.

  1. When playing with LxD, I would like to have a lightening-fast snapshoting ability (since my images would usually be big). This ability, in my understanding, would be achievable only if the backing filesystem is a CoW filesystem such as Btrfs.

(Note: The only major performance-related caveat for Btrfs that I've come across so far is the recommended use of the noatime mount flag.)

  1. But since I also have a MySQL instance on this system whose performance I would not like to see suffer (relative to ext4 filesystem), I decided to make one final check for any potential issues with my switchover to a Btrfs-backed MySQL. And, look what I found:

It's usually best to mount Btrfs with the 'nodatacow' option, disabling copy-on-write, because COW causes fragmentation, dish thrashing, and CPU and RAM spikes when you have a lot of random writes.

Now, that appears to be a real dampener!

Question: Is there any way I can have both fast snapshotting and a performant MySQL instance with Btrfs?

Best Answer

Fragmentation is a largely unavoidable side effect of a file system being designed with copy-on-write. It's also what allows the nearly free file system snapshots in the first place.

The reason for this is fairly simple: Every time a block is changed, the new block must be written to some location other than that of the original block. So even if the file was contiguous originally, it won't be after it has been modified.

I don't know how Btrfs nodatacow interacts with snapshots, but I have a feeling that the instant you have a snapshot on a dataset, you force at least partial copy-on-write behavior no matter what flags you are using; otherwise how would you be able to access the old data via the snapshot?

However, it is not a given that this will necessarily severely affect your MySQL performance, for two reasons:

  • Modern disks are really pretty fast for single-user workloads (which I take it that you are most interested in because you mention that your system is a "workstation")
  • Modern operating systems have pretty good caching algorithms, thereby reducing the need to actually hit physical storage

Just to give you an idea, I'm running ZFS myself (from which Btrfs borrows many ideas), and there's currently a scrub in progress. The pool in question is a six-disk raidz2, which isn't really known for its stellar performance, physically backed by six 7200 rpm disks (two SATA, four SAS) which also aren't exactly known for stellar IOPS in particular. A ZFS scrub navigates the entire on-disk Merkle tree, reads all data, and verifies checksums on everything to make sure everything reads back as it was previously written; in my case, computing SHA-256 hashes of everything along the way. The current scrub speed (after it got past the initial, metadata-heavy portion, which involves heavy seeking) is hovering right around 200 MB/s and actually climbing slowly. And that's for actual platter I/O, with no caching involved (because caching doesn't make any sense when you want to verify what's on persistent storage).

Sure, it's very likely that you will see some performance degredation from fragmentation if you move to a copy-on-write file system. But you can't eat the cake and keep it, too; if fast, low cost snapshots is something you want, it's likely that you are going to have to give up something else to get them.

What I would do in your case is benchmark. Set up some Btrfs storage, put a copy of the MySQL database there, and see how the two perform under reasonable workloads.

Related Question