MySQL Backup – Non-Blocking MySQL Backup Methods

backupMySQLmysqldumpperformancereplication

I'm running a MySQL server with a couple of customer databases, some of which are quite large. We are doing a complete mysql dump once a day. It takes about half an hour to complete.

During this time, every request to the MySQL server gets a ridiculously slow response (sometimes around 1 s instead of something like 10 ms). Even though there are no timeout errors, I would love to see the queries being completed in a decent amount of time, so customer websites are not slowed down by hanging MySQL queries.

I researched a lot and learned about data replication, LVM snapshots, various mysqldump flag options and other work-arounds, but nothing did really improve the situation.

Is there any option to set a query priority in MySQL?

If there was a way to define query priorities, I could give the backup process a lower priority to be sure all the other queries are executed faster. The mysqldump process could be temporarily paused (SIGTSTP) when it's challenged by another request and continued (SIGCONT) when the query is finished. However, I couldn't find a feature like this in MySQL natively.

Giving the mysqldump process a higher nice value doesn't work, because it just increases the time both, the backup process and the table locks will be present. (I'm locking per-table.) Also, limiting IO write rates just leads to longer pain.

Are there any solutions I missed, other than LVM and replication?

Best Answer

Indeed the bottleneck was IO. Since I was using tar -czf to compress the files and the CPU was so performant, the drive simply couldn't handle all the data coming so fast (although we use mirrored RAID volumes).

The mysqldump itself is no problem at all, but tar was slowing down the whole system massively.

I'm now using pv to shape the pipe stream:

tar -czf - ./ | pv -q -L 10m > output.tgz
  • -q disables output of pv.
  • -L 10m limits the write operation to 10 MB per second. Just test which value you should use on your server by watching the gzip process via top. I wanted it to have around 50% CPU usage, so the IO impact is lowered to 50% as well.
  • CPU impact is lowered implicitly, because pv slows down everything (except of the dump itself, it is done before compression).

Many thanks to @Michael - sqlbot and others for pushing me into the right direction.