Mysql – How to perform this database backup/sanitization task

backupgaleramariadbMySQL

I have a 3-node MariaDB/Galera cluster. I need to do two things:

  1. Back up the database regularly (multiple gigs in size) for a normal, in-case-everything-crashes-I-can-restore-from-this backup.
  2. Have a sanitized version in which private data is obfuscated and key values are changed, for the purpose of distributing to developers. (The idea is that every day, they can grab a fresh "devel snapshot" that is a content- and structural- facsimile of the real thing, but that in which any potentially sensitive data is sanitized, and devel-centric settings are changed.

I know how to do a mysqldump, but I'm afraid this will lock the db.

I know how to write the queries to obfuscate the data, but obviously I can't do this on the live db. (I'm assuming that I need to import this into a different database… perhaps on a separate machine not in the cluster?)

I need this to be scripted (which I'm not scared of doing, I'm just not sure where to start.)

Can someone point me in the right direction?

Best Answer

Setup a slave.

Once everything is sync'd up and good.

  • Stop the slave (sql thread).
  • mysqldump slave data to orig.sql
  • run your queries to modify the data
  • mysqldump again to dev.sql
  • Restore the data using orig.sql
  • start slave to resume collection from the master.

This will take some time with a large db of course but it seems pretty straight forward to me. It won't lock your master cluster at all (at least until you create the slave). If you are modifying only a few tables you can tweak the mysqldump.

You can cron the operation and upload it to s3 or something for the devs. Use some fast SSDs.