Mysql – How should I migrate a large MySQL database to RDS

MySQLmysqldump

I've already looked into this a little bit. I realize there are similar questions on Stack Overflow, and Amazon themselves have a helpful document giving advice here:

http://aws.amazon.com/articles/2933

My concerns are the following:

Amazon recommends using mysqldump only for "small amounts of data", which they define as less than 1GB. The database I intend to migrate is over 20GB.

One thing that's nice about mysqldump, however, is that it has the --single-transaction flag, which allows me to ensure a DB state that is consistent with a single point in time.

For larger amounts of data, Amazon's recommendation is to export the database into flat (e.g., CSV) files and then use mysqlimport to import those to RDS. The best way I know how to do this, however, is through the SELECT ... INTO OUTFILE command, which only operates one table at a time. The downside to this, of course, is that it doesn't provide the consistency guarantee of --single-transaction.

I suppose I could ensure consistency by taking the entire DB down temporarily; but I'd like to avoid that if at all possible.

  1. What's the best way to get my large (> 20GB) database into flat files so that I can then use mysqlimport?
  2. If it is indeed the SELECT ... INTO OUTFILE command, how do I export all of the tables in the database (preferably without having to do one at a time)?
  3. Is there any good way to ensure consistency throughout all this?

Best Answer

I just recently spent a lot of time trying to figure out a 15GB transition to RDS. Ended up finding a script on one of the amazon forums that I modified to my own uses and seems to work well. I'm not sure if you can do single transaction, but the dump itself is very quick compared to the actual transfer. I think 15GB only took me 12 minutes to dump, so even if it doesn't have single transaction option I don't think you'd have a very long span of time for inconsistencies to occur. I'm not sure if that's good enough for you, but I found the solution a lot more graceful than the flat file method.

#!/bin/bash

declare -a dbs=(dbname1 dbname2 dbname3 dbname4);

j=0
while [ $j -lt 4 ];
#4 is the number of dbs
do

echo "Dumping ${dbs[$j]} DB"
time mysqldump --order-by-primary --host=sourcehost --user=sourceuser --password=sourcepass `echo ${dbs[$j]}` > /tmp/`echo ${dbs[$j]}`.sql
echo "Adding optimizations to ${dbs[$j]}"
awk 'NR==1{$0="SET autocommit=0; SET unique_checks=0; SET foreign_key_checks=0;\n"$0}1' /tmp/`echo ${dbs[$j]}`.sql >> /tmp/`echo ${dbs[$j]}`X.sql
mv /tmp/`echo ${dbs[$j]}`X.sql /tmp/`echo ${dbs[$j]}`.sql
echo "SET unique_checks=1; SET foreign_key_checks=1; COMMIT;" >> /tmp/`echo ${dbs[$j]}`.sql
echo "Copy ${dbs[$j]} into RDS"
time mysql --host=yourrds.rds.amazonaws.com --user=rdsuser --password=rdspassword `echo ${dbs[$j]}` < /tmp/`echo ${dbs[$j]}`.sql &

j=$(($j+1))
done