Mysql – How should I migrate a large MySQL database to RDS

MySQLmysqldump

I've already looked into this a little bit. I realize there are similar questions on Stack Overflow, and Amazon themselves have a helpful document giving advice here:

http://aws.amazon.com/articles/2933

My concerns are the following:

Amazon recommends using mysqldump only for "small amounts of data", which they define as less than 1GB. The database I intend to migrate is over 20GB.

One thing that's nice about mysqldump, however, is that it has the --single-transaction flag, which allows me to ensure a DB state that is consistent with a single point in time.

For larger amounts of data, Amazon's recommendation is to export the database into flat (e.g., CSV) files and then use mysqlimport to import those to RDS. The best way I know how to do this, however, is through the SELECT ... INTO OUTFILE command, which only operates one table at a time. The downside to this, of course, is that it doesn't provide the consistency guarantee of --single-transaction.

I suppose I could ensure consistency by taking the entire DB down temporarily; but I'd like to avoid that if at all possible.

What's the best way to get my large (> 20GB) database into flat files so that I can then use mysqlimport?
If it is indeed the SELECT ... INTO OUTFILE command, how do I export all of the tables in the database (preferably without having to do one at a time)?
Is there any good way to ensure consistency throughout all this?

Best Answer

I just recently spent a lot of time trying to figure out a 15GB transition to RDS. Ended up finding a script on one of the amazon forums that I modified to my own uses and seems to work well. I'm not sure if you can do single transaction, but the dump itself is very quick compared to the actual transfer. I think 15GB only took me 12 minutes to dump, so even if it doesn't have single transaction option I don't think you'd have a very long span of time for inconsistencies to occur. I'm not sure if that's good enough for you, but I found the solution a lot more graceful than the flat file method.

#!/bin/bash

declare -a dbs=(dbname1 dbname2 dbname3 dbname4);

j=0
while [ $j -lt 4 ];
#4 is the number of dbs
do

echo "Dumping ${dbs[$j]} DB"
time mysqldump --order-by-primary --host=sourcehost --user=sourceuser --password=sourcepass `echo ${dbs[$j]}` > /tmp/`echo ${dbs[$j]}`.sql
echo "Adding optimizations to ${dbs[$j]}"
awk 'NR==1{$0="SET autocommit=0; SET unique_checks=0; SET foreign_key_checks=0;\n"$0}1' /tmp/`echo ${dbs[$j]}`.sql >> /tmp/`echo ${dbs[$j]}`X.sql
mv /tmp/`echo ${dbs[$j]}`X.sql /tmp/`echo ${dbs[$j]}`.sql
echo "SET unique_checks=1; SET foreign_key_checks=1; COMMIT;" >> /tmp/`echo ${dbs[$j]}`.sql
echo "Copy ${dbs[$j]} into RDS"
time mysql --host=yourrds.rds.amazonaws.com --user=rdsuser --password=rdspassword `echo ${dbs[$j]}` < /tmp/`echo ${dbs[$j]}`.sql &

j=$(($j+1))
done

Related Solutions

MySQL – Changed max_allowed_packet Still Receiving ‘Packet Too Large’ Error

The first I thought about was what max_allowed_packet actually controls. Here is what I found:

According to the page 99 of "Understanding MySQL Internals" (ISBN 0-596-00957-7), here are paragraphs 1-3 explaining it:

MySQL network communication code was written under the assumption that queries are always reasonably short, and therefore can be sent to and processed by the server in one chunk, which is called a packet in MySQL terminology. The server allocates the memory for a temporary buffer to store the packet, and it requests enough to fit it entirely. This architecture requires a precaution to avoid having the server run out of memory---a cap on the size of the packet, which this option accomplishes.

The code of interest in relation to this option is found in sql/net_serv.cc. Take a look at my_net_read(), then follow the call to my_real_read() and pay particular attention to net_realloc().

This variable also limits the length of a result of many string functons. See sql/field.cc and sql/intem_strfunc.cc for details.

Given that definition of max_allowed_packet, I then discovered something else from ServerFault: innodb_log_file_size and innodb_log_buffer_size combined must be larger than ten times your biggest blob object if you have a lot of large ones

Keeping these two things in mind, I would increase innodb_log_file_size in /etc/my.cnf to the max size allowed for it, 2047M. This of course requires the following

service mysql stop
rm -f /var/lib/mysql/ib_logfile*
service mysql start

This will accommodate any big blobs you may have in your data.

Mysql – Can thesql restore a single table from a large thesqldump

I have created both linux and windows scripts to restore the specific tables from the dump file:

linux (bash script):

#!/bin/bash

# Where to restore
db_host='localhost'
db_name='adhoctuts'
db_user='root'
db_pass='Adhoctuts2018#'

dump_file='/root/scripts/dump_ignore.sql'

# Associative table list array as source_table=>destination_table pairs
declare -A tbl_list=( ["tbl1"]="restored_tbl1" ["tbl2"]="restored_tbl2")

for tbl in "${!tbl_list[@]}"
do
    echo "Restore $tbl to ${tbl_list[$tbl]}"
    # extract the content between drop table and Table structure for, also replace the table name
    sed -n -e '/DROP TABLE IF EXISTS `'$tbl'`/,/\/*!40000 ALTER TABLE `'$tbl'` ENABLE KEYS \*\/;/p' $dump_file > tbl.sql
    sed -i 's/`'$tbl'`/`'${tbl_list[$tbl]}'`/g' tbl.sql
    mysql -h $db_host -u $db_user -p"$db_pass" $db_name < tbl.sql
    rm -f tbl.sql
done

windows script (bat):

%= Define the database and root authorization details =% 
@ECHO OFF
SETLOCAL EnableDelayedExpansion

set db_host=192.168.70.138
set db_name=adhoctuts
set db_user=adhoctuts
set db_pass=Adhoctuts2018#

set dump_file=dump_ignore.sql

set tbl_cnt=2
set source_table[1]=tbl1
set destination_table[1]=restored_tbl1
set source_table[2]=tbl2
set destination_table[2]=restored_tbl2

set i=1
:loop   
    set src=!source_table[%i%]!
    set dest=!destination_table[%i%]!
    for /f "tokens=1 delims=[]" %%a in ('find /n "DROP TABLE IF EXISTS `%src%`"^<"%dump_file%"') do set /a start=%%a
    for /f "tokens=1 delims=[]" %%a in ('find /n "ALTER TABLE `%src%` ENABLE KEYS"^<"%dump_file%"') do set /a end=%%a
    (
    for /f "tokens=1* delims=[]" %%a in ('find /n /v ""^<"%dump_file%"') do (

    set "line=%%b "
     IF %%a geq %start% IF %%a leq %end% ECHO( !line:%src%=%dest%!
     )
    )>"tbl.sql"

    mysql -h %db_host% -u %db_user% -p"%db_pass%" %db_name% < "tbl.sql"
    del /f "tbl.sql"
    if %i% equ %tbl_cnt% goto :eof
    set /a i=%i%+1
goto loop

here you define the tables you need to restore and under what name to restore. This the more general solution.

I have also created separate tutorial for MySQL selective/exceptional tasks. You may check if needed:

https://youtu.be/8fWQbtIISdc

https://adhoctuts.com/mysql-selective-exceptional-permissions-and-backup-restore/

Best Answer

Related Solutions

MySQL – Changed max_allowed_packet Still Receiving ‘Packet Too Large’ Error

Mysql – Can thesql restore a single table from a large thesqldump

Related Question