I can see mysqldump being a little pain in the neck because of transaporting data out and then in.
Let's say you are copying data from proddb to stagedb. Here is a query to create the script for lateral copying of every table:
select
concat
(
'drop table if exists ',db2,'.',tbl,'; ',
'create table ',db2,'.',tbl,' like ',db1,'.',tbl,'; ',
'insert into ',db2,'.',tbl,' select * from ',db1,'.',tbl,';'
)
from (select table_schema db1,table_name tbl
from information_schema.tables where table_schema='stack_dev') tbls,
(select 'stack_test' db2) dbname;
Just generate the script as a SQL text file and import the text file to the target server:
PROD_IP-10.1.2.20
STAG_IP=10.1.2.40
mysql -h${PROD_IP} -uusername -p --skip-column-names -A -e"select concat('drop table if exists ',db2,'.',tbl,'; ','create table ',db2,'.',tbl,' like ',db1,'.',tbl,'; ','insert into ',db2,'.',tbl,' select * from ',db1,'.',tbl,';') from (select table_schema db1,table_name tbl from information_schema.tables where table_schema='stack_dev') tbls,(select 'stack_test' db2) dbname;" > /root/CopyFromProdToStage.sql
mysql -h${STAG_IP} -uusername -p -A < /root/CopyFromProdToStage.sql
This should work just fine for MyISAM. It should also work for InnoDB if there are no constraints. If there are constraints, you may have to disable them for the import session:
PROD_IP-10.1.2.20
STAG_IP=10.1.2.40
echo "SET FOREIGN_KEY_CHECKS=0;" > /root/CopyFromProdToStage.sql
mysql -h${PROD_IP} -uusername -p --skip-column-names -A -e"select concat('drop table if exists ',db2,'.',tbl,'; ','create table ',db2,'.',tbl,' like ',db1,'.',tbl,'; ','insert into ',db2,'.',tbl,' select * from ',db1,'.',tbl,';') from (select table_schema db1,table_name tbl from information_schema.tables where table_schema='stack_dev') tbls,(select 'stack_test' db2) dbname;" >> /root/CopyFromProdToStage.sql
mysql -h${STAG_IP} -uusername -p -A < /root/CopyFromProdToStage.sql
UPDATE 2012-03-09 17:14 EST
I have a stored procedure that will accomplish this. Run the following only once:
DROP DATABASE IF EXISTS utility;
CREATE DATABASE utility;
DELIMITER $$
DROP PROCEDURE IF EXISTS `utility`.`CopyDB` $$
CREATE PROCEDURE `utility`.`CopyDB` (sourceDB VARCHAR(64),targetDB VARCHAR(64))
TheStoredProcedure:BEGIN
DECLARE found_count,ndx,ndx_last INT;
DECLARE sqlcmd VARCHAR(1024);
SELECT COUNT(1) INTO found_count
FROM information_schema.tables
WHERE table_schema = sourceDB;
IF found_count = 0 THEN
LEAVE TheStoredProcedure;
END IF;
DROP TABLE IF EXISTS DBTablesToCopy;
CREATE TABLE DBTablesToCopy
(
id INT NOT NULL AUTO_INCREMENT,
src VARCHAR(64),
tgt VARCHAR(64),
tbl VARCHAR(64),
PRIMARY KEY (id)
) ENGINE=MyISAM;
DROP TABLE IF EXISTS SQLScriptToCopyTables;
CREATE TABLE SQLScriptToCopyTables
(
id INT NOT NULL AUTO_INCREMENT,
sqltext VARCHAR(1024),
PRIMARY KEY (id)
) ENGINE=MyISAM;
INSERT INTO DBTablesToCopy (src,tgt,tbl)
SELECT sourceDB,targetDB,table_name
FROM information_schema.tables
WHERE table_schema = sourceDB;
INSERT INTO SQLScriptToCopyTables (sqltext) VALUES
(CONCAT('DROP DATABASE IF EXISTS ',targetDB));
INSERT INTO SQLScriptToCopyTables (sqltext) VALUES
(CONCAT('CREATE DATABASE ',targetDB));
SELECT MAX(id) INTO ndx_last FROM DBTablesToCopy;
SET ndx = 0;
WHILE ndx < ndx_last DO
SET ndx = ndx + 1;
INSERT INTO SQLScriptToCopyTables (sqltext)
SELECT CONCAT('CREATE TABLE ',tgt,'.',tbl,' LIKE ',src,'.',tbl)
FROM DBTablesToCopy WHERE id = ndx;
END WHILE;
SET ndx = 0;
WHILE ndx < ndx_last DO
SET ndx = ndx + 1;
INSERT INTO SQLScriptToCopyTables (sqltext)
SELECT CONCAT('INSERT INTO ',tgt,'.',tbl,' SELECT * FROM ',src,'.',tbl)
FROM DBTablesToCopy WHERE id = ndx;
END WHILE;
SELECT MAX(id) INTO ndx_last FROM SQLScriptToCopyTables;
SET ndx = 0;
WHILE ndx < ndx_last DO
SET ndx = ndx + 1;
SELECT sqltext INTO @stmt FROM SQLScriptToCopyTables WHERE id = ndx;
PREPARE s1 FROM @stmt;
EXECUTE s1;
DEALLOCATE PREPARE s1;
END WHILE;
END $$
DELIMITER ;
then anytime you feel like copying stack_dev
to stack_test
, do this:
call utility.copydb('stack_dev','stack_test');
and that's all.
Give it a Try !!!
I'm a big AWS fan in general... but RDS, not so much.
@RolandoMySQLDBA has pointed out are some pretty good points against it.
The only advantage I see in RDS compared to MySQL on EC2 are the ability to do point and click snapshots, clones, and point-in-time recovery, but these are not nearly sufficient to make up for the loss of control and flexibility and they most certainly don't justify the price being higher. RDS is sexy in some ways, but you can't ultimately trust what you can't ultimately fix, because you can't get to all the moving parts.
I don't like not having the SUPER
privilege. I don't like not being able to tail the error log. I don't like not being able to run "top" or "iostat" on my database server to see how the cores and drives are enjoying the load. I don't like not having access to the federated storage engine. I don't like the thought of paying for a hot standyby (multi-AZ) backup master machine that I can't even leverage as a read replica. Sure, there are perfectly reasonable explanations why all of these constraints have to be in place for MySQL to be successfully packaged and sold as RDBMS-in-a-box. The only thing is, RDBMS-in-a-box "solves" a whole series of problems I don't have... and gets in my way, causing new problems.
But the absolute deal-breaker for me with RDS is the complete lack of access to the binary logs and replication. Binlogs, especially row-based, are a fantastic recovery tool for minor disasters, but they are of no help to you if you can't access them. Want to configure an on-premise server at your office as a read-replica of your production database in RDS? Something to take local backups from, do reporting, have on hand for disaster recovery should something unthinkable happen to your data that lives in RDS? That's an idea that is simultaneously obvious and brilliant.
Oops, sorry, direct access to replication is not available. Sure, you can pay for read replicas... but only as other RDS instances. Not a value proposition in my book.
Update: One Significant Change in RDS for MySQL 5.6
I still prefer running my own server (even in EC2) as opposed to running RDS for a number of reasons, including the lack of support for User-Defined Functions, the inability to use the Federated Storage Engine, and the inability to have the one extra connection available for emergency access... however...
Amazon has made a significant change in MySQL 5.6 for RDS that eliminates one of my major objections -- perhaps my largest objection: the binary logs are now accessible and you can run a non-RDS instance as a slave, or connect other utilities to the server that read the binlog stream.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Exporting.NonRDSRepl.html
Officially, the documentation indicates that they are exposing this so that you can set up a slave for the purpose of doing a live migration -- you synchronize the foreign future master server from the existing RDS instance using mysqldump
, then connect it to RDS as a slave to get a live feed of updates via the replication stream until your application is migrated to the new master -- but unofficially, you can do this on an ongoing basis as long as you don't expect them to support you... which, to me, seems reasonable.
This was confirmed in a recent Webinar, in a conversation that begins at around 56:45:
"You can keep it in a replicated state indefinitely...
"...as long as you take the responsibility to maintain the replication..."
"We are not preventing you from doing ongoing replication if that's what you want."
This new capability was enough for me to drop my blanket objection to using RDS in our public-facing website-backing MySQL instances, where we don't use FEDERATED
or some of the other things as much or at all.
So I'm still not in favor of it, but I'm no longer against it, since having a live stream of the binary logs puts me ultimately back in control of the data in real time and the responsibility for ensuring that no transactions are lost in a catastrophic outage is back with me, because I, as the DBA, am back in control -- which is exactly how I want it. Having a third party vendor to point fingers at, or file a lawsuit against, or whatever, doesn't get your lost data back if it disappears down a black hole inside a black box.
Management seems to like the "idea" of RDS and doesn't object to the cost difference, so we are now launching all new web sites with RDS behind them.
The point and click point-in-time recovery, I admit, is a nice feature in RDS... it doesn't alter or disrupt your existing machine -- instead, it fires up an entirely new instance, using the backup that was closest-in-time to the selected point in time, and then applies the necessary binlogs to bring that new machine forward to the point in time that you've specified.
Related to this, but in the other direction, it's also possible, now, to use a similar strategy to migrate a live MySQL database into RDS... you can connect an RDS master (presumably, typically, this would be a newly-deployed instance) as a slave of an existing system so that the RDS instance has the live version of the data at the time you migrate into it. Unlike access to the RDS binlogs for outward replication, which only works in 5.6, the inward replication is supported in RDS beginning with 5.5.33 and 5.6.13.
Best Answer
You may find this surprising, but there are lots of little transactions that occur inside ibdata1. I wrote about this back on Jan 14th : AWS RDS showing large growth in Write Ops/sec, despite low # DB connections?
Even though MySQL 5.6 has innodb_file_per_table on by default, the only way ibdata1 can grow is by having lots of transactions. If you look at this diagram:
you will see that ibdata1 has 1023 rollback segments for supporting MVCC and Transaction Isolation. In a heavy-write DB environment, ibdata1 can grow. I have discussed this before in my old post How can Innodb ibdata1 file grows by 5X even with innodb_file_per_table set?
In that post, I mentioned the following:
According to mysqlperformanceblog.com's Reasons for run-away main Innodb Tablespace:
Thus, I would expect ibdata1 to outgrow your current disk allocation if you chose a number too small.
BTW Where does the ibdata1 and the transaction logs actually reside ?
When you run
show variables like 'innodb%home_dir';
you find those two folderYou cannot just reach into it and expand it that easily.
Just to make sure you have nothing of yours in the database, run this query:
If you get nothing back, then you can reasonable assert what I just said. Otherwise, go delete your data.
You may want to check for binary logs as well. Just run one of the following:
You may or may not have rights to purge them. Amazon may not want you to touch them since they control binary logging for the sake of spinning up read-only Slaves in the cloud.
SUGGESTIONS