Mysql – Populate a table on a selective replication slave

MySQLmysql-5.6replication

We have a setup with a data mining slave with a lot of replicate-ignore-table= for tables that aren't needed. Now the developers have requested to populate a table that is currently ignored. How do I go about populating it so that the replication can be started again at the correct point.

I want to avoid locking the table on the master instance as that would lead to application downtime.

We are not yet using GTID.

I have access to MySQL Enterprise Backup if that helps.

Best Answer

Do these two steps while nothing is being added to the table:

Manually copy the table to the Slave(s) (even if empty -- need table definition).
Remove the replacate-ignore setting.

Related Solutions

Mysql – Will an ALTER TABLE ADD COLUMN on a slave break replication

I think that, if your purpose is to avoid locking, you should use pt-online-schema-change (a tool from Percona Toolkit) or Gh-ost (from GitHub).

Here is how pt-osc works:

Creates a ghost table identical to the original table, but empty
Runs ALTER TABLE on the ghost table
Creates triggers on the original table, which insert new rows in the new table, and updates and deletes them if they are already there.
It copies the rows in chunks, monitoring the server performance (it can slow down or pause its work if necessary)
Switch the table names

gh-ost is very similar, the main difference is that it creates no triggers, and uses the binary log instead to detect the changes to the original table.

To answer your question more directly - yes, it is possible to add a column on the slave. And most probably it will simply work, as long as it has a default value. But it is very likely that in the future you will regret this choice, because you'll need to do something that will break replication, like adding a column in the master. I recommend to sure to check the documentation and fully understand the dangers, before doing such a thing.

MySQL Replication – How to Copy a Table

replicate-ignore-table and the related configuration options are problematic at best. Not because they don't do what they're supposed to do -- because they do -- but rather because they have side effects that require a deeper understanding of MySQL replication.

The problem you have stems from the way the events are replicating. MySQL can replicate table data changes using two different mechanisms, statement-based and row-based, based on the setting of the global configuration variable binlog_format.

STATEMENT -- the actual queries are replicated from server to server, with each server executing the query and modifying its data set.

ROW -- the actual queries are not sent, but instead, tightly-packed row images of the changes actually caused by each query are sent.

MIXED -- the server chooses on a query-by-query basis how to replicate the events.

Importantly, when a query is replicated as a statement event, triggers on the affected tables fire on the slave in order to make the same changes to other tables that would have been made on the master. When a query replicates as a row event, triggers do not fire on the slave. Instead, the master server sends over additional row images from the affected tables so that the additional impacted tables also get updated to match the way they were on the master. The same thing is true when procedures are called -- either the individual statements in the procedure are replicated and have to be valid and execute on the slave (statement-based replication), or the rows changed by the procedure will be replicated (row-based replication).

The scenario you described cannot work unless your master server's binlog_format is set to ROW, because the queries running on the slave have to have access to the "user" table data.

Once that is set, if it were me, I would remove the replicate-ignore-table from the configuration, and instead provision the user table on the slave with the BLACKHOLE storage engine, which discards data sent to it, returning "success" on inserts and 0 rows found/affected on updates and deletes.

However, keep reading, because you are doing this the hard way.

The solution is much simpler: alter the table to move the confidential data to the far right side of the table, and then drop those columns from the slave's copy of the table. No, really:

Source and target tables for replication do not have to be identical. A table on the master can have more or fewer columns than the slave's copy of the table.

You can replicate a table from the master to the slave such that the master and slave copies of the table have differing numbers of columns, subject to the following conditions:

Columns common to both versions of the table must be defined in the same order on the master and the slave. (This is true even if both tables have the same number of columns.)

Columns common to both versions of the table must be defined before any additional columns.

^{— http://dev.mysql.com/doc/refman/5.6/en/replication-features-differing-tables.html}

I have linked to the documentation from MySQL 5.6 since that's the current GA version, but this feature is available as far back as 5.1.21. When doing things this way, you would still be advised to switch your master to binlog_format = ROW.

Best Answer

Related Solutions

Mysql – Will an ALTER TABLE ADD COLUMN on a slave break replication

MySQL Replication – How to Copy a Table

Related Question