You say you are using pt-table-checksum 2.0.1. I would recommend updating to 2.1, as there are many improvements in the tool.
Next, let me address your test. You say the slave was not updated after the first or second commands that you ran. The second command looks to be trying to connect directly to the slave. pt-table-checksum won't report any differences unless the server you're connecting to has slaves.
Also, the --replicate-check-only
option will not do any checksumming. (from the docs):
If specified, pt-table-checksum doesn’t checksum any tables. It checks replicas for differences found by previous checksumming, and then exits.
Your first command doesn't seem to be able to connect to the slave host, which is why it doesn't report any differences. Make sure the user/pass that is connecting to the master can also connect to the slave.
Now, as for your complex setup, you are right to worry about breaking replication. With some slaves replicating only certain tables, you should heed the warning here:
If the replicas are configured with any filtering options, you should be careful not to checksum any databases or tables that exist on the master and not the replicas.
You can specify which databases you want to checksum with the --databases
option, and give a specific list of tables with the --tables
option. Alternatively you can use the --ignore-databases
and --ignore-tables
options to provide a list of databases/tables to not checksum.
This will probably mean you will want separate pt-table-checksum commands based on which slaves you are trying to checksum. You will probably have to use the 'dsn' --recursion-method
to accomplish this (I've never done it, personally)
As for load, pt-table-checksum comes with some options to throttle itself. Namely --max-load
and --max-lag
.
The tool keeps track of how quickly the server is able to execute the queries, and adjusts the chunks as it learns more about the server’s performance. It uses an exponentially decaying weighted average to keep the chunk size stable, yet remain responsive if the server’s performance changes during checksumming for any reason. This means that the tool will quickly throttle itself if your server becomes heavily loaded during a traffic spike or a background task, for example.
Well, you are absolutely right to be concerned about log-slave-updates
causing an issue with your Master-Master setup, though it won't necessarily be an infinite loop changes. I suspect if you are writing to both masters, that you will constantly be having to set skip_slave_counter
and restart the slave thread.... In a word, not ideal.
I would look over this post, specifically the short section on data integrity:
The safest solution is to simply never write data to both masters.
So, if both master's are up, but your clients only ever write to one master at a time (Master 1 by default), then you shouldn't have an issue.
*shouldn't doesn't mean you won't, though.
Best Answer
Problem
When a Master is also a Slave, you need to have log-slave-updates.
From what you described, each Master does not have log-slave-updates configured.
Solution
STEP 01)
STOP SLAVE;
on Slave1STEP 02)
STOP SLAVE;
on Slave2STEP 03)
STOP SLAVE;
on Master1STEP 04)
STOP SLAVE;
on Master2STEP 05) On Master1, add this to
/etc/my.cnf
STEP 06) On Master2, add this to
/etc/my.cnf
STEP 07) On Master1, Run
service mysql restart --skip-slave-start
STEP 08) On Master2, Run
service mysql restart --skip-slave-start
STEP 09)
START SLAVE;
on Slave1STEP 10)
START SLAVE;
on Slave2STEP 11)
START SLAVE;
on Master1STEP 12)
START SLAVE;
on Master2That's it. Everything should replicate properly from here.
Give it a Try !!!
I have discussed this before
Dec 05, 2012
: MySQL Slave Relay Logging but not logging Binary LogMay 29, 2012
: mysqld-multi with first DB as Slave and second DB as MasterMay 07, 2012
: Setting Circular Replication in mysql