The Primary thing you need to do was left in a comment:
Follow the suggestion in Duplicate entry in MySQL slave:
STOP SLAVE;
SET GLOBAL sql_slave_skip_counter=1;
START SLAVE;
However, here is how you can start off at the better position with the CHANGE MASTER TO
command.
Let's take a look at a sample SHOW SLAVE STATUS\G
Slave_IO_State: Waiting for master to send event
Master_Host: 10.4.16.245
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.001527
Read_Master_Log_Pos: 554619670
Relay_Log_File: relay-bin.004561
Relay_Log_Pos: 554619815
Relay_Master_Log_File: mysql-bin.001527
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: phpmyadmin
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 554619670
Relay_Log_Space: 554620007
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Please take note of Relay_Master_Log_File
and Exec_Master_Log_Pos
. These represent the log file and the position of the last SQL statement that was completed on the Master that is next in line to be executed on the Slave.
So, the CHANGE MASTER TO
in this case would be:
CHANGE MASTER TO master_log_file='mysql-bin.001527',master_log_pos=554619670;
For your particular case, here is what you need to do on the Slave Server:
Step 01) Start mysql up with replication disabled at startup
$ service mysql restart --skip-slave-start
Step 02) Login to mysql and show the slave status:
mysql> SHOW SLAVE STATUS\G
Step 03) Get Relay_Master_Log_File
and Exec_Master_Log_Pos
from SHOW SLAVE STATUS\G
Step 04) Run CHANGE MASTER TO
command using Relay_Master_Log_File
and Exec_Master_Log_Pos
This will erase any relay logs collected and start collecting with a fresh, empty relay log.
Step 05) mysql> START SLAVE;
Step 06) mysql> SHOW SLAVE STATUS\G
repeatedly to watch Seconds_Behind_Master
go to 0
Step 07) If replication breaks due to Error 1062 (Duplicate Key), now you can implement @DTest's suggestion. Then, goto Step 06. Repeat this until Seconds_Behind_Master
go to 0.
We stopped at log 'mysql1-bin.000362' position 3384732.
That's what you use:
CHANGE MASTER TO master_log_file='mysql1-bin.000362'. master_log_pos=3384732;
If you ran RESET SLAVE
on the Slave, do not despair. Just run the full syntax version of the CHANGE MASTER TO
command:
CHANGE MASTER TO
MASTER_HOST='IP Address of Master',
MASTER_PORT=3306,
MASTER_USER='Replication Username',
MASTER_PASSWORD='Replication Password',
MASTER_LOG_FILE='mysql1-bin.000362',
MASTER_LOG_POS=3384732;
Best Answer
Regarding de-synchronization of a single or master server: Mosty, the cause of these problems come from the fact that the master is not in full ACID mode or it is using MyISAM tables. In order to avoid problems with a server, only use InnoDB as the engine of your tables and set
innodb_flush_log_at_trx
commit to 1. This is not efficient -you will lose a lot of performance in a busy server except if you can get advantage of the group commit, but it is the only way to guarantee data consistency with disk (no transaction loss).Many people, that do not have appropriate hardware (hardware cache), are forced to relax the value of this variable to 0 or 2, but in that case, you (or your hardware) is responsible to assure the availability of the disk/service in case something goes wrong (crash).
The second reason where disk can be out of sync with a master or single server is not using
sync_binlog=1
. By default, this is set to 0, and that means that you can write data internally, but that doesn't assure that it is written to the binary log. This shouldn't be an issue if you assume the master is always going to be back online, but you must understand that may be an issue. Activating it almost doubles your IOPS requirements, so be careful. It is not usually necessary and I do not recommend you to blindly activate it.Third, the most common cause of desynchronization is the storage of the replication position in the non-transactional
master.info
file in the slave. This is the default configuration. I highly recommend changing the configuration of the variables master_info_repository and relay_log_info_repository to useTABLE
(InnoDB transactional tables) instead ofFILE
. This will get rid of the desynchronization problems when the slave crashes.Additionally, in order to make sure that the master and slave are always in sync, and that no transactions are skipped, I recommend you to use the GTID replication, available in MySQL 5.6/MariaDB 10. While it does not prevent the desync of the master and slave, it makes easier to identify it and resync master and slave. This is a tutorial on how to enable it on MariaDB 10. Please note that GTID implementations between MySQL and MariaDB are not compatible. Once GTID is running, things like failover are greatly simplified.
There could be other reasons for desynchronization, like undeterministic updates, or replication filters. Avoid the later, and you can mitigate the former by using row-based replication
binlog_format = ROW
.On a different note, it is important to remark that MySQL default replication protocol is asynchronous, so one can never guarantee the same state on the master and the slave. If slave drift and other associated issues are a no-go, you can try alternatives like the semi-sync plugin for replication or Galera cluster, which both make tighter links between nodes.
Alternatively, I recommend you to both constantly monitor and repair replication problems with tools like pt-heartbeat, pt-table-checksum and pt-table-sync, which will avoid the need to manually provision the slave for every incident you encounter.
In order to have a more accurate answer, I recommend you to hire a consultant so that you can apply the best strategy in your case (some of the above may have an important impact on your performance while not being your specific problem).