Mariadb – Manually modify master-bin.index

mariadbreplication

I have an issue with MariaDB Server 5.5.56 replication. The slave got stuck and when I realized that we had already lost some of the bins. So this is the scenario :

In master :

MariaDB [(none)]> show binary logs;
+——————-+————+
| Log_name | File_size |
+——————-+————+
| master-bin.000320 | 1073742333 |
| master-bin.000321 | 1074247558 |
| master-bin.000322 | 753717941 |
| master-bin.000323 | 883803465 |
+——————-+————+
4 rows in set (0.01 sec)

cat master-bin.index
./master-bin.000320
./master-bin.000321
./master-bin.000322
./master-bin.000323

In slave ( excerpt )

MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Log_File: master-bin.000312
Read_Master_Log_Pos: 405852801
Relay_Log_File: mariadb-relay-bin.000942
Relay_Log_Pos: 405852988
Relay_Master_Log_File: master-bin.000312
Slave_IO_Running: No
Slave_SQL_Running: Yes
Exec_Master_Log_Pos: 405852703
Relay_Log_Space: 405853426
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)

So my idea to try and recover the slave is :

1.- In master : stop mariadb service
2.- In master : copy from backup bins 312 to 319
3.- In master : manually add bins 312 to 319 to master-bin.index
4.- In master : start mariadb service
5.- In slave : start slave

Would that work ? Is there any other way to solve my problem ? I know manually modifying master-bin.index is not recommended but I can not think of any other alternative to avoid recreating the slave.

Thanks in advance.

Best Answer

The best advice would be to rebuild the slave. Missing 8 binary logs means you may be missing a lot of transactions. Another option would be to execute CHANGE MASTER to and move to the next available binary log. You may need to skip slave errors along the way until the slave 'catches up' BUT you have to use pt-table-checksum and pt-table-sync to "resync" the data between these nodes.