MySQL Replication Lag – Resolving 23-Hour Lag Issues

MySQLreplication

I came into a new environment where the production DB was running on MySQL 4. Replication is usually about an hour behind but in the last few days it has been well over 24 hours behind. I'm not completely sure why. I'm not a DBA, but I'm all the company has at the moment.

What should I do based on the info below? Can I easily upgrade to MySQL 5? Any help, suggestions, etc., will be greatly appreciated. Let me know if you need more info.

Here are the stats

Master

`mysql> show slave status\G;
*************************** 1. row ***************************
             Slave_IO_State: 
                Master_Host: xxxxx.1.xx
                Master_User: repl
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: 
        Read_Master_Log_Pos: 4
             Relay_Log_File: XXXXXXX.000002
              Relay_Log_Pos: 4
      Relay_Master_Log_File: 
           Slave_IO_Running: No
          Slave_SQL_Running: No
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 0
            Relay_Log_Space: 12
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: NULL
1 row in set (0.00 sec)`

Slave 1

`mysql> show slave status\G;
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: xxxxx.3.xxx
                Master_User: repl
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: .001746
        Read_Master_Log_Pos: 123599834
             Relay_Log_File: XXXXXXX.000014
              Relay_Log_Pos: 290086986
      Relay_Master_Log_File: .001733
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 740137529
            Relay_Log_Space: 13632228696
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: 82286
1 row in set (0.00 sec)`

Master Process List

`mysql> show processlist\G;
*************************** 1. row ***************************
     Id: 4409572
   User: repl
   Host: XXXXXXX:58629
     db: NULL
Command: Binlog Dump
   Time: 81339
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
*************************** 2. row ***************************
     Id: 4784043
   User: repl
   Host: xxxxx:37305
     db: NULL
Command: Binlog Dump
   Time: 76254
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
*************************** 3. row ***************************
     Id: xxxxx1573
   User: xxxxx
   Host: dhcp-010-003-010-111.xxxxx:50560
     db: NULL
Command: Sleep
   Time: 58
  State: 
   Info: NULL
*************************** 4. row ***************************
     Id: 11208590
   User: DELAYED
   Host: 
     db: member
Command: Delayed insert
   Time: 28
  State: Waiting for INSERT
   Info: 
*************************** 5. row ***************************
     Id: 11240346
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
*************************** 6. row ***************************
     Id: 11264206
   User: site
   Host: xxxxxxxxxx:51643
     db: member
Command: Sleep
   Time: 4
  State: 
   Info: NULL
*************************** 7. row ***************************
     Id: 11264297
   User: site
   Host: xxxxxxxxxx:58104
     db: member
Command: Sleep
   Time: 4
  State: 
   Info: NULL
*************************** 8. row ***************************
     Id: 11264455
   User: xxxxx
   Host: xxxxxweb12.dig.lo:51377
     db: drupal3
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 9. row ***************************
     Id: 11264490
   User: xxxxx
   Host: xxxxxxxxx:43853
     db: NULL
Command: Sleep
   Time: 2
  State: 
   Info: NULL
*************************** 10. row ***************************
     Id: 11264538
   User: site
   Host: xxxxx:43254
     db: tarot
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 11. row ***************************
     Id: 11264555
   User: site
   Host: xxxxxxxxxx:43261
     db: trialpay
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 12. row ***************************
     Id: 11264567
   User: site
   Host: xxxxxxxxxx:42371
     db: xxxxxxxxx
Command: Sleep
   Time: 0
  State: 
   Info: NULL
*************************** 13. row ***************************
     Id: 11264584
   User: site
   Host: xxxxxxxxxx:49003
     db: partner
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 14. row ***************************
     Id: 11264594
   User: site
   Host: xxxxx:34120
     db: partner
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 15. row ***************************
     Id: 11264595
   User: site
   Host: xxxxx:38290
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 16. row ***************************
     Id: 11264627
   User: site
   Host: xxxxx:38297
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 17. row ***************************
     Id: 11264628
   User: site
   Host: xxxxx:47308
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 18. row ***************************
     Id: 11264630
   User: site
   Host: xxxxx:38298
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 19. row ***************************
     Id: 11264637
   User: site
   Host: xxxxx:38299
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 20. row ***************************
     Id: 11264638
   User: site
   Host: xxxxx:38300
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 21. row ***************************
     Id: 11264641
   User: site
   Host: xxxxx:47309
     db: NULL
Command: Sleep
   Time: 0
  State: 
   Info: NULL
*************************** 22. row ***************************
     Id: 11264643
   User: site
   Host: xxxxx:47310
     db: NULL
Command: Sleep
   Time: 0
  State: 
   Info: NULL
*************************** 23. row ***************************
     Id: 11264645
   User: site
   Host: xxxxx:38302
     db: NULL
Command: Query
   Time: 0
  State: statistics
   Info: SELECT * FROM member.session_3 WHERE ID='xxxxxxxxxxxxxx' LIMIT 1
23 rows in set (0.00 sec)`

Slave Process List

`mysql> show processlist\G;
*************************** 1. row ***************************
     Id: 2772944380
   User: system user
   Host: 
     db: NULL
Command: Connect
   Time: 81410
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 2772944381
   User: system user
   Host: 
     db: 
Command: Connect
   Time: 82185
  State: update
   Info: INSERT INTO member.session_1 (ID, session_data, timestamp) VALUES ('xxxxxxxxxxxxxx', 'pa
*************************** 3. row ***************************
     Id: 2775506496
   User: xxxxx
   Host: dhcp-010-003-010-111.xxxxx:50559
     db: NULL
Command: Sleep
   Time: 13
  State: 
   Info: NULL
*************************** 4. row ***************************
     Id: 2775668174
   User: xxxxx
   Host: dhcp-010-003-010-125.xxxxx:49403
     db: NULL
Command: Sleep
   Time: 52
  State: 
   Info: NULL
*************************** 5. row ***************************
     Id: 2775668202
   User: xxxxx
   Host: dhcp-010-003-010-125.xxxxx:49404
     db: NULL
Command: Sleep
   Time: 32
  State: 
   Info: NULL
*************************** 6. row ***************************
     Id: 2775711517
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
*************************** 7. row ***************************
     Id: 2775720172
   User: site
   Host: xxxxxxxxxx:58384
     db: partner
Command: Sleep
   Time: 25
  State: 
   Info: NULL
*************************** 8. row ***************************
     Id: 2775720176
   User: site
   Host: xxxxxxxxxx:58385
     db: NULL
Command: Sleep
   Time: 25
  State: 
   Info: NULL
*************************** 9. row ***************************
     Id: 2775720944
   User: site
   Host: xxxxxxxxxx:49278
     db: member
Command: Sleep
   Time: 2
  State: 
   Info: NULL
*************************** 10. row ***************************
     Id: 2775720948
   User: site
   Host: xxxxxxxxxx:49282
     db: NULL
Command: Sleep
   Time: 2
  State: 
   Info: NULL
*************************** 11. row ***************************
     Id: 2775721005
   User: site
   Host: xxxxx:57930
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 12. row ***************************
     Id: 2775721010
   User: site
   Host: xxxxx:57933
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 13. row ***************************
     Id: 2775721011
   User: site
   Host: xxxxx:49800
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 14. row ***************************
     Id: 2775721015
   User: site
   Host: xxxxx:45660
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
*************************** 15. row ***************************
     Id: 2775721018
   User: site
   Host: xxxxx:34960
     db: NULL
Command: Sleep
   Time: 1
  State: 
   Info: NULL
15 rows in set (0.00 sec)`

Slave innodb status

`mysql> SHOW ENGINE INNODB STATUS\G
Status: 
=====================================
140807 11:21:08 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 0 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 319355932, signal count 203235888
Mutex spin waits 11228826125, rounds 43916655475, OS waits 166874078
RW-shared spins 201134536, OS waits 98462607; RW-excl spins 44393878, OS waits 9197002
------------
TRANSACTIONS
------------
Trx id counter 4 3013774387
Purge done for trx's n:o < 4 3013773661 undo n:o < 0 0
History list length 284
Total number of lock structs in row lock hash table 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 0, not started, process no 20493, OS thread id 1187813696
MySQL thread id 2775754253, query id 21361709229 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 4 3013774330, not started, process no 20493, OS thread id 1309239616
MySQL thread id 2775755720, query id 21361709124 troweb01.dig.lo 10.3.3.110 site
---TRANSACTION 4 3013774386, ACTIVE 0 sec, process no 20493, OS thread id 1149479232 starting index read
mysql tables in use 1, locked 1
MySQL thread id 2772944381, query id 21361709235 updating
DELETE FROM member.session_0 WHERE ID='mskpjjebs84q468ss2j9rokjv0'
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
1100679534 OS file reads, 1320249436 OS file writes, 864366793 OS fsyncs
31000.00 reads/s, 16384 avg bytes/read, 45000.00 writes/s, 46000.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf for space 0: size 1, free list len 8178, seg size 8180, is not empty
Ibuf for space 0: size 1, free list len 8178, seg size 8180,
23231155 inserts, 23231152 merged recs, 8712216 merges
Hash table size 17700857, used cells 9797487, node heap has 21054 buffer(s)
8000.00 hash searches/s, 292000.00 non-hash searches/s
---
LOG
---
Log sequence number 1109 524358194
Log flushed up to   1109 524357891
Last checkpoint at  1109 408163048
1 pending log writes, 0 pending chkp writes
757938365 log i/o's done, 56000.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 9246607770; in additional pool allocated 20813056
Buffer pool size   524288
Free buffers       0
Database pages     503234
Modified db pages  37692
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 1600906446, created 39766617, written 1028068613
35000.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 977 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 20493, id 1149278528, state: sleeping
Number of rows inserted 322671496, updated 4889952955, deleted 7871272, read 315280388069
2000.00 inserts/s, 54000.00 updates/s, 0.00 deletes/s, 89000.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================`

`mysql> show variables like 'innodb%';
+---------------------------------+------------------------+
| Variable_name                   | Value                  |
+---------------------------------+------------------------+
| innodb_additional_mem_pool_size | 20971520               |
| innodb_autoextend_increment     | 8                      |
| innodb_buffer_pool_awe_mem_mb   | 0                      |
| innodb_buffer_pool_size         | 8589934592             |
| innodb_data_file_path           | ibdata1:10M:autoextend |
| innodb_data_home_dir            |                        |
| innodb_fast_shutdown            | ON                     |
| innodb_file_io_threads          | 4                      |
| innodb_file_per_table           | ON                     |
| innodb_flush_log_at_trx_commit  | 1                      |
| innodb_flush_method             |                        |
| innodb_force_recovery           | 0                      |
| innodb_lock_wait_timeout        | 50                     |
| innodb_locks_unsafe_for_binlog  | OFF                    |
| innodb_log_arch_dir             |                        |
| innodb_log_archive              | OFF                    |
| innodb_log_buffer_size          | 8388608                |
| innodb_log_file_size            | 100663296              |
| innodb_log_files_in_group       | 2                      |
| innodb_log_group_home_dir       | ./                     |
| innodb_max_dirty_pages_pct      | 90                     |
| innodb_max_purge_lag            | 0                      |
| innodb_mirrored_log_groups      | 1                      |
| innodb_open_files               | 300                    |
| innodb_table_locks              | ON                     |
| innodb_thread_concurrency       | 500                    |
+---------------------------------+------------------------+
26 rows in set (0.00 sec)`

Master

`
mysql> show variables like 'innodb%';
+---------------------------------+------------------------+
| Variable_name                   | Value                  |
+---------------------------------+------------------------+
| innodb_additional_mem_pool_size | 20971520               |
| innodb_autoextend_increment     | 8                      |
| innodb_buffer_pool_awe_mem_mb   | 0                      |
| innodb_buffer_pool_size         | 8589934592             |
| innodb_data_file_path           | ibdata1:10M:autoextend |
| innodb_data_home_dir            |                        |
| innodb_fast_shutdown            | ON                     |
| innodb_file_io_threads          | 4                      |
| innodb_file_per_table           | ON                     |
| innodb_flush_log_at_trx_commit  | 1                      |
| innodb_flush_method             |                        |
| innodb_force_recovery           | 0                      |
| innodb_lock_wait_timeout        | 50                     |
| innodb_locks_unsafe_for_binlog  | OFF                    |
| innodb_log_arch_dir             |                        |
| innodb_log_archive              | OFF                    |
| innodb_log_buffer_size          | 8388608                |
| innodb_log_file_size            | 100663296              |
| innodb_log_files_in_group       | 2                      |
| innodb_log_group_home_dir       | ./                     |
| innodb_max_dirty_pages_pct      | 90                     |
| innodb_max_purge_lag            | 0                      |
| innodb_mirrored_log_groups      | 1                      |
| innodb_open_files               | 300                    |
| innodb_table_locks              | ON                     |
| innodb_thread_concurrency       | 500                    |
+---------------------------------+------------------------+`

Binary Logs on Master

`mysql> SHOW BINARY LOGS;
+----------+------------+
| Log_name | File_size  |
+----------+------------+
| .001668  | 1073743525 |
| .001669  | 1073742181 |
| .001670  | 1073742965 |
| .001671  | 1073742675 |
| .001672  | 1073744261 |
| .001673  | 1073742315 |
| .001674  | 1073742012 |
| .001675  | 1073747721 |
| .001676  | 1073742152 |
| .001677  | 1073741870 |
| .001678  | 1073741996 |
| .001679  | 1073743689 |
| .001680  | 1073745854 |
| .001681  | 1073743341 |
| .001682  | 1073743705 |
| .001683  | 1073741918 |
| .001684  | 1073743384 |
| .001685  | 1073743746 |
| .001686  | 1073742489 |
| .001687  | 1073742964 |
| .001688  | 1073743698 |
| .001689  | 1073741992 |
| .001690  | 1073743910 |
| .001691  | 1073742432 |
| .001692  | 1073768692 |
| .001693  | 1073742854 |
| .001694  | 1073744278 |
| .001695  | 1073742950 |
| .001696  | 1073743193 |
| .001697  | 1073743926 |
| .001698  | 1073742996 |
| .001699  | 1073741934 |
| .001700  | 1073752502 |
| .001701  | 1073742946 |
| .001702  | 1073741946 |
| .001703  | 1073743213 |
| .001704  | 1073743600 |
| .001705  | 1073742102 |
| .001706  | 1073742708 |
| .001707  | 1073741920 |
| .001708  | 1073742893 |
| .001709  | 1073746745 |
| .001710  | 1073742241 |
| .001711  | 1073747731 |
| .001712  | 1073742207 |
| .001713  | 1073742084 |
| .001714  | 1073744461 |
| .001715  | 1073742364 |
| .001716  |  518459392 |
| .001717  |         79 |
| .001718  | 1073742235 |
| .001719  | 1073742165 |
| .001720  | 1073741922 |
| .001721  | 1073744403 |
| .001722  | 1073743551 |
| .001723  |  623701238 |
| .001724  |         79 |
| .001725  |         79 |
| .001726  | 1073741876 |
| .001727  | 1073742369 |
| .001728  | 1073742004 |
| .001729  | 1073742101 |
| .001730  | 1073742300 |
| .001731  | 1073741996 |
| .001732  | 1073744726 |
| .001733  | 1073746953 |
| .001734  | 1073741871 |
| .001735  | 1073747600 |
| .001736  | 1073743845 |
| .001737  | 1073747567 |
| .001738  | 1073742703 |
| .001739  | 1073742219 |
| .001740  | 1073749183 |
| .001741  | 1073741888 |
| .001742  | 1073747031 |
| .001743  | 1073744041 |
| .001744  | 1073742142 |
| .001745  | 1073741924 |
| .001746  |  939175970 |
+----------+------------+`

Log File

Number of processes running now: 0
140805 21:29:20  mysqld restarted
/usr/libexec/mysqld: File '/var/log/slowquries.log' not found (Errcode: 13)
140805 21:29:20 [ERROR] Could not use /var/log/slowquries.log for logging (error 13). Turning logging off for the whole duration of the MySQL server process. To turn it on again: fix the cause, shutdown the MySQL server and restart it.
140805 21:29:21  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
140805 21:29:23  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
InnoDB: Error: could not open single-table tablespace file
InnoDB: ./member/session_0.ibd!
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
140805 21:29:23  mysqld ended

140805 21:29:37  mysqld started
/usr/libexec/mysqld: File '/var/log/slowquries.log' not found (Errcode: 13)
140805 21:29:37 [ERROR] Could not use /var/log/slowquries.log for logging (error 13). Turning logging off for the whole duration of the MySQL server process. To turn it on again: fix the cause, shutdown the MySQL server and restart it.
140805 21:29:39  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
140805 21:29:39  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
InnoDB: Error: could not open single-table tablespace file
InnoDB: ./member/session_0.ibd!
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
140805 21:29:39  mysqld ended

140805 21:31:18  mysqld started
/usr/libexec/mysqld: File '/var/log/slowquries.log' not found (Errcode: 13)
140805 21:31:18 [ERROR] Could not use /var/log/slowquries.log for logging (error 13). Turning logging off for the whole duration of the MySQL server process. To turn it on again: fix the cause, shutdown the MySQL server and restart it.
140805 21:31:19  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
140805 21:31:20  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 1111 3006237038.
InnoDB: Doing recovery: scanned up to log sequence number 1111 3011479552
Stuff...
InnoDB: Doing recovery: scanned up to log sequence number 1111 3122643558
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 16118026 row operations to undo
InnoDB: Trx id counter is 5 204953088
140805 21:31:29  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents:.... 
InnoDB: Apply batch completed
InnoDB: Starting rollback of uncommitted transactions
InnoDB: Rolling back trx with id 5 197110837, 16118026 rows to undo
InnoDB: Progress in percents: .....
InnoDB: Rolling back of trx id 5 197110837 completed
InnoDB: Rollback of uncommitted transactions completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 963211909, file name binlog.000571
InnoDB: Last MySQL binlog file position 0 623701238, file name /mnt/eql-mystones/.001723
140806  0:40:54  InnoDB: Flushing modified pages from the buffer pool...
140806  0:42:42  InnoDB: Started; log sequence number 1111 3122643558
/usr/libexec/mysqld: ready for connections.
Version: '4.1.22-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution

Best Answer

Look at the Slave Status

            Master_Log_File: .001746 <<---------------
        Read_Master_Log_Pos: 123599834 <<-------------
             Relay_Log_File: XXXXXXX.000014
              Relay_Log_Pos: 290086986
      Relay_Master_Log_File: .001733 <<---------------
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 740137529 <<-------------
            Relay_Log_Space: 13632228696 <<-----------
            Until_Condition: None

You have 13GB worth of transactions yet to execute

  • Since each binlog is 1GB, 1746 - 1733 is 13 * 1G = 13G
  • Relay_Log_Space is also 13G (sum of all relay log filesizes)

If replication never broke over the past 23 hours, you need to find out if any large transactions were executed in the last 24 hours.

If you performed a LOAD DATA INFILE on the Master within the last 24 hours, it replicated to the Slave. In fact, the entire file in represented in the relay logs. My guess is that the file must be about 13G.

This is most likely what happened: Once you run LOAD DATA INFILE and it was being replicated, Seconds_Behind_Master remained 0 until the entire input file was represented as binlog event chunks spread throughout the relay logs. Then, suddenly, without warning, Seconds_Behind_Master shot right up to thousands of seconds. That time displacement would have been the time it took to completely replicate the file from the Master's Binary Logs to the Slave's Relay Logs, extract it from the Relay Logs, and store it as a CSV on disk on the Slave. That time does not even take into account the time it takes for LOAD DATA INFILE to be executed on the Slave.

I wrote about this scenario before

UPDATE 2014-08-07 16:36 EDT

Based on your comments, if you want to following the steps for setting up sync_binlog, please proceed as follows:

STEP 01) On the Slave, run STOP SLAVE;

STEP 02) On the Master, add sync_binlog=1 to /etc/my.cnf:

[mysqld]
sync-binlog=1

STEP 03) Run one of the following on the Master:

SET GLOBAL sync_binlog = 1;

STEP 04) On the Slave, run START SLAVE;

No need to restart MySQL. Thus, no downtime.