MySQL Database Poisoning: How to recover to a known state quickly

backupMySQLreplication

If you want to avoid database poisoning (ie want to recover back to a point-in-time quickly) which methodology do you prefer?

Let me define data poisoning. You insert some things in your database which totally messes up the internal structure and interdependencies. I know it means that database design probably also needs to be revisited, but the damage is done.

The methods I have in mind are

Somehow setup replication in which the slave is passive and is X hours behind. If I have a failure all I have to do is reset the application and point it to the slave as my new master. I suspect that it is possible.
Do a hot backup of MySQL every few hours and when failure is detected restore to a backup from X hours before. This would mean a downtime for the application since I cannot let the current application keep on running. One could use innobackup or percona for quick backup and recovery steps.
Design the application and database specifically so that newly added data gets nuked (or shelved). This means I store all events/states (I guess this is the most difficult and theorotical solution)

If the first option is possible and it also stores all the relay logs (ie what ever happened on Master gets transfered to Slave at the same instant but is applied in a few hours automagically) then it would be a perfect solution. Perhaps one could setup multiple slaves in a setup to recover from both an outage and data poisoning

Best Answer

You can use the pt-slave-delay tool from Percona Toolkit to keep a replica delayed by the amount of time you choose.

Related Solutions

How to Recover a Single MySQL Database on a Busy Master-Slave System

If all your database use InnoDB only, I have some good news.

You should be to dump all the database in parallel from a slave.

In fact, you can force all the databases into the same point-in-time.

First thing to remember about a the Slave is that it is not not required to have binary logging enabled if it is not a Master for other Slaves.

You cannot use --master-data option for parallel dumps because each dump will have a different position written at line 22 of each dump file. It is better to record the Master's last log file and position the Slave executed using SHOW SLAVE STATUS\G. That way, all the databases have the same point-in-time position.

You can collect all databases and script the parallel dump of all the database.

DBLIST=/tmp/ListOfDatabasesToParallelDump.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT IN ('information_schema','mysql','performance_schema')" > ${DBLIST}

for DB in `cat ${DBLIST}` 
do 
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} | gzip > ${DB}.sql.gz & 
done 
wait 

mysql -h... -u... -p... -e"START SLAVE;"

If there are simply too many databases, dump them 10 or 20 at a time as follows:

DBLIST=/tmp/ListOfDatabasesToParallelDump.txt
SSS=/tmp/ShowSlaveStatusDisplay.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT IN ('information_schema','mysql','performance_schema')" > ${DBLIST}

COMMIT_LIMIT=20
COMMIT_COUNT=0    
for DB in `cat ${DBLIST}` 
do 
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} | gzip > ${DB}.sql.gz & 
    (( COMMIT_COUNT++ ))
    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]
    then
        COMMIT_COUNT=0
        wait
    fi
done 
wait 
if [ ${COMMIT_COUNT} -gt 0 ]
then
    wait
fi

mysql -h... -u... -p... -e"START SLAVE;"

If you need to recover a single table, you can parallel dump tables 20 at a time in size order.

Try this:

TBLIST=/tmp/ListOfTablesToParallelDump.txt
SSS=/tmp/ShowSlaveStatusDisplay.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT CONCAT(table_schema,'.',table_name) FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','mysql','performance_schema') ORDER BY data_length" > ${DBLIST}

COMMIT_LIMIT=20
COMMIT_COUNT=0    
for DBTB in `cat ${TBLIST}` 
do
    DB=`echo "${DBTB}" | sed 's/\./ /g' | awk '{print $1}'`
    TB=`echo "${DBTB}" | sed 's/\./ /g' | awk '{print $2}'`
    DUMPFILE=$DB-{DB}-TBL-${TB}.sql.gz
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} ${TB} | gzip >  ${DUMPFILE} & 
    (( COMMIT_COUNT++ ))
    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]
    then
        COMMIT_COUNT=0
        wait
    fi
done 
wait 
if [ ${COMMIT_COUNT} -gt 0 ]
then
    wait
fi

mysql -h... -u... -p... -e"START SLAVE;"

Now that you have scripts to dump databases or individual tables, you can load that data at your discretion. If you need to get SQL executed from the binary logs on the master, you can use mysqlbinlog and give it the position ot datetime and output the SQL to other text files. You just have to perform due diligence to find the amount of data you need from whatever timestamps the bnary logs have. Just remember that every binary log's timestamp in the OS represents that last time it was written.

Mysql – How to resolve the master server shut down/unavailability in thesql with master – slave replication

Before you perform any mysqldump to fully restore a Slave, you should consult the output of SHOW SLAVE STATUS\G. Let's start with a sample SHOW SLAVE STATUS\G:

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.48.20.253
                Master_User: replicant
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000254
        Read_Master_Log_Pos: 858190247
             Relay_Log_File: relay-bin.066069
              Relay_Log_Pos: 873918
      Relay_Master_Log_File: mysql-bin.000254
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 858190247
            Relay_Log_Space: 873772
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

Please notice that there are two sets of replication coordinates from the Master

(Master_Log_File,Read_Master_Log_Pos)
(Relay_Master_Log_File,Exec_Master_Log_Pos)

There is a major difference between them

(Master_Log_File,Read_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs.
(Relay_Master_Log_File,Exec_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs THAT IS NEXT TO BE EXECUTED ON THE SLAVE.

The timestamps from these two coordinates helps you figure out Seconds_Behind_Master.

Knowing these things, here is what you can do:

Step 01) Run SHOW SLAVE STATUS\G
Step 02) Get Relay_Master_Log_File,Exec_Master_Log_Pos from SHOW SLAVE STATUS\G (In the sample, that would be (mysql-bin.000254,858190247)
Step 03) STOP SLAVE;
Step 04) CHANGE MASTER TO master_log_file='mysql-bin.000254',master_log_pos=858190247;
Step 05) START SLAVE;
Step 06) Wait 10 seconds
Step 07) Run SHOW SLAVE STATUS\G and check Seconds_Behind_Master

If the Seconds_Behind_Master is a number and eventaully drops to zero, replication is fully reesatablished.

After doing all this, if replication breaks because of a corrupt binary log from the master, then you do the last resort:

Steo 01) On the Master, RESET MASTER; to erase all binary logs and start with a new one
Step 02) On the Master, run this

This create proper dump for the slave

echo "STOP SLAVE;" > /root/MySQLData.sql
mysqldump --all-databases --routines --triggers --flush-privileges --master-data=1 >> /root/MySQLData.sql
echo "START SLAVE;" >> /root/MySQLData.sql

Step 03) scp /root/MySQLData.sql over to the Slave and Load it in MySQL on the Slave

Give it a Try !!!

Best Answer

Related Solutions

How to Recover a Single MySQL Database on a Busy Master-Slave System

Mysql – How to resolve the master server shut down/unavailability in thesql with master – slave replication

Related Question