MySQL Delete Rows from Slave

innodbMySQLreplication

I want to truncate the majority of data on a slave such that it only has the most recent records. I'll eventually demote the current master, making it a read-only archive, and promote the truncated slave to master. The primary motivation for truncation is reclamation of disk space; I want to dump and restore the slave.

Questions:

Regarding the dump/restore operation on the slave, I'm planning on stopping the IO and SQL threads, then dumping the table, then dropping the table, the restoring from the dump, then starting the slave threads. Does this sound like the right process?
What happens, after deleting rows from the slave, if the master sends an UPDATE or DELETE referencing a deleted row?

Best Answer

- If you are using InnoDB with the option innodb_file_per_table option off (by default on MySQL 5.1 and before), you need to stop the server after the dump (make sure you are not using innodb_fast_shutdown = 2), then delete at least the ibdata1, ib_logfile0, ib_logfile1 and your database directory and finally restart MySQL and import only the final truncated data. Otherwise, your ibdata1 file won't shrink in size.
- If you have active the option innodb_file_per_table (by default on MySQL 5.5 and later), you have separate .ibd files per table, and your ibdata1 file is not too big, you do not need to import/export the data and delete the InnoDB files. Just delete the appropriate rows and defragment the .ibd files by doing:
```
ALTER TABLE <your table> ENGINE=InnoDB; -- for 5.5 and before

ALTER TABLE <your table> ENGINE=InnoDB, ALGORITHM=COPY; -- for 5.6 and later
```
Replication will probably break. On ROW based replication, it will certainly break as it won't find the right rows to affect. On STATEMENT mode some queries may continue working, affecting 0 rows, but as time passes, it is highly probable to find some incompatible query, like an UPDATE that can be done on the master but not on the slave due to a unique key constraint. I do not recommend running a replication with different data on master and slave unless you know for sure that those parts are not modified or they are done in a very restricted way. I can tell you that most of the replication problems I fix are due to replication filters.

Related Solutions

How to Recover a Single MySQL Database on a Busy Master-Slave System

If all your database use InnoDB only, I have some good news.

You should be to dump all the database in parallel from a slave.

In fact, you can force all the databases into the same point-in-time.

First thing to remember about a the Slave is that it is not not required to have binary logging enabled if it is not a Master for other Slaves.

You cannot use --master-data option for parallel dumps because each dump will have a different position written at line 22 of each dump file. It is better to record the Master's last log file and position the Slave executed using SHOW SLAVE STATUS\G. That way, all the databases have the same point-in-time position.

You can collect all databases and script the parallel dump of all the database.

DBLIST=/tmp/ListOfDatabasesToParallelDump.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT IN ('information_schema','mysql','performance_schema')" > ${DBLIST}

for DB in `cat ${DBLIST}` 
do 
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} | gzip > ${DB}.sql.gz & 
done 
wait 

mysql -h... -u... -p... -e"START SLAVE;"

If there are simply too many databases, dump them 10 or 20 at a time as follows:

DBLIST=/tmp/ListOfDatabasesToParallelDump.txt
SSS=/tmp/ShowSlaveStatusDisplay.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT IN ('information_schema','mysql','performance_schema')" > ${DBLIST}

COMMIT_LIMIT=20
COMMIT_COUNT=0    
for DB in `cat ${DBLIST}` 
do 
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} | gzip > ${DB}.sql.gz & 
    (( COMMIT_COUNT++ ))
    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]
    then
        COMMIT_COUNT=0
        wait
    fi
done 
wait 
if [ ${COMMIT_COUNT} -gt 0 ]
then
    wait
fi

mysql -h... -u... -p... -e"START SLAVE;"

If you need to recover a single table, you can parallel dump tables 20 at a time in size order.

Try this:

TBLIST=/tmp/ListOfTablesToParallelDump.txt
SSS=/tmp/ShowSlaveStatusDisplay.txt
BACKUP_BASE=/backups
BACKUP_DATE=`date +"%Y%m%d_%H%M%S"`
BACKUP_HOME=${BACKUP_BASE}/${BACKUP_DATE}
mkdir ${BACKUP_HOME}
cd ${BACKUP_HOME}

mysql -h... -u... -p... -e"STOP SLAVE;"
mysql -h... -u... -p... -e"SHOW SLAVE STATUS\G" > ${SSS}
LOGFIL=`cat ${SSS} | grep "Relay_Master_Log_File" | awk '{print $2}'`
LOGPOS=`cat ${SSS} | grep "Exec_Master_Log_Pos"   | awk '{print $2}'`
echo "Master was at ${LOGFIL} Position ${LOGPOS} for this Backup" > Master_Log_FilePos.txt

mysql -h... -u... -p... -AN -e"SELECT CONCAT(table_schema,'.',table_name) FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','mysql','performance_schema') ORDER BY data_length" > ${DBLIST}

COMMIT_LIMIT=20
COMMIT_COUNT=0    
for DBTB in `cat ${TBLIST}` 
do
    DB=`echo "${DBTB}" | sed 's/\./ /g' | awk '{print $1}'`
    TB=`echo "${DBTB}" | sed 's/\./ /g' | awk '{print $2}'`
    DUMPFILE=$DB-{DB}-TBL-${TB}.sql.gz
    mysqldump -h... -u... -p... --hex-blob --routines --triggers ${DB} ${TB} | gzip >  ${DUMPFILE} & 
    (( COMMIT_COUNT++ ))
    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]
    then
        COMMIT_COUNT=0
        wait
    fi
done 
wait 
if [ ${COMMIT_COUNT} -gt 0 ]
then
    wait
fi

mysql -h... -u... -p... -e"START SLAVE;"

Now that you have scripts to dump databases or individual tables, you can load that data at your discretion. If you need to get SQL executed from the binary logs on the master, you can use mysqlbinlog and give it the position ot datetime and output the SQL to other text files. You just have to perform due diligence to find the amount of data you need from whatever timestamps the bnary logs have. Just remember that every binary log's timestamp in the OS represents that last time it was written.

Mysql – Rows already DELETEd on MySQL slave; How to DELETE on master

Here is something quick and dirty you can do

Run the DELETE like this on the Master:

SET SQL_LOG_BIN=0;
DELETE FROM ... ;

The first line tells the DB Session not to record the SQL that follows. Thus, when you run the DELETE on the Master, the SQL will not be written in the Master's binary logs. Consequently, the Slave will never receive the DELETE in its relay logs.

This will only affect the session you use to run these two lines. All other DB Connections will replicate as usual. Once you disconnect, any new session will replicate properly.

Give it a Try !!!

Best Answer

Related Solutions

How to Recover a Single MySQL Database on a Busy Master-Slave System

Mysql – Rows already DELETEd on MySQL slave; How to DELETE on master

Related Question