Mysql – How to check the consistency of tables in thesql

data validationMySQL

I have few questions regarding consistency in Mysql in two scenarios

1st scenario:
How can I verify a restored backup matches the original DB data on all the tables?

We have two hosts with the same database A and B (the same data, restored to different DBs) if there any way to check if the data is really the same. Maybe kind of md5 hash on table would help. If there are existing solutions? Of course it's not complicated to do something by myself, kind of validator.

2nd scenario
Let's say we have very strict logical schema for database. I need to check that our DB works according to schema. Let's say every Order has ProductId and etc. It sounds much more complicated, if you have faced these problem, how you deal with them.

Thanks!

Best Answer

Run CHECKSUM TABLE command against a table.

There is a caveat for this. According to the MySQL Documentation on CHECKSUM TABLE:

If the checksums for two tables are different, then it is almost certain that the tables are different in some way. However, because the hashing function used by CHECKSUM TABLE is not guaranteed to be collision-free, there is a slight chance that two tables which are not identical can produce the same checksum.

OK, CHECKSUM TABLE is 99.99%, not 100%.

How can you aggressively check all the items of a table for matches and mismatches? You can actually run a great tool from Percona Tools called pt-table-sync. It will check for differences in tables between Master and Slave, as long as the table structure is identical.

For example, to sync the table db.tb1 on a Slave in relation to its Master, run pt-table-sync as follows:

pt-table-sync --print --sync-to-master h=SlaveIP,D=db,t=tbl > SQLtoFixDifferences.sql

When you run this, the table db.tb1 on the Slave is compared to that of its Master using the Primary Key (or Unique Key is there is no Primary Key) of the Slave's db.tb1. The output will be a series of SQL commands, usually REPLACE INTO and DELETE FROM. The script SQLtoFixDifferences.sql is to be executed on the Slave. When done, the data in db.tb1 should be identical to its counterpart on the Master.

To test it then, run these commands:

CHECKSUM TABLE db.tb1;
SELECT COUNT(1) FROM db.tb1;

The results should be identical on Master and Slave.

CAVEAT

I have never used pt-table-sync. I have been using mk-table-sync (MAATKIT tool from Percona) for years and I know it has always worked for me. I fully trust Percona's pt-table-sync is at the least the same, if not superior, quality.