MySQL – Finding Damage from FOREIGN_KEY_CHECKS=0

foreign keyMySQL

MySQL allows an end user to temporarily disable foreign key checks with the following statement

SET FOREIGN_KEY_CHECKS=0;

This means it's incredibly common (at least in my corner of the world) to find production MySQL databases where some rows have invalid foreign keys. For example, if you setup two tables with a foreign key relationship

CREATE TABLE `foo` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8;

CREATE TABLE `bar` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `foo_id` int(11) DEFAULT NULL,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `foo_id` (`foo_id`),
  CONSTRAINT `bar_ibfk_1` FOREIGN KEY (`foo_id`) REFERENCES `foo` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;

and insert some legitimate data

insert into foo(id,name) VALUES (1,'Test 1');

insert into foo(id,name) VALUES (2,'Test 2');

insert into bar(name, foo_id) VALUES ("Hello", 1);

and then use the above statements to turn off foreign key checks and set some bogus data

SET FOREIGN_KEY_CHECKS=0;
insert into bar(name, foo_id) VALUES ("Hello", 100);
SET FOREIGN_KEY_CHECKS=1;

You now have a bar table with a foo_id of 100, but no such row in the foo table.

My Question: Does MySQL have any built-in tools for finding this sort of bogus data? If not, are there common DBA tools for finding this sort of bogus data? Or am I writing a program to manually scan every table for foreign_keys, and then check each row individually?

Best Answer

The way to check for "incorrect" foreign keys, you have to do something like:

SELECT 
   bar.*
FROM
   bar
   LEFT JOIN foo ON bar.foo_id = foo.id
WHERE
   bar.foo_id IS NOT NULL AND foo.id IS NULL ;

or its equivalent

SELECT
    bar.*
FROM
    bar
WHERE
    bar.foo_id IS NOT NULL AND
    bar.foo_id NOT IN (SELECT foo.id FROM foo WHERE foo.id IS NOT NULL) ;

There is a tool, written in Ruby, called mysql-foreign-key-checker, by Michael Irey that can do this kind of checks systematically for you. The program is actually quite simple and easy to understand, and could be translated quite easyly to the language of your choice.

Check the queries at SQLFiddle.


NOTE: I know that some IS NOT NULL shouldn't be necessary in the queries, because you already have NOT NULL constraints. But... if you're checking for data that actually violates some constraints... you'd better not rely on any constraint.