Mysql – Distinguish between unknown and non-present foreign keys

database-designforeign keyMySQLnull

I'm in the situation of importing data from multiple other applications.

The data is stored in multiple tables that contain foreign keys.

Some of these applications don't provide all the needed data – the application doesn't give me an ID I could put in the foreign key column (either because that application doesn't manage that data or the API doesn't provide access to that field).

I've used NULL if one value is not known. In cases where the value is nothing, I've used 0 or an empty string, for example an empty string for a title if something is untitled.

However, this doesn't work with foreign keys. When using NULL I don't know if the relation is currently unknown or if there is no relation. Using a special value (e.g. 0) also doesn't work because that would violate the foreign key constraint.

I want to find a method to distinguish between the data that (a) applies for a given foreign key column but is currently unknown, and (b) the data that simply does not apply for such column and therefore will never be known.

I can distinguish the two cases (foreign key exists or not) when I get the data from the applications, but I don't know how to write that information to the database without losing that information.

I'm currently using MySQL, but I maybe switch to PostgreSQL (for other reasons).

Example

Let's say I have among others the tables music_album and file:

music_album(album_id, title, cover_id, ...)
file(file_id, ...)

music_album.cover_id is a foreign key to file.file_id.

Now the problem is as following:

  • application A gives me the title of the album, etc. including the cover. If the album doesn't have a cover the application will also give me that information
  • application B gives me the title, etc. but not the cover

If I simply stored a NULL for an album without cover from application A and also store all albums from application B with cover_id=NULL I later don't know if the album doesn't have a cover or that the application didn't give me the information about the cover.

Best Answer

There is a mismatch between you and your DBMS regarding special values. Special values are special in your mind, but the DBMS can't tell that a special value is different from an ordinary value. So when you put a special value in, it expects a valid reference to a row in the lookup table. If the reference is invalid, it flags a constraint error.

There is a way you can get around this, but it will create further problems down the road. You can put the special value in as a key in the lookup table, with some sort of descriptive field set to "[Unknown Value]". Then, when you do the join, this text will pop up in place of the known value.

I hesitate to recommend this solution. In every case where I had to work with this, dummy values in lookup tables ended up causing more problems than they solved:

Wherever the foreign key and the primary key are joined, you have to include a condition to weed out the special values, if appropriate. If you forget, that's a bug. It you don't forget, that's a slowdown.

It depends on what the special value means. If the special value means "this is not a value", among other things, then there is, conceptually, nothing to join to. If the actual join produces a result, that result is likely to be meaningless or misleading.

But, if it works for you, great.