Database Design – Why Avoid Using NULLs?

database-designnull

I remember reading this one article about database design and I also remember it said you should have field properties of NOT NULL. I don't remember why this was the case though.

All I can seem to think of is that, as an application developer, you wouldn't have to test for NULL and a possible nonexistent data value (for instance, an empty string for strings).

But what do you do in the case of dates, datetime, and time (SQL Server 2008)? You'd have to use some historic or bottomed-out date.

Any ideas on this?

Best Answer

I think the question is poorly phrased, as the wording implies that you've already decided NULLs are bad. Perhaps you meant "Should we allow NULLs?"

Anyway, here is my take on it: I think NULLs are a good thing. When you start preventing NULLs just because "NULLs are bad" or "NULLs are hard", you start making up data. For example, what if you don't know my birth date? What are you going to put in the column until you know? If you're anything like a lot of anti-NULL folks, you're going to enter 1900-01-01. Now I'm going to be placed in the geriatric ward and probably get a call from my local news station congratulating me on my long life, asking me my secrets to living such a long life, etc.

If a row can be entered where it is possible that you don't know the value of a column, I think NULL makes a lot more sense than picking some arbitrary token value to represent the fact that it is unknown - a value which others will have to already know, reverse engineer, or ask around to figure out what it means.

There is a balance, though - not every column in your data model should be nullable. There are often optional fields on a form, or pieces of information that otherwise don't get collected at the time the row is created. But that doesn't mean you can defer populating all of the data. :-)

Also the ability to use NULL can be limited by crucial requirements in real life. In the medical field, for example, it can be a life-or-death matter to know why a value is unknown. Is the heart rate NULL because there wasn't a pulse, or because we haven't measured it yet? In such a case, can we put NULL in the heart rate column, and have notes or a different column with a NULL-because reason?

Don't be afraid of NULLs, but be willing to learn or dictate when and where they should be used, and when and where they shouldn't.