Postgresql – Only one non-nullable value across 3 columns

postgresql

I have 3 "Nullable" columns.

I am looking for a way to restrain it so that:

One of the three has value, the other two columns must be null
All three cannot be null

Is it possible to do this?

I tried this:

CREATE UNIQUE INDEX alpha ON contacts ("a")
WHERE "b" IS NULL AND "c" IS NULL;

CREATE UNIQUE INDEX beta ON contacts ("b")
WHERE "a" IS NULL AND "c" IS NULL;

CREATE UNIQUE INDEX gamma ON contacts ("c")
WHERE "a" IS NULL AND "b" IS NULL;

Best Answer

That's not what indexes are for, you need a check constraint for that,

For the "at most one can be not null" you can use a little trick: casting a boolean expression to an integer yields 0 or 1. If you add up the values for a is not null condition, the rule "at most one can be not null" translates to "the sum of those must always be 1":

alter table contacts add constraint only_one_not_null
   check (   (a is not null)::int 
           + (b is not null)::int 
           + (c is not null)::int = 1);

The above can be simplified by using the num_nonnulls() function:

alter table contacts add constraint only_one_not_null 
   check (num_nonnulls(a,b,c) = 1);

This also covers the "not all three can be null" requirement.

Note on indexing strategy

Postgres can combine multiple indices with excellent speed. Typically, in your original scenario, with a nested loop:

nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can be very time consuming. (However, if the right relation can be scanned with an index scan, this can be a good strategy. It is possible to use values from the current row of the left relation as keys for the index scan of the right.)

Say, you have these indices:

sesssions(session_id)
user_sessions(session_id, user_id)

Both should be covered by the pk constraints in your original model automatically. However, the second one could go either way (user_id, session_id) or (session_id, user_id) and that matters:
Is a composite index also good for queries on the first field?

For best performance in a query like:

SELECT u.user_name, s.session_date 
FROM   users         u
JOIN   user_sessions us USING (user_id)
JOIN   sessions      s  USING (session_id)
WHERE  u.user_name = 'foo'

You would need the index (or pk constraint) to be

user_sessions(user_id, session_id)

Or have an additional index with user_id as first column.

PostgreSQL – How to Index Two Tables with One-to-Many Relation

The thing about indexes are that they are excellent tools for speeding up queries, but they bring along overhead as well.

If you choose to index every column in a table that is involved in a query, you have chosen to store considerably more data. This has an effect on the storage required and in the overhead of maintaining all those indexes. (Or maintaining all columns in a very wide index.)
In past experience, having many indexes, especially several indexes with similar distribution statistics, can make it easier for the server to select a poor index and thus a poor plan.

My experience shows that indexing just about everything will hurt your performance overall, for the reasons mentioned above.

Both table A and table B should only contain the indexes that would be of best use for your queries.

Of course, your first indexing decisions may need to be refined over time as you learn more about your database's behavior. Do not be afraid to add additional indexes when needed. And do not be afraid to drop an index that is not being useful.

Best Answer

Related Solutions

Postgresql – Nullable One To Many w/Join Table – Indexing

Note on indexing strategy

PostgreSQL – How to Index Two Tables with One-to-Many Relation

Related Question