PostgreSQL – Enforcing Data Integrity for Sequential Relations

constraintdatabase-designforeign keypostgresql

These are my tables:

create table trips (
  trip_id serial primary key,
  trip_nm text
);

create table trip_segments (
  segment_id serial primary key,
  departure_ts timestamp with time zone, -- departure time
  arrival_ts   timestamp with time zone, -- arrival time
  trip_id integer references trips (trip_id)
);

An insert would look like:

insert into trips (trip_nm) values ('my trip');

-- first segment
insert into trip_segments (departure_ts,arrival_ts,trip_id) values
                          ('2013-01-30 12:00', '2013-01-30 20:00', trip_id);

-- second segment
insert into trip_segments (departure_ts,arrival_ts,trip_id) values
                          ('2013-01-30 21:00', '2013-01-30 22:00', trip_id);

Given the above data: I want to ensure that a departure time in the second segment is after the arrival time of the first segment.

Which means the following insert should fail for the second segment:

insert into trip_segments (departure_ts,arrival_ts,trip_id) values
                          ('2013-01-30 19:55', '2013-01-31 22:00', trip_id);

There could be many segments for one trip. There are many trips.

So, how do I do that?

My first thought was to use a check constraint function for the trip_segments table, that looks like:

create function previous_ts(trip_id integer)
  returns timestamp with time zone as $$

  /* do some hackery here using the lag() function */

  return previous_arrival_ts;

$$ language 'sql';

Okay, so I haven't implemented that, but that would be the idea. However, I read (on stackoverflow, lost the answer) that using
a check constraint function is, and I'm para-phrasing, a really bad idea.

Ok, so there must be a better way, maybe using a trigger, I don't know. What's the best way to solve this problem?

I'm using postgresql 9.5

Best Answer

You can make use of an EXCLUSION constraint and tsrange datatype instead of two timestamps.

create table trip_segments (
  segment_id serial primary key,
  travel_ts tstzrange, -- departure time to arrival time
  trip_id integer references trips (trip_id),
  EXCLUDE USING gist (trip_id WITH =, travel_ts WITH &&)
);

-- first segment
insert into trip_segments (travel_ts,trip_id) values
                          ('[2013-01-30 12:00,2013-01-30 20:00)', 1);

-- second segment overlaps and fails
insert into trip_segments (travel_ts,trip_id) values
                          ('[2013-01-30 19:55,2013-01-30 22:00]', 1);

You get:

ERROR: conflicting key value violates exclusion constraint "trip_segments_travel_ts_excl" DETAIL: Key (travel_ts)=(["2013-01-30 19:55:00+11","2013-01-30 22:00:00+11"]) conflicts with existing key (travel_ts)=(["2013-01-30 12:00:00+11","2013-01-30 20:00:00+11")).

The exclusion is set-up exclusively for time ranges specific to each trip_id.

insert into trips (trip_nm) values ('their trip'); -- add other trip

-- segment overlaps with trip_id=1, but the exclusion is setup specific to trips
insert into trip_segments (travel_ts,trip_id) values
                          ('[2013-01-30 19:55,2013-01-30 22:00)', 2);

Note that timestamp ranges cannot be specified backwards (ending timestamp must be equal or higher than the starting one), and their boundaries are either inclusive ([]) or exclusive (()) boundaries, being the default [), i.e. an inclusive starting and exclusive ending time.