Schema:
CREATE TABLE "expenses_commissionrule" (
"id" serial NOT NULL PRIMARY KEY,
"country" varchar(2) NOT NULL,
"created" timestamp with time zone NOT NULL,
"modified" timestamp with time zone NOT NULL,
"activity" tsrange NOT NULL
);
Description:
I'd like to create an application to manage commission calculations. Every rule has activity period. Each country has independent set of rules.
Constraints:
First. To avoid ambiguity these activity periods should not overlap. I did it with the following constraint:
ALTER TABLE expenses_commissionrule
ADD CONSTRAINT non_overlapping_activity
EXCLUDE USING GIST (country WITH =, activity WITH &&);
Second. In any point of time there should be only one commission rule, so, there should be no gaps between intervals in the table. In other words sum of all intervals should be -INF:+INF.
Question:
How can I add the second constraint?
Use case:
We have a rule with infinite period. I want to switch to a new rule, which should start from the next month. In this case I'd like to set current rule end period to the end of the current month and add a new rule in a single action.
Update:
In the future I'd like to add the following behaviour:
- An ability to specify a rule only for specific user (via nullable foreign key "user_id");
- Every user could have his own rule for each country. If user has no rule – "global" rules will be used for commission calculations, so it is like a fallback;
- These "user" rules could have any period of activity – (-inf: inf) and concrete intervals as well. It is different from "global" rules – gaps are OK here.
- As "may be" feature – extra varchar field, which will have a type of situation, when particular rule could be applied to have more flexibility with the calculation process. This "partial" rules might have different intervals as well, gaps are OK.
In other words all previous constraints should be applied only to rows where user_id IS NULL. How can this be accomplished?
PostgreSQL version: 9.6.2
Best Answer
Chapter 1: Linked List
One way to have this (no gaps) type of constraint enforced in the database is to split
activity
into the starting and ending parts and then useUNIQUE
andFOREIGN KEY
constraints to emulate a linked list.1a.
activity_start
should reference the previousactivity_end
:(country, activity_start) REFERENCES (country, activity_end)
.activity_start
andcountry
or activity end and country:UNIQUE
constraint on(country, activity_start)
.and
UNIQUE (country, activity_end)
.(-Infinity, +Infinity)
or series like:-Infinity -> DateA -> Infinity -> DateB -> +Infinity
. This is achieved with the two partial indexes.Code:
We can then try to insert (valid) data:
Works fine:
And try with invalid data:
Another try:
1b.
After all these, we can spot that
activity
is not really needed in the table as we have start and end and we can compute it. Thus it can be removed:1c.
And then we realize that - due to the foreign key we added - we don't really need the exclusion constraint any more. We can have the same effect by enforcing that
activity_end
is afteractivity_start
.Chapter 2: No List
After the all the tedious effort, let try something simpler. No explanation this time, lets read the code first and explain later:
OK, what happened here? This is much simpler, it can't possibly be used the same way as the previous designs! It can't possibly be equivalent. Or maybe it can?
Lets see what happens here:
activity_start
is missing completely. How are we supposed to find the periods if they are not stored in the table?The answer is that activity start is indeed stored, just not in the same row as activity end. That's what the foreign key was all about, to ensure that every end had a matching start. So we can easily find the start of every period using
LAG()
:Chapter 3: Spot the difference
While all the above design manage to enforce the "no gaps" rule, they all fail to enforce the second rule: "the sum of all intervals should be
-INF : +INF
."Can this be amended? It seems to be easier after all once we have achieved no gaps.
Well, yes and no. And no. It isn't easier. It's similar to enforcing that a table has at least one row. That seems easy, too, but it's actually very hard, if not impossible with DDL alone.
For the specific problem though, the "yes and no" means that the rule:
can be enforced with design 1 (although it makes it even more complicated).
but not with design 2 (at least I can't see a way).
For design 1, we'd need to add a second foreign key (from activity end to the next start) and a unique constraint (country, activity_start). This essentially would convert our list to a doubly linked list. And for the foreign keys to be satisfied, the list would have to either be infinite (which isn't possible) or have ends both left and right, which means two rows with nulls, one for left and one for right (foreign keys are satisfied if one of the columns is
NULL
).For design 2, you'd have to ensure - outside of DDL - that for every country there is one row where
activity_end
isNULL
(that row is the rule for the period up to+Infinity
).