Say I wish to store when staff are on holiday (FromDate
, ToDate
) and then I wish to find all staff that are on holiday between two given dates (QFromDate
, QToDate
).
Now assume that I have many such records (more than will fit in the server’s RAM) and need to do this query often.
Now assume that I also have sick_leave
tables, shift_pattern
tables, pay_rate
tables, etc – all that have FromDate
and ToDate
with the requirement to join them based on overlapping dates.
How should I store the date ranges and how can the queries be written to run fast?
(The choice of RDBMS is not fixed, but being able to run on any “standard” RDBMS is of value unless doing so has a large adverse effect.)
I have posted some answers that I have considered, but don't like! However they may help other people.
Best Answer
You can generate a customized index table, for example, that has a row for all year&week pairs that the date range (a certain vacation etc) encompasses. Then you can join dateranges by going through that index table. It will be large, but does avoid large scans as you can just list all vacations that have any weeks that are in common with another vacation, as an inner join.
pseudo example:
Similar things are used for example spatial coordinate stuff, dividing linear coordinates to grid cells that can be indexed and scanned more easily.
Update: fixed primary key