How to Get Missing Dates from an Interval in MySQL

MySQLmysql-5.6

I am working with a database schema shown below and I am struggling with writing a query on that schema.

A hotel has a contract with a period of validity. Also it has many seasons, and each season has many periods. The task is to return all dates from the contract validity period that are not available in the seasonal periods.

This is my schema:

DB schema

Ideally, I would like to get all possible intervals that do not cover the periods of the hotel. But if it is not possible then at least a list of dates.

Example (all dates are in the DD.MM.YYYY format):

  • Contract period: 1.01.2018 – 31.12.2018
  • Hotel periods: 1.01.2018 – 30.06.2018; 1.08.2018 – 31.12.2018

In this case I lose one month 1.07.2018 – 31.07.2018.
And I want to get this period, or several periods if there happen to be several. So, something like this:

BeginDate  EndDate
---------  ----------
1.07.2018  31.07.2018

If it's not possible to get a list of periods, then I want to get all missing dates, and group them in periods on the server side.

Best Answer

If a hotel can never have overlapping seasonal periods, then it is possible to obtain the result as a list of intervals by direct matching of intervals using the method below.

First you need to invert the seasonal period list, which means getting the list of gaps between the items of the list complementing it with two more intervals representing the period before the first item and the period after the last item. In other words, a list like this:

date_from    date_to
----------  ---------
date_from1   date_to1
date_from2   date_to2
.
.
.
date_fromN   date_toN

would be transformed to a list like this:

   date_from            date_to
-----------------   ------------------
   0001-01-01        date_from1 - 1 day
date_to1   + 1 day   date_from2 - 1 day
date_to2   + 1 day   date_from3 - 1 day
.
.
.
date_toN-1 + 1 day   date_fromN - 1 day
date_toN   + 1 day      9999-12-31

The date_fromi and date_toi in the second list refer to the corresponding values from the first list. The one-day adjustments are there to account for the fact that both the input intervals and the output intervals are inclusive.

This is how you could generate the list of gaps in SQL based on your schema:

SELECT
  IFNULL(
    (
      SELECT
        hs_last.date_to + INTERVAL 1 DAY
      FROM
        hotel_season AS hs_last
        INNER JOIN hotel_season_period AS hsp_last
          ON hs_last.id = hsp_last.hotel_season_id
      WHERE
        hs_last.hotel_id = hs_this.hotel_id
        AND hsp_last.date_from < hsp_this.date_from
      ORDER BY
        hsp_last.date_from DESC
      LIMIT
        0, 1
    ),
    CAST('0001-01-01' AS date)
  )                                      AS date_from,
  hs_this.date_from - INTERVAL 1 DAY     AS date_to
FROM
  hotel_season AS hs_this
  INNER JOIN hotel_season_period AS hsp_this
    ON hs_this.id = hsp_this.hotel_season_id
WHERE
  hs_this.hotel_id = @param_hotel_id

UNION ALL

SELECT
  MAX(hs_this.date_to) + INTERVAL 1 DAY  AS date_from,
  CAST('9999-12-31' AS date)             AS date_to
FROM
  hotel_season AS hs_this
  INNER JOIN hotel_season_period AS hsp_this
    ON hs_this.id = hsp_this.hotel_season_id
WHERE
  hs_this.hotel_id = @param_hotel_id
;

Note that adjacent seasonal periods will generate "gaps" for which date_from > date_to. Such ranges would be invalid, of course, and you would need to filter them out before further processing.

The further processing will involve finding the gaps intersecting with the hotel contract period. Those that do intersect with it will actually be the intervals that you want to return, except the you may need to adjust the beginning of the first matching gap as well as the end of the last one, because those might go outside the contract period bounds.

This is the matching condition, which takes into account that both the contract period and the gaps are inclusive intervals:

gap.date_to >= contract.date_from AND gap.date_from <= contract.date_to

That is, a gap is considered as intersecting the contract period when its end either comes after or exactly matches the contract beginning while at the same time the gap's beginning either comes before or exactly matches the contract's end.

As I have already said, the first and the last matching gap may go partly outside the contract period, i.e. like this:

contract:    [                 ]
first gap: [    ]
last gap:                    [    ]

For the output you will want to modify them like this:

contract:    [                 ]
first gap:   [  ]
last gap:                    [ ]

This means you will need to calculate both the beginning and the end for each gap: take the latest between gap.date_from and contract.date_from as the beginning and the earliest between gap.date_to and contract.date_to as the end:

GREATEST(gap.date_from, contract.date_from)   AS date_from,
LEAST   (gap.date_to  , contract.date_to  )   AS date_to

Putting everything together, this is what you get:

SELECT
  GREATEST(gap.date_from, contract.date_from)   AS date_from,
  LEAST   (gap.date_to  , contract.date_to  )   AS date_to
FROM
  list_of_gaps AS gap
  CROSS JOIN contract
WHERE
  contract.company_id = @param_hotel_id
  AND gap.date_to >= contract.date_from
  AND gap.date_from <= contract.date_to
;

For the list_of_gaps you can directly use the first query as a derived table:

...
FROM
  (
    SELECT
      IFNULL(
    .
    .
    .
  ) AS gap
...

Or you can store that query's results in a temporary table first:

CREATE TEMPORARY TABLE tmp_gap_list
AS
SELECT
  IFNULL(
.
.
.

and use the temporary table as the list_of_gaps instead:

SELECT
  GREATEST(gap.date_from, contract.date_from)   AS date_from,
  LEAST   (gap.date_to  , contract.date_to  )   AS date_to
FROM
  tmp_gap_list AS gap
.
.
.

The latter option – doing the job in two distinct steps – might be faster, depending on whether MySQL will find the single-query solution too complex to come up with an efficient plan, and also depending on the amount of data in your tables. Test both options for yourself to choose the one that works best for you.