Postgresql – calculate days spent of all visitors from encounters

datedatetimepostgresqlquerytime

I have an encounters table

CREATE TABLE visitors(
 id serial PRIMARY KEY,
 start_date TIMESTAMP NOT NULL,
 --  visitor_id bigint NOT NULL -- skip for now,
 end_date TIMESTAMP
);

(end_date can be null when visitor is still inside)

and two dates(start_dt and end_dt) are provided on which I need to calculate how many total days all visitors have spent in a certain given date range. eg

  • if from jan 1 2019 till jan 7, 100 visitors were present, these are 7 * 100 = 700 visitor days
  • if on jan 1, there were 10 visitors and on jan 2, there were 20 visitors then 1 * 10 + 1 * 20 = 30 visitor days for these given dates

Note: dates will be inclusive

Any help in building this query is very much appreciated(I am not able to achieve this query. I am also newbie in this area). thanks

Edit: So far I have tried/reached this

(WITH dates AS ( SELECT start_date, COALESCE(end_date::date, '2019-01-01') AS end_date FROM "visitors"),
    calendar AS (
    SELECT
    i :: DATE AS DATE,
    date_trunc( 'month', i ) :: DATE AS MONTH
    FROM
    generate_series ( '2018-01-01' , '2019-01-01', '1 day' :: INTERVAL ) i
    )
    SELECT COUNT ( * ) days FROM dates d
    INNER JOIN calendar C ON C.DATE BETWEEN d.start_date AND d.end_date);

Edit: I see some answers e.g age difference but these calculate age based on difference and not both dates being inclusive

Best Answer

I have achieve required results with this query

(WITH params (start_dt, end_dt) as (values ('2019-02-14'::timestamp, '2019-02-18'::timestamp)),
  dates AS ( SELECT (CASE
                      WHEN (visitors.start_date < params.start_dt) THEN params.start_dt
                      ELSE visitors.start_date END)                        AS start_date,
                    (CASE
                      WHEN (COALESCE(visitors.end_date, params.end_dt) > params.end_dt) THEN params.end_dt
                      ELSE COALESCE(visitors.end_date, params.end_dt) END) AS end_date,
                    visitors.id                                             as visitor_id
            FROM "visitors",
                  params)
  SELECT date_part('day', age(dates.end_date, dates.start_date)) days,
        dates.end_date,
        dates.start_date,
        dates.visitor_id
  FROM dates,
      params
  where dates.start_date >= start_dt
    AND dates.end_date <= end_dt);

Note: I would love to edit this answer(if wrong or needs update) or will appreciate if someone has better answer