PostgreSQL – Correct Way to Select Data Grouped by Time Interval

datepostgresql

I need to display a chart that would illustrate how many ticket sales happen at different time intervals on different dates, e.g. How many tickets for Friday show get sold 3 hours in advance, etc.

The end result will look something like:

enter image description here

I have written a query by calculating values for different offsets at the SELECT statement:

WITH
  -- Create a date range; use LEFT JOIN/ COALESCE to construct date specific report; we don't want gaps in the data
  dates AS (
    SELECT generate_series(
      '2018-04-26',
      '2018-04-28',
      INTERVAL '1 day'
    ) AS date
  ),
  sales AS (
    SELECT
      ts1.date,
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at < INTERVAL '0 hour' THEN 1 ELSE 0 END) after_event_start,
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at BETWEEN INTERVAL '1 hour' AND INTERVAL '2 hour' THEN 1 ELSE 0 END) hour_0_to_hour_1,
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at BETWEEN INTERVAL '1 hour' AND INTERVAL '2 hour' THEN 1 ELSE 0 END) hour_1_to_hour_2,
    -- [..]
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at BETWEEN INTERVAL '22 hour' AND INTERVAL '23 hour' THEN 1 ELSE 0 END) hour_22_to_hour_23,
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at BETWEEN INTERVAL '23 hour' AND INTERVAL '24 hour' THEN 1 ELSE 0 END) hour_23_to_hour_24,
      SUM(CASE WHEN ts1.event_starts_at - ts1.created_at > INTERVAL '24 hour' THEN 1 ELSE 0 END) after_24_hour
    FROM ticket_sale ts1
    WHERE
      ts1.movie_id = 1012718 AND
      ts1.starts_at > '2018-04-26' AND
      ts1.starts_at < '2018-04-28'::date + INTERVAL '1 day'
    GROUP BY ts1.date
  )
SELECT
  to_char(d1.date, 'YYYY-MM-DD') "date",
  COALESCE(s1.hour_0_to_hour_1, 0) hour_0_to_hour_1,
  COALESCE(s1.hour_1_to_hour_2, 0) hour_1_to_hour_2,
  -- [..]
  COALESCE(s1.hour_22_to_hour_23, 0) hour_22_to_hour_23,
  COALESCE(s1.hour_23_to_hour_24, 0) hour_23_to_hour_24,
  COALESCE(s1.day_6_to_day_7, 0) after_24_hour
FROM dates d1
LEFT JOIN sales s1 ON s1.date = to_char(d1.date, 'YYYY-MM-DD')
ORDER BY d1.date

This works. However, as a query it looks odd.

Is there a better way to count sales at different time intervals?

Best Answer

I really overcomplicated this.

As suggested by @Gaius, all I needed to do is to query all the data grouped by the hour of the purchase, i.e.

SELECT
  ts1.date,
  ROUND(EXTRACT(EPOCH FROM (ts1.starts_at - ts1.created_at))/3600) "hour",
  COUNT(*) sales_count
FROM ticket_sale ts1
WHERE
  ts1.movie_id = 1012718 AND
  ts1.date = '2018-04-26'
GROUP BY ts1.date, hour

I can post-process the data in the client-side to group it into the required intervals.