How to Compare the Same Column on Different Dates in PostgreSQL

postgresqlpostgresql-9.3

I have two tables

project_key: project_id, key_id
project_report: project_id, key_id, level, created_at

I would like to compare the column level when created_at is now() and created_at is now()::date - 1.

Sample data:

project_id: 1, keyword_id: 1, level: 2, created_at: today
project_id: 1, keyword_id: 2, level: 3, created_at: today
project_id: 1, keyword_id: 3, level: 5, created_at: today
project_id: 1, keyword_id: 4, level: 6, created_at: today

project_id: 1, keyword_id: 1, level: 1, created_at: yesterday
project_id: 1, keyword_id: 2, level: 6, created_at: yesterday
project_id: 1, keyword_id: 3, level: 2, created_at: yesterday
project_id: 1, keyword_id: 4, level: 6, created_at: yesterday

In the end a report should be produced, the expected output:

Legend:

increased: today < yesterday
decreased: today > yesterday
not_changed: today =  yesterday

Output:

project_id: 1, increased: 1, decreased: 2, not_changed: 1

Also when one of the level(today or yesterday) is 0 the increased and decreased will change:

increased: today > yesterday
decreased: today < yesterday

Best Answer

First cut query, using a self full outer join:

SELECT
  coalesce(product_today.product_id, product_yesterday.product_id) AS product_id,
  sum(CASE WHEN product_yesterday.keyword_id IS NULL THEN 1 END) AS new_today,
  sum(CASE WHEN product_today.keyword_id IS NULL THEN 1 END) AS deleted_today,
  sum(CASE WHEN product_yesterday.level > product_today.level THEN 1 END) AS decreased,
  sum(CASE WHEN product_yesterday.level < product_today.level THEN 1 END) AS increased,
  sum(CASE WHEN product_yesterday.level = product_today.level THEN 1 END) AS not_changed
FROM (
  SELECT *
  FROM product_report
  WHERE created_at = current_date)
AS product_today
FULL OUTER JOIN (
  SELECT *
  FROM product_report
  WHERE created_at = current_date - INTERVAL '1' DAY
) AS product_yesterday
ON (product_today.project_id = product_yesterday.project_id
    AND product_today.keyword_id = product_yesterday.keyword_id)
GROUP BY project_id;

This query will handle products that're deleted or newly added. If you don't care about that you can simplify it to an inner join, so it only shows products that were neither deleted nor newly added:

SELECT
  product_today.product_id AS product_id,
  sum(CASE WHEN product_yesterday.level > product_today.level THEN 1 END) AS decreased,
  sum(CASE WHEN product_yesterday.level < product_today.level THEN 1 END) AS increased,
  sum(CASE WHEN product_yesterday.level = product_today.level THEN 1 END) AS not_changed
FROM product_report product_today
INNER JOIN product_report product_yesterday
ON (product_today.project_id = product_yesterday.project_id
    AND product_today.keyword_id = product_yesterday.keyword_id)
WHERE product_today.created_at = current_date
  AND product_yesterday.created_at = current_date - INTERVAL '1' DAY
GROUP BY project_id;

If you don't care about "yesterday", but rather "the previous entry", you can probably use a window function in a subquery, but it'll be fiddly to write without sample data and schema.