Postgres Greatest Query Not Returning Correct Timestamp – Troubleshooting

greatest-n-per-grouppostgresql

I think I might be writing this query incorrectly. Essentially, I want to find the object with the latest timestamp across four possible values (in this example, user_id is 1).

SELECT *, GREATEST(books.updated_at, books.deleted_at, posts.updated_at, posts.deleted_at)
FROM books, posts WHERE books.user_id = 1
OR posts.user_id = 1

This will always return a time for updated_at, but even if the deleted_at time for one of the objects is the latest, it will not return a deleted_at time. I know, because running a query for deleted_at IS NOT NULL returned a later timestamp.

What am I doing wrong here?

Best Answer

So, I got it figured out and I'm leaving the answer for posterity.

Essentially, remove the * from the query and add a LIMIT 1 and it all appears to work. In addition, as @ypercubetm stated, you can use AND in place of OR, but it depends on your usecase.

SELECT GREATEST(books.updated_at, books.deleted_at, posts.updated_at, posts.deleted_at) AS greatest_time
FROM books, posts WHERE books.user_id = 1
OR posts.user_id = 1 ORDER BY greatest_time DESC LIMIT 1;

returns a single, latest time.

Related Solutions

PostgreSQL – Index Not Being Used in SELECT Query

According to your current explanation indexes are not going to help much (if at all) with your current query.

So there are 3.25M rows with a project key.

That's also the total number of rows, so this predicate is true for (almost) every row ... and not selective at all. But there is no useful index for the jsonb column "references". Including it in the btree index on ("createdAt", "references", type) is just pointless.

Even if you had a generally more useful GIN index on "reference" like:

CREATE INDEX stats_references_gix ON stats USING gin ("references");

... Postgres would still have no useful statistics about individual keys inside the jsonb column.

There are only 5 distinct values for the type

Your query selects all of one type and an unknown fraction of another type. That's an estimated 20 - 40 % of all rows. A sequential scan is most certainly going to be fastest plan. Indexes start to make sense for around 5 % of all rows or less.

To test, you can force a possible index by setting for debugging purposes in your session:

SET enable_seqscan = off;

Reset with:

RESET enable_seqscan;

You'll see slower queries ...

You group by project values:

GROUP BY "references"->> 'project'

And:

There are exactly 400,000 distinct values for the project reference.

That's 8 rows per project on average. Depending on value frequencies we still have to retrieve an estimated 3 - 20 % of all rows if we only picked min and max per project in a LATERAL subquery ...

Try this index, it makes more sense than what you have now:

CREATE INDEX stats_special_idx ON stats (type, ("references" ->> 'project'), "createdAt")
WHERE "references" ? 'project';

Postgres might still fall back to a sequential scan ...

More might be done with a normalized schema / more selective criteria / a smarter query that only picks min and max "createdAt" ...

Query

I would write your query like this:

SELECT EXTRACT(EPOCH FROM (MAX("createdAt") - MIN("createdAt"))) 
FROM   stats
WHERE  "references" ? 'project'
AND   (type = 'event1' OR
       type = 'event2'
   AND "createdAt" >= '2015-11-02 00:00:00+08:00'  -- I guess you want this
   AND "createdAt" <  '2015-12-04 00:00:00+08:00'
      )
GROUP  BY "references"->> 'project';  -- don't cast

Notes

Don't cast here:

stats."references"::jsonb ? 'project'

The column is jsonb already, you gain nothing. If the predicate was selective, index usage might be prohibited by the cast.
Your predicates on "createdAt" are probably incorrect at lower and upper bound. To include whole days consider my suggested alternative.
references is a reserved word, so you have to always double-quote it. Don't use it as identifier. Similar for double-quoted CaMeL-case names like "createdAt" either. Allowed, but error prone, needless complication.
type

type character varying(255) NOT NULL,

The type is a simple string no longer than 50 characters.

There are only 5 distinct values for the type field, this would probably be no more than a few hundred in production.

None of this seems to make sense.
- varchar(255) in itself hardly ever makes any sense. 255 characters is an arbitrary limit without significance in Postgres.
- If it's no longer than 50 chars, then the limit of 255 makes even less sense.
- In a properly normalized design you would have a small integer column type_id (referencing a small type table) which occupies only 4 bytes per row and makes indexes smaller and faster.
Ideally, you would have a project table, listing all projects and another small integer FK column project_id in stats. Would make any such query faster. And for selective criteria, much faster queries would be possible - even without the suggested normalization. Along these lines:
Optimize GROUP BY query to retrieve latest record per user

Postgresql – Paged query against timestamp populated by trigger missing items

I think you found the main culprit alright. There are more ways to miss rows with this setup. Consider D3 I added to your time diagram:

D1 |st1....................|
D2       |st2......|
D3          |st3......|
R                |.......|
                   ^ -- read query starts here

Transaction D3 starts after D2, but since it's not committed when your read query in transaction R starts, it is invisible as well. It will be included in the next page, but it may still be confusing that it's missing this time. That's how the default transaction isolation level Read Committed is defined. You seem to be aware of it, but for the general public:

When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run.

There is more. Read the manual.

With concurrent access to the DB, various conflicts might stall a DELETE query - which will patiently wait until it can get an exclusive lock on the rows to delete. 10 minutes of missed rows would indicate a (possibly unrelated) problem, I agree with your suspicion. Transactions that remain "idle in transaction" are prime suspects. To diagnose, look for log long running statements or transactions. Some pointers:

Set log_min_duration_statement in postgresql.conf.
Some basics on Logging Difficult Queries in the Postgres Wiki.
The statistics view pg_stat_activity. Related:
- Postgres - estimate end time for a long update
check_postgres provides helpful tools.
Related blog.

Database design

You seem to be using data type timestamp. Your INSERT with statement_timestamp() AT TIME ZONE 'UTC' should work correctly for your purpose, but you shouldn't rely on INSERT statements to get this right. Rather move the logic to the table itself. Use the data type timestamptz (timestamp with time zone) for tombstone.delete_timestamp to begin with. This rules out any discrepancies with time zones or DST (daylight saving time). Details:

Ignoring timezones altogether in Rails and PostgreSQL

Also don't rely on the INSERT statement to provide the correct time. Put the logic in the column default (or even in a trigger BEFORE INSERT to overwrite any provided timestamp).

CREATE TABLE tombstone (
  id int NOT NULL PRIMARY KEY  -- assuming PK
, delete_timestamp timestamptz DEFAULT statement_timestamp()
);

Since timestamps are not visible until the transaction is committed anyway, the cheaper current_timestamp (transaction timestamp) may be just as useful as statement_timestamp() - depends on unknown details.

The simpler INSERT statement in the trigger to go with the above:

INSERT INTO tombstone (id) VALUES (OLD.id);

Query

Your current SELECT query can be faster comparing row values, which is also index-backed in Postgres:

SELECT id, delete_timestamp
FROM   tombstone
WHERE  (delete_timestamp, id) > (@sinceUTC, @id)
ORDER  BY delete_timestamp, id
LIMIT  @count;

Details:

Optimize query with OFFSET on large table

Secure alternative

All this aside, timestamps are just not completely reliable for queuing or paging in a multi-user environment. If you need it to be completely reliable, consider a different approach: Lock unread rows before retrieving and mark them as read. The new FOR UPDATE SKIP LOCKED in Postgres 9.5 may prove useful for this:

Postgres UPDATE ... LIMIT 1

Or, if the reading is done by a single user, you don't need locks. Just mark rows you have read. A partial index for unmarked rows makes reading fast.

Upgrade to current Postgres

Postgres 9.2 is getting old. Upgrading to the current version might help for various reasons.

Best Answer

Related Solutions

PostgreSQL – Index Not Being Used in SELECT Query

Query

Notes

Postgresql – Paged query against timestamp populated by trigger missing items

Database design

Query

Secure alternative

Upgrade to current Postgres

Related Question