I need to update the "oldest" row in the database (for now, its limit 1, but I need to also be able to set it to n
).
I'm essentially doing a constant stream of "update the oldest rows and retrieve them at the same time".
The column is indexed, it's called last_sent_at
and I'm calling it ~20K times a minute.
The thing is, last_sent_at
can be NULL
and that should always be favored over ones with a filled in time. Once all of them are filled in it should choose the "oldest".
This works for me, the problem is I don't know if its efficient, and it also doesn't work when I need to update 5 rows at once. I guess I could use IN
I'm just afraid it would be inefficient.
UPDATE subscribers
SET last_sent_at = '2018-11-17 00:02:27'
WHERE id = (
SELECT id
FROM subscribers
ORDER BY last_sent_at NULLS FIRST
LIMIT 1
)
RETURNING id;
Best Answer
If the table is big, the key to performance is a matching index on
(last_sent_at NULLS FIRST)
.Since you are updating so much, have aggressive
VACUUM
settings. This can be adjusted per table. See:For only a single row,
UPDATE
with a subquery like you have is fine.For
LIMIT N
with N > 1 a CTE is your safe bet. Like:If you do this under concurrent write load, add another expression to your
ORDER BY
to make the order deterministic. And add the same expression to your index to match! (There can be many rows withNULL
, right?) And use the same order for all commands taking row locks in the same table. Else you might face deadlocks from multiple transaction taking locks in arbitrary order. Better yet, useFOR UPDATE SKIP LOCKED
if you are not absolutely bound to the strict order. Closely related, with in-depth explanation:Related: