Postgresql update with a subquery limit 1 that has joins sometimes doesn’t respect the limit

concurrencypostgresqlupdate

As per Postgres UPDATE … LIMIT 1 I'm trying to update a single record but based on a join. However, ~30% of the time on the pre-installed postgresql 9.5.5 from travis-ci the following SQL actually returns multiple records. Is there something I'm misunderstanding about this query like it's running one instance of the subquery for each tuple the update sees or is this a potential bug in postgresql?

UPDATE "users" SET touched_at = now() 
FROM (
   SELECT "some_table".* 
   FROM "some_table" 
     INNER JOIN "users" ON "users"."id" = "some_table"."user_id"
   WHERE "some_table"."handled_at" IS NULL 
     AND ("users".touched_at IS NULL OR "users".touched_at < '2016-12-21') 
  ORDER BY "some_table"."created_at" ASC 
  LIMIT 1 
  FOR UPDATE OF "users"
) dt 
WHERE "users".id = dt.user_id 
AND ("users".touched_at IS NULL OR "users".touched_at < '2016-12-21') 
RETURNING dt.*

Best Answer

In March Erwin Brandstetter revised his answer to mention "The planner may choose to generate a plan that executes a nested loop over the LIMITing subquery, causing more UPDATEs than LIMIT": https://dba.stackexchange.com/posts/69497/revisions

There appear to be two options that actually work for limiting the number of rows updated:

  1. an uncorrelated scalar subquery when you only need a single row, e.g. UPDATE users SET touched_at = now() WHERE users.id = (SELECT id FROM users LIMIT 1)

OR

  1. if you need multiple rows then a common table expression (CTE) which is always materialized once and is a fence to the optimizer in postgresql (https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/), e.g. WITH cte AS (SELECT id FROM users LIMIT 10) UPDATE users SET touched_at = now() WHERE users.id = cte.id