Postgresql – Deletion of orphaned rows in a violation-free & non-blocking manner: predictive order of execution of predicates in case of double-checked locking

cascadeconcurrencyforeign keylockingpostgresql

I have tables person and toy. There is a foreign key person.favorite_toy_id from person to toy. This foreign key is declared as ON DELETE RESTRICT. I now want to delete all toys which are no longer declared as favorite in a violation-free and non-blocking manner:

  • I want to delete as much orphaned toys as possible, while avoiding a foreign key violation because we attempt to delete a toy which is still in use.
  • I do not want to wait for other ongoing transactions, which are possibly introducing a reference to a toy (which requires a key share lock) or simply updating a toy (which requires a (no key) update lock), to finish. Both lock types block our update lock request, required to delete the toy.

The first, naive approach would be:

delete from toy
where       not exists(select 1 from person where person.favorite_toy_id = toy.id)

This will not work in a concurrent environment: after the completion of the not exists predicate, a concurrent transaction could declare the toy in question as favorite. In such case, we end up with a foreign key violation. Also, as said, I prefer this delete to happen in a non-blocking fashion, which is not attempted in this query.

So, my second approach in attempt to avoid this foreign key violation and any blocking is:

delete from toy
where       toy.id in
            (
              select toy.id
              from   toy
              where  not exists(select 1 from person where person.favorite_toy_id = toy.id)
              for update skip locked
            )

However, this doesn't solve the requirement to avoid the foreign key violation, because the lock is taken after the evaluation of the not exists predicate. So there is a small chance that we attempt to delete a toy which is still marked as favorite, resulting in a foreign key violation.

My third attempt to fix this is the following:

delete from toy
where       toy.id in
            (
              select toy.id
              from   toy
              where  not exists(select 1 from person where person.favorite_toy_id = toy.id)
              for update skip locked
            ) and
            not exists(select 1 from person where person.favorite_toy_id = toy.id)

This applies double-checked locking (https://en.wikipedia.org/wiki/Double-checked_locking). This would work if and only if we have the guarantee that the subquery is always evaluated before the additional not exists predicate. As far as I know, there is no such guarantee.

My question is rather educational: can this be solved in a pure SQL query? We can of course implement this in a plpgsql function as shown below, but let's assume we want to solve this in one single plain SQL query.

create function prune_toys() returns void as
$$
declare
  _id int;
begin
  for _id in select toy.id from toy where not exists(...) for update skip locked loop
    delete from toy where toy.id = _id and not exists(...);
  end loop;
end;
$$
language plpgsql;

In all this, I assume the read committed transaction isolation level.

Best Answer

This is more a business process / UI race. It is often solved by deprecating / hiding values for a period before deleting and/or not accepting them as valid anymore.

One approach would be to add a DateDeprecated column to toys. Update it to the current date when you check and nobody has this toy as a favorite. Clear the field if/when the toy is selected as a favorite. In the UI don't show/list deprecated toys at all or don't show toys deprecated for a certain period of time like over 7 days. At some point delete long deprecated toys, say after they have been deprecated for 14 days.