Postgresql – Deletion of orphaned rows in a violation-free & non-blocking manner: predictive order of execution of predicates in case of double-checked locking

cascadeconcurrencyforeign keylockingpostgresql

I have tables person and toy. There is a foreign key person.favorite_toy_id from person to toy. This foreign key is declared as ON DELETE RESTRICT. I now want to delete all toys which are no longer declared as favorite in a violation-free and non-blocking manner:

I want to delete as much orphaned toys as possible, while avoiding a foreign key violation because we attempt to delete a toy which is still in use.
I do not want to wait for other ongoing transactions, which are possibly introducing a reference to a toy (which requires a key share lock) or simply updating a toy (which requires a (no key) update lock), to finish. Both lock types block our update lock request, required to delete the toy.

The first, naive approach would be:

delete from toy
where       not exists(select 1 from person where person.favorite_toy_id = toy.id)

This will not work in a concurrent environment: after the completion of the not exists predicate, a concurrent transaction could declare the toy in question as favorite. In such case, we end up with a foreign key violation. Also, as said, I prefer this delete to happen in a non-blocking fashion, which is not attempted in this query.

So, my second approach in attempt to avoid this foreign key violation and any blocking is:

delete from toy
where       toy.id in
            (
              select toy.id
              from   toy
              where  not exists(select 1 from person where person.favorite_toy_id = toy.id)
              for update skip locked
            )

However, this doesn't solve the requirement to avoid the foreign key violation, because the lock is taken after the evaluation of the not exists predicate. So there is a small chance that we attempt to delete a toy which is still marked as favorite, resulting in a foreign key violation.

My third attempt to fix this is the following:

delete from toy
where       toy.id in
            (
              select toy.id
              from   toy
              where  not exists(select 1 from person where person.favorite_toy_id = toy.id)
              for update skip locked
            ) and
            not exists(select 1 from person where person.favorite_toy_id = toy.id)

This applies double-checked locking (https://en.wikipedia.org/wiki/Double-checked_locking). This would work if and only if we have the guarantee that the subquery is always evaluated before the additional not exists predicate. As far as I know, there is no such guarantee.

My question is rather educational: can this be solved in a pure SQL query? We can of course implement this in a plpgsql function as shown below, but let's assume we want to solve this in one single plain SQL query.

create function prune_toys() returns void as
$$
declare
  _id int;
begin
  for _id in select toy.id from toy where not exists(...) for update skip locked loop
    delete from toy where toy.id = _id and not exists(...);
  end loop;
end;
$$
language plpgsql;

In all this, I assume the read committed transaction isolation level.

Best Answer

This is more a business process / UI race. It is often solved by deprecating / hiding values for a period before deleting and/or not accepting them as valid anymore.

One approach would be to add a DateDeprecated column to toys. Update it to the current date when you check and nobody has this toy as a favorite. Clear the field if/when the toy is selected as a favorite. In the UI don't show/list deprecated toys at all or don't show toys deprecated for a certain period of time like over 7 days. At some point delete long deprecated toys, say after they have been deprecated for 14 days.

Related Solutions

Postgresql – Locking issue with concurrent DELETE / INSERT in PostgreSQL

Mat and Erwin are both right, and I'm only adding another answer to further expand on what they said in a way which won't fit in a comment. Since their answers don't seem to satisfy everyone, and there was a suggestion that PostgreSQL developers should be consulted, and I am one, I will elaborate.

The important point here is that under the SQL standard, within a transaction running at the READ COMMITTED transaction isolation level, the restriction is that the work of uncommitted transactions must not be visible. When the work of committed transactions becomes visible is implementation-dependent. What you are pointing out is a difference in how two products have chosen to implement that. Neither implementation is violating the requirements of the standard.

Here's what happens within PostgreSQL, in detail:

S1-1 runs (1 row deleted)

The old row is left in place, because S1 might still roll back, but S1 now holds a lock on the row so that any other session attempting to modify the row will wait to see whether S1 commits or rolls back. Any reads of the table can still see the old row, unless they attempt to lock it with SELECT FOR UPDATE or SELECT FOR SHARE.

S2-1 runs (but is blocked since S1 has a write lock)

S2 now has to wait to see the outcome of S1. If S1 were to roll back rather than commit, S2 would delete the row. Note that if S1 inserted a new version before rolling back, the new version would never have been there from the perspective of any other transaction, nor would the old version have been deleted from the perspective of any other transaction.

S1-2 runs (1 row inserted)

This row is independent of the old one. If there had been an update of the row with id = 1, the old and new versions would be related, and S2 could delete the updated version of the row when it became unblocked. That a new row happens to have the same values as some row that existed in the past doesn't make it the same as an updated version of that row.

S1-3 runs, releasing the write lock

So S1's changes are persisted. One row is gone. One row has been added.

S2-1 runs, now that it can get the lock. But reports 0 rows deleted. HUH???

What happens internally, is that there is a pointer from one version of a row to the next version of that same row if it is updated. If the row is deleted, there is no next version. When a READ COMMITTED transaction awakens from a block on a write conflict, it follows that update chain to the end; if the row has not been deleted and if it still meets the selection criteria of the query it will be processed. This row has been deleted, so S2's query moves on.

S2 may or may not get to the new row during its scan of the table. If it does, it will see that the new row was created after S2's DELETE statement started, and so is not part of the set of rows visible to it.

If PostgreSQL were to restart S2's entire DELETE statement from the beginning with a new snapshot, it would behave the same as SQL Server. The PostgreSQL community has not chosen to do that for performance reasons. In this simple case you would never notice the difference in performance, but if you were ten million rows into a DELETE when you got blocked, you certainly would. There's trade-off here where PostgreSQL has chosen performance, since the faster version still complies with the requirements of the standard.

S2-2 runs, reports a unique key constraint violation

Of course, the row already exists. This is the least surprising part of the picture.

While there is some surprising behavior here, everything is in conformance with the SQL standard and within bounds of what is "implementation-specific" according to the standard. It certainly can be surprising if you are assuming that some other implementation's behavior will be present in all implementations, but PostgreSQL tries very hard to avoid serialization failures in the READ COMMITTED isolation level, and allows some behaviors that differ from other products in order to achieve that.

Now, personally I'm not a big fan of the READ COMMITTED transaction isolation level in any product's implementation. They all allow race conditions to create surprising behaviors from a transactional point of view. Once someone becomes accustomed to the weird behaviors allowed by one product, they tend to consider that "normal" and the trade-offs chosen by another product odd. But every product has to make some sort of trade-off for any mode not actually implemented as SERIALIZABLE. Where PostgreSQL developers have chosen to draw the line in READ COMMITTED is to minimizing blocking (reads don't block writes and writes don't block reads) and to minimize the chance of serialization failures.

The standard requires that SERIALIZABLE transactions be the default, but most products don't do that because it causes a performance hit over the more lax transaction isolation levels. Some products don't even provide truly serializable transactions when SERIALIZABLE is chosen -- most notably Oracle and versions of PostgreSQL prior to 9.1. But using truly SERIALIZABLE transactions is the only way to avoid surprising effects from race conditions, and SERIALIZABLE transactions always must either block to avoid the race conditions or roll back some transactions to avoid a developing race condition. The most common implementation of SERIALIZABLE transactions is Strict Two-Phase Locking (S2PL) which has both blocking and serialization failures (in the form of deadlocks).

Full disclosure: I worked with Dan Ports of MIT to add truly serializable transactions to PostgreSQL version 9.1 using a new technique called Serializable Snapshot Isolation.

Sql-server – In Sql Server, is there a way to check if a selected group of rows are locked or not

If I understand the request correctly, the goal is to delete batches of rows, while at the same time, DML operations are occurring on rows throughout the table. The goal is to delete a batch; however, if any underlying rows contained within the range defined by said batch are locked, then we must skip that batch and move to the next batch. We must then return to any batches that were not previously deleted and retry our original delete logic. We must repeat this cycle until all required batches of rows are deleted.

As has been mentioned, it is reasonable to use a READPAST hint and the READ COMMITTED (default) isolation level, in order to skip past ranges that may contain blocked rows. I will go a step further and recommend using the SERIALIZABLE isolation level and nibbling deletes.

SQL Server uses Key-Range locks to protect a range of rows implicitly included in a record set being read by a Transact-SQL statement while using the serializable transaction isolation level...find more here: https://technet.microsoft.com/en-US/library/ms191272(v=SQL.105).aspx

With nibbling deletes, our goal is to isolate a range of rows and ensure that no changes will occur to those rows while we are deleting them, that is to say, we do not want phantom reads or insertions. The serializable isolation level is meant to solve this problem.

Before I demonstrate my solution, I would like to add that neither am I recommending switching your database's default isolation level to SERIALIZABLE nor am I recommending that my solution is the best. I merely wish to present it and see where we can go from here.

A few house-keeping notes:

The SQL Server version that I am using is Microsoft SQL Server 2012 - 11.0.5343.0 (X64)
My test database is using the FULL recovery model

To begin my experiment, I will set up a test database, a sample table, and I will fill the table with 2,000,000 rows.


USE [master];
GO

SET NOCOUNT ON;

IF DATABASEPROPERTYEX (N'test', N'Version') > 0
BEGIN
    ALTER DATABASE [test] SET SINGLE_USER
        WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [test];
END
GO

-- Create the test database
CREATE DATABASE [test];
GO

-- Set the recovery model to FULL
ALTER DATABASE [test] SET RECOVERY FULL;

-- Create a FULL database backup
-- in order to ensure we are in fact using 
-- the FULL recovery model
-- I pipe it to dev null for simplicity
BACKUP DATABASE [test]
TO DISK = N'nul';
GO

USE [test];
GO

-- Create our table
IF OBJECT_ID('dbo.tbl','U') IS NOT NULL
BEGIN
    DROP TABLE dbo.tbl;
END;
CREATE TABLE dbo.tbl
(
      c1 BIGINT IDENTITY (1,1) NOT NULL
    , c2 INT NOT NULL
) ON [PRIMARY];
GO

-- Insert 2,000,000 rows 
INSERT INTO dbo.tbl
    SELECT TOP 2000
        number
    FROM
        master..spt_values
    ORDER BY 
        number
GO 1000

At this point, we will need one or more indexes upon which the locking mechanisms of the SERIALIZABLE isolation level can act.


-- Add a clustered index
CREATE UNIQUE CLUSTERED INDEX CIX_tbl_c1
    ON dbo.tbl (c1);
GO

-- Add a non-clustered index
CREATE NONCLUSTERED INDEX IX_tbl_c2 
    ON dbo.tbl (c2);
GO

Now, let us check to see that our 2,000,000 rows were created


SELECT
    COUNT(*)
FROM
    tbl;

So, we have our database, table, indexes, and rows. So, let us set up the experiment for nibbling deletes. First, we must decide how best to create a typical nibbling delete mechanism.


DECLARE
      @BatchSize        INT    = 100
    , @LowestValue      BIGINT = 20000
    , @HighestValue     BIGINT = 20010
    , @DeletedRowsCount BIGINT = 0
    , @RowCount         BIGINT = 1;

SET NOCOUNT ON;
GO

WHILE  @DeletedRowsCount <  ( @HighestValue - @LowestValue ) 
BEGIN

    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    BEGIN TRANSACTION

        DELETE 
        FROM
            dbo.tbl 
        WHERE
            c1 IN ( 
                    SELECT TOP (@BatchSize)
                        c1
                    FROM
                        dbo.tbl 
                    WHERE 
                        c1 BETWEEN @LowestValue AND @HighestValue
                    ORDER BY 
                        c1
                  );

        SET @RowCount = ROWCOUNT_BIG();

    COMMIT TRANSACTION;

    SET @DeletedRowsCount += @RowCount;
    WAITFOR DELAY '000:00:00.025';
    CHECKPOINT;

END;

As you can see, I placed the explicit transaction inside the while loop. If you would like to limit log flushes, then feel free to place it outside the loop. Furthermore, since we are in the FULL recovery model, you may wish to create transaction log backups more often while running your nibbling delete operations, in order to ensure that your transaction log can be prevented from growing outrageously.

So, I have a couple goals with this setup. First, I want my key-range locks; so, I try to keep the batches as small as possible. I also do not want to impact negatively the concurrency on my "gigantic" table; so, I want to take my locks and leave them as fast as I can. So, I recommend that you make your batch sizes small.

Now, I want to provide a very short example of this deletion routine in action. We must open a new window within SSMS and delete one row from our table. I will do this within an implicit transaction using the default READ COMMITTED isolation level.


DELETE FROM
    dbo.tbl
WHERE
    c1 = 20005;

Was this row actually deleted?


SELECT
    c1
FROM
    dbo.tbl
WHERE
    c1 BETWEEN 20000 AND 20010;

Yes, it was deleted.

Now, in order to see our locks, let us open a new window within SSMS and add a code snippet or two. I am using Adam Mechanic's sp_whoisactive, which can be found here: sp_whoisactive


SELECT
    DB_NAME(resource_database_id) AS DatabaseName
  , resource_type
  , request_mode
FROM
    sys.dm_tran_locks
WHERE
    DB_NAME(resource_database_id) = 'test'
    AND resource_type = 'KEY'
ORDER BY
    request_mode;

-- Our insert
sp_lock 55;

-- Our deletions
sp_lock 52;

-- Our active sessions
sp_whoisactive;

Now, we are ready to begin. In a new SSMS window, let us begin an explicit transaction that will attempt to re-insert the one row that we deleted. At the same time, we will fire off our nibbling delete operation.

The insert code:


BEGIN TRANSACTION

    SET IDENTITY_INSERT dbo.tbl ON;

    INSERT  INTO dbo.tbl
            ( c1 , c2 )
    VALUES
            ( 20005 , 1 );

    SET IDENTITY_INSERT dbo.tbl OFF;

--COMMIT TRANSACTION;

Let us kick off both operations beginning with the insert and followed by our deletes. We can see the key-range locks and exclusive locks.

The insert generated these locks:

The nibbling delete/select is holding these locks:

Our insert is blocking our delete as expected:

Now, let us commit the insert transaction and see what is up.

And as expected, all transactions complete. Now, we must check to see whether the insert was a phantom or whether the delete operation removed it as well.


SELECT
    c1
FROM
    dbo.tbl
WHERE
    c1 BETWEEN 20000 AND 20015;

In fact, the insert was deleted; so, no phantom insert was allowed.

So, in conclusion, I think the true intention of this exercise is not to try and track every single row, page, or table-level lock and try to determine whether an element of a batch is locked and would therefore require our delete operation to wait. That may have been the intent of the questioners; however, that task is herculean and basically impractical if not impossible. The real goal is to ensure that no unwanted phenomena arise once we have isolated the range of our batch with locks of our own and then precede to delete the batch. The SERIALIZABLE isolation level achieves this objective. The key is to keep your nibbles small, your transaction log under control, and eliminate unwanted phenomena.

If you want speed, then don't build gigantically deep tables that cannot be partitioned and therefore be unable to use partition switching for the fastest results. The key to speed is partitioning and parallelism; the key to suffering is nibbles and live-locking.

Please let me know what you think.

I created some further examples of the SERIALIZABLE isolation level in action. They should be available at the links below.

Delete Operation

Insert Operation

Equality Operations - Key-Range Locks on Next Key Values

Equality Operations - Singleton Fetch of Existent Data

Equality Operations - Singleton Fetch of Nonexistent Data

Inequality Operations - Key-Range Locks on Range and Next Key Values

Best Answer

Related Solutions

Postgresql – Locking issue with concurrent DELETE / INSERT in PostgreSQL

Sql-server – In Sql Server, is there a way to check if a selected group of rows are locked or not

Related Question