Are recent versions of Oracle able to implement a queue with SKIP LOCKED

oraclequeue

My question: what is the very latest status on whether or not recent versions of Oracle actually make it possible to implement a queue in a straightforward way with SKIP LOCKED and limiting to a single row, without resorting to indirect solutions/fragile solutions?

There seems to be alot of history going back a long time showing SKIP LOCKED queueing being problematic with Oracle – I'm trying to determine if newest Oracle's have cleared these problems up.

I have implemented queue style functionality using Postgres with SKIP LOCKED, and I now wish to do the same with Oracle. I'm happy to use any Oracle version that makes it possible. So before I head down the path of trying to implement this for Oracle, I wanted to first ask if it is impossible to do so.

I've been reading alot of documentation on the web to try to determine if it can be done … older information seems to indicate that Oracle is not able to truly limit results returned to only one single row, which is a big problem when using "SKIP LOCKED" in a queue because for queue processing you want only one row.

Previous information indicating row limit in Oracle depends on fragile/indirect solutions:

https://stackoverflow.com/questions/16299663/select-for-update-skip-locked-with-row-limit

https://stackoverflow.com/questions/6117254/force-oracle-to-return-top-n-rows-with-skip-locked

https://stackoverflow.com/questions/54766489/oracle-how-to-limit-number-of-rows-in-select-for-update-skip-locked

https://stackoverflow.com/questions/470542/how-do-i-limit-the-number-of-rows-returned-by-an-oracle-query-after-ordering

https://stackoverflow.com/questions/50390146/query-limit-in-oracle-database

There appears to be a recent FETCH statement implemented in recent Oracle versions, but again it is not clear if this truly restricts access to a single row. Does FETCH make it possible to implement a queue with SKIP LOCKED in a direct and robust manner?

Please note I am aware Oracle has Advanced Queue functionality built in – I do not want to use that.

Here's what I wrote for Postgres – it's pretty straightforward – note I am aware it lacks needed transaction handling:

import psycopg2
    import psycopg2.extras
    import random

    db_params = {
        'database': 'jobs',
        'user': 'jobsuser',
        'password': 'superSecret',
        'host': '127.0.0.1',
        'port': '5432',
    }
    
    conn = psycopg2.connect(**db_params)
    cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
    
    def do_some_work(job_data):
        if random.choice([True, False]):
            print('do_some_work FAILED')
            raise Exception
        else:
            print('do_some_work SUCCESS')
    
    def process_job():
    
        sql = """DELETE FROM message_queue 
    WHERE id = (
      SELECT id
      FROM message_queue
      WHERE status = 'new'
      ORDER BY created ASC 
      FOR UPDATE SKIP LOCKED
      LIMIT 1
    )
    RETURNING *;
    """
        cur.execute(sql)
        queue_item = cur.fetchone()
        print('message_queue says to process job id: ', queue_item['target_id'])
        sql = """SELECT * FROM jobs WHERE id =%s AND status='new_waiting' AND attempts <= 3 FOR UPDATE;"""
        cur.execute(sql, (queue_item['target_id'],))
        job_data = cur.fetchone()
        if job_data:
            try:
                do_some_work(job_data)
                sql = """UPDATE jobs SET status = 'complete' WHERE id =%s;"""
                cur.execute(sql, (queue_item['target_id'],))
            except Exception as e:
                sql = """UPDATE jobs SET status = 'failed', attempts = attempts + 1 WHERE id =%s;"""
                # if we want the job to run again, insert a new item to the message queue with this job id
                cur.execute(sql, (queue_item['target_id'],))
        else:
            print('no job found, did not get job id: ', queue_item['target_id'])
        conn.commit()
    
    process_job()
    cur.close()
    conn.close()

Best Answer

it's not clear what problem you think there is

The problem is that you rely on non-deterministic behaviour (which, by being non-deterministic, is bound to change at any time).

SQL is a declarative language. It does not define how you want to achieve something; it only defines what you want to achieve. Let's see what you want to achieve by rephrasing your SQL statement:

Give me all rows where status = 'new'
I'm planning to update them, so make sure nobody else can do that before me.
If you find rows that somebody else is planning to update, skip them.
Order them by created.
Oh, and I only want the first row once you've ordered them.
Now that I've got it, please delete it.

Given a lucky combination of indexes, table statistics, and possibly other things at this time a particular SQL engine might find a plan that only locks one row before it gets the one eligible to be returned. At some other time it might find it more advantageous (or possible) to lock more rows, or the entire table, without violating semantics of the query. Then you will suddenly discover that your application works somewhat differently from what you have come to expect. The same applies to moving your application to a different SQL engine.

In short, SQL is not the right tool for the job where you want to prescribe how you want it done. Also, your fixation on skip locked is misplaced.

Related Solutions

Sql-server – Work queue with complex select and long processing: Ensuring concurrency

You mentioned you could not use 'locking by deletion' because the row must stay in the table. The code example below does use it, but only after creating a global queue first and populating it. It then uses the 'locking by deletion' for the ##Queue table rather than the table you're working with.

USE master
GO

    -- 1. Create a Stored Procedure that creates the Queue and populates it based on the criteria you have.
    --      (this doesn't have to be a Stored Procedure, but can also be run via some other method to populate the queue.)
    --      (also update 'DatabaseName.SchemaName.the_table' to be the name of your table.)

    IF EXISTS (SELECT * FROM master.dbo.sysobjects o WHERE o.xtype IN ('P') AND o.id = object_id('master.dbo.CreateQueue'))
    DROP PROC CreateQueue
    GO

    CREATE PROCEDURE dbo.CreateQueue
    AS
    BEGIN
        IF OBJECT_ID('Tempdb.dbo.##Queue') IS NOT NULL
        DROP TABLE ##Queue

        SELECT * INTO ##Queue FROM DatabaseName.SchemaName.the_table WHERE <replace with your criteria> 
    END
    GO

    -- 2. Execute the SP above or use the code some other way to create the ##Queue

    -- 3. Copy the code below and run it via multiple processes if you like, as there should not be concurrency issues.
    WHILE 1 = 1
        BEGIN
            DELETE TOP ( 1 )
                    ##Queue WITH ( READPAST )
            OUTPUT  Deleted.*
                    INTO #RowToProcess
            IF @@ROWCOUNT > 0 
                BEGIN
                    --Place logic here to work with the row...
                    DELETE  FROM #RowToProcess
                END
            ELSE
                BREAK
        END

Oracle: Performance centric ways to notify user on reaching “logical” lock expiration event

seems like a serious design flaw. probably resulting from a pile of flaws. why is it that the application knows when the lock should be expired, and yet it checks again and again to see if it has expired? this means that the time for expiry is meaningless. if you can change the complete design, the current situation is probably resulting from the fragmentation of the data handling parts of your application. if you can't, you still need to realize the problemof a going for a time-based lock of a rigid constant (which goes as high as 30 seconds, a performance red flag), and in the same time creating a propagating mechanism notifying the release. in my opinion, The much-less-than-perfect solution, in your case, is that those waiting for the logical lock to be released should sleep for the time-frame in which the lock is held, and then they should access and see if the record is still locked. if it still not free, they should probably sleep themselves into a random short period (realistic in your app's terms), before checking again to see if the record is locked. some upper threshold should be set for these waits as well. in general, what you need is a messaging system between waiters.

Best Answer

Related Solutions

Sql-server – Work queue with complex select and long processing: Ensuring concurrency

Oracle: Performance centric ways to notify user on reaching “logical” lock expiration event

Related Question