Sql-server – How to use proper hints for Upsert involving a top 1 subquery

deadlocklockingsql serverupsert

For this specific case (tracking Load Balancer switches) we wish to optimize an Upsert so that

it does not exhibit any race conditions,
cause any PK violation, or
acquire any over sized locks.

I understand larger locks (page) may be more efficient but for purposes of question the goal is minimal (row). There are numerous links on the upsert/lock subject but the answers are are somewhat inconsistent (esp. updlock, and multi-statements) and this particular case involves an embedded sub query.

Table definition:

create table [User].[SessionWebServerLog] (
  [SessionId] bigint not null,
  [IsSSL] bit not null default ((0)),
  [LastRequestUtc] datetime2(7) not null default (sysutcdatetime()),
  [WebServerProcessInstanceId] bigint not null,
  [RequestCount] int not null default ((1)),
  [FirstRequestUtc] datetime2(7) not null default (sysutcdatetime()),
  foreign key ([SessionId]) references [User].[Session] ( [SessionId] ) on delete cascade,
  primary key clustered ([SessionId] asc, [IsSSL] asc, [LastRequestUtc] desc, [WebServerProcessInstanceId] asc)
  with ( 
    allow_row_locks = on,
    allow_page_locks = off,  -- Needed else page locks were taken
  )
)

The SP should insert only if the Session+IsSsl combination has changed server ids since the most recent request for that Session+IsSsl:

create proc [User].[usp_LogSessionWebServerRequest]    
    @pSessionId                   bigint, 
    @pWebServerProcessInstanceId  bigint,    
    @pIsSsl                       bit,        -- True for https, false for http
    @pDebug                       bit = 0     -- debug flag for print statements
as    
begin try        
    set xact_abort on;
    begin transaction;

        update l
           set RequestCount = RequestCount + 1,
               LastRequestUtc = sysutcdatetime()
          from [User].SessionWebServerLog l             
          with (rowlock, xlock, serializable) -- row level, exclusively held, until end of xact
         cross apply
             (
               select top(1) WebServerProcessInstanceId, LastRequestUtc
                 from [User].SessionWebServerLog 
                 with (rowlock, xlock, serializable) -- row level, exclusively held, until end of xact
                   -- PK supports this join:  SessionId, IsSsl, LastRequestUtc (desc), WebServerProcessId                       
                where SessionId = @pSessionId
                  and IsSSL = @pIsSsl
                order by LastRequestUtc desc 
             ) prev -- previous request
         where SessionId = @pSessionId
           and IsSSL = @pIsSsl
           and prev.WebServerProcessInstanceId = @pWebServerProcessInstanceId
           and l.WebServerProcessInstanceId = @pWebServerProcessInstanceId
           and l.LastRequestUtc = prev.LastRequestUtc;

        if (@@rowcount = 0) -- if no update occurred, insert new
        begin
            insert into [user].SessionWebServerLog
                 ( SessionId, WebServerProcessInstanceId, IsSSL )
            values 
                 ( @pSessionId, @pWebServerProcessInstanceId, @pIsSsl );                
        end            

    commit;            
end try
begin catch    
   if (xact_state() = -1 or @@trancount > 0)
    rollback;
   -- log, etc.
end catch

This routine appears to work for simple cases by testing using two windows and executing the first half of the transaction within each window and checking blocking.

Q1: It blocks when the update does not match any rows for either window but yet they are different keys. Does blocking occur because the key range lock is held on existing keys only?

Win1:

declare
    @pSessionId                   bigint = 3, -- does not exist in table
    @pWebServerProcessInstanceId  bigint = 100,    
    @pIsSsl                       bit = 0;

 sp_lock 72:

spid  dbid  ObjId      IndId   Type       Resource  Mode      Status
  72    16      0          0     DB                 S          GRANT 
  72    16  388964512      1    KEY (6c2787a590a2)  RangeX-X   GRANT
  72    16  388964512      0    TAB                 IX         GRANT

Win2:

 declare
        @pSessionId                   bigint = 4,  -- does not exist in table
        @pWebServerProcessInstanceId  bigint = 100,    
        @pIsSsl                       bit = 0;

    sp_lock 92:

    spid  dbid      ObjId   IndId   Type  Resource         Mode   Status
    92      16          0       0     DB                    S      GRANT
    92      16  388964512       1    KEY  (6c2787a590a2) RangeX-X   WAIT
    92      16  388964512       0    TAB                    IX     GRANT

declare
@pSessionId bigint = 4,
@pWebServerProcessInstanceId bigint = 100,
@pIsSsl bit = 0;

Q2: If I allow page locks (the default) on the PK, why is a page lock is taken out even though the row lock hint was specified.

spid  dbid      ObjId   IndId   Type     Resource      Mode      Status
72      16          0       0      DB                    S       GRANT
72      16  388964512       1     PAG       1:444       IX       GRANT
72      16  388964512       1     KEY (6c2787a590a2)  RangeX-X   GRANT
72      16  388964512       0     TAB                   IX       GRANT

The transaction isolation level is the default "read committed." I chose not to alter for this specific because restoring it seems messier (for success and failure and assuming/determining default) than just using table locks (imo).

Query plan for the zero case:

When no matching rows to update

Query plan when multiple rows for WebSession+Ssl with different dates exist (exactly one row from branch to top, perfect, apparently using the PK by date):

enter image description here

Q3: Is this overkill–are there other hints that will accomplish the goals? (Please do not re-arrange query or try to convert to merge statement for purposes of this question).

Best Answer

Q1: The range lock is likely held becauase SessionId,IsSSL,LastRequestUtc is not declared unique. Is this correct? This means that the value before and after the one you are looking for has to be locked (as you are requesting an XLOCK on the table) to avoid the range being modified while you read it. If you were to declare the combination unique, I believe this problem should be gone. This can also cause the INSERT to lock up.

Q2: First of all, a clarification. Page lock are NOT the default, row lock are (Page locks were default in SQL Server 7.0). That being said, SQL server will still take out an intent Exclusive (IX) lock on the page. This is not a problem, because this lock is compatible with other IX locks. You can actually make locking cheaper by using ALLOW_PAGE_LOCKS = OFF on your indexes (this has side effects if you do frequent table scans, so be careful with this).

Q3: You dont have to force rowlock or to force the xlock in the CROSS APPLY. You MAY need to force serializable if you expect multiple statement to insert into the same session and ranges. If you dont, then you can get rid of that too (for example, if every inserter only inserts for the same session).

Related Solutions

Single DELETE statement deadlocking in Sybase ASE

Original post describes a classic deadlock scenario with a twist, namely, a single DELETE ... WITH statement, which acquires 2 different kinds of locks.

Recommendation - change table from All Page Lock (APL) to Data Row Lock (DRL) scheme

clustered index will be replaced by placement index separate from data page
index access / updates will use latches rather than logical locks
row-level deadlock conflict is less probably (than page-level's)

Both SPIDs are executing the same sproc with the 1st DELETE ... WHERE statement demanding 2 different kinds of locks:

initially an UPDATE lock on a page when scanning for a row to delete (UPDATE lock is compatible with shared locks only but not with another UPDATE lock or exclusive lock);
once a target row is found on a page, then the UPDATE lock must be upgraded to an exclusive lockbefore deleting that row.

The DEADLOCK events posted in your original question confirm the scenario :

SPID 134 has held an exclusive lock on a page while waiting to acquire an UPDATE lock on another page i.e. it has deleted at least one row already and is continuing with searching for another (row to delete);
SPID 166 blocks the first SPID's acquiring an UPDATE lock on a page while waiting to apply an exclusive lock on another page, which has an exclusive lock held by SPID 134 already (after deleting 1 row or more).

Sql-server – deteriorating stored procedure running times

What is up with FROM part JOIN model ON 1=1? This the same as FROM part, model, which is a cartesian join and will result in a very large number of rows. Is that join supposed to be like that?

You will likely help us help you if you provide details about the tables involved. Please "script" the definition of the tables, along with any indexes defined on those tables.

This sounds like a classic case of parameter sniffing resulting in good plan/bad plan choices for various scenarios in your data.

You may be able to get more reliable performance by making SQL Server cache different plans for different scenarios by using sp_executesql, as in the following example:

CREATE PROCEDURE [dbo].[create_grid_materials2] 
(
    @partlistid bigint
    , @pid bigint
    , @masterid bigint
)
AS
BEGIN
    begin
        DECLARE @cmd NVARCHAR(MAX);

        SET @cmd = '   
        INSERT INTO material (partid, personid, modelID)
        SELECT 
            partid = part.id
            , personid = @pid
            , modelid = model.id  
        FROM part
            INNER JOIN model ON 1=1
        WHERE (
            model.masterid = ' + CONVERT(NVARCHAR(50), @masterid) + ' 
                AND model.modelSetID IS NULL
                AND part.partlistid = ' + CONVERT(NVARCHAR(50), @partlistid) + '
                AND (
                    part.partType = 100 
                    or part.partType=120 
                    or part.partType = 130
                )
            )
            AND NOT EXISTS (
                SELECT 1 
                FROM material AS a1 
                WHERE a1.partid = part.id 
                    AND a1.personid=@pid 
                    AND a1.modelid=model.id
                )';
        DECLARE @Params VARCHAR(200);
        SET @Params = '@pid INT';
        EXEC sys.sp_executesql @cmd
            , @Params
            , @pid = @pid;
    end
End

The above code will cause a new plan to be generated for each combination of @partlistid, and @masterid.

The presumption here is some combinations of those two variables lead to a very small number of rows, whereas some combinations lead to a very large number of rows.

Forcing a plan for each combination allows SQL Server to generate more efficient plans for each. I've explicitly not included @pid since you probably want to try it with a fairly small number of combinations first; adding a third variable to the mix will make for an exponentially larger number of possible plans.

Best Answer

Related Solutions

Single DELETE statement deadlocking in Sybase ASE

Sql-server – deteriorating stored procedure running times

Related Question