Sql-server – INSTEAD OF INSERT trigger

bulk-insertidentitysql-server-2008-r2trigger

Consider this a follow-up to BULK INSERT: Trials and Tribulations

I have the following VIEW defined and would like to insert into the multiple underlying tables when sending a BULK INSERT from a CSV.

CREATE VIEW dbo.StagingViewB as 
  SELECT 
    Q.[SQSQueue],
    Q.[SQSMessageID],
    D.[ReceiverID],
    Q.[DLOrder],
    D.[DetectionDate],
    D.[DetectionMS],
    D.[TagID],
    M.[TxAmplitude],
    M.[TxOffset],
    M.[TxNBW],
    M.[TxChkSum]
  FROM         [StQueue] as Q 
    INNER JOIN [StDetection] as D ON Q.eID=D.eID
    INNER JOIN [StDetMeta] as M ON Q.eID=D.eID
WITH VIEW_METADATA

I want an underlying autoincrement primary key (eID) to tie the 3 underlying tables together. In other words, this is the StQueue DDL

CREATE TABLE [dbo].[StQueue](
    [eID] [bigint] IDENTITY(1,1) NOT NULL,
    [SQSQueue] [varchar](25) NULL,
    [SQSMessageID] [varchar](50) NULL,
    [DLOrder] [int] NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [eID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

The Q,M,D aliased tables all have eID, but I'm willing to remove the IDENTITY(1,1) part of the definition (on all except one), as I don't think it's wise to trust that three separate tables will remain synchronous with respect to the value of eID for the next insert.

To avoid this issue, should I do a SET @ID = COALESCE(IDENT_CURRENT('StQueue'),0)+1
or

SET @ID = SELECT MAX(eID) FROM StQueue
SET @ID = COALESCE(ID,0)+1

And then do

INSERT INTO StQueue ([SQSQueue], [SQSMessageID],[DLOrder])
  VALUES (SQSQueue, SQSMessageID, DLOrder) From Inserted
INSERT INTO StDetection (eID, ReceiverID, DetectionDate, DetectionMS, TagID)
  VALUES (@ID, ReceiverID, DetectionDate, DetectionMS, TagID) FROM Inserted
INSERT INTO StDetMeta (eID, [TxAmplitude], [TxOffset], [TxNBW], [TxChkSum])
  VALUES (@ID, [TxAmplitude], [TxOffset], [TxNBW], [TxChkSum]) FROM Inserted

Microsoft warns:

Be cautious about using IDENT_CURRENT to predict the next generated
identity value. The actual generated value may be different from
IDENT_CURRENT plus IDENT_INCR because of insertions performed by other
sessions.

I assume this warning applies equally to the MAX-based method. Should I INSERT...WITH TABLOCK on the underlying tables to avoid that issue? Or would the multi-session/thread issue still be a problem. The problem might still exist because the computation for the next ID number happens before attempting to grab a lock, discovering it's already locked, and spinning wheels until the existing lock is released, the value would be stale any time a lock is encountered or a race exists where two threads are trying to grab the attention of the table (establish a lock) at the same time? Is there a way I can request a lock for all three tables at once? Would this be wise to do before grabbing IDENT_CURRENT('StQueue')?

Will this whole process be slower than BULK INSERTing and then SELECT eID,x,y,z from a single master table into the 3 "child" tables? Would iterating through a CURSOR be worse?

Best Answer

I ended up abandoning the INSTEAD OF INSERT trigger method initially asked about, as syncing the unique ID across the several tables otherwise was proving problematic.

My current solution, involving the MERGE syntax, which as @AaronBertrand blogged about here, may be bug-ridden. For example, it may be subject to unexpected sessions of deadlock down the line. But the following works for now:P

CREATE TABLE TxStaging (
    [eID] [bigint] IDENTITY(1,1) NOT NULL,
    [SQueue] [varchar](25) NULL,
    [SMessageID] [varchar](50) NULL,
    [RecID] [int] NOT NULL,
    [DLOrder] [int] NOT NULL,
    [DetectionDate] [datetime2](7) NOT NULL,
    [DetectionUS] [decimal](7, 6) NOT NULL,
    [TagID] [varchar](4) NOT NULL,
    [TxAmplitude] [int] NULL,
    [TxOffset] [int] NULL,
    [TxNBW] [int] NULL,
    [TxChkSum] [varchar](25) NULL,
PRIMARY KEY CLUSTERED ( [eID] ASC )
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON,
      ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[StQueue](
    [eID] [bigint] /*IDENTITY(1,1)*/ NOT NULL,
    [SQueue] [varchar](25) NULL,
    [SMessageID] [varchar](50) NULL,
    [DLOrder] [int] NOT NULL,
PRIMARY KEY CLUSTERED ( [eID] ASC )
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, 
      ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[StDets](
    [eID] [bigint] /*IDENTITY(1,1)*/ NOT NULL,
    [RecID] [int] NOT NULL,
    [DetectionDate] [datetime2] NOT NULL, --(7)?
--  [DetectionUS] [decimal](7, 6) NULL,
    [TagID] [varchar](4) NOT NULL,
PRIMARY KEY CLUSTERED ( [eID] ASC )
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, 
      ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[StDetMeta](
    [eID] [bigint] /*IDENTITY(1,1)*/ NOT NULL,
    [TxAmplitude] [int] NULL,
    [TxOffset] [int] NULL,
    [TxNBW] [int] NULL,
    [TxChkSum] [varchar](25) NULL,
    [NeworDup] int NULL, --unevaluated = NULL; New = 0; Duplicated_in_this_staging_table = 1; Duplicated_in_main_table = 2
PRIMARY KEY CLUSTERED ( [eID] ASC )
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, 
      ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE VIEW InsertView as 
SELECT 
    [SQueue],
    [SMessageID],
    [RecID],
    [DLOrder],
    [DetectionDate],
    [DetectionUS],
    [TagID],
    [TxAmplitude],
    [TxOffset],
    [TxNBW],
    [TxChkSum]
FROM TxStaging
GO

ALTER PROCEDURE [dbo].[usp_PropegateSubTables]  --CREATE or ALTER
AS BEGIN
    MERGE INTO realtimetest.dbo.StQueue WITH (HOLDLOCK) as T
        USING realtimetest.dbo.TxStaging as S on T.eID=S.eID
        WHEN MATCHED THEN
            UPDATE SET T.SQueue=S.SQueue, T.[SMessageID]=S.[SMessageID], T.[DLOrder]=S.[DLOrder]
        WHEN NOT MATCHED BY TARGET THEN
            INSERT (eID, [SQueue], [SMessageID], [DLOrder])
            VALUES (S.eID, S.[SQueue], S.[SMessageID], S.[DLOrder]);
    MERGE INTO realtimetest.dbo.StDets WITH (HOLDLOCK) as T
        USING realtimetest.dbo.TxStaging as S on T.eID=S.eID
        WHEN MATCHED THEN
            UPDATE SET T.RecID=S.RecID, T.DetectionDate=DATEADD(mcs,CAST(S.DetectionUS*1000000.0 as INT),S.DetectionDate), T.TagID=S.TagID
        WHEN NOT MATCHED BY TARGET THEN
            INSERT (eID, RecID, DetectionDate, TagID)
            VALUES (S.eID, S.RecID, DATEADD(mcs,CAST(S.DetectionUS*1000000.0 as INT),S.DetectionDate), S.TagID);
    MERGE INTO realtimetest.dbo.StDetMeta WITH (HOLDLOCK) as T
        USING realtimetest.dbo.TxStaging as S on T.eID=S.eID
        WHEN MATCHED THEN
            UPDATE SET T.[TxAmplitude]=S.[TxAmplitude], T.[TxOffset]=S.[TxOffset], T.[TxNBW]=S.[TxNBW], T.[TxChkSum]=S.[TxChkSum]
        WHEN NOT MATCHED BY TARGET THEN
            INSERT (eID, [TxAmplitude], [TxOffset], [TxNBW], [TxChkSum])
            VALUES (S.eID, S.[TxAmplitude], S.[TxOffset], S.[TxNBW], [TxChkSum]);
END
GO

Python code to fill TxStaging and then the child tables (has some other Try: type logic not included here)

tname = 'InsertView'
fname = 'S:/path/to/In.csv'
fieldSep = ','
linestart = 1
ename = 'S:/path/to/In.err'
sqlconn.execute_non_query("BULK INSERT {} FROM '{}' WITH ( FIELDTERMINATOR = '{}', ROWTERMINATOR = '{}', MAXERRORS = 1024, FIRSTROW = {}, KEEPNULLS, ERRORFILE = '{}', TABLOCK )".format(tname, fname, fieldSep, chr(10), linestart, ename))
procObj = sqlconn.init_procedure('usp_PropegateSubTables')
procObj.execute()
qstring = "DELETE FROM {}".format(tname) #TO DO: should check for presence of error file before going ahead and deleting this table.
    # clear table/view after propagated, but leave counter valid. Don't use TRUNCATE for this reason
    # TRUNCATE could be used if proceeded by a "check max eID" and then a `DBCC checkident(tname, RESEED, @maxeID)`
sqlconn.execute_non_query(qstring)