SQL Server Linked Server Insert – Inserting a 30 Million Row Table from a Linked Server

linked-serversql servert-sql

I have a large table that needs to be updated on a nightly basis from a linked server. Initially I had simply truncated the table and inserted all of the records but this was taking too long. I'm now using a merge statement to copy over only the records that need updating. However this appears to still take too long, i.e. 10+ hours What other methods or techniques can I use to get this table refreshed from the linked server on a daily basis?

BEGIN
SET NOCOUNT ON
MERGE dbo.[ANNOUNCEMENTS] AS Target
USING (SELECT [ANN_ID]
      ,[ANNOUNCEMENT]
      ,[SCH_OR_DIST_NUM]
      ,[START_DATE]
      ,[END_DATE]
      ,[CREATED_BY]
      ,[CREATION_DATE]
      ,[UPDATED_BY]
      ,[LAST_UPDATE_DATE]
    FROM [LINKEDSERVER].[DATABASE].[dbo].[ANNOUNCEMENTS]) 
            AS source ([ANN_ID]
                        ,[ANNOUNCEMENT]
                        ,[SCH_OR_DIST_NUM]
                        ,[START_DATE]
                        ,[END_DATE]
                        ,[CREATED_BY]
                        ,[CREATION_DATE]
                        ,[UPDATED_BY]
                        ,[LAST_UPDATE_DATE])
    ON (Target.ANN_ID = source.ANN_ID)
    WHEN NOT MATCHED THEN
        INSERT ([ANN_ID]
              ,[ANNOUNCEMENT]
              ,[SCH_OR_DIST_NUM]
              ,[START_DATE]
              ,[END_DATE]
              ,[CREATED_BY]
              ,[CREATION_DATE]
              ,[UPDATED_BY]
              ,[LAST_UPDATE_DATE])
        VALUES (source.[ANN_ID]
              ,source.[ANNOUNCEMENT]
              ,source.[SCH_OR_DIST_NUM]
              ,source.[START_DATE]
              ,source.[END_DATE]
              ,source.[CREATED_BY]
              ,source.[CREATION_DATE]
              ,source.[UPDATED_BY]
              ,source.[LAST_UPDATE_DATE]);
END

Best Answer

MERGE doesn't seem to be what you want here. You're only inserting new rows with a CREATION_DATE from the past 24 hours, right?

SET NOCOUNT ON;

DECLARE @Yesterday DATE = DATEADD(DAY, -1, GETDATE());

INSERT dbo.ANNOUNCEMENTS(column list)
SELECT column list
  FROM [LINKEDSERVER].[DATABASE].[dbo].[ANNOUNCEMENTS] AS r
  WHERE CREATION_DATE >= @Yesterday 
  AND CREATION_DATE < DATEADD(DAY, 1, @Yesterday)
  AND NOT EXISTS
  (SELECT 1 FROM dbo.ANNOUNCEMENTS WHERE ANN_ID = r.ANN_ID);

Of course having useful indexes on CREATION_DATE and/or ANN_ID will help.