SQL Server – Increase Performance of UPDATE for 50 Million Entries

performancequery-performancesql serversql-server-2008t-sql

We have a query that uses a temporary table to update certain columns in the database:

CREATE TABLE #NEW_OLD ( 
    NEW nvarchar(20), 
    OLD nvarchar(20) 
); 
INSERT INTO #NEW_OLD
VALUES
('DOMAIN\KEY', 'NEW_KEY'),
('DOMAIN\KEY2', 'NEW_KEY2')
(...) 
-- (~ 5.000 keys are inserted)

BEGIN TRANSACTION
BEGIN TRY

    DECLARE @TABLES_COLUMNS TABLE(TABLE_NAME varchar(100), COLUMN_NAME varchar(100))

    INSERT INTO @TABLES_COLUMNS (TABLE_NAME, COLUMN_NAME)
    VALUES
    ('[dbo].[MyTable1]', '[MyColumn1]'),
    ('[dbo].[MyTable2]', '[MyColumn2]')
    (...)
    -- ~ 20 columns are included. The tables have ~ 10.000.000 rows.

    -- Looping over table and column name variables with cursor
    DECLARE TABLES_COLUMNS_CURSOR CURSOR LOCAL FAST_FORWARD FOR
        SELECT TABLE_NAME, COLUMN_NAME 
        FROM @TABLES_COLUMNS

    -- Variables to hold table and column name
    DECLARE @TABLE_NAME varchar(30), @COLUMN_NAME varchar(30)

    OPEN TABLES_COLUMNS_CURSOR

    -- Instead of fetching twice, we set up no-exit loop
    WHILE 1 = 1
    BEGIN

        -- And then fetch
        FETCH NEXT FROM TABLES_COLUMNS_CURSOR 
        INTO @TABLE_NAME, @COLUMN_NAME

        -- And stop if nothing is fetched
        IF @@FETCH_STATUS <> 0
        BEGIN
            BREAK
        END

        -- Now, update each field with the new key that is associated with that key in the temporary table
        EXECUTE
        (
            'UPDATE ' + @TABLE_NAME + ' ' + 
            'SET ' + @COLUMN_NAME + ' = new_old.NEW ' + 
            'FROM ' + @TABLE_NAME + 'AS t ' + 
            'INNER JOIN #NEW_OLD AS new_old ' + 
            'ON t.' + @COLUMN_NAME + 'COLLATE DATABASE_DEFAULT = new_old.OLD COLLATE DATABASE_DEFAULT'
        )

    END

    CLOSE TABLES_COLUMNS_CURSOR
    DEALLOCATE TABLES_COLUMNS_CURSOR

END TRY
BEGIN CATCH

    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
END CATCH;


IF @@TRANCOUNT > 0
    COMMIT TRANSACTION;

This query takes hours to run. The dynamic part (EXECUTE) should not influence performance too much in SQL Server 2008.

How can the performance be improved?

Best Answer

One more thing. If these tables are very larges even if you have good indexes, you should updates these tables rows in cycle, because the transaction log is growing every time and big transactions takes more times.

Example:

CREATE TABLE UpdateTable(id int IDENTITY(1,1), name varchar(200), namenew varchar(200))

INSERT INTO UpdateTable VALUES ('Old1','New2');
INSERT INTO UpdateTable VALUES ('Old2','New2');
go

SET NOCOUNT ON;
DECLARE @cnt INT = -1;

WHILE (@cnt<> 0)
BEGIN
    UPDATE TOP (1) UpdateTable
    SET NAME = namenew
    WHERE name <> namenew

    SET @cnt = @@ROWCOUNT;
 END    

You should change the top (1) to an acceptable value like 100000 or bigger you can measure it on your environment