SQL Server – How to Update Fields with Certain Prefix Using Lookup Table

performancequery-performanceregexsql servert-sql

I have a temporary table #NAMEOLD_NAMENEW which contains a column NAMEOLD and a column NAMENEW where every row contains the old and new value of a value to be updated. This table consists of roughly 2.000 rows. I created a clustered index on NAMEOLD.

The temporary table will be used to update certain fields in the database. I know which columns in which tables need to be updated (roughly 25 columns over 20 tables).

Now, the fields that need to be updated contain the string DOMAIN\ as a prefix and after that a NAMEOLD value of the temporary table.

What I want to do is: use the temporary table to replace the full value of the string DOMAIN\ plus the string after that with the appropriate value in the #NAMEOLD_NAMENEW table.

Example:

Example data in #NAMEOLD_NAMENEW:

NAME_OLD NAMENEW
user1 user5
user2 user6

Example data in TABLE1:

ColumnWithName1
DOMAIN\user1
DOMAIN\user2

Now running the query should update TABLE1 to:

ColumnWithName1
user5
user6

I think that scanning the full table multiple times will not perform well enough (SELECT on certain tables already takes minutes).

What I thought of is typing out all the columns that need to be updated, and then updating them with a regex with a capture group. Something like this (non-working code):

UPDATE [dbo].[Table1]
SET [ColumnWithName1] = (SELECT [NAMENEW] from new_old WHERE [NAMEOLD] = [/1]) -- first CAPTUREGROUP??
FROM #NAMEOLD_NAMENEW new_old
WHERE [ColumnWithName1] like '%DOMAIN\(.*)'

Here the capture group does not work since it does not seem to exist in T-SQL. Furthermore, using the nested SELECT is I believe a bad practice for some reason.

Is it possible to do something like this in T-sql? Or is there perhaps even a better approach?

Best Answer

If the prefix is always DOMAIN\ then you need neither RegEx nor a string splitter. You just need to use SUBSTRING as follows:

UPDATE tbl 
SET    tbl.[ColumnWithName1] = new_old.[NAMENEW] -- to keep the prefix, add "'DOMAIN\' + "
FROM   [dbo].[Table1] tbl
INNER JOIN #NAMEOLD_NAMENEW new_old
        ON new_old.[NAMEOLD] = SUBSTRING(tbl.[ColumnWithName1], 8, 1000);

Just be sure to set the 3rd parameter of the SUBSTRING to be the longest max length of all of the VARCHAR fields being updated (which should already be the length of the NAMENEW field anyway).

Better yet: You could probably improve the efficiency of the operation and reduce the complexity at the same time by approaching this from the opposite direction. Meaning, rather than trying to deal with removing the prefix from the field being updated, simply add in that prefix on the other side. This is similar to what @YperSillyCubeᵀᴹ suggested in a comment on the Question, but given that this is just a temporary table and it is not millions of rows, just prefix all of the values in the NAMEOLD field such that the JOIN does not have any expressions in it at all: it is just a straight equality operation:

-- one-time update
UPDATE tmp
SET    tmp.[NAMEOLD] = 'DOMAIN\' + tmp.[NAMEOLD];


UPDATE tbl 
SET    tbl.[ColumnWithName1] = new_old.[NAMENEW] -- to keep the prefix, add "'DOMAIN\' + "
FROM   [dbo].[Table1] tbl
INNER JOIN #NAMEOLD_NAMENEW new_old
        ON new_old.[NAMEOLD] = tbl.[ColumnWithName1];

Please note that the "pretty-print" formatting for SQL does not handle the \' sequence of characters correctly when formatting a string literal, and does not recognize it as being the end of the string literal, hence the red continues beyond where it should.