How can I modify my SELECT DISTINCT
query to an UPDATE DISTINCT
query?
It is important that it only updates the distinct records, as there are multiple records associated with each [Finance_Project_Number] (due to CRUD operations). I only want to update a single record, as this will simply set in motion a different process of validating the data etc.
If there are multiple records collapsed into a single row as a result of the DISTINCT, any of these can be updated – it does not matter.
When I run my select query, I get a result of: 6 982:
SELECT DISTINCT
[Finance_Project_Number]
FROM [InterfaceInfor].[dbo].[ProjectMaster]
WHERE
NOT EXISTS
(
SELECT *
FROM [IMS].[dbo].[THEOPTION]
WHERE
[InterfaceInfor].[dbo].[ProjectMaster].[Finance_Project_Number] =
[IMS].[dbo].[THEOPTION].[NAME]
);
Here is my attempt at converting my query to a DISTINCT UPDATE
query, but this updates 15 353 records:
UPDATE [InterfaceInfor].[dbo].[ProjectMaster]
SET
[Processing_Result_Text] = 'UNIQUE',
[Processing_Result] = 0
WHERE
NOT EXISTS
(
SELECT *
FROM [IMS].[dbo].[THEOPTION]
WHERE
[InterfaceInfor].[dbo].[ProjectMaster].[Finance_Project_Number] =
[IMS].[dbo].[THEOPTION].[NAME]
);
Best Answer
To just update an arbitrary one from each distinct group you could use
If you decide there is some criteria by which the row to be updated should be selected after all simply change the
ORDER BY (SELECT 0)
accordingly so that the desired target row is ordered first - e.g.ORDER BY DateInserted desc
would update the latest one as ordered by a column called DateInserted if such a column exists.This uses a common table expression (CTE) because it is not permitted to reference ranking functions such as
ROW_NUMBER
directly in theWHERE
clause. It is allowed to update data via common table expressions in the same circumstances as for updatable views (basically the data being updated must be able to be mapped back straight forwardly to specific items in a single base table).If you are not yet familiar with ranking functions you may well find
SELECT
-ing from the CTE first to be beneficial.Example results are below
The
C47474
rows are filtered out as they do exist in the other table so don't meet theNOT EXISTS
, the remaining rows are grouped byFinance_Project_Number
and assigned a sequential number within each group.In this case the yellow rows would meet the
RN = 1
condition and be updated. However there is no guarantee exactly how these numbers will be assigned within each group unless you use aORDER BY
clause on an expression guaranteed to be unique. Without this it could potentially change even between successive executions of the same statement.