I am a little stumped with this CTE update stmt:
DECLARE @a TABLE (ID int, Value int);
DECLARE @b TABLE (ID int, Value int);
INSERT @a VALUES (1, 10), (2, 20);
INSERT @b VALUES (1, 100),(2, 200);
WITH cte AS
(
SELECT * FROM @a
)
UPDATE cte
SET Value = b.Value
FROM cte AS a
INNER JOIN @b AS b
ON b.ID = a.ID
SELECT * FROM @a
GO
Why does this result in table @a
having 100 for both rows? I thought it should be 100 for ID 1 and 200 for ID 2.
I get expected results if I use a table instead of a common table expression to do the updates:
DECLARE @a TABLE (ID int, Value int);
DECLARE @b TABLE (ID int, Value int);
INSERT @a VALUES (1, 10), (2, 20);
INSERT @b VALUES (1, 100),(2, 200);
SELECT *
FROM @a
UPDATE @a
SET Value = b.Value
FROM @a AS XX
INNER JOIN @b AS b ON b.ID = xx.ID
SELECT *
FROM @a
This results in table @a
with 100 and 200. But aren't we supposed to get both the same values? based on the previous explanation of referencing issue? — Update being done to @a
table and not the referenced XX.
Best Answer
To expand on MguerraTorres' answer:
(Updated with the info from your secondary query)
In your first query
UPDATE cte
says to update the table from the CTE.FROM cte as a
says to refer to the table from the CTE asa
.So, we've referred to our CTE in two places.
What you may not realize is that a CTE is re-evaluated for each time it appears in your query, just as if you replaced the reference with a subquery. Since you've referenced the CTE two separate times, you've generated two separate resultsets for the DB engine to work with.
When you say to use
b.Value
wherea.ID = b.ID
, we've got two rows - one whereb.Value
is 100, and one where it's 200 - from tableb
and from our second CTE resultset.However, we're updating the first CTE resultset based on these two rows. Therefore, it updates each row in that first resultset from the two rows returned. There's no relationship between the two resultsets, even though they represent the same underlying data. The engine is doing a
CROSS JOIN
between the results of your join, and the first resultset, to perform the update.Your
UPDATE
statement updates both of your rows to to 200, then to 100 (because the engine decides the quickest way to apply the cross-joined rows, they may not go in the order in which they were entered). Both rows are updated to the same value because they're being updated from the same multiple rows.Your first query is functionally identical to:
In your second query, the DB engine knows that both
a
and@a
reference a table outside the query, and it knows thata
and@a
mean the same thing, so it correctly ties the rows from@b
to@a
when performing the update.In the comments, you asked:
Whether it's 100 or 200 can vary.
I would say that it's likely that, given the same statements shown in your first query, executed in the same way, you would almost certainly get the same outcome.
However, in the real world, with tables seeing other activity, you couldn't really on one outcome or the other, especially over time. It would depend on how the DB engine matched the tables in the join, and then processed the rows in applying the update.