Sql-server – Update a column, selected by query

sql serverupdate

Just stumbled upon this stackoverflow question: https://stackoverflow.com/questions/13648898/sql-update-with-row-number. The second answer struck me as odd:

UPDATE x
SET x.CODE_DEST = x.New_CODE_DEST
FROM (
      SELECT CODE_DEST, ROW_NUMBER() OVER (ORDER BY [RS_NOM]) AS New_CODE_DEST
      FROM DESTINATAIRE_TEMP
      ) x

Here, a column CODE_DEST is updated with the value of ROW_NUMBER(). However, x.CODE_DEST itself is the result of a select query. So, I'm updating the result of a select query. Is this common practice?

Best Answer

It's not common but it can work. Under certain circumstances (and I'm not sure of the rules) you can update a column from a CTE or subquery. My guess is that it's very similar to being able to update a view and probably has the same rules.

The times I've done it it's quite a bit faster than joining back to the original table and updating it.

Related Solutions

Sql-server – Can someone explain why select with nolock will query a potion of updated data

It is not quite true that NOLOCK means placing no locks at all. Queries under this hint will still take Sch-S locks and (possibly HOBT locks).

Under read committed isolation level SQL Server will (usually) take row level S locks and release them as soon as the data is read. These are incompatible with the X locks held on uncommited updates and thus prevent dirty reads.

In the example in the linked answer the SELECT query is not blocked when it encounters a modified row so reading partial updates is quite likely.

It can also happen at default read committed isolation level too though that a SELECT reads some rows with the "before" value and others with the "after" value. It is just needed to engineer a situation where

Select query reads value of row R1 and releases its S lock
Update query updates R2 and takes an X lock
Select query tries to read R2 and is blocked.
Update query updates R1 and takes an X lock.
Update transaction commits thus releasing its locks and allowing the Select to read R2

This type of situation might arise for example if the SELECT and UPDATE are using different indexes to locate the rows of interest.

Example

CREATE TABLE T
(
X INT IDENTITY PRIMARY KEY,
Y AS -X UNIQUE,
Name varchar(10),
Filler char(4000) DEFAULT 'X'
)


INSERT INTO T (Name)
SELECT TOP 2500 'A'
FROM master..spt_values

Now in one query window run

DECLARE @Sum int

SELECT 'SET @@ROWCOUNT' WHERE 1=0

WHILE (@@ROWCOUNT = 0)
SELECT @Sum = SUM(LEN(Name))
FROM T 
WHERE Y IN (-1, -2500)
HAVING SUM(LEN(Name)) = 3

This will run in an infinite loop. In another run

UPDATE T 
SET Name=CASE WHEN Name = 'A' THEN 'AA' ELSE 'A' END

This will likely stop the loop in the other query (try again if not) meaning that it must have read either A,AA or AA,A

MySQL Update a table based on the last data from another table while grouping with conditions

The correct query is:

UPDATE
    lp_plates_backup AS t
    INNER JOIN  (
        SELECT
            plate_uid, brand, model, date_validated
        FROM
            lp_pictures_backup as parent
        WHERE
            brand <> '' AND
            date_validated = (
                 SELECT MAX(date_validated)
                 FROM lp_pictures_backup as t2
                 WHERE t2.plate_uid = parent.plate_uid
                 GROUP BY
                     plate_uid)
) AS m ON
    m.plate_uid = t.uid
SET
    t.brand = m.brand,
    t.model = m.model
WHERE
    t.brand <> m.brand
    OR
    t.model <> m.model;

Just some little explanations. You need INNER JOIN because you must update row of lp_plates_backup only if plate_uid exists into lp_plates_backup. ORDER BY is useless because you are selecting all rows, order is not important.

You need the max date_validated of rows grouped by plate_uid, select MAX(date_validate) must individuate only a row for plate_uid through the date_validated field. So you need to add group by to select max(..).

The query select plate_uid returns a row for plate_uid so you don't need to aggregate here. The condition of single row is already builded into select max subquery.

I hope you did not get confused by me :-)

Updated

The previous works well if date_validate is the pair (plate_uid, date_validated) is unique. If you have this kind of data:

| plate_uid | brand    | model    | date_validated      |
|         1 | Fiat     | Panda    | 2014-10-11 10:03:20 | 
|         1 | BMW      | 7-Series | 2014-10-11 10:03:20 |   <- changed data
|         1 | BMW      | 7-Series | 2014-07-28 19:14:02 |
|         1 | Mercedes | S-Class  | 2014-06-12 08:54:57 |   
|         a | Tesla    | Model S  | 2014-12-17 11:00:00 | 
|         a | BMW      | 3-Series | 2014-11-07 14:34:11 |

The following query returns the first two rows for plate_uid 1.

SELECT plate_uid, brand, model, date_validated
FROM lp_pictures_backup as parent
WHERE
      brand <> '' AND
      date_validated = (
                 SELECT MAX(date_validated)
                 FROM lp_pictures_backup as t2
                 WHERE t2.plate_uid = parent.plate_uid
                 GROUP BY plate_uid)

The update will choice the values from the first or the second row. I think of the following alternatives:

using some other lp_pictures_backup fields to choise between rows with the same date_validated.
enforcing a unique constraint on table. I.e. alter table lp_pictures_backup add unique index (plate_id, date_validated). Rejecting invalid data.
detecting valid lp_pictures_backup pair of (plate_uid, date_validated). Updating lp_plates_backup only with valid pairs, review invalid pairs and correct them.

Maybe there are more alternatives. I prefer enforcing contraint on data so to have better data. I expand 3th alternative. Just create a view to define what a valid pair (plate_uid, date_validated) is:

CREATE VIEW lp_pictures_backup_valid as
SELECT plate_uid, date_validated
FROM lp_pictures_backup as parent
WHERE date_validated = (
        select max(date_validated) 
        from lp_pictures_backup t2 
        where t2.plate_uid =  parent.plate_uid GROUP BY plate_uid
      )
group by plate_uid, date_validated
having count(*) = 1;                    <-- you can change this to make an invalid row

A valid pair (plate_uid, date_validated) is a pair with the max date_validated only if there are a unique date_validated value.

I rewrite the update statements to consider only valid pair:

UPDATE
    lp_plates_backup AS t
    INNER JOIN  (
        -- modification start
        SELECT p.plate_uid, p.brand, p.model, p.date_validated
        FROM lp_pictures_backup as p
             INNER JOIN lp_pictures_backup_valid valid 
                        ON p.plate_uid = valid.plate_uid and 
                           p.date_validated = valid.date_validated
        WHERE p.brand <> '')
        -- modification end
    ) 
    AS m ON m.plate_uid = t.uid
SET
    t.brand = m.brand,
    t.model = m.model
WHERE
    t.brand <> m.brand
    OR
    t.model <> m.model;

Hope this make sense.

Update: 2014-03-20

In the first case:

Blockquote 1. using some other lp_pictures_backup fields to choise between rows with the same date_validated.

I have assumed your date is like this:

|id| plate_uid | brand    | model    | date_validated      |
|4 |         1 | Fiat     | Panda    | 2014-10-11 10:03:20 | 
|3 |         1 | BMW      | 7-Series | 2014-10-11 10:03:20 |   
|2 |         1 | BMW      | 7-Series | 2014-07-28 19:14:02 |
|1 |         1 | Mercedes | S-Class  | 2014-06-12 08:54:57 |    
|2 |         a | Tesla    | Model S  | 2014-12-17 11:00:00 | 
|1 |         a | BMW      | 3-Series | 2014-11-07 14:34:11 |

You can try this:

UPDATE
    lp_plates_backup AS t
    INNER JOIN  (
        SELECT t1.plate_uid, t1.brand, t1.model, t1.date_validated
        FROM lp_pictures_backup as t1,
          (SELECT t2.plate_uid, MAX(id) as id,  MAX(date_validated) as dv
           FROM lp_pictures_backup as t2
           GROUP BY t2.plate_uid) as t3                 
        WHERE t1.brand <> '' AND
              t1.plate_uid = t3.plate_uid AND
              t1.date_validated = t3.dv AND
              t1.id = t3.id
) AS m ON
    m.plate_uid = t.uid
SET
    t.brand = m.brand,
    t.model = m.model
WHERE
    t.brand <> m.brand
    OR
    t.model <> m.model;

The fields used to choise the row to be updated are extracted by this part:

 ...
 (SELECT t2.plate_uid, MAX(id) as id,  MAX(date_validated) as dv
  FROM lp_pictures_backup as t2
  GROUP BY t2.plate_uid) as t3 
 ...

So it assumes a correlation between date_validated and id: at increasing dates corresponds to increasing id.

Hope it help.

Best Answer

Related Solutions

Sql-server – Can someone explain why select with nolock will query a potion of updated data

MySQL Update a table based on the last data from another table while grouping with conditions

Related Question