The correct query is:
UPDATE
lp_plates_backup AS t
INNER JOIN (
SELECT
plate_uid, brand, model, date_validated
FROM
lp_pictures_backup as parent
WHERE
brand <> '' AND
date_validated = (
SELECT MAX(date_validated)
FROM lp_pictures_backup as t2
WHERE t2.plate_uid = parent.plate_uid
GROUP BY
plate_uid)
) AS m ON
m.plate_uid = t.uid
SET
t.brand = m.brand,
t.model = m.model
WHERE
t.brand <> m.brand
OR
t.model <> m.model;
Just some little explanations. You need INNER JOIN
because you must update row of lp_plates_backup
only if plate_uid
exists into lp_plates_backup
. ORDER BY
is useless because you are selecting all rows, order is not important.
You need the max date_validated
of rows grouped by plate_uid
, select MAX(date_validate) must individuate only a row for plate_uid
through the date_validated
field. So you need to add group by to select max(..).
The query select plate_uid
returns a row for plate_uid
so you don't need to aggregate here. The condition of single row is already builded into select max
subquery.
I hope you did not get confused by me :-)
Updated
The previous works well if date_validate
is the pair (plate_uid
, date_validated
) is unique.
If you have this kind of data:
| plate_uid | brand | model | date_validated |
| 1 | Fiat | Panda | 2014-10-11 10:03:20 |
| 1 | BMW | 7-Series | 2014-10-11 10:03:20 | <- changed data
| 1 | BMW | 7-Series | 2014-07-28 19:14:02 |
| 1 | Mercedes | S-Class | 2014-06-12 08:54:57 |
| a | Tesla | Model S | 2014-12-17 11:00:00 |
| a | BMW | 3-Series | 2014-11-07 14:34:11 |
The following query returns the first two rows for plate_uid
1.
SELECT plate_uid, brand, model, date_validated
FROM lp_pictures_backup as parent
WHERE
brand <> '' AND
date_validated = (
SELECT MAX(date_validated)
FROM lp_pictures_backup as t2
WHERE t2.plate_uid = parent.plate_uid
GROUP BY plate_uid)
The update will choice the values from the first or the second row. I think of the following alternatives:
- using some other
lp_pictures_backup
fields to choise between rows with the same date_validated.
- enforcing a unique constraint on table. I.e.
alter table lp_pictures_backup add unique index (plate_id, date_validated)
. Rejecting invalid data.
- detecting valid
lp_pictures_backup
pair of (plate_uid, date_validated)
. Updating lp_plates_backup
only with valid pairs, review invalid pairs and correct them.
Maybe there are more alternatives. I prefer enforcing contraint on data so to have better data. I expand 3th alternative. Just create a view to define what a valid pair (plate_uid, date_validated)
is:
CREATE VIEW lp_pictures_backup_valid as
SELECT plate_uid, date_validated
FROM lp_pictures_backup as parent
WHERE date_validated = (
select max(date_validated)
from lp_pictures_backup t2
where t2.plate_uid = parent.plate_uid GROUP BY plate_uid
)
group by plate_uid, date_validated
having count(*) = 1; <-- you can change this to make an invalid row
A valid pair (plate_uid, date_validated)
is a pair with the max date_validated only if there are a unique date_validated value.
I rewrite the update statements to consider only valid pair:
UPDATE
lp_plates_backup AS t
INNER JOIN (
-- modification start
SELECT p.plate_uid, p.brand, p.model, p.date_validated
FROM lp_pictures_backup as p
INNER JOIN lp_pictures_backup_valid valid
ON p.plate_uid = valid.plate_uid and
p.date_validated = valid.date_validated
WHERE p.brand <> '')
-- modification end
)
AS m ON m.plate_uid = t.uid
SET
t.brand = m.brand,
t.model = m.model
WHERE
t.brand <> m.brand
OR
t.model <> m.model;
Hope this make sense.
Update: 2014-03-20
In the first case:
Blockquote
1. using some other lp_pictures_backup
fields to choise between rows with the same date_validated.
I have assumed your date is like this:
|id| plate_uid | brand | model | date_validated |
|4 | 1 | Fiat | Panda | 2014-10-11 10:03:20 |
|3 | 1 | BMW | 7-Series | 2014-10-11 10:03:20 |
|2 | 1 | BMW | 7-Series | 2014-07-28 19:14:02 |
|1 | 1 | Mercedes | S-Class | 2014-06-12 08:54:57 |
|2 | a | Tesla | Model S | 2014-12-17 11:00:00 |
|1 | a | BMW | 3-Series | 2014-11-07 14:34:11 |
You can try this:
UPDATE
lp_plates_backup AS t
INNER JOIN (
SELECT t1.plate_uid, t1.brand, t1.model, t1.date_validated
FROM lp_pictures_backup as t1,
(SELECT t2.plate_uid, MAX(id) as id, MAX(date_validated) as dv
FROM lp_pictures_backup as t2
GROUP BY t2.plate_uid) as t3
WHERE t1.brand <> '' AND
t1.plate_uid = t3.plate_uid AND
t1.date_validated = t3.dv AND
t1.id = t3.id
) AS m ON
m.plate_uid = t.uid
SET
t.brand = m.brand,
t.model = m.model
WHERE
t.brand <> m.brand
OR
t.model <> m.model;
The fields used to choise the row to be updated are extracted by this part:
...
(SELECT t2.plate_uid, MAX(id) as id, MAX(date_validated) as dv
FROM lp_pictures_backup as t2
GROUP BY t2.plate_uid) as t3
...
So it assumes a correlation between date_validated and id: at increasing dates corresponds to increasing id.
Hope it help.
So it turns out that confluence uses a different structure to manage this data and now the discrepancies appear to be resolved.
Here is the query:
select
User_Name,
First_Name,
Last_Name,
max(Last_Login) as Last_Login
FROM
(
SELECT
u.user_name as User_Name,
u.first_name as First_Name,
u.last_name Last_Name,
ifnull(from_unixtime((cast(a.attribute_value AS UNSIGNED)/1000), '%Y-%m-%d'), '1970-01-01') AS Last_Login
FROM jira.cwd_user u
LEFT JOIN jira.cwd_user_attributes a ON u.id = a.user_id AND a.attribute_name = 'login.lastLoginMillis'
JOIN jira.cwd_directory d ON u.directory_id = d.id
where u.active = 1
UNION
SELECT
u.user_name,
u.first_name as First_Name,
u.last_name as Last_Name,
date_format(l.successdate, '%Y-%m-%d') as Last_Login
FROM confluence.logininfo l
JOIN confluence.user_mapping m ON m.user_key = l.username
JOIN confluence.cwd_user u ON m.username = u.user_name
JOIN confluence.cwd_directory d ON u.directory_id = d.id and u.active = 'T'
UNION
select
u.user_name as User_Name,
u.first_name as First_Name,
u.last_name Last_Name,
ifnull(from_unixtime((cast(sa.attribute_value AS UNSIGNED)/1000), '%Y-%m-%d'), '1970-01-01') AS Last_Login
FROM stash.cwd_user u
LEFT JOIN stash.cwd_user_attribute sa ON u.id = sa.user_id AND sa.attribute_name = 'login.lastLoginMillis'
JOIN stash.cwd_directory d ON u.directory_id = d.id
where u.is_active = 'T'
) as z
group by User_Name, First_Name, Last_Name
having (max(Last_Login)) <= date_sub(now(), interval 90 day)
order by User_Name asc, Last_Login desc
Best Answer
Yours appears to be a "greatest N per group" problem. What you can do is get the maximum seats per engine results:
and, using them as a derived table, join them back to the source to get the rows matching the maximums: