I have two tables definitions
and history
with the following schema:
create table definitions (
id UNIQUEIDENTIFIER,
revision INT,
majorVersion INT
);
create table history (
id UNIQUEIDENTIFIER,
revision INT
);
I want to update table history
with another column majorVersion
where the value ranges are defined in table definitions
. For example, if records in definitions
look like:
+--------------------------------------+-----------+---------------+
| id | revision | majorVersion |
+--------------------------------------+-----------+---------------+
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 2 | 1 |
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 4 | 2 |
+--------------------------------------+-----------+---------------+
Then the updated table history
should look something like this:
+--------------------------------------+-----------+---------------+
| id | revision | majorVersion |
+--------------------------------------+-----------+---------------+
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 1 | 1 |
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 2 | 1 |
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 3 | 2 |
| 9f717823-b9ca-4c7b-97f9-7770aaafb468 | 4 | 2 |
+--------------------------------------+-----------+---------------+
Here is a SQL Fiddle with a more number of rows.
The table definitions
can contain potentially thousands of different id
and multiple majorVersion. Subsequently, table history
can contain can contains close to a million rows.
I want to make the query as fast and optimized as possible. One possible solution is to use something like this:
update history
set majorVersion = (
select top 1 majorVersion
from definitions
where definitions.id = history.id
and definitions.revision >= history.revision
order by definitions.majorVersion
)
where history.majorVersion = 0;
But the problem with this is that we are querying rows for every row in table history
(which can be very large compared to definitions
). Any suggestions on how to improve upon this?
Best Answer
One of two solutions
solution 1
If you are able to split the work in two update statements, first update all the matching revisions
Then use a
CTE
and windowing functions to update themajorVersion
's that are still 0 on the updated tableAn index like this could help
Result
SQL Fiddle
solution 2, still two updates but with a
CROSS APPLY
and a self join. The self join is not ideal.First update all the matching revisions
Then do a self join to update the other majorVersions to the next one that is not 0
SQLFiddle
You would have to add
BEGIN TRANSACTION ... COMMIT TRANSACTION
to the updates if they have to be executed in one batch. The fastest way will depend on additional factors, such as your indexes & data, more than an example can show. YMMV