I currently use the following statement, for 10,000 rows it takes about 150 seconds. I tried removing the index on the target table, but this didn't help. Running the loop without the INSERT INTO
takes less then 50ms. I need it to update about 300 million rows and I can't really wait 52 days (!) for it to complete.
Bottom line of the following update query is that I need to go over each row, perform calculations on a VARBINARY
and extract proper values from it (we need to get rid of the packed VARBINARY
fields), and store that in a new table.
FETCH NEXT FROM LocCities INTO @LocCity
WHILE (@@FETCH_STATUS = 0)
BEGIN
-- several sets, removed calculations for clarity
SET @LocationId = Calculation1()
SET @CityId = Calculation2()
IF(@LocCity <> 0)
BEGIN
-- left out an inner loop here on the VARBINARY based on its length
INSERT INTO LocationCities (LocationId, CityId)
VALUES (@LocationId, @CityId)
END
FETCH NEXT FROM RespCursor INTO @TuningRow
END
I understand that I can use the WITH
keyword with table hints, but I am not sure what to use. I expect the final update query to run in several hours, and hope there's a way to do that. I really can't wait almost two months ;).
Isn't there something similar like BULKINSERT
that I can use?
Best Answer
I really don't think table hints or BULKINSERT are going to help you here - your approach is still to process each varbinary value one at a time, and this will be your downfall regardless - especially when you discard the idea of set-based queries because you "don't think it's possible."
Here's a set-based approach with no awful loops or cursors. This assumes that the pattern is always the same (LocationID is the first byte, and CityID is the next two).
Results:
Some articles that will help you understand numbers tables and why generating sets in SQL Server is far superior to even the most efficient loop you can derive.