SQL Server – How to Update Every Datetime in Bulk

performancesql serversql-server-2008-r2

The current database assumes all times are local to the user. Now we're changing this so the database stores everything in UTC and our application converts the UTC time in the database to the correct timezone for each user.

Now we need to update every datetime in the database to UTC time. Is there a better way than having our application (which has a Timezone library) update each one individually?

Due to daylight savings time (BST), a simple DATEADD isn't an option. There are roughly 10 million cells to update.

Best Answer

Assuming you are dealing with a single time zone at a time (e.g. British Time), and can identify all rows that are in British Time (maybe that's everyone so far), you could create a calendar table that has the DST time change dates for Summer Time. You subtract an hour for all times that fall between, and don't touch the ones that don't. Sample data:

CREATE TABLE dbo.SampleData
(
  LocalDT DATETIME,
  IsBritishTime BIT,
  UTC DATETIME
);

INSERT dbo.SampleData(LocalDT,IsBritishTime) VALUES
 ('20000101 00:00',1), -- should NOT be changed
 ('20000326 00:59',1), -- should NOT be changed
 ('20000326 01:01',1), -- should go back
 ('20001029 01:59',1), -- should go back
 ('20001029 02:01',1), -- should NOT be changed
 ('20001231 23:59',1), -- should NOT be changed
 ('20000401 00:00',0); -- should NOT be changed (different time zone)

Now you can identify the ranges using some rather tricky T-SQL. We grab some rows from system tables based on the years your data needs to cover, then for each year we calculate the beginning and end of BST. We can then use those outputs to update the main table:

SET DATEFIRST 7;

;WITH y AS 
(
  -- all the years from 2000 through 50 years after the current year:
  SELECT TOP (YEAR(GETDATE())-2000+51) 
    y = DATEADD(DAY,-1,DATEADD(YEAR,number,'20000101'))
      FROM [master].dbo.spt_values 
      WHERE [type] = N'P' ORDER BY number
),
s AS
(
  SELECT 
    -- BST starts last Sunday in March @ 1:00 AM UTC:
    BSTStart = DATEADD(HOUR, 1, DATEADD(DAY,(1-DATEPART(
      WEEKDAY,DATEADD(MONTH,3,y))) % 7,DATEADD(MONTH,3,y))),
    -- and ends last Sunday in October @ 2:00 AM UTC:
    BSTEnd = DATEADD(HOUR,2,DATEADD(DAY,(1-DATEPART(
      WEEKDAY,DATEADD(MONTH,10,y))) % 7,DATEADD(MONTH,10,y)))
  FROM y
)
-- UPDATE d SET 
SELECT *,
  UTC = DATEADD(HOUR, 
  CASE WHEN s.BSTStart IS NULL THEN 0 ELSE -1 END, d.LocalDT)
FROM dbo.SampleData AS d
LEFT OUTER JOIN s 
ON d.LocalDT >= s.BSTStart
AND d.LocalDT < s.BSTEnd
WHERE d.IsBritishTime = 1 
-- or more likely WHERE EXISTS 
-- (SELECT 1 FROM dbo.Users 
--    WHERE UserID = d.UserID
--    AND TimeZone = 'BritishTime');

As written, this just shows you the join and identifies the rows that will be updated with a different UTC value (they will have non-null BSTStart/BSTEnd values). If you uncomment -- UPDATE d SET and comment SELECT *,, then select the data from dbo.SampleData, you should see this:

LocalDT          IsBritishTime UTC
---------------- ------------- ----------------
2000-01-01 00:00 1             2000-01-01 00:00
2000-03-26 00:59 1             2000-03-26 00:59
2000-03-26 01:01 1             2000-03-26 00:01 -- changed
2000-10-29 01:59 1             2000-10-29 00:59 -- changed
2000-10-29 02:01 1             2000-10-29 02:01
2000-12-31 23:59 1             2000-12-31 23:59
2000-04-01 00:00 0             NULL

(And actually, the March 26 1:01 local time shouldn't be possible - at 1:00 AM, the clocks rolled forward to 2 AM, so how could it ever have been 1:01 AM locally? So, you may want to special case that hour and deal with those cases individually. This is why every system I design stores UTC from the start - there is no plausible reason to store local time, since I can always get that from UTC, but I can't always get back.)

Of course, this will affect all rows that match the WHERE clause, it's just that some of the rows won't have a different value for UTC. I highly recommend putting this into a different column first, rather than just changing the start time, so that you maintain the original data in case something goes wrong, you need to troubleshoot, you do end up needing that data later, etc. You can change the column names if you want to use the UTC value in existing queries instead of changing those one-by-one.

This will work best if you have supporting indexes on the underlying table. If you are performing this update against multiple time zones (and have a column that can identify rows by local time zone), then creating a filtered index for each operation might be something worth trying. If you are identifying by some other indirection (such as UserID), maybe you could do a user at a time, or the set of users from each time zone. Chunking will work best in terms of blocking and log impact, but having no other details about the structure and how you could partition the data, that's the best I can do.

I provided British Time above, since it was the example in the question, but you'll obviously need different calculations for different timezones that have different offsets and different start/end dates (and those may differ from year to year, for example in 2007 the US dates changed).

This is based on the fact that the UK (well, London specifically) moves forward to 2 AM BST on the last Sunday of March at 1 AM local (or UTC), and back to 1 AM GMT the last Sunday of October at 2 AM local (1 AM UTC). Corroborate here; link courtesy @JackDouglas.

Related Solutions

Sql-server – Update Table with Network Client Connection Status, I need help with DB design. SQL 2008 Ex R2

On the Devices table, add two fields: LastStatus and LastDateTimeChecked. Update it whenever you insert records into the Connections table. The Connections is your history, but you shouldn't be hitting it for live reporting queries.

Sql-server – User can alter procedures without permission

Do you see anything that has been granted or role membership that shouldn't exist?

EXECUTE AS USER = N'your user';
GO
SELECT * FROM sys.fn_my_permissions(NULL, NULL);
GO
REVERT;
GO

SELECT * FROM sys.database_permissions
WHERE grantee_principal_id = USER_ID(N'your user');
GO

SELECT r.name
FROM sys.database_role_members AS m
INNER JOIN sys.database_principals AS p
ON m.member_principal_id = p.principal_id
INNER JOIN sys.database_principals AS r
ON m.role_principal_id = r.principal_id
WHERE p.name = N'your user';

SELECT r.name
FROM sys.server_role_members AS m
INNER JOIN sys.server_principals AS p
ON m.member_principal_id = p.principal_id
INNER JOIN sys.server_principals AS r
ON m.role_principal_id = r.principal_id
WHERE p.name = N'your login';

It is possible these developers are in the db_owner role without your knowledge or have been granted ALL on the dbo or another schema.

Make sure you execute these statements in the right database, and also validate that your users are logging in as who they say they are logging in as. Perhaps they have the sa password and you don't know it. Finally, you should make sure that the database principal maps to the correct server-level login.

SELECT dp.name, sp.name
FROM sys.database_principals AS dp
FULL OUTER JOIN sys.server_principals AS sp
ON dp.sid = sp.sid
WHERE 
(
  dp.type <> 'R' 
  OR sp.type NOT IN ('C', 'K')
)
ORDER BY 
  dp.name COLLATE DATABASE_DEFAULT + sp.name COLLATE DATABASE_DEFAULT DESC;

Best Answer

Related Solutions

Sql-server – Update Table with Network Client Connection Status, I need help with DB design. SQL 2008 Ex R2

Sql-server – User can alter procedures without permission

Related Question