Sql-server – Constructing datetime from 2 fields: one integer and a datetime

datetimesql serversql-server-2008

I am looking for the most efficient way to parse 2 values; the first is stored as a datetime field with (hh:mm:ss.fff all being 0) and the second time field stored as an int with the format: hhmmssmmm). I need to take these 2 fields and convert them to a datetime field. For example:

date field = '2013-09-04'
time field = 192423998 

Should return:

2013-09-04 19:24:23.997

I've seen solutions using DATEADD and CONVERT and I came up with my own (shown below) but I wonder if there's a better/more efficient way.

DECLARE @dt DATETIME = '2013-09-04'

SELECT  @dt+' '+ STUFF(STUFF(STUFF(RIGHT('0'+CAST(192423998 AS VARCHAR(9)),9)  ,7,0,'.'),5,0,':'),3,0,':')  

Note: In case you wonder why the RIGHT('0'+CAST(192423998 AS VARCHAR(9)),9) – this is because a time before 10 AM will only have one digit for the hour part.

Best Answer

So, you're going to be parsing strings and ints and doing all kinds of conversions, so I don't think you're going to really save anything by seeking out a "more efficient" way to turn crap data into good data - even on billions of rows. An alternative might be to avoid converting anything to a string at all (which is notoriously bad for performance):

DECLARE @dt DATETIME = '20130904', @t INT = 192423998;

SELECT DATEADD(MILLISECOND, @t % 100000 -- milliseconds
  + 60000 * (@t / 100000 % 100) -- minutes converted to milliseconds
  + 60000 * 60 * (@t / 10000000), -- hours converted to milliseconds
  @dt);

Of course, keep in mind that the milliseconds your trading system passes in can round in either direction when using datetime (.998 isn't possible and becomes .997, for example), so you may want instead:

SELECT DATEADD(MILLISECOND, @t % 100000 -- milliseconds
  + 60000 * (@t / 100000 % 100) -- minutes converted to milliseconds
  + 60000 * 60 * (@t / 10000000), -- hours converted to milliseconds
  CONVERT(DATETIME2(3), @dt));

I ran these each 100,000 times, and compared to your existing approach. The difference?

Your approach:               41.8 seconds
My approach (no convert):    40.5 seconds
My approach (with convert):  38.9 seconds

Now, this should not be extrapolated to mean that on 1 billion rows my method will save you (13,000 or 31,000) seconds, as the translation of 100,000 batches to operating on 100,000 rows in a single statement will not be linear. But it could be worth testing (you have the data necessary to test this; I would just be making stuff up, and I also probably have processors that have different performance characteristics when it comes to floating point operations etc., so anything else I could offer would not necessarily be relevant).

I still think you're better off taking the hit, fixing the design, and intercepting the incoming trading data and correcting it. As this system gets bigger, these translations and conversions are going to hurt more, and more, and more...