We are in the process of moving data from a legacy table (all varchar fields) into a strongly typed counterpart cue cheering
As part of this effort, we are taking data from the base Entity table and dumping it into Entity_New if all the data can be correctly converted to the appropriate type. Otherwise, it goes into a copy of the existing table named Entity_Bad.
We have a rules engine that validated the data and types as it when in so in theory, the data should be clean even though it's stored in character fields. The reality is my posting here because some thing's off and I'll can't find it. The CompletionDate field in Entity is varchar(46) NULL
Environment is
productversion productlevel edition
10.0.4064.0 SP2 Enterprise Edition (64-bit)
My scripts demonstrating what I was doing and what is but isn't working
SET NOCOUNT ON
DECLARE
@startid int = 0
, @stopid int = 796833
-------------------------------------------------------------------------------
-- Check doesn't find anything wrong with CompletionDate
-------------------------------------------------------------------------------
SELECT
1
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND E.AAKey = PB.AAKey
AND PB.ProcessResultCode IN ('A','W','M')
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND
(
-- check
(isDate(E.[CompletionDate]) = 0 AND E.[CompletionDate] IS NOT NULL)
AND (isDate(E.[CompletionDate]) = 1 AND CAST(E.[CompletionDate] AS datetime) BETWEEN '1753-01-01T00:00:00.000' AND '9999-12-31T23:59:59.997')
)
-------------------------------------------------------------------------------
-- Only row that shows as non-date is the NULL one, which is expected
-------------------------------------------------------------------------------
SELECT DISTINCT
(E.[CompletionDate] )
, isDate(E.[CompletionDate])
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND PB.ProcessResultCode IN ('A','W','M')
-- Ensure we aren't pulling something we have already processed
LEFT OUTER JOIN
[dbo].[Entity_new] N
ON N.HashCode = E.HashCode
AND N.FiscalYear = E.FiscalYear
AND E.AAKey = N.AAKey
-- Ensure we aren't pulling something we have already processed (or was bad)
LEFT OUTER JOIN
[dbo].[Entity_bad] BAD
ON BAD.HashCode = E.HashCode
AND BAD.FiscalYear = E.FiscalYear
AND E.AAKey = BAD.AAKey
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND N.FiscalYear IS NULL
AND BAD.FiscalYear IS NULL
ORDER BY 2
-------------------------------------------------------------------------------
-- Make the cast and it blows with
-- The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.
-------------------------------------------------------------------------------
SELECT DISTINCT
(E.[CompletionDate] )
, CAST(E.[CompletionDate] AS datetime) AS [CompletionDate]
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND PB.ProcessResultCode IN ('A','W','M')
-- Ensure we aren't pulling something we have already processed
LEFT OUTER JOIN
[dbo].[Entity_new] N
ON N.HashCode = E.HashCode
AND N.FiscalYear = E.FiscalYear
AND E.AAKey = N.AAKey
-- Ensure we aren't pulling something we have already processed (or was bad)
LEFT OUTER JOIN
[dbo].[Entity_bad] BAD
ON BAD.HashCode = E.HashCode
AND BAD.FiscalYear = E.FiscalYear
AND E.AAKey = BAD.AAKey
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND N.FiscalYear IS NULL
AND BAD.FiscalYear IS NULL
-------------------------------------------------------------------------------
-- Dump the values into a temporary table to slice and dice the values
-------------------------------------------------------------------------------
DECLARE @debug TABLE
(
CompletionDate varchar(46) NULL
)
INSERT INTO
@debug
SELECT DISTINCT
(E.[CompletionDate] )
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND PB.ProcessResultCode IN ('A','W','M')
-- Ensure we aren't pulling something we have already processed
LEFT OUTER JOIN
[dbo].[Entity_new] N
ON N.HashCode = E.HashCode
AND N.FiscalYear = E.FiscalYear
AND E.AAKey = N.AAKey
-- Ensure we aren't pulling something we have already processed (or was bad)
LEFT OUTER JOIN
[dbo].[Entity_bad] BAD
ON BAD.HashCode = E.HashCode
AND BAD.FiscalYear = E.FiscalYear
AND E.AAKey = BAD.AAKey
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND N.FiscalYear IS NULL
AND BAD.FiscalYear IS NULL
-------------------------------------------------------------------------------
-- This is operating on all the same values as the failing query but magically works
-------------------------------------------------------------------------------
SELECT ALL
CAST(E.[CompletionDate] AS datetime) AS [CompletionDate]
FROM
@debug E
-------------------------------------------------------------------------------
-- Clearly, something is amiss when we extract the data so process each row
-- and find the culprit that way. Except this finds nothing wrong
-------------------------------------------------------------------------------
DECLARE @hash uniqueidentifier
, @zee_date varchar(46)
, @real_date datetime
DECLARE
CSR CURSOR READ_ONLY
FOR
SELECT
E.HashCode
, E.[CompletionDate]
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND PB.ProcessResultCode IN ('A','W','M')
-- Ensure we aren't pulling something we have already processed
LEFT OUTER JOIN
[dbo].[Entity_new] N
ON N.HashCode = E.HashCode
AND N.FiscalYear = E.FiscalYear
AND E.AAKey = N.AAKey
-- Ensure we aren't pulling something we have already processed (or was bad)
LEFT OUTER JOIN
[dbo].[Entity_bad] BAD
ON BAD.HashCode = E.HashCode
AND BAD.FiscalYear = E.FiscalYear
AND E.AAKey = BAD.AAKey
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND N.FiscalYear IS NULL
AND BAD.FiscalYear IS NULL
OPEN CSR
FETCH NEXT FROM CSR INTO
@hash, @zee_date
WHILE (@@fetch_status = 0)
BEGIN
BEGIN TRY
SELECT @real_date = cast(@zee_date AS datetime)
END TRY
BEGIN CATCH
print 'In here'
print @hash
print @zee_date
SELECT @hash, @zee_date
END CATCH
FETCH NEXT FROM CSR INTO
@hash, @zee_date
END
CLOSE csr
DEALLOCATE csr
After critting you with with the above wall of code, here are the 406 unique values the above queries are operating on.
DECLARE @REAL_DATES TABLE
(
CompletionDate varchar(46) NULL
)
INSERT INTO
@REAL_DATES
SELECT
NULL
UNION ALL SELECT '19000101'
UNION ALL SELECT '20100208'
UNION ALL SELECT '20100228'
UNION ALL SELECT '20100309'
UNION ALL SELECT '20100314'
UNION ALL SELECT '20100401'
UNION ALL SELECT '20100409'
UNION ALL SELECT '20100420'
UNION ALL SELECT '20100427'
UNION ALL SELECT '20100429'
UNION ALL SELECT '20100507'
UNION ALL SELECT '20100615'
UNION ALL SELECT '20100617'
UNION ALL SELECT '20100629'
UNION ALL SELECT '20100701'
UNION ALL SELECT '20100703'
UNION ALL SELECT '20100704'
UNION ALL SELECT '20100706'
UNION ALL SELECT '20100709'
UNION ALL SELECT '20100713'
UNION ALL SELECT '20100714'
UNION ALL SELECT '20100715'
UNION ALL SELECT '20100716'
UNION ALL SELECT '20100720'
UNION ALL SELECT '20100721'
UNION ALL SELECT '20100726'
UNION ALL SELECT '20100727'
UNION ALL SELECT '20100728'
UNION ALL SELECT '20100729'
UNION ALL SELECT '20100731'
UNION ALL SELECT '20100801'
UNION ALL SELECT '20100802'
UNION ALL SELECT '20100803'
UNION ALL SELECT '20100804'
UNION ALL SELECT '20100807'
UNION ALL SELECT '20100809'
UNION ALL SELECT '20100810'
UNION ALL SELECT '20100811'
UNION ALL SELECT '20100813'
UNION ALL SELECT '20100817'
UNION ALL SELECT '20100819'
UNION ALL SELECT '20100820'
UNION ALL SELECT '20100822'
UNION ALL SELECT '20100823'
UNION ALL SELECT '20100825'
UNION ALL SELECT '20100827'
UNION ALL SELECT '20100828'
UNION ALL SELECT '20100830'
UNION ALL SELECT '20100831'
UNION ALL SELECT '20100901'
UNION ALL SELECT '20100902'
UNION ALL SELECT '20100904'
UNION ALL SELECT '20100907'
UNION ALL SELECT '20100908'
UNION ALL SELECT '20100909'
UNION ALL SELECT '20100910'
UNION ALL SELECT '20100911'
UNION ALL SELECT '20100913'
UNION ALL SELECT '20100916'
UNION ALL SELECT '20100919'
UNION ALL SELECT '20100920'
UNION ALL SELECT '20100922'
UNION ALL SELECT '20100923'
UNION ALL SELECT '20100925'
UNION ALL SELECT '20100928'
UNION ALL SELECT '20101002'
UNION ALL SELECT '20101004'
UNION ALL SELECT '20101007'
UNION ALL SELECT '20101009'
UNION ALL SELECT '20101010'
UNION ALL SELECT '20101013'
UNION ALL SELECT '20101016'
UNION ALL SELECT '20101018'
UNION ALL SELECT '20101019'
UNION ALL SELECT '20101020'
UNION ALL SELECT '20101022'
UNION ALL SELECT '20101023'
UNION ALL SELECT '20101025'
UNION ALL SELECT '20101028'
UNION ALL SELECT '20101030'
UNION ALL SELECT '20101102'
UNION ALL SELECT '20101107'
UNION ALL SELECT '20101108'
UNION ALL SELECT '20101109'
UNION ALL SELECT '20101111'
UNION ALL SELECT '20101112'
UNION ALL SELECT '20101114'
UNION ALL SELECT '20101117'
UNION ALL SELECT '20101119'
UNION ALL SELECT '20101124'
UNION ALL SELECT '20101126'
UNION ALL SELECT '20101127'
UNION ALL SELECT '20101129'
UNION ALL SELECT '20101201'
UNION ALL SELECT '20101203'
UNION ALL SELECT '20101204'
UNION ALL SELECT '20101206'
UNION ALL SELECT '20101209'
UNION ALL SELECT '20101210'
UNION ALL SELECT '20101213'
UNION ALL SELECT '20101214'
UNION ALL SELECT '20101215'
UNION ALL SELECT '20101216'
UNION ALL SELECT '20101220'
UNION ALL SELECT '20101222'
UNION ALL SELECT '20101227'
UNION ALL SELECT '20101228'
UNION ALL SELECT '20101229'
UNION ALL SELECT '20101230'
UNION ALL SELECT '20110102'
UNION ALL SELECT '20110103'
UNION ALL SELECT '20110104'
UNION ALL SELECT '20110106'
UNION ALL SELECT '20110107'
UNION ALL SELECT '20110108'
UNION ALL SELECT '20110109'
UNION ALL SELECT '20110110'
UNION ALL SELECT '20110111'
UNION ALL SELECT '20110114'
UNION ALL SELECT '20110116'
UNION ALL SELECT '20110118'
UNION ALL SELECT '20110119'
UNION ALL SELECT '20110120'
UNION ALL SELECT '20110122'
UNION ALL SELECT '20110123'
UNION ALL SELECT '20110125'
UNION ALL SELECT '20110126'
UNION ALL SELECT '20110128'
UNION ALL SELECT '20110130'
UNION ALL SELECT '20110203'
UNION ALL SELECT '20110205'
UNION ALL SELECT '20110206'
UNION ALL SELECT '20110208'
UNION ALL SELECT '20110210'
UNION ALL SELECT '20110212'
UNION ALL SELECT '20110213'
UNION ALL SELECT '20110215'
UNION ALL SELECT '20110218'
UNION ALL SELECT '20110221'
UNION ALL SELECT '20110224'
UNION ALL SELECT '20110226'
UNION ALL SELECT '20110301'
UNION ALL SELECT '20110302'
UNION ALL SELECT '20110304'
UNION ALL SELECT '20110307'
UNION ALL SELECT '20110309'
UNION ALL SELECT '20110311'
UNION ALL SELECT '20110314'
UNION ALL SELECT '20110316'
UNION ALL SELECT '20110317'
UNION ALL SELECT '20110320'
UNION ALL SELECT '20110321'
UNION ALL SELECT '20110323'
UNION ALL SELECT '20110326'
UNION ALL SELECT '20110328'
UNION ALL SELECT '20110329'
UNION ALL SELECT '20110331'
UNION ALL SELECT '20110403'
UNION ALL SELECT '20110405'
UNION ALL SELECT '20110406'
UNION ALL SELECT '20110408'
UNION ALL SELECT '20110410'
UNION ALL SELECT '20110415'
UNION ALL SELECT '20110416'
UNION ALL SELECT '20110417'
UNION ALL SELECT '20110418'
UNION ALL SELECT '20110421'
UNION ALL SELECT '20110422'
UNION ALL SELECT '20110423'
UNION ALL SELECT '20110426'
UNION ALL SELECT '20110429'
UNION ALL SELECT '20110501'
UNION ALL SELECT '20110503'
UNION ALL SELECT '20110504'
UNION ALL SELECT '20110506'
UNION ALL SELECT '20110508'
UNION ALL SELECT '20110509'
UNION ALL SELECT '20110511'
UNION ALL SELECT '20110512'
UNION ALL SELECT '20110514'
UNION ALL SELECT '20110517'
UNION ALL SELECT '20110518'
UNION ALL SELECT '20110519'
UNION ALL SELECT '20110520'
UNION ALL SELECT '20110522'
UNION ALL SELECT '20110526'
UNION ALL SELECT '20110531'
UNION ALL SELECT '20110603'
UNION ALL SELECT '20110604'
UNION ALL SELECT '20110605'
UNION ALL SELECT '20110606'
UNION ALL SELECT '20110608'
UNION ALL SELECT '20110611'
UNION ALL SELECT '20110613'
UNION ALL SELECT '20110614'
UNION ALL SELECT '20110616'
UNION ALL SELECT '20110622'
UNION ALL SELECT '20110624'
UNION ALL SELECT '20110627'
UNION ALL SELECT '20110703'
UNION ALL SELECT '20110704'
UNION ALL SELECT '20110711'
UNION ALL SELECT '20110712'
UNION ALL SELECT '20110713'
UNION ALL SELECT '20110714'
UNION ALL SELECT '20110719'
UNION ALL SELECT '20110720'
UNION ALL SELECT '20110725'
UNION ALL SELECT '20110726'
UNION ALL SELECT '20110802'
UNION ALL SELECT '20110804'
UNION ALL SELECT '20110811'
UNION ALL SELECT '20090611'
UNION ALL SELECT '20091124'
UNION ALL SELECT '20100201'
UNION ALL SELECT '20100202'
UNION ALL SELECT '20100204'
UNION ALL SELECT '20100220'
UNION ALL SELECT '20100305'
UNION ALL SELECT '20100323'
UNION ALL SELECT '20100414'
UNION ALL SELECT '20100417'
UNION ALL SELECT '20100508'
UNION ALL SELECT '20100512'
UNION ALL SELECT '20100527'
UNION ALL SELECT '20100616'
UNION ALL SELECT '20100702'
UNION ALL SELECT '20100705'
UNION ALL SELECT '20100707'
UNION ALL SELECT '20100708'
UNION ALL SELECT '20100710'
UNION ALL SELECT '20100711'
UNION ALL SELECT '20100712'
UNION ALL SELECT '20100717'
UNION ALL SELECT '20100719'
UNION ALL SELECT '20100722'
UNION ALL SELECT '20100723'
UNION ALL SELECT '20100724'
UNION ALL SELECT '20100725'
UNION ALL SELECT '20100730'
UNION ALL SELECT '20100805'
UNION ALL SELECT '20100806'
UNION ALL SELECT '20100808'
UNION ALL SELECT '20100812'
UNION ALL SELECT '20100814'
UNION ALL SELECT '20100815'
UNION ALL SELECT '20100816'
UNION ALL SELECT '20100818'
UNION ALL SELECT '20100821'
UNION ALL SELECT '20100824'
UNION ALL SELECT '20100826'
UNION ALL SELECT '20100829'
UNION ALL SELECT '20100903'
UNION ALL SELECT '20100906'
UNION ALL SELECT '20100912'
UNION ALL SELECT '20100914'
UNION ALL SELECT '20100915'
UNION ALL SELECT '20100917'
UNION ALL SELECT '20100918'
UNION ALL SELECT '20100921'
UNION ALL SELECT '20100924'
UNION ALL SELECT '20100926'
UNION ALL SELECT '20100927'
UNION ALL SELECT '20100929'
UNION ALL SELECT '20100930'
UNION ALL SELECT '20101001'
UNION ALL SELECT '20101003'
UNION ALL SELECT '20101005'
UNION ALL SELECT '20101006'
UNION ALL SELECT '20101008'
UNION ALL SELECT '20101011'
UNION ALL SELECT '20101012'
UNION ALL SELECT '20101014'
UNION ALL SELECT '20101015'
UNION ALL SELECT '20101017'
UNION ALL SELECT '20101021'
UNION ALL SELECT '20101024'
UNION ALL SELECT '20101026'
UNION ALL SELECT '20101027'
UNION ALL SELECT '20101029'
UNION ALL SELECT '20101031'
UNION ALL SELECT '20101101'
UNION ALL SELECT '20101103'
UNION ALL SELECT '20101104'
UNION ALL SELECT '20101105'
UNION ALL SELECT '20101106'
UNION ALL SELECT '20101110'
UNION ALL SELECT '20101113'
UNION ALL SELECT '20101115'
UNION ALL SELECT '20101116'
UNION ALL SELECT '20101118'
UNION ALL SELECT '20101120'
UNION ALL SELECT '20101122'
UNION ALL SELECT '20101123'
UNION ALL SELECT '20101125'
UNION ALL SELECT '20101128'
UNION ALL SELECT '20101130'
UNION ALL SELECT '20101202'
UNION ALL SELECT '20101205'
UNION ALL SELECT '20101207'
UNION ALL SELECT '20101208'
UNION ALL SELECT '20101212'
UNION ALL SELECT '20101217'
UNION ALL SELECT '20101218'
UNION ALL SELECT '20101219'
UNION ALL SELECT '20101221'
UNION ALL SELECT '20101223'
UNION ALL SELECT '20101224'
UNION ALL SELECT '20101226'
UNION ALL SELECT '20101231'
UNION ALL SELECT '20110101'
UNION ALL SELECT '20110105'
UNION ALL SELECT '20110112'
UNION ALL SELECT '20110113'
UNION ALL SELECT '20110115'
UNION ALL SELECT '20110117'
UNION ALL SELECT '20110121'
UNION ALL SELECT '20110124'
UNION ALL SELECT '20110127'
UNION ALL SELECT '20110129'
UNION ALL SELECT '20110131'
UNION ALL SELECT '20110201'
UNION ALL SELECT '20110202'
UNION ALL SELECT '20110204'
UNION ALL SELECT '20110207'
UNION ALL SELECT '20110209'
UNION ALL SELECT '20110211'
UNION ALL SELECT '20110214'
UNION ALL SELECT '20110216'
UNION ALL SELECT '20110217'
UNION ALL SELECT '20110219'
UNION ALL SELECT '20110220'
UNION ALL SELECT '20110222'
UNION ALL SELECT '20110223'
UNION ALL SELECT '20110225'
UNION ALL SELECT '20110228'
UNION ALL SELECT '20110303'
UNION ALL SELECT '20110305'
UNION ALL SELECT '20110306'
UNION ALL SELECT '20110308'
UNION ALL SELECT '20110310'
UNION ALL SELECT '20110312'
UNION ALL SELECT '20110313'
UNION ALL SELECT '20110315'
UNION ALL SELECT '20110318'
UNION ALL SELECT '20110322'
UNION ALL SELECT '20110324'
UNION ALL SELECT '20110325'
UNION ALL SELECT '20110327'
UNION ALL SELECT '20110330'
UNION ALL SELECT '20110401'
UNION ALL SELECT '20110404'
UNION ALL SELECT '20110407'
UNION ALL SELECT '20110409'
UNION ALL SELECT '20110411'
UNION ALL SELECT '20110412'
UNION ALL SELECT '20110413'
UNION ALL SELECT '20110414'
UNION ALL SELECT '20110419'
UNION ALL SELECT '20110420'
UNION ALL SELECT '20110425'
UNION ALL SELECT '20110427'
UNION ALL SELECT '20110428'
UNION ALL SELECT '20110430'
UNION ALL SELECT '20110502'
UNION ALL SELECT '20110505'
UNION ALL SELECT '20110507'
UNION ALL SELECT '20110510'
UNION ALL SELECT '20110513'
UNION ALL SELECT '20110515'
UNION ALL SELECT '20110516'
UNION ALL SELECT '20110521'
UNION ALL SELECT '20110523'
UNION ALL SELECT '20110524'
UNION ALL SELECT '20110525'
UNION ALL SELECT '20110527'
UNION ALL SELECT '20110528'
UNION ALL SELECT '20110530'
UNION ALL SELECT '20110601'
UNION ALL SELECT '20110602'
UNION ALL SELECT '20110607'
UNION ALL SELECT '20110609'
UNION ALL SELECT '20110610'
UNION ALL SELECT '20110615'
UNION ALL SELECT '20110617'
UNION ALL SELECT '20110618'
UNION ALL SELECT '20110620'
UNION ALL SELECT '20110621'
UNION ALL SELECT '20110623'
UNION ALL SELECT '20110626'
UNION ALL SELECT '20110628'
UNION ALL SELECT '20110629'
UNION ALL SELECT '20110630'
UNION ALL SELECT '20110701'
UNION ALL SELECT '20110706'
UNION ALL SELECT '20110707'
UNION ALL SELECT '20110708'
UNION ALL SELECT '20110715'
UNION ALL SELECT '20110717'
UNION ALL SELECT '20110721'
UNION ALL SELECT '20110722'
UNION ALL SELECT '20110727'
UNION ALL SELECT '20110729'
UNION ALL SELECT '20110801'
UNION ALL SELECT '20110810'
SELECT
CAST(RD.CompletionDate AS datetime) AS casts_fine
FROm
@REAL_DATES RD
I do appreciate the comments about SSIS or other approaches but we're married to the TSQL conversion approach at this point in the game. If anyone can point out what I'm missing, I'll name my first child after you, assuming you don't mind changing your name to James.
Best Answer
What happens when you do this?
What I was getting at on twitter was that the conversion can be attempted by the optimizer before rows are eliminated, so you can't only consider the CompletionDate values that are returned by the join.
My first suggestion would be to use the right data type. Why are you using VARCHAR(46) to store a date? This is why you have bad data in the table and why you have to explicitly convert when you want rich data that is not a string (and should never have been a string in the first place IMHO).
My next suggestion would be to correct ALL data in that column, and put measures in place so that it won't become invalid again. For example, a check constraint that validates
ISDATE(columnname) = 1
.Failing those two, next on my list would be to return the data to the client and let it convert to datetime or for display or what have you. No matter where you filter out the rows that are causing the problem, the optimizer could push that evaluation around so that the convert is attempted before the bad rows are weeded out.
And finally, you could dump the result of the query into a temp table / table variable, and perform the convert as a second step when querying that intermediate object (since you should be confident that the dates here are valid - in fact you can check first and raise an error if your join happened to return some rows with invalid dates).
Bottom line: (a) you can't make any assumptions about where in the stack a conversion attempt will be made and (b) these workarounds and hacks would not be necessary if you used the right data type.