Sql-server – Why does SQL Server 2008 R2 insert an invalid datetime when using bulk insert

bulk-insertdatetimesql-server-2008-r2t-sql

I want to insert rows with an optional datetime value (per row) using T-SQL BULK INSERT
on a SQL Server 2008 R2.

A table could look like this:

CREATE TABLE [dbo].[tbl_bulk_insert_datetime_issue] (
    [id] [int] NOT NULL,
    [description] [varchar](20) NOT NULL,
    [datetime] [datetime] NULL,
CONSTRAINT [pk_bulk_insert_datetime_issue] PRIMARY KEY CLUSTERED (
    [id] ASC
))

The insert batch:

BULK
INSERT [dbo].[tbl_bulk_insert_datetime_issue]
FROM 'C:\temp\bulkinsertsample.csv'
WITH
(
    FIELDTERMINATOR=';'
)

If I am to insert the following CSV content:

1;row01;
2;row02;20130401
3;row03;
4;row04;20130515

The table' content was parsed and inserted as expected:

id          description          datetime
----------- -------------------- -----------------------
1           row01                NULL
2           row02                2013-04-01 00:00:00.000
3           row03                NULL
4           row04                2013-05-15 00:00:00.000


But if the CSV file contains invalid data for the optional datetime

1;row01;
2;row02;20130401
3;row03;not_a_datetime
4;row04;20130515

SQL Server inserts the last valid datetime value instead of discarding the row with an error message:

id          description          datetime
----------- -------------------- -----------------------
1           row01                NULL
2           row02                2013-04-01 00:00:00.000
3           row03                2013-04-01 00:00:00.000
4           row04                2013-05-15 00:00:00.000

If I am to insert a row with an invalid datetime and no valid parsable datetime at all

1;row01;not_a_datetime

SQL Server inserts the default value of DATETIME:

id          description          datetime
----------- -------------------- -----------------------
1           row01                1900-01-01 00:00:00.000

Why does Sql Server insert the (for that specific row) invalid datetime when using bulk insert?
If I try to insert rubbish to an INT column the row will not be inserted by the bulk batch and an error will be thrown…


UPDATE:
I will share some information about the environment I used to reproduce the behaviour.

  • OS: Windows 7 Enterprise SP1 x64
  • regional settings: de-DE
  • Microsoft SQL Server 2008 R2 (SP1) – 10.50.2500.0 (X64) Enterprise Edition
  • database collation: Latin1_General_100_CS_AS
  • file encoding: UCS-2 Little Endian (with Windows CRLF)

As hinted by Max Vernon the file encoding seems to be at fault. Setting the encoding to UTF-8 without BOM the bulk insert will throw the error I expected and discards the invalid row.

Best Answer

On SQL Server 2005, and SQL Server 2012, I did the following:

USE tempdb;
CREATE TABLE ImpTest
(
    ImpTestID INT NULL
    , ImpTestDate DATETIME NULL
);
GO
BULK INSERT ImpTest FROM 'C:\SQLServer\ImportTest.txt' WITH (FIELDTERMINATOR=',');
SELECT * FROM ImpTest;
DROP TABLE ImpTest;

With the following data:

1,
2,20130406
3,NOT_DATE
4,
5,20130409

On both servers, I received the following:

Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified
codepage) for row 3, column 2 (ImpTestDate).

Perhaps, as Aaron noted, this is a bug that needs to be addressed through http://connect.microsoft.com

I'm wondering about some of the details of your system, including locale settings, collations, physical format of the import file (is it from a Unix system? etc).