Sql-server – SSMS: Import from flat file (.csv)

importsql-server-2008-r2ssisssms

In SQL Server Management Studio (SSMS) 2008 R2, I wish to import a not-quite-properly formatted CSV into a database table. I was trying to use the wizard at <database name>->Tasks->Import Data.

Differences from the CSV "standard"

  1. null values are represented by a space between commas
  2. some fields have commas in the middle of them and are not quote-escaped
  3. there is a terminal comma on all lines except the header line (WTF??)

I eventually addressed these issues manually by editing the CSV prior to import. Issue 2 was fixed by replacing commas with dashes for the 4 families of cell values that were causing it. Issue 3 I addressed with a search-and-replace for those terminal commas. Issue 1, which was a problem for numeric fields, also with a search-and-replace , , -> ,,.

When my file did eventually import properly, null values were replaced with zeros, even though I had allowed null values in the field definitions.

My basic question: is there a way that I can alter the code for CAST/CONVERT operations during an SSMS import? The only stage at which I could get to SQL code was in the field definitions of the created table (from Select Source Tables and Views->Edit Mappings->Edit SQL – which was available if and only if I hadn't changed the destination table name on the Select Source Tables page), but that was just for the CREATE TABLE definition and didn't address the import process at all. It really seems to me that I should be able to allow a single-character space string to be cast to a null value when converted to a number (instead the wizard fails and states that a value is out of bounds). Further, the null string certainly should be cast to the null value integer, rather than 0.

I can save a SSIS package but it wasn't clear to me how to edit and run that to allow for what I want to do. Is there a template file somewhere that I can edit to allow for null strings to be cast to null integer values (like something here….)? Is there a way to use parts of the Import Wizard to generate tSQL code? Should I instead use a BULK INSERT statement rather than the wizard?

Best Answer

The Import/Export wizard builds an SSIS package under the covers but the package the wizard builds only performs the E and L of ETL (Extract, Transform, Load). What you're looking for is T because you needed to transform this weird data into something more manageable.

For the weird header row, you can always specify that there is no header row and the connection manager should skip N rows.

Unfortunately, there's not much you can do about the parsing itself when there's embedded delimiters. Logically, you can look and see that Kansas City, MO was all one field but the parser is simplistic and breaks the string on the delimiter, baring an escape character.

For spaces not empty strings, the Import-Export wizard way would be to import to a table of all varchar columns, and then write an extract query performing whatever preferred magic to make it into a NULL NULLIF(RTRIM(MyCol), '') AS MyCol

Random notes for future readers

If you're on 2005 or 2008, then the only way to get the SSIS editor, BIDS (Business Intelligence Design Studio) is to have a copy of the SQL Server installation media. 2012+ Microsoft allows for the download and installation of the SQL Server Data Tools-BI edition without the need for a SQL Server installation.

The Import/Export wizard is available in all editions of SQL Server, including Express, but the limitation here is that you cannot save the generated SSIS package if you are using Express. All the other versions work fine.