Sql-server – problems when importing csv files into SQL Server using SSIS when the columns are not consistent from quarter to quarter

csvsql serverssis

I am using SSIS to upload a bunch of csv tables into SQL Server Express. The csv files are quarterly data files. They are largely consistent but every couple of years a column may be added or an existing column deleted.

Every time SSIS encounters a change in the columns present it throws one or both of these errors:

Error: 0xC0202055 at Data Flow Task, Flat File Source [2]: The column delimiter for column "XYZ" was not found.

Error: 0xC0202091 at Data Flow Task, Flat File Source [2]: An error occurred while skipping data rows.

When a new column is added, both errors appear. When an existing column is excluded, only the 2091 error appears. Note that the columns are not just null when they are deleted, they no longer in the data.

Is there setting or something I can use so that SSIS accommodates these changes in columns? I've tried setting up the target table with the complete set of columns but that hasn't helped.

Thanks,

Best Answer

You can use a script transformation to create the actual column mapping. For your flat file source, make every output line become a single large column.

In your script transformation, create column definitions for each column you need mapped / already know about.

This assumes your first line contains headers:

Then, in your script transformation , split each row by the delimiter you need. You can create a boolean value indicating if the first line has been read yet. If not, parse the line and store the column mapping positions in a Dictionary<string,int> such as ColumnMappings and do not write to OutputBuffer. If your row reader sees that the column mappings have been read, then you can set your output to the correct column with something like:

 OutputBuffer.MyMappedColumn = SplitRow[ColumnMappings["MyMappedColumn"]];

An example of an entire transformation with a column "Day" is below

using System;
using System.Data;
using System.Text;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent {
    UTF8Encoding enc = new UTF8Encoding();
    char[] separator = { ',' };
    bool _isColumnMappingComplete = false;
    System.Collections.Generic.Dictionary<string, int> _columnMapping;
    bool isWritable = false;

    bool isColumnMappingComplete { //Set true after first row processed
        get{ return _isColumnMappingComplete; }
        set{ _isColumnMappingComplete = value;}   }

    System.Collections.Generic.Dictionary<string, int> columnMapping{ //Object to hold column mapping indexes
        get {
            if (_columnMapping == null) { _columnMapping = new System.Collections.Generic.Dictionary<string, int>(); }
            return _columnMapping; }
        set {
            _columnMapping = value;
            isColumnMappingComplete = true; } }

    public override void PreExecute() {
        base.PreExecute(); /* Add your code here for preprocessing or remove if not needed */ }

    public override void PostExecute()    {
        isWritable = true;
        base.PostExecute();
        /* Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example: Variables.MyIntVar = 100 */  }

    public override void Input0_ProcessInputRow(Input0Buffer Row)    {
        bool isEmptyRow = false;
        //byte[] rawRowBytes = Row.Column0.GetBlobData(0,(int)Row.Column0.Length); //uncomment if using Text input datatype
        //string rawRow = enc.GetString(rawRowBytes); //uncomment if using Text input datatype
        string rawRow = Row.Column0; //comment out if using Text input datatype
        if (rawRow.Replace("\"", "").Replace(",", "").Replace(" ", "").Length == 0)
        { isEmptyRow = true; }
        if (!isEmptyRow) {
            string[] rowData = rawRow.Split(separator);

            if (!isColumnMappingComplete) {
                System.Collections.Generic.Dictionary<string, int> firstColMap = new System.Collections.Generic.Dictionary<string, int>();
                for (int i = 0; i < rowData.Length; i++)
                { firstColMap.Add(rowData[i], i); }
                columnMapping = firstColMap;    }
            else {
                Output0Buffer.AddRow();
                bool _rowError = false;
                //Day
                if (columnMapping.ContainsKey("Day"))
                { Output0Buffer.Day = rowData[columnMapping["Day"]]; }
                else
                { //Output0Buffer.Day = "-";
                    _rowError = true; }   }      }    }

    public override void CreateNewOutputRows()
    {   /* Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
         * For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput". */ }     }