SQL Server – Upload Multiple Files Without Server Box Access

bulk-insertsql server 2014ssis

I am trying to create a process that would bulk upload multiple csv files periodically to individual tables in SQL server. The issue I am having is those files are not located on the same server.

I can use the bulk upload command to upload files from a folder on the SQL server box but not from our shared drive which is located on a different server. There are multiple people who would be running this process and so it might not be feasible because not all of them would get that kind of access to the SQL server box.

Is there any way of accomplishing this with the same level of efficiency and speed?
I have thought of sending each line in the csv files but those files are huge and would take a long time
Just looking for ideas to make this work.
Would SSIS be less or more or equally efficient?

Best Answer

If there are no transformations involved, you could write this process fairly simply in SSIS with a for loop in the control flow that iterates over a file directory, then passes each file name as a variable to a data flow. The filename for the csv would be bound to that variable, and your destination could similarly be bound to the file name. Just use bulk mode in your target component and you should get pretty good efficiency. This works great if the folder is accessible to all users and you add a step to either check and see that files in the directory are processed or move them somewhere else when they've been processed. The SSIS process could be invoked from the SSIS catalog on the SQL Server directly, or scheduled on a regular basis.

Alternatively, you could use the bcp Utility for a CLI experience. This is probably a lot simpler in that it doesn't require writing an SSIS process and management of that process, but it requires more knowledge on the end of the user invoking it. This tool doesn't require the files be on the server: you provide a path to the file as the invoking user, along with credentials to authenticate, and the tool will handle the data movement.

Related Solutions

Sql-server – Duplicate primary key in bulk insert after truncate in SQL Server 2008

If you've truncated the table, then a primary key violation must be coming from duplicate data in the file. Try bulk inserting into a new table, without the PK constraint, and then check the table for duplicates (probably easier than writing some tool or script to parse the file directly). You can create a mimic table that won't have constraints this way:

SELECT * INTO dbo.new_bulk_source
  FROM dbo.old_source
  WHERE 1 = 0;

Then change your package to reference this table, do the insert, then run:

SELECT key FROM dbo.new_bulk_source
  GROUP BY key
  HAVING COUNT(*) > 1;

I bet a donut the call is coming from inside the house (or the truncate is not succeeding).

Sql-server – problems when importing csv files into SQL Server using SSIS when the columns are not consistent from quarter to quarter

You can use a script transformation to create the actual column mapping. For your flat file source, make every output line become a single large column.

In your script transformation, create column definitions for each column you need mapped / already know about.

This assumes your first line contains headers:

Then, in your script transformation , split each row by the delimiter you need. You can create a boolean value indicating if the first line has been read yet. If not, parse the line and store the column mapping positions in a Dictionary<string,int> such as ColumnMappings and do not write to OutputBuffer. If your row reader sees that the column mappings have been read, then you can set your output to the correct column with something like:

 OutputBuffer.MyMappedColumn = SplitRow[ColumnMappings["MyMappedColumn"]];

An example of an entire transformation with a column "Day" is below

using System;
using System.Data;
using System.Text;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent {
    UTF8Encoding enc = new UTF8Encoding();
    char[] separator = { ',' };
    bool _isColumnMappingComplete = false;
    System.Collections.Generic.Dictionary<string, int> _columnMapping;
    bool isWritable = false;

    bool isColumnMappingComplete { //Set true after first row processed
        get{ return _isColumnMappingComplete; }
        set{ _isColumnMappingComplete = value;}   }

    System.Collections.Generic.Dictionary<string, int> columnMapping{ //Object to hold column mapping indexes
        get {
            if (_columnMapping == null) { _columnMapping = new System.Collections.Generic.Dictionary<string, int>(); }
            return _columnMapping; }
        set {
            _columnMapping = value;
            isColumnMappingComplete = true; } }

    public override void PreExecute() {
        base.PreExecute(); /* Add your code here for preprocessing or remove if not needed */ }

    public override void PostExecute()    {
        isWritable = true;
        base.PostExecute();
        /* Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example: Variables.MyIntVar = 100 */  }

    public override void Input0_ProcessInputRow(Input0Buffer Row)    {
        bool isEmptyRow = false;
        //byte[] rawRowBytes = Row.Column0.GetBlobData(0,(int)Row.Column0.Length); //uncomment if using Text input datatype
        //string rawRow = enc.GetString(rawRowBytes); //uncomment if using Text input datatype
        string rawRow = Row.Column0; //comment out if using Text input datatype
        if (rawRow.Replace("\"", "").Replace(",", "").Replace(" ", "").Length == 0)
        { isEmptyRow = true; }
        if (!isEmptyRow) {
            string[] rowData = rawRow.Split(separator);

            if (!isColumnMappingComplete) {
                System.Collections.Generic.Dictionary<string, int> firstColMap = new System.Collections.Generic.Dictionary<string, int>();
                for (int i = 0; i < rowData.Length; i++)
                { firstColMap.Add(rowData[i], i); }
                columnMapping = firstColMap;    }
            else {
                Output0Buffer.AddRow();
                bool _rowError = false;
                //Day
                if (columnMapping.ContainsKey("Day"))
                { Output0Buffer.Day = rowData[columnMapping["Day"]]; }
                else
                { //Output0Buffer.Day = "-";
                    _rowError = true; }   }      }    }

    public override void CreateNewOutputRows()
    {   /* Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
         * For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput". */ }     }

Best Answer

Related Solutions

Sql-server – Duplicate primary key in bulk insert after truncate in SQL Server 2008

Sql-server – problems when importing csv files into SQL Server using SSIS when the columns are not consistent from quarter to quarter

Related Question