Sql-server – Add Data from Columns Together

importsql server

I am working on importing data from an old non-relational database, to a new relational one. I am currently working on importing data into the following table:

Table: job_part
    job_id   INT NOT NULL --References ID Column of Job table
    part_id  INT NOT NULL --References ID Column of Part table
    quantity INT NOT NULL --Amount of parts required for the job
    PK = job_id & part_id

The data I have is in a .csv file which has the following columns;

Part Number – Value will be referenced with Part table to get part_id value

Job Number – Value will be referenced with Job table to get job_id value

Amount – Value to be stored in Quantity column

My problem is when there are data rows like the following:

CDDM 42, P0164, 1
CDDM 42, P0164, 2

As you can see, the Part Number, and Job Number are the same, while the quantity value is different. Because the table I want to insert this data into has the PartID and JobID as a composite primary key, these pieces of data will give me an error when I try to import it.

What I want to do, is to take this data, and before I put it in my table, put it into a temporary table that combines all of the matching key pairs. So for my problem example above, instead of having the two rows that will cause and error I want the result to be:

CDDM 42, P0164, 3

Essentially, find every matching pair of key values, and add their quantities together so they can be a single row of data.

However I am unsure how to go about solving this problem. I am using SQL Server.

Best Answer

You answered your own question, by importing it into a temporary or staging table.

CREATE TABLE #TempTable(
    job_id   INT NOT NULL
    ,part_id  INT NOT NULL 
    ,quantity INT NOT NULL)

BULK INSERT #TempTable
FROM 'C:\example.csv'
WITH
(FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n') --any other options like FIRSTROW, etc

Then, just use the SUM() aggregate when inserting into your actual table

INSERT INTO MyTable
SELECT 
   job_id   
   ,part_id  
   ,quantity = SUM(quantity) 
FROM #TempTable
GROUP BY
   job_id   
   ,part_id

Related Solutions

Sql-server – SSIS Data Flow Task Excel to SQL table NULL value will not work for small INT datatype

My original answer was working (changing int to nvarchar) but I ran into another column which contained dates in the excel source file where some cells contained the string "NULL" (I did not want to have dates in a NVARCHAR column.) When SSIS got to this date column it was generating an error because it could not convert the string "NULL" to a date. It was not properly reading the "NULL" as a NULL value, instead it was reading it as a string. The way I was able to resolve this issue is by adding a derived column component to the package that replaced the date field by using the following expression

(DT_WSTR,255)Auth_from_date == "NULL" ? NULL(DT_WSTR,50) : (DT_WSTR,255)Auth_from_date

this is just an IF statement that will change the string "NULL" to an actual NULL value. The date is then passed thru a data conversion component and converted into DATETIME format.

Sql-server – Select values from 2 tables with same properties

DECLARE @sql NVARCHAR(MAX) = N'SELECT ';

SELECT @sql += N'
  ' + QUOTENAME(c.name) 
  + ' = COALESCE(es.' + QUOTENAME(c.name) 
       + ', s.' + QUOTENAME(c.name) + '),'
FROM sys.columns AS c
INNER JOIN sys.columns AS c2
ON c.name = c2.name
AND c.[object_id] = OBJECT_ID('dbo.SETTINGS')
AND c2.[object_id] = OBJECT_ID('dbo.EXTENDED_SETTINGS')
AND c.name NOT IN (N'name', N'ID' /* , ... potentially others ... */);

SET @sql += N' s.name, s.ID';

SELECT @sql += N', 
  s.' + QUOTENAME(c.name)
  FROM sys.columns AS c
  WHERE [object_id] = OBJECT_ID('dbo.SETTINGS')
  AND name NOT IN (N'name', N'ID') 
  AND NOT EXISTS 
  (
    SELECT 1 FROM sys.columns AS c2
      WHERE c2.[object_id] = OBJECT_ID('dbo.EXTENDED_SETTINGS')
      AND c2.name = c.name
  );

SET @sql += N'
  FROM dbo.SETTINGS AS s
  INNER JOIN dbo.EXTENDED_SETTINGS AS es
  ON s.ID = es.SETTINGS_ID
  WHERE es.ID = @extSetId AND s.ID = @settID;';

PRINT @sql;
--EXEC sp_executesql @sql;

Please use table aliases and schema prefixes, and here is why I prefer COALESCE over ISNULL. And if those tables really aren't stored as upper case in a case sensitive collation, do you really need all caps? Also, ID is a horrible and ambiguous name for a column. It should be the same name across the schema; always describing exactly what it is - especially when you're doing lazy dynamic things like this.

Related Question