SQL Server Bulk Insert – Insert into View and Find Missing Values

bulk-insertsql-server-2012view

I have 2 tables, Categories, and SubCategories:

CREATE TABLE [dbo].[tbCategories]
(
  [CategoryID] [int] IDENTITY(1,1) NOT NULL,
  [CategoryName] [varchar](250) NULL,
)

CREATE TABLE [dbo].[tbSubCategories]
(
  [SubCategoryID] [int] IDENTITY(1,1) NOT NULL,
  [CategoryID] [int] NULL,
  [SubCategory] [varchar](250) NULL
)

I want to Import Sub-Categories with a csv file, which contains the CategoryName and the SubCategory.

Here's an example:

enter image description here

So I get the CategoryName and the Sub-Category Name. But I need the CategoryID. The user won't know what that is, they will only know the Category Name. What would be this best way to do this?

I already have a bulk insert method (using c#) that works fine if I import a csv file with CategoryID and SubCategoryName. I thought I can insert into a view to achieve this. Something like:

CREATE VIEW [dbo].[vSubCategories]
AS

SELECT c.CategoryName, s.SubCategory
FROM dbo.tbCategories c 
INNER JOIN dbo.tbSubCategories s 
ON c.CategoryID = s.CategoryID

Then I can use a trigger with 'INSTEAD OF INSERT'. But how do I get SQL to automatically pick up the CategoryID which should be inserted into SubCategory table?

In short, I want to import the csv file above, and see something similar in the tbSubCategories table as below:

enter image description here

Best Answer

For the purpose of this answer, I am assuming your INSERT statement looks something like this:

INSERT INTO
  dbo.vSubCategories (CategoryName, SubCategory)
SELECT
  CategoryName,
  SubCategory
FROM
  ...  /* reading rows either from the CSV directly or
          from a staging table into which the CSV has been imported */
;

So, the first thing your trigger needs to do is to insert the (new) categories into tbCategories:

INSERT INTO
  dbo.tbCategories (CategoryName)
SELECT
  CategoryName
FROM
  inserted
EXCEPT  -- do not insert already existing category names
SELECT
  CategoryName
FROM
  dbo.tbCategories
;

Now all the category names in the imported dataset can be mapped to their IDs using the tbCategories table. And that is what the trigger should do when inserting the subcategories:

INSERT INTO
  dbo.tbSubCategories (CategoryID, SubCategory)
SELECT
  c.CategoryID,
  n.SubCategory
FROM
  inserted AS n
  INNER JOIN dbo.tbCategories AS c ON n.CategoryName = c.CategoryName
EXCEPT  -- exclude existing sub-categories, just in case
SELECT
  CategoryID,
  SubCategory
FROM
  dbo.tbSubCategories
;

For reference, here is a complete trigger script:

CREATE TRIGGER dbo.vSubCategories_InsteadOfInsert
ON dbo.vSubCategories
INSTEAD OF INSERT
AS
BEGIN
  -- insert categories
  INSERT INTO
    dbo.tbCategories (CategoryName)
  SELECT
    CategoryName
  FROM
    inserted
  EXCEPT  -- do not insert already existing category names
  SELECT
    CategoryName
  FROM
    dbo.tbCategories
  ;
  -- insert sub-categories
  INSERT INTO
    dbo.tbSubCategories (CategoryID, SubCategory)
   SELECT
    c.CategoryID,
    n.SubCategory
  FROM
    inserted AS n
    INNER JOIN dbo.tbCategories AS c ON n.CategoryName = c.CategoryName
  EXCEPT  -- exclude existing sub-categories, just in case
  SELECT
    CategoryID,
    SubCategory
  FROM
    dbo.tbSubCategories
  ;
END;