Exhaustive SQL stored proc validation make insert slow

optimizationstored-procedures

So basically our team is working on a module to retrieve rows value from excel file and insert it during batch job.

The requirement are as follow:
– Exhaustive validation should be done for each column, and the validation error message should be descriptive according to requirement specification
eg: Error in Column [x] due to invalid date format/exceed max length.
– Error in each column should be inserted into database (not log file)
– Insertion should be done ROW by ROW, no bulk insert. Any fail insertion description (due to invalid data) should be inserted into database.

The proposed solution is as follow:
So, first we will insert excel data row by row into excel file list. If it pass all validation, it will include all value except ErrorDesc. If it is failed it will include Error Desc and other excel row value except line will be null.

As what we observed over here, the validation (easily go to 30 to 40 depending on column number and validation requirement), as well as checking whether the memberclassname and membercountryname has corresponding ID in the database makes the insertion slow, ranging from 0.2 to 0.5s per insertion in ExcelContentList. We are currently performing select statement for each insertion to check whether memberclassname and membercountryname exist and has corresponding ID.

We appreciate if there is any suggestion/improvements.

enter image description here

Best Answer

Insert all the data into a staging table without verifying anything. This should go quickly, but your mileage may vary. This also assumes that you are only ever going to be doing one insert at a time. Multiple inserts may require multiple staging tables, but you might be able to use a temp table for it or have a session identifier in the staging table to indicate discrete insert operations.

In any case, once the data is loaded into the staging table, perform your validation checks. Something like this (I don't know your database structure so this is just an example).

UPDATE StagingTable
SET ErrorDesc = ISNULL(ErrorDesc,'') + 'MemberClassName mismatch'
WHERE NOT(MemberClassName IN (SELECT MemberClassName FROM MemberClass))

Repeat this for all your validation rules. This should go much faster than your row by row check.

Once done, move all rows that had no validation errors into the final tables, then output all the errors for whatever you want to do with them.