I am creating a script in SSIS where I want table A
to check if rows exist in table B
. If they exist, they must be deleted.
Table A
is an archive table namedCustomer_archive
Table B
is a table that will get updated frequently and is namedCustomer_Information
.
The idea is that once a row is stored in Customer_archive
, that row will not appear in Customer_information
, even after Customer_information
has been updated.
It has to check the entire row and not WHERE = 'something'.
Is it possible to do a WHERE
clause that is a combination of 2 columns?
Best Answer
I can't say anything about SSIS but in SQL you can check whether a whole row is identical to another (including checking for
NULL
values which can get rather complicated otherwise), using this technique, explained by @PaulWhite in his blog post: Undocumented Query Plans: Equality Comparisons .For example in you case. "For any identical rows of table B and table A, delete those B rows":
I have serious concerns though about efficiency when the tables are big - and an archive table by definition is going to be quite big. The
a.pk = b.pk
is not needed really, as the pk columns are obviously included in the row check of theEXISTS
but I kept them for efficiency. Assuming that the two tables have the same primary key and that after a period, most of the archived rows have been already deleted from table B, the PK indexes will have very few matching values so the join will be relatively fast - and using the row checks only for matching pk values.