Db2 – delete records in one table based on multiple column values in second table

db2

I need to insert and delete records in table 1 based on the data in multiple columns in table 2.

Example: Table1 and Table2 have identical columns: address, city, state, and zip.

Table1 is the master table with millions and millions of records.
Table2 (thousands of records) is a table based on data that changed in some way – although it may not be a change in data in this table.

Matching records means that the address, city, state, and zip are identical in value in both tables.

So I want to do 2 operations:

Insert records into Table1 that only exist in Table2
Delete records from Table1 where the record is not in Table2 but the city
is in Table2

Based on the size of Table1, I need a very efficient way to execute this.
I know some SQL – but not to the level this would require if done with SQL.

This is a DB2 database – version 10.5.

Best Answer

Operation 1

Insert records into Table1 that only exist in Table2:

INSERT INTO Table1
    (address, city, state, zip)
SELECT address, city, state, zip  FROM Table2
EXCEPT
SELECT address, city, state, zip  FROM Table1 ;

Operation 2

Delete records from Table1 where the record is not in Table2 but the city is in Table2.

(assuming that the columns ar enot nullable. If they allow nulls, it would be better to rewrite the NOT IN with a NOT EXISTS subquery):
```
DELETE FROM Table1 
WHERE (address, city, state, zip)
       NOT IN
      (SELECT address, city, state, zip  FROM Table2)
  AND city IN
      (SELECT city FROM Table2) ;
```

As for performance and efficiency, I have almost zero experience with DB2, so it would be better if you waited for some other answer. But an index on all the columns (address, city, state, zip) - on both tables - would probably be the best you can do for the queries, plus a single index on (city). Or perhaps combine the two suggestions in one index with all the 4 columns but choose city for the 1st part of the index: (city, address, state, zip).

You haven't provided the tables' structure but the 4 columns seem appropriate for a UNIQUE constraint so you may already have such an index.

Best Answer

Related Solutions

DB2 – Why RUNSTATS/REORG Cannot Be Used in Trigger Definitions

DB2 Replication – Options for Creating Read-Only Replica

Related Question