Db2 – Is SET INTEGRITY of any use after IMPORT statements, or just after LOAD statements

data integritydb2importload

I'm getting the feeling, both from documentation and empirically, that you can only put SET INTEGRITY to good use after a LOAD, not after an IMPORT. LOAD is mentioned for SET INTEGRITY, but not IMPORT.

If you try an IMPORT with INSERT_UPDATE, you may get some rows rejected, as you may with a LOAD. Whereas you can embed the use of an exception table directly in the syntax of the LOAD or in a SET INTEGRITY statement used directly after the LOAD statement, and the exception tables will "get" the rejected rows, but hat doesn't appear to be the case with IMPORT.

There's no ability to embed the use of the exception table in the IMPORT statement, and if you run the SET INTEGRITY statement after the IMPORT has rejected the rows, it's too late.

Am I right or wrong?

Best Answer

Short answer: You are right in that you cannot specify an exception table for the IMPORT command. Also, there is no need to run the SET INTEGRITY statement if you use IMPORT.

Here is why:

The biggest reason for SET INTEGRITY is based on how LOAD and IMPORT work. IMPORT uses normal INSERT statements to get the data into the tables. Doing this engages the database transaction logs, triggers, and referential integrity. If a row is rejected by referential integrity, you will know about it.

LOAD bypasses referential integrity, triggers, and database transaction logs, and puts the data directly into the table (though it cannot bypass unique constraints and identity constraints). The main reason why you want an exception table is to catch the data that doesn't load into the table (perhaps it was the wrong data type etc.)

Because you bypass referential integrity, you MUST do a SET INTEGRITY afterwards to make the table aware of the data it now contains to see if anything does violate referential integrity (otherwise tables are put into a CHECK PENDING state until you do so). (It is also a good idea to run a RUNSTATS, REORG, FLUSH PACKAGE CACHE DYNAMIC, and REBIND afterwards too to make sure that indexes and the optimizer know about the new data. Actually, it is a good idea to do that anyway if you do a mass IMPORT...but that aside...).

Note that you can use the MESSAGES option with both LOAD and IMPORT to capture errors that happen so you know how to deal with them (usually after the fact though). Note that if you use both MESSAGES for a message file and FOR EXCEPTION <table name> for an exception table, any rows in error that get put in the exception table don't get listed in the message file.

Here is a good comparison of LOAD vs IMPORT straight from IBM DB2 documentation (version 9.7)