Db2 – Immediate refresh materialized views (MQT) with OUTER JOIN under DB2

db2fast-refreshjoin;materialized-view

DB2 accepts a materialized view based on an "OUTER JOIN" query, as long it doesn't refresh automatically (REFRESH IMMEDIATE).

I cannot find a simple solution to get round this problem. Perhaps with (a lot of) triggers, but it isn't really nice…

As example, my database is

CREATE TABLE PHONEUSER (USERID INT NOT NULL PRIMARY KEY, NAME VARCHAR(30));
CREATE TABLE PHONECALL (CALLID INT NOT NULL PRIMARY KEY, USERID INT, HOUR TIMESTAMP);

INSERT INTO PHONEUSER VALUES (1, 'Olivier');
INSERT INTO PHONEUSER VALUES (2, 'Arthur');

INSERT INTO PHONECALL VALUES (1,1,CURRENT_TIMESTAMP);

and I want each call of each user. If an user has never called, I want a null value for him (it has sense in my application):

SELECT PU.USERID, CALLID, HOUR FROM PHONEUSER PU LEFT JOIN PHONECALL PC ON PC.USERID=PU.USERID;

USERID  CALLID  HOUR                   
------  ------  -----------------------
     2    null  null                   
     1       1  2013-04-09 16:36:08.952

This query runs well, but cannot be used for an MQT with refresh immediate:

CREATE TABLE CALLLIST AS (SELECT PU.USERID, CALLID, HOUR FROM PHONEUSER PU LEFT JOIN PHONECALL PC ON PC.USERID=PU.USERID) DATA INITIALLY DEFERRED REFRESH IMMEDIATE ;
The fullselect specified for the materialized query table CALLLIST is not valid. Reason code = "10".. SQLCODE=-20058, SQLSTATE=428EC, DRIVER=4.7.85

Best Answer

Unfortunately, as the error description for SQL20058N reason code 10 states, you can't use an outer join with a refresh immediate MQT:

Explanation

...
10
The REFRESH IMMEDIATE option was specified and the FROM clause referenced more than one table.

User Response

10
Create the materialized query table as REFRESH DEFERRED, or use an inner join without the explicit INNER JOIN syntax.

However, if the MQT definition you supply is really this simple (i.e., pre-computing the join), I wonder why you want to create an MQT in the first place? As you said, performance of the query is good, so why wouldn't you just retain the query as written in the application? Alternately, you could encapsulate the join logic in a view.

Although I would generally advise against this, if there is some serious performance issue with the join that can't be resolved with proper indexing and RUNSTATS, you could replicate this immediate update functionality through triggers on the PHONEUSER and PHONECALL tables that insert records into your CALLLIST table:

create trigger trg_PhoneUser
    after insert on PHONEUSER
    referencing new as n
    for each row
        insert into PHONECALL (userid) values (n.userid);

create trigger trg_PhoneCall
    after insert on PHONECALL
    referencing new as n
    for each row
    begin
        delete from PHONECALL where userid = n.userid and hour is null;--
        insert into PHONECALL values (n.userid, n.callid, n.hour);--
    end;

The first trigger is necessary to handle the outer join (i.e. where a user hasn't made any calls yet). The second trigger inserts the calls, and also deletes the records inserted by the first trigger (as they are no longer necessary once a record in PHONECALL exists). You would need to decide how you want to handle deletes in either table (either through triggers or foreign keys with cascading deletes).

Again, I want to stress that this is a possible solution, but I would only use it as a last resort. Databases are designed to perform joins efficiently, but this might require some proper design and tuning. The table structures above should absolutely not require this kind of solution, but I'm working on the assumption that this is not your actual schema. I am only describing this solution because I can imagine a few edge cases where something like this solution might be necessary.