Db2 – set order of query execution

db2performancequery-performance

Below query is executing in approx 5 min:

     SELECT COUNT(distinct B.CALL_ID), C.IVR_APPLN_NAME
      FROM PTDT.TDT_CALL_DATA AS A, PTDT.TDT_IVR_CALL_APPLICATIONS AS B,
           PTDT.TDT_IVR_APPLN_REF AS C
      WHERE A.CALL_ID = B.CALL_ID AND B.IVR_APPLN_ID = C.IVR_APPLN_ID
      GROUP BY C.IVR_APPLN_NAME;

By adding additional where clause (IVR_APPLN_NAME) it's executing in 120 min.

    SELECT COUNT(distinct B.CALL_ID), C.IVR_APPLN_NAME
            FROM PTDT.TDT_CALL_DATA AS A, PTDT.TDT_IVR_CALL_APPLICATIONS AS B,
            PTDT.TDT_IVR_APPLN_REF AS C
            WHERE A.CALL_ID = B.CALL_ID AND B.IVR_APPLN_ID = C.IVR_APPLN_ID
        AND C.IVR_APPLN_NAME in ('DSC','EMS','FAO','RET','YAS')
      GROUP BY C.IVR_APPLN_NAME;

So I tried to run it like below:

       select * from (
                SELECT COUNT(distinct B.CALL_ID), C.IVR_APPLN_NAME
                   FROM PTDT.TDT_CALL_DATA AS A, PTDT.TDT_IVR_CALL_APPLICATIONS AS B,
                   PTDT.TDT_IVR_APPLN_REF AS C
                WHERE A.CALL_ID = B.CALL_ID AND B.IVR_APPLN_ID = C.IVR_APPLN_ID
                GROUP BY C.IVR_APPLN_NAME) as D
       where D.IVR_APPLN_NAME in ('DSC','EMS','FAO','RET','YAS')

I assumed that internal query will run first (like before it was running in 5 min) and then I will filter results out of that (from outer filter D.IVR_APPLN_NAME in ('DSC','EMS','FAO','RET','YAS')).
But this was not the case, DB2 manipulated in such way that it applied outer filter (IVR_APPLN_NAME) to internal query and overall query executed again in 120 min.
Other option could be to store internal result in temporary table but temp table use in DB2 itself is not as simple as in SQL.

Any suggestions?

Best Answer

Try following change. It might work.

select d.count1 , d.IVR_APPLN_NAME  
from (
        SELECT COUNT(distinct B.CALL_ID) as count1, 
               C.IVR_APPLN_NAME , 
               C.IVR_APPLN_NAME || '' as  test_name
        FROM PTDT.TDT_CALL_DATA AS A, 
             PTDT.TDT_IVR_CALL_APPLICATIONS AS B,
             PTDT.TDT_IVR_APPLN_REF AS C
        WHERE A.CALL_ID = B.CALL_ID AND B.IVR_APPLN_ID = C.IVR_APPLN_ID
        GROUP BY C.IVR_APPLN_NAME
     ) as D
where D.test_NAME in ('DSC','EMS','FAO','RET','YAS')

This SQL is different from the SQL in the question. By using a derived column (test_name) instead of original column (IVR_APPLN_NAME) in where clause, we might be able to outsmart the optimizer. I tested this strategy on our system using different tables and it worked.