Db2 – Do I need to RUNSTATS after a REORG in DB2

db2db2-luwmaintenance

The REORG documentation says:

BUILD

Builds indexes. Updates index statistics. 

But I have always heard REORG/RUNSTATS as if they should be paired. Can anyone verify that "Updates index statistics" achieves the same thing as RUNSTATS?

Best Answer

REORG is used to help DB2 point to accurate data (ie, indexes should become aware of fresh data and no longer include deleted data), as well as "collapse" empty page space created by deletion of data and/or indexes. It can also help move data that is related closer to each other for more efficient access (especially true in the case of a cluster index).

RUNSTATS is used to help gather updated statistics on the volume and/or distribution of data within tables and indexes. This information is stored in the system tables and is used by many things including the optimizer to determine the optimal path to query the data. This is especially true as data grows. The data may distribute in a way that causes the optimizer to now include an index that it didn't before in its access path (or perhaps vice versa and choose not to use an index). RUNSTATS can also affect how a REORG runs - at least in older versions of DB2.

Based on what I have read from DB2 DBA's you generally want to run

  1. RUNSTATS
  2. REORG
  3. RUNSTATS

The first RUNSTATS helps the REORG to work appropriately and efficiently. The RUNSTATS after is more to make sure that stats are now accurate given all the data movement around on pages. Depending on where you read you may see DBA's mention the last RUNSTATS is no longer needed with newer versions of DB2. Since we haven't seen definitive answers on that and since most DB2 DBA's I've heard of follow the above order, our company has chosen to use that order (for the record we are on DB2 LUW 9.7 FP4 and we plan on migrating to V 10.1 sometime next year).

Also, just for the sake of completeness, generally when you update the statistics and you affect the optimizer you want all applications that call into DB2 to make sure they are taking advantage of those new optimizations. So a REBIND of packages will make sure that static packages know of the updated statistics and optimized query paths, and a FLUSH PACKAGE CACHE DYNAMIC cleans out dynamic queries from things like Hibernate, so that way the queries will be rebound with the updated query paths. (NOTE: on z/OS I see you have BIND and REBIND options. Perhaps the FLUSH PACKAGE CACHE DYNAMIC is under those or has a different name on z.)

So ultimately I would do the following:

  1. RUNSTATS
  2. REORG
  3. RUNSTATS
  4. REBIND packages
  5. FLUSH PACKAGE CACHE DYNAMIC (on LUW) or whatever the equivalent is on z/OS.