Lets run two hypothetical databases (H1
and H2
) in parallel, with the same total amount of RAM for bufferpools (R
).
- Let
H1
have a single bufferpool of size R
.
- Let
H2
have two bufferpools: one of size I
for index pages, the other of size D
for data pages. (And D+I==R
of course.)
The question is:
- How do you determine the right values for
I
and D
to make H2
perform better than H1
?
My answer is that you can't in general. The database engine for H1
has more room for optimizing its cache than H2
. If there are times of day where more index pages would lead to better performance, it can discard unused data pages and have a "mostly index pages" cache. If at a later time, data pages become hotter, it can evict more index pages and have a "mostly data pages" cache.
H2
can't do that. Once it has I
pages-worth of index pages cached, it can't cache more even if that's what would have been best right now. It's stuck there with a sub-optimal use of RAM.
The only way for H2
to perform just as well as H1
is if the D
/I
split initially chosen is ideal, and the workload is very stable. That sure can happen, but I'd wager that it is not a very usual database scenario. If that's not your case, think of H1
as being the same as H2
but with dynamic partitioning of the cache between data and index pages managed directly by the thing that knows the most about how it needs to optimizing I/O (i.e. the database engine).
This is not to say that maintaining different bufferpools is never a good idea.
One scenario for isolating specific pages in a specific cache I've encountered is having a critical "report" that needed to run fast (obviously) at all times, and happened to use a few tables that were pretty much never used elsewhere. So those pages kept getting evicted, and that "report" had runtimes that varied a lot from run to run. Moving a specific set of tables (and their indexes) to a specific pool removed much of the "non-deterministic" performance part.
But that's sub-optimal for the database as a whole, more of a kludge and nearer Voodoo optimization IMO. (And that wasn't on DB2 but I believe that's irrelevant here.)
So my advice is: if you have X Gb of RAM available for cache, use a single buffer, put everything in it, and let the database engine do its magic in there.
If you run into a corner case that would appear to benefit from cache segregation, try it out, benchmark it, think about the overhead of having to maintain the magic numbers for each cache size, and go tune the queries, schema or disk layout instead :)
First of all, you must have a bufferpool that matches the page size of the tablespace. By that token if the temporary tablespace is the only one with the 32K page size then it will have the bufferpool for its exclusive use.
If you have other tablespaces with the 32K page size, only monitoring the system performance will tell you if there may be a benefit from a separate temp space bufferpool.
You can use select * from sysibmadm.mon_bp_utilization
to look at the bufferpool hit ratios and select * from table (mon_get_bufferpool(NULL,NULL))
to check page cleaner activity (POOL_NO_VICTIM_BUFFER
and POOL_DRTY_PG_STEAL_CLNS
should ideally be 0). If you see that a drop in the bufferpool hit ratio or high dirty page contention coincides with the temporary space use (a spike in POOL_TEMP_DATA_L_READS
in select * from table (mon_get_tablespace(NULL,NULL))
), creating a separate bufferpool of an appropriate size for the temporary tablespace in question might be useful.
Best Answer
You can't "estimate" the size of a buffer pool, because the answer is, "it depends".
Try using the
AUTOCONFIGURE
command to get started, and enable STMM. Together these will go a long way towards getting your database running more efficiently.I would also recommend you spend some time reading about DB2 Performance Tuning. There is a ton of information available on the web, starting from the entire Performance Tuning section of the manual.