Lets run two hypothetical databases (H1
and H2
) in parallel, with the same total amount of RAM for bufferpools (R
).
- Let
H1
have a single bufferpool of size R
.
- Let
H2
have two bufferpools: one of size I
for index pages, the other of size D
for data pages. (And D+I==R
of course.)
The question is:
- How do you determine the right values for
I
and D
to make H2
perform better than H1
?
My answer is that you can't in general. The database engine for H1
has more room for optimizing its cache than H2
. If there are times of day where more index pages would lead to better performance, it can discard unused data pages and have a "mostly index pages" cache. If at a later time, data pages become hotter, it can evict more index pages and have a "mostly data pages" cache.
H2
can't do that. Once it has I
pages-worth of index pages cached, it can't cache more even if that's what would have been best right now. It's stuck there with a sub-optimal use of RAM.
The only way for H2
to perform just as well as H1
is if the D
/I
split initially chosen is ideal, and the workload is very stable. That sure can happen, but I'd wager that it is not a very usual database scenario. If that's not your case, think of H1
as being the same as H2
but with dynamic partitioning of the cache between data and index pages managed directly by the thing that knows the most about how it needs to optimizing I/O (i.e. the database engine).
This is not to say that maintaining different bufferpools is never a good idea.
One scenario for isolating specific pages in a specific cache I've encountered is having a critical "report" that needed to run fast (obviously) at all times, and happened to use a few tables that were pretty much never used elsewhere. So those pages kept getting evicted, and that "report" had runtimes that varied a lot from run to run. Moving a specific set of tables (and their indexes) to a specific pool removed much of the "non-deterministic" performance part.
But that's sub-optimal for the database as a whole, more of a kludge and nearer Voodoo optimization IMO. (And that wasn't on DB2 but I believe that's irrelevant here.)
So my advice is: if you have X Gb of RAM available for cache, use a single buffer, put everything in it, and let the database engine do its magic in there.
If you run into a corner case that would appear to benefit from cache segregation, try it out, benchmark it, think about the overhead of having to maintain the magic numbers for each cache size, and go tune the queries, schema or disk layout instead :)
With some offline conversations, Zane was able to determine that if they could keep a connection open, it would hasten the validation.
To that end, I suggested they change the property on the Connection Manager to flip the RetainSameConnection property to True
from its default.
Best Answer
If it works for you then it's a good practice. There are basically no hard rules for databases.