Sql Server's Buffer Pool is a wonderful thing. It's smart enough to handle all sorts of situations in a fairly intelligent way. Here are a couple examples showing how at first glance the buffer pool behavior seems strange, but is actually fairly clever.
Consider a 400 GB clustered index on a server with 64 GB of Memory available for the buffer pool. If a query scans the entire 400 GB index, how much buffer pool should the index use after the query is complete?
Probably not very much. If Sql Server aggressively cached all the data it accessed maybe the last 50GB that it read during the scan is in memory. However, the next time you run the same query, that 50GB is useless because Sql will have to read the first 350GB from disk before it gets to the last 50GB. This is a common enough scenario that Sql handles it intelligently. Instead it is more likely that all of the intermediate level index pages will be cached along with a few of the leaf pages.
I've got a server with 512GB of memory, but it never seems to use all of it, or it uses it..strangely?
Sql Server is NUMA aware. On NUMA (Non-Uniform Memory Access) systems, some memory is closer/faster to a CPU than other memory. For example, a 4 socket system might have 4 memory controllers (1 per socket). It is faster for a CPU to access memory on the controller attached to it vs accessing memory through some other cpu's controller. (I've seen 1.4-1.7 times slower access).
Sql Server knows this, and will prefer to allocate local node memory before remote node memory. Since requests can run on any cpu (they don't typically jump between them) and hence allocate memory on any node, the first request to access an index will cache it's pages in local memory. In most cases, you'll end up with a fairly equal distribution of cached pages, however, each node is managed individually by sql server, so effectively, your 512 GB is really four 128GB buffer pools. You could have the situation where node 1 is using all 128GB, node 2 is using 80GB, and pages are getting discarded from node 1 to make room while node 2 has lots of free space.
Why would my heavily read from indexes be cached less than indexes only receiving writes?
One possible reason is checkpoints. Depending on your recovery interval, it may be several minutes after a data change occurs before a checkpoint writes the page to the data file (it's already in the log file on disk). During this time, the page can't be discarded to make room in the buffer pool.
Besides setting the size, are there other ways to change how the buffer pool behaves?
There isn't a whole lot of control over the buffer pool and the decisions it makes. That's fine because you typically don't need to tweak it. Watching perfmon counters for "SQLServer:Buffer Manager" and "SQLServer:Buffer Node" can provide insight into what the buffer pool is doing, whether it is sufficient in size, whether your IO subsystem can keep up with the checkpoints.
Regarding a couple of items you mentioned:
- Fragmentation on disk doesn't affect memory usage directly. However, if an index has 50% free space on the pages due to deletes, it will take up twice a much room on disk and in memory, but that's related to free space, not fragmentation.
- PAGE_IO_LATCH Waits measure the time it takes for the IO subsystem to load pages into memory. They are an indicator of disk performance and not memory.
- You are correct, high page lock waits are due to the writes. The writes make pages dirty, and dirty pages have to stick around in the buffer pool until they get checkpointed to disk.
I addressed this issue 1.5 years ago : innodb_file_format Barracuda
Just from reading the documentation, I was able to figure out mathematically that the smaller you make the key_block_size, the less extra space you need to add to the InnoDB Buffer Pool.
Here is the chart I made up in that earlier post
- You have a DB Server with a 8G Buffer Pool
- You ran compression with
key_block_size=8
8
is 50.00%
of 16
50.00%
of 8G
is 4G
- raise
innodb_buffer_pool_size
to 12G
(8G
+ 4G
)
- You ran compression with
key_block_size=4
4
is 25.00%
of 16
25.00%
of 8G
is 2G
- raise
innodb_buffer_pool_size
to 10G
(8G
+ 2G
)
- You ran compression with
key_block_size=2
2
is 12.50%
of 16
12.50%
of 8G
is 1G
- raise
innodb_buffer_pool_size
to 9G
(8G
+ 1G
)
- You ran compression with
key_block_size=1
1
is 06.25%
of 16
06.25%
of 8G
is 0.5G
(512M
)
- raise
innodb_buffer_pool_size
to 8704M
(8G
(8192M
) + 512M
)
Given the following
- BPS = Buffer Pool Size
- KBS = key_block_size
- BPSFC = Buffer Pool Size For Compression
The ideal size for the Buffer Pool in a perfect world would be
BPSFC = BPS * (16 + KBS) / 16
From this formula, less is more. The smaller KBS is, the smaller the increase in BPS needs to be. Additoinally, it would appear that the CPU would worker a little harder in response to tighter compression. How ?
- For KBS=8, evicting one uncompressed page leaves room for 02 compressed pages
- For KBS=4, evicting one uncompressed page leaves room for 04 compressed pages
- For KBS=2, evicting one uncompressed page leaves room for 08 compressed pages
- For KBS=1, evicting one uncompressed page leaves room for 16 compressed pages
- For KBS=X, evicting one uncompressed page leaves room for 16/X compressed pages
I can see the CPU working harder to introduce more compressed pages with tighter compression. This could result in evicting LRU compressed pages more often.
If this is the case, it would appear that making the BPS even bigger than the suggested values from these formulas would give CPU some relief as well as reduce the rate of the eviction of compressed pages from the Buffer Pool.
EXAMPLE: From my first chart, an 8GB Buffer Pool would need to be 12GB for key_block_size=8. Making the Buffer Pool larger than 12GB would relieve the CPU of a lot of this heavy lifting (in the form of evicting uncompressed pages). Perhaps doubling the Buffer Pool Size may be need to be explored.
As for your question
will enabling compression on tables without changing the buffer pool size hurt or benefit the buffer pool hit rate?
I would say it would hurt, no such much because of the hit rate against the compressed pages, but because of the additional house cleaning in terms of the management of uncompressed pages and the CPU usage required for it. This would lower InnoDB performance no matter how much tuning the rest of the Storage Engine gets.
Best Answer
This is the only really meaningful value in the situation that you are in... and that situation is that you are lucky enough to have a buffer pool with a perfect 100% hit rate. Don't over-analyze the rest of it, because there is nothing you need to change, unless the server OS is low on memory, causing swapping.
The young/not young values aren't interesting in a case where there's zero pressure on the buffer pool. InnoDB is using it, it doesn't do anything without it. If the pool is too small, pages get evicted and new pages get read in and the other stats help you understand that... but that is problem you don't appear to have.
Free "unused" space in the pool will never be neglected or left idle by InnoDB if it is needed for any reason at all, so the fact that it's free means only that you have some breathing room to expand into as the size of your working dataset grows.
That's all it means, unless, of course, you recently restarted the server, in which case, it's incomplete.. The server needs to run through a full period of "normal" usage (including full backups) before the stats tell the whole story... whether that's an hour, a day, week, month, or year, depends on your application.