Mysql – Is it a waste to set innodb_buffer_pool_instances greater than the # of CPUs

buffer-poolinnodbMySQLmysql-5.5

I set the innodb_buffer_pool_size to 20GB on a server with 12 CPU cores. My full database is 11gb, however most of it is archived tables that are almost never used. The total queried data is around 3 gb, and the frequently queried data is ~1.25 gb.

What should I set the innodb_buffer_pool_instances to?

innodb_buffer_pool_size / total queried data = 6 pool instances
innodb_buffer_pool_size / frequently queried data = 16 pool instances

Normally I'd pick option #2, but logically it seems number of buffer pools that can be used at any one time is no more than the total number of CPU cores.

Is it a waste to set innodb_buffer_pool_instances to more than the # of CPU cores?

Best Answer

I don't think you will need to have too many buffer pools for you queried data because the size of the frequently queried data doesn't quite justify it. This does of course depend on the definition of "frequently".

The appropriate documentation you should be referencing the page on innodb buffer pools, here:

http://dev.mysql.com/doc/refman/5.5/en/innodb-multiple-buffer-pools.html

The numbers I'm focusing on here is

the frequently queried data is ~1.25 gb.

My rule of thumb when trying to keep InnoDB buffers the right size is to keep them at or around 1GB a piece, in order to keep the list of blocks as short as possible but keeping the list of buffers from being too far fetched, this will always depend on your actual needs however.

This is in line with MySQL's recomendations:

For best efficiency, specify a combination of innodb_buffer_pool_instances and innodb_buffer_pool_size so that each buffer pool instance is at least 1 gigabyte.

The point of the multiple buffer pools is to ensure your CPU threads don't meet high contention in accessing the data. Or as they put it:

You might encounter bottlenecks from multiple threads trying to access the buffer pool at once. You can enable multiple buffer pools to minimize this contention.

However this feature is more for larger amounts of data being frequently accessed as opposed to the 1.25GB you're system uses frequently. Ultimately if I were in your position, I wouldn't see a need for having more buffer pools than the number of CPU's assuming all CPU's are only performing MySQL related tasks. I would also look into the affects of using innodb_old_blocks_time to prevent the occasional query of your archived tables from taking the place of a block of data that is used over and over again.

I hope that helps, let me know how everything works out.

Related Solutions

Mysql – How much memory do I need for innodb buffer pool

If you go strictly by that rule of accommodating an addition 10%, here is my suggestion:

SELECT CONCAT(CEILING(RIBPS/POWER(1024,pw)),SUBSTR(' KMGT',pw+1,1))
Recommended_InnoDB_Buffer_Pool_Size FROM
(
    SELECT RIBPS,FLOOR(LOG(RIBPS)/LOG(1024)) pw
    FROM
    (
        SELECT SUM(data_length+index_length)*1.1*growth RIBPS
        FROM information_schema.tables AAA,
        (SELECT 1 growth) BBB
        WHERE ENGINE='InnoDB'
    ) AA
) A;

This will produce exactly what you need to set innodb_buffer_pool_size in /etc/my.cnf. If you want to account for 25% increase in data and indexes over time, please change (SELECT 1 growth) BBB to (SELECT 1.25 growth) BBB

Recently, I answered another question like this in the DBA StackExchange.

Mysql – Perl – MySQL/MariaDB – slow with no identifiable bottleneck

Conclusion and a workaround

After exhausting all options on Windows, I decided to switch to Linux, mostly because I was frustrated with inability to profile and debug in detail.

I have moved the whole setup to Ubuntu 14.04. I first tried XAMPP but gave up because of conflicts between XAMPP and MySQL and MySQL Workbench. Then I moved to vanilla MySQL (5.5, I think) and vanilla Apache 2.

However, I was still left with the same problem – no visible bottleneck and resources still underutilized. I suspected throttling in TCP sockets (used between Perl code and MySQL), but further profiling proved this not to be the case.

Then, I turned my attantion to Perl DBI module DBD::SQL, thinking that it may be doing some throttlinig. I did some tests where I replaced DBI calls in Perl with system calls (system("mysql -e'INSERT INTO blah blah …'). I have determined that the performance did not change, therefore absolving DBI as a culprit.

I need to add one important detail now: I was in fact always running a number of my Perl scripts in parallel. Given that the CPU has 8 cores, this is necessary to utilize all of them, of course. Further debugging showed that almost all my perl processes which were supposed to work furiously were sleeping most of the time. Ubunty System Monitor shawed them as waiting on Waiting Channels wait_answer_interruptible or unix_stream_recvmsg. CPU History graph in System Monitor showed all perl processes jumping to 100% CPU utilization and then dropping to ~0% in unison. I suspected that MySQL server is not configured for multi threading, but htop showed 17 mysqld threads activated, confirming that all should be ok.
I suspected that all MySQL threads were waiting on the same semaphore and were locked out for most of the time. I dreaded delving into the dark bowels of MySQL trying to figure out what goes on inside. Instead, I decided to replace MySQL with MariaDB, even though MariaDB seems to have had the same issue originally when I was running it on Windows.

Lo and behold – this finally worked. My perl scripts were screaming.

One last problem remained: I had a very rudimentary method of parallelising the perl scripts: I would just run 10 or 20 with their respective loads and hope that they would utilize all the resources.

This has obvious drawbacks: if too many processes are spawned, the OS may spend too much time swapping them (although not a serious issue with only 20 processes, it becomes an issue with e.g. 1000). If not enough processes are spawned (e.g. less than 8, for each core) the CPU will not be utilized fully for sure. If too many processes exhaust RAM, Linux will turn to disk and will start swapping. As soon as this starts happening, everything grinds to a halt.

I searched but could not find a perl library/script/code which would spawn new processes only when CPU, memory and disk are under utilized. Hence I created my own: raspawn.pl (resource aware spawn) which I placed on github. Raspawn.pl spawns a number of processes while trying to keep resources utilization just below the maximum. It constantly checks the CPU, memory and disk utilization and only if all are less than ~90% utilized, starts a new process.

Finally, this worked. I can now process my whole load in around 7 days, instead of many months...

Best Answer

Related Solutions

Mysql – How much memory do I need for innodb buffer pool

Mysql – Perl – MySQL/MariaDB – slow with no identifiable bottleneck

Related Question