MYSQL parallel query performance issue

MySQLmysql-5.1parallelismperformance

Im working on a project for my company which sells huge amounts of weather data to customers through a Web API. As part of that API, there is a backend service which needs to fetch data from MYSQL database (ver 5.1.1, InnoDB). The data set can reach amount of 50 000 rows and 12 columns if interval is set to whole year, which is very common. MYSQL database runs on Red Hat Linux 64bit, 4 core server computer.

Query which calculates, for example, daily averages on set of 50000 rows executes in 3.5 sec. However, if i run 4 of those queries in parallel, execution time for each of them reaches average 8sec per query. Im also monitoring client connection on DB and I can clearly see there are 4 threads executing each query. CPU usage on server during execution goes from 0 to cca 50% usage.

My question is, why execution time rises if I run same query in parallel? Im I missing something or this is expected behaviour?

More server info: CPU: Intel Core I5-4440 @ 3.1 GHz (4 cores), Memory: 8 GB

EDIT: Increase of innodb_buffer_pool_size from 530 MB to 5.5 GB solved problem!

Best Answer

  • Upgrade to 5.5, then 5.6, then 5.7. You will be able to run more simultaneous queries at each upgrade. Do you really mean "5.1.1"? That was a pre-release for 5.1 from about 2005.

  • Use summary tables for old, static, data. (It is static after it is INSERTed, correct?) Perhaps summarizing (eg AVG, MAX, MIN) daily readings can be done just after midnight?

  • Use InnoDB, not MyISAM. Much better sharing of the table. (And it gets better with each version upgrade.)

  • innodb_buffer_pool_size should be about 5500M for your 8G machine. Is the dataset, including indexes, bigger than that? If so, get more RAM and/or RAID with striping and/or SSDs.

  • And Summary tables are the main cure for Data Warehouse performance issues. Further discussion . It's not uncommon to speed up queries 10-fold, especially I/O-bound ones.

"How does one solve..." -- Most applications are not doing lots of 3.5-second queries. They are doing 'point queries', which might be 3.5 milliseconds.