MariaDB instance in cloud hit 95% CPU utilisation for 30 mins

mariadbmonitoringperformance

I've got MariaDB instance in the cloud which I put live about 2 months ago. I'm new to the DBA business and I got the impression the people at the company who I run it through are learning on the hoof as well.

All is going well except that I saw CPU utilisation spiked on Tuesday to 95% for now apparent reason.

I requested performance stats after a user complained of a long running query. It turns out they were doing regular hourly queries on unindexed fields and the table scans involved suddenly took a lot longer. I am now looking at indexing this column so they can run their queries faster.

I'd like to know whether their queries could be the cause of the CPU utilisation, or whether whatever caused the CPU utilisation had just grabbed all the cache and caused their table scanning query to suddenly take far longer.

What should I be doing to capture information that will show what is going on and why the CPU utilisation spiked, and if it's a concern?

Here's the 5 days chart of CPU utilisation:

and the 1 day chart

and the only other chart that correlated with it:

and

The scales on the chart are probably the key, right?

The other charts provided (database connections, write IOPS, disk queue depth) didn't show any correlation.

I'm also puzzled why the normal pattern has totally changed. The usage is around a million queries per day, and has been for 2 months now.

Best Answer

(Not yet an "Answer", but debugging help:)

High CPU almost always means a SELECT needs optimizing. To find out which query:

Run SHOW FULL PROCESSLIST; when the naughty query is running (10:00?). Run it a few times; hopefully, you can spot some query with a large "Time" or perhaps it is being run repeatedly.
Or us the SlowLog to find it. Here are tips on how to get the info and present it to us here for help.

You show heavy I/O also. That could be a query that is doing a table scan. Again, find the query and ask for help.

Meanwhile, refresh graphs, and add one for "slow queries", if it exists.

Related Solutions

Mysql – CPU usage on RDS instance monotonically increasing with no change to query volume

I have some queries for you regarding table sizes that you can run in MySQL during these spikes

Database size in terms of StorageEngine (MB)

SELECT IFNULL(B.engine,'Total') "Storage Engine", CONCAT(LPAD(REPLACE(FORMAT(
B.DSize/POWER(1024,pw),3),',',''),17,' '),' ',SUBSTR(' KMGTP',pw+1,1),'B') "Data Size",
CONCAT(LPAD(REPLACE(FORMAT(B.ISize/POWER(1024,pw),3),',',''),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Index Size",CONCAT(LPAD(REPLACE(FORMAT(B.TSize/
POWER(1024,pw),3),',',''),17,' '),' ',SUBSTR(' KMGTP',pw+1,1),'B') "Table Size"
FROM (SELECT engine,SUM(data_length) DSize,
SUM(index_length) ISize,SUM(data_length+index_length) TSize FROM information_schema.tables
WHERE table_schema NOT IN ('mysql','information_schema','performance_schema') AND
engine IS NOT NULL GROUP BY engine WITH ROLLUP) B,(SELECT 2 pw) A ORDER BY TSize;

Database size in terms of Databases (MB)

SELECT DBName,CONCAT(LPAD(FORMAT(SDSize/POWER(1024,pw),3),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Data Size",
CONCAT(LPAD(FORMAT(SXSize/POWER(1024,pw),3),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Index Size",
CONCAT(LPAD(FORMAT(STSize/POWER(1024,pw),3),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Total Size" FROM (SELECT
IFNULL(DB,'All Databases') DBName,SUM(DSize) SDSize,SUM(XSize) SXSize,
SUM(TSize) STSize FROM (SELECT table_schema DB,data_length DSize,
index_length XSize,data_length+index_length TSize FROM information_schema.tables
WHERE table_schema NOT IN ('mysql','information_schema','performance_schema')) AAA
GROUP BY DB WITH ROLLUP) AA,(SELECT 2 pw) BB ORDER BY (SDSize+SXSize);

Database size in terms of Database/StorageEngine (MB)

SELECT IF(ISNULL(B.table_schema)+ISNULL(B.engine)=2,"Storage for All Databases",
IF(ISNULL(B.table_schema)+ISNULL(B.engine)=1,CONCAT("Storage for ",B.table_schema),
CONCAT(B.engine," Tables for ",B.table_schema))) Statistic,CONCAT(LPAD(REPLACE(FORMAT(
B.DSize/POWER(1024,pw),3),',',''),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Data Size",CONCAT(LPAD(REPLACE(FORMAT(
B.ISize/POWER(1024,pw),3),',',''),17,' '),' ',SUBSTR(' KMGTP',pw+1,1),'B') "Index Size",
CONCAT(LPAD(REPLACE(FORMAT(B.TSize/POWER(1024,pw),3),',',''),17,' '),' ',
SUBSTR(' KMGTP',pw+1,1),'B') "Table Size" FROM (SELECT table_schema,engine,
SUM(data_length) DSize,SUM(index_length) ISize,SUM(data_length+index_length) TSize
FROM information_schema.tables WHERE table_schema NOT IN
('mysql','information_schema','performance_schema') AND engine IS NOT NULL
GROUP BY table_schema,engine WITH ROLLUP) B,(SELECT 2 pw) A ORDER BY TSize;

Pay attention to certain markers

Innodb_buffer_pool_pages_dirty
Innodb_data_reads
Innodb_data_writes

I recommend downloading MySQL Administrator (I know, it's old but I still you it for quick and dirty "I WANNA SEE STATS NOW" moments of day) and set it up. I customized my own graphs to watch the size of the InnoDB Buffer Pool and its dirty pages. You could also just use the Connection Health tab.

MySQL CPU & Memory Spikes

I remember to have the same problems and it has something to do with how vBulletin is programmed. Please check regular "cronjobs" (not talking about system cronjobs you have already mentioned not to be the culprit) run by vBulletin which are triggered by visting users.

You can find them in version 4 of vBulletin in your admin panel, just below the settings for RSS-Feeds.

Another point to check is wether your vBulletin tables are in MyISAM format. vBulletin (at least until version 4) is optimized for MyISAM - if you happen to use InnoDB then some queries on vBulletin's tables (especially with the post table) will literally grind your system down to a halt.

Hope this helps.

Best Answer

Related Solutions

Mysql – CPU usage on RDS instance monotonically increasing with no change to query volume

MySQL CPU & Memory Spikes

Related Question