Optimizing Amazon RDS – MySQLTuner Recommendations and Feedback

amazon-rdsdatabase-designMySQLperformancequery-performance

I'm currently trying to optimize a database I've inherited for speed.
The server is an Amazon RDS db.m3.large instance:

  • 2 VCPUs
  • 7.5 GB Memory
  • 500 GB HDD

My current strategy is to do the following:

  1. Remove all dev databases and push them to a different server
  2. Add unique index on aggr_id column in the problematic tables below (How to do this properly without interrupting production?)
  3. Switch MyISAM tables to InnoDB to prevent locking (What are the implications of doing this?)

I am making this post to get feedback on the current approach, and any additional ideas. I'm not a DBA by any means, just a web developer. Any and all feedback is welcomed, thanks in advance. I'll be sure to respond and approve answers! I will also edit this post with any insights gained and resources found.

There are a few problem tables all with the same format:

| xxxxx | CREATE TABLE `xxxxx` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `aggr_id` int(11) NOT NULL,
  `craft_id` int(11) NOT NULL,
  `task_year` smallint(8) unsigned NOT NULL,
  `hc1` mediumint(8) unsigned DEFAULT NULL,
  ......
  `hc365` mediumint(8) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`,`task_year`),
  KEY `IDX_AGGRLABORDAY` (`task_year`,`aggr_id`,`craft_id`)
) ENGINE=MyISAM AUTO_INCREMENT=50480037 DEFAULT CHARSET=latin1
/*!50500 PARTITION BY RANGE  COLUMNS(task_year)
(PARTITION p1 VALUES LESS THAN (2012) ENGINE = MyISAM,
 PARTITION p2 VALUES LESS THAN (2013) ENGINE = MyISAM,
 PARTITION p3 VALUES LESS THAN (2014) ENGINE = MyISAM,
 PARTITION p4 VALUES LESS THAN (2015) ENGINE = MyISAM,
 PARTITION p5 VALUES LESS THAN (2016) ENGINE = MyISAM,
 PARTITION p6 VALUES LESS THAN (2017) ENGINE = MyISAM,
 PARTITION p7 VALUES LESS THAN (9999) ENGINE = MyISAM) */

Here is an explain of a problematic query:

explain DELETE FROM xxxxx WHERE aggr_id = 3000010;
+----+-------------+---------------------------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table                     | type | possible_keys | key  | key_len | ref  | rows     | Extra       |
+----+-------------+---------------------------+------+---------------+------+---------+------+----------+-------------+
|  1 | SIMPLE      | xxxxx                     | ALL  | NULL          | NULL | NULL    | NULL | 46611048 | Using where |
+----+-------------+---------------------------+------+---------------+------+---------+------+----------+-------------+
1 row in set (0.05 sec)

Below is MySQLTuner results:

[--] Assuming 7680 MB of physical memory
[!!] Assuming 0 MB of swap space (use --forceswap to specify)
[OK] Currently running supported MySQL version 5.6.21-log

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MRG_MYISAM 
[--] Data in MyISAM tables: 274G (Tables: 75)
[--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 52)
[--] Data in InnoDB tables: 7G (Tables: 227)
[!!] Total fragmented tables: 35

-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned

-------- Performance Metrics -------------------------------------------------
[--] Up for: 107d 15h 56m 50s (110M q [11.901 qps], 3M conn, TX: 14149B, RX: 108B)
[--] Reads / Writes: 58% / 42%
[--] Total buffers: 5.4G global + 1.5M per thread (604 max threads)
[OK] Maximum possible memory usage: 6.2G (83% of installed RAM)
[OK] Slow queries: 0% (17K/110M)
[OK] Highest usage of available connections: 29% (177/604)
[OK] Key buffer size / total MyISAM indexes: 16.0M/49.6G
[OK] Key buffer hit rate: 99.4% (25B cached / 155M reads)
[!!] Query cache is disabled
[OK] Sorts requiring temporary tables: 2% (133K temp sorts / 6M sorts)
[!!] Temporary tables created on disk: 62% (7M on disk / 11M total)
[OK] Thread cache hit rate: 74% (945K created / 3M connections)
[!!] Table cache hit rate: 0% (1K open / 674K opened)
[OK] Open file limit used: 3% (2K/65K)
[!!] Table locks acquired immediately: 78%
[!!] InnoDB  buffer pool / data size: 5.3G/7.7G
[!!] InnoDB log waits: 1
-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    When making adjustments, make tmp_table_size/max_heap_table_size equal
    Reduce your SELECT DISTINCT queries without LIMIT clauses
    Increase table_open_cache gradually to avoid file descriptor limits
    Read this before increasing table_open_cache over 64: http://bit.ly/1mi7c4C
    Optimize queries and/or use InnoDB to reduce lock wait
Variables to adjust:
    query_cache_type (=1)
    tmp_table_size (> 16M)
    max_heap_table_size (> 16M)
    table_open_cache (> 2000)
    innodb_buffer_pool_size (>= 7G)
    innodb_log_buffer_size (>= 8M)

Best Answer

(I'm writing this as I see your points; please read to the end before taking action. I 'develop' the best answer piece by piece -- hope you learn some things.)

DELETE FROM xxxxx WHERE aggr_id = 3000010;

must scan every row in every partition. That means it must do a lot of I/O, which will take a lot of time, regardless of the tuning.

If this is a common query, then add

INDEX(aggr_id)

Even better is change

KEY `IDX_AGGRLABORDAY` (`task_year`,`aggr_id`,`craft_id`)

Normally it is best to have the "partition key" (task_year) at the end of indexes, not the start. In this case, you may as well simply remove it. Any use of task_year will be for "pruning", then the KEY can take over. DROP that INDEX and ADD the following in a single ALTER:

KEY `IDX_AGGRLABORDAY` (`aggr_id`,`craft_id`)

Caution: It will lock the table for some time.

You ought to switch to InnoDB, which has online INDEX operations.

The tuner suggestions seem to be worse than usual:

  • Leave the query cache off
  • raising the tmp_table_size and max_heap_table_size is not necessarily useful
  • You seem to be using a mixture of InnoDB and MyISAM, so I recommend innodb_buffer_pool_size = 2500M and key_buffer_size = 500M. (The high buffer_pool setting was starving the data cache for MyISAM.)
  • As for the tmp tables; let's see your slowest query, plus the SHOW CREATE TABLE.
  • Ignore table fragmentation; don't bother OPTIMIZEing.
  • table_open_cache should be a few hundred. If open_files_limit is only 1K, then something in the OS is constraining the cache.

If aggr_id can be made UNIQUE, then perhaps it should be the PRIMARY KEY and eliminate id. Note that you cannot have a UNIQUEness constraint because of partitioning.

Note that there are two things intertwined here: Switching to InnoDB and making aggr_id the PK. If you do both, then do

PRIMARY KEY (aggr_id, craft_id, task_year)

and have no secondary keys. The rationale:

  • InnoDB really needs a PK.
  • task_year (the partition key) must be in every unique (that includes PK) key
  • It is usually best to put the partition key last
  • You apparently need craft_id for some other reason? Quite possibly it should be removed, especially since the PK coexists with the data in InnoDB. (Please show the SELECT(s) that wants craft_id; we can discuss this further.)

Switching to InnoDB is almost always "good". See my blog for the gotchas: http://mysql.rjweb.org/doc.php/myisam2innodb While you are at it, read about PARTITION issues here: http://mysql.rjweb.org/doc.php/partitionmaint It may conclude that partitioning is doing you no good.

If you go to all-InnoDB, then innodb_buffer_pool_size = 5G and key_buffer_size = 20M.