Always have an explicit PRIMARY KEY
for any InnoDB table. If practical, use a "natural" key -- a column, or combination of columns, that is unique. Otherwise, add an AUTO_INCREMENT
.
Learn about "composite indexes" -- this is using more than one column in an index. It is often a good optimization. It is not the same as having an index on each column.
It is OK for the indexes to take up more space than the data. A quick survey showed 1/4 of my tables have that. Having good indexes improves performance far more than the bulkiness hurts performance.
Each Index is updated for each row INSERTed
, so having too many indexes hurts INSERTs
(some).
You mentioned partitioning; did you mean "PARTITION BY ..."? Don't do it unless you have a particular purpose. I know of only 4 use cases; see my blog.
Do not bother to index low-cardinality columns (flags, enums, etc). They won't be used. If you have a status flag that is 10/90, then an index on just that column would be used for the 10, but not for the 90. A composite index starting with that column may be always useful when all the columns (in the index) are in the WHERE
clause.
The most important performance setting for InnoDB is to set innodb_buffer_pool_size
to about 70% of available RAM.
It is usually a bad idea split DATE
and TIME
into two fields. Look at how messy your SELECTs
have to be.
How to create good indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Show us some of the SELECTs
; we can advise further.
Do you have Replication turned on? The first couple might be a cause for extra I/O (and CPU):
( Binlog_cache_disk_use ) = 268,012 / 83387 = 3.2 /sec
-- Freq of binlog cache spilling to disk
-- Increase binlog_cache_size.
( Binlog_cache_disk_use / Binlog_cache_use ) = 268,012 / 268284 = 99.9%
-- Spill to disk
-- Increase binlog_cache_size
( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((5484 + 32307952) ) / 83387 = 387 /sec
-- InnoDB I/O
-- Increase innodb_buffer_pool_size?
( Innodb_log_waits / Innodb_log_writes ) = 413 / 335941 = 0.12%
-- Frequency of needing to wait to write to log
-- Increase innodb_log_buffer_size.
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 83,387 / 60 * 128M / 534630468096 = 0.349
-- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf.
-- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size.
( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 5483 + 32308192 ) / 83387 / 200 = 193.8%
-- If > 100%, need more io_capacity.
-- Increase innodb_io_capacity if the drives can handle it. (AWS can probably handle it.)
( Select_scan / Com_select ) = 61,056 / 132852 = 46.0%
-- % of selects doing full table scan. (May be fooled by Stored Routines.)
-- Add indexes / optimize queries
( slave_pending_jobs_size_max / max_allowed_packet ) = 16M / 4M = 400.0%
-- For parallel slave threads
-- slave_pending_jobs_size_max must not be less than max_allowed_packet
( long_query_time ) = 10.000000 = 10
-- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
Uptime is 23 hours. Were you running nothing but the benchmark for that whole time? I would prefer to analyze your workload, not a benchmark. (I usually find that benchmarks do not mirror reality.)
You asked about CPU... Some of the above issues relate directly to CPU; some relate indirectly. That is, the CPU is sometimes involved in things that otherwise seem like I/O. Perhaps a main example is when the code "spins the cpu" in hopes that a Mutex will be released or I/O will finished. The presumption is that that is 'faster' than hitting the OS.
In any case, address whatever issues you can, then we can make another analysis pass.
Best Answer
Create a view v_ip that contains only the columns you want, and then dump that view into a csv file.
Importing should be done with "load data".