I am running a large local database with MySQL and InnoDB. The size is around 30GB for data and 6GB for Keys. The machine has 16GB of RAM.
The table looks like this
symbol VARCHAR(10),
iso_datestamp VARCHAR(14),
-- a lot more columns with values with mostly DECIMAL(26,6)s
I have BTREE indexes on the symbol and the iso_datestamp.
The operations I run are mostly updates which are run like this:
DELETE FROM table WHERE symbol = 'XXX'
INSERT INTO table (columns) VALUES (...)
This needs to be done to avoid data inconsistencies as opposed to just appending data.
I also frequently run analysis as in
SELECT FROM table WHERE symbol in ('XXX', 'YYY', ...) [AND iso_datestamp > '2015...']
Per symbol I have around 3000 entries.
The performance problem: As soon as the table starts to grow over around 10GB, the DELETE
and INSERT
statements slow down substantially from usually less than 0.4 seconds to well over 1 second even up to 5 seconds and more as the table keeps growing. RAM is not limiting (according to top
).
I need the total of DELETE
and INSERT
to be around 1 second on average (of course faster is also fine)…
I am having 2 ideas right now:
- Using hash indexes instead of btree indexes, since I am only doing equality operation
- Partitioning the table using the symbol and the range
Are those viable strategies? Do both, just one? Other things I should do to improve the speed?
Best Answer
iso_datestamp
to a MySQLTIMESTAMP
so that it can be 5 bytes instead of 15.CHARACTER SET ascii
(orlatin1
) unless you need the multibyte utf8/utf8mb4.PRIMARY KEY(symbol, datestamp)
PARTITION BY RANGE(TO_DAYS(datestamp))
-- this will significantly speed up deletion of 'old' rows.DROP PARTITION
for very efficient deleting by date. (See link below) If you are deleting by symbol, let me know.DECIMAL(26,6)
takes 12 bytes. Consider shrinking it and/or moving to an 8-byteDOUBLE
. Caution: Double's ~16 significant digits and binary format may lead to rounding issues.MEDIUMINT INT
for asymbol_id
and normalize the actualsymbol
name into another (small) table.FLOATs
. It's 7 significant digits is overkill for such.innodb_buffer_pool_size
to about 70% of available memory (after accounting for apps, web server, etc.)These changes should help with your various issues.
Currently, performance is falling off a cliff for multiple reasons. The above tips will delay the cliff for about twice as long, and make the cliff less steep. In some cases, the cliff is eliminated.
Also, my tips will cut back on the I/O needed -- by clustering, by locality of reference, by better caching, etc.
http://mysql.rjweb.org/doc.php/partitionmaint