Mysql – Which MySQL table to use for read-only, bulk-loaded tables

bulkinnodbmyisamMySQLstorage-engine

So I have a workload on 2 key tables in my application with the following characteristics:

  • READ-ONLY DATA
    The data is bulk-loaded in a fully consistent manner from a single process on a schedule, there are no deletes nor updates. I am using INSERT IGNORE ...
  • Very Large Tables
    I am using temporal partitioning, but even with that my expected worse case on a single paritition is 10M-100M records per paritition (older partitions are eventually deleted).
  • No need for FK, transactions, consistent reads
    Theses table are used to seed very targeted time-sensitive queries and reporting views. There is no need for enforced FK constraints (there are no explicit FKs) and certainly no transactions are needed. There is also no need for clients issuing selects to have a fully consistent view of the data — so if I am in the middle of a bulkload and they only see 1/2 of the in-progress updates, that's totally cool.

I initially was leaning towards InnoDB and then went back to MyISAM after reading about concurrent inserts, which sounds like a perfect description of my use case (high number of inserts with no intervening updates) I am thinking of going back to MyISAM esp given how many other features that I just don't care about (e.g. Foreign Keys, transactions).

Could someone with real-world experience using MySQL for similar workloads offer a suggestion on what storage engine is the right choice here. Plan on doing some serious testing early next week but just would like to get a rough idea of what to expect/look out for here.

Best Answer

This is definitely a loaded question. Why ???

  • InnoDB caches data and index pages
  • InnoDB may lock Clustered Indexes during Inserts/Updates
  • InnoDB is multithreaded and can be tuned for accessing CPUs/Cores
  • MyISAM only caches index pages
  • MyISAM must always be read data from disk
  • MyISAM can have dedicated caches per tables

If you have plenty of RAM and intend to read the same data throughout a business day, some would go with InnoDB. If you are looking for high-speed reads from varying ranges of data, some would go with MyISAM. Even still, there are simply too many mitigating factors to pick one fully over the other such as

  • VARCHAR field processing
  • Row Format
  • Foreign Keys
  • Index Cardinality
  • Physical Disk (SSD, SATA, SAN, NAS)
  • Application Processing Needs

Here are some posts that compare and contrast MyISAM and InnoDB

Please read these before deciding ...