Mysql – innodb_thread_concurrency seems evil as it enlarge table size by big factor

innodbmariadbMySQL

Environment: OS: MS Windows 2012 R2, CPU: Xeon 12 logical processors, HDD Block Size: 8 KB, DB engine (MariaDB 10.1.23 Or MariaDB 10.2.6 Or MySQL 5.7.18) with Default Settings.

Schema:

CREATE DATABASE IF NOT EXISTS `z` DEFAULT CHARACTER SET utf8;
CREATE TABLE `test` (
    `testId` BIGINT(20) NOT NULL AUTO_INCREMENT,
    PRIMARY KEY (`testId`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
  1. Use below mysqlslap to make 1 Million records without concurrency (single thread, single connection) :

mysqlslap -pPutPASS –create-schema=z –number-of-queries=1000000
–query="insert into test values(null)"

test.ibd size: 36.0 MB (37,748,736 bytes)

  1. Truncate/empty the table and perform concurrency insert as below:

mysqlslap -pPutPASS –create-schema=z –concurrency=100
–number-of-queries=1000000 –query="insert into test values(null)" –commit=10000

test.ibd size: 56.0 MB (58,720,256 bytes)

  1. Set innodb_thread_concurrency=12 and repeat step 2:

test.ibd size: 76.0 MB (79,691,776 bytes) and sometimes 72 MB. also I tested with –concurrency=500 and table size became 88.0 MB

  1. the actual optimized size is 28 MB by using:

OPTIMIZE TABLE z.test;

Update 1:

Above test.ibd size results valid for MariaDB 10.1.23.

For MariaDB 10.2.6 test.ibd size:

  1. 36 MB (without concurrency)

  2. 88 MB (concurrency insert with innodb_thread_concurrency=0)

  3. 92 MB (innodb_thread_concurrency=12)

  4. 40 MB (optimized)

For MySQL 5.7.18 test.ibd size:

  1. 36 MB (without concurrency)

  2. 68 MB (concurrency insert with innodb_thread_concurrency=0)

  3. 72 MB (innodb_thread_concurrency=12)

  4. 40 MB (optimized)

The difference of 10.2.6 & 5.7.18 from 10.1.23 mostly due to barracuda "innodb_file_format" which seems worst than the old format as above tests shows!

Questions:

  1. Why InnoDB table size increased by big factor when I use concurrent Insertion?
  2. Why innodb_thread_concurrency causing the table size to be tripled and nothing mentioned in official docs regarding that?!
  3. What are the config tweaks that make MariaDB/MySQL benefit from the concurrency performance boost WHILE keeping the table size smallest? I am planning to use innodb_thread_concurrency=0 as explained in Hyperthreading & MySQL InnoDB Thread Concurrency Performance but that's not enough, my project will have billions of records and keeping the table size as small as possible will be big gain for performance and saving of H/W resources.

Update 2:

Until I get an answer for the above questions, I will highlight my project requirements in order to get best configuration:

Production Environment:

OS: MS Windows 2012 R2

SDD: HPE 1.92TB SATA 6G Mixed, Block Size 4 KB

RAM: 768 GB

CPU: 4 Xeon Processors (Sockets) E7-8890 V3 2.5 GHZ (total cores: 72, total logical processors: 144)

Data: 150 inserts per second, 24X7, from 50 to 150 connections, single-row inserted. Avalibilty of data is 1 year (mean any record older than 1 year will be deleted), no updates on the rows, just insert once, then selecting/querying through web UI), one year data estimated to be 3-4 Billion rows. DB schema consist of one big table and 10 tables holding metadata (ex: country, etc), biggest metadata table will hold maximum of 3000 rows during the whole project life cycle. so the main focus is the big table.

So far I adjusted below config params:

max_connections=501
innodb_flush_log_at_trx_commit = 0
innodb_log_buffer_size = 256M
innodb_buffer_pool_size=512G
innodb_log_file_size=1G
max_allowed_packet = 1G
innodb_flush_method=async_unbuffered
innodb_flush_neighbors=0
innodb_io_capacity=400
innodb_io_capacity_max=2000
innodb_log_compressed_pages=off
innodb_write_io_threads = 12
innodb_doublewrite = off
innodb_read_io_threads = 12
skip_name_resolve=ON
# dump/restore buffer pool, faster buffer pool warmup
innodb_buffer_pool_dump_at_shutdown = ON
innodb_buffer_pool_load_at_startup  = ON
innodb_page_size=4k

Please Suggest (with Explanation) further config tweaks I need in order to get the best of InnoDB and MariaDB 10.2.6

Best Answer

I suspect I will get down-voted for this Answer, but I think these things need to be said. (And, no, I don't know the specific reasons for the strange sizes.)

Two simplified replies:

  • InnoDB plays a lot of games in its attempts to trade off speed and space.
  • Benchmarks rarely reflect anything in the real world.

Some factors (including possible explanations):

  • When there are multiple inserts into one table, InnoDB takes effort to minimize the impact of one insert on the other -- this may involve premature block splits, duplication of rows awaiting commits, MVCC, etc.
  • AUTO_INCREMENT is a bit tricky. Your benchmark is relatively simpleminded. In real life, there might be "burned" ids, leading to gaps in the values. This is (I assume) a deliberate efficiency measure -- gain speed, but waste ids. Note that there are various AI modes (see innodb_autoinc_lock_mode).
  • BTrees are inherently sloppy. When a block is "split", it becomes two half-full blocks. In the single-threaded benchmark this is not a big issue since there is no contention for the 'last' block. And InnoDB has some optimizations for such.
  • BTrees are even more sloppy in the presence of MVCC -- since multiple copies of a row live for some time.
  • BTree blocks (in InnoDB) are not necessarily recombined promptly.
  • Allocation in in pages, extents, segments. See here . This can lead to megabytes of un-recouped space. Furthermore, there is no 'visibility' into how much space is wasted at page/extent/segment level.
  • Concurrency of 100 (especially with only 12 processors) is asking for a lot of thrashing of locks in InnoDB. Oracle, even with 5.7, does not brag above about 64. I find that 100 leads to InnoDB stumbling over itself, and not necessarily finishing the benchmark any sooner than with a smaller number. That is, excessive currency can lead to terrible latency without necessarily helping throughput.
  • 10K rows may lead to an excessive use of the undo buffer (etc), thereby actually be slower than committing sooner.
  • Your "project will have billions of records" -- but how fast will they be inserted, and from how many different connections, and will they be batched or single-row inserted, etc.?
  • My take on high speed ingestion .

More (after Update 2)

  • innodb_buffer_pool_instances = 16
  • innodb_doublewrite = off -- this may be safe since the drive and the page_size are the same.
  • SSDs make 150 individual inserts/sec not a problem
  • innodb_page_size=4k -- be aware that all tables will have to live with this. It limits rows to under 2KB. BTrees will be a little deeper, but I don't see that as a problem.
  • With the "delete after a year", I strongly recommend using PARTITION BY RANGE(TO_DAYS(...)) with 14 (monthly) or 55 (weekly) partitions. This will make the deletes have essentially no impact. A DELETE, even daily, would probably cause hiccups. More discussion here .
  • How fast did the mysqlslap run (inserts/sec)? Probably a lot faster than 150. If your 150 is reasonably steady, then you will not experience concurrency=100, possibly rarely =10. I don't see a problem ingesting your data.
  • If you need to 'normalize' (ex: country, etc) the data as you load it, see my ingestion link.
  • What are the SELECTs like? They will be competing with the INSERTs. I suspect 200 selects/sec, if they are point queries, could coexist with the inserts. If you expect 1000, we need to think further.
  • With partitioning, if the excessive size persists, here is a fix... REORGANIZE or REBUILD, not OPTIMIZE, a weekly partition after it is fully built. That way, the excessive disk usage can be freed. Be sure to use innodb_file_per_table before creating the table.