Mysql – How toNSERT into theSQL without buffering, or force immediate buffer commit

innodbMySQL

I have a script I am running that will allow the user to add test junk to the mySQL DB for testing backups. The script offers the choice to add records of size X until the DB reaches a target size. The script works, but always overshoots the target size because when you get the DB size

SELECT Round(Sum(data_length + index_length) / 1024 / 1024, 1) \"DB Size in MB\"
FROM information_schema.tables 
WHERE table_schema='dbName' 
GROUP BY table_schema;

it doesn't seem to account for the buffer. So the script eventually detects I've reached the target size, but only when the buffer happens to flush (during debugging, I added 1MB text blocks one at a time, and the DB size reported as 2.5MB a few hundred times before jumping up to ~450MB).

I did try to FLUSH TABLES between each insert, but this just made the DB size constantly report as 0.0 even with 7000+ rows.

How can I ensure I am getting an accurate DB size after each insert (preferably without disabling innodb for the whole DB or restarting mysql)?

Edit: I am using TEXT for the data columns.
COMMIT before pulling DB size doesn't work (still wind up at ~450MB).
SQL_NO_CACHE on SELECT statement above doesn't work (still wind up at ~450MB).

Best Answer

Plan A:

Shrink innodb_buffer_pool_size so as to force flushing to disk sooner.

  • Be cautious -- too small (a few MB) a buffer_pool may lead to a hang or crash.
  • There is still some caching, so you still cannot get precisely what you want.

Plan B:

  1. Insert 1000 rows.
  2. Shutdown MySQL (only this once)
  3. Do the math.
  4. Extrapolate to see how many rows to insert.
  5. Finish to the computed limit.

Caveat: The math won't be quite right, so you might under/overshoot.

Plan C:

As with Plan B, but do a second shutdown and extrapolation after inserting half the number of rows computed from Plan B. This should give you a much more accurate target.

Plan D:

From my experience, InnoDB tables are usually 2x to 3x larger than MyISAM tables with the same schema and same data. Since MyISAM table size is rather easy to compute, will doing that, then multiplying by 2 to 3 be good enough?

If all columns are TEXT, that is a pretty messy schema. The storage of data will be compromised. SELECTs will have to work harder, especially for range tests on numeric values stored in TEXT columns.