Mysql – How toNSERT into theSQL without buffering, or force immediate buffer commit

innodbMySQL

I have a script I am running that will allow the user to add test junk to the mySQL DB for testing backups. The script offers the choice to add records of size X until the DB reaches a target size. The script works, but always overshoots the target size because when you get the DB size

SELECT Round(Sum(data_length + index_length) / 1024 / 1024, 1) \"DB Size in MB\"
FROM information_schema.tables 
WHERE table_schema='dbName' 
GROUP BY table_schema;

it doesn't seem to account for the buffer. So the script eventually detects I've reached the target size, but only when the buffer happens to flush (during debugging, I added 1MB text blocks one at a time, and the DB size reported as 2.5MB a few hundred times before jumping up to ~450MB).

I did try to FLUSH TABLES between each insert, but this just made the DB size constantly report as 0.0 even with 7000+ rows.

How can I ensure I am getting an accurate DB size after each insert (preferably without disabling innodb for the whole DB or restarting mysql)?

Edit: I am using TEXT for the data columns.
COMMIT before pulling DB size doesn't work (still wind up at ~450MB).
SQL_NO_CACHE on SELECT statement above doesn't work (still wind up at ~450MB).

Best Answer

Plan A:

Shrink innodb_buffer_pool_size so as to force flushing to disk sooner.

Be cautious -- too small (a few MB) a buffer_pool may lead to a hang or crash.
There is still some caching, so you still cannot get precisely what you want.

Plan B:

Insert 1000 rows.
Shutdown MySQL (only this once)
Do the math.
Extrapolate to see how many rows to insert.
Finish to the computed limit.

Caveat: The math won't be quite right, so you might under/overshoot.

Plan C:

As with Plan B, but do a second shutdown and extrapolation after inserting half the number of rows computed from Plan B. This should give you a much more accurate target.

Plan D:

From my experience, InnoDB tables are usually 2x to 3x larger than MyISAM tables with the same schema and same data. Since MyISAM table size is rather easy to compute, will doing that, then multiplying by 2 to 3 be good enough?

If all columns are TEXT, that is a pretty messy schema. The storage of data will be compromised. SELECTs will have to work harder, especially for range tests on numeric values stored in TEXT columns.

Related Solutions

Mysql – INSERT INTO without duplicates

This will insert new Name fields that were not previously in the Matrix table.

INSERT IGNORE
INTO AdminAccounts (Name) 
SELECT Name
FROM Matrix;

If you do not trust INSERT IGNORE, there is an alternative where you can manifest the new Name values before inserting them:

CREATE TABLE NewName SELECT Name FROM Matrix WHERE 1=2;
INSERT INTO NewName
    SELECT Name FROM AdminAccounts A
    LEFT JOIN Matrix B USING (Name)
    WHERE B.Name IS NULL;
INSERT INTO Matrix (Name) SELECT Name FROM NewNames;

The table NewName collects only those tables in AdminAccounts that are not in Matrix at present. This can give you a chance to look over the new Names. Afterwards, you can INSERT everything in NewName into Matrix.

Mysql – How to insert into junction table using triggers

You have to think in terms of the design.

After triggers on the Song table cannot help populate the SongArtist table since Song and Artist have no immediate relationship.

Although you should not need this, you could have before triggers on SongArtist to check for validity of SongID and ArtistID.

DELIMITER $$
CREATE TRIGGER before_songartist_insert
BEFORE INSERT ON SongArtist
FOR EACH ROW 
BEGIN
    DECLARE song_ok,artist_ok,sum_ok INT;
    SELECT COUNT(1) song_ok   FROM Song WHERE songID = NEW.songID;
    SELECT COUNT(1) artist_ok FROM Artist WHERE artistID = NEW.artistID;
    SET sum = song_ok + artist_ok;
    IF sum_ok < 2 THEN
        SELECT COUNT(1) INTO sum_ok FROM information_schema.dummy;
    END IF;
END; $$
CREATE TRIGGER before_songartist_insert
BEFORE UPDATE ON SongArtist
FOR EACH ROW 
BEGIN
    DECLARE song_ok,artist_ok,sum_ok INT;
    SELECT COUNT(1) song_ok   FROM Song WHERE songID = NEW.songID;
    SELECT COUNT(1) artist_ok FROM Artist WHERE artistID = NEW.artistID;
    SET sum = song_ok + artist_ok;
    IF sum_ok < 2 THEN
        SELECT COUNT(1) INTO sum_ok FROM information_schema.dummy;
    END IF;
END; $$
DELIMITER ;

In reality, the ON DELETE CASCADE and ON UPDATE CASCADE clauses should make using triggers unnecessary. You should insert rows into the Song and Artist as a single transaction. Then, using songIDs and artistIDs from the completed transaction, you can insert them into SongArtist and let any defined CONSTRAINT check the validity.

Best Answer

Related Solutions

Mysql – INSERT INTO without duplicates

Mysql – How to insert into junction table using triggers

Related Question