I have read pt-online-schema-change
documentation and understood that it works by creating triggers and copying data in chunks with in chunk-time seconds(0.5 seconds in default)
Let us consider this below case.
We have a table TBL1 of 5 GB and is being altered with PT-ONLINE-SCHEMA-CHANGE
PT-ONLINE-SCHEMA-CHANGE created new table with the desired change, added trigggers on
TBL1 and started copying.copied 10 chunks of data and started copying 11th chunk which is some 'X' MB in size.
As per tool, this chunk be copied in a max of 500 milliseconds or 0.5 secondsDuring the 11th chunk copying, application issued an UPDATE which has to modify effect
rows that are part of this 11th chunk of data.
My question is that which of the below things happen now.original table now ?
A) Update will be blocked till the chunk be copied and then applied on original table
which gets updated through triggers on new one.
B) Update will be performed on original table and data chunk will be re copied all
over again.
If Option A is the answer, how PT-ONLINE-SCHEMA-CHANGE can be lock free ?
If Option B is the answer, how it knows the chunk being copied is modified ?
Thanks in advance.
Best Answer
pt-online-schema-change takes advantage of the consistency nature of relational databases at table level. Let's see what happens when we write to the original table, one of three cases:
The chunk has already been copied. No problem, the trigger will overwrite the values of the accessory table within the transaction
The chunk has not yet been copied. The trigger will probably fail, but the original modification will be done thanks to the
IGNORE
option on theINSERT/UPDATE/DELETE
of the trigger. No problem, when the pt-osc process reaches to it, it will copy the newer values.Your question. The chunk is being copied. There are 2 subcases, depending on which process starts first.
a) If the modification of the original table starts first, either the whole table (in the case of MyISAM) or the set of rows (in the case of InnoDB), will be locked for other modifications, so pt-osc will have to wait for the write to finish. You should avoid very time-consuming write operations while this tool is working, or relying on it to pause if the load is too high, but in the unfortunate case that you are doing in concurrency another large update, the session variable
innodb_lock_wait_timout
(configured by the parameterlock-wait-timeout
, by default 1 second) for pt-osc should kick in and revert/abort the copy of that chunk. In that case, the tool will retry that chunk again--retries
times (default, 3), or the whole process will abort, in theory gracefully. In most cases, once the lock disappears (because the concurrent write finishes), you will be essentially on case 2.b) If pt-osc starts first, it will essentially lock again the whole table (in MyISAM) or those specific rows (in InnoDB) for writing (a READ LOCK), preventing concurrent writes. This will only be hold for, ideally, up to 0.5 seconds, as the chunk size has been calculated so that the chunk copy should not take more than that. Once it has finished, you are essentially on case 1.
So, yes, there is not 100% lock free tool, but no such a tool can exist if you want to guarantee consistency. Even the rename table process takes microseconds (after all, a/several file(s) must be renamed on disk), and for those microseconds, the table will be locked. The idea is to have a minimum amount of locking. For MyISAM, every time you write a row (except for concurrent inserts) the whole table is locked, and for InnoDB, at least that particular row (sometimes more). You cannot change that unless you sacrifice the atomicy and durability of changes. However, you can take into acount some tricks to minimize its problems:
Reading the code of the tool is not that hard, I invite you to read it for more information.