Mysql – SELECT gives different results with/without FOR UPDATE in same transaction

MySQLmysql-5.6select

I'm sorry but I havent't found a better wording for the title.

Given the table a:

CREATE TABLE a (
    id INT NOT NULL auto_increment PRIMARY KEY,
    ref VARCHAR(10) NOT NULL,
    val INT NOT NULL
);

INSERT INTO a(ref, val) VALUES ('ten', 10);

And two transactions Tx1 and Tx2 with default isolation level, whose queries are issued in the following order, is the result shown in step [*] of transaction Tx2expected?

Tx1: START TRANSACTION;

            Tx2: START TRANSACTION;

            [^] Tx2: SELECT * FROM a WHERE id='1';
            (without FOR UPDATE)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "10"

Tx1: SELECT * FROM a WHERE id='1' FOR UPDATE;
===>
    "id"    "ref"   "val"
    "1"     "ten"   "10"

            Tx2: SELECT * FROM a WHERE id='1' FOR UPDATE;
            (waits)

Tx1: UPDATE a SET v=20 WHERE id;

Tx1: COMMIT;

            Tx2: (locked on 'SELECT * FROM a WHERE id='1' FOR UPDATE')
            (unlocks)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "20"

            [*] Tx2: SELECT * FROM a WHERE id='1';
            (without FOR UPDATE)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "10"

[*]: Shouldn't this give the updated row value 20 for column val?

UPDATE: If I remove the query [^], then [*] yields the value 20.

This is on 5.6.39; queries were issued using third party client – HeidiSQL – though I guess that shouldn't matter.

Best Answer

A guess:

"Repeatable Read" means that the identical SELECT will produce identical results throughout the transaction. But, apparently, the FOR UPDATE clause is enough to make the two selects not 'identical'.

Under the covers... There is a clock (sequence number) ticking away. Whenever a row is modified, it is copied over. Now there are two copies of that row, one with the old clock, one with the new. (See "History list".) Your first SELECT is associated with the older clock value; the second SELECT, not being identical to the first, has a newer clock value, hence linked to a different copy of the row.

After the transaction if finished (COMMIT or ROLLBACK), the extra copy of the row is tossed. (I think this is a background task for COMMIT, but a "do it now" task for ROLLBACK.) Meanwhile, other connections pick the appropriate row from the history list for each row based on their isolation_mode. Roughly speaking, READ_COMMITTED goes to the old end of the list; READ_UNCOMMITTED ("dirty read") goes to the new end.

Things are optimized on the "optimistic" assumption that COMMITs are more likely than ROLLBACKs. Hence, the comments in the previous paragraph, and the slowness of ROLLBACK.

Related Solutions

Mysql – Identical query, tables, but different EXPLAIN and performance

Just make a simple join. Sub-queries does not provide the best result quite often

EXPLAIN SELECT l.id, l.level_name, l.date_published, l.rating
FROM levels AS l
INNER JOIN users_favorites AS uf 
ON uf.level_id = l.id
WHERE l.user_id = 2;

Mysql – Adding index to large thesql tables

The activity of altering big tables are done in phases:

Create a new table with required fields and indexes say in test DB (just structure)
Dump the data from the existing table and load the same to the newly created table in test DB
Now announce your downtime :)
Swap the tables by renaming - RENAME table ur_db.table_name to test.temp, test.table_name to ur_db.table_name, test.temp to test.table_name; This is an atomic operation taking fraction of a second.
Load the extra records to the newly created table ( the records that came after dumping and then loading). This step can be done before Step: 3 also once to reduce your downtime.
And your system is back

Few notes:

You need not hit information schema directly like this, try using SHOW TABLE STATUS from db like 'table_name'
The speed of alter table is more or less linked with the I/O speed. Last time when we ran a direct alter table(without the above steps), we had 40GB+ table size it took around 4hours. If your 20GB data is taking years, you are working on some outdated machine.
Also drop of the unwanted indexes like index_bslt_ondate, index_base_schedule_line_items_loan_base_schedule as some other indexes has the left most column as the indexed column

Please let me know if you need any clarification on any of these steps.

Edit : A simple python script to automate the process https://github.com/georgecj11/hotswap_mysql_table.git

Best Answer

Related Solutions

Mysql – Identical query, tables, but different EXPLAIN and performance

Mysql – Adding index to large thesql tables

Related Question