Mysql – SELECT gives different results with/without FOR UPDATE in same transaction

MySQLmysql-5.6select

I'm sorry but I havent't found a better wording for the title.

Given the table a:

CREATE TABLE a (
    id INT NOT NULL auto_increment PRIMARY KEY,
    ref VARCHAR(10) NOT NULL,
    val INT NOT NULL
);

INSERT INTO a(ref, val) VALUES ('ten', 10);

And two transactions Tx1 and Tx2 with default isolation level, whose queries are issued in the following order, is the result shown in step [*] of transaction Tx2expected?

Tx1: START TRANSACTION;

            Tx2: START TRANSACTION;

            [^] Tx2: SELECT * FROM a WHERE id='1';
            (without FOR UPDATE)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "10"

Tx1: SELECT * FROM a WHERE id='1' FOR UPDATE;
===>
    "id"    "ref"   "val"
    "1"     "ten"   "10"

            Tx2: SELECT * FROM a WHERE id='1' FOR UPDATE;
            (waits)

Tx1: UPDATE a SET v=20 WHERE id;

Tx1: COMMIT;

            Tx2: (locked on 'SELECT * FROM a WHERE id='1' FOR UPDATE')
            (unlocks)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "20"

            [*] Tx2: SELECT * FROM a WHERE id='1';
            (without FOR UPDATE)
            ===>
                "id"    "ref"   "val"
                "1"     "ten"   "10"

[*]: Shouldn't this give the updated row value 20 for column val?

UPDATE: If I remove the query [^], then [*] yields the value 20.

This is on 5.6.39; queries were issued using third party client – HeidiSQL – though I guess that shouldn't matter.

Best Answer

A guess:

"Repeatable Read" means that the identical SELECT will produce identical results throughout the transaction. But, apparently, the FOR UPDATE clause is enough to make the two selects not 'identical'.

Under the covers... There is a clock (sequence number) ticking away. Whenever a row is modified, it is copied over. Now there are two copies of that row, one with the old clock, one with the new. (See "History list".) Your first SELECT is associated with the older clock value; the second SELECT, not being identical to the first, has a newer clock value, hence linked to a different copy of the row.

After the transaction if finished (COMMIT or ROLLBACK), the extra copy of the row is tossed. (I think this is a background task for COMMIT, but a "do it now" task for ROLLBACK.) Meanwhile, other connections pick the appropriate row from the history list for each row based on their isolation_mode. Roughly speaking, READ_COMMITTED goes to the old end of the list; READ_UNCOMMITTED ("dirty read") goes to the new end.

Things are optimized on the "optimistic" assumption that COMMITs are more likely than ROLLBACKs. Hence, the comments in the previous paragraph, and the slowness of ROLLBACK.