MySQL – Consistent Nonlocking Reads vs. INSERT … SELECT

concurrencyinnodbisolation-levellockingMySQL

So, from my reading of the MySQL 5.5 documentation and some experiments I did with colleagues, I understand that the following is the case:

  1. With InnoDB tables, a simple SELECT statement (one that does not use FOR UPDATE or LOCK IN SHARE MODE) will not grab any row locks on the tables that the SELECT statement reads. This is true at all transaction isolation levels.
  2. However, if your transaction isolation level is REPEATABLE READ or higher, an INSERT ... SELECT or CREATE TABLE AS SELECT statement will place row locks on tables that it reads from.

Sources:

If I understand this correctly (and correct me if I don't), then I'm puzzled by this difference. Why does reading a table require lock rows in the one case but not the other, when the transaction isolation level is the same? I'd like to understand the reason for this.

Best Answer

From the MySQL Documentation

The type of read varies for selects in clauses like INSERT INTO ... SELECT, UPDATE ... (SELECT), and CREATE TABLE ... SELECT that do not specify FOR UPDATE or LOCK IN SHARE MODE:

By default, InnoDB uses stronger locks and the SELECT part acts like READ COMMITTED, where each consistent read, even within the same transaction, sets and reads its own fresh snapshot.

To use a consistent read in such cases, enable the innodb_locks_unsafe_for_binlog option and set the isolation level of the transaction to READ UNCOMMITTED, READ COMMITTED, or REPEATABLE READ (that is, anything other than SERIALIZABLE). In this case, no locks are set on rows read from the selected table.

Evidently, you do not need stronger locks when just doing reads. Picture copying data into a table using INSERT ... SELECT or CREATE TABLE ... SELECT. The SELECT needs to be a frozen snapshot for loading the table. If it were a moving target, that would require a transaction within a transaction. If you wanted transaction within a transaction behavior, you would need to script that using SAVEPOINT. Otherwise, stronger locks allow for consistent SELECTs for an atomic INSERT.

Doing the simple SELECT allows your DB Session to trust other transactions not to play smoke-and-mirrors with your view of the data. If it has to violate that trust, it will politely lock every row it can to do its work, rather than deceive DB Sessions that are only doing SELECTs into thinking the data is stable when it is, in fact, changing.

No matter which isolation level you pick, none of them is so granular that a simple SELECT gets held up. That's why SELECT ... FOR UPDATE and SELECT ... FROM ... LOCK IN SHARE MODE were invented, so you can be that granular if politeless is not an option.

I wrote about this before : How can I make a select statement get blocked?