MySQL – Does using an explicit transaction affect query performance

MySQLmysql-5.6

In MySQL 5.6, using InnoDB, I have a rather large table, call it A; (about 4 million rows) in which I need to update a field. The condition regarding what to update is held in another table, B; and the relation between the two is also held in another table, C; so I created my update statement with two joins, like so :

UPDATE A a JOIN (B b, C c) ON (conditions) SET a.myField = 'myValue'

Now I tested this statement on a dump of my target database, on a equivalent machine. Because I needed this dump for other purposes, I tested the thing using a transaction in order to rollback afterwards :

START TRANSACTION;
UPDATE [...]
ROLLBACK;

Which gave me pretty good performance (I think) for my update, about 17 seconds and 8 for the rollback.

My question is this: If I were to run this statement as is, not in a transaction, on my target database, would I obtain the same performance ? In other words, does using explicit transactions affect performance, and if yes, how ?

I tried to look the documentation on this but it is rather sparse on this topic.

Best Answer

1: Using autocommit to automatically commit each transaction:

SET autocommit = ON;
UPDATE ...

2: Explicitly turning a single statement into a transaction. This is 'identical' to #1.

BEGIN;     (aka START TRANSACTION;)
UPDATE ...;
COMMIT;

3: Multiple statements in a transaction:

BEGIN;
UPDATE ...;
UPDATE ...;
COMMIT;

4: Don't use this; some day you will forget to COMMIT:

SET autocommit = OFF;
UPDATE ...;
...
eventually: COMMIT;

5: Not advised. It can lead to slowdowns due to overflowing the "log":

BEGIN;
a million statements
COMMIT;

Notes:

  • Each SQL has some small overhead (network transmission; parsing; etc)
  • Each transaction as a noticeable overhead. In particular there is disk I/O to make it crash-safe. Hence #3 is faster than two copies of #1 or #2.
  • See innodb_flush_log_at_trx_commit for a tradeoff between integrity and speed.

Advice:

  • First priority: Do what the application needs -- this is, put the appropriate number of statements in each transaction.
  • Lower priority: Throw together multiple statements when suitable, thereby amortizing the transaction overhead.
  • In the case of INSERT, it is even better to have a single INSERT statement with multiple rows. Davit's Comment show a 9x speedup by putting multiple INSERTs in a single transaction. That is similar to the effect of the single batch insert.
  • Use ROLLBACK only when appropriate. It is costly. This is because InnoDB optimizes for COMMIT and has to do extra effort to undo.
  • For performance improvements, this whole discussion is pretty far down the checklist. More important things involve indexing and query formulation.
  • Multi-thousand-row DELETE and UPDATE have a lot of overhead and stress on the log. It is better to break them up.