MySQL – Does using an explicit transaction affect query performance

MySQLmysql-5.6

In MySQL 5.6, using InnoDB, I have a rather large table, call it A; (about 4 million rows) in which I need to update a field. The condition regarding what to update is held in another table, B; and the relation between the two is also held in another table, C; so I created my update statement with two joins, like so :

UPDATE A a JOIN (B b, C c) ON (conditions) SET a.myField = 'myValue'

Now I tested this statement on a dump of my target database, on a equivalent machine. Because I needed this dump for other purposes, I tested the thing using a transaction in order to rollback afterwards :

START TRANSACTION;
UPDATE [...]
ROLLBACK;

Which gave me pretty good performance (I think) for my update, about 17 seconds and 8 for the rollback.

My question is this: If I were to run this statement as is, not in a transaction, on my target database, would I obtain the same performance ? In other words, does using explicit transactions affect performance, and if yes, how ?

I tried to look the documentation on this but it is rather sparse on this topic.

Best Answer

1: Using autocommit to automatically commit each transaction:

SET autocommit = ON;
UPDATE ...

2: Explicitly turning a single statement into a transaction. This is 'identical' to #1.

BEGIN;     (aka START TRANSACTION;)
UPDATE ...;
COMMIT;

3: Multiple statements in a transaction:

BEGIN;
UPDATE ...;
UPDATE ...;
COMMIT;

4: Don't use this; some day you will forget to COMMIT:

SET autocommit = OFF;
UPDATE ...;
...
eventually: COMMIT;

5: Not advised. It can lead to slowdowns due to overflowing the "log":

BEGIN;
a million statements
COMMIT;

Notes:

Each SQL has some small overhead (network transmission; parsing; etc)
Each transaction as a noticeable overhead. In particular there is disk I/O to make it crash-safe. Hence #3 is faster than two copies of #1 or #2.
See innodb_flush_log_at_trx_commit for a tradeoff between integrity and speed.

Advice:

First priority: Do what the application needs -- this is, put the appropriate number of statements in each transaction.
Lower priority: Throw together multiple statements when suitable, thereby amortizing the transaction overhead.
In the case of INSERT, it is even better to have a single INSERT statement with multiple rows. Davit's Comment show a 9x speedup by putting multiple INSERTs in a single transaction. That is similar to the effect of the single batch insert.
Use ROLLBACK only when appropriate. It is costly. This is because InnoDB optimizes for COMMIT and has to do extra effort to undo.
For performance improvements, this whole discussion is pretty far down the checklist. More important things involve indexing and query formulation.
Multi-thousand-row DELETE and UPDATE have a lot of overhead and stress on the log. It is better to break them up.

Related Solutions

Mysql – the performance benefit when using a temporary table to update a regular table

First, let's look at a different aspect. Is your table ENGINE=InnoDB? If so, and you have not done anything (yet) about "transactions"? Then 5 Updates is 5 transactions. There is a significant overhead for each transactions. Hence, this would be a lot better:

BEGIN;
UPDATE ...;
UPDATE ...;
...
COMMIT;

You had 5 statements versus 3. I turned the 5 into 7.

If, instead, there were 500 updates to do, that would be 502 versus 3. Now the overhead of extra round-trips to the server becomes significant. At this point (502 statements versus 3), creating the tmp table, etc, becomes significantly better.

(There is no need for BEGIN...COMMIT on the tmp approach; the UPDATE is a 'transaction' unto itself, assuming autocommit=1.)

Mysql – Should I expect better performance out of MySQL

I think "sending data" includes disk access also so probably the whole table is being read for some reason. I would test on the same machine if you're not already to remove the possibility of the network adding latency. You should make the id your primary key as well as adding an index (or make sure to add a unique index). I would be very suspicious of key_len since 138 bytes is very big. You should be using eg, a 64 bit integer and then it would be 8 bytes.

Can you post (at least some of) DESCRIBE Test_schema.Events;?

But to answer the original question, yes, you should be able to get much better performance.

Best Answer

Related Solutions

Mysql – the performance benefit when using a temporary table to update a regular table

Mysql – Should I expect better performance out of MySQL

Related Question