MySQL Concurrency – Manage Concurrencies Without Locking Tables

concurrencylockingMySQL

In our php app, we have a table that records orders recieved at the reception. Since we can have up to 7 agents/active accounts, how is it possible to manage the concurrencies in the orders table without having to lock the table each time.

The issues are:

That before inserting any row in the main table, there must be a chain of check ups to run on the customer, etc… which impose multiple selects, then the insert.
Orders are loaded by batch, not one by one, by the agents. I fear locking the table will impose them to redo the orders' registration again due to the database forbiding access.

Should I even be worried of the writing management of MySQL or, indeed I have to procede to a locking system?

Best Answer

Concurrency and isolation are the two topics you are dealing with here - the ability for more than one user to use the system simultaneously and for each's work to be unaffected by the others'. RDBMS handle this through isolation levels and locks. It is how they are designed to work; it is what they do. Our job as application architects is to choose settings which meet the requirements and build the system in such a way that minimises the time window in which conflicts can occur.

In this system you want consistency between rows in the Order table and other tables. For this you need locks. This is precisely what locks are for. Don't be scared of them. They are your friend and protect your data. For the reference tables you want to ensure you are taking shared locks by only reading and not writing, and preferably row locks. This way multiple concurrent activities can share the reference data.

Writing the Order will create an exclusive lock. Hopefully on just the new row but maybe on the table. It is therefore important to release that lock quickly by committing the transaction as soon after the insert as possible.

What you do not want is that these locks are held while the agents are typing. That is an eternity in computer terms. By all means validate input as they type but only as single statements, never in a transaction spanning the whole data entry session. Then at the end, when the order is saved, start a transaction, re-read the reference data, insert the Order and commit. On any platform that should complete in milliseconds. The other operators won't even notice it happening. If there is an error or a conflict the agent will get immediate feedback, which they can correct and re-submit.

There is an alternative, which I wouldn't recommend, but list here for interest's sake. You could architect the system on an asynchronous, message-processing design. When an agent saves data it is written to a single staging table which has one column per data value. No validation, no normalisation. One single insert and the data's persisted. With 7 operators the chances of there even being two operations in the DB at one time are almost zero. A second process polls this table to read one row at a time. It validates and writes to the normalised schema. The result - success or failure - is recorded. Then the next row in the staging table is processed, and so on. This guarantees that all inserts are fully serialised, at the expense of a significantly more complicated design, no immediate feedback to agents and the need for additional business processes and programming to correct errors. Also this will not protect against activity from other programs connecting to the database. Something has to be reading those orders after all. As I said, not recommended.

Seven agents entering data at the same time is not a big number. You will be able to construct a system to handle this, I promise.

Related Solutions

MySQL – How to Implement Optimistic Locking Correctly

Your developer is mistaken. You need either SELECT ... FOR UPDATE or row versioning, not both.

Try it and see. Open three MySQL sessions (A), (B) and (C) to the same database.

In (C) issue:

CREATE TABLE test(
    id integer PRIMARY KEY,
    data varchar(255) not null,
    version integer not null
);
INSERT INTO test(id,data,version) VALUES (1,'fred',0);
BEGIN;
LOCK TABLES test WRITE;

In both (A) and (B) issue an UPDATE that tests and sets the row version, changing the winner text in each so you can see which session is which:

-- In (A):

BEGIN;
UPDATE test SET data = 'winnerA',
            version = version + 1
WHERE id = 1 AND version = 0;

-- in (B):

BEGIN;
UPDATE test SET data = 'winnerB',
            version = version + 1
WHERE id = 1 AND version = 0;

Now in (C), UNLOCK TABLES; to release the lock.

(A) and (B) will race for the row lock. One of them will win and get the lock. The other will block on the lock. The winner who got the lock will proceed to change the row. Assuming (A) is the winner, you can now see the changed row (still uncommitted so not visible to other transactions) with a SELECT * FROM test WHERE id = 1.

Now COMMIT in the winner session, say (A).

(B) will get the lock and proceed with the update. However, the version no longer matches, so it will change no rows, as reported by the row count result. Only one UPDATE had any effect, and the client application can clearly see which UPDATE succeeded and which failed. No further locking is necessary.

See session logs at pastebin here. I used mysql --prompt="A> " etc to make it easy to tell the difference between sessions. I copied and pasted the output interleaved in time sequence, so it's not totally raw output and it's possible I could've made errors copying and pasting it. Test it yourself to see.

If you had not added a row version field, then you would need to SELECT ... FOR UPDATE to be able to reliably ensure ordering.

If you think about it, a SELECT ... FOR UPDATE is completely redundant if you're immediately doing an UPDATE without re-using data from the SELECT, or if you're using row versioning. The UPDATE will take a lock anyway. If someone else updates the row between your read and subsequent write, your version won't match anymore so your update will fail. That's how optimistic locking works.

The purpose of SELECT ... FOR UPDATE is:

To manage lock ordering to avoid deadlocks; and
To extend the span of a row lock for when you want to read data from a row, change it in the application, and write a new row that's based on the original one without having to use SERIALIZABLE isolation or row versioning.

You do not need to use both optimistic locking (row versioning) and SELECT ... FOR UPDATE. Use one or the other.

MySQL – Locking Tables and Rows on Select Statement

I think that a simple update statement will work just fine:

UPDATE TableName
SET col = col + @user_input 
WHERE key_col = @key ;

The update will take the necessarily lock (probably on one row only) and any other update (by other users) will either wait for the lock to be released or fail.

If you want the user to see the result of the update, you can enclose in an transaction, something like this:

START TRANSACTION;
    UPDATE table_name
    SET col = col + @user_input 
    WHERE key_col = @key ;

    SET @r = 
        ( SELECT col
          FROM table_name
          WHERE key_col = @key
        ) ;
COMMIT ;
SELECT @r AS col_updated ;

Best Answer

Related Solutions

MySQL – How to Implement Optimistic Locking Correctly

MySQL – Locking Tables and Rows on Select Statement

Related Question