Advantages of single-leader (transactions) over multi-leader replication

I am reading the excellent book "Designing Data-Intensive Applications" which I wholeheartedly recommend, but I'm confused by a section comparing multi-leader (i.e. multi-writer) replication to single-leader replication. I understand the basic difference: In multi-leader, multiple leader nodes can accept writes, each leader sends its writes to the other leaders, and you have conflict-resolution rules to decide how to merge them. Single leader solves concurrency using transactions.

The following two paragraphs describe how multi-writer can be more challenging because conflicts aren't resolved right away. My question is afterward.

[This paragraph and the diagram are describing multi-leader.] For example, consider a wiki page that is simultaneously being edited
by two users as shown in Figure 5-7 [copied below]. User 1 changes the title of the
page from A to B, and user 2 changes the title from A to C at the same
time. Each user’s change is successfully applied to their local
leader. However, when the changes are asynchronously replicated, a
conflict is detected. This problem does not occur in a
single-leader database.

In a single-leader database, the second writer will either block and
wait for the first write to complete, or abort the second write
transaction, forcing the user to retry the write. On the other hand,
in a multi-leader setup, both writes are successful, and the conflict
is only detected asynchronously at some later point in time. At that
time, it may be too late to ask the user to resolve the conflict.

I see the difficulty with multi-writer here, but I'm skeptical that single-writer would be much better.

Consider the most likely chain of events when two people edit a Wikipedia page at roughly the same time: 1) Person 1 loads the edit page, takes 3-5 seconds to edit the title and submits. 2) Person 2 loads the edit page, takes 3-5 seconds to edit the title and submits. Each database transaction to apply the edit is only a few milliseconds, so it is far more likely that that the updates will happen one after the other, than that they will happen at the same time. Therefore if your concern is that one of these 2 people's updates will be lost, you need to address the potential for conflicts at the application level somehow; transactions won't really help you.

Furthermore in the case where the two transactions do overlap, it doesn't help the users to simply delay one of the transactions until the other is done. Once it resumes it will still overwrite the first user's data.

So my question is, is there some helpful transaction technique I'm missing that would actually be useful here? It's been a while since I tried to use transactions so my technique is rusty.

The best improvement I can think of: add AND title='A' to the end of both UPDATE statements, and add a second statement to the transaction that checks the number of affected rows, and rollback if that is equal to 0. The rollback would have no effect but it would indicate a failure to the client. But this is a bit hackish.

I don't think it would help to begin the transaction with a check (i.e. SELECT * FROM pages WHERE title='A' and make sure you get something back). Both transactions could possibly see 'A' at the beginning even though only one transaction would win out.

Best Answer

I came across your question while reading the same book.

I think the issue to consider is that in the single leader scenario, there is a natural total ordering as it can apply a sequence number per transaction. It always knows "which happened first". It's hard or impossible to do this correctly in multi leader due to issues like clock skew.

So I think it boils down to, does the system even know which happened first.

Your "write if" suggestion is mentioned/used many times and I think is agnostic to single/multi leader. Write if is application level (the user probably has to enter what they think the old value is) whereas single/multi leader is probably not exposed to the user. In single leader, the write if would unambiguously avoid conflicts; the total ordering determines which operation happens first, and whichever gets there second will safely fail due to the write if.

I think the advantage of multi is performance based; the advantage of single is "correctness/complexity" based.

Best Answer

Related Solutions

Sql-server – How Commit works in a transactional push replication with updatable subscribers

Sql-server – Replication of database used by two programs

Related Question