MySQL Replication – How to Read from Slaves Manually

MySQLreplication

MySQL Manual – Replication:

"Scale-out solutions – spreading the load among multiple slaves to
improve performance. In this environment, all writes and updates must
take place on the master server. Reads, however, may take place on one
or more slaves. "

I see how to setup replication which looks relatively simple, but I haven't seen how an application should communicate to slaves.

I assume the application would have to determine which slave to read from? The application will also have to know which server is doing writes? Or is it possible to have the application send all queries to the Master and have reads proxied to slaves?

Best Answer

I am adding an additional answer because Giovanni's one is completely right, but it only defines why scaling, how to do HA on the master and on the slaves, and how to split read queries among the slaves from an architectural point of view. I think OP question was more about how to do the read-write split itself.

Then answer depends on the software stack used, but many MySQL connectors/ORMs/frameworks, etc. allows you to define several connections as read-write or read-only. For example, the PHP connector for MySQL has a plugin for read-write splitting that can send read-only statements or those marked so to a different connection.

If you connector does not support it, you can always use a fabric-aware connector, a framework that can handle HA and read-write split for you, among other things.

Of course, you can always program it manually just by opening two connections, and use one for writing and another for reading, but managing connections at low level (reopening them if they fail, retrying statements, etc.) can be tricky.

The main problem here is that doing read-write split transparently can lead to many problems because the replication protocol is not synchronous. That means that if you perform a read just after a write, it may not have been replicated to the slave yet. So, for reads that have to be up-to-date, you will have to send those to the master. You may also want to monitor your replication lag so that it never goes over the maximum limit that your application requires. Or, in some cases (think, for example, banking) you may want to avoid asynchronous replication completely and use other protocols.

Think that there may be many processes that are heavy, such as reports or analytics that may not be affected by the replication lag, and that can safely be offloaded to the slaves. You code should be able to be "intelligent" enough to decide which service to use: in most cases you cannot do it transparently.