Mysql – How to scale MySQL transaction wrapped queries

MySQLscalabilitytransaction

So I'm working on an app that wraps many of it's SQL queries in a transaction for it's SQL usage and I'm trying to think of some ways to scale the DB layer up….

Up until now we've been throwing hardware at the problem, slowly upgrading our Rackspace cloud servers…but we're reaching the upper echelons of their cloud server offerings.

I'm thinking the standard R/W split won't work for us because the transaction gets stuck to a single server eliminating the benefits of the R/W split. (Is that right?)

I've looked at dbShards and ScaleBase both of which look like interesting solutions to the problem, but as a cash strapped startup we'd prefer to use something OSS or devote some man hours creating a sharding strategy instead.

So I suppose the question boils down to, are Galera/MySQL Cluster and sharding my only two options for scaling this setup?

EDIT

To answer some more questions…

We manage a great deal of financial and inventory related data, for this reason we wrap most of our SQL traffic in a transaction so it's all or nothing.

We have ~100 tables in our DB with ~30 of them getting a great deal of use. The most commonly used tables have anywhere from from 15MM to 35MM rows.

Our transactions span across 20-30 tables with joins between ~10 of them. The queried tables are rather large and we have made some inroads with respect to query optimization and proper indexing, but no one is a fully fledged DBA in house….closest is me haha

A lot of our DB traffic is machine generated, we do a lot of syncing with third party data sources. As such, most of the SQL queries are wrapped in a transaction to ensure data consistency when performing all the calculations and syncing with the third parties (and to allow for a rollback if the link to the third party gets disrupted)

Based off of our cacti monitoring it looks like a great deal of our SQL traffic is reads (~85%R to ~15%W)

Just as a note, we haven't hit too much of a bottleneck with our current cloud server but I'm doing my due diligence ahead of time to make sure we don't run into the wall without a plan.

Best Answer

Read/Write splitting doesn't move all reads to a replica server, it only moves the reads away from the master that are sensible to move... so any reads, for example, done by your processes that are integrating third party data would remain on the master, along with any reads related to transactions, but many other reads, such as report generation or compiling of aggregate/summary views could probably be offloaded to another machine.

How much this would help vs. how complicated it is to implement is very situation-specific. The 85%/15% split you're observing seems promising, but it doesn't tell us how many of the reads are directly related to the writes and thus need to stick with the master server.

Another concern with read/write splitting is the read-after-write consistency seen by your users. If I'm browsing my bank account transactions, that information could easily be read from a replica; on the other hand, if I just did a transfer between accounts, then go back to look at my transactions, the safest thing for the system to do is to read my transactions from the master to ensure that what I just did is immediately visible to me. Again, the nature of your application is the driving factor in determining the implications of this.

Neither of these points should be taken as advice "against" read/write splitting. You almost certainly should do some of it... the questions are "how much can you safely do?" followed by "how much will that help?"

When you are integrating the third-party data, something you want to keep in mind is that ideally, database transactions should generally be as quick as possible and as small as possible. From what you've described about using transactions to ensure consistency in the event of a lost connection from a third-party data source, it sounds like you might have room for improvement, here. The best bet -- probably -- is to retrieve the data from the external source, save it locally, verify its integrity, and then proceed with the integration.

If a large portion of your load is machine-generated, you probably also have options of making those processes less aggressive in their actions, more deterministic in their scheduling, and more cooperative with each other. Some of those processes, possibly, could be staged on another server, the calculations done, and the final result sent to your master server... but like everything else, it depends on the nature of the processes.