MySQL- reducing memory footprint

amazon-rdsMySQLsharding

Our InnoDB/Mysql5.6 database is growing at a fairly steady pace of doubling in size every year, currently at 22Gb. This is sales data; we have a number of merchants each generating millions of rows, all stored in one logical instance. The report queries are served by a few read replicas for load balancing and redundancy with the queries distributed randomly.

The database is hosted in AWS/RDS. We outgrew the 4Gb instance, then 8gb and currently reside in a 16Gb node. We expect to last until the end of the year. While AWS offers "memory optimized" instances we feel this will just defer the problem but not solve the bigger picture. Instead of scaling up, we would like to use a number of smaller instances and serve merchant specific data. We can partition the data the "hard way" where instance A only holds merchant A data and instance B only holds merchant B data and so on. The biggest drawback of this is that in case of an instance downtime we cannot just route traffic to another instance since there is no cross-account data sharing.

So I want to ask if anyone can comment on "logical partitioning" which means that all servers still hold all the data (disk space is not a problem) but queries related to one account are "sticky" i.e. routed to the same server (or server pool) to maximize cache hits. That way if a DB instance, zone or region goes down any other instance elsewhere can pick up the work of the down-instance since the same data was distributed everywhere.

In your opinion, would logical partitioning based on repeated queries to the same set of instances reduce the need for memory, assuming all the queries use indexes where the account ID is the first field in every composite key?

Best Answer

Good explanation. Good idea. I'll add my 2-cents...

"Sharding" would have a few merchants on one server (plus its replicas); a few other merchants on another shard, etc.

You are suggesting that it is practical to have all the data everywhere, but "shard" the accesses to minimize RAM requirements.

First let me make a strong suggestion. Have all PRIMARY KEYs be composite, starting with the merchant_id. This will cluster all the rows for a given merchant together. This is in addition to having most of the indexes start with that same column.

If you currently have

PRIMARY KEY(id),   -- auto_increment
INDEX(merchant_id, ...)   -- a few of these

then change to

PRIMARY KEY(merchant_id, id)  -- for m_id for clustering; id for uniqueness
INDEX(id)   -- to keep auto_increment happy
INDEX(merchant_id, ...)   -- fewer of these

Reconsider the secondary indexes; you may not need as many, since the rows are already clustered by merchant_id.

Granted, inserting not-for-this-shard merchant rows will need to some blocks; but they will be transient and not cost too much room in innodb_buffer_pool_size.

If you are pseudo-sharding the slaves, then if a slave dies, all the other slaves are ready to take over for any Merchant's requests. The only problem is that queries will be slow due to the lack of cached data. I see the 'instant' failover as being more of a benefit than the 'slower' is a detriment. Anyway, the caching will catch up pretty soon.

How much write traffic is there, versus read traffic?

  • A lot of write: Then you need to shard the Masters (and Slaves). But would this mean that you are still doing all the writes to all Masters? This is clumsy?

  • Mostly read traffic: (That's what I was commenting on above.)

Keep in mind that every Slave performs every write that happened on its Master. Corollary: You can't get write scaling without Sharding the Masters.

PK example

Using your specifics (from Comment): You currently have an 8-byte id BIGINT?

PRIMARY KEY(id),   -- auto_increment, 0 extra bytes (clustered with data)
INDEX(account_id, timestamp)   -- INT + TIMESTAMP or DATETIME : 4+5 + 8 bytes

-->

PRIMARY KEY(account_id, timestamp, id),   -- 0 extra bytes
INDEX(id)   -- 8 + 4+5 bytes

That is the disk footprint will be virtually identical. To elaborate:

  • The PK is "clustered" with the data; this makes the data ordered according to the PK. Hence, the PK takes zero extra bytes on top of the data. (OK, there is a small overhead for the non-leaf nodes of the BTree.)
  • Any secondary index has a copy of the PK in it. (This is how it gets to the actual row.)
  • So, in this pair, the secondary index has exactly the same fields, with exactly the same total space, just rearranged.
  • Here's the feature I am after: When looking up rows for one account, perhaps over a timestamp range, the data PK will now be used instead of using the secondary key, then jumping through id. And all the desired rows will be adjacent to each other. The result: Such SELECTs will be faster (maybe 2x, but there are a lot of variables).

Back to the "old wives tale" that you should keep the PK "small". Yeah, that is a general rule. But I have just explained an exception.

If you have several more secondary keys on the same table, then the extra secondary keys will be bulkier. But you want to skip on RAM, and don't need to worry about disk. Again, what I suggest fits well.