MySQL compression on Google Cloud SQL

compressiongoogle-cloud-platformgoogle-cloud-sqlMySQL

As the Google Cloud SQL is a managed instance of MySQL under the hood, limited to an absolute max on 2nd generation machines of 100GB of RAM, should I be compressing some of my table and index data ? I am designing a global system so eventually the working set will probably exceed the 100GB.

Previously i've achieved performance improvements by compressing tables and indexes in SQL Server and Oracle systems that were memory bound.

I'm new to GCP, and consequently MySQL. Would the usage of compression still apply in this case? are there any such functions or procedures (like compression_estimate in SQL Server) I can run to assess if a table or index will benefit from the compression?

Best Answer

Do you really need compressed? A multi-terabyte datasets may happily live on 100G-RAM machine. What is your data like? Text? Blobs? Lots of rows? Care to share the CREATE TABLEs?

I don't have any details on Cloud SQL. But...

Compression is usually not the first place to look when thinking of Scaling. It has overhead (usually CPU, sometimes complexity).

  • InnoDB's compression gives you about 2x.
  • Outside compression (zip, etc) give about 3x for text. That is not very significant for "scaling".
  • TokuDB and InfiniDB claim to get 10x, and I have heard supporting evidence for such.

"Scaling" usually involves Sharding or some other "distributed" technique.

RAM is not usually the limiting factor in Scaling; it is disk space. Or it could be any of several metrics - IOPs, network bandwidth, etc.

Scaling...

  • Read scaling is 'easily' handled via Replication.
  • Write scaling is not handled by any normal technique.
  • Galera Cluster gives some write scaling.
  • Sharding give a lot of write scaling, but at noticeable complexity in the application.
  • MySQL's NDB Cluster gives some write scaling, but it is usually limited to niche applications, and not for general use. (And the setup is quite complex.)
  • I am skeptical of any off-the-shelf database "scaling" -- beware of marketing hype. (Web servers are easy to scale; databases are not.)