Mysql – the optimal way to upgrade production RDS instance

amazon-rdsMySQL

I have MySQL small RDS instance as part of my production system and I want to upgrade it to medium instance with provided IOPS.

As old-school DBA I'm aware about "add slave; promote to master; switch clients" method, but AWS promises to provide magic one-click upgrade path, i.e. "upgrade instance", "add provided IOPS".

Tried this on test RDS instance, downtime is too long, IMHO: about 5 min for small->medium upgrade, and 30 min (!!!) for switching to provided IOPS.

  • Is this normal behavior?
  • Is there any way to run upgrade on production RDS w/o downtime?
  • Do you recommend "stop; create a snapshot; restore from snapshot to bigger instance" way?

Best Answer

Upgrading an instance in RDS means RDS will be physically migrating the database to a new instance, likely on a different physical host, so downtime would not be avoidable. Migrating to provisioned IOPS would likely mean your data would be migrated to a new EBS volume (and the server might be migrated to a new instance as well with this change, depending on whether, internally, machines capable of accessing EBS volumes with provisioned IOPS are physically segregated from machines that aren't, so that they can be on a different class of network hardware) so downtime would again be inevitable.

There appears to be a way to avoid this disruption: a Multi-AZ deployment, which creates an invisible and inaccessible (to you) replica in another availability zone within the region.

In the case of system upgrades like OS patching or DB Instance scaling, these operations are applied first on the standby, prior to the automatic failover. As a result, your availability impact is limited only to the time required for automatic failover to complete.

http://aws.amazon.com/rds/multi-az/

That should provide a quick and seamless migration path, though I have not had occasion to test this capability. "Modify" in the console appears to allow you to convert an instance to Multi-AZ. Presumably, this would result in brief I/O freeze as the instance is cloned, so I of course would recommend testing all of this functionality before trying it.

Alternately, RDS supports an internal mechanism that should allow you to emulate the "add slave; promote to master; switch clients" operation, and this also should allow you to achieve a near-zero-downtime conversion:

  • Create an actual RDS read replica of your database with the desired instance class
  • Wait for the replica to come online and be synched with the master
  • Modify the replica's configuration to add Provisioned IOPS
  • Wait for the replica to come online and be synched with the master
  • Verify that both systems have identical data using 3rd party tools
  • Disconnect your application from the old master
  • Verify matching binlog coordinates on master and replica to assure that all application writes have replicated
  • Split the systems with "Promote Read Replica" on the new replica in RDS
  • Connect your application to the new master

http://aws.amazon.com/about-aws/whats-new/2012/10/11/amazon-rds-mysql-rr-promotion/