SQL Server RDS – How Many Read IOPS Are Too Many?

amazon-rdssql server

Amazon's best practices indicate you should have enough memory on your RDS instance to hold most of your dataset in memory, and that not having enough will result in more Read IOPS, as information is swapped into memory.

Our instance (SQL Server m3.medium) generally shows 150MB of "freeable memory," which doesn't seem like much of a margin. Our read iops ranges from 6-8 on average with momentary spikes to as much as 120 (or in one isolated case, 224).

We are not unhappy with our performance right now (coming from a far-inferior hosting environment), but as we try to decide whether to purchase a reserved instance, we'd like to know we're "ok" on resources for the term of the lease.

Thanks for any insight!

Best Answer

we'd like to know we're "ok" on resources for the term of the lease.

This is a "feeling" now you'll have to put this feeling into tangible, can be tracked and trended. This is how you can make predictions about your workload.

What would this look like? There are three areas you'll need to document:

What are the limits of my hosted solution?

There may be more than this, it's just to get you started thinking.

How much CPU (total) does the hosted environment have? (1 CPU = 100%)
How much RAM (total)?
How many Disks, Total Space, IOPS/Disk, IOPs Total
What is my networking bandwidth (Per card and totaled)

What should I monitor to trend my workload appropriately?

Whatever the limits of the hosted environment are, you'll definitely want to be gathering those metrics and statistics. There may be additional items that are Application defined which should also be included.

For example, if your application is a website that takes orders from users and the fulfills them there may some metrics you could add at the application layer:

Orders/Hour
Milliseconds to accept/place an order

Then the environment metrics based off of the first paragraph of information (not exhaustive, just an example):

CPU usage
Page file/Memory
Per disk IOPs, average seconds per IOP (transfer), etc.
Network Bandwidth

What are acceptable SLAs?

If an order takes 50 ms to place, is that bad? Is 2.5 seconds good? How many orders per hour must be accepted, minimally? Are there stored procedures associated with these orders that should be watched and have certain thresholds put on them?

Putting all of this information together will allow for the forecasting of expected volume, current volume, and potential needs in order to meet your SLAs for the business.

How many places do I know that actually do this? You'd be surprised. It's taken a few of them years to put it in place but many have these statistics readily viewable and available at a moments notice. Is it easy to do? Yeah, it's just time consuming. If you don't want to put in the time, just buy a bigger/better hosted solution - if you want to create a repeatable diagnostic infrastructure in order to provide this information then be willing to customize it to your environment.

Summary

I can't speak for RDS (amazon) and in all transparency, I work for Microsoft - but there are other hosted solutions such as Azure that already monitor many of these systems based items (CPU, Memory, IO, etc) for you. This puts the onus on you to grab the information and create your own application based metrics.

Each environment is different and while the system based attributes are a great start for monitoring, the whole picture is needed in order to make appropriate business decisions.

Related Solutions

MySQL – When to Upgrade RDS Instance Based on Memory Usage

Just ran across this question, and can give you a couple of suggestions.

If you're trying to gauge when to move to a bigger instances (vs. increasing the size of the databases) when you're getting close to hitting some sort of resource limit: Memory, I/O and CPU all have the ability to limit your performance.

The symptom you mention in the question may be a symptom that you're using more and more memory - memory used for cache & buffers are being reclaimed for other uses. Upgrading to a large instance type will increase the amount of available memory.

The chart that Rolando posted should give you a good guideline as to the number of available connections - if you're hitting those limits, then it's time to upgrade. Available connections are directly related to available memory, so hitting connection limits means you should upgrade.

If your CPU is averaging > 50% or so, you might want to start planning on an upgrade.

Finally, if you see consistently high I/O you may want to consider a larger instance (generally, the more powerful the instance the better the I/O) or using provisioned IOPS.

Mysql – Too Many database connections on Amazon RDS

Here is a Stored Procedure to kill long running SELECTs

DELIMITER $$

DROP PROCEDURE IF EXISTS `test`.`Kill_Long_Running_Selects` $$
CREATE PROCEDURE `test`.`Kill_Long_Running_Selects` (time_limit INT,display INT)
BEGIN

    DECLARE ndx,lastndx INT;

    DROP TABLE IF EXISTS test.LongRunningSelects;
    CREATE TABLE test.LongRunningSelects
    (
        id INT NOT NULL AUTO_INCREMENT,
        idtokill BIGINT,
        PRIMARY KEY (id)
    ) ENGINE=MEMORY;
    INSERT INTO test.LongRunningSelects (idtokill)
    SELECT id FROM information_schema.processlist
    WHERE user<>'system user' AND info regexp '^SELECT' AND time > time_limit;

    SELECT COUNT(1) INTO lastndx FROM test.LongRunningSelects;
    SET ndx = 0;
    WHILE ndx < lastndx DO
        SET ndx = ndx + 1;
        SELECT idtokill INTO @kill_id
        FROM test.LongRunningSelects WHERE id = ndx;
        CALL mysql.rds_kill(@kill_id);
    END WHILE;

    IF lastndx > 0 THEN
        IF display = 1 THEN
            SELECT GROUP_CONCAT(idtokill) INTO @idlist FROM test.LongRunningSelects;
            SELECT @idlist IDs_KIlled;
            SELECT CONCAT('Processes Killed : ',lastndx) Kill_Long_Running_Selects;
        END IF;
    END IF;

END $$

To kill SELECTs running longer than 30 seconds, you run this

CALL test.Kill_Long_Running_Selects(30,0);

If you want to see the connections being killed, you run this

CALL test.Kill_Long_Running_Selects(30,1);

Perhaps you can create a MySQL Event to call this Stored Procedure every minute.

If Amazon does not let you have the EVENT privilege, you will have to write an external shell script on the EC2 server to connect to the DB and run the Stored Procedure. That shell script can be put into a crontab.

If Amazon does not let you have the PROCESS and SUPER privileges, you may need to move the DB out of RDS and into another EC2 instance running MySQL to accomplish this. You could then create the MySQL Event without Amazon's hosting restrictions.

Best Answer

Related Solutions

MySQL – When to Upgrade RDS Instance Based on Memory Usage

Mysql – Too Many database connections on Amazon RDS

Related Question