Postgresql – Why would there be varying response times for the same query

optimizationpostgresqlpostgresql-9.6

Is there a reason why the same query executed a number of times have huge variance in response times? from 50% – 200% what the projected response time is?
They range from 6 seconds to 20 seconds even though it is the only active query in the database.

Context:

  1. Database on Postgres 9.6 on AWS RDS (with provisioned IOPS)
  2. Contains one table comprising five numeric columns, indexed on id, holding 200 million rows

The query:

SELECT col1, col2 
  FROM calculations 
 WHERE id > 0 
   AND id < 100000;

The query's explain plan:

Bitmap Heap Scan on calculation  (cost=2419.37..310549.65 rows=99005 width=43)
  Recheck Cond: ((id > 0) AND (id <= 100000))
  ->  Bitmap Index Scan on calculation_pkey  (cost=0.00..2394.62 rows=99005 width=0)
        Index Cond: ((id > 0) AND (id <= 100000))

Is there any reasons why a simple query like this isn't more predictable in response time?

After doing a database dump to my own EC2 instance PostgreSQL server, and executing the above query continuously, I can see that the variance is quite small — within +/- 10% of each other. So now thinking that we are not correctly configuring (or don't don't understand fully how the Provisioned IOPS works on RDS). Will look into this more.

Best Answer

After investigation of the historical load, we have found out that the provisioned IOPS we originally configured had been exhausted during the last set of load tests performed on the environment.

According to Amazon's documentation @http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html, after this point, Amazon does not guarantee consistency in execution times and the SLAs are no longer applicable.

We have confirmed that replicating the database onto a new instance of AWS RDS with same configuration yields consistent response times when executing the query multiple times.