We recently upgraded an Amazon RDS instance from PostgreSQL 9.3 to 9.4. There were no problems during the upgrade, and afterwards we ran VACUUM ANALYZE
. Performance is now very slow across the board. Here's an example of a query that took a fraction of a second in 9.3:
EXPLAIN ANALYZE SELECT room_name, id FROM common_activityinstance WHERE started_by_id = 1370408 AND room_name IN ('robcv28', 'foobartest', 'noroster', 'jscode', 'cv28', 'johansencouple', 'lv426', 'johansenfamily', 'johansen') AND end_time IS NULL;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_activityinstance (cost=55.63..59.65 rows=1 width=13) (actual time=1.082..1.082 rows=0 loops=1)
Recheck Cond: ((started_by_id = 1370408) AND (room_name = ANY ('{robcv28,foobartest,noroster,jscode,cv28,johansencouple,lv426,johansenfamily,johansen}'::text[])))
Filter: (end_time IS NULL)
Rows Removed by Filter: 925
-> BitmapAnd (cost=55.63..55.63 rows=1 width=0) (actual time=0.349..0.349 rows=0 loops=1)
-> Bitmap Index Scan on common_activityinstance_started_by_id (cost=0.00..5.53 rows=146 width=0) (actual time=0.122..0.122 rows=927 loops=1)
Index Cond: (started_by_id = 1370408)
-> Bitmap Index Scan on common_activityinstance_room_name_c1f9997a_like (cost=0.00..49.85 rows=1171 width=0) (actual time=0.171..0.171 rows=926 loops=1)
Index Cond: (room_name = ANY ('{robcv28,foobartest,noroster,jscode,cv28,johansencouple,lv426,johansenfamily,johansen}'::text[]))
Total runtime: 1.116 ms
(10 rows)
This same query is now much slower in 9.4:
EXPLAIN ANALYZE SELECT room_name, id FROM common_activityinstance WHERE started_by_id = 1370408 AND room_name IN ('robcv28', 'foobartest', 'noroster', 'jscode', 'cv28', 'johansencouple', 'lv426', 'johansenfamily', 'johansen') AND end_time IS NULL;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_activityinstance (cost=10157.60..14038.97 rows=45 width=36) (actual time=3923.011..3923.011 rows=0 loops=1)
Recheck Cond: ((started_by_id = 1370408) AND (end_time IS NULL))
Rows Removed by Index Recheck: 698
Filter: (room_name = ANY ('{robcv28,foobartest,noroster,jscode,cv28,johansencouple,lv426,johansenfamily,johansen}'::text[]))
Heap Blocks: exact=634
-> BitmapAnd (cost=10157.60..10157.60 rows=991 width=0) (actual time=3917.278..3917.278 rows=0 loops=1)
-> Bitmap Index Scan on common_activityinstance_started_by_id (cost=0.00..3282.60 rows=198155 width=0) (actual time=0.245..0.245 rows=927 loops=1)
Index Cond: (started_by_id = 1370408)
-> Bitmap Index Scan on end_time_id (cost=0.00..6874.73 rows=198155 width=0) (actual time=3915.565..3915.565 rows=1674860 loops=1)
Index Cond: (end_time IS NULL)
Planning time: 2.457 ms
Execution time: 3923.065 ms
(12 rows)
I don't understand why the query plans are different. And this is just one example—another query was running for 15 minutes before I just gave up and stopped it.
I thought maybe there was a problem with the indexes, so I also did REINDEX DATABASE
but it didn't help.
Any ideas?
Best Answer
Since the entire database seemed to be affected, I decided to try a
VACUUM (FULL, ANALYZE)
. It took 50 hours and 4 minutes to complete, but it was totally worth it. Our database disk usage went from 1259 GB down to 747 GB (reclaiming a whopping 512 GB of space), and performance across the board has returned back to the levels we were seeing before the upgrade.