PostgreSQL performance degradation over time on a write intensive db

database-tuningperformancepostgresqlpostgresql-9.1

I have observed a weird situation that over time the performance of a query (a combination of queries explained below) degrades, meaning at the start of testing (for a few minutes) the time of the query is 2ms then next day it got to 15ms then day after 30ms.

By query I refer here to a combination of either:

  • insert a row into table 2, select a row from table 2, select a row in table 3, update a row in table 3, commit
  • insert a row into table 1, select a row in table 3, update a row in table 3, commit

I wonder what might be the reason of that or which settings from the configuration file should I consider setting and how? I observed the problem on Ubuntu machine where database was set and the primary keys were not added. On the other hand on Win which I develop on it was not observed (it was running constantly on average 3ms per query for 7 days).

I noticed that in the new database (on Ubuntu) there were no primary keys on any table, as oppose to the one I develop on. Could the lack of primary keys have the negative impact on this sort of query?

I thought I will ask this question in the mean time as I am moving my whole db from my development machine to the test one.

On development I used PostgreSQL 8.4 (CPU: Intel i7 740QM, RAM: 6GB), on test there is PostgreSQL 9.1 (CPU: Intel i3-2100, RAM: 3.8GB).


UPDATE: autovacuum related parameters:

#autovacuum = on        
#log_autovacuum_min_duration = -1   
#autovacuum_max_workers = 3     
#autovacuum_naptime = 1min      
#autovacuum_vacuum_threshold = 50   
#autovacuum_analyze_threshold = 50  
#autovacuum_vacuum_scale_factor = 0.2   
#autovacuum_analyze_scale_factor = 0.1  
#autovacuum_freeze_max_age = 200000000  
#autovacuum_vacuum_cost_delay = 20ms    
#autovacuum_vacuum_cost_limit = -1  

UPDATE2:
It appears that the problem occurs on the development machine as well, but I remember it running fine before. Never the less I did some more testing and did run EXPLAIN ANALYZE on the query, which I takes the most time, and it is an update (I also seen selects take a while on the table as well) presented below:

EXPLAIN ANALYZE UPDATE ais_track SET latest_dynamic = '2012-09-10 22:22:22.222' WHERE mmsi = 123456789 AND ais_system = 1;

The below results are on Win as restore still is in progress on the Ubuntu, and I got this at first:

Index Scan using pk_track on ais_track  (cost=0.00..4.46 rows=1 width=36) (actual time=1.090..2.460 rows=1 loops=1)
  Index Cond: ((mmsi = 123456789) AND (ais_system = 1))
Total runtime: 8.681 ms

Then on 2nd repeat and further repeats of the same update query for a few times I get something of this form:

Index Scan using pk_track on ais_track  (cost=0.00..4.46 rows=1 width=36) (actual time=0.699..1.797 rows=1 loops=1)
  Index Cond: ((mmsi = 123456789) AND (ais_system = 1))
Total runtime: 1.850 ms

After a hundred repeats or so it got to over 2ms.

EXPLAIN ANALYZE for Select:

EXPLAIN ANALYZE SELECT * FROM ais_track WHERE mmsi = 123456789 AND ais_system = 1

1st run:

Index Scan using pk_track on ais_track  (cost=0.00..4.46 rows=1 width=38) (actual time=1.283..2.522 rows=1 loops=1)
  Index Cond: ((mmsi = 123456789) AND (ais_system = 1))
Total runtime: 2.560 ms

After a hundred or so runs:

Index Scan using pk_track on ais_track  (cost=0.00..4.46 rows=1 width=38) (actual time=0.027..1.357 rows=1 loops=1)
  Index Cond: ((mmsi = 123456789) AND (ais_system = 1))
Total runtime: 1.382 ms

The table used in query:

CREATE TABLE ais_track
(
  ais_system integer NOT NULL,
  mmsi integer NOT NULL,
  ext_id integer,
  latest_dynamic timestamp without time zone,
  latest_static timestamp without time zone,
  "name" character varying,
  CONSTRAINT pk_track PRIMARY KEY (mmsi, ais_system)
)

And two indexes:

CREATE INDEX ais_track_mmsi
  ON ais_track
  USING btree
  (mmsi);

CREATE INDEX ais_track_sys
  ON ais_track
  USING btree
  (ais_system);

NOTE: The table size is 11000 and it doesn't change.

Best Answer

Primary and unique keys in all(?) RDBMSes use indexes in order to quickly be able to determine whether a newly inserted value is indeed unique.

The side effect of this is that queries via primary and unique keys are usually "fast".

Now if you haven't defined primary or unique keys on your tables,

  1. You don't have a relational table but you have junk (OK, this is a contentious opinion, but a relational model needs keys on all tables).
  2. queries on this table (in the absence of any other indexes) will become slower as more data is inserted into the table.

So yes, the absence of primary keys will cause this!