PostgreSQL Performance – Simple Query Takes Too Much Time to Execute

postgresqlpostgresql-performance

a very simple postgres query, like SELECT takes five minutes to execute. It was working fine, taking less than one second and at some point the execution time increased significantly to five minutes. When I do SELECT query to another table in the same database – it works pretty fast. Only one table is affected and it seems to happen at some point, before it was also executing fast. Following query takes five minutes to execute:

SELECT * FROM some_table LIMIT 50 OFFSET 0;

another thing is that if I select by Id – it works instantly, very fast, less than one second:

SELECT * FROM some_table WHERE id = '33ae0b5f0d6a0435e36faf8d';

UPDATE 1

table definition has 34 columns,

Id primary key column is indexed, it is GUID not integer

table contains only 8455 rows

server_prod=# SELECT COUNT(*) FROM some_table;
 count 
-------
  8455
(1 row)

UPDATE 2

server_prod=# select current_setting('shared_buffers') AS shared_buffers, pg_size_pretty(pg_table_size('some_table')) AS table_size;
 shared_buffers | table_size 
----------------+------------
 128MB          | 62 GB
(1 row)

server_prod=# explain (analyze, buffers, timing) SELECT * FROM some_table LIMIT 10 OFFSET 0;
                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..9677.70 rows=10 width=1379) (actual time=363665.384..363668.365 rows=10 loops=1)
   Buffers: shared hit=5701604 read=1443335
   ->  Seq Scan on project_views  (cost=0.00..8109910.80 rows=8380 width=1379) (actual time=363665.381..363668.358 rows=10 loops=1)
         Buffers: shared hit=5701604 read=1443335
 Planning time: 0.099 ms
 Execution time: 363668.414 ms
(6 rows)

UPDATE 3

psql version is 10.7

server_prod=# SELECT version();
                                                                   version                                                                   
---------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 (Ubuntu 10.7-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
(1 row)

please advise.

Best Answer

When you use the LIMIT clause, it is applied after all the data is loaded to filter down the data, which is evident by your EXPLAIN as it is the last operation (top of the operation chain). This is because LIMIT isn't a predicate filter, rather it just takes the top N number of rows from the total result set. While 8,455 rows is tiny, the actual Table size is 62 GB on disk which is a decent amount of data to SELECT all at once and then filter and return to the client.

When you use the WHERE clause, the filtering is applied upfront, and can be done efficiently because your id field is indexed, so very few rows need to actually be analyzed to serve the data (since the logical data structure is a B-Tree) and therefore much less data (in file size) to load and process.

If you want a more efficient way to get only 50 records back, you'll need to find a way to use predicate filtering with your WHERE clause to reduce the data first, and then apply your TOP if needed. E.g. if you know you which range of ids to filter on (though I'm assuming since you're using a GUID that'll be hard to do). If there are other fields you can use to help filter down the data with predicate filtering, then you should create indexes on those fields as well, but that will depend on your table definition.