I am trying to improve the performance of a postgres(9.6) query. Here is my schema and table contains about 60 million rows.
Column | Type | Modifiers
--------------------------+-----------------------------+-----------
transaction_id | text | not null
network_merchant_name | text |
network_merchant_id | text |
network_merchant_mcc | integer |
network_merchant_country | text |
issuer_country | text |
merchant_id | text |
remapped_merchant_id | text |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
remapped_at | timestamp without time zone |
Indexes:
"mapped_transactions_pkey" PRIMARY KEY, btree (transaction_id)
"ix_mapped_transactions_remapped_at" btree (remapped_at NULLS FIRST)
Here is the query I am trying to execute.
SELECT *
FROM mapped_transactions
ORDER BY remapped_at ASC NULLS FIRST
LIMIT 10000;
Here is the query plan:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..1511.67 rows=10000 width=146) (actual time=327049.374..327345.341 rows=10000 loops=1)
Buffers: shared hit=574937 read=210425 dirtied=356 written=4457
I/O Timings: read=146625.381 write=59.637
-> Index Scan using ix_mapped_transactions_remapped_at on mapped_transactions (cost=0.57..16190862.91 rows=107145960 width=146) (actual time=327049.364..327339.402 rows=10000 loops=1)
Buffers: shared hit=574937 read=210425 dirtied=356 written=4457
I/O Timings: read=146625.381 write=59.637
Planning time: 0.125 ms
Execution time: 327348.322 ms
(8 rows)
I don't understand why it takes so much time when there is an index on the remapped_at
column.
On the other hand if I order in reverse it is fast.
SELECT *
FROM mapped_transactions
ORDER BY remapped_at DESC NULLS LAST
LIMIT 10000;
and the plan is:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..1511.67 rows=10000 width=146) (actual time=0.020..9.268 rows=10000 loops=1)
Buffers: shared hit=1802
-> Index Scan Backward using ix_mapped_transactions_remapped_at on mapped_transactions (cost=0.57..16190848.04 rows=107145866 width=146) (actual time=0.018..4.759 rows=10000 loops=1)
Buffers: shared hit=1802
Planning time: 0.080 ms
Execution time: 11.561 ms
(6 rows)
Can someone please help me how I can improve the performance of the first query?
Updates
-
I solved this by rebuilding the table and reindexing the data.
VACUUM FULL ANALYZE
was not an option because the table was being used and I did not want to lock it. -
The performance of the index is deteriorating very fast. I rebuilt the index 7 hours ago and the performance was good. Now the queries answer in about 10s. Please note that that this table is write heavy. How can I make the index fast? Do I have to re-index the table frequently? There are no deletions in this table but a lot of updates.
Best Answer
I can't confirm with PostgreSQL 9.5. So this may be a regression. Sample data,
Here we do
NULLS FIRST
and it works,Trying
DESC NULLS LAST
You may have to work up a test case. It's not evident what you're experiencing.