I'm having some strange issue of running the same query between prod and dev and receiving a bit of a different query plan and I'm not sure why. I'll try to cover everything I can, but if I forgot something of importance please let me know.
Postgresql: 11.6
GCE Prod specs:
- CPUS: 6
- RAM: 64gigs
- Storage: SSD
GCE Dev specs:
- CPUS: 4
- RAM: 24gigs
- Storage: HDD
Prod configuration:
shared_buffers = 16GB
seq_page_cost = 1.0
random_page_cost = 3.0
effective_cache_size = 48GB
work_mem = 512MB
maintenance_work_mem = 2GB
effective_io_concurrency = 200
dev configuration:
shared_buffers = 3840MB
seq_page_cost = 1.0
random_page_cost = 4.0
effective_cache_size = 11520MB
work_mem = 512MB
maintenance_work_mem = 960MB
effective_io_concurrency = 4
Table Definitions
Table "api.ms1_peak"
Column | Type | Collation | Nullable | Default
-----------------------+------------------+-----------+----------+---------------------------------------------------
ms1_peak_id | integer | | not null | nextval('api.ms1_peak_ms1_peak_id_seq'::regclass)
mzml_id | integer | | |
charge | integer | | |
mz | double precision | | |
adduct_m | double precision | | |
rt_corrected | double precision | | |
Indexes:
"ms1_peak_pkey" PRIMARY KEY, btree (ms1_peak_id)
"ms1_peak_ms1_peak_id_idx" btree (ms1_peak_id) INCLUDE (mz, rt_corrected)
"ms1_peak_mz_rt_ms1_peak_id_idx" btree (mz, rt_corrected) INCLUDE (ms1_peak_id, charge)
"ms1_peak_mzml_idx" btree (mzml_id)
Foreign-key constraints:
"ms1_peak_mzml_id_fkey" FOREIGN KEY (mzml_id) REFERENCES api.mzml(mzml_id) ON DELETE CASCADE
Table "api.molecular_structure"
Column | Type | Collation | Nullable | Default
------------------------+------------------+-----------+----------+-------------------------------------------------------------------------
molecular_structure_id | integer | | not null | nextval('api.molecular_structure_molecular_structure_id_seq'::regclass)
monoisotopic_mass | double precision | | |
mol | mol | | not null |
Indexes:
"molecular_structure_pkey" PRIMARY KEY, btree (molecular_structure_id)
"molecular_structure_unique_idx" UNIQUE, btree (md5(mol::text))
"molecular_structure_monoisotopic_mass_idx" btree (monoisotopic_mass, molecular_structure_id)
Referenced by:
TABLE "api.smallmolecule_fragment" CONSTRAINT "smallmolecule_molecular_structure_id_fkey" FOREIGN KEY (molecular_structure_id) REFERENCES api.molecular_structure(molecular_structure_id) ON DELETE CASCADE
TABLE "api.smallmolecule" CONSTRAINT "smallmolecule_molecular_structure_id_fkey" FOREIGN KEY (molecular_structure_id) REFERENCES api.molecular_structure(molecular_structure_id) ON DELETE SET NULL
Table "api.smallmolecule"
Column | Type | Collation | Nullable | Default
------------------------+--------------------------+-----------+----------+-------------------------------------------------------------
smallmolecule_id | integer | | not null | nextval('api.smallmolecule_smallmolecule_id_seq'::regclass)
created_date | timestamp with time zone | | | now()
original_smiles | text | | |
reference | json | | |
names | jsonb | | |
cas | text | | |
names_tsvector | tsvector | | |
molecular_structure_id | integer | | |
Indexes:
"smallmolecule_pkey" PRIMARY KEY, btree (smallmolecule_id)
"smallmolecule_molecular_structure_id_idx" btree (molecular_structure_id, smallmolecule_id)
"smallmolecule_names_idx" gist (names_tsvector)
Foreign-key constraints:
"smallmolecule_molecular_structure_id_fkey" FOREIGN KEY (molecular_structure_id) REFERENCES api.molecular_structure(molecular_structure_id) ON DELETE SET NULL
"smallmolecule_organization_id_fkey" FOREIGN KEY (organization_id) REFERENCES api.organization(organization_id) ON DELETE SET NULL
The query in question
select p.ms1_peak_id,
(
select coalesce(json_agg(sm.smallmolecule_id), '[]') from api.molecular_structure as ms
join api.smallmolecule sm using(molecular_structure_id)
where ms.monoisotopic_mass BETWEEN p.adduct_m - public.mz_tol(p.adduct_m, 5) AND
p.adduct_m + public.mz_tol(p.adduct_m, 5)
) as smallmolecules
from api.ms1_peak as p
where p.mzml_id = 20634
and adduct_m is not null
order by ms1_peak_id asc;
On the dev server, here is the query plan
Sort (cost=94236607.88..94236610.32 rows=973 width=36) (actual time=173.075..173.144 rows=1082 loops=1)
Output: p.ms1_peak_id, ((SubPlan 1))
Sort Key: p.ms1_peak_id
Sort Method: quicksort Memory: 106kB
Buffers: shared hit=58597
-> Bitmap Heap Scan on api.ms1_peak p (cost=39.56..94236559.59 rows=973 width=36) (actual time=17.735..172.456 rows=1082 loops=1)
Output: p.ms1_peak_id, (SubPlan 1)
Recheck Cond: (p.mzml_id = 20634)
Filter: (p.adduct_m IS NOT NULL)
Rows Removed by Filter: 1318
Heap Blocks: exact=399
Buffers: shared hit=58597
-> Bitmap Index Scan on ms1_peak_mzml_idx (cost=0.00..39.32 rows=1967 width=0) (actual time=0.180..0.180 rows=2400 loops=1)
Index Cond: (p.mzml_id = 20634)
Buffers: shared hit=12
SubPlan 1
-> Aggregate (cost=96843.52..96843.53 rows=1 width=32) (actual time=0.157..0.157 rows=1 loops=1082)
Output: COALESCE(json_agg(sm.smallmolecule_id), '[]'::json)
Buffers: shared hit=58186
-> Hash Join (cost=6062.96..96842.90 rows=246 width=4) (actual time=0.136..0.156 rows=1 loops=1082)
Output: sm.smallmolecule_id
Hash Cond: (ms.molecular_structure_id = sm.molecular_structure_id)
Buffers: shared hit=58186
-> Bitmap Heap Scan on api.molecular_structure ms (cost=816.07..91096.68 rows=26500 width=4) (actual time=0.019..0.111 rows=89 loops=1082)
Output: ms.molecular_structure_id, ms.molecular_weight, ms.monoisotopic_mass, ms.molecular_formula, ms.flat_smiles, ms.mol, ms.ecfp6, ms.fcfp6
Recheck Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Heap Blocks: exact=54237
Buffers: shared hit=57961
-> Bitmap Index Scan on molecular_structure_monoisotopic_mass_idx (cost=0.00..809.45 rows=26500 width=0) (actual time=0.012..0.012 rows=89 loops=1082)
Index Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Buffers: shared hit=3724
-> Hash (cost=4633.03..4633.03 rows=49108 width=8) (actual time=16.781..16.782 rows=49102 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Buckets: 65536 Batches: 1 Memory Usage: 2431kB
Buffers: shared hit=225
-> Index Only Scan using smallmolecule_molecular_structure_id_idx on api.smallmolecule sm (cost=0.41..4633.03 rows=49108 width=8) (actual time=0.030..7.495 rows=49108 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Heap Fetches: 0
Buffers: shared hit=225
Planning Time: 0.440 ms
Execution Time: 173.291 ms
On Production
Sort (cost=3158406.45..3158408.86 rows=965 width=36) (actual time=10773.364..10773.433 rows=1082 loops=1)
Output: p.ms1_peak_id, ((SubPlan 1))
Sort Key: p.ms1_peak_id
Sort Method: quicksort Memory: 106kB
Buffers: shared hit=358573
-> Index Scan using ms1_peak_mzml_idx on api.ms1_peak p (cost=0.56..3158358.61 rows=965 width=36) (actual time=14.920..10770.308 rows=1082 loops=1)
Output: p.ms1_peak_id, (SubPlan 1)
Index Cond: (p.mzml_id = 20634)
Filter: (p.adduct_m IS NOT NULL)
Rows Removed by Filter: 1318
Buffers: shared hit=358573
SubPlan 1
-> Aggregate (cost=3266.75..3266.76 rows=1 width=32) (actual time=9.924..9.924 rows=1 loops=1082)
Output: COALESCE(json_agg(sm.smallmolecule_id), '[]'::json)
Buffers: shared hit=357961
-> Hash Join (cost=1164.60..3266.13 rows=246 width=4) (actual time=8.142..9.913 rows=1 loops=1082)
Output: sm.smallmolecule_id
Inner Unique: true
Hash Cond: (sm.molecular_structure_id = ms.molecular_structure_id)
Buffers: shared hit=357961
-> Index Only Scan using smallmolecule_molecular_structure_id_idx on api.smallmolecule sm (cost=0.41..1973.03 rows=49108 width=8) (actual time=0.009..5.957 rows=49108 loops=1017)
Output: sm.molecular_structure_id, sm.smallmolecule_id
Heap Fetches: 0
Buffers: shared hit=322390
-> Hash (cost=834.29..834.29 rows=26392 width=4) (actual time=0.087..0.087 rows=89 loops=1082)
Output: ms.molecular_structure_id
Buckets: 32768 Batches: 1 Memory Usage: 262kB
Buffers: shared hit=35571
-> Index Only Scan using molecular_structure_monoisotopic_mass_idx on api.molecular_structure ms (cost=0.45..834.29 rows=26392 width=4) (actual time=0.032..0.071 rows=89 loops=1082)
Output: ms.molecular_structure_id
Index Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Heap Fetches: 0
Buffers: shared hit=35571
Planning Time: 2.329 ms
Execution Time: 10773.584 ms
The interesting thing is on dev I get a different query plan. It generally is using bitmap scans, while on production it is using index/index only scans.
Now, I can get the dev server to output a closer query plan to prod if I disable bitmapscans forcing more of an index only route
Here is the dev output after setting set enable_bitmapscan = off;
Sort (cost=107687464.30..107687466.74 rows=973 width=36) (actual time=166.059..166.129 rows=1082 loops=1)
Output: p.ms1_peak_id, ((SubPlan 1))
Sort Key: p.ms1_peak_id
Sort Method: quicksort Memory: 106kB
Buffers: shared hit=58988
-> Index Scan using ms1_peak_mzml_idx on api.ms1_peak p (cost=0.56..107687416.01 rows=973 width=36) (actual time=17.591..165.425 rows=1082 loops=1)
Output: p.ms1_peak_id, (SubPlan 1)
Index Cond: (p.mzml_id = 20634)
Filter: (p.adduct_m IS NOT NULL)
Rows Removed by Filter: 1318
Buffers: shared hit=58988
SubPlan 1
-> Aggregate (cost=110667.53..110667.54 rows=1 width=32) (actual time=0.150..0.150 rows=1 loops=1082)
Output: COALESCE(json_agg(sm.smallmolecule_id), '[]'::json)
Buffers: shared hit=58376
-> Hash Join (cost=5247.33..110666.91 rows=246 width=4) (actual time=0.123..0.149 rows=1 loops=1082)
Output: sm.smallmolecule_id
Hash Cond: (ms.molecular_structure_id = sm.molecular_structure_id)
Buffers: shared hit=58376
-> Index Only Scan using molecular_structure_monoisotopic_mass_idx on api.molecular_structure ms (cost=0.45..104920.69 rows=26500 width=4) (actual time=0.010..0.106 rows=89 loops=1082)
Output: ms.monoisotopic_mass, ms.molecular_structure_id
Index Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Heap Fetches: 96251
Buffers: shared hit=58151
-> Hash (cost=4633.03..4633.03 rows=49108 width=8) (actual time=16.941..16.941 rows=49102 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Buckets: 65536 Batches: 1 Memory Usage: 2431kB
Buffers: shared hit=225
-> Index Only Scan using smallmolecule_molecular_structure_id_idx on api.smallmolecule sm (cost=0.41..4633.03 rows=49108 width=8) (actual time=0.027..7.519 rows=49108 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Heap Fetches: 0
Buffers: shared hit=225
This query is a bit closer to production, and still faster (both cases on dev are faster than prod) However the query plan between the two is still slightly different. Notice the prod query plan uses an inner unique: true while places the index on 'molecular_structure_monoisotopix_mass_idx` at the bottom and loops 1082 times while on dev, this index usage is higher up in the explanation while not using an inner unique: true
Putting these query plans through explain.depesz breaks it down and shows that the production query planner takes 6s on the index only scan on api.smallmolecule, and 4 seconds on the hash join.
While on dev, the index only scan on api.smallmolecule is 8ms and the hash join is 28ms.
The data on molecular_structure and smallmolecule on dev and production is the same. Both are vacuumed and analyzed. So the dataset for these tables is the same.
about 5.2million rows on molecular_structure and 49000 on smallmolecule.
Playing around with configuration tweaks, I can manage to get the dev server to perform and output the exact same plan, when I set random_page_cost = 2.0
. This produces the exact same plan. So I'm starting to think this is definitely a tuning issue between the two. Tweaking with random_page_cost on prod seems to have no affect for this particular query.
These servers are on Google Cloud Engine, prod uses a persistent SSD so i'm aware tuning certain values may not be exact with dedicated SSD setups while dev uses an HDD.
Any ideas on how to tune or debug this on production to get better results would be great. Anything I'm missing let me know.
Edit:
Vacuum Analyze has been ran on production, this is new data so there are no new inserts/updates/deletes so I dont believe index fragmentation is the problem.
On production if I disable index scans set enable_indexscans = 'off'
I get bitmap heap scan plans and 180ms matching the performance of the dev machine.
Sort (cost=70758692.10..70758694.51 rows=966 width=36) (actual time=179.621..179.692 rows=1082 loops=1)
Output: p.ms1_peak_id, ((SubPlan 1))
Sort Key: p.ms1_peak_id
Sort Method: quicksort Memory: 106kB
Buffers: shared hit=60536
-> Bitmap Heap Scan on api.ms1_peak p (cost=42.52..70758644.20 rows=966 width=36) (actual time=29.350..179.233 rows=1082 loops=1)
Output: p.ms1_peak_id, (SubPlan 1)
Recheck Cond: (p.mzml_id = 20634)
Filter: (p.adduct_m IS NOT NULL)
Rows Removed by Filter: 1318
Heap Blocks: exact=399
Buffers: shared hit=60536
-> Bitmap Index Scan on ms1_peak_mzml_idx (cost=0.00..42.28 rows=1962 width=0) (actual time=0.169..0.169 rows=2400 loops=1)
Index Cond: (p.mzml_id = 20634)
Buffers: shared hit=12
SubPlan 1
-> Aggregate (cost=73243.01..73243.02 rows=1 width=32) (actual time=0.163..0.163 rows=1 loops=1082)
Output: COALESCE(json_agg(sm.smallmolecule_id), '[]'::json)
Buffers: shared hit=60125
-> Hash Join (cost=3970.76..73242.40 rows=246 width=4) (actual time=0.145..0.162 rows=1 loops=1082)
Output: sm.smallmolecule_id
Hash Cond: (ms.molecular_structure_id = sm.molecular_structure_id)
Buffers: shared hit=60125
-> Bitmap Heap Scan on api.molecular_structure ms (cost=576.97..69450.26 rows=26392 width=4) (actual time=0.019..0.106 rows=89 loops=1082)
Output: ms.molecular_structure_id, ms.molecular_weight, ms.monoisotopic_mass, ms.molecular_formula, ms.flat_smiles, ms.mol, ms.ecfp6, ms.fcfp6
Recheck Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Heap Blocks: exact=54237
Buffers: shared hit=57836
-> Bitmap Index Scan on molecular_structure_monoisotopic_mass_idx (cost=0.00..570.37 rows=26392 width=0) (actual time=0.012..0.012 rows=89 loops=1082)
Index Cond: ((ms.monoisotopic_mass >= (p.adduct_m - (('5'::double precision * p.adduct_m) / '1000000'::double precision))) AND (ms.monoisotopic_mass <= (p.adduct_m + (('5'::double precision * p.adduct_m) / '1000000'::double precision))))
Buffers: shared hit=3599
-> Hash (cost=2780.02..2780.02 rows=49102 width=8) (actual time=28.821..28.822 rows=49102 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Buckets: 65536 Batches: 1 Memory Usage: 2431kB
Buffers: shared hit=2289
-> Seq Scan on api.smallmolecule sm (cost=0.00..2780.02 rows=49102 width=8) (actual time=0.007..19.592 rows=49108 loops=1)
Output: sm.smallmolecule_id, sm.molecular_structure_id
Buffers: shared hit=2289
Planning Time: 0.461 ms
Execution Time: 179.824 ms
(40 rows)
Another thing I have noticed is that if I remove the join to the smallmolecule table
within the subquery, this query becomes very fast prod.
Edit 2:
I managed to do a pg_basebackup and sync the data fully between dev and production. The only table that had a few million more inserts was the ms1_peak table, the molecular_structure and smallmolecule table never changed between the two during these results.
The result set from the ms1_peak still returned the same results between outdated dev and prod, (1082 peaks).
But anyways I'm still not convinced a large peak table was the cause because that query is super fast. But now both dev and prod produce the exact same query plan and performance of 10s. Why is joining to the smallmolecule table super slow.
Best Answer
The problem comes from the bad row count estimate for the bitmap index scan on
api.molecular_structure
(26392 instead of the actual 89), and it must be a minor difference in the data that tilts the plan.It is probably difficult to improve the estimate.
How about rewriting the query?
Not sure if that will improve the speed, but it is worth a try.