I'm using a Postgres database and trying to optimize the following query:
SELECT DISTINCT catalogite1_.id
FROM cat_catalogitem catalogite1_
INNER JOIN cat_service service2_
ON catalogite1_.service_id=service2_.id
LEFT OUTER JOIN cat_entitlement_services entitledse6_
ON service2_.id=entitledse6_.service_id
LEFT OUTER JOIN cat_entitlement entitlemen7_
ON entitledse6_.entitlement_id=entitlemen7_.id
AND (entitlemen7_.id IN ('505e03e5-7370-42c2-a26e-bdb2df593934' ,
'508da3b6-7147-4b16-971f-6e6476b8ef44' ,
'6c68fbd2-7cc4-4b7c-85c1-617b69578ab9' ,
'6c9e5ff0-a073-4923-a5ec-b47f5e4c120a' ,
'961bee54-e9d6-402c-a763-c3937b03402f' ,
'2f113c9a-9e2f-47d8-beda-df0e05faa167' ,
'471bca1e-a112-4842-bdfc-252b8848b862' ,
'482ba515-2197-4fdb-a74b-37d9a0795c4e' ,
'872038e4-766a-4b93-bf95-aa2735e7f942' ,
'fd6345fc-8799-42a0-83d5-0234e450e397' ,
'7378e830-5271-482f-b73b-7ce4232b000d' ,
'6aeafe3b-aac9-4c67-8895-aa77da7a7d6b'))
LEFT OUTER JOIN cat_subtenant catalogsub3_
ON catalogite1_.subtenant_id=catalogsub3_.id
LEFT OUTER JOIN cat_entitlement_catalogitems entitledca4_
ON catalogite1_.id=entitledca4_.catalogitem_id
LEFT OUTER JOIN cat_entitlement entitlemen5_
ON entitledca4_.entitlement_id=entitlemen5_.id
AND (entitlemen5_.id IN ('505e03e5-7370-42c2-a26e-bdb2df593934' ,
'508da3b6-7147-4b16-971f-6e6476b8ef44' ,
'6c68fbd2-7cc4-4b7c-85c1-617b69578ab9' ,
'6c9e5ff0-a073-4923-a5ec-b47f5e4c120a' ,
'961bee54-e9d6-402c-a763-c3937b03402f' ,
'2f113c9a-9e2f-47d8-beda-df0e05faa167' ,
'471bca1e-a112-4842-bdfc-252b8848b862' ,
'482ba515-2197-4fdb-a74b-37d9a0795c4e' ,
'872038e4-766a-4b93-bf95-aa2735e7f942' ,
'fd6345fc-8799-42a0-83d5-0234e450e397' ,
'7378e830-5271-482f-b73b-7ce4232b000d' ,
'6aeafe3b-aac9-4c67-8895-aa77da7a7d6b'))
WHERE catalogite1_.is_requestable=true
AND catalogite1_.status='PUBLISHED'
AND catalogite1_.tenant_id='intel-1'
AND (
service2_.id=NULL
OR COALESCE(NULL) IS NULL)
AND service2_.status='ACTIVE'
AND service2_.tenant_id='intel-1'
AND ((entitlemen7_.id IS NOT NULL)
AND (catalogsub3_.id IS NULL
OR entitlemen7_.subtenant_id=catalogsub3_.id)
AND (entitlemen5_.id IS NULL
OR entitlemen5_.subtenant_id<>entitlemen7_.subtenant_id)
OR (entitlemen5_.id IS NOT NULL)
AND (catalogsub3_.id IS NULL
OR entitlemen5_.subtenant_id=catalogsub3_.id)
AND entitledca4_.is_hidden=false)
As you can see, the query is quite complex and utilizes multiple joins and filtering conditions. My issue is that this query takes over 60 seconds to complete! My target is to bring the performance down to 2-3 seconds.
I've done some background research and looked into running this query through Postgres' EXPLAIN ANALYZE
feature, which gave some great insights into where the bottleneck is.
Here is a link to an easy-to-read representation of EXPLAIN ANALYZE
's output:
https://explain.depesz.com/s/f3hn
As you can see, the main bottleneck seems to be at the Merge Join, where the database is filtering out over 317 million rows! Another bottle neck is the sorting that also occurs several steps before the merge join. I'm not sure why this sorting is happening because there is no ORDER BY
operation in my query. The sort seems to be an external disk sort, which could be why it's proving to be so costly.
Can someone point me in the right direction to optimizing this query at this point? I think I've managed to pinpoint the main performance bottlenecks and just need a push in the right direction and/or tips on how to improve this situation.
Best Answer
It is sorting so that it can do the merge join. Merge joins require sorted input.
No, the actual sorting shouldn't take much time at all (although you might want to increase work_mem anyway, that kind of sort probably doesn't need to be on disk. What is the current setting?). Once it has the sorted data, though, it has to re-probe that data again and again as part of the merge join. That is where the time is going, and some of that time gets attributed to the sort step. Also, with this kind of plan, the overhead of collecting the times to report for an EXPLAIN ANALYZE can be huge, leading to the query taking several times longer than if it were not being monitored. If you do EXPLAIN (ANALYZE, TIMING OFF), what do you get for the bottom line execution time?
If you were to get it to use a hash join instead of a merge join, it probably would not change anything because the re-probing would still have to happen, just through a different mechanism.
The likely problem is that the query is executing as two sub-branches, one off from catalogite1_ and one off from service2_, which are then effectively cartesian joined at the end. The filtering can't be done until the end because some of the data needed for the comparisons is coming from one branch, and some from the other. And it is effectively a cartesian join because service2_ only has one qualifying row in it, meaning
catalogite1_.service_id=service2_.id
is not very selectiveI would try changing this part of the query:
to this:
This might allow the filtering to occur at a much lower place in the query. If this works, then it would be interesting to know why the planner didn't make this switch for you--it should be capable of it. What is your setting for join_collapse_limit?
Also, things like this:
Certainly don't help the planner make reasonable choices!