Postgresql – Postgres: Performance: Slow index scan

indexoptimizationpostgresqlquery-performance

I have a really big query, that queries data from various tables. I have tried various approaches to optimize the query.

  • Made possible LEFT JOIN as LEFT LATERAL JOIN, so that the lateral join queries are computed after fetching results from the main query
  • Removed GROUP BY since the aggregation took more time.

But even the final query that I had come up with is slow when I try to order it by an indexed column. Please suggest to me how I should get started on optimization.

Here is the query and the explain analyze detail
https://explain.depesz.com/s/kTBh

Here is the EXPLAIN (ANALYZE, BUFFERS) with indexscan ON:
https://explain.depesz.com/s/SyRu

Here is the EXPLAIN (ANALYZE, BUFFERS) with indexscan OFF:
https://explain.depesz.com/s/TPDv

Indexes on the tables:

CREATE INDEX idx_screens_name ON public.screens USING btree (name);
CREATE INDEX idx_screens_number ON public.screens USING btree (((data ->> 'number'::text)));
CREATE INDEX idx_screens_tags ON public.screens USING gin (tags);
CREATE INDEX idx_screens_theatre_id ON public.screens USING btree (theatre_id);

CREATE INDEX idx_theatre_companies ON public.theatres USING gin (company_ids);
CREATE INDEX idx_theatres_chain_name ON public.theatres USING btree ((((data -> 'chain'::text) ->> 'name'::text)));
CREATE INDEX idx_theatres_city_name ON public.theatres USING btree (((((data -> 'address'::text) -> 'city'::text) ->> 'name'::text)));
CREATE INDEX idx_theatres_country_name ON public.theatres USING btree (((((data -> 'address'::text) -> 'country'::text) ->> 'name'::text)));
CREATE INDEX idx_theatres_name ON public.theatres USING btree (name);
CREATE INDEX idx_theatres_province_name ON public.theatres USING btree (((((data -> 'address'::text) -> 'province'::text) ->> 'name'::text)));
CREATE INDEX idx_theatres_qwc_is_active ON public.theatres USING btree (qwc_is_active);
CREATE INDEX idx_theatres_tags ON public.theatres USING gin (tags);

CREATE UNIQUE INDEX company_screen_mappings_key ON public.company_screen_mappings USING btree (company_id, screen_id) WHERE (is_deleted = false);

CREATE UNIQUE INDEX idx_scheduling_times_company_id_screen_id_content_type_id ON public.scheduling_times USING btree (company_id, screen_id, content_type_id);
CREATE INDEX idx_screen_right_mappings_date_range ON public.screen_right_mappings USING gist (daterange(valid_from, valid_till));
CREATE INDEX idx_screen_right_mappings_time_range ON public.screen_right_mappings USING gist (public.tmrange(start_time, end_time));

Best Answer

Looking at the query: most of your LEFT JOINs are not really LEFT JOINs because of the predicate ON TRUE which makes them behave as CROSS JOINs. Your query is considering every row that comes out of:

FROM screens s
INNER JOIN theatres t ON s.theatre_id = t.id 
LEFT JOIN company_screen_mappings AS csm ON csm.screen_id = s.id AND csm.is_deleted = FALSE AND csm.company_id = 'f5003aa3-3621-498a-ab0f-526706fda88f'

against every combination of rows coming out of the sub-queries that get called r, p, c, ste, activation, and rc. This is going to be time-consuming if any or all of those parts return a significant number of rows. As it is needing to compare everything against everything with little or no constraining predicates, scans are unavoidable (it needs to read everything to do the comparisons).

To me that looks like a logic flaw which the GROUP BY existed to try work around (much like the common anti-pattern of fixing accidental cartesian products by adding DISTINCT), furthermore it looks like a query that has been generated by an ORM which has been given some incorrect joining options, though not knowing the application or what the query is trying to do I of course don't know either of those things with any certainty.

For how to proceed in such cases: first I'd ask the developer to look at what the application is asking for (if it is working through an ORM), it may be that the fix or simplification is to be done there. If this is a hand-written query (I doubt it would have been written in one go like this, but it may have grown over time as more and more features were added to the report/screen it supports) rather than ORM generated then my first thought would be to return to the spec and rewrite it to those requirements from scratch.