The CTE is not needed here and poses as optimization barrier. A plain subquery generally performs better:
SELECT *
FROM (
SELECT id
,rank() OVER w AS global_rank
,lag(slug) OVER w AS previous_slug
,lead(slug) OVER w AS next_slug
FROM entries
WHERE competition_id = 'bdd94eee-25a4-481f-b7b5-37aaed953c6b'
WINDOW w AS (ORDER BY total_votes DESC)
) entry_with_global_rank
WHERE id = 'f2df68b7-d720-459d-8c4d-d11e28e0f0c0'
LIMIT 1;
As @Daniel commented, I removed the PARTITION BY
clause from the window definition, since you are limiting to a single competition_id
anyway.
Table layout
You could optimize your table layout to slightly reduce on-disk storage size, which makes everything a bit faster, yet:
Column | Type | Modifiers
----------------+-----------------------------+-------------------------------------
id | uuid | not null default uuid_generate_v4()
competition_id | uuid | not null
user_id | uuid | not null
total_votes | integer | not null default 0
photos_count | integer | not null default 0
hidden | boolean | not null default false
slug | character varying(255) | not null
first_name | character varying(255) | not null
last_name | character varying(255) | not null
image | character varying(255) |
country | character varying(255) |
image_src | character varying(255) |
photo_id | uuid |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
featured_until | timestamp without time zone |
More about that:
Also, do you actually need all those uuid
columns? int
or bigint
won't work for you? Would make table and indexes a bit smaller and everything faster.
And I would just use text
for the character data, but that is not going to help performance of the query.
Aside: character varying(255)
is almost always pointless in Postgres. Some other RDBMS profit from the restriction of the length, for Postgres it's all the same (unless you actually need to enforce the unlikely max. length of 255 characters).
Special index
Finally, you could build a highly specialized index (only if index maintenance is worth the special casing):
CREATE INDEX entries_special_idx ON entries (competition_id, total_votes DESC, id, slug);
Adding (id, slug)
to the index only makes sense if you can get index-only scans out of this. (Disabled autovacuum or lots of concurrent writes would negate that effort.) Else remove the last two columns.
While being at it, audit your indexes. Are they all in use? There might be some dead freight here.
I'm not sure why this sorting is happening because there is no ORDER BY operation in my query.
It is sorting so that it can do the merge join. Merge joins require sorted input.
The sort seems to be an external disk sort, which could be why it's proving to be so costly.
No, the actual sorting shouldn't take much time at all (although you might want to increase work_mem anyway, that kind of sort probably doesn't need to be on disk. What is the current setting?). Once it has the sorted data, though, it has to re-probe that data again and again as part of the merge join. That is where the time is going, and some of that time gets attributed to the sort step. Also, with this kind of plan, the overhead of collecting the times to report for an EXPLAIN ANALYZE can be huge, leading to the query taking several times longer than if it were not being monitored. If you do EXPLAIN (ANALYZE, TIMING OFF), what do you get for the bottom line execution time?
If you were to get it to use a hash join instead of a merge join, it probably would not change anything because the re-probing would still have to happen, just through a different mechanism.
The likely problem is that the query is executing as two sub-branches, one off from catalogite1_ and one off from service2_, which are then effectively cartesian joined at the end. The filtering can't be done until the end because some of the data needed for the comparisons is coming from one branch, and some from the other. And it is effectively a cartesian join because service2_ only has one qualifying row in it, meaning catalogite1_.service_id=service2_.id
is not very selective
I would try changing this part of the query:
ON service2_.id=entitledse6_.service_id
to this:
ON catalogite1_.service_id=entitledse6_.service_id
This might allow the filtering to occur at a much lower place in the query. If this works, then it would be interesting to know why the planner didn't make this switch for you--it should be capable of it. What is your setting for join_collapse_limit?
Also, things like this:
AND (
service2_.id=NULL
OR COALESCE(NULL) IS NULL)
Certainly don't help the planner make reasonable choices!
Best Answer
Your window function has to sort
10,453,164
rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.I also think there is something wrong with your partitioning.
Why does this query search a table called
ep_2016_8_host_ts_a_ses
when the timestamp clearly exists in2017-02
, and2017-03
. The query planner is supposed to know better. Look up constraint exclusionMy guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by
timestamp_
I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.