Postgresql – Window aggregate resulting in slow query on postgres 9.6

execution-planperformancepostgresqlquery-performancewindow functions

I'm using Postgresql 9.6 and I'm running into a performance issue. I have a relatively simple query:

explain analyze
 SELECT * FROM ( 
     SELECT A.person_id, O.session_id,  timestamp_ AS step_ts_min ,rank() over (partition by A.person_id order by timestamp_) as rn
     FROM event_page O
     JOIN alias A ON (O.person_alias = A.alias)
     WHERE O.timestamp_ between '2017-02-18 00:00:00.000' AND  '2017-03-13 00:00:00.000' 
     and O.location_host like '2016.myhost.ca'
) as innerstep WHERE rn=1

The event_page table is actually a view over monthly tables union-ed together, you can see that in the plan.

This query takes over 2.5 minutes to run and you can see the lions share of that time is used by quicksort, along with 1.8G of ram!!!

Here is the execution plan: https://explain.depesz.com/s/v51h

Is there a way to improve performance and optimize the query?

Best Answer

Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.

Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
Sort Key: a.person_id, event_page_2016_8.timestamp_
Sort Method: quicksort Memory: 1863192kB

I also think there is something wrong with your partitioning.

->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
                        Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))

Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion

My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.

Table layout

You could optimize your table layout to slightly reduce on-disk storage size, which makes everything a bit faster, yet:

     Column     |            Type             |              Modifiers
----------------+-----------------------------+-------------------------------------
 id             | uuid                        | not null default uuid_generate_v4()
 competition_id | uuid                        | not null
 user_id        | uuid                        | not null
 total_votes    | integer                     | not null default 0
 photos_count   | integer                     | not null default 0
 hidden         | boolean                     | not null default false
 slug           | character varying(255)      | not null
 first_name     | character varying(255)      | not null
 last_name      | character varying(255)      | not null
 image          | character varying(255)      |
 country        | character varying(255)      |
 image_src      | character varying(255)      |
 photo_id       | uuid                        |
 created_at     | timestamp without time zone |
 updated_at     | timestamp without time zone |
 featured_until | timestamp without time zone |

More about that:

Configuring PostgreSQL for read performance

Also, do you actually need all those uuid columns? int or bigint won't work for you? Would make table and indexes a bit smaller and everything faster.

And I would just use text for the character data, but that is not going to help performance of the query.

Aside: character varying(255) is almost always pointless in Postgres. Some other RDBMS profit from the restriction of the length, for Postgres it's all the same (unless you actually need to enforce the unlikely max. length of 255 characters).

Special index

Finally, you could build a highly specialized index (only if index maintenance is worth the special casing):

CREATE INDEX entries_special_idx ON entries (competition_id, total_votes DESC, id, slug);

Adding (id, slug) to the index only makes sense if you can get index-only scans out of this. (Disabled autovacuum or lots of concurrent writes would negate that effort.) Else remove the last two columns.

While being at it, audit your indexes. Are they all in use? There might be some dead freight here.

Postgresql – Query with merge join horrendously slow

I'm not sure why this sorting is happening because there is no ORDER BY operation in my query.

It is sorting so that it can do the merge join. Merge joins require sorted input.

The sort seems to be an external disk sort, which could be why it's proving to be so costly.

No, the actual sorting shouldn't take much time at all (although you might want to increase work_mem anyway, that kind of sort probably doesn't need to be on disk. What is the current setting?). Once it has the sorted data, though, it has to re-probe that data again and again as part of the merge join. That is where the time is going, and some of that time gets attributed to the sort step. Also, with this kind of plan, the overhead of collecting the times to report for an EXPLAIN ANALYZE can be huge, leading to the query taking several times longer than if it were not being monitored. If you do EXPLAIN (ANALYZE, TIMING OFF), what do you get for the bottom line execution time?

If you were to get it to use a hash join instead of a merge join, it probably would not change anything because the re-probing would still have to happen, just through a different mechanism.

The likely problem is that the query is executing as two sub-branches, one off from catalogite1_ and one off from service2_, which are then effectively cartesian joined at the end. The filtering can't be done until the end because some of the data needed for the comparisons is coming from one branch, and some from the other. And it is effectively a cartesian join because service2_ only has one qualifying row in it, meaning catalogite1_.service_id=service2_.id is not very selective

I would try changing this part of the query:

ON              service2_.id=entitledse6_.service_id

to this:

ON              catalogite1_.service_id=entitledse6_.service_id

This might allow the filtering to occur at a much lower place in the query. If this works, then it would be interesting to know why the planner didn't make this switch for you--it should be capable of it. What is your setting for join_collapse_limit?

Also, things like this:

AND             ( 
                                   service2_.id=NULL 
                   OR              COALESCE(NULL) IS NULL)

Certainly don't help the planner make reasonable choices!

Best Answer

Related Solutions

PostgreSQL Performance – How to Optimize Window Queries

Table layout

Special index

Postgresql – Query with merge join horrendously slow

Related Question