Postgresql – Postgres CTE query for a single row not optimized

ctepostgresqlview

From my database I am trying to get a weighted sum of a user's stats, and I will only be querying the table one or two users at a time, so I wrote it as a view.

Since it's a view, I pretend to calculate the sum for every single row in the table, and then I was hoping that the optimizer would realize when I am only asking for a single row and would optimize the query. However my query plan is massive, and is calculating 17 billion rows at its innermost point, where I think there should be at most a thousand.

Here is the query:

CREATE OR REPLACE VIEW weighted_stats AS
WITH 
    clf AS (SELECT * FROM classifiers order by time_trained desc limit 1),
    weights AS (SELECT kv.key, kv.value from clf, each(clf.weights) AS kv),
    kvs AS (
        SELECT stats.player_id, kv.key, kv.value FROM
        stats, each(stats.hstore_column) AS kv),
SELECT
    stats.player_id,
    SUM(kvs.value :: numeric * weights.value :: numeric) AS stats
FROM
    kvs JOIN weights USING (key)
GROUP BY kvs.player_id;

This is the query plan:

explain analyze select * from weighted_stats where player_id=76561197960269296

GroupAggregate  (cost=53645.35..299471.72 rows=1 width=72) (actual time=1014.016..1014.016 rows=0 loops=1)
   Group Key: kvs.id
   CTE clf
     ->  Limit  (cost=20.65..20.65 rows=1 width=84) (actual time=0.017..0.018 rows=1 loops=1)
           ->  Sort  (cost=20.65..22.43 rows=710 width=84) (actual time=0.014..0.014 rows=1 loops=1)
                 Sort Key: classifiers.time_trained
                 Sort Method: quicksort  Memory: 25kB
                 ->  Seq Scan on classifiers  (cost=0.00..17.10 rows=710 width=84) (actual time=0.003..0.005 rows=1 loops=1)
   CTE kvs
     ->  Seq Scan on stats  (cost=0.00..53572.18 rows=10318000 width=722) (actual time=0.037..530.337 rows=336036 loops=1)
   CTE weights
     ->  Nested Loop  (cost=0.00..20.02 rows=1000 width=64) (actual time=0.036..0.046 rows=2 loops=1)
           ->  CTE Scan on clf  (cost=0.00..0.02 rows=1 width=32) (actual time=0.020..0.023 rows=1 loops=1)
           ->  Function Scan on each kv  (cost=0.00..10.00 rows=1000 width=64) (actual time=0.011..0.013 rows=2 loops=1)
   ->  Hash Join  (cost=32.50..241344.73 rows=257950 width=72) (actual time=1014.012..1014.012 rows=0 loops=1)
         Hash Cond: (kvs.key = weights.key)
         ->  CTE Scan on kvs  (cost=0.00..232155.00 rows=51590 width=72) (actual time=0.044..1013.877 rows=62 loops=1)
               Filter: (id = 76561197960269296::bigint)
               Rows Removed by Filter: 335974
         ->  Hash  (cost=20.00..20.00 rows=1000 width=64) (actual time=0.060..0.060 rows=2 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               ->  CTE Scan on weights  (cost=0.00..20.00 rows=1000 width=64) (actual time=0.040..0.054 rows=2 loops=1)
 Planning time: 0.286 ms
 Execution time: 1017.671 ms

This is still a lot slower than what I'd expect. The optimization is working partially by filtering before the join, instead of before the group by, but it seems like the kvs CTE (which should itself be filtered) is still being calculated for everyone.

Best Answer

Common Table Expressions are treated as an “optimization fence” by PostgreSQL: It will never push down predicates from the main query into the CTE nor collapse any joins across the CTE boundary. Instead, it will generally evaluate the whole CTE as it is, materializing the result; the main query will then access the temporary table generated from the CTE.

So yes, your query would likely benefit from converting the CTE into a subquery.

Note that an actual view (created by CREATE VIEW) does not act as an optimization fence. The view’s definition will be included in the query using it, then optimized as usual. For CTEs, there has been discussion about making the optimization fence behaviour optional so that they could be used “just” to make queries more readable. However, as of version 9.5, this hasn’t been implemented yet.