Intro
In PostgreSQL 9.3: I am building a query that fetches data for a table that supports sorting, filtering and paging. Think Customers for example, you want to show name, surname, some detailed information, some information from associated tables (eg. past purchases) etc. A lot of data is being fetched, and in the simplest form the query looks like this:
SELECT
customer.name,
customer.surname,
(SELECT aggregate_whatever(foo) FROM bar) past_purchases,
baz.aaa bbb,
gaz.ccc ddd,
...
FROM
customers customer
LEFT JOIN
baz ON ...
LEFT JOIN
gaz ON ...
There are several thousand of customers
and the additional data being fetched comes from most of the system.
Filtering, paging and sorting
In the end we will have to put past_purchases
into the WHERE
clause when building the query according to the filter. For this reason, the whole query is encapsulated as a CTE (common table expression), just like in this SO question. It looks like this:
WITH encapsulated AS (
SELECT
customer.name,
customer.surname,
(SELECT aggregate_whatever(foo) FROM bar) past_purchases,
baz.aaa bbb,
gaz.ccc ddd,
...
FROM
customers customer
LEFT JOIN
baz ON ...
LEFT JOIN
gaz ON ...
)
SELECT * FROM encapsulated
WHERE
past_purchases = 5
AND <other conditions>
That ultimate SELECT
will also have an ORDER BY
clause for the necessary columns. For paging purposes, a LIMIT
is added at the end, and the number of total rows is calculated as follows:
...
SELECT *, COUNT(*) OVER () as total_row_count
FROM encapsulated
WHERE
past_purchases = 5
AND <other conditions>
ORDER BY
surname, name
LIMIT 0, 10
Problem
This solution works but quickly runs into performance problems. All the tables have proper indices and EXPLAIN (ANALYZE)
shows a nice yet costly plan with indexed accesses, yet the whole query takes almost 20 seconds to finish.
My suspicion is that the CTE is basically built in full at first, meaning the server fetches all the data for all the thousands of customers, and only then it applies the WHERE
filtering, ordering and LIMIT
ing.
Question 1
Are the WHERE
conditions outside of the CTE propagated into the CTE?
Question 2
Am I using the CTE correctly? Basically I am simplifying my life: my final WHERE
clause is trivial, because it does not need to repeat the expressions that form my columns, but did I break the performance in that act?
Things I have tried
- Made sure all the table accesses are on an index (this helped a bit, but still almost 20 seconds in total to fetch).
- Tried replacing all the views referenced from the query by their materialized versions (no improvement).
- Tried removing
JOIN
s and sub-SELECT
s from the query to see if the performance improves (only a very little). - Set
shared_buffers
to 1/4 of the available RAM (1 GB of 4 GB, no effect). VACUUM
ed the whole database (FULL FREEZE ANALYZE
, no effect).
Question 3
Am I correct in thinking that my approach requires to scan basically the whole DB, build a specialized view of it (that is that CTE) and only then apply some filtering? Is there any better way to do this?
Best Answer
We have a similar issue with CTEs. From what I gather researching the question, and from actually testing on our own queries, indexes which would have been used to filter the results in the CTE when used in a
WHERE
clause outside of the CTE aren't used because, as mentioned here, the CTE acts as an optimization fence. This means that, for performance reasons, you will want to refactor queries using CTEs to use subqueries instead.We had a bunch of queries that used CTEs where we gained an order or two of magnitude of performance when we refactored them to subqueries — in one case, we dropped query time from approx. 2 minutes to just under a second. So keep that in mind when building queries.
So, in your example, you would use the query inside the CTE as a subquery instead:
(In our case, we had relatively abysmal performance on some of our queries due to us using XPath column expressions, combined with scanning upwards of 30k rows in a table, caused us to spend seconds on calculating those XPaths on rows which would ultimately be discarded anyway. Removing the CTEs and using subqueries sped things up considerably, as the XPath columns would be calculated only for the actual rows returned).