PostgreSQL – Managing CTE Execution Order Under Uncertainty

ctepostgresqlunique-constraint

I am writing a large, multi-step CTE for performance reasons.

In one query, data must be moved from one table to another, but the quantity of rows moved is uncertain and could be zero.

In a subsequent table, the origin from the previous query is deleted but must be after the previous query is completed.

Finally, rows must be written in place of the deleted rows after the second query above is completed.

In the first two queries, I am using RETURNING to enforce execution order.

In the second query, I'm determining that the first query is completed by this subquery

(SELECT COUNT(*) FROM first_query) >= 0

In the the third query, I'm determining that the second query is completed by this subquery

SELECT EXISTS (SELECT 1 FROM second_query)

Is the subquery to determine that the first query has completed correct?

Is the subquery to determine that the second query, which must return rows, has completed optimal for accuracy, precision, and performance?

Using the above subqueries to enforce execution order is giving duplicate key value violations.

Query subsection

WITH copy_to_other_table AS (
    INSERT INTO other_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM main_table
        WHERE column_a = $1::bigint
        RETURNING *
),
main_table_deleted AS (
    DELETE FROM main_table WHERE column_a = $1::bigint 
        AND (SELECT COUNT(*) FROM copy_to_other_table) >= 0         
        RETURNING *
)
INSERT INTO main_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM another_table WHERE column_a = $1::bigint 
             AND EXISTS (SELECT 1 FROM main_table_deleted)

It is the final query that is violating the unique constraint.

Best Answer

This should work but I'm not really sure if it's the best regarding efficiency:

WITH copy_to_other_table AS (
    INSERT INTO other_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM main_table
        WHERE column_a = 1
),
main_table_deleted AS (
    DELETE 
    FROM main_table 
    WHERE column_a = 1 
      AND NOT EXISTS (SELECT 1 FROM another_table 
                      WHERE column_a = 1
                        AND column_b = main_table.column_b)               

)
INSERT INTO main_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM another_table WHERE column_a = 1
        EXCEPT 
        SELECT column_a, column_b 
            FROM main_table WHERE column_a = 1 ;

But what is wrong with the original query?

First, the (SELECT COUNT(*) ...) >= 0 is completely redundant. A count aggregate will always return a values of 0 or more so that condition is always true.
Second, there is no need at all to have any condition at all there, because all the rows from main that you want copied to the other table, you also want them tobe deleted from main. There is no reason to "check" if they have copied before you delete them . All 3 subqueries, (the 2 CTEs and the main query) will "see" the same tables with the same exact number of rows and data.
The third part is more tricky. It might seem at first glance, that no checking is needed either for the "interaction" between the 2nd (delete) and the 3rd (insert) part. Both are to the same table but if the 2nd cte is performed before the main query then all should be well.
Alas, the order of execution is not consecutive. From Postgres docs:

Data-modifying statements in WITH are executed exactly once, and always to completion, independently of whether the primary query reads all (or indeed any) of their output. Notice that this is different from the rule for SELECT in WITH: as stated in the previous section, execution of a SELECT is carried only as far as the primary query demands its output.

The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot "see" each others' effects on the target tables. This alleviates the effects of the unpredictability of the actual order of row updates, and means that RETURNING data is the only way to communicate changes between different WITH sub-statements and the main query.

As a test, you could altering the order of the 3 sub-statements. The result will be the same:

WITH main_table_deleted AS (
    DELETE 
    FROM main_table 
    WHERE column_a = 1 
      AND NOT EXISTS (SELECT 1 FROM another_table 
                      WHERE column_a = 1
                        AND column_b = main_table.column_b)               

),
 copy_to_other_table AS (
    INSERT INTO other_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM main_table
        WHERE column_a = 1
)
INSERT INTO main_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM another_table WHERE column_a = 1
        EXCEPT 
        SELECT column_a, column_b 
            FROM main_table WHERE column_a = 1 ;

The related issue is when the unique constraints are checked. I'm not 100% sure about the detail of those checks in combination with CTEs but unique constraints should be checked at the end of statements. It appears that they are also checked concurrently for each modifying cte.
(Note: This behaviour seems like a bug to be honest - unless I missed something in the documentation.)
Regarding your last question, setting isolation level to SERIALIZABLE would not have solved the issue as the whole operation is one statement, with 3 sub-statements. You could however, split the actions into 2 or 3 statements and then they would be executed one after the other. So, the 2nd would see the results of the 1st and the 3rd, the results of the first two. (If you do that, put the 2 or 3 statements inside a transaction, to isolate the operation from other executing statements.)

Another way - that is more close to your original query - would be to use the RETURNING clause to force the execution of the sub-statements in a specific order, i.e. the 3rd after the 2nd (The 1st can stay without RETURNING and executed concurently). Test in SQLFIddle-3:

WITH copy_to_other_table AS (
    INSERT INTO other_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM main_table
        WHERE column_a = 1
),
main_table_deleted AS (
    DELETE FROM main_table WHERE column_a = 1        
        RETURNING *
)
INSERT INTO main_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM another_table WHERE column_a = 1
        EXCEPT 
        (TABLE main_table_deleted EXCEPT TABLE another_table) ;

or slightly improved by doing the delete (2nd) cte first and then using its RETURNIING output in both the other two:

WITH main_table_deleted AS (
    DELETE FROM main_table WHERE column_a = 1        
        RETURNING *
),
copy_to_other_table AS (
    INSERT INTO other_table (column_a, column_b) 
        TABLE  main_table_deleted
)
INSERT INTO main_table (column_a, column_b) 
        SELECT column_a, column_b 
            FROM another_table WHERE column_a = 1
        EXCEPT 
        (TABLE main_table_deleted EXCEPT TABLE another_table) ;

"Simple" Solution

SELECT DISTINCT ON (1)
       n.number, p.code
FROM   num n
JOIN   prefix p ON right(n.number, -1) LIKE (p.code || '%')
ORDER  BY n.number, p.code DESC;

Key elements:

DISTINCT ON is a Postgres extension of the SQL standard DISTINCT. Find a detailed explanation for the used query technique in this related answer on SO.
ORDER BY p.code DESC picks the longest match, because '1234' sorts after '123' (in ascending order).

Simple SQL Fiddle.

Without index, the query would run for a very long time (didn't wait to see it finish). To make this fast, you need index support. The trigram indexes you mentioned, supplied by the additional module pg_trgm are a good candidate. You have to choose between GIN and GiST index. The first character of the numbers is just noise and can be excluded from the index, making it a functional index in addition.
In my tests, a functional trigram GIN index won the race over a trigram GiST index (as expected):

CREATE INDEX num_trgm_gin_idx ON num USING gin (right(number, -1) gin_trgm_ops);

Advanced dbfiddle here.

All test results are from a local Postgres 9.1 test installation with a reduced setup: 17k numbers and 2k codes:

Total runtime: 1719.552 ms (trigram GiST)
Total runtime: 912.329 ms (trigram GIN)

Much faster yet

Failed attempt with `text_pattern_ops`

Once we ignore the distracting first noise character, it comes down to basic left anchored pattern match. Therefore I tried a functional B-tree index with the operator class text_pattern_ops (assuming column type text).

CREATE INDEX num_text_pattern_idx ON num(right(number, -1) text_pattern_ops);

This works excellently for direct queries with a single search term and makes the trigram index look bad in comparison:

SELECT * FROM num WHERE right(number, -1) LIKE '2345%'

Total runtime: 3.816 ms (trgm_gin_idx)
Total runtime: 0.147 ms (text_pattern_idx)

However, the query planner will not consider this index for joining two tables. I have seen this limitation before. I don't have a meaningful explanation for this, yet.

Partial / functional B-tree indexes

The alternative it to use equality checks on partial strings with partial indexes. This can be used in a JOIN.

Since we typically only have a limited number of different lengths for prefixes, we can build a solution similar to the one presented here with partial indexes.

Say, we have prefixes ranging from 1 to 5 characters. Create a number of partial functional indexes, one for every distinct prefix length:

CREATE INDEX prefix_code_idx5 ON prefix(code) WHERE length(code) = 5;
CREATE INDEX prefix_code_idx4 ON prefix(code) WHERE length(code) = 4;
CREATE INDEX prefix_code_idx3 ON prefix(code) WHERE length(code) = 3;
CREATE INDEX prefix_code_idx2 ON prefix(code) WHERE length(code) = 2;
CREATE INDEX prefix_code_idx1 ON prefix(code) WHERE length(code) = 1;

Since these are partial indexes, all of them together are barely larger than a single complete index.

Add matching indexes for numbers (taking the leading noise character into account):

CREATE INDEX num_number_idx5 ON num(substring(number, 2, 5)) WHERE length(number) >= 6;
CREATE INDEX num_number_idx4 ON num(substring(number, 2, 4)) WHERE length(number) >= 5;
CREATE INDEX num_number_idx3 ON num(substring(number, 2, 3)) WHERE length(number) >= 4;
CREATE INDEX num_number_idx2 ON num(substring(number, 2, 2)) WHERE length(number) >= 3;
CREATE INDEX num_number_idx1 ON num(substring(number, 2, 1)) WHERE length(number) >= 2;

While these indexes only hold a substring each and are partial, each covers most or all of the table. So they are much larger together than a single total index - except for long numbers. And they impose more work for write operations. That's the cost for amazing speed.

If that cost is too high for you (write performance is important / too many write operations / disk space an issue), you can skip these indexes. The rest is still faster, if not quite as fast as it could be ...

If numbers are never shorter then n characters, drop redundant WHERE clauses from some or all, and also drop the corresponding WHERE clause from all following queries.

Recursive CTE

With all the setup so far I was hoping for very elegant solution with a recursive CTE:

WITH RECURSIVE cte AS (
   SELECT n.number, p.code, 4 AS len
   FROM   num n
   LEFT    JOIN prefix p
            ON  substring(number, 2, 5) = p.code
            AND length(n.number) >= 6  -- incl. noise character
            AND length(p.code) = 5

   UNION ALL 
   SELECT c.number, p.code, len - 1
   FROM    cte c
   LEFT   JOIN prefix p
            ON  substring(number, 2, c.len) = p.code
            AND length(c.number) >= c.len+1  -- incl. noise character
            AND length(p.code) = c.len
   WHERE    c.len > 0
   AND    c.code IS NULL
   )
SELECT number, code
FROM   cte
WHERE  code IS NOT NULL;

Total runtime: 1045.115 ms

However, while this query isn't bad - it performs about as good as the simple version with a trigram GIN index - it doesn't deliver what I was aiming for. The recursive term is planned once only, so it can't use the best indexes. Only the non-recursive term can.

UNION ALL

Since we are dealing with a small number of recursions we can just spell them out iteratively. This allows optimized plans for each of them. (We lose the recursive exclusion of already successful numbers, though. So there is still some room for improvement, especially for a wider range of prefix lengths)):

SELECT DISTINCT ON (1) number, code
FROM  (
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 5) = p.code
            AND length(n.number) >= 6  -- incl. noise character
            AND length(p.code) = 5
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 4) = p.code
            AND length(n.number) >= 5
            AND length(p.code) = 4
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 3) = p.code
            AND length(n.number) >= 4
            AND length(p.code) = 3
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 2) = p.code
            AND length(n.number) >= 3
            AND length(p.code) = 2
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 1) = p.code
            AND length(n.number) >= 2
            AND length(p.code) = 1
   ) x
ORDER BY number, code DESC;

Total runtime: 57.578 ms (!!)

A breakthrough, finally!

SQL function

Wrapping this into an SQL function removes the query planning overhead for repeated use:

CREATE OR REPLACE FUNCTION f_longest_prefix()
  RETURNS TABLE (number text, code text) LANGUAGE sql AS
$func$
SELECT DISTINCT ON (1) number, code
FROM  (
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 5) = p.code
            AND length(n.number) >= 6  -- incl. noise character
            AND length(p.code) = 5
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 4) = p.code
            AND length(n.number) >= 5
            AND length(p.code) = 4
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 3) = p.code
            AND length(n.number) >= 4
            AND length(p.code) = 3
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 2) = p.code
            AND length(n.number) >= 3
            AND length(p.code) = 2
   UNION ALL 
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(number, 2, 1) = p.code
            AND length(n.number) >= 2
            AND length(p.code) = 1
   ) x
ORDER BY number, code DESC
$func$;

Call:

SELECT * FROM f_longest_prefix_sql();

Total runtime: 17.138 ms (!!!)

PL/pgSQL function with dynamic SQL

This plpgsql function is much like the recursive CTE above, but the dynamic SQL with EXECUTE forces the query to be re-planned for every iteration. Now it makes use of all the tailored indexes.

Additionally this works for any range of prefix lengths. The function takes two parameters for the range, but I prepared it with DEFAULT values, so it works without explicit parameters, too:

CREATE OR REPLACE FUNCTION f_longest_prefix2(_min int = 1, _max int = 5)
  RETURNS TABLE (number text, code text) LANGUAGE plpgsql AS
$func$
BEGIN
FOR i IN REVERSE _max .. _min LOOP  -- longer matches first
   RETURN QUERY EXECUTE '
   SELECT n.number, p.code
   FROM   num n
   JOIN   prefix p
            ON  substring(n.number, 2, $1) = p.code
            AND length(n.number) >= $1+1  -- incl. noise character
            AND length(p.code) = $1'
   USING i;
END LOOP;
END
$func$;

The final step cannot be wrapped into the one function easily. Either just call it like this:

SELECT DISTINCT ON (1)
       number, code
FROM   f_longest_prefix_prefix2() x
ORDER  BY number, code DESC;

Total runtime: 27.413 ms

Or use another SQL function as wrapper:

CREATE OR REPLACE FUNCTION f_longest_prefix3(_min int = 1, _max int = 5)
  RETURNS TABLE (number text, code text) LANGUAGE sql AS
$func$
SELECT DISTINCT ON (1)
       number, code
FROM   f_longest_prefix_prefix2($1, $2) x
ORDER  BY number, code DESC
$func$;

Call:

SELECT * FROM f_longest_prefix3();

Total runtime: 37.622 ms

A bit slower due to required planning overhead. But more versatile than SQL and shorter for longer prefixes.

PostgreSQL – How to Delete Rows When Joining and Returning Data

Very simple indeed, but you do need to include the other WHERE clause as well:

DELETE FROM batch bp
USING  sender_log sl
WHERE  bp.log_id = sl.id
AND    bp.protocol = 'someprotocol'
RETURNING bp.*, sl.*;

And to actually return what your question outlines, you need to include both tables in the RETURNING clause.