PostgreSQL – Reduce by Key if Key is Not Null

index-tuningoptimizationperformancepostgresqlpostgresql-performance

I need to select N rows with seq > X which reduce_key is null and for each row group with the same reduce_key select the one with the biggest seq. Result should be strictly sequential (but with rows with the same reduce_key filtered except the one with biggest seq) and limited with N.

Here is a query which does that:

explain analyze SELECT
    x2."user_id",
    x2."seq",
    x2."timestamp",
    x2."reduce_key",
    x2."mapping"
FROM
    "user_sequence" x2
WHERE
    (
        (x2."user_id" = 860862404)
        AND (x2."seq" > 3562974)
    )
AND (
    (x2."reduce_key" IS NULL)
    OR (
        x2."seq" IN (
            SELECT
                MAX ("seq")
            FROM
                "user_sequence"
            WHERE
                ("user_id" = 860862404)
            AND (
                ("seq" >= x2."seq")
                AND ("reduce_key" = x2."reduce_key")
            )
            GROUP BY
                "reduce_key" 
        )
    )
)
ORDER BY
    x2."seq"
LIMIT 1000

The problem is that it's being executed ~70ms on table with tens millions of rows. Is there any more efficient way to reduce rows by key leaving rows with null key?

Table structure:

actor=# \d user_sequence;
   Table "public.user_sequence"
   Column   |  Type   | Modifiers 
------------+---------+-----------
 user_id    | integer | not null
 seq        | integer | not null
 timestamp  | bigint  | not null
 mapping    | bytea   | not null
 reduce_key | text    | 
Indexes:
    "user_sequence_pkey" PRIMARY KEY, btree (user_id, seq)
    "user_sequence_user_id_reduce_key_seq_idx" btree (user_id, reduce_key, seq)

EXPLAIN ANALYZE:

 Limit  (cost=0.57..231389.06 rows=1000 width=103) (actual time=0.408..74.754 rows=308 loops=1)
   ->  Index Scan using user_sequence_pkey on user_sequence x2  (cost=0.57..24444343.74 rows=105642 width=103) (act
ual time=0.406..74.645 rows=308 loops=1)
         Index Cond: ((user_id = 860862404) AND (seq > 3562974))
         Filter: ((reduce_key IS NULL) OR (SubPlan 1))
         Rows Removed by Filter: 692
         SubPlan 1
           ->  GroupAggregate  (cost=0.57..447.59 rows=1 width=33) (actual time=0.096..0.097 rows=1 loops=730)
                 Group Key: user_sequence.reduce_key
                 ->  Index Only Scan using user_sequence_user_id_reduce_key_seq_idx on user_sequence  (cost=0.57..4
47.03 rows=110 width=33) (actual time=0.036..0.081 rows=68 loops=730)
                       Index Cond: ((user_id = 860862404) AND (reduce_key = x2.reduce_key) AND (seq >= x2.seq))
                       Heap Fetches: 49371
 Planning time: 0.808 ms
 Execution time: 74.890 ms
(13 rows)

I think that problem is with filter but how to make it use index?

I am using PostgreSQL 9.4.

Best Answer

Assuming you want what your current query does (which seems different from what your description says).

You ask:

I think that problem is with filter but how to make it use index?

Your query plan shows that Postgres is already using indexes for every step of the way. The added FILTER step only filters rows according to your additional predicates.

I posted a different idea at first that was missing ORDER BY seq. This query using a simple NOT EXISTS anti-semi-join should be a better equivalent:

SELECT user_id
     , seq
     , timestamp
     , reduce_key
     , mapping
FROM   user_sequence u
WHERE  user_id = 860862404
AND    seq > 3562974
AND    NOT EXISTS (
   SELECT 1
   FROM   user_sequence
   WHERE  user_id = u.user_id
   AND    reduce_key = u.reduce_key
   AND    seq > u.seq
   )
ORDER  BY seq
LIMIT  1000;

Your existing indexes look good for it. The second one might be a bit more efficient with descending sort order for the last item:

user_sequence_user_id_reduce_key_seq_idx btree (user_id, reduce_key, seq DESC)

Normally, there would be a simpler alternative with DISTINCT ON, but that would fold all rows with reduce_key IS NULL into a single row while you want them all.

Related answer with more explanation:

Aside: You have a lot of unnecessary parentheses and double-quotes. Maybe an inefficient ORM?