Postgresql – Varying query runtime in tag based reporting and aggregation system

performancepostgresqlquery-performance

I'm evaluating an approach for reporting in a tag based time tracking system. I'm struggling to understand varying query runtimes or coming up with a different system to produce the desired query results.

Schema:

CREATE TABLE intervals_tags (
    interval_id integer NOT NULL,
    interval_duration integer NOT NULL,
    tag_type character varying NOT NULL,
    tag character varying NOT NULL
);

CREATE INDEX index_interval_id ON intervals_tags(interval_id int4_ops);
CREATE INDEX index_tag_type ON intervals_tags(tag_type text_ops);
CREATE INDEX index_tag ON intervals_tags(tag text_ops);

At the planned full capacity the intervals_tags table holds around 2 million rows and the goal is to aggregate intervals based on different tag based criterions. One such query is "Duration per user for a specific client":

SELECT
  tag AS user,
  SUM(interval_duration)
FROM intervals_tags
INNER JOIN (
  SELECT
    interval_id
   FROM intervals_tags
   WHERE tag = 'client:60'
) tt1
ON intervals_tags.interval_id = tt1.interval_id
WHERE tag_type = 'user'
GROUP BY tag;

With my dataset the query returns in 20ms. When I instead pick the client tag client:84 the query takes over 500ms to complete.

Questions

  1. Why is there such a big difference? (The only difference is the amount of rows per tag. client:60 has around 2.000 rows in the table, and client:84 stands at 10.000 rows.)

  2. What can I do to improve performance? (The goals is a P99 of 500ms.)

  3. What schema would be better suited for such tag based queries?

Explain for client:60

GroupAggregate  (cost=7681.25..7695.20 rows=797 width=19) (actual time=18.695..19.373 rows=15 loops=1)
  Group Key: intervals_tags.tag
  ->  Sort  (cost=7681.25..7683.24 rows=797 width=15) (actual time=18.638..18.925 rows=1987 loops=1)
        Sort Key: intervals_tags.tag
        Sort Method: quicksort  Memory: 142kB
        ->  Nested Loop  (cost=21.99..7642.84 rows=797 width=15) (actual time=0.779..14.715 rows=1987 loops=1)
              ->  Bitmap Heap Scan on intervals_tags intervals_tags_1  (cost=21.57..2207.74 rows=663 width=4) (actual time=0.766..3.191 rows=1987 loops=1)
                    Recheck Cond: ((tag)::text = 'client:60'::text)
                    Heap Blocks: exact=1848
                    ->  Bitmap Index Scan on index_tag  (cost=0.00..21.40 rows=663 width=0) (actual time=0.458..0.458 rows=1987 loops=1)
                          Index Cond: ((tag)::text = 'client:60'::text)
              ->  Index Scan using index_interval_id on intervals_tags  (cost=0.43..8.19 rows=1 width=19) (actual time=0.005..0.005 rows=1 loops=1987)
                    Index Cond: (interval_id = intervals_tags_1.interval_id)
                    Filter: ((tag_type)::text = 'user'::text)
                    Rows Removed by Filter: 3
Planning time: 0.348 ms
Execution time: 19.427 ms

Explain for client:84

HashAggregate  (cost=61532.59..61544.24 rows=1165 width=19) (actual time=551.934..551.987 rows=15 loops=1)
  Group Key: intervals_tags.tag
  ->  Hash Join  (cost=41533.65..61466.55 rows=13208 width=15) (actual time=444.025..547.858 rows=9958 loops=1)
        Hash Cond: (intervals_tags_1.interval_id = intervals_tags.interval_id)
        ->  Bitmap Heap Scan on intervals_tags intervals_tags_1  (cost=289.58..14421.85 rows=10987 width=4) (actual time=4.059..22.508 rows=9958 loops=1)
              Recheck Cond: ((tag)::text = 'client:84'::text)
              Heap Blocks: exact=7210
              ->  Bitmap Index Scan on index_tag  (cost=0.00..286.83 rows=10987 width=0) (actual time=2.478..2.478 rows=9958 loops=1)
                    Index Cond: ((tag)::text = 'client:84'::text)
        ->  Hash  (cost=32230.19..32230.19 rows=490951 width=19) (actual time=439.674..439.674 rows=500000 loops=1)
              Buckets: 65536  Batches: 8  Memory Usage: 3449kB
              ->  Bitmap Heap Scan on intervals_tags  (cost=11813.30..32230.19 rows=490951 width=19) (actual time=77.849..258.581 rows=500000 loops=1)
                    Recheck Cond: ((tag_type)::text = 'user'::text)
                    Heap Blocks: exact=14280
                    ->  Bitmap Index Scan on index_tag_type  (cost=0.00..11690.56 rows=490951 width=0) (actual time=74.719..74.719 rows=500000 loops=1)
                          Index Cond: ((tag_type)::text = 'user'::text)
Planning time: 0.359 ms
Execution time: 552.223 ms

Best Answer

SELECT tag AS user,
       SUM(interval_duration)
  FROM intervals_tags
  WHERE interval_id in (
    select interval_id from intervals_tags
      WHERE tag = 'client:60' )
  and tag_type = 'user'
  GROUP BY tag;