Postgresql – Complex “activity” based query optimization

optimizationpostgresqlpostgresql-9.2

I'm working on a query in the same style as Facebook's feed on a table that currently has 500,000 rows. For now I am just looking to get the 20 newest items (new records created by a user's followees, records a user's followees have shown interest in) sorted by whichever action was the most recent (add, poke, etc.) with no duplicates from the main table. Eventually there with be other types of activity that will influence the results, some of which must a CTE. Being able to paginate the results is a requirement.

I have a query written that will do this, but when the user's followees have quite a bit of activity, the query becomes too slow to be usable (for instance, when a user's followees have added 9000 records, it take 2000 ms to run)

WITH
    my_followees AS (SELECT user_id, followee_id FROM user_followees WHERE user_id = 91884),
    new_events AS
        (SELECT
            events.*,
            events.date_added AS sort_date
        FROM
            events
            JOIN my_followees ON my_followees.followee_id = events.user_id
        WHERE
            active = true
            AND public = true),
    recently_rsvpd_events AS (
        SELECT DISTINCT ON (events.id)
            events.*,
            event_attendees.date_added AS sort_date
        FROM
            events
            JOIN event_attendees ON event_attendees.event_id = events.id
            JOIN my_followees ON my_followees.followee_id = event_attendees.user_id
        WHERE
            active = true
            AND public = true
        ORDER BY
            events.id,
            event_attendees.date_added DESC)
SELECT --DISTINCT ON (events.id)
    events.*
FROM
    ((SELECT * FROM recently_rsvpd_events) UNION (SELECT * FROM new_events)) AS events
ORDER BY
    --events.id,
    sort_date DESC
LIMIT 20;

Query plan:

 Limit  (cost=1058.01..1058.06 rows=20 width=1356) (actual time=1755.893..1756.513 rows=20 loops=1)
   CTE my_followees
     ->  Seq Scan on user_followees  (cost=0.00..1.06 rows=2 width=8) (actual time=0.118..0.218 rows=3 loops=1)
           Filter: (user_id = 91884)
           Rows Removed by Filter: 3
   CTE new_events
     ->  Nested Loop  (cost=5.30..1014.49 rows=214 width=986) (actual time=2.991..402.613 rows=9448 loops=1)
           ->  CTE Scan on my_followees  (cost=0.00..0.04 rows=2 width=4) (actual time=0.032..0.066 rows=3 loops=1)
           ->  Bitmap Heap Scan on events  (cost=5.30..506.16 rows=107 width=986) (actual time=1.364..52.863 rows=3149 loops=3)
                 Recheck Cond: (user_id = my_followees.followee_id)
                 Filter: (active AND public)
                 Rows Removed by Filter: 996
                 ->  Bitmap Index Scan on events_user_id_idx  (cost=0.00..5.28 rows=130 width=0) (actual time=1.022..1.022 rows=4145 loops=3)
                       Index Cond: (user_id = my_followees.followee_id)
   CTE recently_rsvpd_events
     ->  Unique  (cost=17.80..17.81 rows=2 width=994) (actual time=1.815..1.815 rows=0 loops=1)
           ->  Sort  (cost=17.80..17.80 rows=2 width=994) (actual time=1.754..1.754 rows=0 loops=1)
                 Sort Key: public.events.id, event_attendees.date_added
                 Sort Method: quicksort  Memory: 25kB
                 ->  Nested Loop  (cost=0.00..17.79 rows=2 width=994) (actual time=1.649..1.649 rows=0 loops=1)
                       Join Filter: (event_attendees.user_id = my_followees.followee_id)
                       Rows Removed by Join Filter: 3
                       ->  CTE Scan on my_followees  (cost=0.00..0.04 rows=2 width=4) (actual time=0.185..0.476 rows=3 loops=1)
                       ->  Materialize  (cost=0.00..17.69 rows=2 width=998) (actual time=0.171..0.264 rows=1 loops=3)
                             ->  Nested Loop  (cost=0.00..17.68 rows=2 width=998) (actual time=0.332..0.490 rows=1 loops=1)
                                   ->  Seq Scan on event_attendees  (cost=0.00..1.02 rows=2 width=16) (actual time=0.036..0.098 rows=2 loops=1)
                                   ->  Index Scan using events_pkey on events  (cost=0.00..8.32 rows=1 width=986) (actual time=0.049..0.066 rows=0 loops=2)
                                         Index Cond: (id = event_attendees.event_id)
                                         Filter: (active AND public)
                                         Rows Removed by Filter: 0
   ->  Sort  (cost=24.65..25.19 rows=216 width=1356) (actual time=1755.848..1756.058 rows=20 loops=1)
         Sort Key: recently_rsvpd_events.sort_date
         Sort Method: top-N heapsort  Memory: 50kB
         ->  HashAggregate  (cost=14.58..16.74 rows=216 width=1356) (actual time=1317.217..1642.807 rows=9448 loops=1)
               ->  Append  (cost=0.00..6.48 rows=216 width=1356) (actual time=5.046..1129.053 rows=9448 loops=1)
                     ->  CTE Scan on recently_rsvpd_events  (cost=0.00..0.04 rows=2 width=1356) (actual time=1.879..1.879 rows=0 loops=1)
                     ->  CTE Scan on new_events  (cost=0.00..4.28 rows=214 width=1356) (actual time=3.075..861.290 rows=9448 loops=1)
 Total runtime: 1760.640 ms

I'm not concerned about the sequential scans being done on the event_attendees or user_followees tables, since they only have 2 and 6 records respectively at the moment.

Using subqueries instead of CTE cuts the execution time almost in half, which is still not good enough:

 Limit  (cost=1053.69..1053.74 rows=20 width=1356) (actual time=1156.989..1157.601 rows=20 loops=1)
   CTE my_followees
     ->  Seq Scan on user_followees  (cost=0.00..1.06 rows=2 width=8) (actual time=0.031..0.065 rows=3 loops=1)
           Filter: (user_id = 91884)
           Rows Removed by Filter: 3
   ->  Sort  (cost=1052.63..1053.17 rows=216 width=1356) (actual time=1156.967..1157.171 rows=20 loops=1)
         Sort Key: event_attendees.date_added
         Sort Method: top-N heapsort  Memory: 50kB
         ->  HashAggregate  (cost=1042.56..1044.72 rows=216 width=986) (actual time=718.326..1042.587 rows=9448 loops=1)
               ->  Append  (cost=17.80..1034.46 rows=216 width=986) (actual time=1.609..569.446 rows=9448 loops=1)
                     ->  Unique  (cost=17.80..17.81 rows=2 width=994) (actual time=0.601..0.601 rows=0 loops=1)
                           ->  Sort  (cost=17.80..17.80 rows=2 width=994) (actual time=0.579..0.579 rows=0 loops=1)
                                 Sort Key: public.events.id, event_attendees.date_added
                                 Sort Method: quicksort  Memory: 25kB
                                 ->  Nested Loop  (cost=0.00..17.79 rows=2 width=994) (actual time=0.541..0.541 rows=0 loops=1)
                                       Join Filter: (event_attendees.user_id = my_followees.followee_id)
                                       Rows Removed by Join Filter: 3
                                       ->  CTE Scan on my_followees  (cost=0.00..0.04 rows=2 width=4) (actual time=0.054..0.154 rows=3 loops=1)
                                       ->  Materialize  (cost=0.00..17.69 rows=2 width=998) (actual time=0.054..0.087 rows=1 loops=3)
                                             ->  Nested Loop  (cost=0.00..17.68 rows=2 width=998) (actual time=0.114..0.168 rows=1 loops=1)
                                                   ->  Seq Scan on event_attendees  (cost=0.00..1.02 rows=2 width=16) (actual time=0.012..0.034 rows=2 loops=1)
                                                   ->  Index Scan using events_pkey on events  (cost=0.00..8.32 rows=1 width=986) (actual time=0.017..0.023 rows=0 loops=2)
                                                         Index Cond: (id = event_attendees.event_id)
                                                         Filter: (active AND public)
                                                         Rows Removed by Filter: 0
                     ->  Nested Loop  (cost=5.30..1014.49 rows=214 width=986) (actual time=0.978..327.947 rows=9448 loops=1)
                           ->  CTE Scan on my_followees  (cost=0.00..0.04 rows=2 width=4) (actual time=0.011..0.048 rows=3 loops=1)
                           ->  Bitmap Heap Scan on events  (cost=5.30..506.16 rows=107 width=986) (actual time=0.722..40.484 rows=3149 loops=3)
                                 Recheck Cond: (user_id = my_followees.followee_id)
                                 Filter: (active AND public)
                                 Rows Removed by Filter: 996
                                 ->  Bitmap Index Scan on events_user_id_idx  (cost=0.00..5.28 rows=130 width=0) (actual time=0.516..0.516 rows=4145 loops=3)
                                       Index Cond: (user_id = my_followees.followee_id)
 Total runtime: 1160.166 ms

Can a query like this be optimized in a meaningful way? Are there PostgreSQL features I'm not aware of that could help me out here? I don't see materialized views as being an option, since I am expecting a fair bit of activity to be happening throughout the day for some of the more active users.

Best Answer

The best I could come up with for optimizing this query is to limit the results up front by age. For the new_events CTE, an event that's over 3 months old isn't very new:

WITH
    my_followees AS (SELECT user_id, followee_id FROM user_followees WHERE user_id = 91884),
    new_events AS
        (SELECT
            events.*,
            events.date_added AS sort_date
        FROM
            events
            JOIN user_followees ON user_followees.followee_id = events.user_id
        WHERE
            active = true
            AND public = true
            AND events.date_added > now() - interval '3 months'),
    recently_rsvpd_events AS (
        SELECT
            events.*,
            event_attendees.date_added AS sort_date
        FROM
            events
            JOIN event_attendees ON event_attendees.event_id = events.id
            JOIN my_followees ON my_followees.followee_id = event_attendees.user_id
        WHERE
            active = true
            AND public = true
            AND event_attendees.date_added > now() - interval '3 months'),
    selected_events AS
        (SELECT DISTINCT ON (events.id)
            events.*
        FROM
            ((SELECT * FROM recently_rsvpd_events) UNION (SELECT * FROM new_events)) AS events
        ORDER BY
            events.id,
            events.sort_date DESC)
SELECT * FROM selected_events ORDER BY sort_date DESC LIMIT 20;

Followees would have to be exceptionally active for this query to be anywhere close to as slow as it was before (while the followees of this particular user have posted over 9000 events, they only posted 383 within a 3 month period). A few extra indexes helped, too.

New query plan:

 Limit  (cost=221.10..221.15 rows=20 width=1356) (actual time=225.795..226.409 rows=20 loops=1)
   CTE my_followees
     ->  Seq Scan on user_followees  (cost=0.00..1.07 rows=3 width=8) (actual time=0.046..0.115 rows=3 loops=1)
           Filter: (user_id = 91884)
           Rows Removed by Filter: 3
   CTE new_events
     ->  Nested Loop  (cost=0.01..51.14 rows=61 width=979) (actual time=0.356..0.356 rows=0 loops=1)
           ->  Seq Scan on user_followees  (cost=0.00..1.06 rows=6 width=4) (actual time=0.013..0.077 rows=6 loops=1)
           ->  Index Scan using events_user_id_active_public_date_added_desc_idx on events  (cost=0.01..8.34 rows=1 width=979) (actual time=0.016..0.016 rows=0 loops=6)
                 Index Cond: ((user_id = public.user_followees.followee_id) AND (date_added > (now() - '3 mons'::interval)))
   CTE recently_rsvpd_events
     ->  Nested Loop  (cost=0.10..156.12 rows=14 width=987) (actual time=0.861..127.241 rows=857 loops=1)
           ->  Hash Join  (cost=0.10..33.87 rows=17 width=12) (actual time=0.746..52.563 rows=1000 loops=1)
                 Hash Cond: (event_attendees.user_id = my_followees.followee_id)
                 ->  Seq Scan on event_attendees  (cost=0.00..28.91 rows=1252 width=16) (actual time=0.091..17.317 rows=1252 loops=1)
                       Filter: (date_added > (now() - '3 mons'::interval))
                 ->  Hash  (cost=0.06..0.06 rows=3 width=4) (actual time=0.428..0.428 rows=3 loops=1)
                       Buckets: 1024  Batches: 1  Memory Usage: 1kB
                       ->  CTE Scan on my_followees  (cost=0.00..0.06 rows=3 width=4) (actual time=0.092..0.315 rows=3 loops=1)
           ->  Index Scan using events_pkey on events  (cost=0.00..7.18 rows=1 width=979) (actual time=0.018..0.029 rows=1 loops=1000)
                 Index Cond: (id = event_attendees.event_id)
                 Filter: (active AND public)
                 Rows Removed by Filter: 0
   CTE selected_events
     ->  Unique  (cost=8.90..9.27 rows=75 width=1356) (actual time=197.618..211.242 rows=409 loops=1)
           ->  Sort  (cost=8.90..9.09 rows=75 width=1356) (actual time=197.586..202.127 rows=409 loops=1)
                 Sort Key: recently_rsvpd_events.id, recently_rsvpd_events.sort_date
                 Sort Method: quicksort  Memory: 448kB
                 ->  HashAggregate  (cost=5.06..5.81 rows=75 width=1356) (actual time=187.194..191.963 rows=409 loops=1)
                       ->  Append  (cost=0.00..2.25 rows=75 width=1356) (actual time=0.957..171.673 rows=857 loops=1)
                             ->  CTE Scan on recently_rsvpd_events  (cost=0.00..0.28 rows=14 width=1356) (actual time=0.914..150.859 rows=857 loops=1)
                             ->  CTE Scan on new_events  (cost=0.00..1.22 rows=61 width=1356) (actual time=0.378..0.378 rows=0 loops=1)
   ->  Sort  (cost=3.50..3.68 rows=75 width=1356) (actual time=225.762..225.960 rows=20 loops=1)
         Sort Key: selected_events.sort_date
         Sort Method: top-N heapsort  Memory: 43kB
         ->  CTE Scan on selected_events  (cost=0.00..1.50 rows=75 width=1356) (actual time=197.653..220.903 rows=409 loops=1)
 Total runtime: 227.675 ms