Postgresql – Why does a list of 10,000 IDs perform better than using the equivalent SQL to select them

performancepostgresqlpostgresql-performancequery-performance

I have a Rails application with a legacy query that I'd like to refurbish. The current implementation performs two SQL queries: one to get a large number of IDs and a second query that uses those IDs and applies some additional joins and filters to get the desired result.

I am trying to replace this with a single query that avoids the round trip, but doing so has incurred a large performance degradation in my local testing environment (which is a copy of the full production dataset). It appears that an index is not being used in the new query, leading to a full table scan. I had hoped the single query would keep the same performance as the original code, ideally improving on it due to not needing to send all the IDs around.

This is a fairly minimized version of my actual problem. A slightly larger version is discussed at Why does a list of 10,000 IDs perform better in a complicated query with multiple CTEs compared to the equivalent SQL to select them?.

Current Query

There's a query that takes ~6.5 seconds to calculate a list of 10000+ IDs. You can see that as the CTE visible_projects in the "proposed query" section below. Those IDs are then fed into this query:

EXPLAIN (ANALYZE, BUFFERS)
WITH visible_projects AS NOT MATERIALIZED (
    SELECT
        id
    FROM
        "projects"
    WHERE
        "projects"."id" IN (
            -- 10000+ IDs removed
)),
visible_tasks AS MATERIALIZED (
    SELECT
        tasks.id
    FROM
        tasks
    WHERE
        tasks.project_id IN (
            SELECT
                id
            FROM
                visible_projects))
SELECT
    COUNT(1)
FROM
    visible_tasks;

Query plan (depesz)

Aggregate  (cost=1309912.31..1309912.32 rows=1 width=8) (actual time=148.661..153.739 rows=1 loops=1)
   Buffers: shared hit=73107 read=22301
   CTE visible_tasks
     ->  Gather  (cost=43024.54..1308639.80 rows=56556 width=4) (actual time=46.337..137.260 rows=48557 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           Buffers: shared hit=73107 read=22301
           ->  Nested Loop  (cost=42024.54..1301984.20 rows=23565 width=4) (actual time=28.871..120.682 rows=16186 loops=3)
                 Buffers: shared hit=73107 read=22301
                 ->  Parallel Bitmap Heap Scan on projects  (cost=42023.97..138877.16 rows=4378 width=4) (actual time=28.621..52.627 rows=3502 loops=3)
                       Recheck Cond: (id = ANY ('{ REMOVED_IDS }'::integer[]))
                       Heap Blocks: exact=3536
                       Buffers: shared hit=30410 read=9833
                       ->  Bitmap Index Scan on projects_pkey  (cost=0.00..42021.35 rows=10507 width=0) (actual time=35.642..35.642 rows=10507 loops=1)
                             Index Cond: (id = ANY ('{ REMOVED_IDS }'::integer[]))
                             Buffers: shared hit=30410 read=1111
                 ->  Index Scan using test_tasks_on_project on tasks  (cost=0.57..263.85 rows=182 width=8) (actual time=0.012..0.018 rows=5 loops=10507)
                       Index Cond: (project_id = projects.id)
                       Buffers: shared hit=42697 read=12468
   ->  CTE Scan on visible_tasks  (cost=0.00..1131.12 rows=56556 width=0) (actual time=46.339..144.641 rows=48557 loops=1)
         Buffers: shared hit=73107 read=22301
 Planning:
   Buffers: shared hit=10 read=10
 Planning Time: 8.857 ms
 Execution Time: 156.102 ms

Proposed Query

This is the same query structure, but instead of inserting the 10000+ IDs directly into the visible_projects CTE, I've embedded the SQL that finds those IDs.

EXPLAIN (ANALYZE, BUFFERS)
WITH visible_projects AS NOT MATERIALIZED (
    SELECT
        id
    FROM
        "projects"
    WHERE
        "projects"."company_id" = 11171
        AND "projects"."state" < 6
        AND "projects"."is_template" = FALSE),
visible_tasks AS MATERIALIZED (
    SELECT
        tasks.id
    FROM
        tasks
    WHERE
        tasks.project_id IN (
            SELECT
                id
            FROM
                visible_projects))
SELECT
    COUNT(1)
FROM
    visible_tasks;

Query plan (depesz):

 Aggregate  (cost=2212223.53..2212223.54 rows=1 width=8) (actual time=40675.984..40686.708 rows=1 loops=1)
   Buffers: shared hit=118145 read=1567727
   CTE visible_tasks
     ->  Gather  (cost=279353.08..2208430.12 rows=168596 width=4) (actual time=7050.894..40666.025 rows=48557 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           Buffers: shared hit=118145 read=1567727
           ->  Hash Join  (cost=278353.08..2190570.52 rows=70248 width=4) (actual time=7038.932..40650.430 rows=16186 loops=3)
                 Hash Cond: (tasks.project_id = projects.id)
                 Buffers: shared hit=118145 read=1567727
                 ->  Parallel Seq Scan on tasks  (cost=0.00..1828314.43 rows=31963043 width=8) (actual time=0.397..29372.029 rows=25572144 loops=3)
                       Buffers: shared read=1508684
                 ->  Hash  (cost=277961.56..277961.56 rows=31322 width=4) (actual time=6977.480..6977.481 rows=10507 loops=3)
                       Buckets: 32768  Batches: 1  Memory Usage: 626kB
                       Buffers: shared hit=118061 read=59031
                       ->  Index Scan using index_projects_on_company_id on projects  (cost=0.43..277961.56 rows=31322 width=4) (actual time=0.591..6970.696 rows=10507 loops=3)
                             Index Cond: (company_id = 11171)
                             Filter: ((NOT is_template) AND (state < 6))
                             Rows Removed by Filter: 63512
                             Buffers: shared hit=118061 read=59031
   ->  CTE Scan on visible_tasks  (cost=0.00..3371.92 rows=168596 width=0) (actual time=7050.896..40671.054 rows=48557 loops=1)
         Buffers: shared hit=118145 read=1567727
 Planning:
   Buffers: shared hit=2 read=18
 Planning Time: 9.528 ms
 Execution Time: 40687.524 ms

Even accounting for the two previous queries combined, this takes 6x the time as the current implementation.

I see that this has chosen to use Parallel Seq Scan on tasks which is the main contributing time factor. What I don't understand is why this was chosen and what I should do to return to using the index.

Through research, I've learned that Postgres doesn't offer query hints to force the use of an index, so I assume that a good solution will involve demonstrating to the query planner that using the index would be beneficial.

Meta

I'm using COUNT(1) combined with the AS MATERIALIZED / AS NOT MATERIALIZED controls in this question to produce a smaller example.

The larger query in the application does not use these, but it also performs some filtering on the tasks table before producing a number of other CTEs and some aggregate metrics as the final result.

Schema

                                                 Table "public.projects"
           Column           |             Type              | Collation | Nullable |               Default
----------------------------+-------------------------------+-----------+----------+--------------------------------------
 id                         | integer                       |           | not null | nextval('projects_id_seq'::regclass)
 name                       | character varying(255)        |           |          |
 description                | text                          |           |          |
 due                        | timestamp without time zone   |           |          |
 created_at                 | timestamp without time zone   |           | not null |
 updated_at                 | timestamp without time zone   |           | not null |
 client_id                  | integer                       |           |          |
 company_id                 | integer                       |           |          |
 repeat                     | boolean                       |           | not null | true
 end_date                   | timestamp without time zone   |           |          |
 prev_id                    | integer                       |           |          |
 next_id                    | integer                       |           |          |
 completed_tasks_count      | integer                       |           | not null | 0
 tasks_count                | integer                       |           | not null | 0
 done_at                    | timestamp without time zone   |           |          |
 state                      | integer                       |           |          |
 schedule                   | text                          |           |          |
 start_date                 | timestamp without time zone   |           |          |
 manager_id                 | integer                       |           |          |
 partner_id                 | integer                       |           |          |
 exschedule                 | text                          |           |          |
 extdue                     | timestamp without time zone   |           |          |
 is_template                | boolean                       |           | not null | false
 predicted_duration         | integer                       |           |          | 0
 budget                     | integer                       |           |          | 0
 cached_effective_due_date  | timestamp without time zone   |           |          |
 cached_manager_fullname    | character varying(255)        |           |          | ''::character varying
 cached_partner_fullname    | character varying(255)        |           |          | ''::character varying
 cached_staffs_fullnames    | text                          |           |          | ''::text
 cached_staffs_ids          | text                          |           |          | ''::text
 cached_label_ids           | character varying(255)        |           |          | ''::character varying
 date_in                    | timestamp without time zone   |           |          |
 cached_label_sum           | integer                       |           |          | 0
 date_out                   | timestamp without time zone   |           |          |
 turn_around_time           | integer                       |           |          | 0
 dues_calculated_at         | timestamp without time zone   |           |          |
 dues                       | timestamp without time zone[] |           |          |
 dues_rewind                | integer[]                     |           |          |
 quickbooks_item_id         | integer                       |           |          |
 perform_final_review       | boolean                       |           | not null | false
 quickbooks_desktop_item_id | integer                       |           |          |
 billing_model_type         | character varying             |           | not null | 'staff'::character varying
 series_id                  | integer                       |           |          |
 shared                     | boolean                       |           |          | false
Indexes:
    "projects_pkey" PRIMARY KEY, btree (id)
    "index_projects_on_cached_effective_due_date" btree (cached_effective_due_date)
    "index_projects_on_client_id" btree (client_id)
    "index_projects_on_company_id" btree (company_id)
    "index_projects_on_manager_id" btree (manager_id)
    "index_projects_on_next_id" btree (next_id)
    "index_projects_on_partner_id" btree (partner_id)
    "index_projects_on_series_id" btree (series_id)
    "index_projects_on_shared_and_is_template" btree (shared, is_template) WHERE shared = true AND is_template = true
Foreign-key constraints:
    "fk_rails_243d23cb48" FOREIGN KEY (quickbooks_desktop_item_id) REFERENCES quickbooks_desktop_items(id)
    "fk_rails_33ba8711de" FOREIGN KEY (quickbooks_item_id) REFERENCES quickbooks_items(id)
    "fk_rails_fcf0ca7614" FOREIGN KEY (series_id) REFERENCES series(id) NOT VALID
Referenced by:
    TABLE "tasks" CONSTRAINT "tasks_project_id_fkey" FOREIGN KEY (project_id) REFERENCES projects(id)

The projects table has 14,273,833 rows.

  • 124,005 are is_template = true
                                               Table "public.tasks"
         Column          |            Type             | Collation | Nullable |              Default
-------------------------+-----------------------------+-----------+----------+-----------------------------------
 id                      | integer                     |           | not null | nextval('tasks_id_seq'::regclass)
 name                    | character varying(255)      |           |          |
 description             | text                        |           |          |
 duedate                 | timestamp without time zone |           |          |
 created_at              | timestamp without time zone |           | not null |
 updated_at              | timestamp without time zone |           | not null |
 project_id              | integer                     |           | not null |
 done                    | boolean                     |           | not null | false
 position                | integer                     |           |          |
 done_at                 | timestamp without time zone |           |          |
 dueafter                | integer                     |           |          |
 done_by_user_id         | integer                     |           |          |
 predicted_duration      | integer                     |           |          |
 auto_predicted_duration | integer                     |           |          | 0
 assignable_id           | integer                     |           |          |
 assignable_type         | character varying           |           |          |
 will_assign_to_client   | boolean                     |           | not null | false
Indexes:
    "tasks_pkey" PRIMARY KEY, btree (id)
    "index_tasks_on_assignable_type_and_assignable_id" btree (assignable_type, assignable_id)
    "index_tasks_on_done_by_user_id" btree (done_by_user_id)
    "index_tasks_on_duedate" btree (duedate)
    "test_tasks_on_project" btree (project_id)
Foreign-key constraints:
    "tasks_project_id_fkey" FOREIGN KEY (project_id) REFERENCES projects(id)

The tasks table has 76,716,433 rows.

System specifications

  • PostgreSQL 13.1
  • 2.9 GHz 6-Core Intel Core i9
  • 32 GB RAM
  • macOS 10.15.7

Best Answer

The main reason for the different query plan is probably the increased number of rows that Postgres estimates to get back from projects:

(cost=0.00..42021.35 rows=10507 width=0) (actual time=35.642..35.642 rows=10507 loops=1)

vs.

(cost=0.43..277961.56 rows=31322 width=4) (actual time=0.591..6970.696 rows=10507 loops=3)

Over-estimated by factor 3, which is not dramatic, but obviously enough to favor a different (inferior) query plan. Related:

Assuming projects.is_template is mostly false, I suggest these multicolumn indices:

CREATE INDEX ON projects(company_id, state);

Equality first, range later. See:

You might also try to increase the statistics target on company_id, state, and ANALYZE the table, to get better estimates.

And:

CREATE INDEX ON tasks (project_id, id);

Plus increase statistics target on tasks.project_id and ANALYZE.

In both cases, the multicolumn index can replace the one on just project.company_id / task.project_id. Since all columns are integer, the size of the index will would be the same - except for the effect of index de-duplication (added with Postgres 13), which shows strongly in your test for the highly duplicative tasks.project_id. See:

And this query:

SELECT t.id
FROM   projects p
JOIN   tasks t ON t.project_id = p.id
WHERE  p.company_id = 11171
AND    p.state < 6
AND    p.is_template = FALSE;

The direct join should be faster.