PostgreSQL – Resolving Query Hang Issues with Parallel Execution

indexpostgresqlpostgresql-13

I have a Flask API that has 2 endpoints.

When the user clicks a button, the front-end first calls a /calculate route, which computes an expensive query and appends the results to a table called results.

This table is indexed as follows:

 create index if not exists results_tbl_idx on results
              (customer_id, result_id, team_id, team_name, territory_name, specialty, id);

When /calculate returns, it then gets a handle to the result_id, and then calls a /get_results endpoint, which runs the following query on the results table:

with distinct_users_team_tbl as (
    select team_id,
        count(distinct id) as distinct_users_team,
        count(distinct id) filter ( where targeted_users > 0 ) as targeted_distinct_users_team
    from tam_results_14
    where result_id = 201 and customer_id = '1'
    group by team_id
),
distinct_users_territory_tbl as (
    select team_id, territory_name,
        count(distinct id) as distinct_users_territory,
        count(distinct id) filter ( where targeted_users > 0 ) as targeted_distinct_users_territory
    from tam_results_14
    where result_id = 201 and customer_id = '1'
    group by team_id, territory_name
),
distinct_users_specialty_tbl as (
    select team_id, team_name, territory_name, specialty,
        sum(tam) as tam,
        sum(sam) as sam,
        count(distinct id) as  distinct_users_specialty,
        count(distinct id) filter ( where targeted_users > 0 ) as  targeted_distinct_users_specialty
    from tam_results_14
    where result_id = 201 and customer_id = '1'
    group by team_id, team_name, territory_name, specialty
)

select *
from distinct_users_specialty_tbl
join distinct_users_team_tbl
    on distinct_users_specialty_tbl.team_id = distinct_users_team_tbl.team_id
join distinct_users_territory_tbl
    on distinct_users_specialty_tbl.team_id = distinct_users_territory_tbl.team_id
    and distinct_users_specialty_tbl.territory_name = distinct_users_territory_tbl.territory_name

Sometimes, just after running /calculate, when fetching /get_results, the query hangs forever.

Typically, the /get_results query executes in < 4s. And even while it's hanging, I can paste the exact same query that is hanging into a console and run it in < 4s.

I have checked pg_stat_activity and can see the query running with no wait_event and state = active.

I then checked pg_locks and for that pid, I can see:

pg_locks

My current hunch

Is it that database hasn't finished indexing the new entries to the results table before I query it? But that it has finished adding items to the index by the time I paste the hung query into the terminal and run which can use the index?

Is there some way I can force the table to fully index before running the /get_results query? Is that even possible?

Thanks for any help!!!

Best Answer

Add an explicit

ANALYZE results;

at the end of \calculate so that the other function has accurate table statistics.