Postgresql – Query slowing down the database performance

pgroutingpostgispostgresqlpostgresql-performancequery-performance

I have a query to find the maximum spacing between charging stations on any given route, using PostGIS and pgRouting. An example query is below:

select (max(delr) * st_length(line::geography) * 0.000621371) as max_spacing
from (select sq2.ratio - coalesce((lag(sq2.ratio) over ( order by sq2.ratio)), 0) as delr, line
      from (select ST_LineLocatePoint(line, sqa.points) as ratio, line
            from sp_od_ferry(98282, 98002) as line,
                 (select st_setsrid(st_makepoint(longitude, latitude), 4326) as points
                  from (select longitude,
                               latitude
                        from evses_now
                        where analysis_id = 565
                          and (connector_code = 1
                           or connector_code = 3)
                        union
                        select longitude,
                            latitude
                        from zipcode_record
                        where zip = '98282'
                           or zip = '98002') as sq) as sqa
            where st_dwithin(line::geography, sqa.points::geography, 16093.4)
            order by ratio asc) as sq2) as sq3
group by sq3.line;

Briefly, the logic is to find the points (charging stations) near the shortest path (given by user-defined function sp_od_ferry()) between origin and destination and find the length of the longest segment between points.

I have to run the above query for several OD pairs, and several of these calculations can be launched in parallel by users. I used AWS RDS performance insights and it found the above query to be the slowest one and causing database slowdown (and 100% CPU usage on the DB instance).

enter image description here

On EXPLAIN ANALYZE, it shows the nested inner loop to be the costliest step. Visit here for a visual plan. The query execution plan generated with explain (analyze, buffers, format text) is below:

"GroupAggregate  (cost=3862.38..3864.90 rows=1 width=40) (actual time=199.637..199.642 rows=1 loops=1)"
"  Group Key: sq3.line"
"  Buffers: shared hit=2567 read=507"
"  I/O Timings: read=81.495"
"  ->  Sort  (cost=3862.38..3862.38 rows=1 width=40) (actual time=197.461..197.471 rows=78 loops=1)"
"        Sort Key: sq3.line"
"        Sort Method: quicksort  Memory: 1967kB"
"        Buffers: shared hit=2561 read=507"
"        I/O Timings: read=81.495"
"        ->  Subquery Scan on sq3  (cost=3862.32..3862.37 rows=1 width=40) (actual time=195.862..197.144 rows=78 loops=1)"
"              Buffers: shared hit=2555 read=507"
"              I/O Timings: read=81.495"
"              ->  WindowAgg  (cost=3862.32..3862.36 rows=1 width=48) (actual time=195.861..197.129 rows=78 loops=1)"
"                    Buffers: shared hit=2555 read=507"
"                    I/O Timings: read=81.495"
"                    ->  Sort  (cost=3862.32..3862.33 rows=1 width=40) (actual time=195.790..195.806 rows=78 loops=1)"
"                          Sort Key: (st_linelocatepoint(line.shortest_path, st_setsrid(st_makepoint(evses_now.longitude, evses_now.latitude), 4326)))"
"                          Sort Method: quicksort  Memory: 1967kB"
"                          Buffers: shared hit=2555 read=507"
"                          I/O Timings: read=81.495"
"                          ->  Nested Loop  (cost=105.66..3862.31 rows=1 width=40) (actual time=125.256..194.986 rows=78 loops=1)"
"                                Join Filter: st_dwithin((line.shortest_path)::geography, (st_setsrid(st_makepoint(evses_now.longitude, evses_now.latitude), 4326))::geography, '16093.4'::double precision, true)"
"                                Rows Removed by Join Filter: 71"
"                                Buffers: shared hit=2552 read=507"
"                                I/O Timings: read=81.495"
"                                ->  Function Scan on sp_od_ferry line  (cost=0.25..0.26 rows=1 width=32) (actual time=99.868..99.869 rows=1 loops=1)"
"                                      Buffers: shared hit=2458 read=481"
"                                      I/O Timings: read=66.655"
"                                ->  HashAggregate  (cost=105.41..106.88 rows=147 width=16) (actual time=5.061..5.110 rows=149 loops=1)"
"                                      Group Key: evses_now.longitude, evses_now.latitude"
"                                      Buffers: shared hit=8 read=8"
"                                      I/O Timings: read=4.909"
"                                      ->  Append  (cost=0.42..104.67 rows=147 width=16) (actual time=2.745..5.023 rows=149 loops=1)"
"                                            Buffers: shared hit=8 read=8"
"                                            I/O Timings: read=4.909"
"                                            ->  Index Scan using evses_now_pkey on evses_now  (cost=0.42..97.46 rows=145 width=16) (actual time=2.744..4.982 rows=147 loops=1)"
"                                                  Index Cond: (analysis_id = 565)"
"                                                  Filter: ((connector_code = 1) OR (connector_code = 3))"
"                                                  Rows Removed by Filter: 12"
"                                                  Buffers: shared hit=2 read=8"
"                                                  I/O Timings: read=4.909"
"                                            ->  Bitmap Heap Scan on zipcode_record  (cost=2.80..5.01 rows=2 width=16) (actual time=0.021..0.024 rows=2 loops=1)"
"                                                  Recheck Cond: ((zip = '98282'::text) OR (zip = '98002'::text))"
"                                                  Heap Blocks: exact=2"
"                                                  Buffers: shared hit=6"
"                                                  ->  BitmapOr  (cost=2.80..2.80 rows=2 width=0) (actual time=0.017..0.018 rows=0 loops=1)"
"                                                        Buffers: shared hit=4"
"                                                        ->  Bitmap Index Scan on zipcode_record_pkey  (cost=0.00..1.40 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1)"
"                                                              Index Cond: (zip = '98282'::text)"
"                                                              Buffers: shared hit=2"
"                                                        ->  Bitmap Index Scan on zipcode_record_pkey  (cost=0.00..1.40 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=1)"
"                                                              Index Cond: (zip = '98002'::text)"
"                                                              Buffers: shared hit=2"
"Planning Time: 44.030 ms"
"Execution Time: 200.743 ms"

I understand one way to reduce the database load would be to provision a bigger RDS instance. I currently use (db.t3.small) which has the following specs:

enter image description here

I used pgTune to make the changes to the default AWS RDS Postgres 12.5 settings. The new config is below:

max_connections = 100
shared_buffers = 512MB
effective_cache_size = 1536MB
maintenance_work_mem = 128MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 2621kB
min_wal_size = 2GB
max_wal_size = 8GB

Any suggestions regarding the query or ideas about how I can manage the database load while keeping the costs low are appreciated.

Best Answer

The time in executing the query is spent in two places:

  1. about 50% executing the function sp_od_ferry:

    I cannot say anything about optimizing that, since all we see is the function call.

  2. about 50% executing st_dwithin 149 times during the a nested loop join

It is noticeable that almost all that time is spent doing I/O, so caching more of the data, perhaps with more RAM, would help.