I'm using postgresql 9.6 on the 16GB Ubuntu-PC with 4 cores & SSD with following params:
max_connections = 200
shared_buffers = 6GB
work_mem = 256MB
maintenance_work_mem =1GB
How to increase performance of the SQL-query:
WITH ss AS (SELECT
time_noise, track, slow, distance, last_time
FROM
eco.noise
WHERE
(track, time_noise, slow) IN (
SELECT
DISTINCT ON (track)
track, time_noise, slow
FROM
eco.noise
WHERE
base_name='B001' AND slow >= 50 AND slow <= 90 AND distance <= 10000 AND time_noise >= '06-01-2019' AND time_noise <= '07-01-2019'
ORDER BY track, slow DESC
)
ORDER BY time_noise)
SELECT
COALESCE(to_char(ss.time_noise, 'YYYY-MM-DD HH24:MI:SS'), '') AS time_noise, ss.slow, ss.track, COALESCE(to_char(ss.last_time, 'YYYY-MM-DD HH24:MI:SS'), '') as last_time, ss.distance,
eco.tracks.callsign, eco.tracks.altitude,eco.tracks.speed,eco.tracks.angle,eco.tracks.latitude,eco.tracks.longitude,eco.tracks.vertical_speed,
eco.routes.from, eco.routes.to
FROM ss
LEFT JOIN eco.tracks ON ss.track = eco.tracks.track AND eco.tracks.time_track = ss.last_time
LEFT JOIN eco.routes ON eco.tracks.callsign = eco.routes.callsign
ORDER BY ss.time_noise ASC;
I use the following indexes:
postgres | eco | eco.noise | test_unique_noise_3 | f | f | btree | 3 8 26 2 | {base_name,slow,distance,time_noise} | f | f
postgres | eco | eco.tracks | tracks_time_track_track_key | t | f | btree | 2 3 | {time_track,track} | f | f
postgres | eco | eco.routes | pr_routes | t | t | btree | 1 | {callsign} | f | f
postgres | eco | eco.noise | test_unique_noise | f | f | btree | 25 2 8 | {track,time_noise,slow} | f | f
EXPLAIN ANALYZE:
Sort (cost=3739384.97..3742913.97 rows=1411600 width=131) (actual time=2377.640..2377.805 rows=3792 loops=1)
Sort Key: ss.time_noise
Sort Method: quicksort Memory: 1087kB
CTE ss
-> Sort (cost=525472.51..529001.51 rows=1411600 width=32) (actual time=2161.522..2162.140 rows=3792 loops=1)
Sort Key: noise.time_noise
Sort Method: quicksort Memory: 393kB
-> Nested Loop (cost=369723.59..381285.34 rows=1411600 width=32) (actual time=2026.531..2160.723 rows=3792 loops=1)
-> Unique (cost=369723.02..370160.00 rows=3978 width=20) (actual time=2026.498..2142.201 rows=3791 loops=1)
-> Sort (cost=369723.02..369941.51 rows=87396 width=20) (actual time=2026.497..2117.520 rows=568059 loops=1)
Sort Key: noise_1.track, noise_1.slow DESC
Sort Method: quicksort Memory: 68956kB
-> Index Scan using test_unique_noise_3 on noise noise_1 (cost=0.56..362549.87 rows=87396 width=20) (actual time=0.039..1751.161 rows=568059 loops=1)
Index Cond: ((base_name = 'B001'::text) AND (slow >= '50'::double precision) AND (slow <= '90'::double precision) AND (distance <= 10000) AND (time_noise >= '2019-06-01 00:00:00'::timestamp without time zone) AND (time_noise <= '2019-07-01 00:00:00'::timestamp without time zone))
-> Index Scan using test_unique_noise on noise (cost=0.56..2.78 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=3791)
Index Cond: ((track = noise_1.track) AND (time_noise = noise_1.time_noise) AND (slow = noise_1.slow))
-> Hash Left Join (cost=17792.76..3066196.28 rows=1411600 width=131) (actual time=2351.691..2376.080 rows=3792 loops=1)
Hash Cond: (tracks.callsign = routes.callsign)
-> Nested Loop Left Join (cost=0.56..3021978.00 rows=1411600 width=67) (actual time=2161.540..2179.800 rows=3792 loops=1)
-> CTE Scan on ss (cost=0.00..28232.00 rows=1411600 width=32) (actual time=2161.524..2162.558 rows=3792 loops=1)
-> Index Scan using tracks_time_track_track_key on tracks (cost=0.56..2.11 rows=1 width=51) (actual time=0.004..0.004 rows=1 loops=3792)
Index Cond: ((time_track = ss.last_time) AND (ss.track = track))
-> Hash (cost=10017.64..10017.64 rows=621964 width=15) (actual time=189.279..189.279 rows=610992 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 37585kB
-> Seq Scan on routes (cost=0.00..10017.64 rows=621964 width=15) (actual time=0.017..65.022 rows=610992 loops=1)
Planning time: 0.646 ms
Execution time: 2380.094 ms
As I understand the bottleneck is CTE because of 'WITH', so I've tried to use subquery.. but similar result;
SELECT
sub.time_noise AS time_noise, sub.slow, sub.track, sub.last_time as last_time, sub.temperature, sub.humadity, sub.presure, sub.wind, sub.distance,
eco.tracks.callsign, eco.tracks.altitude,eco.tracks.speed,eco.tracks.angle,eco.tracks.latitude,eco.tracks.longitude,eco.tracks.vertical_speed,
eco.routes.from, eco.routes.to
FROM (SELECT
time_noise, track, slow, distance, last_time
FROM
eco.noise
WHERE
(track, time_noise, slow) IN (
SELECT
DISTINCT ON (track)
track, time_noise, slow
FROM
eco.noise
WHERE
base_name='B001' AND slow >= 50 AND slow <= 100 AND distance <= 10000 AND time_noise >= '01-01-2019' AND time_noise <= '07-01-2019'
ORDER BY track, slow DESC
)
ORDER BY time_noise) as sub
LEFT JOIN eco.tracks ON sub.track = eco.tracks.track AND eco.tracks.time_track = sub.last_time
LEFT JOIN eco.routes ON eco.tracks.callsign = eco.routes.callsign
ORDER BY sub.time_noise ASC;
EXPLAIN ANALYZE:
Nested Loop Left Join (cost=1066367.20..4077589.03 rows=1411652 width=83) (actual time=3943.936..3992.130 rows=9262 loops=1)
-> Sort (cost=1066366.64..1069895.77 rows=1411652 width=48) (actual time=3943.917..3945.328 rows=9262 loops=1)
Sort Key: noise.time_noise
Sort Method: quicksort Memory: 1108kB
-> Nested Loop (cost=907649.38..922173.77 rows=1411652 width=48) (actual time=3582.100..3941.579 rows=9262 loops=1)
-> Unique (cost=907648.81..911048.44 rows=3978 width=20) (actual time=3582.059..3891.212 rows=9253 loops=1)
-> Sort (cost=907648.81..909348.63 rows=679925 width=20) (actual time=3582.058..3826.172 rows=1450072 loops=1)
Sort Key: noise_1.track, noise_1.slow DESC
Sort Method: quicksort Memory: 162439kB
-> Index Scan using test_unique_noise_3 on noise noise_1 (cost=0.56..841781.02 rows=679925 width=20) (actual time=0.043..2864.183 rows=1450072 loops=1)
Index Cond: ((base_name = 'B001'::text) AND (slow >= '50'::double precision) AND (slow <= '100'::double precision) AND (distance <= 10000) AND (time_noise >= '2019-01-01 00:00:00'::timestamp without time zone) AND (time_noise <= '2019-07-01 00:00:00'::timestamp without time zone))
-> Index Scan using test_unique_noise on noise (cost=0.56..2.78 rows=1 width=48) (actual time=0.005..0.005 rows=1 loops=9253)
Index Cond: ((track = noise_1.track) AND (time_noise = noise_1.time_noise) AND (slow = noise_1.slow))
-> Index Scan using tracks_time_track_track_key on tracks (cost=0.56..2.11 rows=1 width=51) (actual time=0.005..0.005 rows=1 loops=9262)
Index Cond: ((time_track = noise.last_time) AND (noise.track = track))
Planning time: 0.782 ms
Execution time: 3994.348 ms
Should I use PL/pgpsql e.g?
Best Answer
The index scan can only use the first two columns (
base_name
andslow
) for the scan itself, the remaining ones are used as a filter (before table access). But your query is probably slow because it has to access the table 1.5 million times to fetch thetrack
column.You should create an index like this:
Then make sure that
noise
is vacuumed regularly, and you should get an index only scan, which is much faster, and you won't need an explicit sort as well.