I have a question on how to select best match in PostgreSQL for a given result/condition:
I have a setup containing 3 tables.
1 is a result overview table with reliability and status|
2 is a detailed segments overview with details per segment
3 is a general table with information on the targets.
Goal is to map segments onto targets based on results from table 1
Setup:
table1: Results_table
DROP TABLE IF EXISTS result_fcd;
CREATE TABLE result_fcd
(
source_geometry_id integer NOT NULL,
reliability double precision NOT NULL,
status text COLLATE pg_catalog."default" NOT NULL,
geometry geometry(LineString,4326) NOT NULL,
CONSTRAINT result_fcd_pkey PRIMARY KEY (source_geometry_id)
);
table2: segments (links to originals)
DROP TABLE IF EXISTS segments;
CREATE TABLE segments
(
source_geometry_id integer NOT NULL,
index integer NOT NULL,
target_link_id integer NOT NULL,
reversed_target_link boolean NOT NULL,
start_fraction double precision NOT NULL,
end_fraction double precision NOT NULL,
geometry geometry(LineString,4326) NOT NULL,
CONSTRAINT segments_pkey PRIMARY KEY (source_geometry_id, index)
);
table 3: Originals
DROP TABLE IF EXISTS originals;
CREATE TABLE originals
(
gid integer NOT NULL DEFAULT nextval('originals_gid_seq'::regclass),
linknr double precision,
name character varying(50) COLLATE pg_catalog."default",
direction double precision,
length numeric,
anode double precision,
bnode double precision,
namenr double precision,
geom geometry(LineString,28992),
CONSTRAINT originals_pkey PRIMARY KEY (gid)
)
Example of output (simple left join)
SELECT target_link_id,reversed_target_link,segs.source_geometry_id,reliability
FROM segments segs
LEFT JOIN result_fcd res ON res.source_geometry_id = segs.source_geometry_id
WHERE reliability > 80
AND status = 'match'
LIMIT 10
results:
target_link_id reversed_target_link source_geometry_id reliability
750 true 6156655 "96.90882243480968"
750 true 6156654 "96.3385845004076"
759 false 299963 "97.88820671302537"
759 false 299962 "90.81686447596023"
759 false 4339973 "83.83639328022115"
981 true 304625 "94.25089913251661"
981 true 304624 "96.521137788089"
981 true 304626 "94.21941014522639"
981 true 304621 "91.93785338486143"
981 true 304622 "95.27831743865612"
now the goal is to merge the results into:
750 true 6156655 "96.90882243480968"
-- 750 true 6156654 "96.3385845004076"
759 false 299963 "97.88820671302537"
-- 759 false 299962 "90.81686447596023"
-- 759 false 4339973 "83.83639328022115"
-- 981 true 304625 "94.25089913251661"
981 true 304624 "96.521137788089"
-- 981 true 304626 "94.21941014522639"
-- 981 true 304621 "91.93785338486143"
-- 981 true 304622 "95.27831743865612"
I tried using a MAX(reliability) and grouping by target_link_id,reversed_target_link,source_geometry_id but that did not really get the results.
Essentially i need the best performing source_geometry_id (max reliability) for each target_link_id
Anyone able to assist with this?
Best Answer
I've used this data as start point:
If you add a row_number() to your previous query you can get the desired result in this way:
db<>fiddle here