I'm using Postgres 9.5. I want to select some records, but only if they have at least one association in another table. I came up with this query:
select distinct m.name, m.id
from machines m, parts rt
where m.id = rt.machine_id
and m.name like '%Name%';
but I don't think this is particularly efficient, since if there are 50000 associations, I think they're all loaded and then "distinct"ed through, which seems a little inefficient. Is there a more efficient way to check load a record but only if it has one or more associations?
Best Answer
The way your query is structured, you can just write it using an
EXISTS
expression:If you want the query to be efficient, take into account that the most difficult part will probably be checking the machine
name
. Consider indexingmachines.name
using atrigram index
. Trigram indices allow for fast findingLIKE '%something%'
conditions.Most probably, the database planner will make an execution plan where it will first find the machines whose
name
is LIKE '%Name%' (using thename_like_idx
index), then, for every machine in this subset, it will check if there are parts associated with them; if so, they will be part of the result.For efficiency, it is also assumed that there is also an index on
parts.machine_id
, because this will then be how PostgreSQL will efficiently check for the existence of the part. Ifmachine_id
is a foreign key, it is always advised to index it.This query doesn't need
DISTINCT
(assuming that theid
s are already unique).