Postgresql – Pre-filtering on right table in a LEFT JOIN

join;postgresql

I have the following schema/query where I basically need to filter parent table by some criteria and then collect aggregated data from its child table(s):

create table parent (id integer primary key, name text not null);
create table child (id integer primary key, pid integer not null references parent(id));

select parent.id, parent.name, q.cnt from parent
left join (
  select pid, count(*) cnt from child
-- where pid in (select id from parent where name like '%xyz%')
  group by pid
) q on parent.id = q.pid
where name like '%xyz%'

(Actual schema/query is more convoluted but the gist is the same.)

The issue is, unless I uncomment the WHERE clause on the right table the query takes significantly more time to execute. Having to specify the same filter in two places doesn't feel right. Am I doing something wrong? Should I transform the query somehow? Why is the inner filter even necessary; shouldn't it automatically discard records from the right table that cannot be joined? The database is PostgreSQL by the way.

Best Answer

For example you can do it this way:

select p.id, p.name, 
COALESCE((select count(*) from child where p.id = pid), 0) AS cnt 
from parent AS p
where name like '%xyz%'

or this:

select p.id, p.name, 
count(c.pid) AS cnt 
from parent AS p
left join child AS c ON p.id = c.pid
where name like '%xyz%'
group by p.id, p.name

if you need more columns from parent:

select p1.*, CTE.cnt
from parent p1
join (select p.id
count(c.pid) AS cnt 
from parent AS p
left join child AS c ON p.id = c.pid
where name like '%xyz%'
group by p.id) as CTE on p1.id = CTE.id

Related Solutions

Postgresql – Altering a parent table in Postgresql 8.4 breaks child table defaults

Your problem is that when you add a new column to the_person, its child, the_person_two will have this field appended at the end of columns list (4th position), so after has_default column. See:

db=> \d temp_person
  Column   |       Type        |                            Modifiers                            
-----------+-------------------+-----------------------------------------------------------------
 person_id | integer           | not null default nextval('temp_person_person_id_seq'::regclass)
 name      | character varying | 
 foo       | text              | 

db=> \d temp_person_two 
   Column    |         Type         |                            Modifiers                            
-------------+----------------------+-----------------------------------------------------------------
 person_id   | integer              | not null default nextval('temp_person_person_id_seq'::regclass)
 name        | character varying    | 
 has_default | character varying(4) | not null default 'en'::character varying
 foo         | text                 |

So, when you execute this:

INSERT INTO temp_person_two VALUES ( NEW.* );

PostgreSQL will actually understand that you want to insert on the first three columns of temp_person_two (as NEW.* will expand to three values), generating something similar to this:

INSERT INTO temp_person_two(person_id,name,has_default)
VALUES ( NEW.person_id, NEW.name, NEW.foo );

So, temp_person_two.has_default will get the value of NEW.foo, which is NULL in your case.

The solution is to simply expand the column names:

INSERT INTO temp_person_two(person_id,name,foo)
VALUES ( NEW.person_id, NEW.name, NEW.foo );

or, you could also use this:

INSERT INTO temp_person_two(person_id,name,foo)
VALUES ( NEW.* );

But this is weak, as any changes on column positions may break your statements, so I'd recommend the first one.

EDIT:

So the conclusion and the lesson learned here is:

Always explicitly type the names of the columns and the values when issuing an INSERT command, in fact, when issuing any SQL command at all... =D

This will save you a lot of time solving problems like that in future.

Mysql – Simple left outer join leaving me with inner join

I think you need a UNION or an OR with EXISTS:

SELECT c.* 
FROM change AS c
WHERE c.project_id = 10
   OR c.project_id IS NULL
  AND EXISTS
        ( SELECT *
          FROM change AS x 
          WHERE x.project_id = 10
            AND x.object_id = c.object_id
        ) ;

Best Answer

Related Solutions

Postgresql – Altering a parent table in Postgresql 8.4 breaks child table defaults

Mysql – Simple left outer join leaving me with inner join

Related Question