I have following table:
create table test (
company_id integer not null,
client_id integer not null,
client_status text,
unique (company_id, client_id)
);
insert into test values
(1, 1, 'y'), -- company1
(2, 2, null), -- company2
(3, 3, 'n'), -- company3
(4, 4, 'y'), -- company4
(4, 5, 'n'),
(5, 6, null), -- company5
(5, 7, 'n')
;
Basically, there are 5 different companies, each one has one or more clients and each client has status: 'y' or 'n' (might be null as well).
What I have to do is to select all pairs (company_id, client_id)
for all companies for which there is at least one client whose status is not 'n' ('y' or null). So for the example data above, the output should be:
company_id;client_id
1;1
2;2
4;4
4;5
5;6
5;7
I tried something with window functions but I can't figure out how to compare the number of ALL clients with the number of clients with STATUS = 'n'
.
select company_id,
count(*) over (partition by company_id) as all_clients_count
from test
-- where all_clients_count != ... ?
I figured out how to do this, but I am not sure if it's the right way:
select sub.company_id, unnest(sub.client_ids)
from (
select company_id, array_agg(client_id) as client_ids
from test
group by company_id
having count(*) != count( (case when client_status = 'n' then 1 else null end) )
) sub
Best Answer
Basically you are looking for the expression:
The column
client_status
should really be data typeboolean
, nottext
, which would allow the simpler expression:The manual has details in the chapter Comparison Operators.
Assuming your actual table has a
UNIQUE
orPK
constraint, we arrive at:Queries
All of these do the same (what you asked), which is the fastest depends on data distribution:
Or:
Or:
Boolean values sort
FALSE
->TRUE
->NULL
in ascending sort order. SoFALSE
comes last in descending order. If there's is any other value available, then that one's picked first ...The added PK is implemented with a useful index for these queries. If you want faster, yet, add a partial index for query 1:
You could use window functions, too, but that would be slower. Example with
first_value()
:For lots of rows per
company_id
, one of these techniques may be faster, still: