I am trying to filter a table which has NON-UNIQUE transaction id's, joined to a products table on 1 common column. The filter needs to be as follows:
equal to :group1 AND
NOT equal to :group2 AND
NOT equal to :group3
I originally started with the below query but without the AND rk_group <> ____
conditions.
SELECT COUNT(DISTINCT txn_id)
FROM 1_txns
INNER JOIN 2_products USING (sku)
WHERE rk_group = :group1
AND rk_group <> :group2
AND rk_group <> :group3
;
I have also tried
SELECT COUNT(DISTINCT txn_id)
FROM 1_txns
INNER JOIN 2_products USING (sku)
WHERE rk_group NOT IN ( :group2, :group3)
;
I have also tried multiple combinations of joins and IN() and NOT IN() and it's still returning all teams ids including those where the NOT groups are present.
Can someone point me in the right direction please.
Schema info relevant for the query:
table 1_txns
(txn_id, sku)
table 2_products
(sku, rk_group)
Sample data
Txn_id, rk_group
------
1,group1
1,group2
2,group1
3,group1
3,group3
If the above were my data, and group1 is my = group, and groups2 and 3 are my != groups, then I should return a row count of 1. Which is txn id '2'. All methods I've tried so far will return a count of 3.
NOTE: there are only 3 distinct rk_groups in the 2_products table.
Best Answer
Your WHERE clause does not make much sense, because it is applied to each row individually and it is pointless to check if the same value is equal to
a
and at the same time not equal tob
orc
– of course, it will not be equal tob
orc
if it isa
. What you want instead, therefore, is for the conditions to be applied to a group of rows as a whole – more specifically, to each group of rows sharing the sametxn_id
.So, you need to use GROUP BY and, to apply conditions to groups of rows, HAVING. This query will give you the list of
txn_id
values matching your requirements:As you do not seem to want the list, only the number of its items, use the above as a derived table to count the rows:
As you can see,
COUNT(DISTINCT ...)
is not necessary: the derived table is grouping bytxn_id
and thus cannot return duplicates – so,COUNT(*)
is enough to get the correct result.In case you are not aware, the
OR NULL
bit lets the COUNT function count only matches and omit mismatches, as explained in detail in this answer: