Your unexpected results may be stemming from product_id
being assigned to multiple categories in the oc_product_to_category
table. If a product is in e.g. category 1 AND category 73, it will still be returned by the inner query and thus excluded from the outer query.
Produce a simple set of test data
INTO oc_product INSERT (product_id, model, tax_class_id) VALUES
(1, 'model5', 10),
(2, 'AA', 5),
(3, '12', 8),
(4, '14', 12),
(5, 'xx', 12)
INTO oc_product_to_category INSERT (product_id, category_id) VALUES
(1, 1), -- product_id is in a category that is not in that list
(1, 73), -- product_id is STILL in a category that is not in that list; as well as a value in that list
(2, 73), -- product_id is only in categories in that list
(3, 17), -- product_id is only in categories not in that list
(4, 33), -- product_id is only in categories not on the category list, is excluded because of tax category
(5, 73), -- product_id is only in categories in the list, is excluded because of tax category
(I probably have the syntax of the INSERT query backwards;) )
I believe your current query will return
(2, 'AA', 5),
You probably want something like
SELECT
op.`product_id`,
op.`model`,
op.`tax_class_id`
FROM
oc_product AS op
LEFT JOIN
(SELECT DISTINCT p2c.product_id
FROM oc_product_to_category as p2c
WHERE
p2c.category_id IN (73, 78, 116, 119, 120, 121, 122, 123, 125, 126, 127, 143, 159, 170, 176, 183, 211, 212, 213, 250, 260, 389, 433)
) as OnlyProductsBelongingToTheseIncluded
ON op.product_id = OnlyProductsBelongingToTheseIncluded.product_id
WHERE
( op.tax_class_id <> 12 ) AND
OnlyProductsBelongingToTheseIncluded.product_id IS NOT NULL
This should return
(1, 'model5', 10),
(2, 'AA', 5)
3 is excluded because it only exists in other groups;
4 and 5 are excluded because of the tax code
Changing the last line to
AllProductsBelongingToTheseExcluded.product_id IS NULL
(and the appropriate subquery alias)
should return only the (3, '12', 8)
datapoint
Best Answer
Just to provide some additional explanation to billinkc's answer.
If null is a trump card you might be wondering why doesn't
WHERE 2 IN (2,3, NULL)
exhibit the same behavior?That one works as expected because it evaluates to
(2=2) OR (2=3) OR (2=NULL)
.Under the rules of three valued logic for
Or
-ed conditions if any of them evaluate totrue
the expression istrue
. Otherwise if any of them evaluate tounknown
the expression isunknown
. The only other possibility is that all arefalse
in which case the expression evaluates tofalse
.In order for a row to be returned in SQL the
WHERE
clause must evaluate totrue
rather thanfalse
orunknown
. The above does that.The expression
1 NOT IN (2,3, NULL)
evaluates to(1 <> 2) AND (1 <> 3) AND (1 <> NULL)
. When conditions areAND
-ed all of them must evaluate totrue
in order for the expression to evaluate totrue
.The presence of the
NULL
in the list guarantees that there will be at least oneUNKNOWN
and that this will never be the case. Hence the reason for the " NULL pooches it all" behaviour in this context.To give an analogy here as to why the
NOT IN
behaviour makes sense.