Mysql – Query with WHERE NOT and NOT IN with subquery

MySQLmysql-5.6subquerywhere

MySQL Version: 5.6.27

I am using the following two tables

TABLE: oc_product
product_id      int(11)
model           varchar(64)
tax_class_id    int(11)

and

TABLE: oc_product_to_category
product_id      int(11)
category_id     int(11)

I want to change the tax_class_id for a few products on TABLE oc_product. So I am running a SELECT query before UPDATE

SELECT `product_id`, `model`, `tax_class_id` 
FROM `oc_product` 
WHERE 
NOT `tax_class_id` = 12 
AND
`product_id` 
NOT IN ( SELECT `product_id` 
        FROM  `oc_product_to_category` 
        WHERE  `category_id` 
        IN ( 73, 78, 116, 119, 120, 121, 122, 123, 125, 126, 127, 143, 159, 170, 176, 183, 211, 212, 213, 250, 260, 389, 433) 
        )

Unfortunately the results are not as expected.

All I want is to select all product_id from the table oc_product where tax_class_id is not 12 and product_id does not exist in the results of the following sub query.

SELECT `product_id` 
FROM  `oc_product_to_category` 
WHERE  `category_id` 
IN ( 73, 78, 116, 119, 120, 121, 122, 123, 125, 126, 127, 143, 159, 170, 176, 183, 211, 212, 213, 250, 260, 389, 433)

Did I miss something? Please advise!

Best Answer

Your unexpected results may be stemming from product_id being assigned to multiple categories in the oc_product_to_category table. If a product is in e.g. category 1 AND category 73, it will still be returned by the inner query and thus excluded from the outer query.

Produce a simple set of test data

INTO oc_product INSERT (product_id,  model, tax_class_id) VALUES
(1, 'model5', 10),
(2, 'AA', 5),
(3, '12', 8),
(4, '14', 12),
(5, 'xx', 12)

INTO oc_product_to_category INSERT (product_id, category_id) VALUES
(1, 1), -- product_id is in a category that is not in that list
(1, 73), -- product_id is STILL in a category that is not in that list; as well as a value in that list
(2, 73), -- product_id is only in categories in that list
(3, 17), -- product_id is only in categories not in that list
(4, 33), -- product_id is only in categories not on the category list, is excluded because of tax category
(5, 73), -- product_id is only in categories in the list, is excluded because of tax category

(I probably have the syntax of the INSERT query backwards;) )

I believe your current query will return

(2, 'AA', 5),

You probably want something like

SELECT 
    op.`product_id`,
    op.`model`,
    op.`tax_class_id` 
FROM
    oc_product AS op
  LEFT JOIN
    (SELECT DISTINCT p2c.product_id
     FROM oc_product_to_category as p2c
     WHERE
         p2c.category_id IN (73, 78, 116, 119, 120, 121, 122, 123, 125, 126, 127, 143, 159, 170, 176, 183, 211, 212, 213, 250, 260, 389, 433)
    ) as OnlyProductsBelongingToTheseIncluded
  ON op.product_id = OnlyProductsBelongingToTheseIncluded.product_id
WHERE 
    ( op.tax_class_id <> 12 ) AND
    OnlyProductsBelongingToTheseIncluded.product_id IS NOT NULL

This should return

(1, 'model5', 10),
(2, 'AA', 5)

3 is excluded because it only exists in other groups; 4 and 5 are excluded because of the tax code

Changing the last line to AllProductsBelongingToTheseExcluded.product_id IS NULL (and the appropriate subquery alias) should return only the (3, '12', 8) datapoint