Mysql – Selecting with multiple ands

MySQL

I am having a little trouble wrapping my head around filtering a query in the way that I need. My current sql:

SELECT
`t0`.`id` AS `t0_c0`, `t0`.`language_id` AS `t0_c1`, `t0`.`attribute_assignment_id` AS `t0_c2`, `t0`.`data` AS `t0_c3`, `t0`.`slug` AS `t0_c4`, `t1`.`id` AS `t1_c0`, `t1`.`attribute_id` AS `t1_c1`, `t1`.`entity_id` AS `t1_c2`, `t2`.`id` AS `t2_c0`, `t2`.`type` AS `t2_c1`, `t3`.`id` AS `t3_c0`, `t3`.`language_id` AS `t3_c1`, `t3`.`attribute_id` AS `t3_c2`, `t3`.`name` AS `t3_c3`, `t3`.`slug` AS `t3_c4`

FROM `attribute_assignment_i18n` AS `t0`
LEFT JOIN `attribute_assignment` AS `t1` ON (`t0`.`attribute_assignment_id` = `t1`.`id`)
LEFT JOIN `attribute` AS `t2` ON (`t1`.`attribute_id` = `t2`.`id`)
LEFT JOIN `attribute_i18n` AS `t3` ON (`t2`.`id` = `t3`.`attribute_id`)

WHERE
`t1`.`entity_id` IN ('product_336', 'product_337', 'product_338', 'product_339', 'product_340', 'product_341', 'product_342', 'product_343', 'product_344', 'product_345')

--There can be any number of these ANDs
AND `t3`.`slug` in ('country', 'type')

AND `t0`.`slug` IN ('rose', 'france')

Which produces a result like this:

+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+
| t0_c0 | t0_c1 | t0_c2 | t0_c3  | t0_c4  | t1_c0 | t1_c1 | t1_c2       | t2_c0 | t2_c1        | t3_c0 | t3_c1 | t3_c2 | t3_c3   | t3_c4   |
+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+
|     1 | en    |     1 | France | France |     1 | 1     | product_342 |     1 | default_text |     1 | en    |     1 | Country | country |
|    11 | en    |    11 | France | France |    11 | 1     | product_343 |     1 | default_text |     1 | en    |     1 | Country | country |
|    31 | en    |    31 | France | France |    31 | 1     | product_345 |     1 | default_text |     1 | en    |     1 | Country | country |
|    32 | en    |    32 | Rose   | Rose   |    32 | 2     | product_345 |     2 | default_text |     2 | en    |     2 | Type    | type    |
|    41 | en    |    41 | France | France |    41 | 1     | product_341 |     1 | default_text |     1 | en    |     1 | Country | country |
|    51 | en    |    51 | France | France |    51 | 1     | product_340 |     1 | default_text |     1 | en    |     1 | Country | country |
|    61 | en    |    61 | France | France |    61 | 1     | product_337 |     1 | default_text |     1 | en    |     1 | Country | country |
|    71 | en    |    71 | France | France |    71 | 1     | product_338 |     1 | default_text |     1 | en    |     1 | Country | country |
|    81 | en    |    81 | France | France |    81 | 1     | product_339 |     1 | default_text |     1 | en    |     1 | Country | country |
+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+

I have also tried wrapping the end ANDs in a container AND as well as using ORs:

AND(`t3`.`slug` = 'country' AND `t0`.`slug` IN ('france'))
OR(`t3`.`slug` = 'type' AND `t0`.`slug` IN ('rose'))

What I need is to be able to return the product identifiers that match both the last two ANDs only.

+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+
| t0_c0 | t0_c1 | t0_c2 | t0_c3  | t0_c4  | t1_c0 | t1_c1 | t1_c2       | t2_c0 | t2_c1        | t3_c0 | t3_c1 | t3_c2 | t3_c3   | t3_c4   |
+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+
|    31 | en    |    31 | France | France |    31 | 1     | product_345 |     1 | default_text |     1 | en    |     1 | Country | country |
|    32 | en    |    32 | Rose   | Rose   |    32 | 2     | product_345 |     2 | default_text |     2 | en    |     2 | Type    | type    |
+-------+-------+-------+--------+--------+-------+-------+-------------+-------+--------------+-------+-------+-------+---------+---------+

The number of these ANDs can change, eg, there can be 1 or 9 of these extra ands.

Best Answer

These tables appear to be using the Entity-Attribute-Value model, which is often not the best model to use, the complexity of the queries being only one reason for this.

The problem here is not so much with your WHERE clause, it's in your JOIN logic.

The nature of the problem is that the WHERE clause applies per row. It does not look forward or backward so there's no way to express "oh, we need to return 'this' row only if we also find 'that' row."

What we need, then, is a query that can evaluate all of the necessary conditions for each product on one single row.

The query here is appears to be interested in finding only entities where there exists both a matching row (tuple) in the t0/t1/t2/t3 join where t3.slug = 'country' AND t0.slug IN ('france') ... and another row (tuple) in t0/t1/t2/t3 where t3.slug = 'type' AND t0.slug IN ('rose').

So, for each row we're building, the query needs to reference each of those 4 tables twice, and, I would suggest, should actually start with the "entity" table on the far left (presumably, that's the name of the table where t1.entity_id is the primary key). It could be done without this, though you'd end up joining t1y to t1x on entity_id which seems unintuitive to me and "entity" may also have some "entity type" column that should be included in the query. (speculation)

SELECT e.id as product_id
  FROM entity e

  JOIN `attribute_assignment` t1x ON `t1x`.`entity_id` = e.id 
  JOIN `attribute` `t2x` ON `t1x`.`attribute_id` = `t2x`.`id`
  JOIN `attribute_i18n` `t3x` ON `t2x`.`id` = `t3x`.`attribute_id`
                             AND `t3x`.`slug` = 'country'         
  JOIN `attribute_assignment_i18n` `t0x` ON `t0x`.`attribute_assignment_id` = `t1x`.`id`
                                        AND `t0x`.`slug` IN ('france')

  JOIN `attribute_assignment` t1y ON `t1y`.`entity_id` = e.id 
  JOIN `attribute` `t2y` ON `t1y`.`attribute_id` = `t2y`.`id`
  JOIN `attribute_i18n` `t3y` ON `t2y`.`id` = `t3y`.`attribute_id`
                             AND `t3y`.`slug` = 'type'
  JOIN `attribute_assignment_i18n` `t0y` ON `t0y`.`attribute_assignment_id` = `t1y`.`id`
                                        AND `t0y`.`slug` IN ('rose')


 WHERE e.id IN ('product_336', 'product_337', 'product_338', 'product_339', 'product_340', 'product_341', 'product_342', 'product_343', 'product_344', 'product_345');         

I've rewritten the left joins as inner joins, since we're not interested in any results where we can't match rows all the way across. This should cause a total of fewer rows to be evaluated by eliminating more rows, "earlier" (admittedly, that may be a slightly careless use of the term "earlier").

I've rearranged the ordering of the tables, since t1 is where entity_id is referenced and there's no point in looking for "france" until we know we're looking at "country". Of course, that's mostly for clarity, because with an inner JOIN, the optimizer is free to reorder the sequence in which it literally processes JOIN conditions into any order that will produce a valid result.

Also, the references to the 'attribute' table seem unnecessary since there's an apparent case of transitivity with t1.attribute_id = t2.id = t3.attribute_id.

But essentially, this query should find all the records in entity with matching values for id... and t3x.slug = 'country' and t0x.slug = 'france' and t3y.slug = 'type' and t0y.slug = 'rose'... which I think is what you're looking for.

The AND in the join conditions involving t0x.slug and t0y.slug could be moved to the WHERE clause if that's more intuitive, and the optimizer should realize that this construct us functionally equivalent.

Essentially, though, it seems to me that for each "attribute" you need to compare, you'll need to join t1/t2/t3/t0 repeatedly with different aliases each time, in order for your AND logic to do what you intend.