Mysql – find duplicate sets of data grouped by foreign key

join;MySQL

I have a large mysql table (500,000 records). I need to find sets of data that have all the same attributes (KEY, NAME AND VALUE). Some sets have 20 attributes others have 397 per each KEY.
The table looks like this. KEY is a foreign key.

ID, KEY, NAME, VALUE
1   87   Color  Red
2   87   Size   Big
3   87   Weight 6

4   85   Color  Red
5   85   Size   Big
6   85   Weight 6

7   96   Color  Red
8   96   Size   Small
8   96   Weight 7

I'm trying to write a query where if KEY=87 it finds KEY 85 matches all attributes and values like KEY 87. There are 397 different attributes rather than just the 3 I show here. I can do this in php but it's sloppy and I want to learn more about mysql but I can't get my head around it. Appreciate any help in advance.

Best Answer

  1. In a subquery, fetch all the attributes (I cringe at EAV, too).
  2. GROUP_CONCAT(... ORDER BY ...) as foo to combine them into a single string.
  3. In the outer query, SELECT GROUP_CONCAT(ky) ... GROUP BY foo HAVING COUNT(*) > 1 to get the duplicate kys. (Please don't use the keyword "KEY".)

Give that a try. If you get in trouble, show us SHOW CREATE TABLE and your failed attempt. (Or abandon EAV.)

Some code (not quite mimicking the above prose):

SELECT `key`, CONCAT(`name`, '=', `value`) AS kv FROM tbl

should give you

key kv
87  Color=Red
87  Size=Big
...

Then...

SELECT `key`,
       GROUP_CONCAT(kv SEPARATOR '|' ORDER BY kv) AS kvs
    FROM ( -- the above query -- ) AS t1

should give you

key kvs
87  Color=Red|Size=Big|...
85  ...

Now...

SELECT kvs,
       GROUP_CONCAT(`key`) AS keys
    FROM ( -- second select -- ) AS t2
    GROUP BY kvs
    HAVING COUNT(*) > 1

should give you

kvs                           keys
Color=Red|Size=Big|Weight=6   87,85
...  (perhaps not including 96)

I suggest you debug my code the way I wrote it -- test the first query to see if it gives you the first set of output, then second, then third.