I am trying to delete all the duplicates but keeping single record only (shorter id).
Following query deletes duplicates but take lot of iterations to delete all copies and keeping original ones.
DELETE FROM emailTable WHERE id IN (
SELECT * FROM (
SELECT id FROM emailTable GROUP BY email HAVING ( COUNT(email) > 1 )
) AS q
)
Its MySQL.
DDL
CREATE TABLE `emailTable` (
`id` mediumint(9) NOT NULL auto_increment,
`email` varchar(200) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=298872 DEFAULT CHARSET=latin1
Best Answer
Try this:
The above worked for my test of 50 emails (5 different emails duplicated 10 times).
You might need to add an index on the 'email' column:
It might be a bit slow fro 250,000 rows. It was slow for me on a table that had 1.5million rows (properly indexed), which is how I came up with this strategy:
The benefit of the memory table is there's an index that is used (primary key on minID) that speeds up the process over a normal temporary table.