I have a big performance problem beetwen my local mysql instance and the server.
I have this request :
DELETE a
FROM a
LEFT OUTER JOIN
( SELECT MAX(id) as id,
`v_id`,
`v_statut`,
`c_controle`,
`c_d_controle`
FROM `a`
GROUP BY `v_id`,
`v_statut`,
`c_controle`,
`c_d_controle`
) AS t1 ON `a`.id = t1.id
WHERE t1.id IS NULL;
The purpose of the request is to remove duplicate identified by the GROUP BY. I have already created key for the group by.
In my local server (5.6.17) this request is about 15 seconds on 700 000 lines, on the server(5.5.43) this request is endless.
Is there any mysql tuning or SQL request optimization that can help me.
Thanks.
CREATE TABLE :
Create Table"
temp_import_vigipresse "CREATE TABLE `temp_import_vigipresse` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`date_creation` datetime DEFAULT NULL,
`vigipresse_id` int(11) NOT NULL,
`vigipresse_tele_id` varchar(8) DEFAULT NULL,
`inutile_date` date DEFAULT NULL,
`vigipresse_statut_depot` tinyint(4) NOT NULL,
`vigipresse_commentaire` varchar(255) DEFAULT NULL,
`vigipresse_type_parution` tinyint(4) DEFAULT NULL,
`vigipresse_admission_id` int(11) DEFAULT NULL,
`date_depot` datetime DEFAULT NULL,
`etablissement_code_regate` varchar(6) DEFAULT NULL,
`etat_date_film` tinyint(4) DEFAULT NULL,
`date_film` date DEFAULT NULL,
`vigipresse_code_hld` tinyint(4) DEFAULT NULL,
`vigipresse_lot_stp` tinyint(4) DEFAULT NULL,
`code_statut_resultat` tinyint(4) DEFAULT NULL,
`commentaire_resultat_controle` varchar(255) DEFAULT NULL,
`valeur_resultat_controle` varchar(255) DEFAULT NULL,
`code_critere_controle` tinyint(4) NOT NULL,
`code_departement_postal_controle` varchar(2) DEFAULT NULL,
`numero_oqs_saisi` smallint(6) DEFAULT NULL,
`vigipresse_numero_parution_constate` int(11) DEFAULT NULL,
`inutile_numero_parution` varchar(255) DEFAULT NULL,
`vigipresse_type_parution_declare` tinyint(4) DEFAULT NULL,
`vigipresse_date_depot_prevue` date DEFAULT NULL,
`vigipresse_quantite_articles` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `vigipresse_id_statut_code`
(`vigipresse_id`,`vigipresse_statut_depot`,`code_critere_controle`),
KEY `vigipresse_id_statut` (`vigipresse_id`,`vigipresse_statut_depot`),
KEY `vigi_unique`
(`vigipresse_id`,
`vigipresse_statut_depot`,
`code_critere_controle`,
`code_departement_postal_controle`)
) ENGINE=MyISAM AUTO_INCREMENT=730705 DEFAULT CHARSET=utf8"
EXPLAIN LOCAL (5.6):
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,PRIMARY,temp_import_vigipresse,ALL,NULL,NULL,NULL,NULL,228470,NULL
1,PRIMARY,,ref,,,5,qspresse.temp_import_vigipresse.id,10,"Using where"
2,DERIVED,temp_import_vigipresse,ALL,vigi_unique,NULL,NULL,NULL,228470,"Using temporary; Using filesort"
EXPLAIN SERVER (5.5)
Endless request
Best Answer
EXPLAIN
theSELECT
subquery. (This will workaround the point that jynus brought up.)Your KEY
vigi_unique
can be made somewhat better by addingid
onto the end of it. This will change the EXPLAIN to say "Using index" because it will be a "covering index".But the real reason for 5.6 to be faster is ...
Suggest using
EXPLAIN FORMAT=JSON SELECT ...
on 5.6 to see a little more.You should consider moving to InnoDB.
Another way to "dedup" is to create a new table (with a UNIQUE index to prevent future dups), populate it with
INSERT ... SELECT DISTINCT ...
, thenRENAME TABLE
to move it into place.