To remove duplicate rows in a table I did the following (about 200k rows)
table1:
id_a int
id_b int
val_a int
val_b int
pk (id_a, id_b), index id_a, index id_b
Algorithm:
created table2 as select from table1;
drop table1;
created table1 as select from table2 (with removed dublicates);
created indexes for table1;
drop table2 ;
After that, mysqld
started to use 40-140% CPU at VPS with any query that uses that table (0% idle). Strange, that CPU usage increased even if I query another tables, that were not changed. It all started after those changes.
SHOW PROCESSLIST
shows only "sleep" and "show processlist".
I've tried to run mysqlcheck
and tried to restart mysql-server – no errors are reported.
I enabled slow query logs, and they appeared there. Even those, which ran fast one day ago.
What could be wrong and what else can I do to diagnose the cause?
Best Answer
First thing I noticed is that you have a redundant index.
This is your original table1:
You do not need the index on id_a since it is already the first column of the primary key. That's 200k rows less of index entries to process. Just run this to get rid of it:
Another step you will need is to reload the table and disable non-unique indexes while reloading (This is MyISAM only). In addition, you need a cleaner algorithm. I have two algorithms. Here is the first one:
ALGORITHM #1
Since
DISABLE KEYS
leave the primary key open for updating, the primary key is loaded during theINSERT INTO table2
. TheENABLE KEYS
step rebuilds the id_b index.Now, here is the second algorithm:
ALGORITHM #2
A whole lot shorter, isn't it ? The second algorithm (using OPTIMIZE TABLE) does everything the first did plus recompute index statistics, do internal table repairs, and index reorganization.
UPDATE 2011-09-07 17:38 EDT
The reason these two algorithms should improve things is the fact MySQL indexes are now processed linearly.