MySQL hangs in killed state

hangmyisamMySQLoptimization

I have a strange situation where a mysql OPTIMIZE query will never end. After killinh it, it remains in the "killed" state.

Scenario to reproduce:

OS: CentOS 6.5
MySQL: 5.6
Engine: MyISAM (with at least one BLOB/LONGTEXT column)
mysql config 'myisam_repair_threads' = 2

I have a table with 40MB of data and 1500 records. With myisam_repair_threads set to 1, OPTIMIZE completes within a second. Settings myisam_repair_threads to 2 takes forever to complete. After killing the query, it remains in the 'killed' state (at moment of writing already 3 hours). Seems like this is a bug in mysql.

However, the big question: How can I force to kill this query? It currently holds a lock on the meta data of the table, so some processes can't execute because they are waiting for this lock.

Any advice is appreciated!

Best Answer

From http://www.dbasquare.com/2012/05/15/why-do-threads-sometimes-stay-in-killed-state-in-mysql/:

What actually happens when you run KILL?

The command sytnax is KILL [QUERY | CONNECTION] , where thread_id is the value from the first column of the process list output. The optional argument determines whether only running query or should the entire session be terminated. It defaults to the latter, so specifying CONNECTION is not required.

Running the command doesn’t actually do anything except for setting a special flag inside the selected thread. Therefore the kill operation doesn’t happen synchronously with the corresponding request. In many cases it takes some time for a thread or a query to stop. The flag is checked at various stages of statement execution. When it happens exactly, or how frequently, depends on the work a thread is actually doing.

...

No matter what the real cause is, it is impossible to get rid of such hanging threads without a database restart. They should, however, simply be allowed to go away on their own.

There is also no way of forcing a kill that would execute instantly like kill -9 in Unix systems.

Related Solutions

Mysql – Too Many database connections on Amazon RDS

Here is a Stored Procedure to kill long running SELECTs

DELIMITER $$

DROP PROCEDURE IF EXISTS `test`.`Kill_Long_Running_Selects` $$
CREATE PROCEDURE `test`.`Kill_Long_Running_Selects` (time_limit INT,display INT)
BEGIN

    DECLARE ndx,lastndx INT;

    DROP TABLE IF EXISTS test.LongRunningSelects;
    CREATE TABLE test.LongRunningSelects
    (
        id INT NOT NULL AUTO_INCREMENT,
        idtokill BIGINT,
        PRIMARY KEY (id)
    ) ENGINE=MEMORY;
    INSERT INTO test.LongRunningSelects (idtokill)
    SELECT id FROM information_schema.processlist
    WHERE user<>'system user' AND info regexp '^SELECT' AND time > time_limit;

    SELECT COUNT(1) INTO lastndx FROM test.LongRunningSelects;
    SET ndx = 0;
    WHILE ndx < lastndx DO
        SET ndx = ndx + 1;
        SELECT idtokill INTO @kill_id
        FROM test.LongRunningSelects WHERE id = ndx;
        CALL mysql.rds_kill(@kill_id);
    END WHILE;

    IF lastndx > 0 THEN
        IF display = 1 THEN
            SELECT GROUP_CONCAT(idtokill) INTO @idlist FROM test.LongRunningSelects;
            SELECT @idlist IDs_KIlled;
            SELECT CONCAT('Processes Killed : ',lastndx) Kill_Long_Running_Selects;
        END IF;
    END IF;

END $$

To kill SELECTs running longer than 30 seconds, you run this

CALL test.Kill_Long_Running_Selects(30,0);

If you want to see the connections being killed, you run this

CALL test.Kill_Long_Running_Selects(30,1);

Perhaps you can create a MySQL Event to call this Stored Procedure every minute.

If Amazon does not let you have the EVENT privilege, you will have to write an external shell script on the EC2 server to connect to the DB and run the Stored Procedure. That shell script can be put into a crontab.

If Amazon does not let you have the PROCESS and SUPER privileges, you may need to move the DB out of RDS and into another EC2 instance running MySQL to accomplish this. You could then create the MySQL Event without Amazon's hosting restrictions.

MySQL, optimize and replication

Your First Question

why does a OPTIMIZE query block the replication catching up ?

The SQL thread in MySQL Replication processes SQL commands from the Relay Logs as a FIFO queue (i.e. first-come, first-server basis). Once the SQL thread see a command, it processes it to completion. Meanwhile, the Seconds_Behind_Master just keeps creeping up.

While the SQL thread on the Slave is running OPTIMIZE TABLE, once you run STOP SLAVE; it SQL thread has complete the running OPTIMIZE TABLE before the SQL thread can terminate. That's why STOP SLAVE; hangs. To verify this, open another mysql session and run SHOW PROCESSLIST;, You should see a line that has OPTIMIZE TABLE with the status Killing slave....

Your Second Question

can I prevent those OPTIMIZE queries ? I didn't put any cron to do that, and as far as I can see, nothing in /etc/cron.{d,daily,hourly,weekly} does this.

Someone or something is running the OPTIMIZE TABLE. You have to hunt it down in all your jobs or PHP apps. Use the mysqlbinlog utility to dumps all the binary logs into a text file. Then, grep the text file for OPTIMIZE TABLE or optimize table. The timestamp of when it was submitted into the binlogs will be a few lines about the command.

When you do find that OPTIMIZE TABLE command in your jobs or PHP code, simply run this:

SET SQL_LOG_BIN=0; OPTIMIZE TABLE tblname; SET SQL_LOG_BIN=1;

This will not register the OPTIMIZE TABLE in the binlogs, thus preventing a Slave from replicating it.

As was just pointed out in the comments below, You can also replace

OPTIMIZE TABLE tblname;

with one of the following:

OPTIMIZE NO_WRITE_TO_BINLOG TABLE tblname;
OPTIMIZE LOCAL TABLE tblname;

to restrict the OPTIMIZE TABLE command from landing in the binlogs and thus prevent replication of it.

Best Answer

Related Solutions

Mysql – Too Many database connections on Amazon RDS

MySQL, optimize and replication

Related Question