Mysql – Large MySQL import stuck at “Waiting for table flush”

importMySQL

Two weeks ago (!) I started a large import (dump sql file is 540GB). Now the import is stuck at this state:

"show processlist" shows: Command = "Field List", State="Waiting for table flush".

Looking at the record count for a table I selected randomly, I see zero rows, whereas in the source DB there are millions of rows, so the import is clearly not done. The .ibd file for that table is tiny, whereas on the source it's several GB. For other tables this is not the case, i.e. the .ibd files are very large on the target DB.

The server shows heavy swap activity, with the load being around 5.

Following suggestions I got on this site I started the import with these commands:

set autocommit=0;
set unique_checks=0;
set foreign_key_checks=0;

…and ended the import with these commands:

commit;
set unique_checks=1;
set foreign_key_checks=1;

What is going on now? Must I wait?

Best Answer

I have always found large mysqldumps to be problematic. The best tactic I know is to reduce the size of the import. aka split the database dump into many dumps that let you run a per table restore this will let you make tangible progress and reduce the size and time each import takes.

https://github.com/kedarvj/mysqldumpsplitter

Related Solutions

MySQL Load Data Infile Slow Query Logging – How to Resolve

The query under process ID 180233 looks like it is in distress.

Here is the query itself

SELECT COUNT(DISTINCT A.`campaignid`)  INTO _c
FROM `ox_campaigns` A 
INNER JOIN `selfserving_users` B ON B.`user_id` = A.`uid`
INNER JOIN `v3_cam_date` C ON C.`campaignid` = A.`campaignid`
WHERE A.`revenue_type` = 5 AND A.`deleted` = 0
AND A.`expire` = DATE_ADD(CURRENT_DATE, INTERVAL 1 DAY) 
AND A.`isExpired` = 0
AND IF( NAME_CONST('_permitid',3) = -1, 1=1,
IF( NAME_CONST('_permitid',3) = 0, A.`uid` IN
(SELECT C.`user_id` FROM `selfserving_users` C
WHERE C.`groupid` =  NAME_CONST('_groupid',12) ) ,
A.`uid` =  NAME_CONST('userid',388)));

The scary part about the query is the self-reference

You have selfserving_users acting in a self serving manner against itself.

Sometimes, the MySQL Query Optimizer will play a bait-and-switch, smoke-and-mirrors games with data, especially with a self reference, in order to formulate the best EXPLAIN plan possible. While mysql is very capable of completing sub-SELECTs, it can be still be expensive.

However, this is just a symptom that manifested because of Process ID 97. What is really the issue here?

LOAD DATA INFILE against an InnoDB table could make mysqld a little punch drunk. I don't believe (or at least I don't exercise full confidence) you can encapsulate it as a normal transaction although this was addressed back in MySQL 5.0.

Just picture it:

You are hammering the InnoDB Buffer
Some memory swapping may be going on
Possible full table locking issues that are affecting data pages outside the v3_zone_date table (such was with the selfserving_users table)

There may be a way to throttle the LOAD DATA INFILE process on an InnoDB table. I cannot give you a solid answer on this one, but try this link from Baron Schwartz.

UPDATE 2012-02-22 12:00 EST

There is open bug report in MySQL 5.5.7 called Deadlock when DDL under LOCK TABLES WRITE, READ + PREPARE. At the bottom of the report, a person complained about a block problem cause by the explicit LOCK TABLES.

Launching a COMMIT on locked rows in a table would hang because of trying to unraveling MVCC data assocaited with the locked rows. Based on the InnoDB Status you have shown, there would exist 6933 row locks on the table you are importing. I know that in Oracle, when introducing new rows to a table, MVCC is still generated because the previous version of the newly inserted row is a nonexistent row. The same must be occurring for InnoDB.

UPDATE 2012-02-22 12:42 EDT

In your question you stated the following about your .NET process

LOCK TABLES;
SET autocommit=0;
SET unique_checks=0;
SET foreign_key_checks=0;
LOAD DATA;
COMMIT;
UNLOCK TABLES;
SET autocommit=1;
SET unique_checks=1;

All of these events are running within the same DB Session. This is also happening within one DB Connection. Thus, this is not a deadlock in the traditional sense. It is just a case of blocking your COMMIT within a given DB Connection/Session because the tables were locked within the same DB Connection/Session.

UPDATE 2012-02-23 19:00 EDT

I would change the sequence to be this:

SET autocommit=0;
SET unique_checks=0;
SET foreign_key_checks=0;
LOCK TABLES;
LOAD DATA;
UNLOCK TABLES;
COMMIT;
SET autocommit=1;
SET unique_checks=1;
SET foreign_key_checks=1;

Please remember, a COMMIT cannot proceed if you have the tables locked in serial fashion. Therefore, UNLOCK TABLES must precede COMMIT.

Mysql – Struggling to debug high CPU usage on Amazon RDS MySQL instance

Managed to solve this, these are the steps I followed:

Firstly, I contacted the Amazon RDS team by posting on their discussion forum, they confirmed it was the mysqld process taking up all this CPU - this eliminated a configuration fault with something else running on the physical server

Secondly I tracked down the source of the queries that were running:

SELECT `mytable`.* FROM `mytable` WHERE `mytable`.`foreign_key` = 231273 LIMIT 1

I originally overlooked this as the cause, because none of these queries seemed to be taking particularly long when I monitored the show processlist output. After exhausting other avenues, I decided it might be worth following up....and I'm glad I did.

As you can see in the show processlist output, these queries were coming from a utlility server, which runs some tactical utility jobs that exist outside of our main application code. This is why they were not showing up as slow or causing issues in our new relic monitoring, because the new relic agent is only installed on our main app server.

Loosely following this guide:

http://www.mysqlperformanceblog.com/2007/02/08/debugging-sleeping-connections-with-mysql/

I was able to trace these queries to a specific running process on our utility server box. This was a bit of ruby code that was very inefficiently iterating through around 70,000 records, checking some field values and using those to decide whether it needs to create a new record in 'mytable.' After doing some analysis I was able to determine, the process was no longer needed so could be killed.

Something that was making matters worse, there seemed to be 6 instances of this same process running at one time due to the way the cron job was configured and how long each one took! I killed off these processes, and incredibly our CPU usage fell from around 100% to around 5%!