MySQL – Manipulating MyISAM files directly brings down entire database

myisamMySQLmysql-5.5

Lately, we have noticed (seemingly) random slowdowns on our system, where an innocuous query like

SHOW CREATE TABLE tableName; or SHOW COLUMNS FROM tableName will take upwards of 20 minutes, being stuck in a state like waiting for table meta data lock.

Another common one is a simple select statement like SELECT * FROM someTable WHERE id=123 will take a similarly long time, being stuck in a state like sending data.

Our server is pretty busy, so something like this can easily cause us to exceed our 500 connections limit, which then makes the server crash.

After some headscratching, we have figured out that these issues seem to be coming up when the MyISAM files (*.MYI, *.MYD, *.frm) are being manipulated directly. This is a common occurence as we have several crons that move data back and forth between servers by copying the MyISAM files from one server directly to the /var/lib/mysql/dbname folder on another server as a temp table, then rename the temp table to the name of the main table, and then running a FLUSH TABLE tableName on the destination server. When the files are being manipulated directly on the command line or via a cron (ie, while a file is being copied or unzipped into the destination directory), this is when problems occur.

We have been able to test and recreate this repeatedly, where something like rming a file or even running ls can bring the system to a halt.

So, my main question is: Is this standard documented behavior? Is it always a no-no to manipulate the MyISAM files directly? (Not counting command line tools like myisamchk, but specifically standard file operations on the command line such as cp and rm). I don't remember seeing this in the documentation, and I have even seen this as a recommended way of transfering data (https://stackoverflow.com/questions/879176/how-to-recover-mysql-database-from-myd-myi-frm-files).

The secondary question is: What is the recommended way of sharing data between servers? One way would be to use mysqldump to export and mysql to import, but the reason we are moving the MyISAM directly is because it is (seemingly) a lot faster and doesn't require that extra space for dump files.

Another idea is too keep the process as is, but copy the MyISAM files to a seperate database, and then integrate them into the main database by using RENAME TABLE tempDB.tableName TO mainDB.tableName. Would that make a difference since the files being manipulated belong to another database?

Also noting that we have been doing this for over a year, but problems have been more noticeable in the last month.

EDIT: Note that the files copied from the origin server are guaranteed not to be written to while the copy takes place. That is not the case for the destination server, where the issue is happening. This question (Does a MyISAM table locks if I copy it while MySQL is running? (READ DISCLAIMER)) may be related to mine except it seems to be concerned with the data on the origin server, not the destination server.

Using MySQL 5.5.25 Planning to upgrade to 5.7 soon but can't yet

Best Answer

Is this standard documented behavior?

Is it always a no-no to manipulate the MyISAM files directly?

Yes, and yes, viz.:

normally you should not get corrupted tables unless one of the following happens:

[...]

  • Some external program is manipulating data files or index files at the same time as mysqld without locking the table properly.

http://dev.mysql.com/doc/refman/5.7/en/crashing.html

Manipulating the data files is not a standard technique.

Note that the files copied from the origin server are guaranteed not to be written to while the copy takes place

That actually doesn't matter, either... you can't safely copy from or to while the server has control of the files.

For MyISAM tables, this is the correct solution, for both source and target:

       -- obtain an exclusive lock on the table
mysql> LOCK TABLE t1 WRITE;
       -- force the server to evict the table from the open table cache,
       -- flush changes to disk, close any open filehandles, and block 
       -- other queries that try to access the table
mysql> FLUSH TABLE t1;
mysql> -- <<< wait for the previous query to return to the prompt
       -- then copy the file; this connection must remain established
       -- in order to hold the table lock.  When the copy is complete...
mysql> UNLOCK TABLES;

Note that the above is only valid for MyISAM. This does not work at all for InnoDB.

What is the recommended way of sharing data between servers

Replication is the most obvious candidate if you really need this functionality. If you need this data to feed into multiple targets, make them all slaves and the data source as master. It doesn't matter if they have slaves of their own -- any MySQL server can be both a master and slave, and this will work fine as long as there are no collisions among object names.

Federation is another solution, but it is not a magic bullet -- successfully using the FEDERATED storage engine requires an in-depth understanding of how it "thinks" so that you can outsmart it into behaving optimally.

Or simply an external process running code that connects to both servers and synchronizes the data by comparing the two tables and crafting insert/update/delete statements to bring one table into consistency with the other, when timing as isn't as critical.

Or abstracting this away so that the application doesn't need everything to be on one server.