Lately, we have noticed (seemingly) random slowdowns on our system, where an innocuous query like
SHOW CREATE TABLE tableName;
or SHOW COLUMNS FROM tableName
will take upwards of 20 minutes, being stuck in a state like waiting for table meta data lock.
Another common one is a simple select statement like SELECT * FROM someTable WHERE id=123
will take a similarly long time, being stuck in a state like sending data.
Our server is pretty busy, so something like this can easily cause us to exceed our 500 connections limit, which then makes the server crash.
After some headscratching, we have figured out that these issues seem to be coming up when the MyISAM
files (*.MYI, *.MYD, *.frm)
are being manipulated directly. This is a common occurence as we have several crons that move data back and forth between servers by copying the MyISAM files from one server directly to the /var/lib/mysql/dbname
folder on another server as a temp table, then rename the temp table to the name of the main table, and then running a FLUSH TABLE tableName
on the destination server. When the files are being manipulated directly on the command line or via a cron (ie, while a file is being copied or unzipped into the destination directory), this is when problems occur.
We have been able to test and recreate this repeatedly, where something like rm
ing a file or even running ls
can bring the system to a halt.
So, my main question is: Is this standard documented behavior? Is it always a no-no to manipulate the MyISAM files directly? (Not counting command line tools like myisamchk
, but specifically standard file operations on the command line such as cp
and rm
). I don't remember seeing this in the documentation, and I have even seen this as a recommended way of transfering data (https://stackoverflow.com/questions/879176/how-to-recover-mysql-database-from-myd-myi-frm-files).
The secondary question is: What is the recommended way of sharing data between servers? One way would be to use mysqldump
to export and mysql
to import, but the reason we are moving the MyISAM
directly is because it is (seemingly) a lot faster and doesn't require that extra space for dump files.
Another idea is too keep the process as is, but copy the MyISAM files to a seperate database, and then integrate them into the main database by using RENAME TABLE tempDB.tableName TO mainDB.tableName
. Would that make a difference since the files being manipulated belong to another database?
Also noting that we have been doing this for over a year, but problems have been more noticeable in the last month.
EDIT: Note that the files copied from the origin server are guaranteed not to be written to while the copy takes place. That is not the case for the destination server, where the issue is happening. This question (Does a MyISAM table locks if I copy it while MySQL is running? (READ DISCLAIMER)) may be related to mine except it seems to be concerned with the data on the origin server, not the destination server.
Using MySQL 5.5.25 Planning to upgrade to 5.7 soon but can't yet
Best Answer
Yes, and yes, viz.:
Manipulating the data files is not a standard technique.
That actually doesn't matter, either... you can't safely copy from or to while the server has control of the files.
For MyISAM tables, this is the correct solution, for both source and target:
Note that the above is only valid for MyISAM. This does not work at all for InnoDB.
Replication is the most obvious candidate if you really need this functionality. If you need this data to feed into multiple targets, make them all slaves and the data source as master. It doesn't matter if they have slaves of their own -- any MySQL server can be both a master and slave, and this will work fine as long as there are no collisions among object names.
Federation is another solution, but it is not a magic bullet -- successfully using the
FEDERATED
storage engine requires an in-depth understanding of how it "thinks" so that you can outsmart it into behaving optimally.Or simply an external process running code that connects to both servers and synchronizes the data by comparing the two tables and crafting insert/update/delete statements to bring one table into consistency with the other, when timing as isn't as critical.
Or abstracting this away so that the application doesn't need everything to be on one server.