Mysql – IMPORT TABLESPACE is hanging in the ‘System lock’ state

backupMySQLpercona-serverxtrabackup

We have a development database server with Percona-server 5.7.15-9 on it. It replicates two schemas from two different production servers using multi-source GTID replication. Lets call these schemas alice and bob.

On the dev server we clone this replicas of production databases to get databases for development. They are called 1_alice, 1_bob, 2_alice, 2_bob, etc. All of them use the same instance of MySQL.
For fast cloning we use Percona XtraBackup as described here https://www.percona.com/doc/percona-xtrabackup/2.4/innobackupex/restoring_individual_tables_ibk.html

In the past there were only one replica (alice) and we used binary log position for replication instead of GTID. These times everything worked fine and fast. One day (I'm not sure when exactly) it became broken.

Now when I perform ALTER TABLE 2_alice.access_group IMPORT TABLESPACE query, it hangs in the 'System lock' state. And could be hanging in this state from 1 min 'till 1 hour and more (then it works). There are no more active connections instead of two replicas, but they doesn't use 2_alice schema.

Why is IMPORT TABLESPACE query hanging and how could I debug this case?

Best Answer

I think your table must be very big. I have recently encountered the same problem when I trying to import a 700G table.

Check the error log, I found that the import is in progress stage 1

InnoDB: Phase I - Update all pages

Check the official documentation. https://dev.mysql.com/doc/refman/8.0/en/tablespace-copying.html

When ALTER TABLE ... IMPORT TABLESPACE is run on the destination instance, the import algorithm performs the following operations for each tablespace being imported:

  • Each tablespace page is checked for corruption.

  • The space ID and log sequence numbers (LSNs) on each page are updated

  • Flags are validated and LSN updated for the header page.

  • Btree pages are updated.

  • The page state is set to dirty so that it is written to disk.

I think it takes a lot of time to update the LSN of each page.

check the MySQL work log https://dev.mysql.com/worklog/task/?id=5522

Import algorithm
================
We scan the blocks in extents and modify individual blocks rather than using 
logical index structure.

foreach page in tablespace {
  1. Check each page for corruption.

  2. Update the space id and LSN on every page  --I think this is what "InnoDB: Phase I - Update all pages" does
     * For the header page
       - Validate the flags
       - Update the LSN

  3. On Btree pages
     * Set the index id
     * Update the max trx id
     * In a cluster index, update the system columns
     * In a cluster index, update the BLOB ptr, set the space id
     * Purge delete marked records, but only if they can be easily
       removed from the page
     * Keep a counter of number of rows, ie. non-delete-marked rows
     * Keep a counter of number of delete marked rows
     * Keep a counter of number of purge failure
     * If a page is stamped with an index id that isn't in the .cfg file
       we assume it is deleted and the page can be ignored.
     * We can't tell free pages from allocated paes (for now). Therefore
       the assumption is that the free pages are either empty or are logically
       consistent. TODO: Cache the extent bitmap and check free pages.

   4. Set the page state to dirty so that it will be written to disk.
}
Related Question