Mysql – IMPORT TABLESPACE is hanging in the ‘System lock’ state

backupMySQLpercona-serverxtrabackup

We have a development database server with Percona-server 5.7.15-9 on it. It replicates two schemas from two different production servers using multi-source GTID replication. Lets call these schemas alice and bob.

On the dev server we clone this replicas of production databases to get databases for development. They are called 1_alice, 1_bob, 2_alice, 2_bob, etc. All of them use the same instance of MySQL.
For fast cloning we use Percona XtraBackup as described here https://www.percona.com/doc/percona-xtrabackup/2.4/innobackupex/restoring_individual_tables_ibk.html

In the past there were only one replica (alice) and we used binary log position for replication instead of GTID. These times everything worked fine and fast. One day (I'm not sure when exactly) it became broken.

Now when I perform ALTER TABLE 2_alice.access_group IMPORT TABLESPACE query, it hangs in the 'System lock' state. And could be hanging in this state from 1 min 'till 1 hour and more (then it works). There are no more active connections instead of two replicas, but they doesn't use 2_alice schema.

Why is IMPORT TABLESPACE query hanging and how could I debug this case?

Best Answer

I think your table must be very big. I have recently encountered the same problem when I trying to import a 700G table.

Check the error log, I found that the import is in progress stage 1

InnoDB: Phase I - Update all pages

Check the official documentation. https://dev.mysql.com/doc/refman/8.0/en/tablespace-copying.html

When ALTER TABLE ... IMPORT TABLESPACE is run on the destination instance, the import algorithm performs the following operations for each tablespace being imported:

Each tablespace page is checked for corruption.
The space ID and log sequence numbers (LSNs) on each page are updated
Flags are validated and LSN updated for the header page.
Btree pages are updated.
The page state is set to dirty so that it is written to disk.

I think it takes a lot of time to update the LSN of each page.

check the MySQL work log https://dev.mysql.com/worklog/task/?id=5522

Import algorithm
================
We scan the blocks in extents and modify individual blocks rather than using 
logical index structure.

foreach page in tablespace {
  1. Check each page for corruption.

  2. Update the space id and LSN on every page  --I think this is what "InnoDB: Phase I - Update all pages" does
     * For the header page
       - Validate the flags
       - Update the LSN

  3. On Btree pages
     * Set the index id
     * Update the max trx id
     * In a cluster index, update the system columns
     * In a cluster index, update the BLOB ptr, set the space id
     * Purge delete marked records, but only if they can be easily
       removed from the page
     * Keep a counter of number of rows, ie. non-delete-marked rows
     * Keep a counter of number of delete marked rows
     * Keep a counter of number of purge failure
     * If a page is stamped with an index id that isn't in the .cfg file
       we assume it is deleted and the page can be ignored.
     * We can't tell free pages from allocated paes (for now). Therefore
       the assumption is that the free pages are either empty or are logically
       consistent. TODO: Cache the extent bitmap and check free pages.

   4. Set the page state to dirty so that it will be written to disk.
}

Related Solutions

Mysql – Does the Amazon RDS backup/snapshot service lock tables

If there are any MyISAM tables present, as your wording implies, the RDS Documentation instructs you on how to handle it and what to expect

According to the RDS Documentation :

Automated Backups with Unsupported Storage Engines

Amazon RDS automated backups and DB Snapshots are currently supported for only the InnoDB storage engine. Use of these features with other MySQL storage engines, including MyISAM, may lead to unreliable behavior while restoring from backups. Specifically, since storage engines like MyISAM do not support reliable crash recovery, your tables can be corrupted in the event of a crash. For this reason, we encourage you to use the InnoDB storage engine.

If you choose to use MyISAM, you can attempt to manually repair tables that become damaged after a crash by using the REPAIR command ((see: http://dev.mysql.com/doc/refman/5.5/en/repair-table.html). However, as noted in the MySQL documentation, there is a good chance that you will not be able to recover all your data.

If you want to take DB snapshots with MyISAM tables, follow these steps:

Launch Process

1 Stop all activity to your MyISAM tables (that is, close all sessions)

2 Lock and flush each of your MyISAM tables

3 Issue a CreateDBSnapshot API call, or use the RDSCLI rds-create-db-snapshot command. When the snapshot has completed, release the locks and resume activity on the MyISAM tables. These steps force MyISAM to flush data stored in memory to disk thereby ensuring a clean start when you restore from a DB snapshot.

Finally, if you would like to convert existing MyISAM tables to InnoDB tables, you can use alter table command (for example, alter table TABLE_NAME engine=innodb;).

Believe me, large MyISAM tables have no place in Amazon RDS. InnoDB is far more accepted. Please, either convert them to InnoDB or live with doing your own locking and possible table crashes/repairs.

Restore Oracle tablespace to its state from 2 days ago

You probably tried to restore and then recover the database. When you recover the database all the files should be in sync before you can open the database for work. The error shown tells you that some of your data files are not in sync with the rest of the database.

What your really need to use to recover a tablespace or a set of tablespaces is Tablespace Point-in-Time Recovery (TSPITR). You can do it manually or with RMAN. In the latter case the command might look like:

recover tablespace app_data
until SCN 1234567
auxiliary destination '/tmp/auxdb';

And you wouldn't need to configure the auxiliary database manually. With user-managed TSPITR, on the other hand, you have to configure the auxiliary database first, and then use Transportable Tablespaces feature (see the section "Introduction to Transportable Tablespaces" in Database Administrator's Guide) to move the data files along with metadata to the target database.

For more details read "RMAN Tablespace Point-in-Time Recovery (TSPITR)" and "Performing User-Managed TSPITR" sections in Database Backup and Recovery Advanced User's Guide, and also "Tablespace Point-in-Time Recovery" section in Concepts.

Best Answer

Related Solutions

Mysql – Does the Amazon RDS backup/snapshot service lock tables

Restore Oracle tablespace to its state from 2 days ago

Related Question