Mysql – How to add column to big table in MySQL

innodbMySQL

I'm a PHP developer so don't be strict. I have a big table ~5.5gb dump. Our PM decided to make new column in it to perform new feature. Table is InnoDB so what i tried:

  1. Alter table in screen with table lock. Took ~30hours and nothing. So I just stopped it. First I made a mistake because I didn't end all transactions but the 2nd time was no multilock. Status was copy to tmp table.

  2. Since I also need to apply partitioning for this table we decide to make dump, rename and make table with same name and new structure. But dump is making strict copy(at least I didn't found something else). So i added to dump a new column with sed and query it. But some strange errors began. I believe it was caused by charset. Table in utf-8 and file became us-ascii after sed. So I got errors(unknown command '\'') on 30% of data. So this is also a bad way.

What are other options to accomplish this and speed performance(I can do it with php script, but it will took ages). What will be performance of INSERT SELECT in this case.

Thanks for any advance.

Best Answer

Use MySQL Workbench. You can right-click a table and select "Send to SQL Editor" --> "Create Statement". This way no table "properties" will be forgotten to add (including CHARSET or COLLATE).
With this huge amount of data I'd recommend cleaning up either the table or the data structure you use (a good DBA comes handy). If not possible:

  • rename the table (ALTER) and create a new one with the CREATE script you get from Workbench. You can also extend that query with the new field you need
  • BULK LOAD the data from the old table to the new one:
    SET FOREIGN_KEY_CHECKS = 0;
    SET UNIQUE_CHECKS = 0;
    SET AUTOCOMMIT = 0;
    INSERT INTO new_table (fieldA, fieldB, fieldC, ..., fieldN)
       SELECT fieldA, fieldB, fieldC, ..., fieldN
       FROM old_table
    SET UNIQUE_CHECKS = 1;
    SET FOREIGN_KEY_CHECKS = 1;
    COMMIT;
    

    This way you avoid indexing/etc to run record by record. The "update" to the table still will be slow (as the amount of data is huge) but this is the fastest way I can think of.

    EDIT: Read this article to get details about the commands used in the above sample query ;)