Mysql – How to normalize data en-masse

foreign keyMySQLnormalization

Goal:

Extract address fields from a users table into a separate addresses table
linked by foreign key.

I have created the structure below, and have millions of rows to deal with. I
need to do this once, but relatively quickly (ie, I can't loop over every row
in my Rails code)

The query below does work to extract the address fields into new rows in the
addresses table. But I can't figure out how to link the new rows back to the
users table that the data came out of.

Structure:

Before:

User
  login
  address_line_one
  address_line_two
  address_city
  address_state
  address_zip

After:

User
  login
  address_id

Address
  id
  line_one
  line_two
  city
  state
  postal_code
  country

What I have so far:

INSERT INTO addresses (line_one, line_two, city, state, postal_code, country)
SELECT address_line_one,
  address_line_two,
  address_city,
  address_state,
  address_zip AS zip,
  'US'
FROM users
WHERE
  address_line_one IS NOT NULL ||
  address_line_two IS NOT NULL ||
  address_city     IS NOT NULL ||
  address_state    IS NOT NULL ||
  address_zip      IS NOT NULL

PROBLEM

I don't know how to get the address_id back into the users table

Best Answer

This answer assumes that users.login is unique.

  • First, alter the users table to include an address_id column.

  • Next, if you add a addresses.login column you would then be able to do the following query after the insert into addresses:

    UPDATE users, addresses SET users.address_id = addresses.address_id 
    WHERE users.login = addresses.login
    
  • Finally drop the addresses.login column

An alternative would be to use an AFTER INSERT trigger on addresses to update the users table based on the unique address entries. Unfortunately, this assumes that all the fields of addresses combined represent a unique users.login, which for example would be invalid if a married couple each had logins but shared the same address.