Mysql – Merging tables, but ignoring duplicate columns

join;MySQL

I have two tables I want to merge:

city1(id, city, state, zip, lat, lon, county)

city2(id, city, state)

The tables contain overlapping city and state, but not primary keys. I can't figure out how to do this. UNION won't work, nor INSERT IGNORE due to the different column numbers and primary keys. So, I guess that leaves joins(left?), but I can't figure out the syntax to get it it excluding duplicates.

A NOTE

UNION will still display duplicate city/state combinations. From what I've read, UNION does prevent duplicates if the tables have different sizes.

Table Create Statements

Column and table names have changed

CREATE TABLE `cities` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `city` varchar(50) NOT NULL,
  `state_code` char(2) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_state_code` (`state_code`)
) ENGINE=InnoDB AUTO_INCREMENT=29739 DEFAULT CHARSET=latin1

CREATE TABLE `cities_extended` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `city` varchar(50) NOT NULL,
  `state_code` char(2) NOT NULL,
  `zip` int(5) unsigned zerofill NOT NULL,
  `latitude` double NOT NULL,
  `longitude` double NOT NULL,
  `county` varchar(50) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=41756 DEFAULT CHARSET=latin1

Best Answer

Proposed Solution #1

ALTER TABLE cities          ADD UNIQUE INDEX idx_city_state (city,state_code);
ALTER TABLE cities_extended ADD UNIQUE INDEX idx_city_state (city,state_code);
CREATE TABLE cities_new
    SELECT A.city,A.state_code
    FROM cities A LEFT JOIN cities_extended B USING (city,state_code)
    WHERE B.city IS NULL
;
INSERT INTO cities_extended (city,state_code)
SELECT city,state_code FROM cities_new;
DROP TABLE cities_new;

Explanation for Proposed Solution #1

Create Unique Indexes on (city,state_code) for both tables

ALTER TABLE cities          ADD UNIQUE INDEX idx_city_state (city,state_code);
ALTER TABLE cities_extended ADD UNIQUE INDEX idx_city_state (city,state_code);

Create Temp Table with rows from cities that's not in cities_extended

CREATE TABLE cities_new
    SELECT A.city,A.state_code
    FROM cities A LEFT JOIN cities_extended B USING (city,state_code)
    WHERE B.city IS NULL
;

Load the Temp Table into cities

INSERT INTO cities_extended (city,state_code)
SELECT city,state_code FROM cities_new;

Discard the Temp Table

DROP TABLE cities_new;

Proposed Solution #2

ALTER TABLE cities_extended ADD UNIQUE INDEX idx_city_state (city,state_code);
INSERT IGNORE INTO cities_extended
(city,state_code,zip,latitude,longitude,county)
SELECT city,state_code,0,0.0,0.0,'' FROM cities;

Explanation for Proposed Solution #2

INSERT IGNORE only works with unique indexes and primary keys. In your case, a unique index on city,state_code was needed for cities_extended.

After making such an index, you can insert all columns except id.

You could also include id like this

ALTER TABLE cities_extended ADD UNIQUE INDEX idx_city_state (city,state_code);
INSERT IGNORE INTO cities_extended
(id,city,state_code,zip,latitude,longitude,county)
SELECT NULL,city,state_code,0,0.0,0.0,'' FROM cities;

GIVE IT A TRY !!!