Mysql – Selecting rows from one table not in another table

MySQL

I have two tables

table_a_mod (2 million rows)
table_a (2 million rows + new live data)

table_a_mod is identical to table table_a except for two extra columns

table_a is a "live" table with data being written to it semi-regularly.

To keep table_a_mod up to date I would like to copy data from table_a to table_a_mod

By using a regular join, I am able to select the data:

SELECT *
FROM table_a t1
LEFT JOIN table_a_mod t2
USING(id)
WHERE t2.id IS NULL

However, this query takes about 1 minute to run.

Is there a more efficient query along the lines of:

SELECT  *
FROM table_a t1
LEFT JOIN 
(
  SELECT id 
  FROM table_a_mod
  ORDER BY id DESC
  LIMIT 1
) t2
USING (id)
WHERE t1.id > t2.id

ie. Get the largest id from table t2. Select all rows from table t1 with id greater than that number.

I've tried different permutations of the above query but I'm falling short.

Edit 1

In response to @danblack

EXPLAIN {slow_query}

"1" "SIMPLE"    "table_a"   "ALL"   \N  \N  \N  \N  "2353123"   ""
"1" "SIMPLE"    "table_a_mod"   "eq_ref"    "PRIMARY"   "PRIMARY"   "4" "db.table_a.id" "1" "Using where; Not exists"

I'm assuming id is a primary key in both

Yes. id is a primary key in both.

Are the tables innodb and is there sufficient innodb_buffer_pool_size to hold both tables?

Both tables are innodb. And innodb_buffer_pool_size is 8388608

Is "new live data" always of an id above the previous table?

Yes, because it's auto incrementing

Why are you keeping two nearly identical tables

It is a temporary situation. We will drop the older one and use the newer one.

Best Answer

How much RAM do you have? If you have more than 4GB, set innodb_buffer_pool_size to 70% of available RAM. This is the most important thing to tune. With it set to only 8M, you are definitely I/O-bound. Increasing the size will probably shrink the time to under 10 seconds.

Your second idea should be good (assuming the only new rows have higher ids), but let's get rid of the LEFT and use MAX:

SELECT  *
    FROM  table_a t1
    JOIN  
        ( SELECT  MAX(id) AS max_id
            FROM  table_a_mod ) t2
    WHERE  t1.id > t2.max_id