Mysql – How to optimized thesql query having large dataset

index-tuningMySQLperformancequery-performance

I have two tables in MySql with the following schema,

CREATE TABLE `open_log` (
  `delivery_id` varchar(30) DEFAULT NULL,
  `email_id` varchar(50) DEFAULT NULL,
  `email_activity` varchar(30) DEFAULT NULL,
  `click_url` text,
  `email_code` varchar(30) DEFAULT NULL,
  `on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


CREATE TABLE `sent_log` (
  `email_id` varchar(50) DEFAULT NULL,
  `delivery_id` varchar(50) DEFAULT NULL,
  `email_code` varchar(50) DEFAULT NULL,
  `delivery_status` varchar(50) DEFAULT NULL,
  `tries` int(11) DEFAULT NULL,
  `creation_ts` varchar(50) DEFAULT NULL,
  `creation_dt` varchar(50) DEFAULT NULL,
  `on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

The email_id and delivery_id columns in both tables make up a unique key.

The open_log table have 2.5 million records where as sent_log table has 0.25 million records.

I want to filter out the records from open log table based on the unique key (email_id and delivery_id).

I'm writing the following query.

SELECT * FROM open_log
WHERE CONCAT(email_id,'^',delivery_id)
IN (
SELECT DISTINCT CONCAT(email_id,'^',delivery_id) FROM sent_log
)

The problem is the query is taking too much time to execute. I've waited for an hour for the query completion but didn't succeed.

I've tried to make the email_id and delivery_id as composite key but that didn't help.

Kindly, suggest what I can do to make it fast since, I have the big data size in the tables.

Thanks,
Faisal Nasir

Best Answer

  • First, if email_id and delivery_id together are a unique key, please add a primary -composite- key on both tables for (email_id, delivery_id)
  • Second, the concat is not necessary and will prevent the previous key from being used. Try:

    SELECT ol.* 
    FROM open_log ol 
    JOIN sent_log sl 
    ON (ol.email_id, ol.delivery_id) = (sl.email_id, sl.delivery_id)