Mysql – Slow query with JOIN

join;MySQLoptimizationperformancequery-performancestored-procedures

I have a stored procedure which runs pretty slow (6 sec) using the following JOIN:

JOIN tariffs t ON 
(LEFT(`cdrs`.`cnumber`,7) = t.numberrange 
OR LEFT(`cdrs`.`cnumber`,8) = t.numberrange)

Without the above JOIN the query runs at 1.5 sec.
Any way how to improve the performance? Full stored procedure below:

CREATE DEFINER=`xxx`@`%` PROCEDURE `GetHourStats`(IN _ID INT, IN _YEAR INT, IN _MONTH INT, IN _DAY INT)
BEGIN

set @_START = UNIX_TIMESTAMP(date(_YEAR * 10000 + _MONTH * 100 + _DAY * 1)); 
set @_END = UNIX_TIMESTAMP(date_add(date(_YEAR * 10000 + _MONTH * 100 + _DAY * 1), interval 1 day));

SELECT h.idhour, h.`hour` as 'hour', innumber, count(*) as `count`, sum(talktime) as `duration` FROM (
     SELECT 
        `cdrs`.`dcustomer` AS `dcustomer`,
        (CASE
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "01" OR LEFT(`cdrs`.`cnumber`, 2) = "02") THEN '01-02'
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "03") THEN '03'
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "05") THEN '05'
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "06") THEN '06'
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "07") THEN '07'
            WHEN (LEFT(`cdrs`.`cnumber`, 3) = "080") THEN '080'
            WHEN (LEFT(`cdrs`.`cnumber`, 3) = "084") THEN '084'
            WHEN (LEFT(`cdrs`.`cnumber`, 3) = "087") THEN '087'
            WHEN (LEFT(`cdrs`.`cnumber`, 2) = "09") THEN '09'
        END) AS 'innumber',
        FROM_UNIXTIME(`cdrs`.`start`) AS `start`,
         (`cdrs`.`end` - `cdrs`.`start`) AS `duration`,
         `cdrs`.`cnumber` AS `calling`,
         `cdrs`.`talktime` AS `talktime`
    FROM `cdrs`
    JOIN tariffs t ON (LEFT(`cdrs`.`cnumber`,7) = t.numberrange OR LEFT(`cdrs`.`cnumber`,8) = t.numberrange)
    WHERE `cdrs`.`start` >= @_START and `cdrs`.`start` < @_END
    AND `cdrs`.`stype` = _LATIN1'external'
    AND `cdrs`.`talktime` >= 5 
    AND `cdrs`.`status` = 'answer'
    AND CHAR_LENGTH(`cdrs`.`cnumber`) = 11
    GROUP BY callid
   ) cdr 

   JOIN customers c ON c.id = cdr.dcustomer
   LEFT JOIN hub.hours h ON HOUR(cdr.`start`) = h.idhour

    WHERE (c.parent = _ID or cdr.dcustomer = _ID or c.parent IN 
        (SELECT id FROM customers WHERE parent = _ID))

   GROUP BY h.idhour, cdr.innumber
   ORDER BY h.idhour;

END

Q: How can I make the above stored procedure run faster?

Best Answer

I agree with the comment about table definitions, but in a few moments you will see this does not depend on the tables.

You use SELECT from derived table cdr for JOIN and for WHERE

JOIN customers c ON c.id = cdr.dcustomer
   LEFT JOIN hub.hours h ON HOUR(cdr.`start`) = h.idhour

    WHERE (c.parent = _ID or cdr.dcustomer = _ID or c.parent IN 
        (SELECT id FROM customers WHERE parent = _ID))

Derived tables do not have indexes and so these operations always will be full-scan. It may take 6 seconds. That is not too much.

The other JOIN uses derived columns:

FROM `cdrs`
    JOIN tariffs t ON (LEFT(`cdrs`.`cnumber`,7) = t.numberrange OR LEFT(`cdrs`.`cnumber`,8) = t.numberrange)

This is not the best idea.

Again, group by does not use indexed data:

GROUP BY h.idhour, cdr.innumber

cdr - does not have indexes.

Possibly faster will be to store intermediate data in a table with indexes.