Mariadb – How to optimize the execution of query that takes too long

index-tuningmariadbmariadb-10.1

I am working with a query that is taking around 53 minutes to execute, which I feel is excessive.

Maria DB version: Server version: 10.1.38-MariaDB-1~stretch mariadb.org

The declaration of the relevant base table is as follows:

CREATE TABLE `callinfo` (
  `uniqueid` varchar(60) NOT NULL DEFAULT '',
  `customer_id` int(11) DEFAULT '0',
  `type` tinyint(1) NOT NULL DEFAULT '0',
  `callerid` varchar(30) NOT NULL,
  `sip_from_user` varchar(10) NOT NULL,
  `sip_from_host` varchar(15) NOT NULL,
  `sip_from_uri` varchar(100) NOT NULL,
  `sip_via_host` varchar(20) NOT NULL,
  `callednum` varchar(30) NOT NULL DEFAULT '',
  `billseconds` smallint(6) NOT NULL DEFAULT '0',
  `trunk_id` smallint(6) NOT NULL DEFAULT '0',
  `trunkip` varchar(15) NOT NULL DEFAULT '',
  `callerip` varchar(15) NOT NULL DEFAULT '',
  `disposition` varchar(45) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `debit` decimal(20,6) NOT NULL DEFAULT '0.000000',
  `cost` decimal(20,6) NOT NULL DEFAULT '0.000000',
  `provider_id` int(11) NOT NULL DEFAULT '0',
  `pricelist_id` smallint(6) NOT NULL DEFAULT '0',
  `package_id` int(11) NOT NULL DEFAULT '0',
  `pattern` varchar(20) NOT NULL,
  `notes` varchar(80) NOT NULL,
  `invoiceid` int(11) NOT NULL DEFAULT '0',
  `rate_cost` decimal(20,6) NOT NULL DEFAULT '0.000000',
  `reseller_id` int(11) NOT NULL DEFAULT '0',
  `reseller_code` varchar(20) NOT NULL,
  `reseller_code_destination` varchar(80) DEFAULT NULL,
  `reseller_cost` decimal(20,6) NOT NULL DEFAULT '0.000000',
  `provider_code` varchar(20) NOT NULL,
  `provider_code_destination` varchar(80) NOT NULL,
  `provider_cost` decimal(20,6) NOT NULL DEFAULT '0.000000',
  `provider_call_cost` decimal(20,6) NOT NULL,
  `call_direction` enum('outbound','inbound') NOT NULL,
  `calltype` enum('STANDARD','DID','FREE','CALLINGCARD') NOT NULL DEFAULT 'STANDARD',
  `profile_start_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `answer_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `bridge_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `progress_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `progress_media_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `end_stamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `billmsec` int(11) NOT NULL DEFAULT '0',
  `answermsec` int(11) NOT NULL DEFAULT '0',
  `waitmsec` int(11) NOT NULL DEFAULT '0',
  `progress_mediamsec` int(11) NOT NULL DEFAULT '0',
  `flow_billmsec` int(11) NOT NULL DEFAULT '0',
  `is_recording` tinyint(1) NOT NULL DEFAULT '1' COMMENT '0 for On,1 for Off',
  `call_reward_amount` decimal(20,6) NOT NULL DEFAULT '0.000000',
  UNIQUE KEY `uniqueid` (`uniqueid`),
  KEY `customer_id` (`customer_id`),
  KEY `pattern` (`pattern`,`notes`),
  KEY `callerid` (`callerid`),
  KEY `sipuri` (`sip_from_uri`),
  KEY `call_direction` (`call_direction`),
  KEY `callednum` (`callednum`),
  KEY `date` (`date`) USING BTREE,
  KEY `group_by` (`date`,`customer_id`,`pattern`,`notes`,`callerid`,`sip_from_uri`,`call_direction`,`callednum`) USING BTREE,
  KEY `date_user_callednum` (`date`,`customer_id`,`callerid`,`callednum`) USING BTREE,
  KEY `date_callednum` (`date`,`callednum`,`callerid`,`sip_from_uri`) USING BTREE,
  KEY `date_user` (`date`,`customer_id`,`callerid`,`sip_from_uri`) USING BTREE,
  KEY `date_direction` (`date`,`call_direction`,`callednum`) USING BTREE,
  KEY `callinfo_report` (`date`,`type`,`reseller_id`) USING BTREE,
  KEY `select_callinfo_report` (`uniqueid`,`billseconds`,`calltype`,`debit`,`cost`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='callinfo';

This table contains over 10 million rows with ~2k added every hour.

The query at issue is the one below:

SELECT customer_id, COUNT(uniqueid) AS attempts, 
              AVG(billseconds) AS acd, MAX(billseconds) AS mcd,
              SUM(billseconds) AS duration, 
              SUM(CASE WHEN calltype !='free' THEN billseconds ELSE 0 END) as billable, 
              SUM(CASE WHEN billseconds > 0 THEN 1 ELSE 0 END) as completed, 
              SUM(debit) AS debit, SUM(cost) AS cost, 
               (SUM(debit)-SUM(cost)) as profit, 
              round(SUM(CASE WHEN billseconds > 0 THEN 1 ELSE 0 END)/COUNT(uniqueid), 2) AS asr 
    FROM (`callinfo`) 
    WHERE `date` >=  '2017-03-01 00:00:00' 
      AND `date` <=  '2019-04-01 23:59:59' 
      AND `reseller_id` =  0 
      AND `type` IN ('0', '3')  
    GROUP BY `customer_id` 
    ORDER BY `customer_id` desc 
    LIMIT 10;

I know, I have set up too much indexes here, but I am not sure which one is working.

Below is explain extended output for the above query:

+------+-------------+-------+-------+--------------------------------------------------------------------------------------------+-------------+---------+------+--------+----------+-------------+
| id   | select_type | table | type  | possible_keys                                                                              | key         | key_len | ref  | rows   | filtered | Extra       |
+------+-------------+-------+-------+--------------------------------------------------------------------------------------------+-------------+---------+------+--------+----------+-------------+
|    1 | SIMPLE      | cdrs  | index | date,group_by,date_user_callednum,date_callednum,date_user,date_direction,callinfo_report  | customer_id | 7       | NULL | 690765 |   100.00 | Using where |
+------+-------------+-------+-------+--------------------------------------------------------------------------------------------+-------------+---------+------+--------+----------+-------------+

Does anyone like to point out any root cause for the slow execution?

Best Answer

You need an index like (customer_id, reseller_id, type, date) because those columns are used for result restriction, grouping and sorting. Keep in mind that the same query with different timeranges and/or type lists can require the different index.