MySQL Performance – Slow Query on Large Table Without Unique Keys

indexmariadbMySQL

I have a table that has approximately 20 million records and I'm having a very hard time improving the performance of the following query. It's taking well over a minute to run.

SELECT  a.batch_date, a.batch_number, sum(a.gross_amt) AS gross_amount,
        (sum(a.trans_fee)*-1) as trans_fee, sum(a.net_amt) AS net_amount
    FROM  cc_detail a
    WHERE  a.company_id='1'
      AND  a.account_number='999999'
      AND  (a.batch_date BETWEEN '2015-10-01' AND '2016-01-31')
    GROUP BY  a.batch_date, a.batch_number
    ORDER BY  a.batch_date DESC, a.batch_number
    LIMIT  0, 10

Here is the table structure:

CREATE TABLE `cc_detail` ( 
    `company_id` int(11) NOT NULL, 
    `account_number` int(12) NOT NULL, 
    `location_number` int(12) NOT NULL, 
    `batch_date` date NOT NULL, 
    `batch_number` varchar(10) COLLATE utf8_unicode_ci NOT NULL, 
    `gross_amt` decimal(15,6) NOT NULL, 
    `trans_fee` decimal(15,6) NOT NULL, 
    `net_amt` decimal(15,6) NOT NULL, 
    KEY `company_id` (`company_id`), 
    KEY `account_number` (`account_number`),
    KEY `location_number` (`location_number`),  
    KEY `batch_date` (`batch_date`),
    KEY `batch_number` (`batch_number`), 
    KEY `ca` (`company_id`,`account_number`), 
    KEY `cab` (`company_id`,`account_number`,`batch_date`), 
    KEY `cb` (`company_id`,`batch_date`) USING BTREE 
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

The server has 3.5 GB RAM and is running MariaDB 10.1. One issue with this table is that it does not have a primary or unique key. Due to the source of this data not having a unique key, I can not create one (unless someone has suggestions on ho to do this safely). This table is updated multiple times a day by deleting records within a date range and inserting data within that same date range.

Here is the result of the Explain statement using the above query:

id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: company_id,account_number,batch_date,ca,cab,cb
key: ca
key_len: 8
ref: const, const
rows: 1
Extra: Using where; Using temporary; Using filesort

Best Answer

INDEX(a) is unnecessary when you also have INDEX(a,b). So get rid of these:

KEY `company_id` (`company_id`),
KEY `ca` (`company_id`,`account_number`), 

ca is a good index, but cab is better; removing ca will encourage the optimizer to use cab. How many rows are in the result set?

Was the SELECT slow during one of those big DELETE + INSERT times? If so, please provide those queries for further discussion.