I asked this on StackOverflow, but there are not many hints. So:
SHOW CREATE TABLE cashadv_txn
:
CREATE TABLE `cashadv_txn` (
`txn_id` int(11) NOT NULL AUTO_INCREMENT,
`cashadv_id` int(11) NOT NULL,
`txn_type` enum('LI','MI','CI','LD','MD','LDA','SRDA','SRD','NRF') DEFAULT NULL,
`datetime` datetime NOT NULL,
`amount` float(20,2) NOT NULL,
`status` varchar(16) DEFAULT NULL,
`usaepay_invoice_id` int(11) DEFAULT NULL,
`link_uid` int(11) DEFAULT NULL,
`link_date` date DEFAULT NULL,
`link_txn_id` int(11) DEFAULT NULL,
PRIMARY KEY (`txn_id`),
KEY `link_txn_id` (`link_txn_id`),
KEY `type_date_uid` (`cashadv_id`,`txn_type`,`link_date`,`link_uid`),
KEY `usaepay_invoice_id` (`usaepay_invoice_id`),
KEY `idx1` (`cashadv_id`,`link_uid`)
) ENGINE=InnoDB AUTO_INCREMENT=4586015 DEFAULT CHARSET=utf8
relevant temporary table variables. I know this is an overkill, but I can reduce it later. This is just to show that they are high enough:
SHOW GLOBAL VARIABLES LIKE 'max_heap_table_size';
SHOW GLOBAL VARIABLES LIKE 'tmp_table_size' ;
This query takes around 0.8-1 seconds:
SELECT SQL_NO_CACHE
link_uid AS lender_uid,
cashadv_id,
SUM(amount) AS lenderTotalCredit
FROM
cashadv_txn
WHERE
cashadv_txn.txn_type='LDA'
GROUP BY
cashadv_id,
link_uid;
However, this query (which is in fact the previous one saving results into a temporary table) takes 40 seconds:
CREATE TEMPORARY TABLE IF NOT EXISTS tLenderTotalCredits ENGINE=MEMORY AS (SELECT
link_uid AS lender_uid,
cashadv_id,
SUM(amount) AS lenderTotalCredit
FROM
cashadv_txn
WHERE
cashadv_txn.txn_type='LDA'
GROUP BY
cashadv_id,
link_uid
);
SELECT @@version
: 5.6.22-1+deb.sury.org~precise+1
UPDATE: EXPLAIN
of the SELECT
statement:
You can see all the indexes above in SHOW CREATE TABLE
UPDATE: It was my mistake in the measurements, actually. I'm using SQLyog for query debugging and it limits by default by 1000 records. I'm aware of it and tested it with limit rows option unchecked. The problem is: on the first attempt, once I uncheck the option it still says it takes about 2 seconds (in reality it takes good 40 s):
However, if i restart the query with the option already unchecked, then it reports the time right. My mistake was that I thought the time on option uncheck was right and the rest ~30 s it was just rendering all the rows in a grid.
As another question, I'm trying to optimize this query (which is crazy, I know) and which is in turn a subquery in a bigger one. But I think if I optimize this, the whole bigger query will run really faster. At the moment this query takes 1.5 min to execute:
SELECT
cashadv_id,
link_date AS DATE,
link_uid AS lender_uid,
amount,
'Approved' AS STATUS
FROM
`cashadv_txn`
WHERE
txn_type='LD'
AND STATUS='Approved'
UNION
ALL SELECT
tLenderTotalCredits.cashadv_id,
DATE(NOW()) AS DATE,
tLenderTotalCredits.lender_uid,
IFNULL(tLenderTotalCredits.lenderTotalCredit,0) - IFNULL(tLenderTotalPay.lenderTotalPay,0) AS amount,
'Ready' AS STATUS
FROM
( SELECT
link_uid AS lender_uid,
cashadv_id,
SUM(amount) AS lenderTotalCredit
FROM
cashadv_txn
WHERE
cashadv_txn.txn_type='LDA'
GROUP BY
link_uid,
cashadv_id ) tLenderTotalCredits
LEFT JOIN
(
SELECT
cashadv_id,
link_uid AS lender_uid,
SUM(amount) AS lenderTotalPay
FROM
cashadv_txn
WHERE
txn_type='LD'
AND STATUS='Approved'
GROUP BY
cashadv_id,
link_uid
) tLenderTotalPay
ON tLenderTotalPay.cashadv_id=tLenderTotalCredits.cashadv_id
AND tLenderTotalPay.lender_uid=tLenderTotalCredits.lender_uid -- check if there is current day payment
LEFT JOIN
cashadv_txn cashadv_txn3
ON cashadv_txn3.txn_type='LD'
AND cashadv_txn3.cashadv_id=tLenderTotalCredits.cashadv_id
AND cashadv_txn3.link_uid=tLenderTotalCredits.lender_uid
AND cashadv_txn3.link_date = DATE(NOW())
AND cashadv_txn3.status='Approved'
WHERE
cashadv_txn3.cashadv_id IS NULL
As you see, there are two subquery joins that I was hoping to speed up by first getting their resultsets into an in-memory temp table and then join on them.
I have to say, I'm not yet 100% faimilar with the app logic, so I'm afraid of breaking something, I don't know if we really need all 23k rows in these two sub-joins or we can limit them. But apart of that, do you have any ideas on how to optimize this? The cashadv_txn
structure is above.
Thanks
Best Answer
How much RAM do you have? Do you have 2GB to spare? That is what you are allowing for that MEMORY table.
Remove the parentheses
CREATE ... AS ( SELECT ... )
.Add
It is dangerous to have tmp_table_size = 2G. If multiple connections each needed a tmp table, you could quickly run out of RAM. At that point, mysql gets veeery slow.
This construct optimizes poorly, hence runs slowly:
DATE(NOW())
-->CURRENT_DATE()
float(20,2)
-->FLOAT
orDECIMAL(20,2)
(probably 20 is excessive)Consider changing
status varchar(16) DEFAULT NULL,
to anENUM
.