I have three table with more than 1 millions of rows and one of my query take more than 30 seconds with a date range of 3 months. I have tried multiples index and I think I have respect index orders. I don't know what to do go get a better performance.
SELECT
MAX(table2.id) as item_id,
MAX(table2.sampled_inspected) as sample_size,
table1.group_col AS ass_group_id
FROM
table1
INNER JOIN table2 ON table1.id = table2.assignment_id
LEFT OUTER JOIN table3 ON table1.po_id = table3.id
LEFT JOIN table4 ON table4.id = table3.supplier_id
LEFT JOIN table5 ON table5.id = table3.project_id
WHERE table1.recycled=0
AND table2.inspection_status_id > 0
AND table2.inspection_status_id != 50
AND table3.importer_id = 215
AND table5.recycled = 0
AND table3.project_id IN ( '2062','2063','2064','2065' )
AND table2.inspection_completed_date >= DATE('2019-10-01')
AND table2.inspection_completed_date <= DATE('2020-01-29')
GROUP BY ass_group_id
and here my 5 tables
CREATE TABLE `table1` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`recycled` int(1) NOT NULL DEFAULT '0',
`group_col` varchar(25) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`po_id` int(11) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_assignment_po` (`po_id`),
KEY `recycled` (`recycled`,`group_col`),
KEY `group_col` (`group_col`)
) ENGINE=InnoDB AUTO_INCREMENT=3324705 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `table2` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`assignment_id` int(10) unsigned DEFAULT NULL,
`sampling_size` int(11) DEFAULT NULL,
`notes` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`photo_report` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`inspection_status_id` int(11) DEFAULT '0',
`inspection_result_id` int(1) NOT NULL DEFAULT '0',
`inspection_completed_date` datetime DEFAULT NULL,
`recycled` int(1) NOT NULL DEFAULT '0',
`qty_to_inspect` decimal(11,2) DEFAULT NULL,
`defect_rate` decimal(16,13) DEFAULT NULL,
`sampled_inspected` int(11) unsigned DEFAULT NULL,
`inspection_report_id` int(11) unsigned DEFAULT NULL,
`poline_id` int(11) unsigned DEFAULT NULL,
`item_status_id` int(11) NOT NULL DEFAULT '0',
`production_line` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK_wsxix_assignments` (`assignment_id`),
KEY `index_inspection_report_reportid` (`inspection_report_id`),
KEY `index_ai_inspection_completed_date` (`inspection_completed_date`),
KEY `index_item_poline` (`poline_id`),
KEY `index_item_poline_assignment` (`poline_id`,`assignment_id`),
KEY `wsxix_assignments_items_inspection_status_id` (`inspection_status_id`),
KEY `wsxix_assignments_items_inspection_result_id` (`inspection_result_id`),
KEY `inspection_completed_date` (`inspection_completed_date`,`inspection_result_id`,`inspection_status_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3324705 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `table3` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`buyer_id` int(11) unsigned DEFAULT NULL,
`client_id` int(11) unsigned DEFAULT NULL,
`merchandiser_user_id` int(11) unsigned DEFAULT NULL,
`factory_id` int(11) unsigned DEFAULT NULL,
`dc_id` int(11) unsigned DEFAULT NULL,
`parent_id` int(10) unsigned DEFAULT NULL,
`supplier_id` int(11) unsigned NOT NULL,
`importer_id` int(11) unsigned NOT NULL,
`project_id` int(11) unsigned NOT NULL,
`tier_two_id` int(11) unsigned DEFAULT NULL,
`store_id` int(11) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `buyer_id` (`buyer_id`),
KEY `client_id` (`client_id`),
KEY `merchandiser_user_id` (`merchandiser_user_id`),
KEY `FK_po_factory_companies_id` (`factory_id`),
KEY `FK_po_dc_companies_id` (`dc_id`),
KEY `index_po_parentid` (`parent_id`),
KEY `index_purchaseorders_cie` (`supplier_id`),
KEY `index_poimporter_cie` (`importer_id`),
KEY `index_purchaseorders_project` (`project_id`),
KEY `facory_importer_id` (`factory_id`,`importer_id`),
KEY `posu` (`project_id`,`supplier_id`),
KEY `store_id` (`store_id`),
KEY `tier_two_id` (`tier_two_id`)
) ENGINE=InnoDB AUTO_INCREMENT=34959 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `table4` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`parent_company_id` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `parent_company_id` (`parent_company_id`)
) ENGINE=InnoDB AUTO_INCREMENT=272 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `table5` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`recycled` int(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `recycled` (`recycled`)
) ENGINE=InnoDB AUTO_INCREMENT=2066 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
UPDATE:
Made change from @Rick Martin comments
Here my Explain result avec your change
My table size:
Table1: 3.3 Millions
Table2: 3.5 Millions
Table3: 375
Table4 ( Not used anymore and removed in my query): 57
Table5: 5
@Willem Renzema
Best Answer
In addition to changes you already made based on other answers and comments, I would add the following 2 indexes:
Trying to get a
MAX
value for two different columns at the same time, from the same table, is unfortunately going to be something that is more difficult to optimize.With these indexes it should hopefully try to do
table1
first, and so potentially avoid the filesort, which is likely much of the performance problem.Provide the
EXPLAIN
output after these changes, and how quickly the query itself runs. We'll see if this made progress before I suggest more changes.Forcing table1 to be first
Well, it appears that the query optimizer is refusing to try running the query with
table1
being the first table. I believe havingtable1
go first may be more efficient, so here's your query with it modified to accomplish that goal. Basically, I replaced theLEFT JOIN
s withSTRAIGHT_JOIN
s.This query is equivalent logically because all of your
LEFT JOIN
s were already being converted intoINNER JOIN
s already, because you haveWHERE
clauses that reference those tables. ASTRAIGHT_JOIN
is the same as anINNER JOIN
, but just forces the tables to be joined in specific order.Run that and let me know how it performs. Also, try this query like you did before, by doing just one of the
MAX
statements at a time. See if either of those are better.Last of the ideas
The only other thing I can think of to help, and this may be why the optimizer doesn't want to do
table1
first, is that the column you are grouping by is 100 bytes wide.Is
varchar(25)
withutf8mb4
charset required? That charset is a good default, but it is 4 bytes wide, and if you are only holding letters and numbers you can reduce that by a factor of 4 by switching to something like the latin1 charset, which is only 1 byte wide. Also, if it was possible to switch that column to be aINT
that would be even better, as that is 4 bytes, which is up to 25 times smaller than what you currently have. I don't expect that is practical though.Warning: Be careful about changing the charset on your production data, as if the charset does matter you'll incur data loss upon conversion.
Either way, definitely try my modified query before you bother with this column.