basically i got 2 tables: header
, details tables
.
CREATE TABLE `header` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`RECORD_DATE` datetime DEFAULT NULL,
`TICKER_ID` int(11) DEFAULT NULL,
`CURR_TIMESTAMP` datetime DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE TABLE `detail` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`HEADER_ID` int(11) DEFAULT NULL,
`BROKER_ID` int(11) DEFAULT NULL,
`AMOUNT` decimal(26,0) DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
There're about 4800 ticker need to be update everyday, ticker_id (A00001, B00032…) and each ticker_id contain numbers of record everyday and i store in detail table
it works fine at the beginning, after years, header become 2.4 million row and detail table got 250 millions row, its take an hour to with simple select,
SELECT h.ticker_id, h.record_date , d.broker_id, d.broker_id, d.amount
FROM DETAIL
INNER JOIN herder h
ON h.id = d.header_id
where h.ticker = 'A00001'
so i create a 'link' table to join reference them together.
CREATE TABLE `linkA00001` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`HEADER_ID` int(11) DEFAULT NULL,
`DETAIL_ID` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
and the result become
SELECT h.ticker_id, h.record_date , d.broker_id, d.broker_id, d.amount
FROM DETAIL
INNER JOIN linkA00001 l
INNER JOIN herder h
ON h.id = l.header_id and d.id = l.detail_id
its stunning fast, from hour to few seconds, but this approach solve 1 ticker selection only, so i create 4800 link table for better performance, its do solve the selection problem, but i using java spring for my application, building over 4800 entity require 10 minutes to start the application and very hard to maintain the link table.
so my question is:
- this approach works?
- if not, how can i improve or just redesign this?
- according to this link, its require timestamp but i never insert at detail table, is it good practices add back?
Thank you very much!
Best Answer
To improve the query:
(ignoring that
ticket_id
is an INT in the table);The aspect of the query that will limit the results most is the value of
h.ticket_id
. To find this quickly there need to be an index on this in tableheader
.The database then need to find the HEADER_ID quickly in the
detail
table:What is unclear in the question here is why
detail
has an auto_increment primary key. If this isn't used in any queries, and ifHEADER_ID,BROKER_ID
is unique, then perhaps those are a better primary key for the table and as an alternate toCREATE INDEX header
above:This will also speed up the query as primary keys are quicker to search than secondary keys (where non-index values like
d.amount
are also needed).