Let's say I have database "DB" and it has many tables.
Question 1.
If table A has 100 million rows of data and table B has 100000 rows, will database size as a whole impact performance if I do a select query on table B?
Question 2.
Table A` has 100 million rows of data for around 8000 products. Searching data for particular product on this table is very slow. Will performance increase if I instead create 8000 tables, 1 for each product and put respective data on respective table and make a search on that particular table? Does this have any caveats in a long run?
More information:
- It's an InnoDB Table
- Ram size is 27GB
- Table A size (the largest table)
- Rows=100.43 Million
- Data Size= 13.62GB
- Index Size= 7.21GB
- Total Size = 20.83GB
- Table B size
- Table B could be various other tables but their size does not exceed 500MB
Table A is indexed and it has three indices on id
(Pk), ProductCode
, and AddedDate
columns.
UPDATE
Table A
CREATE TABLE `ProductDetails` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`DepartmentID` INT(10) DEFAULT NULL,
`SellerID` INT(10) DEFAULT NULL,
`ProductCode` VARCHAR(20) DEFAULT NULL,
`WholesalerType` VARCHAR(5) DEFAULT NULL,
`Remarks` TEXT DEFAULT NULL,
`AddedDate` DATETIME DEFAULT NULL,
`InsertedTimestamp` INT(10) UNSIGNED NOT NULL,
`WholesalerID` TINYINT(3) UNSIGNED DEFAULT NULL,
`ProductType` TINYINT(4) DEFAULT NULL,
`ArrivalDate` DATETIME DEFAULT NULL,
`SerialNumber` VARCHAR(4) DEFAULT NULL,
`MessageID` INT(11) DEFAULT NULL,
`Model` VARCHAR(10) NOT NULL DEFAULT 'NA'
PRIMARY KEY (`id`),
KEY `ProductCode` (`ProductCode`),
KEY `AddedDate` (`AddedDate`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COMMENT='Engine should be InnoDB for performance'
Table B
CREATE TABLE `Products` (
`ID` INT(11) NOT NULL AUTO_INCREMENT,
`ProductCode` VARCHAR(20) NOT NULL DEFAULT '00A',
`Created` DATETIME DEFAULT NULL,
`LastEdited` DATETIME DEFAULT NULL,
`Model` VARCHAR(15) CHARACTER SET utf8 DEFAULT NULL,
`ProductType` VARCHAR(15) CHARACTER SET utf8 DEFAULT NULL,
`StockItem` INT(11) NOT NULL DEFAULT '0',
`StockThreshold` TINYINT(2) NOT NULL DEFAULT '10',
`CostPrice` DOUBLE DEFAULT NULL,
`SellingPrice` DOUBLE DEFAULT NULL,
`ShortDescription` MEDIUMTEXT CHARACTER SET utf8,
`Specification` TEXT CHARACTER SET utf8,
`Description` TEXT CHARACTER SET utf8,
`BrandID` INT(11) NOT NULL DEFAULT '0',
`ManufacturerID` INT(11) NOT NULL DEFAULT '0',
`ProductStatus` TINYINT(1) UNSIGNED NOT NULL DEFAULT '1',
PRIMARY KEY (`ID`),
KEY `ProductCode` (`ProductCode`)
) ENGINE=INNODB DEFAULT CHARSET=latin1
Example Query
SELECT * FROM ProductDetails WHERE (ProductCode = 'PR1920A'
AND AddedDate > FROM_UNIXTIME(1519841700)
AND AddedDate < FROM_UNIXTIME(1523934207))
AND ((ProductType in('14','17')
AND MessageID = 28
AND Model LIKE 'MBA-1201')
OR (WholesalerType in('PARAM1','PARAM2','PARAM3','PARAM4','PARAM5')
AND UPPER(Model) IN('RAGL21','BVRK10','TPL001','LTV2960','DEN7YZ','6YU90Q','47GER8','GB3049')))
AND DepartmentID != 0 AND SellerID != 0
ORDER BY AddedDate;
UPDATE 2
Here are the result of the EXPLAIN
id = 1
select_type = SIMPLE
table = ProductDetails
type = ref
possible_keys = ProductCode, AddedDate
key = ProductCode
key_len = 63
ref = const
rows = 104456
Extra = Using index condition; Using where; Using filesort
And here is the SHOW INDEX
results for ProductDetails
table
Best Answer
Q1: A "database" is a collection of "tables". The performance of a query on one table (or a few tables
JOINed
together) is entirely on the table(s) in the query. Not other tables; not other databases.Q2: Do not make a bunch of tables. If you have an index starting with
ProductCode
, then queries withWHERE ProductCode = 123 AND ...
will be efficient. (This question has been asked and answered many times - always the same way.)innodb_buffer_pool_size
should be set to about 20G.If you usually fetch by
ProductCode
, then changeto
That will 'cluster' all 12K (avg) rows for a given product together, making many
SELECTs
run faster.Does Table B have 8K rows? Or 100K rows? If there is one row per product, then get rid of
id
and changeto
BUG! You have inconsistent
CHARACTER SETs
between the tables. This can cause comparisons onProductCode
to not use an index! (And hence be sloooow.) (SeeALTER TABLE .. CONVERT TO ..
)UPPER(Model)
-- Don't useUPPER
. You have a case-foldingCOLLATION
, so you will get the desired test without need forUPPER()
.When I suggested
PRIMARY KEY(ProductCode, AddedDate, id)
above, includingAddedDate
was not important. But looking at theSELECT
, it is important, both in theWHERE
and in theORDER BY
.