I have a MySQL 5.7 table that holds activities that users can search through. It will end up holding less than 100,000 rows and inserts will be less than 50/day. This activities
table has a structure of:
CREATE TABLE activities (
activityID int AUTO_INCREMENT,
createdBy char, -- a unique AWS ID. eg: "us-east-1:128d0a74-c82f-4553-916d-90053e4a8b0f"
active bool,
city varchar,
length int,
price decimal,
latitude decimal,
longitude decimal,
CONSTRAINT Activities_PK PRIMARY KEY (activityID)
);
The queries I will be making are constructed depending on the query string sent to the API endpoint. There are 3 'paths' the API can be searched by to return activities.
- If
createdBY
is included the query string,Select ... where createdBy = ~some id~
- If
city
is included, Select bycity
, andlength
if included, and mininmumprice
if included, and maximumprice
if included - If
latitude
andlongitude
are included, search between the 2 points
Indexes I have come up with so far:
-
createdBy
, as a B-Tree. Should this be a HASH index because it's a unique identifier? -
(city, length, price)
. Since this is a covering index, if the constructed Select query is only searching bycity
, would it still use the index even though it containslength
andprice
?Also, since covering indexes start on the left, should I also include a
(city, price, length)
index? -
(latitude, longitude)
Are these acceptable indexes? Should I include an index for the active column? All queries should only return active activities. My guess will be ~90% of rows would be active, while ~10% would be inactive. This answer suggests no.
Thanks so much! Any other suggestions are welcome!
Best Answer
InnoDB does not support
HASH
indexes; onlyB-TREE
. You could certainly implement your own with another column calledcreatedByHash
. Use something likeCRC32
as your hashing function to make it integer-based and save a few bytes per row. Index this column instead and make your queriesSELECT .. FROM .. WHERE createdByHash = CRC32(<input>) AND createdBy = <input>
(you have to have both parts because there could be collisions with CRC32)Yes, MySQL uses multi-column indexes in left-to-right order.
SELECT city, length, price FROM .. WHERE city = x
will still use the index. If you have any range-based queries, the index should put those range'd columns to the right. An index on(A, B, C)
andWHERE A=4 AND B>5 AND C=3
will only use the first two columns of that index.If you are doing a simple greater-than/less-than on the lat/long, what you have will probably be OK. This one is tough since they are both ranged predicates. You may have to experiment on (lat, long) vs (long, lat) in your application.