Should an Index Be Added for a Monthly Query in MySQL?

indexMySQLoptimization

Concretely, is it worth it to add an index to a query that scans the table of a mysql database,
even if that index would only be used once a month?

For example, having a users a table and then querying for all the users of a certain type.

EXPLAIN SELECT type FROM users WHERE type in (1,3);

+----+-------------+------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table            | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+------------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | users            | ALL  | NULL          | NULL | NULL    | NULL | 1000685 | Using where |
+----+-------------+------------------+------+---------------+------+---------+------+--------+-------------+

Best Answer

Here is one sure way to know if an index is worth it:

Run these queries:

SELECT type,COUNT(1) indtypecount FROM users GROUP BY type WITH ROLLUP; 
SELECT COUNT(DISTINCT type) typecount FROM users GROUP BY type WITH ROLLUP; 

This query will give you the count of each type (indtypecount aka individual typecount) and the table count as the last row.

If indtypecount is more than 5% of the rowcount of the whole table, the MySQL Query Optimizer will ignore any index no matter how many times you run ANALYZE TABLE.

If your table is huge and the indtypecount for a type is less than 5%, an index scan is better than a full table scan anyday.

If typecount (cardinality of type) is something ridiculously small, this is all a moot point. Indexing values such as gender (M,F), day of week, active status, and marital status would produce indexes that resemble monster link lists of types that the MySQL Query Optimizer would find totally useless and, thus, space wasted.

If the running time of once-a-month queries is not a concern, indexing or not is strictly your personal choice.