How Functional Index in MySQL 8.0.13+ Improves Query Performance

MySQLmysql-8.0

Quoting from 13.1.15 CREATE INDEX Syntax:

MySQL 8.0.13 and higher supports functional key parts that index expression values rather than column or column prefix values.

See "Functional Key Parts" section of that manual page for detail. Basically, it enables us to create an index consisting of complex expression values derived from columns rather than mere column values (or their prefixes).

However, on a following part, that page also states:

Functional indexes are implemented as hidden virtual generated columns […]

In 13.1.20.8 CREATE TABLE and Generated Columns, it says:

The VIRTUAL or STORED keyword indicates how column values are stored, which has implications for column use:

  • VIRTUAL: Column values are not stored, but are evaluated when rows are read, immediately after any BEFORE triggers. A virtual column takes no storage.

    […]

  • STORED: Column values are evaluated and stored when rows are inserted or updated. A stored column does require storage space and can be indexed.

So it looks like (at least to me) the functional index is not stored at all, and only built on the fly during executing a query. Is this how functional index is used? If this is true, how could this "virtual" index improve query performance (I mean even the index itself has to be built during query)?

Best Answer

So it looks like (at least to me) the functional index is not stored at all, and only built on the fly during executing a query. Is this how functional index is used?

Please notice that columns are virtual, the indexes are as real ("materialized") as any other index. That means that indeed, if you request data from the table from a virtual column, it is generated at the moment, but if you use the index, it will be pregenerated on creation, and updated on every insert/update/delete.

That means that if you have a query such as this one:

SELECT * FROM revision WHERE substr(rev_timestamp, 5, 2) == '09'

(get all rows generated in september of any year)

You don't need to create a separate real column with substr(rev_timestamp, 5, 2), you can create a virtual one (which will take no more table space) and generate a real index on it, so you can run:

SELECT * FROM revision WHERE virtual_col == '09'

which should use the new index. Of course the index will take some space, but any index does (and modifying non-unique indexes after UPDATEs is very fast on MySQL). Also think that in most cases you don't need to index the whole column- this is very useful and optimized to index optional fields on JSON type columns without duplicating the whole, large content field.