MySQL – When to Use Views for Optimization

MySQLoptimizationview

When creating tables from multiple joins for use in analysis, when is it preferred to use views versus creating a new table?

One reason that I would prefer to use views is that the database schema has been developed by our administrator from within Ruby, and I am not familiar with Ruby. I can request that tables be created, but requires an additional step and I would like more flexibility when developing / testing new joins.

I started using views following the answer to a related question on SO (When to use R, when to use SQL). The top-voted answer begins "do the data manipulations in SQL until the data is in a single table, and then do the rest in R."

I have started using views, but I have run into a few issues with views:

  1. queries are much slower
  2. Views do not get dumped from the production to backup database that I use for analysis.

Are views appropriate for this use? If so, should I expect a performance penalty? Is there a way to speed up queries on views?

Best Answer

Views in MySQL are handled using one of two different algorithms: MERGE or TEMPTABLE. MERGE is simply a query expansion with appropriate aliases. TEMPTABLE is just what it sounds like, the view puts the results into a temporary table before running the WHERE clause, and there are no indexes on it.

The 'third' option is UNDEFINED, which tells MySQL to select the appropriate algorithm. MySQL will attempt to use MERGE because it is more efficient. Main Caveat:

If the MERGE algorithm cannot be used, a temporary table must be used instead. MERGE cannot be used if the view contains any of the following constructs:

  • Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)

  • DISTINCT

  • GROUP BY

  • HAVING

  • LIMIT

  • UNION or UNION ALL

  • Subquery in the select list

  • Refers only to literal values (in this case, there is no underlying table)

[src]

I would venture to guess your VIEWS are requiring the TEMPTABLE algorithm, causing performance issues.

Here is a really old blog post on the performance of views in MySQL and it doesn't seem to have gotten better.

There might, however, be some light at the end of the tunnel on this issue of temporary tables not containing indexes (causing full table scans). In 5.6:

For cases when materialization is required for a subquery in the FROM clause, the optimizer may speed up access to the result by adding an index to the materialized table. ... After adding the index, the optimizer can treat the materialized derived table the same as a usual table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index.

As @ypercube points out, MariaDB 5.3 has added the same optimization. This article has an interesting overview of the process:

The optimization is applied then the derived table could not be merged into its parent SELECT which happens when the derived table doesn't meet criteria for mergeable VIEW