I use a pretty common setup, some wordpress sites running on Debian 10 (Buster) with mariadb 10.3.27 on ZFS, on a decent server with SSDs and lots of RAM, pretty smooth operation with low CPU utilization. There is this query wordpress makes,
SELECT DISTINCT meta_key
FROM wp_postmeta
INNER JOIN wp_posts ON post_id = ID
WHERE post_type <> 'product'
AND post_type <> 'product_variation';
For some reason I cannot possibly imagine, on database A (InnoDB tables) it runs in about 3 seconds, and on database B (MyISAM tables) it takes over 2 minutes. wp_postmeta
has 198938 records and wp_posts
25189 in database B, where on database A they both have many more records. I've tried dumping, deleting the database and recreating it to no avail. If I move the database to another similar server with EXT4, the query on MyISAM tables finishes in about 4 seconds. If I alter the tables to InnoDB, query execution finishes in about 4 seconds. Also, when running the query on MyISAM tables on ZFS I noticed the disk reads spike to about 50MB/s and up to 80MB/s while the query is running. 2 minutes of reads at 50MB/s is 6GB of data, in a database with a total size less than 500MB.
This difference in execution time between MyISAM and InnoDB was noticed on Debian systems with ZFS 0.6 and 0.7. The exact same databases on a FreeBSD/ZFS server display no such difference, with queries executed within a a few seconds for both MyISAM and InnoDB tables.
Does anybody know what could make such a huge difference between MyISAM and InnoDB on ZFS on Linux, and why ZFS on FreeBSD does not show this behavior?
The tables created as:
CREATE TABLE `wp_posts` (
`ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`post_author` bigint(20) unsigned NOT NULL DEFAULT 0,
`post_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_content` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
`post_title` text COLLATE utf8mb4_unicode_ci NOT NULL,
`post_excerpt` text COLLATE utf8mb4_unicode_ci NOT NULL,
`post_status` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'publish',
`comment_status` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'open',
`ping_status` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'open',
`post_password` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`post_name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`to_ping` text COLLATE utf8mb4_unicode_ci NOT NULL,
`pinged` text COLLATE utf8mb4_unicode_ci NOT NULL,
`post_modified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_modified_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_content_filtered` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
`post_parent` bigint(20) unsigned NOT NULL DEFAULT 0,
`guid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`menu_order` int(11) NOT NULL DEFAULT 0,
`post_type` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'post',
`post_mime_type` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`comment_count` bigint(20) NOT NULL DEFAULT 0,
PRIMARY KEY (`ID`),
KEY `post_name` (`post_name`(191)),
KEY `type_status_date` (`post_type`,`post_status`,`post_date`,`ID`),
KEY `post_parent` (`post_parent`),
KEY `post_author` (`post_author`)
) ENGINE=MyISAM AUTO_INCREMENT=33318 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE TABLE `wp_postmeta` (
`meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`post_id` bigint(20) unsigned NOT NULL DEFAULT 0,
`meta_key` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`meta_value` longtext COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`meta_id`),
KEY `post_id` (`post_id`),
KEY `meta_key` (`meta_key`(191))
) ENGINE=MyISAM AUTO_INCREMENT=279035 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Best Answer
Short answer
ZFS, etc, are not the main problems. (See Comments for a rebuttal on ZFS.)
Use InnoDB, not MyISAM.
Fix the indexes in postmeta, as discussed here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta
Avoid index prefixing. (The above link discusses that, too.)
(If you must use MyISAM, then the indexes recommended there are not adequate. Let me know if you need more help.)
Long Answer
Several dimensions to this question:
Version -- MariaDB 10.3 versus other versions of MySQL/MariaDB. This is probably not significant.
OS -- Linux vs FreeBSD vs Windows -- The OS is just a middleman; all are quite mature and have mostly caught up with each other.
HDD vs SDD -- SDD drives are inherently faster for any access. Plus they are faster for random accesses due to the lack of a physical "seek".
FileSystem -- ZFS versus many others. "Journaling" and other advanced techniques will have some impact on the effective performance.
RAID, and especially "battery-backed write cache" has some impact on reads and possibly a big impact on writes. This applies across all disk operations in all applications.
InnoDB vs MyISAM -- These differ significantly in locking, caching, the organization of data. Indexes differ a lot. In MyISAM, fetching a block of data uses the FS's blocksize and caches it in the OS. A MyISAM index block is 1KB, which is likely to be smaller than the FS block slide, thereby leading perhaps to an extra level of caching. For InnoDB, all blocks are 16KB. The interactions between these MariaDB block size and the FS block is hard to predict. Peformance at many levels depends on whether the data or index needs to be scanned linearly or probed randomly.
The
INDEXes
-- Above, I focused on this topic because it is where the main improvement can be had. In the situation here (accessing WP'spostmeta
), will have a significant and far-reaching performance impact.Impact of INDEX
(post_id, meta_key)
which is optimal for your slow query. Furthermore...PRIMARY KEY
is "clustered" with the data, thereby eliminating the extra random fetch. Some of my recommendations take advantage of this efficiency.post_id
, helps in this respect. Furthermore, InnoDB avoids the random seek that MyISAM requires.(191)
from any indexes having it. Second best is if you can assure that all keys will always be under 191 characters, then changing the definition ofmeta_key
is a simple answer.The specific Query
posts
table. There is no getting around this. This will be a linear scan through the "data". The engines do this task quite differently, but neither engine has a strong advantage here.postmeta
. This is where the engine and indexes make a significant difference.meta_key
values -- This may be done in RAM or on disk, at the whim of the Optimizer. You could change a setting to influence that decision, but it would have the side effect of potentially impacting the performance of all other queries, so I won't discuss it.