Your Predicate is different to your Seek Predicate.
A Seek Predicate is used to search the ordered data in the index. In this case, it'll be doing three seeks, one for each ItemState that you're interested in. Beyond that, the data is in ItemPriority order, so no further "Seek" operation can be done.
But before the data is returned, it checks every row using the Predicate, which I refer to as the Residual Predicate. It's done on the results of the Seek Predicate.
Any included column is not part of the ordered data, but can be used to satisfy the Residual Predicate, without having to do the extra Lookup.
You can see material I've written on this around Sargability. Check for a session at SQLBits in particular, at http://bit.ly/Sargability
Edit: To show the impact of Residuals better, run the query using the undocumented OPTION (QUERYTRACEON 9130)
, which will separate out the Residual into a separate Filter operator (which is actually an earlier version of the plan before the residual gets moved into the Seek operator). It clearly shows the impact of an ineffective Seek, by the number of rows being passed left to the Filter.
It's also worth noting that because of the IN clause on ItemState, the data being passed left is actually in ItemState order, not in ItemPriority order. A composite index on ItemState followed by one of the dates (eg (ItemState, LastAccessTime)) could be used to have three Seeks (notice the Seek Predicate shows three seeks within the one Seek operator), each against two levels, producing data that is still in ItemState order (eg, ItemState=3 and LastAccessTime less than something, then ItemState=9 and LastAccessTime less than something, and then ItemState=10 and LastAccessTime less than something).
An index on (ItemState, LastAccesTime, CreationTime) would be no more useful than one on (ItemState, LastAccessTime) because the CreationTime level is only useful if your Seek is for a particular ItemState and LastAccessTime combination, not a range. Like how the phone book isn't in FirstName order if you are interested in Surnames beginning in F.
If you want a composite index but you are never going to be able to use the later columns in Seek Predicates because of the way you use the earlier columns, then you may as well have them as included columns, where they take less space in the index (because they're only stored at the leaf level of the index, not the higher levels) but can still avoid lookups and get used in Residual predicates.
As per the term Residual Predicate - that's my own term for this property of a Seek. A Merge Join explicitly calls it its equivalent a Residual Predicate, and the Hash Match calls its one a Probe Residual (which you might get from TSA if you match for hash). But in a Seek they just call it Predicate which makes it seem less bad than it is.
Best Answer
What has to be adhered to is the query you know you are going to make.
Let go back to the original question's query
Look carefully at the WHERE clause. I see two static values
blogposts.deleted = 0
blogposts.visible = 1
These cannot be indexes by themselves because of their implied cardinality. Think about it:
deleted
is either 0 or 1 (That's a Cardinality of 2)visible
is either 0 or 1 (That's a Cardinality of 2)All Query Optimizers (MySQL, Oracle, PostgreSQL, MSSQL, etc) would take one look at those indexes and decide not to use the index.
Look at the
ORDER BY
clause :ORDER BY blogposts.date_created DESC
. This shows a predictable ordering schema. This may help bypass the need for sorting.Combining these three columns into a single index gives the Query Optimizer some relief by gathering data with delete=0,visible=1,date_created already merged.
Although MySQL is perfectly capable of doing Index Merging, you should never expect the Query Optimizer to do index merging on indexes with very low cardinalities. Even if it did, just expect poor performance. Query Optimizers will choose full table scans, full index scans, and range scans over lopsided index merges any day. Creating covering indexes can bypass having to do index merges, but it can become a burden or a big waste of time and diskspace if the cardinality of the multiple-column index is still too low. Thus, the Query Optimizer would still choose not use it. Therefore, you must know your key distribution well in order to formulate multiple column indexes that you will write queries to use properly.
Here are some nice Links on Covering Indexes and When to Make Them
As far the
date_published
column goes, it would be alright to add it to the index. It would bring that much more relief to the Query Optimizer. It will perform an index scan to test for<=NOW()
but that OK. That's will covering indexes are for.