Your ix_hugetable
looks quite useless because:
- it is the clustered index (PK)
- the INCLUDE makes no difference because a clustered index INCLUDEs all non-key columns (non-key values at lowest leaf = INCLUDEd = what a clustered index is)
In addition:
- added or fk should be first
- ID is first = not much use
Try changing the clustered key to (added, fk, id)
and drop ix_hugetable
. You've already tried (fk, added, id)
. If nothing else, you'll save a lot of disk space and index maintenance
Another option might be to try the FORCE ORDER hint with table order boh ways and no JOIN/INDEX hints. I try not to use JOIN/INDEX hints personally because you remove options for the optimiser. Many years ago I was told (seminar with a SQL Guru) that FORCE ORDER hint can help when you have huge table JOIN small table: YMMV 7 years later...
Oh, and let us know where the DBA lives so we can arrange for some percussion adjustment
Edit, after 02 Jun update
The 4th column is not part of the non-clustered index so it uses the clustered index.
Try changing the NC index to INCLUDE the value column so it doesn't have to access the value column for the clustered index
create nonclustered index ix_hugetable on dbo.hugetable (
fk asc, added asc
) include(value)
Note: If value is not nullable then it is the same as COUNT(*)
semantically. But for SUM it need the actual value, not existence.
As an example, if you change COUNT(value)
to COUNT(DISTINCT value)
without changing the index it should break the query again because it has to process value as a value, not as existence.
The query needs 3 columns: added, fk, value. The first 2 are filtered/joined so are key columns. value is just used so can be included. Classic use of a covering index.
Ever is a big word, but, in general, no, I wouldn't put a varchar(2000) field into an INCLUDE.
And yeah, the way that data is stored at the page level can seriously impact performance of the index, depending on how the index is used.
The thing is, the more rows of data you can cram into a page, the fewer pages have to get accessed, the faster your system is, for the most part. Adding a really large column means less information stored on a page, so, in the event of range seeks or scans, more pages have to be read to retreive the data, seriously slowing stuff down.
To know for sure if this is an issue on your query, or on your system, you'd have to monitor the reads, especially the number of pages that the query uses.
Best Answer
There are 2 bits of information. This means 2 fields. It is that simple.
In practical terms, selecting 2 columns is zero extra complexity. Having to read a large field to parse one bit out is unnecessary complexity and performance overhead.
A couple of SO questions on storing salts with hashes