Sql-server – difference between having many non clustered index with single columns and with combination of many columns

indexsql serversql-server-2008-r2

Folks

I am a newbie to DB design stuffs. I am wondering what would be the difference and performance difference between having designed

many non clustered index with single columns and non clustered index with many columns combined.
Or ideally under what scenario I can combine columns to make a single non clustered index and if so, how many columns should I combine ?
What is the criteria to choose the column for non clustered index ? (I am aware that its applied only to not frequent updation table)….
Whats the difference between 2005/2008 and 2012 SQL Server in terms of non clustered index with regard to my question and also is it 249 is the max we can create ?

Thank you in advance.

Best Answer

Having many indices requires SQL Server to maintain and update many indices for each insert, delete or update statement. So having less indices is generally better.

However, when you have a compound index (made up of multiple columns), that only helps if you use/specify the n left-most columns in your query.

So if you have an index on (City, LastName, FirstName) for an address table, that index might be used if

you have all three columns in your WHERE clause
you specify WHERE City = 'London' in your query (using the 1 left-most column)
you specify WHERE City = 'London' and LastName = 'Smith' in your query (using the 2 left-most columns)

However, such an index cannot ever be used for:

finding all rows with a WHERE FirstName = 'Joe' clause - you're not using the n left-most columns in that index
finding all rows with a WHERE LastName = 'Brown' AND FirstName = 'Charlie' clause - again: you're not using the n left-most columns in that index

A compound index cannot (in many cases) replace all one-column indices. Be very careful when designing and creating compound indices to make sure you design them in a way they'll really be useful!

So while compound indices do have their reason to be, and they're generally preferable - you also need to make sure they'll really be able to be used. An index that is just being maintained (and causes effort on SQL Server's side), but can never be used in any queries, is the worst index ever - no gain, only maintenance overhead.

You should also read Kimberly The Queen of Indexing Tripp's excellent blog posts on SQL Server indexing - most notably Indexes: just because you can, doesn't mean you should!

Related Solutions

Sql-server – Non-Clustered Index key columns to cover a variable where clause

That's a great question.
And there are good answers.

The engine definitely will use the index even if you don't use every key column.
That's especially so if they are in order, as you are talking about.
(can anyone else speak to different orders of key columns?)

You will benefit just fine from selecting just on the first column alone as a key, or multiple columns.
What will make a difference - for any index - is staying inside the INCLUDEd columns.
No matter how many key columns you use in your Where, the performance hit for having to go back to the primary key for additional columns can be huge as it doubles the "operations".

When it comes to dealing with performance vs. size, you have the same problem as with any index.
Since you know you want the same columns returned in all cases, if you are READ focused, you will probably want to the index with all 6, if you INCLUDE everything.
It will certainly save you db size compared to making both indexes.

On WRITE, you obviously have a bigger burden with a larger index. That is a significant additional amount of sorting.
If you do just one row inserted at a time, maybe it won't hardly matter at all.
If you do bulk inserts, you'll definitely want to test the two indexes to see the write performance for your actual inserts.

Sql-server – Improve query performance when selecting almost all rows with many “group by” columns

The non-clustered index you have tested is not the best for this query. It can be used for the WHERE clause and for doing an index scan instead of a full table scan but it cannot be used for the GROUP BY.

The best possible index would have to be a partial index (to filter the unwanted rows from the WHERE clause), then have all the columns used in the GROUP BY and then INCLUDE all the other columns used in the SELECT:

CREATE INDEX special_ix 
  ON dbo.Commissions_Output
    ( company, location, account, 
      salesroute, employee, producttype, 
      item, loadjdate, commissionrate ) 
INCLUDE 
  ( [Extended Sales Price], [Delivered Qty] ) 
WHERE 
  ( [Extended Sales Price] <> 0 ) ;

Best Answer

Related Solutions

Sql-server – Non-Clustered Index key columns to cover a variable where clause

Sql-server – Improve query performance when selecting almost all rows with many “group by” columns

Related Question