Sql-server – Consolidating indexes

indexperformancesql serversql-server-2005

I have one big table that is used to generate business intelligence cube. Currently it has around 40M rows and 55 columns. A lot of the cube dimensions are generated by running 'select distinct' on a column of the big table.

Currently I have one index for each 'select distinct' and 16 indexes in total. Is it better to merge some of those indexes? My thinking is that the cube needs to process all rows anyway, and the indexes speed up the process because it is much smaller than the entire table, and if I put two or three columns in one index the index size will not grow so much. Maybe the good candidates are columns that have low specificity.

Am I going in the right direction? Can the query make use of the second column (or third, fourth, …) of the index?

I'm using SQL Server 2005 Standard Edition.

Best Answer

If the "cube" needs to process all rows, indexes won't speed up the process as the querying engine would just do a table scan.

I think the best advice I could give is to familiarize yourself with query plan output so that you can see how the database is satisfying the query requests. This will allow you to see what access methods are used to retrieve data. It would also allow you to answer you're own question: if you create the proposed index, does sql server use it to fulfill your query?

I think the answer to your question is going to be "No". If your "cube" is being materialized by executing "SELECT DISTINCT columnB FROM table", and you currently have an indexes on "columnA" and "columnB". "Consolidating" those separate indexes into a single index on "(columnA, columnB)" is unlikely to help your "cube" queries on "columnB". Your mileage may vary, not having access to the schema nor query source makes these kinds of questions difficult to answer definitively.

Having said that, I think the correct response is to point you towards the SQL Server data warehousing functionality instead of rolling your own solution.