Sql-server – Index planning for many columns with different usage

database-designindexsql serversql-server-2012

I've a table with 80 columns and that is a base table for most of the application. Daily load inserts almost 8,000 records and update upto 2,000 records. This table is now having more than 5 million records. Unfortunately, I can't change the architecture and for next few months I have to continue with this.
Now, as I said there are multiple application connected to the table which are fetching data from it and almost all of them using different columns for their purpose. Example of few of those queries are:

SELECT Col1, Col2 
FROM Table
WHERE Col3 = 'something'

SELECT Col4, Col5, Col6
FROM Table
WHERE Col7 IN ('A','B') AND Col1 = 'something'


SELECT Col1, Col2, Col7, Col28, ....
FROM Table 
WHERE Col1 = 'Something' AND Col2 = 12 AND (Col2 > 2 OR Col7 <20)

As you can see that one column is being used in where clause and some query it is in select clause. I have created indexes and now I realized that I need more but that doesn't seems to be feasible to me as I will end up in having so many indexes.

How can I implement index strategy in this kind of scenario?

Also,

How to design index when those columns are coming up in different
combinations?

Ex.:

SELECT Col1, Col2, Col3
FROM Table
WHERE Col3 = 'something' AND Col4 = 'something'

SELECT Col4,Col1, Col5
FROM table
WHERE Col3='Something'AND Col7 = 'something' AND Col67 = 'something'

out of 80 columns, application is using > 50 columns in where clause.

Best Answer

Of course, the indexes need to be prioritized. You can create only those indexes that would impact most number of users, or most critical users, or would have maximum impact on the system. Look at the following blog for an easy way to identify missing indexes: http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx

Related Solutions

Sql-server – the most efficient way to get the minimum of multiple columns on SQL Server 2005

I tested the performance of all 3 methods, and here's what I found:

1 record: No noticeable difference
10 records: No noticeable difference
1,000 records: No noticeable difference
10,000 records: UNION subquery was a little slower. The CASE WHEN query is a little faster than the UNPIVOT one.
100,000 records: UNION subquery is significantly slower, but UNPIVOT query becomes a little faster than the CASE WHEN query
500,000 records: UNION subquery still significantly slower, but UNPIVOT becomes much faster than the CASE WHEN query

So the end results seems to be

With smaller record sets there doesn't seem to be enough of a difference to matter. Use whatever is easiest to read and maintain.
Once you start getting into larger record sets, the UNION ALL subquery begins to perform poorly compared to the other two methods.
The CASE statement performs the best up until a certain point (in my case, around 100k rows), and which point the UNPIVOT query becomes the best-performing query

The actual number at which one query becomes better than another will probably change as a result of your hardware, database schema, data, and current server load, so be sure to test with your own system if you're concerned about performance.

I also ran some tests using Mikael's answer; however, it was slower than all 3 of the other methods tried here for most recordset sizes. The only exception was it did better than a the UNION ALL query for very large recordset sizes. I like the fact it shows the column name in addition to the smallest value though.

I'm not a dba, so I may not have optimized my tests and missed something. I was testing with the actual live data, so that may have affected the results. I tried to account for that by running each query a few different times, but you never know. I would definitely be interested if someone wrote up a clean test of this and shared their results.

Sql-server – How to create implicit select statements

Not sure that this is simpler than a cursor solution, but here is what I would do:

Before we begin we need a few tables in a SQL Fiddle:

MS SQL Server 2012 Schema Setup

CREATE TABLE dbo.tbl1(c1 INT, c2 INT, c3 INT);
CREATE TABLE dbo.tbl2(c4 INT, c5 INT, c6 INT);
CREATE TABLE dbo.tbl3(c7 INT, c8 INT, c9 INT);

First we need the list of all user tables. For that we can use the sys.tables catalog view. The OBJECT_SCHEMA_NAME() function gets us the schema name, the OBJECT_NAME() function the table name. The QUOTENAME() function quotes the names correctly, in case some of those contain special characters, key words or spaces. (In this example the use of OBJECT_NAME() is not strictly necessary as sys.tables has a name column, but I left it in as you can use this pattern with any catalog view that has an object_id column.)

Query 1:

SELECT 
  QUOTENAME(OBJECT_SCHEMA_NAME(T.object_id))+'.'+
  QUOTENAME(OBJECT_NAME(T.object_id)) AS quoted_table_name,
  T.object_id
FROM sys.tables AS T;

Results:

| QUOTED_TABLE_NAME | OBJECT_ID |
|-------------------|-----------|
|      [dbo].[tbl1] | 245575913 |
|      [dbo].[tbl2] | 261575970 |
|      [dbo].[tbl3] | 277576027 |

The next step is to get the list of column names, again quoted. We can use the sys.columns catalog view for that.

Query 2:

SELECT C.name,C.column_id 
  FROM sys.columns AS C
 WHERE C.object_id = OBJECT_ID('dbo.tbl1');

Results:

| NAME | COLUMN_ID |
|------|-----------|
|   c1 |         1 |
|   c2 |         2 |
|   c3 |         3 |

The next hurdle is to get those columns in a comma separated list. There is no string concatenation aggregate function build in so we have to use a trick:

Query 3:

SELECT STUFF((
  SELECT ','+QUOTENAME(name)
    FROM sys.columns AS C
   WHERE C.object_id = OBJECT_ID('dbo.tbl1')
   ORDER BY C.column_id
     FOR XML PATH(''),TYPE
  ).value('.','NVARCHAR(MAX)'),1,1,'') AS clomun_list;

Results:

|    CLOMUN_LIST |
|----------------|
| [c1],[c2],[c3] |

With that all pieces are in place and we just have to put them all together:

Query 4:

SELECT 'SELECT ' +
       CL.column_list +
       ' FROM ' +
       QUOTENAME(OBJECT_SCHEMA_NAME(T.object_id)) + '.' +
       QUOTENAME(OBJECT_NAME(T.object_id)) +
       ';' AS select_statement
FROM sys.tables AS T
CROSS APPLY (
  SELECT STUFF((
    SELECT ','+QUOTENAME(name)
      FROM sys.columns AS C
     WHERE C.object_id = T.object_id
     ORDER BY C.column_id
       FOR XML PATH(''),TYPE
    ).value('.','NVARCHAR(MAX)'),1,1,'') AS column_list
  )CL;

Results:

|                         SELECT_STATEMENT |
|------------------------------------------|
| SELECT [c1],[c2],[c3] FROM [dbo].[tbl1]; |
| SELECT [c4],[c5],[c6] FROM [dbo].[tbl2]; |
| SELECT [c7],[c8],[c9] FROM [dbo].[tbl3]; |

This example will work in SQL 2005 and later, assuming you are on the latest service pack. There is no clean solution to achieve this with SQL Server 2000. In that case you need to go back to your cursor solution.

Best Answer

Related Solutions

Sql-server – the most efficient way to get the minimum of multiple columns on SQL Server 2005

Sql-server – How to create implicit select statements

Related Question