I tested the performance of all 3 methods, and here's what I found:
- 1 record: No noticeable difference
- 10 records: No noticeable difference
- 1,000 records: No noticeable difference
- 10,000 records:
UNION
subquery was a little slower. The CASE WHEN
query is a little faster than the UNPIVOT
one.
- 100,000 records:
UNION
subquery is significantly slower, but UNPIVOT
query becomes a little faster than the CASE WHEN
query
- 500,000 records:
UNION
subquery still significantly slower, but UNPIVOT
becomes much faster than the CASE WHEN
query
So the end results seems to be
With smaller record sets there doesn't seem to be enough of a difference to matter. Use whatever is easiest to read and maintain.
Once you start getting into larger record sets, the UNION ALL
subquery begins to perform poorly compared to the other two methods.
The CASE
statement performs the best up until a certain point (in my case, around 100k rows), and which point the UNPIVOT
query becomes the best-performing query
The actual number at which one query becomes better than another will probably change as a result of your hardware, database schema, data, and current server load, so be sure to test with your own system if you're concerned about performance.
I also ran some tests using Mikael's answer; however, it was slower than all 3 of the other methods tried here for most recordset sizes. The only exception was it did better than a the UNION ALL
query for very large recordset sizes. I like the fact it shows the column name in addition to the smallest value though.
I'm not a dba, so I may not have optimized my tests and missed something. I was testing with the actual live data, so that may have affected the results. I tried to account for that by running each query a few different times, but you never know. I would definitely be interested if someone wrote up a clean test of this and shared their results.
Not sure that this is simpler than a cursor solution, but here is what I would do:
Before we begin we need a few tables in a SQL Fiddle:
MS SQL Server 2012 Schema Setup
CREATE TABLE dbo.tbl1(c1 INT, c2 INT, c3 INT);
CREATE TABLE dbo.tbl2(c4 INT, c5 INT, c6 INT);
CREATE TABLE dbo.tbl3(c7 INT, c8 INT, c9 INT);
First we need the list of all user tables. For that we can use the sys.tables
catalog view. The OBJECT_SCHEMA_NAME()
function gets us the schema name, the OBJECT_NAME()
function the table name. The QUOTENAME()
function quotes the names correctly, in case some of those contain special characters, key words or spaces. (In this example the use of OBJECT_NAME()
is not strictly necessary as sys.tables
has a name column, but I left it in as you can use this pattern with any catalog view that has an object_id
column.)
Query 1:
SELECT
QUOTENAME(OBJECT_SCHEMA_NAME(T.object_id))+'.'+
QUOTENAME(OBJECT_NAME(T.object_id)) AS quoted_table_name,
T.object_id
FROM sys.tables AS T;
Results:
| QUOTED_TABLE_NAME | OBJECT_ID |
|-------------------|-----------|
| [dbo].[tbl1] | 245575913 |
| [dbo].[tbl2] | 261575970 |
| [dbo].[tbl3] | 277576027 |
The next step is to get the list of column names, again quoted. We can use the sys.columns
catalog view for that.
Query 2:
SELECT C.name,C.column_id
FROM sys.columns AS C
WHERE C.object_id = OBJECT_ID('dbo.tbl1');
Results:
| NAME | COLUMN_ID |
|------|-----------|
| c1 | 1 |
| c2 | 2 |
| c3 | 3 |
The next hurdle is to get those columns in a comma separated list. There is no string concatenation aggregate function build in so we have to use a trick:
Query 3:
SELECT STUFF((
SELECT ','+QUOTENAME(name)
FROM sys.columns AS C
WHERE C.object_id = OBJECT_ID('dbo.tbl1')
ORDER BY C.column_id
FOR XML PATH(''),TYPE
).value('.','NVARCHAR(MAX)'),1,1,'') AS clomun_list;
Results:
| CLOMUN_LIST |
|----------------|
| [c1],[c2],[c3] |
With that all pieces are in place and we just have to put them all together:
Query 4:
SELECT 'SELECT ' +
CL.column_list +
' FROM ' +
QUOTENAME(OBJECT_SCHEMA_NAME(T.object_id)) + '.' +
QUOTENAME(OBJECT_NAME(T.object_id)) +
';' AS select_statement
FROM sys.tables AS T
CROSS APPLY (
SELECT STUFF((
SELECT ','+QUOTENAME(name)
FROM sys.columns AS C
WHERE C.object_id = T.object_id
ORDER BY C.column_id
FOR XML PATH(''),TYPE
).value('.','NVARCHAR(MAX)'),1,1,'') AS column_list
)CL;
Results:
| SELECT_STATEMENT |
|------------------------------------------|
| SELECT [c1],[c2],[c3] FROM [dbo].[tbl1]; |
| SELECT [c4],[c5],[c6] FROM [dbo].[tbl2]; |
| SELECT [c7],[c8],[c9] FROM [dbo].[tbl3]; |
This example will work in SQL 2005 and later, assuming you are on the latest service pack.
There is no clean solution to achieve this with SQL Server 2000. In that case you need to go back to your cursor solution.
Best Answer
Of course, the indexes need to be prioritized. You can create only those indexes that would impact most number of users, or most critical users, or would have maximum impact on the system. Look at the following blog for an easy way to identify missing indexes: http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx