Sql-server – the most efficient way to get the minimum of multiple columns on SQL Server 2005

aggregateperformancesql serversql-server-2005

I'm in a situation where I want to get the minimum value from of 6 columns.

I've found three ways so far to accomplish this, but I have concerns with the performance of these methods and would like to know which would be better for performance.

The first method is to use a big case statement. Here's an example with 3 columns, based on the example in the link above. My case statement would be much longer since I will be looking at 6 columns.

Select Id,
       Case When Col1 <= Col2 And Col1 <= Col3 Then Col1
            When Col2 <= Col3 Then Col2 
            Else Col3
            End As TheMin
From   MyTable

The second option is to use the UNION operator with multiple select statements. I would put this in an UDF that accepts an Id parameter.

select Id, dbo.GetMinimumFromMyTable(Id)
from MyTable

and

select min(col)
from
(
    select col1 [col] from MyTable where Id = @id
    union all
    select col2 from MyTable where Id = @id
    union all
    select col3 from MyTable where Id = @id
) as t

And the 3rd option I found was to use the UNPIVOT operator, which I didn't even know existed until just now

with cte (ID, Col1, Col2, Col3)
as
(
    select ID, Col1, Col2, Col3
    from TestTable
)
select cte.ID, Col1, Col2, Col3, TheMin from cte
join
(
    select
        ID, min(Amount) as TheMin
    from 
        cte 
        UNPIVOT (Amount for AmountCol in (Col1, Col2, Col3)) as unpvt
    group by ID
) as minValues
on cte.ID = minValues.ID

Because of the table size, and frequency in which this table is queried and updated, I am concerned about the performance impact these queries would have on the database.

This query will actually be used in a join to a table with a few million records, however the records returned will be reduced to around a hundred records at a time. It will get run many times throughout the day, and the 6 columns I am querying are frequently updated (they contain daily stats). I do not think there are any indexes on the 6 columns I am querying.

Which of these methods is better for performance when trying to get the minimum of multiple columns? Or is there another better method that I don't know of?

I am using SQL Server 2005

Sample Data & Results

If my data contained records like this:

Id    Col1    Col2    Col3    Col4    Col5    Col6
1        3       4       0       2       1       5
2        2       6      10       5       7       9
3        1       1       2       3       4       5
4        9       5       4       6       8       9

The end result should be

Id    Value
1        0
2        2
3        1
4        4

Best Answer

I tested the performance of all 3 methods, and here's what I found:

  • 1 record: No noticeable difference
  • 10 records: No noticeable difference
  • 1,000 records: No noticeable difference
  • 10,000 records: UNION subquery was a little slower. The CASE WHEN query is a little faster than the UNPIVOT one.
  • 100,000 records: UNION subquery is significantly slower, but UNPIVOT query becomes a little faster than the CASE WHEN query
  • 500,000 records: UNION subquery still significantly slower, but UNPIVOT becomes much faster than the CASE WHEN query

So the end results seems to be

  • With smaller record sets there doesn't seem to be enough of a difference to matter. Use whatever is easiest to read and maintain.

  • Once you start getting into larger record sets, the UNION ALL subquery begins to perform poorly compared to the other two methods.

  • The CASE statement performs the best up until a certain point (in my case, around 100k rows), and which point the UNPIVOT query becomes the best-performing query

The actual number at which one query becomes better than another will probably change as a result of your hardware, database schema, data, and current server load, so be sure to test with your own system if you're concerned about performance.

I also ran some tests using Mikael's answer; however, it was slower than all 3 of the other methods tried here for most recordset sizes. The only exception was it did better than a the UNION ALL query for very large recordset sizes. I like the fact it shows the column name in addition to the smallest value though.

I'm not a dba, so I may not have optimized my tests and missed something. I was testing with the actual live data, so that may have affected the results. I tried to account for that by running each query a few different times, but you never know. I would definitely be interested if someone wrote up a clean test of this and shared their results.