I'm in a situation where I want to get the minimum value from of 6 columns.
I've found three ways so far to accomplish this, but I have concerns with the performance of these methods and would like to know which would be better for performance.
The first method is to use a big case statement. Here's an example with 3 columns, based on the example in the link above. My case statement would be much longer since I will be looking at 6 columns.
Select Id,
Case When Col1 <= Col2 And Col1 <= Col3 Then Col1
When Col2 <= Col3 Then Col2
Else Col3
End As TheMin
From MyTable
The second option is to use the UNION
operator with multiple select statements. I would put this in an UDF that accepts an Id parameter.
select Id, dbo.GetMinimumFromMyTable(Id)
from MyTable
and
select min(col)
from
(
select col1 [col] from MyTable where Id = @id
union all
select col2 from MyTable where Id = @id
union all
select col3 from MyTable where Id = @id
) as t
And the 3rd option I found was to use the UNPIVOT operator, which I didn't even know existed until just now
with cte (ID, Col1, Col2, Col3)
as
(
select ID, Col1, Col2, Col3
from TestTable
)
select cte.ID, Col1, Col2, Col3, TheMin from cte
join
(
select
ID, min(Amount) as TheMin
from
cte
UNPIVOT (Amount for AmountCol in (Col1, Col2, Col3)) as unpvt
group by ID
) as minValues
on cte.ID = minValues.ID
Because of the table size, and frequency in which this table is queried and updated, I am concerned about the performance impact these queries would have on the database.
This query will actually be used in a join to a table with a few million records, however the records returned will be reduced to around a hundred records at a time. It will get run many times throughout the day, and the 6 columns I am querying are frequently updated (they contain daily stats). I do not think there are any indexes on the 6 columns I am querying.
Which of these methods is better for performance when trying to get the minimum of multiple columns? Or is there another better method that I don't know of?
I am using SQL Server 2005
Sample Data & Results
If my data contained records like this:
Id Col1 Col2 Col3 Col4 Col5 Col6 1 3 4 0 2 1 5 2 2 6 10 5 7 9 3 1 1 2 3 4 5 4 9 5 4 6 8 9
The end result should be
Id Value 1 0 2 2 3 1 4 4
Best Answer
I tested the performance of all 3 methods, and here's what I found:
UNION
subquery was a little slower. TheCASE WHEN
query is a little faster than theUNPIVOT
one.UNION
subquery is significantly slower, butUNPIVOT
query becomes a little faster than theCASE WHEN
queryUNION
subquery still significantly slower, butUNPIVOT
becomes much faster than theCASE WHEN
querySo the end results seems to be
With smaller record sets there doesn't seem to be enough of a difference to matter. Use whatever is easiest to read and maintain.
Once you start getting into larger record sets, the
UNION ALL
subquery begins to perform poorly compared to the other two methods.The
CASE
statement performs the best up until a certain point (in my case, around 100k rows), and which point theUNPIVOT
query becomes the best-performing queryThe actual number at which one query becomes better than another will probably change as a result of your hardware, database schema, data, and current server load, so be sure to test with your own system if you're concerned about performance.
I also ran some tests using Mikael's answer; however, it was slower than all 3 of the other methods tried here for most recordset sizes. The only exception was it did better than a the
UNION ALL
query for very large recordset sizes. I like the fact it shows the column name in addition to the smallest value though.I'm not a dba, so I may not have optimized my tests and missed something. I was testing with the actual live data, so that may have affected the results. I tried to account for that by running each query a few different times, but you never know. I would definitely be interested if someone wrote up a clean test of this and shared their results.