I run a report in which I have a situation where based on a column value which is basically a Key or Id, I need to fetch corresponding value from the mapping Id table. Something like below:
SELECT
(case when [column1='A'] then (select value from Table1)
when [column1='B'] then (select value from Table2)
when [column1='C'] then (select value from Table3)
and so on uptil 35 more 'when' conditions ...
ELSE column1 end) Value
from Table1
More precisely:
SELECT
(case when [A.column1='1']
then (select value from B where B.clientId=100 and A.column1=B.Id)
when [A.column1='2']
then (select value from C where C.clientId=100 and A.column1=C.Id)
when [A.column1='3']
then (select value from D where D.clientId=100 and A.column1=D.Id)
...
and so on uptil 30 more 'when' conditions
...
ELSE column1 end)
FROM A
In Tables B,C,D..so on, we maintain data for all clients. Each client has a specific ClientId
and these tables B,C,D etc have index in place on Id
and ClientId
columns.
Is there an efficient way to handle this in SQL Server?
Best Answer
I'm assuming that you have appropriate indexes on the tables in the subqueries. I mocked up some quick test data and put 10 million rows in table
A
. I wasn't game to create 30 tables so I just created 3 for theCASE
expression. I think that 3 is enough to show the general principles.I disabled result sets and ran the following query in SSMS:
It took around 0.723 seconds. I'm doing pretty unscientific tests because I don't know anything your data. In any case, with serial queries we can't expect a better result than 0.7 seconds. That's our baseline.
The most efficient way to write this query is without joins at all. The key is that the
CASE
expression is only ever going to return 3 (or 30) unique values if it finds a match. You can save off the results into local variables and just use those in the query. The query below finishes in around 1.044 seconds:The plan is very simple:
For another option you can write the query with joins (where we can rewrite the
CASE
expression in a more compact form, usingCOALESCE()
. This finished in around 2.314 seconds:Here's the plan:
You can get a nearly identical runtime and query plan by writing the query like this:
The original query in the question has an issue: SQL Server is doing a useless sort before the nested loop join. That query finishes in around 5.838 seconds on my machine.
Trace flag 8690 eliminates the sort as well as the table spools. The query runs in about 7.479 seconds with trace flag 8690, so I suppose that the spools are helpful for this query.