Sql-server – How to avoid lengthy CASE statement

sql serversql-server-2017

Below is my simplified scenario. We have a AdCampaign table with the below columns:

ID   Provider     Campaign         AdFlag
 1   Instagram    Influencer123    1

I have been updating the AdFlag based on a lookup table that contained the list of Providers, like so

update t
set AdFlag = 1
from AdCampaign t
join lookup.AdFlag s on t.provider = s.provider

Now the requirements have changed and the logic is something like this:

set AdFlag = 1 according to:

when Provider = 'instagram'
       and campaign = 'InstagramCampaign'
when Provider = 'instagram' 
       and keyword like '%foobar%'

when Provider = 'google'
        and keyword like '%foobar%'

when Provider = 'bing'
        and campaign like '%abc123%'
        and keyword like '%foobar%'

… and so on for about five more providers with slightly different logic

Before this requirement change, it was data-driven and clean. Now, it looks like I'll need to replace my lookup table with a lengthy and harder to read CASE statement.

My question is: is there a better approach?

Best Answer

You can still make it data driven - you just need more data!

First some data setup:

drop table if exists #Lookup;
drop table if exists #AdCampaign;

with LookupData as
(
    select * from
    (
        values
        ('instagram', 'InstagramCampaign', NULL, NULL),
        ('instagram', NULL, NULL, '%foobar%'),
        ('google', NULL, NULL, '%foobar%'),
        ('bing', NULL, '%abc123%', '%foobar%')
    ) as T(Provider, CampaignEqual, CampaignLike, KeywordLike)
)
select * into #Lookup from LookupData;


with CampaignData as
(
    select * from
    (
        values
        (1, 'Instagram', 'InstagramCampaign', NULL),
        (2, 'bing', 'Contains abc123 internally', 'bazfoobarbix')
    ) as T(ID, Provider, Campaign, Keyword)
)select * Into #AdCampaign from CampaignData;

There's no magic in there. It's all just a convenient (for me) way to populate some tables without worrying about datatypes, constraints, keys and whatnot all. The pertinent bit is the columns in #Lookup. I've inferred that Provider is mandatory. Campaign is split in two depending on how the values are to be compared. I'll get to that in a minute.

I use NULL in #Lookup to represent "not applicable" or "not a part of this comparison".

And here's the query that does the matching:

select
   a.*,
   '|',           -- just a visual separator for SSMS
   l.*
from #AdCampaign as a
inner join #Lookup as l
on a.Provider = l.Provider      -- I'm taking this as mandatory.
and (l.CampaignEqual is NULL or l.CampaignEqual = a.Campaign)
and (l.CampaignLike is NULL or a.Campaign like l.CampaignLike)
and (l.KeywordLike is NULL or a.Keyword like l.KeywordLike);

The idea is to have one column in #Lookup per value-operator pair. These then translate quite directly into the WHERE clause of the actual update. So Lookup.CampaignEqual compares to AdCampaign.Campaign using and "equals" operator. That's why Lookup contains both CampaignEqual and CampaignLike - so it's obvious what to compare to and how. This design can be extended indefinitely with columns such as ValueEqual, ValueLessThan, ValueStartsWith etc.

You could have fewer columns in #Lookup at the cost of more involved predicates. For example if we have just Lookup.Campaign then the predicate becomes

.. and (l.Campaign is NULL
     or a.Campaign = l.Campaign
     or a.Campaign like l.Campaign
     ... etc.
   )

The meaning's the same. I think the performance will be the same. It's a matter of style and what you're comfortable with maintaining.

It may be a little easier to get the lookup data correct with just a single column. Putting a wildcard in a ..Equal column or missing them from a ..Like would cause the matching to break.

OK, so more data and a somewhat lengthy where clause.