SQL Server Query Performance – Fastest Query to Filter Product by Countries

query-performancesql server

I have a large Product table:

ProductID ProductName
1 Product 1
2 Product 2
n Product n

whereas n is around 2,000,000.

To limit product availability to a certain country, I have a second ProductCountryRestriction table:

ProductID CountryCode IncludeExclude
1 USA 0
1 UK 0
2 AUS 1
2 NZ 1

If a product is restricted to one or more countries, I will then add product- included countries to this table with IncludeExclude = 0. For example Product 1 is available for USA and UK only.

If a product is valid for all countries except some countries, I will then add product – excluded countries to this table with IncludeExclude = 1. For example Product 2 is available for all countries except AUS and NZ.

All country codes corresponding to a product must either be set as IncludeExclude = 0 or as IncludeExclude = 1. The mixing of 0 and 1 is not allowed. For example, there are records with IncludeExclude = 0 for Product 1, I can't add a record with ProductID = 1 and IncludeExclude = 1 to this table.

An user of the system can select several countries to work with. The user preference is stored in UserCountry table:

UserID CountryCode
1 USA
1 NZ

So the User 1 can see Product 1 because it is available for USA. He can also see Product 2 because the Product 2 is also available for USA (although it is excluded from NZ).
Here is my query to get all available products for an user:

DECLARE @UserID int = 1;

SELECT P.*
FROM Product P
WHERE 
    EXISTS
    (
        SELECT * FROM ProductCountryRestriction PCR 
        WHERE 
            PCR.ProductId = P.ProductId
            AND PCR.IncludeExclude = 0 
            AND PCR.CountryCode IN (SELECT CountryCode FROM UserCountry WHERE UserID = @UserID)
    )
    OR EXISTS 
    (
        SELECT * FROM UserCountry UC
        WHERE UserID = @UserID
            AND UC.CountryCode NOT IN (SELECT CountryCode FROM ProductCountryRestriction PCR WHERE PCR.ProductId = P.ProductId AND PCR.IncludeExclude = 1)
    )

This query works as expected but it introduces poor performance. What can I do to improve it? I don't mind to change the DB design.
Thank you for reading my question! I will be grateful for any help you can provide.

Edited 12/31/2020 – Added execution plan as suggested by @J.D.
Please check this link for the plan: https://www.brentozar.com/pastetheplan/?id=BywbFaqaw

Best Answer

Since wrote you won't mind changing the design... Change the design.

Instead of that complicated logic with the flag and implicit inclusions/exclusions, just have a table that maps products to countries. Let's simply call it productcountry. If, and only if, a record for a country exists for a product in productcountry the product is available in that country.

The query then just uses some JOINs and a WHERE.

SELECT DISTINCT 
       p.*
       FROM product p
            INNER JOIN productcountry pc
                       ON pc.productid = p.productid
            INNER JOIN usercountry uc
                       ON uc.countrycode = pc.countrycode
       WHERE uc.userid = @userid;

For that you should try indexes on usercountry (userid, countrycode), productcountry (countrycode, productid) and product (id).

It will need a DISTINCT though because a product can be available in more than one country a user uses. (I silently assume here that the products are distinct by themselves, i.e. they have a key.) You can experiment, if you yield a better plan, if you instead use EXISTS and a correlated subquery.

SELECT p.*
       FROM product p
       WHERE EXISTS (SELECT *
                            FROM productcountry pc
                                 INNER JOIN usercountry uc
                                            ON uc.countrycode = pc.countrycode
                            WHERE uc.userid = @userid
                                  AND pc.productid = p.productid);

Here you can try indexes on usercountry (userid, countrycode) and productcountry (productid, countrycode).