Sql-server – How to efficiently handle querying for different combinations of optional parameters

functionsjoin;sql serverstored-procedureswhere

Consider the following Tables:

Table: Parts
    part_id INT PK
    part_number VARCHAR

Table: Jobs
    job_id INT PK
    job_number VARCHAR

Table: Job_Parts
    part_id INT FK PK
    job_id INT FK PK
    quantity INT

Running the following statement:

SELECT
    p.part_number AS [Part],
    j.job_number AS [Job],
    jp.quantity AS [Quantity]
FROM
    dbo.parts AS p
    LEFT OUTER JOIN
        dbo.job_parts AS jp
        ON jp.part_id = p.part_id
    Left OUTER JOIN
        dbo.jobs AS j
        ON j.job_id = jp.job_id

Will give a table that can give results like the following:

Part        |Job        |Quantity
Part1234     Job1        10
Part1234     Job2        5
Part1234     Job3        7

Now let's pretend I have that above statement inside of a Function:

CREATE FUNCTION JobParts(
    @PartNum VARCHAR = NULL,
    @JobNum VARCHAR = NULL
) RETURNS TABLE AS (
    --SELECT STATEMENT
)

Is there a way for me to modify the table join in the select statement such that:

  1. If @PartNum or @JobNum have a value, it joins where part_number or job_number equal the provided value
  2. If @PartNum or @JobNum are NULL, it will only return the last record put in for a part.

To give an example, if I enter @JobNum='Job2' I want the result I get to be:

Part        |Job        |Quantity
Part1234     Job2        5

Otherwise, if @JobNum is NULL is defaults to just showing the last entry in the table, so my result would be:

Part        |Job        |Quantity
Part1234     Job3        7

Best Answer

I highly recommend that you don't go down that path. You are mixing business logic with relational querying, and that's a sure recipe for many headaches later on, especially in terms of performance.

Keep in mind the rule of '1 query for 1 task', and the simpler and more granular, the better. You are trying to write 1 query, to handle different tasks.

Use a business control layer either on the client side, a dedicated business tier, or even in a stored procedure on the server, and execute different dedicated, simple SQL queries based on your logic.

For example, in your case you could write a procedure that will accept both parameters and execute the right query based on the values provided. The procedure has negligible overhead, and you get the huge advantage of having separate, optimized query plans created for every combination.

CREATE PROCEDURE SomeProc 
@PartNum VARCHAR,
@JobNum VARCHAR
AS
BEGIN
IF @PartNum IS NULL AND @JobNum IS NULL
-- Specific Query 1 (or execute sub-procedure) goes here
ELSE IF @PartNum IS NOT NULL AND @JobNum IS NOT NULL
-- Specific Query 2 (or execute sub-procedure) goes here
ELSE IF @PartNum IS NULL AND @JobNum IS NOT NULL
-- Specific Query 3 ... etc.
ELSE -- 
-- Generic query, not optimized
END

UPDATE: If you need this to work for a large number of parameters, I highly recommend that you read this article by Erland Sommarskog. It is the best article that I know of for this type of challenge, and will offer you several solutions, with thorough analysis of the pros and cons of each. It's long, it's not an easy read, but it will be worth every minute you spend on it.

What worked well for me in previous similar cases, is a hybrid approach. I start with a generic solution, you will find a few options in the article. Then, I set up a trace, or a log table to record actual usage stats by the users and let it run for some time. Although product and marketing will say that all combinations are equally important, you will find that in real life, most users tend to use only a few and repeat the same pattern. Then, take the top used combinations, and write specific queries just for those, and use the generic query for the few instances that a unique combination of parameters is used. I've added the line of code to the example. This way you get the best of both worlds - for most executions, you will get excellent performance, and still support all possible combinations.

BTW - An interesting side effect of this approach, is that over time users start to notice that it takes much longer when they use the non-standard combinations, and then they either complain about it, so you can write a specific query for that case and make them happy, but usually they tend to just use the common combinations more, that are way faster :-)

HTH