Sql-server – Restricting multiple records to one by preference

aggregatecasegroup bysql-server-2005

Below is a query I'm using to extract the best Price1 amongst several distributor's pricing tables.

SELECT 
    products.UPC, 
    T2.Price1, 
    vProducts_PriceQty_Union.Price2, 
    vProducts_PriceQty_Union.DBID AS PricingDBID, 
    T2.VendorCount
FROM         
    vProducts_PriceQty_Union 
INNER JOIN
    products ON 
        vProducts_PriceQty_Union.UPC = products.UPC 
INNER JOIN
    (
    SELECT 
        UPC, 
        MIN(Price1) AS Price1, 
        SUM(Quantity) AS Quantity, 
        COUNT(DBID) AS VendorCount
    FROM 
        vProducts_PriceQty_Union
    GROUP BY 
        UPC
    ) AS T2 ON 
        T2.UPC = vProducts_PriceQty_Union.UPC 
        AND 
        vProducts_PriceQty_Union.Price1 = T2.Price1 
WHERE
    vProducts_PriceQty_Union.UPC = '082442818986'
ORDER BY 
    products.UPC

"vProducts_PriceQty_Union" is a view with a UNION ALL of three tables, which looks like this:

SELECT     UPC, Price1, Price2, Quantity, DBID FROM Table1
    UNION ALL
SELECT     UPC, Price1, Price2, Quantity, DBID FROM Table2
    UNION ALL
SELECT     UPC, Price1, Price2, Quantity, DBID FROM Table3

Notes:

  1. I realize column name "DBID" is a reserved name, but it's too late in the game to change, and does not seem to affect any query results in an adverse way
  2. DBID is simply a numeric reference to a table containing the data.
  3. There are no duplicates of UPCs within any of the tables referenced in the union query, only duplicate UPCs when they are unioned
  4. The WHERE clause is only shown in the first query to return example results and not used in the production query

Normally, the first query shown above returns one row for each UPC, showing the lowest Price1 from the three available. However, in cases where Price1 is the same amongst more than one table, the query returns more than one row for a given UPC, as follows:

UPC             Price1      Price2      Quantity    Pricing DBID    VendorCount
082442818986    597.0000    569.0000    103.00      1               2
082442818986    597.0000    569.0000    103.00      2               2

My question is:

How do I list only one row for this type of instance (more than one returned), with a preference of using PricingDBID = 1 (the main store, not an offsite warehouse), when one of the results contains PricingDBID = 1. If one of the records does NOT contain PricingDBID = 1, then select only the first one (or random, if you feel up to it) that is returned (all PricingDBIDs above 1 are offsite warehouses and we don't care which one gets chosen, just choose ONE).

I've tried this with CASE statements, but cannot for the life of me figure it out. I suspect it has something to do with the "T2.UPC = Products_PriceQty_Union.UPC AND vProducts_PriceQty_Union.Price1 = T2.Price1" bindings, but not exactly sure what the syntax might be to accomplish this. Over 20 hours on this and no resolution so far, so I'm reaching out to the community. Any help is greatly appreciated!

A follow up question…:

Not sure if this is where this supplemental question should go. If this should be posted as new question, let me know (I'm new to this group as far as posting goes).

When I attempt to paste the working query (see ypercube's answer) into view designer in SSMS and attempt to view the results (ctrl+r), I get "unhandled win32 exception occurred in sqlwb.exe [14240]". Then, when I debug, I'm presented with the dialog "A buffer overrun has occurred in SqlWb.exe which has corrupted the programs's internal state. Press Break to debug". On pressing "Break", the offending line is "0x7813BD93 = 2014559635 and dword ptr ds:[781C4920h],0" (if that means anything to anyone). On stopping code in Visual Studio debugger SSMS closes and I have to start over. I've searched far and wide for a solution to this. However, when I run the query in the query window (not builder), the query runs just fine!

A workaround is to script the creation of the view using Create view, which works just fine for querying. As long as I don't want to edit the query in the view designer, it works just fine. Weird.

Does anyone have any input or experience with this, where queries pasted into view designer cause the program to crash? Ah, Microsoft, it's always something with you…

Best Answer

I think this will get what you want. It may not be the best regarding efficiency (due to the unioned view):

WITH LowestPrices AS
    ( SELECT 
          UPC, 
          Price1, 
          SUM(Quantity) OVER (PARTITION BY UPC) AS Quantity, 
          DBID AS PricingDBID, 
          COUNT(*) OVER (PARTITION BY UPC) AS VendorCount,
          ROW_NUMBER() OVER (PARTITION BY UPC 
                             ORDER BY Price1,
                                      DBID         -- to resolve ties
                            ) AS Rn
      FROM         
          vProducts_PriceQty_Union  
     -- WHERE
     --     UPC = '082442818986'
    )
SELECT 
    UPC,
    Price1,
    Quantity,
    PricingDBID,
    VendorCount
FROM
    LowestPrices
WHERE
    Rn = 1
ORDER BY 
    UPC ;

If the join to Products is really needed, you can add it, either inside the CTE or in the external query.