Group by with joins

join;

So I have 2 tables with identical structures that represent data from 2 vendors…..I would have set this up in one table but I lost the argument with the data guy…anyway….keep getting an error here:

Column 'Venor2.DealID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Trying to simply get an average "MIA" by DealID

Select A1.DealID as A1_DealID, 
        Avg(A1.Trade_Date_Current_Day_MIA) as Average_MIA,
        A2.DealID, 
        Avg(A2.Trade_Date_Current_Day_MIA) as Average_MIA
From  Vendor1_Aggregate_File_Stage as A1
left join Vendor2_Aggregate_File_Stage as A2 on A1.DealID = A2.DealID
Group By A1.DealID

Best Answer

Vendor 2 is indeed invalid in the select list as the error suggests, since you don't group by it, it may have multiple values per group of vendor 1 which you do group by. You didn't provide enough details, but from what you describe I guess what you are really after is something along the lines of:

WITH unioned_vendors AS
(
 SELECT DealID, Trade_Date_Current_Day_MIA 
 FROM   Vendor1_Aggregate_File_Stage
 UNION 
 SELECT DealID, Trade_Date_Current_Day_MIA 
 FROM   Vendor2_Aggregate_File_Stage
)
SELECT DealID, Avg(Trade_Date_Current_Day_MIA) as Average_MIA
FROM   unioned_vendors
GROUP BY DealID;

*Note - you may need UNION ALL instead of UNION depending how your data is structured, and if you want to eliminate duplicates if they exist.

I would suggest creating a view for the CTE definition so you can reuse it as most likely other queries will require this as well:

CREATE VIEW unioned_vendors 
AS
 SELECT DealID, Trade_Date_Current_Day_MIA 
 FROM   Vendor1_Aggregate_File_Stage
 UNION 
 SELECT DealID, Trade_Date_Current_Day_MIA 
 FROM   Vendor2_Aggregate_File_Stage;

And then:

SELECT DealID, Avg(Trade_Date_Current_Day_MIA) as Average_MIA
FROM   unioned_vendors
GROUP BY DealID;

HTH