T-sql – Merging data from multiple tables

merget-sql

I'm trying to merge data from multiple tables in different databases on the same server however the result is incorrect (i.e. the number of rows returned is more than what it should be). Here's the code:

merge into factOrders fo
using (
    select CustomerKey
        ,ProductKey
        ,CategoryKey
        ,dt1.TimeKey as [OrderDateKey]
        ,dt2.TimeKey as [RequiredDateKey]
        ,dt3.TimeKey as [ShippedDateKey]
        ,o.OrderID
        ,Freight
        ,od.UnitPrice as [UnitPrice]
        ,od.Quantity as [Quantity]
        ,od.Discount as [Discount]
        ,od.UnitPrice * od.Quantity * (1 - Discount) as [TotalPrice]
        ,sh.CompanyName as [ShipperName]
        ,sh.Phone as [ShipperPhone]
    from db1.dbo.Orders o
        ,db1.dbo.[Order Details] od
        ,db1.dbo.Shippers sh
        ,db1.dbo.Products p
        ,dimCustomers dc
        ,dimProducts dp
        ,dimCategories dct
        ,dimTime dt1
        ,dimTime dt2
        ,dimTime dt3
    where o.OrderID = od.OrderID
        and dc.CustomerID = o.CustomerID
        and dp.ProductID = od.ProductID
        and dct.CategoryID = p.CategoryID
        and sh.ShipperID = o.ShipVia
        and dt1.date = o.OrderDate
        and dt2.date = o.RequiredDate
        and dt3.date = o.ShippedDate
    ) o
    on (
            fo.ProductKey = o.ProductKey
            and fo.CustomerKey = o.CustomerKey
            and fo.CategoryKey = o.CategoryKey
            and fo.OrderDateKey = o.OrderDateKey
            )
when matched
    then
        update
        set......
when not matched
    then
        insert.....
        values (......)

I can tell the issue is somewhere in the WHERE condition and I've tried multiple options but not getting the correct result.

Best Answer

You don't mention what engine this is going against so I'm assuming it's Microsoft SQL Server. One word of advice is that you put the join predicate and specify the type of join in the FROM clause, it will make errors like this much more visible as well as making reading the query easier. You forgot to specify the join type for the db1.dbo.Products table, which I think makes it a cross join, so vastly multiplying the number of rows you would get back. See below.

merge into factOrders fo
using (
    select CustomerKey
        ,ProductKey
        ,CategoryKey
        ,dt1.TimeKey as [OrderDateKey]
        ,dt2.TimeKey as [RequiredDateKey]
        ,dt3.TimeKey as [ShippedDateKey]
        ,o.OrderID
        ,Freight
        ,od.UnitPrice as [UnitPrice]
        ,od.Quantity as [Quantity]
        ,od.Discount as [Discount]
        ,od.UnitPrice * od.Quantity * (1 - Discount) as [TotalPrice]
        ,sh.CompanyName as [ShipperName]
        ,sh.Phone as [ShipperPhone]
    from db1.dbo.Orders o
        INNER JOIN db1.dbo.[Order Details] od ON o.OrderID = od.OrderID
        INNER JOIN db1.dbo.Shippers sh ON sh.ShipperID = o.ShipVia
        INNER JOIN dimCustomers dc ON dc.CustomerID = o.CustomerID
        INNER JOIN dimProducts dp ON dp.ProductID = od.ProductID
        INNER JOIN dimCategories dct ON dct.CategoryID = p.CategoryID
        INNER JOIN db1.dbo.Products p ON 
        INNER JOIN dimTime dt1 ON dt1.date = o.OrderDate
        INNER JOIN dimTime dt2 ON dt2.date = o.RequiredDate
        INNER JOIN dimTime dt3 ON dt3.date = o.ShippedDate
    ) o
    on (
            fo.ProductKey = o.ProductKey
            and fo.CustomerKey = o.CustomerKey
            and fo.CategoryKey = o.CategoryKey
            and fo.OrderDateKey = o.OrderDateKey
            )