Aggregating spend by customer where accounts can be re-assigned to different customers

join;teradata

I want to join unique customer ids with their spends.

Problem: The account_no can be assigned to a different customer (A,B,C etc) at different point in time (assume an employee A leaving the organization and his account no is assigned to B).

Eg from the tables below: If a new customer, say B, is assigned an account number 123 (which was earlier assigned to A) on 1 Mar2015 then spend post 1 Mar2015 must fall in Bs remit til his account 123 is assigned to the next customer.

Tables to Join: Table 1 and Table 2

Desired Result: Table 3

Table 1: Assignment of account to different customers at different dates

Customer_ID   Account_no     Date 
A              123           1Feb2015   
B              123           1Mar2015
A              124           5Mar2015
C              123           1May2015
B              124           20May2015

Table 2: Daily level spend info

Account_no      Spend        Date
123             100         2Feb2015
123             200         2Mar2015
124             200         7Mar2015
123             200         17May2015
124             150         25May2015

Desired Output:

Customer_ID     Spend
A               100+200=300
B               150+200=350
C               200

The account No can be assigned twice, thrice or more to a different customer in case it is closed by multiple customers.

How do I aggregate the spend customer Id wise?

Best Answer

You need a kind of best-match join like @TypoCubeᵀᴹ's approach, but depending on the actual data (i.e. number of rows per Account_no) this might create a huge intermediate result before aggregation (and needs a uniqueness contraint on table2(Account_no, Date)).

The same logic logic can be achieved by a different approach using UNION ALL and LAST_VALUE:

SELECT Customer_ID, Sum(spend)
FROM 
 (
   SELECT
      -- assign the "previous" Customer_ID to the current row
      Last_Value(Customer_ID IGNORE NULLS) 
          -- within a date first rows from table1, followed by table2
      Over (ORDER BY Date, x) AS Customer_ID,
      Spend
   FROM
    (
      SELECT 
         1 AS x, -- flag indicating source table
         Customer_ID,
         Account_no,
         dt,
         -- need to cast to datatype of table2.Spend
         Cast(NULL AS ...) AS Spend
      FROM cust
      UNION ALL
      SELECT
         2 AS x,
         -- need to cast to datatype of table1.Customer_ID 
         Cast(NULL AS ...), 
         Account_no,
         dt,
         spend
      FROM spend
    ) AS dt
   QUALIFY -- only rows from table2
           x = 2
       AND -- only if table1 and table2 Account_no match
           Last_Value(CASE WHEN x = 1 THEN Account_no END IGNORE NULLS) 
           Over (ORDER BY dt, Spend) = Account_no

 ) AS dt
GROUP BY Customer_ID
;
Related Question