Selecting Only One Record with Most Recent Date Using Table Join

datejoin;teradata

I have a list of 463 CARRIER_CD. I am attempting to retrieve the most current BILL_DT for each of the 463 CARRIER_CD.

I have uploaded the 463 unique CARRIER_CD to a database named edw_sbx_bt the table name is jjenkins_carriers.

The BILL_DT is in a database named edw_ar_fl and table named ARTBASE. I wrote the query below in hopes of returning a single row of data including CARRIER_CD, BILL_DT, and BILL_AMT for the most recent BILL_DT associated with each of the unique CARRIER_CD. However, the query returns 408+ million rows. Any help would be appreciated.

SELECT edw_ar_fl.ARTBASE.CARRIER_CD,
    edw_ar_fl.ARTBASE.BILL_DT,
    edw_ar_fl.ARTBASE.BILL_AMT
FROM   edw_ar_fl.ARTBASE
       JOIN edw_sbx_bt.jjenkins_carriers
         ON edw_ar_fl.ARTBASE.CARRIER_CD = edw_sbx_bt.jjenkins_carriers.CARRIER_CD
AND edw_ar_fl.ARTBASE.BILL_DT = 
     (select max(edw_ar_fl.ARTBASE.BILL_DT)
      where edw_ar_fl.ARTBASE.CARRIER_CD = edw_sbx_bt.jjenkins_carriers.CARRIER_CD)

Best Answer

The simplest way to select the last row (by bill_dt) in each group of rows with the same carrier_cd in Teradata would probably be using QUALIFY:

QUALIFY 1 = ROW_NUMBER() OVER (PARTITION BY carrier_cd ORDER BY bill_dt DESC)

If you're sure there are no ties (i.e. there's only one row with bill_dt = MAX(bill_dt) for each carrier_cd), you could calculate the max date for each carrier you're interested in:

sel carrier_cd, max(bill_dt) as max_bill_dt
from ARTBASE join CARRIERS on ARTBASE.carrier_cd = CARRIERS.carrier_cd
group by 1

...and then join it back to the original table:

sel ARTBASE.carrier_cd, ARTBASE.bill_dt, ARTBASE.bill_amt
from ARTBASE
join ( the above select here ) CARRIERS_STAT
  on ARTBASE.bill_dt = CARRIERS_STAT.max_bill_dt

In general I try to avoid the correlated subqueries, especially in a join condition. I'm not even sure how it works at all in the example you provided.