How to join the latest previous record with SQL

informixjoin;

I have a table whose schema is like this:

CREATE TABLE product_shipping(
    product_id CHAR(10),
    product_name CHAR(10),
    deal_dt DATETIME,
    deal_reason_no SMALLINT,
    cancel_dt DATETIME,
    cancel_reason_no SMALLINT
);

Once the shipping date is arranged, the deal_dt and deal_reason_no will be set and cancel_dt and cancel_reason_no is null. However product_id is not unique in the table since the every product shipping can be cancelled. Once cancelled cancel_dt and cancel_reason_no will be set. If the cancelled product is bought by other customer then the shipping is recorded in the new row, and product_name can be changed in the new shipping.

I want to query the product_id and deal_dt whose deal_reason_no is 1, 3 or 5 and deal_dt is between '2014-04-01 00:00:00' and '2014-04-11 00:00:00' and the latest previous deal_dt and latest previous cancel_dt whose cancel_reason_no is 2 or 4 for the product_id.

I think the SQL could be like this:

SELECT 
C.product_id as product_id,
C.deal_dt as deal_dt,
R.deal_dt as previous_deal_dt,
R.cancel_dt as previous_cancel_dt,
FROM product_shipping C
LEFT JOIN product_shipping R ON (???)
WHERE TO_CHAR(C.deal_reason_no ) in ('1', '3', '5')
AND C.deal_dt between '2014-04-01 00:00:00' and '2014-04-11 00:00:00'

But I'm not sure how to join to get latest previous record.

I'm using Informix 11.70.

Edit: Add my idea. Is this reasonable?

SELECT 
C.product_id as product_id,
C.deal_dt as deal_dt,
MAX(R.deal_dt) as previous_deal_dt,
MAX(R.cancel_dt) as previous_cancel_dt,
FROM product_shipping C
LEFT JOIN product_shipping R ON (C.product_id = R.product_id and TO_CHAR(R.cancel_reason_no) MATCHES ('2', '4') and R.cancel_dt <= C.deal_dt)
WHERE TO_CHAR(C.deal_reason_no ) in ('1', '3', '5')
AND C.deal_dt between '2014-04-01 00:00:00' and '2014-04-11 00:00:00'
GROUP BY 1,2

Best Answer

I suggest you join to previous products that match criteria and (if any exist) narrow down to the one where no later row exists:

SELECT P.product_id AS product_id,
      ,P.deal_dt    AS deal_dt
      ,C.deal_dt    AS previous_deal_dt
      ,C.cancel_dt  AS previous_cancel_dt
FROM   product_shipping p
LEFT   JOIN product_shipping C ON C.product_id = P.product_id
                              AND C.cancel_reason_no IN (2, 4)
                              AND C.cancel_dt < P.deal_dt
WHERE  P.deal_reason_no IN (1, 3, 5)
AND    P.deal_dt BETWEEN '2014-04-01 00:00:00' AND '2014-04-11 00:00:00'
AND    NOT EXISTS (
   SELECT 1
   FROM   product_shipping C1
   WHERE  C1.product_id = C.product_id
   AND    C1.cancel_reason_no IN (2, 4)
   AND    C1.cancel_dt < P.deal_dt
   AND    C1.cancel_dt > C.deal_dt
   );

Assuming (product_id,cancel_dt) to be UNIQUE, else you need tiebreaker criteria to be unambiguous. This way you do not need GROUP BY, because only one (or no) row is left-joined.

Also assuming deal_dt and cancel_dt are supposed to come from the same cancelled deal.

This would be simpler with the window function lag(), but your version does not seem to support that according to the discussion under the duplicate post.