mysql,greatest-n-per-group – How to Get Unique Records from Last 6 Months in MySQL

greatest-n-per-groupMySQL

i am facing the below issue

Table structure :

some_table CREATE TABLE `some_table ` ( 
`wallet_id` int(11) NOT NULL AUTO_INCREMENT, 
`wallet_amount` double(10,2) NOT NULL, 
`actual_wallet_amount` double(10,2) NOT NULL, 
`customer_id` int(11) NOT NULL, 
`last_update_on` datetime NOT NULL, 
`redeemption_date` date NOT NULL, 
`expiry_date` date NOT NULL, 
`wallet_status` varchar(200) NOT NULL, 
`order_id` varchar(250) NOT NULL, 
PRIMARY KEY (`wallet_id`) 
) ENGINE=InnoDB 

query :
SELECT *
FROM some_table
WHERE redeemption_date <= CURDATE( ) - INTERVAL 6 MONTH
AND wallet_status != "EXPIRED"

Desired Result:

  1. Need the get all the unique customer_id with latest last_updated_on
  2. Need the get all the records in the table which have only one enter last 6 months and above
  3. when adding group by clause on customer_id i am not getting latest
    updated customer_id
  4. i need all the latest unique customer_id

please help thanks in advance.

snag

Best Answer

This type of queries are sometimes called "greatest-n-pre-group" (there's even a tag for these questions at SO!) and there are various ways to write them.

Assuming that the (customer_id, last_updated_on) is unique - i.e. there are no two rows with same last_update_on timestamp for the same customer - then you could first GROUP BY customer_id and get the maximum timestamp in a derived table and then join to the original table. Like this:
-- query 1

SELECT s.* 
FROM 
    ( SELECT customer_id,
             MAX(last_update_on) AS last_update_on
      FROM some_table
      WHERE redeemption_date > CURDATE() - INTERVAL 6 MONTH 
        AND wallet_status <> 'EXPIRED'
      GROUP BY customer_id
    ) AS m
  JOIN
    some_table AS s
      ON  s.customer_id = m.customer_id
      AND s.last_update_on = m.last_update_on 
WHERE s.redeemption_date > CURDATE() - INTERVAL 6 MONTH 
  AND s.wallet_status <> 'EXPIRED';

Another variation is to first find the customer ids (similarly to above) but then join to the table using a correlated subquery. This is often more efficient if one can find easily the distinct customer ids (so, perhaps not in your case). Here is how it can be done, so you can test both:
-- query 2

SELECT s.* 
FROM 
    ( SELECT customer_id
      FROM some_table
      GROUP BY customer_id
    ) AS d
  JOIN
    some_table AS s
      ON  s.wallet_id =                   -- the PRIMARY KEY 
          ( SELECT si.wallet_id           -- of the table
            FROM some_table AS si         
            WHERE si.customer_id = d.customer_id
              AND si.redeemption_date > CURDATE() - INTERVAL 6 MONTH 
              AND si.wallet_status <> 'EXPIRED'
            ORDER BY si.last_update_on DESC
              LIMIT 1
          );