Mysql – Select first instance of a set of items

greatest-n-per-groupMySQLselect

I'm attempting to modify an existing query that pulls the most recent order from a set of customers and provided order_type.

The current query works fine (albeit a little slow, all columns below are indexed):

SELECT 
    DISTINCT(td1.product_id), 
    td1.customer_id, 
    td1.date_test_added,
    tdmi.product_type_id
FROM 
    order_history AS td1
    LEFT JOIN order_history td2     
    ON td1.customer_id = td2.customer_id AND td1.date_test_added < td2.date_test_added
    LEFT JOIN order_types tdmi      
    ON tdmi.product_id   = td1.product_id 
WHERE
   td2.date_test_added IS NULL
   AND tdmi.product_type_id = 31
   AND td1.customer_id IN (91000, 91001, 91002, 91003)

If I change the less-than operator in the first join to greater than, no results are returned.

What is the best way to change this to pull the earliest order (based on date_test_added)?

Best Answer

You query says:

Pull the most recent order from a set of customers and only if that order has product with type 31.

When you change the < to > it says:

Pull the oldest order from a set of customers and only if that order has product with type 31.

So it is possible that the modified query to return no results, even the first query, with same parameters returns something. That would happen when the oldest order has product with type different than 31.


If that wasn't the intention of the query but you want the most recent (or oldest) order among the orders that have a product of type 31, you need to modify it, so the LEFT JOIN / IS NULL antijoin considers this restriction.

Some other minor notes:

  • the LEFT JOIN order_types tdmi works as an INNER join really, due to the tdmi.product_type_id = 31 condition in WHERE.

  • DISTINCT is not a function. It modifies the whole SELECT and not just one column, so the parentheses are redundant and can only bring confusion.

  • In a not so minor note, the DISTINCT looks suspicious. It can probably be removed.

The query becomes:

SELECT -- DISTINCT
    td1.product_id, 
    td1.customer_id, 
    td1.date_test_added,
    tdmi.product_type_id
FROM 
    order_history AS td1
    LEFT JOIN order_history td2   
        JOIN order_types tdmi2      
            ON  tdmi2.product_id = td2.product_id
            AND tdmi2.product_type_id = 31   
        ON  td1.customer_id = td2.customer_id 
        AND td1.date_test_added < td2.date_test_added
    JOIN order_types tdmi      
        ON  tdmi.product_id = td1.product_id
        AND tdmi.product_type_id = 31 
WHERE
   td2.date_test_added IS NULL
   AND td1.customer_id IN (91000, 91001, 91002, 91003) ;