SQL Server Subquery – Retrieve Customer with Highest Orders Using ANY/ALL

greatest-n-per-groupsql serversubquery

I have this tables, and i would like to retrieve the customer who placed the highest number of orders.

i wrote this query:

select * from customers where cust_id=(
select cust_id from orders
group by cust_id
having count(*)>=all(select count(cust_id)
                     from orders
                     group by cust_id))

I know that maybe there is a better way to do it, but I was surprised to find that 'ALL' works with '> ='

In my understanding, 'ALL' checks whether the current row is greater or less than all rows in the sub-query, but I never thought it was possible to use it with '='.

If i use it with '=' or '>' the query doesn't return me any rows as i was expecting.

But if i use them together '>=' the query gives me the correct results.

Isn't it strange?

Anyway, at the end i wrote this query:

SELECT *
FROM Orders, customers
WHERE orders.cust_id=customers.cust_id
and orders.cust_id IN
            (SELECT TOP (1) o.cust_id
            FROM Orders AS O
            GROUP BY O.cust_id
            ORDER BY COUNT(*) DESC);

Do you have a better or more elegant solution?

And isn't it strange the use of '>=' in 'ALL' clause?

Thank you.

Best Answer

Various ways to write. Your query, slightly rewritten to use JOIN and corrected to return all customers in case of ties:

select c.*, o.*
from customers as c
    join orders as o
      on o.cust_id = c.cust_id
where c.cust_id in 
      ( select cust_id 
        from orders
        group by cust_id
        having count(*) >=  
               all ( select count(*)
                     from orders
                     group by cust_id
                   )
      ) ;

Similar, without using >= ALL:

select c.*, o.*
from customers as c
    join orders as o
      on o.cust_id = c.cust_id
where c.cust_id in 
      ( select cust_id 
        from orders
        group by cust_id
        having count(*) =  
                  ( select top (1) count(*) as cnt
                     from orders
                     group by cust_id
                     order by cnt desc
                  )
      ) ;

The subquery simplified using TOP WITH TIES:

select c.*, o.* 
from customers as c
    join orders as o
      on o.cust_id = c.cust_id
where c.cust_id in 
      ( select top (1) with ties m.cust_id 
        from orders as m
        group by m.cust_id
        order by count(*) desc
      ) ;

All of the above could be rewritten to use a derived table or CTE and JOIN as the subquery is uncorrelated.

Another way would be to use window functions in combination with TOP:

with 
  d as
  ( select c.*, o.*,
           cnt = count(*) over (partition by o.cust_id)  
    from customers as c
        join orders as o
          on o.cust_id = c.cust_id
  )
select top (1) with ties
    d.*
from d
order by d.cnt desc ;

or just window functions:

with 
  d as
  ( select m.cust_id,
           cnt = count(*)  
           max_cnt = max(count(*)) over ()  
    from orders as m
    group by m.cust_id
  )
select c.*, o.* 
from customers as c
    join orders as o
      on o.cust_id = c.cust_id
    join d 
      on d.cust_id = c.cust_id
where d.cnt = d.max_cnt ;

Best Answer

Related Solutions

Sql-server – which one is more efficient query

Does Detach/Attach or Offline/Online Clear Buffer Cache for Database?

Related Question