MySQL – How to improve this query? EXPLAIN attached

explainMySQLperformancequery-performance

I have a query that I feel should go much faster then it does. Right now it'll execute around 1,5xx secs each time. I don't have any advanced knowledge of looking at the EXPLAIN and draw conclusions. Maybe someone can help?

The query has all its ID-fields indexed and unsigned.

"reservations" contains around 100,000 records and "reservation_nights" around 180,000 records. All other tables have under 1000 records each.

"profiles" is LEFT JOIN:ed because not all reservations has profiles.

Here's the query:

SELECT DISTINCT(reservations.id), `reservations`.`referenceId`, `reservations`.`referenceStatus`, `reservations`.`status`, `reservations`.`firstname`, `reservations`.`lastname`, `reservations`.`dateArrival`, `reservations`.`daysToCheckIn`, `reservations`.`rateCode`, `reservations`.`roomCode`, `reservations`.`totalNights`, `reservations`.`rateRevenue`, `reservations`.`dailyRate`, `reservations`.`totalRevenue`, `reservations`.`foodBeverageRevenue`, `reservations`.`additionalRevenue`, `channels`.`name` AS `channelName`, `channelgroups`.`name` AS `channelGroupName`, `markets`.`name` AS `marketName`, `profiles`.`totalReservations` 
FROM `reservation_nights` 
INNER JOIN `reservations` ON reservations.id = reservation_nights.reservationId 
INNER JOIN `channels` ON channels.id = reservations.channelId 
INNER JOIN `channelgroups` ON channelgroups.id = channels.channelgroupId 
INNER JOIN `markets` ON markets.id = reservations.marketId 
LEFT JOIN `profiles` ON profiles.id = reservations.profileId 
ORDER BY `reservation_nights`.`dateStayed` DESC 
LIMIT 100

Attached is the EXPLAIN from PHPMYADMIN. Where should I start optimizing this query?

enter image description here

Best Answer

  • First, please remove the parentheses from DISTINCT(reservations.id).
    DISTINCT is not a function.

  • About performance, why do you join reservation_nights? You only seem to use that table in the ORDER BY. Assuming that a reservation can span several nights, you probably mean to use the last date (from the many that a reservation has) for the ordering and nothing else. This however, requires a join on reservation_nights and thus many rows for each reservation are created in the result set. So, you had to use DISTINCT to get rid of the duplicate rows. DISTINCT is often inefficient, especially when performed for big results sets and not in small/indexed tables.

Try this rewriting, that gets rid of the DISTINCT and pushes the ORDER BY / LIMIT 100 in a derived table where only base table is used. This technique which I call "LIMIT first, JOIN later" can be more efficient as the order by is done on less data (single table vs a large join) and then the joins are done on less data as well (max 100 rows from each of the joined tables):

SELECT 
    r.id, 
    r.referenceId, r.referenceStatus, r.status, 
    r.firstname, r.lastname, 
    r.dateArrival, r.daysToCheckIn, 
    r.rateCode, r.roomCode, 
    r.totalNights, r.rateRevenue, 
    r.dailyRate, r.totalRevenue, 
    r.foodBeverageRevenue, r.additionalRevenue, 
    c.name AS channelName, 
    cg.name AS channelGroupName, 
    m.name AS marketName, 
    p.totalReservations 
FROM 
    ( SELECT reservationId, MAX(dateStayed) AS lastDateStayed
      FROM reservation_nights
      GROUP BY reservationId
      ORDER BY MAX(dateStayed) DESC 
      LIMIT 100 
    ) AS rn
  INNER JOIN reservations AS r ON r.id = rn.reservationId 
  INNER JOIN channels AS c ON c.id = r.channelId 
  INNER JOIN channelgroups AS cg ON cg.id = c.channelgroupId 
  INNER JOIN markets AS m ON m.id = r.marketId 
  LEFT JOIN profiles AS p ON p.id = r.profileId 
ORDER BY 
    rn.lastDateStayed DESC ;
  • If you don't already have one, add an index on reservation_nights (reservationId, dateStayed) to improve the efficiency of the subquery.