I need to select the oldest(min) due datetime for each InternalOrder and then filter these against some datetime for example NOW().
As you can see in schema image, I designed the schema as one InternalOrder has multiple Deliveries and each Delivery has multiple DeliveryRev(revisions), as I need to have whole trace of inputed data. I always look at the latest revision, if that one is deleted the whole delivery should not be considered.
Table rows:
- InternalOrder: 2970
- Delivery: 3258
- DeliveryRev: 12272
I select the newest DeliveryRev via joins and filter these, which have due datetime field filled and also are not deleted.
I have working query (first one), but it is not exactly, what I need, since I can get the result with InternalOrder, which have oldest(min) due under specified threshold(not selected) and another one, which is over (in the result). In this case I do not want that InternalOrder in the result, which I solved using having instead of where, but this query is taking ages.
Am I doing something wrong with HAVING, or how can I optimize/change to get right result?
This query is fast (40ms), but not getting right result.
SELECT
s9_.internal_order_id,
MIN(s10_.due) AS minDue
FROM sales_internal_delivery s9_
LEFT JOIN sales_internal_deliveryrev s10_
ON s9_.id = s10_.delivery_id
AND (s10_.due IS NOT NULL AND s10_.deleted = 0)
LEFT JOIN sales_internal_deliveryrev s11_
ON s10_.delivery_id = s11_.delivery_id
AND s10_.timestamp < s11_.timestamp
WHERE
s11_.delivery_id IS NULL
AND s10_.due > NOW()
GROUP BY s9_.internal_order_id
ORDER BY minDue ASC
This query is slow (25s), but the result is alright.
SELECT
s9_.internal_order_id,
MIN(s10_.due) AS minDue
FROM sales_internal_delivery s9_
LEFT JOIN sales_internal_deliveryrev s10_
ON s9_.id = s10_.delivery_id
AND (s10_.due IS NOT NULL AND s10_.deleted = 0)
LEFT JOIN sales_internal_deliveryrev s11_
ON s10_.delivery_id = s11_.delivery_id
AND s10_.timestamp < s11_.timestamp
WHERE
s11_.delivery_id IS NULL
GROUP BY s9_.internal_order_id
HAVING minDue > NOW()
ORDER BY minDue ASC
Best Answer
I don't know that the query could be refactored much and still be logically correct. And since I don't have your dataset it is difficult to test possible solutions for performance.
This query should still be logically the same, but may be executed in a more efficient way.
Edit
As per discussion about the proper way to handle revisions with
deleted = 0
this is an alternate version which does not consider a revision withdeleted = 1
when looking for the latest revision.The
WHERE deleted = 0
has been moved into the inner subquery.