Mysql – Compare row value when records are not successive

MySQL

I have a table wich holds the records for all the products purchases made. What I'm trying to do is to get a list of all the purchases made that contain one specific product to get a price variation history. I also want to display an icon showing if the price increased, decreased or is the same value as the previous record.

I almost made it work, however, the way I found to do it is based on the unique ID of each rows. The problem with this approach is that they are not sequential, so I can't rely on this method. What I need to compare is the date, but I'm finding it difficult to do.

First of all, here is my table structures:

**tb_purchase**
id | value_purchase | value_delivery | date
1  | 80.90          | 0.00           | 2019-07-01
2  | 24.90          | 5.00           | 2019-07-03
3  | 18.90          | 5.00           | 2019-07-03
[...continue]

**tb_purchase_product**
id | id_purchase | id_product | id_provider | price | quantity | total
1  | 27          | 33         | 2           | 16.90 | 1        | 16.90
2  | 27          | 87         | 2           | 34.90 | 1        | 34.90
3  | 28          | 64         | 2           | 19.90 | 1        | 19.90
4  | 29          | 33         | 3           | 15.80 | 3        | 47.40
5  | 30          | 48         | 3           | 18.90 | 1        | 18.90
6  | 31          | 33         | 1           | 15.80 | 1        | 15.80
7  | 32          | 33         | 3           | 17.90 | 2        | 35.80
[...continue]

**tb_provider**
id | name
1  | 'Mega Food'
2  | 'Best Market'
3  | 'Amazing Sales Store'
[...continue]

This is the query I did:

SELECT
    a.price,
    DATE_FORMAT(b.date, '%d/%m/%Y') as 'date',
    c.name as 'provider',

    CASE
        WHEN a.price > d.price THEN 1
        WHEN a.price < d.price THEN -1
    ELSE 0 END AS 'variation'

FROM
    tb_purchase_product a,
    tb_purchase b,
    tb_provider c,
    tb_purchase_product d

WHERE a.id_product = :id_product -- {ex.: 33}
AND a.id_purchase = b.id
AND b.id_provider = c.id
AND b.date BETWEEN :date_initial AND :date_final
AND d.id = a.id + 1

ORDER BY b.date DESC

The expected result for id_product = 33 is as follow, but I'm getting inconsistent results since it's comparing the value from the previous id and not previous record of the id_product. I'm also having trouble to fix this since I don't know how to mach the date when it's the first smaller date.

price | date       | provider              | variation
15.80 | 05/07/2019 | 'Amazing Sales Store' | -1
17.90 | 04/07/2019 | 'Mega Food'           | 1
15.80 | 03/07/2019 | 'Amazing Sales Store' | -1
16.90 | 02/07/2019 | 'Best Market'         | 0

Note: :id_product, :date_initial and :date_final are passed via php array argument when running the application online.

Best Answer

As already suggested by Rick James in the comments, if you are working with MySQL 8+, you can make this much easier by using the LAG() analytic function to return the previous row's price. You can also get the variation using the SIGN() function instead of the CASE expression, as demonstrated in the following query:

SELECT
  pprod.price,
  DATE_FORMAT(purch.date, '%d/%m/%Y')                                   AS date,
  prov.name                                                             AS provider,
  SIGN(pprod.price
     - LAG(pprod.price, 1, pprod.price) OVER (ORDER BY purch.date ASC)) AS variation
FROM
  tb_purchase_product AS pprod
  INNER JOIN tb_purchase AS purch ON pprod.id_purchase = purch.id
  INNER JOIN tb_provider AS prov ON purch.id_provider = prov.id
WHERE
  pprod.id_product = :id_product
  AND purch.date BETWEEN :date_initial AND :date_final
ORDER BY
  purch.date DESC
;

In this query I have also rewritten the joins using the JOIN syntax. Using the explicit JOIN syntax separates joining conditions from filter conditions more clearly, which, in my estimation, makes the code more understandable and easier to maintain (but your mileage may vary).

And with the use of the LAG() function there is no need to join a second instance of tb_purchase_product, which is another obvious difference of this query from the original.

One important note regarding this query is that, as written, it works for a single id_product, which should be all right based on your filter conditions. But if you ever need to run it for multiple products simultaneously, you need to introduce partitioning to the LAG function, like this:

LAG(pprod.price, 1, pprod.price) OVER (PARTITION BY pprod.id_product
                                       ORDER BY purch.date ASC)

The PARTITION BY subclause ensures that the previous pprod.price value is for the same product as the current row.