I need to store prices with their history. A price can have different specificity. It can refer to a single user (most specific), group (less specific) and product price (global – least specific). There are also two different types of prices (product and delivery price). Now I am having trouble with this. I made it to work, but I somehow feel bad about it when I write queries (they tend to get long and complicated).
I designed two tables like this (they are same):
base_prices
| id | referent_id | product_id | valid_from | valid_until | amount | pieces_per_lot |
delivery_prices
| id | referent_id | product_id | valid_from | valid_until | amount | pieces_per_lot |
For a user specific price referent id would be a positive integer, matching users id.
For a group specific price referent id would be a negative integer, matching group id (I made group ids negative integers).
For global product price, referent id will be NULL.
This way, when I query price for a specific item and specific user and his group all I need to do is filter table by item_id and order results by referent_id column in descending order. This way, user specific are on top, group after and global are last.
Date ranges valid_from-valid_until cannot overlap for two rows.
Now a query to get current prices would look something like this:
SELECT
*
FROM
base_prices AS outer
WHERE
id = (
SELECT
id
FROM
base_prices AS inner
WHERE
NOW() BETWEEN valid_from AND valid_until
AND inner.item_id = outer.item_id
AND (
inner.referent_id = # HERE GOES USER ID
OR inner.referent_id = # HERE GOES GROUP ID
OR inner.referent_id IS NULL
)
ORDER BY
inner.referent_id DESC # FOR SPECIFICITY
Thing that bugs me is that I always have to write this kind of sub-query to get items, and this doesn't seem as a good design to me. When data starts to build up, this queries will cost much more. Also, I had to write a few triggers to keep data integrity during inserts and updates.
My opinion is that current prices and historical prices need to be separated, but I was in a hurry, so this is what came up. I was thinking about scheduling events during inserts based on valid_from column that will move data from historical prices table to (new) current prices table.
I am inexperienced, so this seems kind of weird to me. It might be OK, though, but I would like to see some suggestions and opinions.
If I left out something or there is not enough info, leave a comment.
Best Answer
There are a couple of ways you could separate the current values from the historic ones: you could simply include a boolean field that is true for the latest price and filter on that, or you could keep current prices in a separate table. Both options involve a little extra work to maintain integrity but would make a query for current prices more efficient. This doesn't remove the need for the sub-query to pick the first row that matches the user/group/NULL condition, though fewer rows will be scanned in processing both the inner and outer queries.
The following might be more efficient as a straight query if only the current prices are in the base table (add the
is_latest=true
check or your current date filter to thewhere
andon
clauses if you keep the historic values in the same table):This will only work if you always have a row for each product where referent_id is null, otherwise you could try:
In both cases be careful using this construct in a view because further filtering by the fields brought out via those case statements will not be able to use any indexes you may have defined.
Obviously the case statement per field is a bit of faf but it should stop the query runner searching the table once for every row returned as it might with the sub-query arrangement.
In other databases you might be able to use windowing functions (like
ROW_NUMBER
) to do this more conveniently and efficiently, but mySQL does not support those to my knowledge.