I am not familiar with accounting, but I solved some similar problems in inventory-type environments. I store running totals in the same row with the transaction. I am using constraints, so that my data is never wrong even under high concurrency. I have written the following solution back then in 2009::
Calculating running totals is notoriously slow, whether you do it with a cursor or with a triangular join. It is very tempting to denormalize, to store running totals in a column, especially if you select it frequently. However, as usual when you denormalize, you need to guarantee the integrity of your denormalized data. Fortunately, you can guarantee the integrity of running totals with constraints – as long as all your constraints are trusted, all your running totals are correct. Also this way you can easily ensure that the current balance (running totals) is never negative - enforcing by other methods can also be very slow. The following script demonstrates the technique.
CREATE TABLE Data.Inventory(InventoryID INT NOT NULL IDENTITY,
ItemID INT NOT NULL,
ChangeDate DATETIME NOT NULL,
ChangeQty INT NOT NULL,
TotalQty INT NOT NULL,
PreviousChangeDate DATETIME NULL,
PreviousTotalQty INT NULL,
CONSTRAINT PK_Inventory PRIMARY KEY(ItemID, ChangeDate),
CONSTRAINT UNQ_Inventory UNIQUE(ItemID, ChangeDate, TotalQty),
CONSTRAINT UNQ_Inventory_Previous_Columns
UNIQUE(ItemID, PreviousChangeDate, PreviousTotalQty),
CONSTRAINT FK_Inventory_Self FOREIGN KEY(ItemID, PreviousChangeDate, PreviousTotalQty)
REFERENCES Data.Inventory(ItemID, ChangeDate, TotalQty),
CONSTRAINT CHK_Inventory_Valid_TotalQty CHECK(
TotalQty >= 0
AND (TotalQty = COALESCE(PreviousTotalQty, 0) + ChangeQty)
),
CONSTRAINT CHK_Inventory_Valid_Dates_Sequence CHECK(PreviousChangeDate < ChangeDate),
CONSTRAINT CHK_Inventory_Valid_Previous_Columns CHECK(
(PreviousChangeDate IS NULL AND PreviousTotalQty IS NULL)
OR (PreviousChangeDate IS NOT NULL AND PreviousTotalQty IS NOT NULL)
)
);
-- beginning of inventory for item 1
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
VALUES(1, '20090101', 10, 10, NULL, NULL);
-- cannot begin the inventory for the second time for the same item 1
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
VALUES(1, '20090102', 10, 10, NULL, NULL);
Msg 2627, Level 14, State 1, Line 10
Violation of UNIQUE KEY constraint 'UNQ_Inventory_Previous_Columns'.
Cannot insert duplicate key in object 'Data.Inventory'.
The statement has been terminated.
-- add more
DECLARE @ChangeQty INT;
SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
SELECT TOP 1 ItemID, '20090103', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
FROM Data.Inventory
WHERE ItemID = 1
ORDER BY ChangeDate DESC;
SET @ChangeQty = 3;
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
SELECT TOP 1 ItemID, '20090104', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
FROM Data.Inventory
WHERE ItemID = 1
ORDER BY ChangeDate DESC;
SET @ChangeQty = -4;
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
SELECT TOP 1 ItemID, '20090105', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
FROM Data.Inventory
WHERE ItemID = 1
ORDER BY ChangeDate DESC;
-- try to violate chronological order
SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty)
SELECT TOP 1 ItemID, '20081231', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
FROM Data.Inventory
WHERE ItemID = 1
ORDER BY ChangeDate DESC;
Msg 547, Level 16, State 0, Line 4
The INSERT statement conflicted with the CHECK constraint
"CHK_Inventory_Valid_Dates_Sequence".
The conflict occurred in database "Test", table "Data.Inventory".
The statement has been terminated.
SELECT ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;
ChangeDate ChangeQty TotalQty PreviousChangeDate PreviousTotalQty
----------------------- ----------- ----------- ----------------------- -----
2009-01-01 00:00:00.000 10 10 NULL NULL
2009-01-03 00:00:00.000 5 15 2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3 18 2009-01-03 00:00:00.000 15
2009-01-05 00:00:00.000 -4 14 2009-01-04 00:00:00.000 18
-- try to change a single row, all updates must fail
UPDATE Data.Inventory SET ChangeQty = ChangeQty + 2 WHERE InventoryID = 3;
UPDATE Data.Inventory SET TotalQty = TotalQty + 2 WHERE InventoryID = 3;
-- try to delete not the last row, all deletes must fail
DELETE FROM Data.Inventory WHERE InventoryID = 1;
DELETE FROM Data.Inventory WHERE InventoryID = 3;
-- the right way to update
DECLARE @IncreaseQty INT;
SET @IncreaseQty = 2;
UPDATE Data.Inventory
SET
ChangeQty = ChangeQty
+ CASE
WHEN ItemID = 1 AND ChangeDate = '20090103'
THEN @IncreaseQty
ELSE 0
END,
TotalQty = TotalQty + @IncreaseQty,
PreviousTotalQty = PreviousTotalQty +
CASE
WHEN ItemID = 1 AND ChangeDate = '20090103'
THEN 0
ELSE @IncreaseQty
END
WHERE ItemID = 1 AND ChangeDate >= '20090103';
SELECT ChangeDate,
ChangeQty,
TotalQty,
PreviousChangeDate,
PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;
ChangeDate ChangeQty TotalQty PreviousChangeDate PreviousTotalQty
----------------------- ----------- ----------- ----------------------- ----------------
2009-01-01 00:00:00.000 10 10 NULL NULL
2009-01-03 00:00:00.000 7 17 2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3 20 2009-01-03 00:00:00.000 17
2009-01-05 00:00:00.000 -4 16 2009-01-04 00:00:00.000 20
As you've identified, storing the price on the order makes the technical implementation easier. There are a number of business reasons why this may be beneficial though.
In addition to web transactions, many businesses support sales through other channels, e.g.:
- Over the phone
- Sales agents "on the road"
- At a physical location (e.g. shop, office)
In these cases the order may be entered into the system at some time after the transaction took place. In these circumstances it can be difficult to impossible to correctly identify which historical price record should be used - storing the unit price directly on the order is the only feasible option.
Multiple channels often bring another challenge - different prices for the same product. Surcharges for phone orders are common - and some customers may negotiate themselves a discount. You may be able to represent all possible prices for all channels in your product schema, but incorporating this into your order tables can become (very) complex.
Anywhere that negotiation is allowed it becomes very difficult to link price history to the order price agreed (unlesss agents have very narrow negotiation limits). You need to store the price on the order itself.
Even if you only support web transactions and having a relatively simple pricing structure, there's still an interesting problem to overcome - how should price increases be handled for in flight transactions? Does the business insist that the customer must pay increases or do they honour the original price (when the product was added to the basket)? If it's the latter it the technical implementation is complicated - you need to find a way to ensure you're maintaining the price version in the session correctly.
Finally, many businesses are starting to use highly dynamic pricing. There may not be one fixed price for a given product - it is always calculated at runtime based on factors such as time of day, demand for the product and so on. In these cases the price may not be stored against the product in the first place!
Best Answer
Yes, logs tend to become big. But that is not necessarily a problem. (More in a minute.)
I would be tempted to treat
View History
as if it were another page, and log it in the same table.The log does not really need an
id
.PRIMARY KEY(user_id, view_id, timestamp)
(all the columns) serves two purposes: Having a PK; and facilitatingView History
because the PK starts with user_id. One drawback: If the user views the same page twice in one second, the INSERT would get "duplicate key" error. So I would doINSERT IGNORE
to avoid the error (and lose the record of the extra page load).The best way to handle purging old log entries (should you decide to do such), is to
PARTITION BY RANGE
using the timestamp. The PK is already prepped for that. If you want a 30 day retention period, use 32 partitions and do DROP PARTITION nightly. If you want a bigger retention, then use weekly partitions. The details of "why", plus sample code for doing theDROP
andREORGANIZE
are in my Partition Maintenance blog.Assuming that you use InnoDB for the log, it would be beneficial to have enough RAM so you can set innodb_buffer_pool_size bigger than the size of one partition.
(All InnoDB and MyISAM indexes are BTrees, so the comment about Heap is irrelevant.)