MySQL – Schema for Payment History and Viewing History

database-designMySQLschema

On some websites I have seen two history of events: Payment History and View/Watched History.

I want to have both payment and viewing histories in my app, but I got stuck when I started thinking about the database schema.

As a basic example, the easiest way to do this for the view history would be:

  • Have one table called view_history
  • Have these four columns, at least: id, user_id, timestamp, view_id

So when a user with an ID of 80 visits a view with an ID of 22, this should be added to the table.

When this user_id 80 views the View History view, it will loop through the view_history table and displays only the rows that match the user_id = 80.

This is the easiest way of doing this, but I'm afraid it is also the worst. Log tables tend to get big pretty quickly and I assume looping through this whole table wouldn't be ideal performance wise.

With that in mind, can you suggest the best approach to this?

Thank you!

Edit: I apologize if this wasn't the right place to post this question.

Best Answer

Yes, logs tend to become big. But that is not necessarily a problem. (More in a minute.)

I would be tempted to treat View History as if it were another page, and log it in the same table.

The log does not really need an id. PRIMARY KEY(user_id, view_id, timestamp) (all the columns) serves two purposes: Having a PK; and facilitating View History because the PK starts with user_id. One drawback: If the user views the same page twice in one second, the INSERT would get "duplicate key" error. So I would do INSERT IGNORE to avoid the error (and lose the record of the extra page load).

The best way to handle purging old log entries (should you decide to do such), is to PARTITION BY RANGE using the timestamp. The PK is already prepped for that. If you want a 30 day retention period, use 32 partitions and do DROP PARTITION nightly. If you want a bigger retention, then use weekly partitions. The details of "why", plus sample code for doing the DROP and REORGANIZE are in my Partition Maintenance blog.

Assuming that you use InnoDB for the log, it would be beneficial to have enough RAM so you can set innodb_buffer_pool_size bigger than the size of one partition.

(All InnoDB and MyISAM indexes are BTrees, so the comment about Heap is irrelevant.)