Mysql – What are some good methods of structuring a MySQL audit trail to be both immutable, and perform well for usage

auditdatabase-engineinnodbMySQLperformance

In one application, we have an audit table that uses the ArchiveEngine in conjunction with a few triggers to capture 100% of the changes that occur. The upside to this is that 20+ million rows occupies 145MB, not to mention that ArchiveEngine only supports INSERT and SELECT, rendering the trail immutable.

The downside comes when you want to access that data in order to give the user audit information. Since the data is stored in compressed form and there are no indices, selection time can be ridiculous:

mysql> select * from AUDIT_LOG WHERE Table_Name = 'myTable' AND Row_ID = 9024;
+------------+--------+-------------+---------------------+---------------------+----------------+-------------+---------------------+
| Table_Name | Row_ID | Field_Name  | Old_Value           | New_Value           | DB_User        | modified_by | date_modified       |
+------------+--------+-------------+---------------------+---------------------+----------------+-------------+---------------------+
| myTable    |   9024 | Flags       | 0                   | 1                   | user@localhost |         826 | 2011-06-10 22:05:11 |
| myTable    |   9024 | Status      | Unavailable         | Pending             | user@localhost |         826 | 2011-07-08 22:41:45 |
| myTable    |   9024 | Status      | Pending             | Processing          | user@localhost |         826 | 2011-07-08 22:41:49 |
| myTable    |   9024 | Status      | Processing          | Calculated          | user@localhost |         826 | 2011-07-08 22:41:54 |
| myTable    |   9024 | Fee_Paid    | NULL                | 185.00              | user@localhost |         826 | 2011-07-08 22:41:54 |
| myTable    |   9024 | Status      | Calculated          | Approved            | user@localhost |         826 | 2011-07-08 22:47:04 |
| myTable    |   9024 | Approved    | NULL                | 2011-07-08 22:47:06 | user@localhost |         826 | 2011-07-08 22:47:04 |
| myTable    |   9024 | Approved_TZ | America/Yellowknife | America/Vancouver   | user@localhost |         826 | 2011-07-08 22:47:04 |
| myTable    |   9024 | Approved_By | NULL                | 826                 | user@localhost |         826 | 2011-07-08 22:47:04 |
| myTable    |   9024 | Invoice_ID  | NULL                | 2372                | user@localhost |         825 | 2011-10-17 19:32:59 |
+------------+--------+-------------+---------------------+---------------------+----------------+-------------+---------------------+
**10 rows in set (13.54 sec)**

After conversion to InnoDB and adding a PK and indexing Row_ID and Table_Name, this same operation takes roughly 1/10th of a second.

What I'm considering is a leaving the triggers + Archive, and then using a cron job to mirror the Archive in InnoDB.

  • Are there other ways this has been dealt with?
  • Any drawbacks or trade-offs that are apparent?

Best Answer

Are there other ways this has been dealt with?

If you use MySQL Replication, you could have a Master/Slave that has the following:

  • AUDIT_LOG as InnoDB on the Master
  • AUDIT_LOG as ARCHIVE on the Slave

This would provide fast writing of the AUDIT_LOG on the Master and makes the data available for reads almost instantly. The Slave would have the backup as an ARCHIVE with a small disk footprint.

DO NOT SEND ANY ALTER TABLE SQL AGAINST AUDIT_LOG ONCE REPLICATION IS SET UP. Otherwise, you lose the small disk footprint feature of the ARCHIVE Storage Engine.

Any drawbacks or trade-offs that are apparent?

You must configure InnoDB