I'm designing a schema to support logging of user activity, where users must be able to search:
- Across all events (events of any type), with datetime range, by username;
- Across events of one type, with same as above and additionally with parameters of that module.
I created this schema:
And designed queries to search:
-
across all events:
SELECT extract(epoch from log_time) * 1000, u.username, CASE WHEN l.event_type IN (0, 2) THEN e.template WHEN l.event_type = 1 THEN format(e.template, ecs.field) WHEN l.event_type = 3 THEN format(e.template, ess.field) WHEN l.event_type = 4 THEN format(e.template, ect.ip, ect.port) END FROM logs l JOIN users u ON u.id = l.user_id JOIN event_def e ON e.id = l.event_type LEFT JOIN event_client_search ecs ON ecs.log_id = l.id LEFT JOIN event_switch_search ess ON ess.log_id = l.id LEFT JOIN event_cable_test ect ON ect.log_id = l.id
-
across same-type events:
SELECT extract(epoch from log_time) * 1000, u.username, format(e.template, ect.ip, ect.port) FROM logs l JOIN users u ON u.id = l.user_id JOIN event_def e ON e.id = l.event_type JOIN event_cable_test ect ON ect.log_id = l.id
This is what the event_def table looks like:
id | template
----+----------------------------------------
0 | Logged in
2 | Logged out
1 | Searched in clients %s
3 | Searched in switches %s
4 | Cable test switch %s port %s
But what I didn't like, when I have about 100 events (now I have about 50, but haven't implemented them yet), it is going to be a performance issue.
So I thought to merge events with same parameters, like this:
I don't think that will help a lot (it'll cut the number of events by half at most), maybe I'm thinking in the wrong direction?
Best Answer
One of the assumptions that I would ask you to reconsider is whether you really want to have different tables for each event type, or if rather these event types are rows in a table instead of distinct tables.
If you had an
EVENT_TYPE
table containing your list of event types (about 100 of these) you could take the columns from your current various event tables and make them rows in anEVENT_PARAMETER
table.This type of design would turn your schema changes into data changes when you add new event types. That would save you code maintenance issues and also avoid making your queries more complex and potentially slower.
The instances of events would similarly be kept in two tables, for example a
LOG
table and aLOG_DETAIL
table.LOG
would be an intersection between user and event type.LOG_DETAIL
would be an intersection between log and event parameter.Your data model might look something like this:
Note that one objection some people may have to this approach is that you end up saving the log parameter values in a string format, rather than in a native format. This is clearly a trade-off. You have to ask yourself is it a good trade-off for your situation.