Understanding a notification system

database-design

I have been looking into how to build a notification system on SE and elsewhere and found myself drawn to the solution that is the accepted answer here: https://stackoverflow.com/questions/9735578/building-a-notification-system
which uses this structure:

╔═════════════╗      ╔═══════════════════╗      ╔════════════════════╗
║notification ║      ║notification_object║      ║notification_change ║
╟─────────────╢      ╟───────────────────╢      ╟────────────────────╢
║ID           ║—1:n—→║ID                 ║—1:n—→║ID                  ║
║userID       ║      ║notificationID     ║      ║notificationObjectID║
╚═════════════╝      ║object             ║      ║verb                ║
                     ╚═══════════════════╝      ║actor               ║
                                                ╚════════════════════╝

A notification is about something (object = event, friendship..) being changed (verb = added, requested..) by someone (actor) and reported to the user (subject). Here is a normalized data structure (though I've used MongoDB). You need to notify certain users about changes. So it's per-user notifications.. meaning that if there were 100 users involved, you generate 100 notifications.

I thought at first that I understood this approach, but when I started getting ready to implement it I realized I apparently don't understand it particularly well. The last few comments on the answer are questions from other users who also have had trouble understanding the solution.

I'm not sure if this is the model I'll end up following, but given the number of upvotes it has, I'm sure it would benefit me to understand it, and I'd certainly like to learn more. I hope this will also be of use to others who have had trouble grasping this solution (incidentally, I don't have enough internet points to leave a comment on that answer directing to this question, anyone else please do!)

Questions

If I understand it right, notificationObjectID is a foreign key pointing to the notification_object table, and notificationID is a foreign key pointing to the notification table. It seems like object should be a foreign key referring to the ID of the database entry the notification is about (e.g. a specific event or post), but don't we then need another field to indicate which table that ID belongs to?

The author wrote

notification_object.object identifies change type, like a string "friendship" The actual reference to changed object with its extra data that I talk about is in notification_change.notificationObjectID

which doesn't seem to make sense to me. Object is a string (enum?) and notificationObjectID is a foreign key referring to the object the notification is about? Then how are the middle and right tables connected at all?

It seems that the middle table specifies what object (or type of object) the notification is about, e.g. an event or post. We can then have many entries in notification_change that point to the same object type, which allows us to bundle notifications (like "25 users posted on X's wall) – hence the 1:n relationship between the middle and right tables.

But why is there a 1:n relationship between the left and middle tables? Are we going to give "25 users posted on Sam's wall" and "Mary updated her "Friday Picnic" event the same notification ID? If all notifications for the same user have the same notification ID, why do we even need the table on the left?

A performance question – say John posts a comment on Mary's picnic event. It seems like we'd need to do a lookup to see if a notification_object already exists for Mary's Picnic before we created the notification_change entry. Is this going to negatively impact performance, or is it a non-issue? Continuing the questions from the previous paragraph, how would we know which notification entry to point the notification_object to?

Best Answer

Thanks for such thorough questions, and sorry for all of confusion - commenting 1 year after initial answer is hard and now 3 years after.. initial thoughts fade and confuse me too, but I fear to edit the hole thing because I'm not working on storing notifications on backend right now and not sure if I make good judgements without practical application

I think at the time of writing:

Yes and no, notificationID was a foreign key, notificationObjectID was not. You do need another FK field to tie tables together. I blame my mongo experience in not being so clear about it :(
Yes and no, notification_object.object is vague because you can have it just as a string, or something complex (JSON, or FK). In my case it was just a noun.

So it all depends on how your notifications look. For simple case, you just want to link entire notification to some URL, like friends page - thats why having object (entityType) as a string is useful - you tie URLs to it.

Notification "you have 3 friend requests added" can be stored differently.

If you want to show them one at a time - you'll have 3 notifications, 3 notification objects (friend_request) and 3 entries in notification_change that link to specific friend user.

If you want to show one notification - you'll have 1 notification, 1/more objects and 3/more actions. So in this complex case, like «you have 3 friend requests from user A, user B, user C» - you use notificationObjectIDs for each user and have several links in your notification text.

Should you use 1 or 3 friend_request objects? It depends on

What is the object and what is the action. Is «article liked/commented»? Or is it «like/comment added» and object hierarchy gets linked only during display? Here is facebook distinguishing «photo commenting» from «user mentioning» which semantically seem to be very close — you have same photo, same actor but different notifications that could have been merged together

enter image description here

Can you remove a notification or do you need a history? So if I send a friend request and then cancel it, or comment and then delete an article, like and then unlike something - should It show this history (as another action) to the end user as two different actions? Probably not. In addition its techincally more complex - you need to search if there is an existing notification_object, add new notification_change to it (if not - add a new object) so that later I would need to search through notification_change to remove it. Instead I would just add another notification_object to the same notification and remove it if its gone with actions getting cascade-deleted.

On the other hand, there could be cases, where it might be useful to have grouping of actions where nothing gets wiped from the history.

I think thats why at the time of writing, 1:n relation between left and middle tables were done - so that you could group notifications not only by actors on the same entity (middle-right tables) but by several object/entities too.

But sure, you can simplify entire case and optimize storage by reversing left-middle relationship to n:1, so that you have per-user notifications generated for one event.

So that it would it look more like this..

╔═════════════╗      ╔═══════════════════╗      ╔════════════════════╗
║notification ║      ║notification_object║      ║notification_change ║
╟─────────────╢      ╟───────────────────╢      ╟────────────────────╢
║ID           ║←—n:1—║ID                 ║—1:n—→║ID                  ║
║noteObjFK    ║      ║entityType         ║      ║noteObjFK           ║
║viewerUserID ║      ║entityID           ║      ║actionOnEntity      ║
╚═════════════╝      ╚═══════════════════╝      ║actorUserID         ║
                                                ╚════════════════════╝

Hope this helped

Disclaimer

This is experimental and only tested rudimentarily. Proceed at your own risk. I would not use it myself and just drop / recreate constraints with standard DDL commands. If you break entries in the catalog tables you could easily mess up your database.

For all I know, there are only two differences between a PRIMARY KEY and a UNIQUE constraint in the catalog tables (the index itself is identical):

pg_index.indisprimary:
For PRIMARY KEY constraint ... TRUE
For UNIQUE constraint ... FALSE

pg_constraint.contype:
PRIMARY KEY constraint ... 'p'
UNIQUE constraint ... 'u'

You could convert constraint and index in place, from PRIMARY KEY constraint to UNIQUE constraint, my_idx being the (optionally schema-qualified) index name:

UPDATE pg_index SET indisprimary = FALSE WHERE indexrelid = 'my_idx'::regclass
UPDATE pg_constraint SET contype = 'u' WHERE conindid = 'my_idx'::regclass;

Or upgrade from UNIQUE to PRIMARY KEY:

UPDATE pg_index SET indisprimary = TRUE WHERE indexrelid = 'my_idx'::regclass;
UPDATE pg_constraint SET contype = 'p' WHERE conindid = 'my_idx'::regclass;

Questions

Best Answer

Related Solutions

Designing An ACL Based Permission System

Postgresql – What happens to the index of a primary key after a DROP CONSTRAINT

Disclaimer

Related Question