How to build a table for a private messaging system that supports replies

database-designdatabase-recommendationhierarchy

I'm writing a private messaging web application in PHP, the application allows replies such that when you view a message, you also get to see to what was that a reply to, and to what was that a reply to and to what was that a reply to and so on and so on.

I'm trying to find a right database structure that would avoid redundancy, so I'm wondering how can I link a single message to all other messages that it is replying to?

I thought about basically having a field called reply_id which would be a serialized array holding the IDs of all messages to be presented as replies

Does anyone have a suggestion on how to do this efficiently? Is my thought a good practice?

Best Answer

Hierarchic Self-Referential Data

Syntax shown using the free and open source database, PostgreSQL

You need a self-referential table, which creates hierarchy in the database. This is how it looks.

CREATE TABLE messages (
  ts_entered  timestamp DEFAULT now(),
  reply_to    int       REFERENCES messages,
  id          int       PRIMARY KEY
                        GENERATED BY DEFAULT AS IDENTITY,
  content     text
);
CREATE INDEX ON messages (reply_to, id);

Notice that reply_id reference an id on the very table. In a more enterprisey example, we'd probably make it EXCLUDE the same id.

Then we enter test data like this.

INSERT INTO messages ( reply_to, id, content )
VALUES
  ( null, 1, 'SYN'),
  ( 1,    2, 'SYN-ACK'),
  ( 2,    3, 'ACK'),
  ( null, 4, 'We should give free advertising to PLAN EXPLORER'),
  ( 4,    5, 'Admins should disclose their affiliations with products'),
  ( 5,    6, 'BANNNN'),
  ( 4,    7, 'Plan Explorer only works on one database'),
  ( 7,    8, 'BANNNN'),
  ( 4,    9, 'Plan Explorer does not support Linux'),
  ( 9,   10, 'BANNNN'),
  ( 4,   11, 'Plan Explorer only does what decent databases already do'),
  ( 11,  12, 'BANNNN')
; 

Now in order to query this, we need a RECURSIVE CTE.

WITH RECURSIVE t(reply_to, id, content, root, level)
AS (
  SELECT reply_to, id, content, ARRAY[id], 0
  FROM messages
  WHERE reply_to IS NULL
  UNION ALL
    SELECT messages.reply_to, messages.id, messages.content, root + ARRAY[messages.id], t.level+1
    FROM t
    JOIN messages
      ON (messages.reply_to = t.id)
)
SELECT * FROM t
ORDER BY root;

And you're done..

 reply_to | id |                         content                          |   root    | level 
----------+----+----------------------------------------------------------+-----------+-------
          |  1 | SYN                                                      | {1}       |     0
        1 |  2 | SYN-ACK                                                  | {1,2}     |     1
        2 |  3 | ACK                                                      | {1,2,3}   |     2
          |  4 | We should give free advertising to PLAN EXPLORER         | {4}       |     0
        4 |  5 | Admins should disclose their affiliations with products  | {4,5}     |     1
        5 |  6 | BANNNN                                                   | {4,5,6}   |     2
        4 |  7 | Plan Explorer only works on one database                 | {4,7}     |     1
        7 |  8 | BANNNN                                                   | {4,7,8}   |     2
        4 |  9 | Plan Explorer does not support Linux                     | {4,9}     |     1
        9 | 10 | BANNNN                                                   | {4,9,10}  |     2
        4 | 11 | Plan Explorer only does what decent databases already do | {4,11}    |     1
       11 | 12 | BANNNN                                                   | {4,11,12} |     2
(12 rows)