Single Table vs Multiple Table Design for Distinct Records Queried Together in PostgreSQL

database-designnormalizationpostgresqlpostgresql-9.3

Let's say I have the following types of recurring data (all types periodically come in).

  • Post (Title, Summary, Text, Timestamp)
  • Image (URL, Caption, Timestamp)
  • Event (StartDate, EndDate, Title, Description, Location)

As the style of the data suggests, we can split them into their own separate tables, Post, Image and Event respectively. However the majority of the queries on these data are of the "timeline" variety (Return me all posts, images and events within a given timeframe sorted by time), though there are also some queries that work on a single type alone.

Someone suggested that we combine all three into a single table and encode the unique attributes into a json field (We're using Postgres 9.3+) rather than use a UNION of the three tables.

So according to that model, we'd have

  • Info (Type, Timestamp, Attributes)

where Attributes is a JSON field which will have different keys based on the Type value.

Which is the better approach here?

P.S. This is a simple example for explaining the question but in reality there are closer to 9 different "types" of data and the number of unique attributes is much larger. We might need to join on one of those unique attributes as well in the future (not sure if Postgres supports this).

Best Answer

I won't put all of them together. Even if you query them together, it may occur that your application will be able to post multiple images per post or multiple events per post or multiple posts per event.

In this case you would be able to save space. The JSON may be a good idea, but it would be better (in view of indices) to use multiple tables. This way you can provide indices which may improve the query process.

Another idea could be to use permutation. Which means you have a global table which holds many different types with the same structure. For example:

CREATE TABLE types(
    id int,
    type nvarchar(50),
    property1 int,
    property2 nvarchar(50)
)

This way you can store a value with Type = 'Image' which holds a property1 (order number) and a property2 (a path). And additionally you can hold an event like Type = 'Event' which holds the date in property1 and the name in property2. Just as an example.