Postgresql – How to efficiently clone database records together with all child elements

database-designpostgresql

I have an insert-only PostgreSQL database with a Book table, a Page table and an Element table.
Books can have multiple pages and pages can have multiple elements.
The database is insert-only because any time a page is modified, a new version of that page is created in a new record for auditing reasons.

-------------
Book
-------------
id [PK]

-------------
Page
-------------
id [PK]
book_id [FK]
version
created

-------------
Element
-------------
id [PK]
page_id [FK]
version
created

Currently, Element has a Foreign Key pointing to Page.id and Page has a Foreign Key pointing to Book.id.

A requirement is to have independent page drafts (versions) that can contain any number of elements. Every time a new page draft (version) is created, the new draft should be a clone of the previous version and it should contain all elements in the previous version. Elements can be added to or removed from a specific version of the Page, although Elements records are never deleted.

For scalibility reasons, I would avoid cloning all the elements together with the page, since the elements could be a very large number. I would also what to avoid many-to-many relationships between pages and elements.

I was thinking about replacing the Element.page_id FK with an Element.book_id FK so to have a single bucket of elements and pages more loosely coupled with each other. This way, when cloning a page, I wouldn't have to clone all the elements, but I would need a way to associate each element with a particular version of a Page.

Ideally, I should be able to retrieve a page version with all the elements associated to that version with a single SQL query.

I am still in a design/prototyping phase, so any aspect of the databse can potentially be changed to fulfill the requirements.

What could be the best way to associate Pages and Elements according to the requirements above? I hope the requirements are clear, if not, please ask any questions and I'll try to clarify.

Best Answer

a new version of that page is created in a new record for auditing reasons.

How do you know which "version" is the latest, you're going to process on version and select the last one? That'll work, but I think you're barking up a pretty steep tree. Essentially you're building your own revision-control system in a database. Not saying everyone hasn't tackled that task before, but issues likely to come up are

  • Merging (other users)
  • Rollbacks (without which, why would you keep old versions)
  • Diff-views
  • Delta stores (why store the whole element when you only change one character)
  • etc, etc.

This gets super complex very fast. If you're wanting an out of the box solution, checkout gitlab. If you have speced to just stop here and not feature creep -- I think this will work just fine.