Cross-posting the answer as well:
I think I nailed it. Basically, you add a (unique) draft field to the relevant tables, and you work on the drafts as if they were a new post/tag/etc.:
posts (
id pkey,
public,
created_at stamptz,
updated_at stamptz,
updated_by int,
draft int fkey posts (id) unique
)
post_revs (
id,
public,
created_at,
created_by,
deleted_at,
pkey (id, created_at)
)
tags (
id pkey,
public,
created_at,
updated_at,
updated_by,
draft fkey tags (id) unique
)
tag_revs (
id,
public,
created_at,
created_by,
deleted_at,
pkey (id, created_at)
)
post2tag (
post_id fkey posts(id),
tag_id fkey tags(id),
public,
created_at,
updated_at,
updated_by,
pkey (post_id, tag_id)
)
post2tag_revs (
post_id,
tag_id,
public,
created_at,
created_by,
deleted_at,
pkey (post_id, tag_id, created_at)
)
You can and will be better served by removing created_by
from your primary key and creating a separate index for it.
This will allow those queries which use created_by
in the where clause (e.g. where created_by = ?
), to be able to use that index to service the query.
There may be a slight, and probably not noticeable, performance hit from managing the primary key and the index on an insert. However this price is nothing to pay for not doing full table scans when executing such queries needing it to be be the primary search in the where clause.
Best Answer
Your proposed design is completely viable.
Couple of things to consider:
Workload
Assuming InnoDB engine you want primary key to be a continuously incrementing id so INSERT won't be random causing too much page split and fragment the tablespace.
Benefits of
AUTO_INCREMENT
:Downsides of
AUTO_INCREMENT
:Timestamp would be a great candidate for PK if you can keep it
UNIQUE
since it's most likely will participate in most of your queries and most like as a range. (PK is at the end of every secondary key).You could also experiment with the an
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY
instead. Use whichever gives better performance.Size
Try to keep the row size as small as possible to avoid sharding/partitioning/archiving as long as possible (eventually you will need to but MySQL is quite good with tens of millions of rows even).
To achieve this use only IDs in this table and have lookup tables where necessary. For example: instead of
action VARCHAR(12)
useaction_id TINYINT UNSIGNED
.TINYINT
is 1 byte whereas action will be always at least 2 assumingASCII
,latin1
or similar small character set but longer the string becomes the more byte it will use obviously.UTF8
can use up to 4 bytes / character.Only put index on columns where you really need to. Daily, monthly, etc aggregation tables will serve you better than heavy indexes on this table.
A possible schema