MySQL Database Design – Efficient Way to Update Multiple Tags or Categories

database-designMySQL

What's the efficient or best way on updating a multiple tags or categories?

Currently I have posts and posts_meta tables.

I have this structure on posts_meta table:

+----+---------+----------+------------+---------------------+---------------------+
| id | post_id | meta_key | meta_value | created_at          | last_update_at      |
+----+---------+----------+------------+---------------------+---------------------+
| 94 |     111 | tags     | art        | 2016-07-05 15:01:05 | 2016-07-05 15:01:05 |
| 95 |     111 | tags     | books      | 2016-07-05 15:01:05 | 2016-07-05 15:01:05 |
| 96 |     111 | tags     | comics     | 2016-07-05 15:01:05 | 2016-07-05 15:01:05 |
+----+---------+----------+------------+---------------------+---------------------+

Currently what I was thinking is when I update the tags on a single post, I delete the current tags and
then insert the new ones.

What I think the problem on that is when I update a post without changing the tags, the function will still
going to happen(deleting the current tags and inserting new ones). And I think that is not efficient.

Another solution I thought was just updating the current tags row by just changing the meta_value.

So from:

+----+---------+----------+------------+---------------------+---------------------+
| id | post_id | meta_key | meta_value | created_at          | last_update_at      |
+----+---------+----------+------------+---------------------+---------------------+
| 94 |     111 | tags     | art        | 2016-07-05 15:01:05 | 2016-07-05 15:01:05 |
+----+---------+----------+------------+---------------------+---------------------+

to:

+----+---------+----------+------------+---------------------+---------------------+
| id | post_id | meta_key | meta_value | created_at          | last_update_at      |
+----+---------+----------+------------+---------------------+---------------------+
| 94 |     111 | tags     | magazine   | 2016-07-05 15:01:05 | 2016-07-05 15:01:05 |
+----+---------+----------+------------+---------------------+---------------------+

But another problem here is that, what if the current post have 3 tags and the update has 4 new tags or vice versa.
I know we can do some workaround for that but, not sure if that's efficient or worth to do it for just
a simple update.

Now I'm really not sure if this is how I should do it.

Any suggestions?

Thanks

Best Answer

Usage?

You have not listed what kind of queries will hit this table. You can't really design a table, much less optimize it, until the SELECTs are tentatively written. More later.

First, some simplification

Get rid of id; make PRIMARY KEY(post_id, meta_key, meta_value). After all, that has to be unique. The id is clutter and slows things down for the normal operations which involve WHERE post_id = constant AND meta_key = constant.

Because of the 3 tags vs 4 tags you mentioned, the only reasonable way to do it is

DELETE FROM meta WHERE post_id = ? AND meta_key = ?;
INSERT INTO meta (post_id, ... )
    VALUES
    (...), (...), (...), (...);

Then some more simplification

The EAV schema (which you have) sucks. You may as well have a table called Tags. Now you can get rid of meta_key, and everything gets simpler and shorter.

You may choose to keep the EAV stuff for where there is only one value for a meta_key. That is a separate discussion. I'm arguing that it is wrong to fit multi-valued "tags" into that mold.

Further

Do you really need those timestamps? WP may be omnipresent, but it is not necessarily 'right'.

Ordering?

When you fetch everything for some tag, don't you want to put them is some order? And might you paginate it? So, maybe you need one of those timestamps for this?

Bottom line

CREATE TABLE `tags` (
    post_id ... NOT NULL,
    tag varchar(55) NOT NULL,
    sort_order ... NOT NULL,
  PRIMARY KEY (tag, sort_order, post_id),  -- for the main select, plus dedup
  INDEX(post_id)  -- for maintenance
) ENGINE=InnoDB;

Further discussion