My design contains a table where almost every entry has a parent it has to point to, within the same table (differently put, self references).
/* scenario executions */
CREATE TABLE `boteval`.`scenario_executions` (
`scenario_id` BIGINT NOT NULL COMMENT 'connects to a scenario id',
`id` BIGINT NOT NULL AUTO_INCREMENT COMMENT 'execution id',
`parent_id` BIGINT NULL COMMENT 'execution id of parent executor (null would mean no parent, i.e. a top level execution)',
`started` DATETIME(6) NOT NULL,
`ended` DATETIME(6) NULL COMMENT 'can be null while the execution has not yet ended',
PRIMARY KEY (`id`),
UNIQUE INDEX `id_UNIQUE` (`id` DESC));
The use case being one where tree structures need to be incrementally recorded into the table, first the parent, than children, children of children, etc, as the program unfolds.
Using an auto-incrementing key for this table, I have reason to believe it would be extruciatingly slow if every insert needed to happen only after its predecessor had already returned its database-assigned auto incremented key.
On the other hand, I think that batching this workload would be way beyond naive, as a sql that strings together the returned ids would become extremely nested for each buffered batch, not to mention the dependencies between batches.
I have several solutions to this, but I wonder if there are very specific design patterns you would recommend. Maybe one of my design constraints would need to be relaxed: maybe the auto-incrementing, if I can safely supply a unique key from my code. In a way, this might not even really be a good fit for a relational database like MySQL.
Thanks in advance for any comments on good patterns for this kind of scenario!
Best Answer
DESC
is ignored. (This will be rectified in version 8.0.)DROP
that redundantUNIQUE
index.That said, I don't see that anything you have described so far could be cause "excruciatingly slow" inserts.
To further elaborate. You have 2 BTrees:
ID
ID
and ordered (ASC) by it.When inserting a new row (without explicitly specifying
ID
), the server is optimized to insert it into the "last" block of the BTree and bumping the A_I value.Meanwhile, the
UNIQUE
index is acting much the same. But it is doubling the amount of work to do, with no benefit.On the other hand, if you have multiple connections doing the inserts, this may be worth noting -- How the inset works:
UNIQUE
indexes for duplicates. This may be slow, since another connection may be locking the adjacent ID.Also worth noting in InnoDB:
INSERT
? That may literally run 10 times as fast.Do you really need
BIGINT
? It takes 8 bytes.INT UNSIGNED
takes only 4 bytes, and allows 4 billion values.