Mysql – auto-incrementing key in self-referencing table, performance

auto-incrementdatabase-designMySQL

My design contains a table where almost every entry has a parent it has to point to, within the same table (differently put, self references).

    /* scenario executions */
    CREATE TABLE `boteval`.`scenario_executions` (
      `scenario_id` BIGINT NOT NULL COMMENT 'connects to a scenario id',
      `id` BIGINT NOT NULL AUTO_INCREMENT COMMENT 'execution id',
      `parent_id` BIGINT NULL COMMENT 'execution id of parent executor (null would mean no parent, i.e. a top level execution)',
      `started` DATETIME(6) NOT NULL,
      `ended` DATETIME(6) NULL COMMENT 'can be null while the execution has not yet ended',
      PRIMARY KEY (`id`),
      UNIQUE INDEX `id_UNIQUE` (`id` DESC));

The use case being one where tree structures need to be incrementally recorded into the table, first the parent, than children, children of children, etc, as the program unfolds.

Using an auto-incrementing key for this table, I have reason to believe it would be extruciatingly slow if every insert needed to happen only after its predecessor had already returned its database-assigned auto incremented key.

On the other hand, I think that batching this workload would be way beyond naive, as a sql that strings together the returned ids would become extremely nested for each buffered batch, not to mention the dependencies between batches.

I have several solutions to this, but I wonder if there are very specific design patterns you would recommend. Maybe one of my design constraints would need to be relaxed: maybe the auto-incrementing, if I can safely supply a unique key from my code. In a way, this might not even really be a good fit for a relational database like MySQL.

Thanks in advance for any comments on good patterns for this kind of scenario!

Best Answer

  • In indexes, DESC is ignored. (This will be rectified in version 8.0.)
  • MySQL is happy to use an ASC index (that is, any index) but scan it in descending order.
  • So DROP that redundant UNIQUE index.

That said, I don't see that anything you have described so far could be cause "excruciatingly slow" inserts.

To further elaborate. You have 2 BTrees:

  • One with all the columns, and ordered (ASC) by ID
  • One with just ID and ordered (ASC) by it.

When inserting a new row (without explicitly specifying ID), the server is optimized to insert it into the "last" block of the BTree and bumping the A_I value.

Meanwhile, the UNIQUE index is acting much the same. But it is doubling the amount of work to do, with no benefit.

On the other hand, if you have multiple connections doing the inserts, this may be worth noting -- How the inset works:

  1. Get a new A_I value (well optimized)
  2. Check all UNIQUE indexes for duplicates. This may be slow, since another connection may be locking the adjacent ID.
  3. Insert the row
  4. Throw info into the "Change buffer" for later updating of non-unique indexes. (This does not apply.)

Also worth noting in InnoDB:

  • Each transaction has some overhead -- writing to the undo log, writing the "double write" buffer, etc.
  • Hence, batched insert run much faster because they spread out this overhead over multiple rows. Can you insert, say, 100 rows in a single INSERT? That may literally run 10 times as fast.

Do you really need BIGINT? It takes 8 bytes. INT UNSIGNED takes only 4 bytes, and allows 4 billion values.