Mysql – Should I in general partition tables by tenant IDs in a multi-tenant system

We're building a system where the data in 10 tables is associated with Accounts. A typical table looks like this:

create table Things(
    accountId varchar(64) not null,
    internalId varchar(64) not null,
    externalId varchar(256) as (concat(accountId, '-', internalId)) stored,
    ...
    primary key (accountId, sourcedId),
    unique (externalId),
    foreign key (accountId) references Accounts (id)
);

All queries either have an accountId in the where clause, or they use externalId. There are no cross-account queries.

We're expecting to have a total of 200 accounts. The sizes of the other tables (like Things) vary from 5 rows per account for some tables (1000 rows total) to 225K rows per account for some other tables (45M rows total). (These are the numbers we use for performance testing – they're the max numbers)

The DB size is ~150 GB. 95% of the scenarios are reads.

The RDBMS is Mysql 8.0.16 (AWS RDS).

We don't have any performance issues at the moment and we're not trying to make anything work faster. But I'm wondering if NOT partitioning the tables like Things by accountId is a "premature pessimization"?

Best Answer

If you are using InnoDB, the clustered index will (generally) already organize the data by AccountId. If the number of writes is generally low/fairly balanced, there should be no need to partition by every AccountId.

Partitioning might help in the following scenarious:

You need to drop an entire account's data quickly or restore just a single account's data.
You have a lot of accounts with very few writes, and other accounts with tons of writes. If you partition the high-volume accounts to their own partition, it's easier to perform things like defragmentation/index rebuilds without affecting other accounts.
You need to keep data accessible for inactive accounts, but want to keep the "active" data as small/manageable as possible.

If you do end up partitioning, doing so by every AccountId will probably be overkill - it's better to determine some criteria for how/when you'd partition.

Edit: As RickJames helpfully points out, MySQL can't/won't enforce FK constraints with partitioned tables. So you'd need to create an alternate method of enforcing those, which is extra code overhead and the risk of invalid data. So if you do end up partitioning things, might make sense to upgrade to a database without that limitation (SQL Server would be the next step up in the "Supports clustered indexes, can implement partitions without breaking" category.

Best Answer

Related Solutions

Sql-server – Which of these table designs is better for performance

MySQL optimization – year column grouping – using temporary table, filesort

Related Question