Postgresql – Limits of table partitioning (in PostgreSQL)

partitioningpostgresqlsharding

I have the following use case (PostgreSQL, currently v9 but I'm planning to upgrade to the latest version).

I have a data warehouse application with a central table that has a sizeable number of columns, two of these being JSONB columns.
As common for data warehouse applications, I have a number of indices on the central table, roughly one for each query type that is running.
The central table has a case column that contains an integer value and identifies one subset of data.
After the import, the records for a particular case are immutable (but I might need to purge all records for a case and re-import it once in a while).
For each case, I have ~100k to 3M records (a pretty much bimodal distribution with 70% making up ~100k, and 30% making up ~3M records).
I currently have one thousand cases and would expect this to grow by one thousand cases a year.
95% of all queries to go a single case. 5% of all queries go to a dozen or maybe 200 cases.

Now my questions:

Would it make sense to partition by case ID for performance improvements? I'm currently not so much concerned about query performance (Postgres is really awesome) but more about import times.
If I'm partitioning by case, would I hash the case ID integer or would I start a new one-element range for each case?
If I'm getting 10k or 100k cases, would Postgres scale to this number of partitions?
I'm considering using something like Citrus for sharding over multiple physical servers. Is there anything to keep in mind here?

I did quite some online search regarding this topic but did not find much beyond table partitioning for log files.
I'd be more than happy if you could direct me to good resources for research.

Best Answer

100k partitions is an awful lot. I would not try that without doing some serious testing first. Include in the testing things like your backup scripts, and pg_upgrade runs (you don't want to design a database schema that cannot be feasibly upgraded to the next major version). Or just don't do it.

You would probably get most of the advantages of partitions without resorting to one partition per case. You could use hash partitioning (new in v11) or range partitioning, for example, to have a couple hundred partitions with a few hundred cases per partition.

5% of all queries go to a dozen or maybe 200 cases.

How does that get expressed? A app-generated in-list of literal case_id? A join to a parent case table with a where clause to identify the dozen to 200?

Related Solutions

Mysql – Partitioning MySQL vs PostgreSQL

Ok, PostgreSQL does not directly support foreign keys and partitions. You can create a foreign key on each partition, but there is no way to have a foreign key against a set of partitions without either coding the constraint triggers yourself or using a key management table.

There are several ways to resolve this, but they require some understanding of underlying db implementation concepts to make perform well.

The first is to do your referential integrity enforcement manually through triggers. You can write constraint triggers which can be deferred (if needed) that check the existence of foreign key information. Note this is subject to ALTER TABLE DISABLE TRIGGER ALL issues but it should generally work.

The second is to have a separate table which stores keys and is maintained through a combination of triggers and bulk data operations. This can help a great deal.

So it is possible to enforce things but it requires a bit more work.

I don't know enough about MySQL's table partitioning capabilities to comment on it though.

MySQL – Partitioning Large Table

First, you should consider solving the problem in another way.

Upgrade to MySQL 5.6, where OPTIMIZE TABLE works without blocking (for an InnoDB table), as it is supported by InnoDB Online DDL.
If you can't upgrade, try using Percona Toolkit's pt-online-schema-change, which can perform the table rebuild without blocking.
```
$ pt-online-schema-change h=localhost,D=mydatabase,t=mytable --execute
    --alter="ENGINE=InnoDB"
```

If you're stuck on using partition, yes, you must make id the partition key in the table you show. You can convert the table to partitioning with ALTER TABLE. If you need the conversion operation to be non-blocking, use pt-online-schema-change.

There's no way to partition to fixed-size partitions. You have to partition by values. But is it really that important to hit a specific size per partition?

Re your comment about partition size:

When using RANGE partitioning, what I do is set up a schedule to ALTER TABLE and split the last partition from time to time. If you have a regular rate of growth, this is easy, but if you have irregular patterns of growth, you might instead set up a periodic check that examines the number of rows per partition (use the INFORMATION_SCHEMA.PARTITIONS), and email you if it's getting full.

For example, let's set up a table partitioned by range on id.

CREATE TABLE `mytable` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `transactionid` int(11) NOT NULL,
  `parent` int(11) NOT NULL,
  `headers` longtext,
  `creator` int(11) NOT NULL,
  `created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `transactionid` (`transactionid`,`parent`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (id)
(PARTITION p0 VALUES LESS THAN (0) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (1000) ENGINE = InnoDB,
 PARTITION p2 VALUES LESS THAN (2000) ENGINE = InnoDB,
 PARTITION p3 VALUES LESS THAN (3000) ENGINE = InnoDB,
 PARTITION p4 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

As the MAX(id) approaches 3000, it's getting close to filling up p3 and spilling over into p4. So it's time to reorganize. It's good to do this before any data spills over into p4, because the reorg will affect only the last, empty partition and will therefore be very quick.

ALTER TABLE mytable REORGANIZE PARTITION p4 INTO 
(PARTITION p4 VALUES LESS THAN (4000), PARTITION p5 VALUES LESS THAN MAXVALUE);

Even if you miss a day and you get some data into the old p4, chances are it's not much data. But if you neglect this for a month or two, and p4 fills up with a lot of data, then the REORGANIZE will take longer.

Best Answer

Related Solutions

Mysql – Partitioning MySQL vs PostgreSQL

MySQL – Partitioning Large Table

Related Question