Ok, PostgreSQL does not directly support foreign keys and partitions. You can create a foreign key on each partition, but there is no way to have a foreign key against a set of partitions without either coding the constraint triggers yourself or using a key management table.
There are several ways to resolve this, but they require some understanding of underlying db implementation concepts to make perform well.
The first is to do your referential integrity enforcement manually through triggers. You can write constraint triggers which can be deferred (if needed) that check the existence of foreign key information. Note this is subject to ALTER TABLE DISABLE TRIGGER ALL
issues but it should generally work.
The second is to have a separate table which stores keys and is maintained through a combination of triggers and bulk data operations. This can help a great deal.
So it is possible to enforce things but it requires a bit more work.
I don't know enough about MySQL's table partitioning capabilities to comment on it though.
First, you should consider solving the problem in another way.
Upgrade to MySQL 5.6, where OPTIMIZE TABLE
works without blocking (for an InnoDB table), as it is supported by InnoDB Online DDL.
If you can't upgrade, try using Percona Toolkit's pt-online-schema-change, which can perform the table rebuild without blocking.
$ pt-online-schema-change h=localhost,D=mydatabase,t=mytable --execute
--alter="ENGINE=InnoDB"
If you're stuck on using partition, yes, you must make id
the partition key in the table you show. You can convert the table to partitioning with ALTER TABLE
. If you need the conversion operation to be non-blocking, use pt-online-schema-change.
There's no way to partition to fixed-size partitions. You have to partition by values. But is it really that important to hit a specific size per partition?
Re your comment about partition size:
When using RANGE partitioning, what I do is set up a schedule to ALTER TABLE and split the last partition from time to time. If you have a regular rate of growth, this is easy, but if you have irregular patterns of growth, you might instead set up a periodic check that examines the number of rows per partition (use the INFORMATION_SCHEMA.PARTITIONS), and email you if it's getting full.
For example, let's set up a table partitioned by range on id
.
CREATE TABLE `mytable` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`transactionid` int(11) NOT NULL,
`parent` int(11) NOT NULL,
`headers` longtext,
`creator` int(11) NOT NULL,
`created` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
KEY `transactionid` (`transactionid`,`parent`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (id)
(PARTITION p0 VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (1000) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (2000) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (3000) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
As the MAX(id)
approaches 3000, it's getting close to filling up p3
and spilling over into p4
. So it's time to reorganize. It's good to do this before any data spills over into p4
, because the reorg will affect only the last, empty partition and will therefore be very quick.
ALTER TABLE mytable REORGANIZE PARTITION p4 INTO
(PARTITION p4 VALUES LESS THAN (4000), PARTITION p5 VALUES LESS THAN MAXVALUE);
Even if you miss a day and you get some data into the old p4
, chances are it's not much data. But if you neglect this for a month or two, and p4
fills up with a lot of data, then the REORGANIZE will take longer.
Best Answer
100k partitions is an awful lot. I would not try that without doing some serious testing first. Include in the testing things like your backup scripts, and pg_upgrade runs (you don't want to design a database schema that cannot be feasibly upgraded to the next major version). Or just don't do it.
You would probably get most of the advantages of partitions without resorting to one partition per case. You could use hash partitioning (new in v11) or range partitioning, for example, to have a couple hundred partitions with a few hundred cases per partition.
How does that get expressed? A app-generated in-list of literal case_id? A join to a parent case table with a where clause to identify the dozen to 200?