MySQL – InnoDB – Partitioning a table with uniqueness

innodbMySQLpartitioning

Following a previous post I posted: date-based-partitioning

This is my solution for partitioning a table with primary key.

I wanted to add partitioning by months and delete old record.

CREATE TABLE `filesoi` (
  `uniqueness` binary(32) NOT NULL DEFAULT '0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',
  `sha256` binary(32) NOT NULL,
  `datein` timestamp NOT NULL DEFAULT current_timestamp(),
  `lastSeen` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  PRIMARY KEY (`uniqueness`),
  KEY `lastSeen` (`lastSeen`)
) ENGINE=InnoDB AUTO_INCREMENT=4102 DEFAULT CHARSET=utf8

Is this a good solution to create the table like so?:

CREATE TABLE `filesoi` (
  `uniqueness` binary(32) NOT NULL DEFAULT '0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',
  `sha256` binary(32) NOT NULL,
  `datein` timestamp NOT NULL DEFAULT current_timestamp(),
  `lastSeen` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  `__month` tinyint(4) NOT NULL DEFAULT month(curdate()),
  PRIMARY KEY (`uniqueness`,`__month`),
  KEY `lastSeen` (`lastSeen`)
) ENGINE=InnoDB AUTO_INCREMENT=4102 DEFAULT CHARSET=utf8
 PARTITION BY RANGE  COLUMNS(`__month`)
(PARTITION `p0` VALUES LESS THAN (3) ENGINE = InnoDB,
 PARTITION `p1` VALUES LESS THAN (5) ENGINE = InnoDB,
 PARTITION `p2` VALUES LESS THAN (7) ENGINE = InnoDB,
 PARTITION `p3` VALUES LESS THAN (9) ENGINE = InnoDB,
 PARTITION `p4` VALUES LESS THAN (11) ENGINE = InnoDB,
 PARTITION `p5` VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB)

This way I can still insert with Insert Ignore.. and the duplicates will be per partition.

Best Answer

In my answer to the other Question, I was assuming you wanted to add PARTITION BY RANGE (TO_DAYS(updated_at)) and then DROP PARTITION to rapidly delete "old" records.

The Question here talks about using PARTITION BY COLUMNS(_month). There is essentially no difference. However, what do you expect to happen after you finish with month #12? Move on to 13?

I don't see a way to get a one-step uniqueness check and have partitioning. I do provide some non-partition DELETE optimizations here . That will let you use a UNIQUE key and have some performance in the deletion process.

PRIMARY KEY (uniqueness, __month) provides a uniqueness constraint on the pair, but not uniqueness by itself. One failing of PARTITIONing is the inability to declare something unique across all partitions (unless the partition key is part of it).

If you ever need to search across a range of months, you can't do it across December to January.