Mysql – Saving data for a colossal maze

innodbMySQL

I want to build a labyrinth in which players can interact with others through leaving messages for one another, so in its most basic form each "tile" of the map is a forum thread or chat room.

However, I intend to have the maze continue to be generated for a very long time (the process is deterministic and will eventually terminate, but only if the 50% chance of each branch continuing fails for every single branch simultaneously. Interestingly a seed value of 42 results in a maze of size "3 tiles", so clearly 42 is not the answer here :D) and each room will have some data, such as which directions are open and which one points to the origin (can be useful for pathfinding later – find the path to the origin and they'll meet somewhere, thus you find the path between the two; this works because there are no loops, only branches).

Anyway, here's the table structure I have in mind for saving rooms:

CREATE TABLE `rooms` (
    `x` INT NOT NULL,
    `y` INT NOT NULL,
    PRIMARY KEY (`x`,`y`),
    `open_left` TINYINT(1) UNSIGNED NOT NULL,
    `open_right` TINYINT(1) UNSIGNED NOT NULL,
    `open_up` TINYINT(1) UNSIGNED NOT NULL,
    `open_down` TINYINT(1) UNSIGNED NOT NULL,
    `origin_direction` ENUM('isorigin','left','right','up','down') NOT NULL
) ENGINE=InnoDB;

Then, the chat messages can be saved like so:

CREATE TABLE `messages` (
    `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `x` INT NOT NULL,
    `y` INT NOT NULL,
    FOREIGN KEY (`x`,`y`) REFERENCES `rooms` (`x`,`y`)
        ON DELETE CASCADE ON UPDATE CASCADE,
    `timestamp` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    ...
) ENGINE=InnoDB;

This seems fine, but there's one problem I can forsee:

The maze will be absolutely massive if given enough time to be generated. Based on how it is generated, its size will be exponential and could easily reach millions, perhaps billions of rooms. And past experience has taught me, tables with millions of rows are trouble.

I imagine I will probably need to break the world down into chunks. Say, each chunk can be 1000×1000 tiles in size, giving a maximum capacity of 1,000,000 rooms (assuming somehow the maze generation fills every tile in the area). This seems simple enough, and I can procedurally create tables or even whole databases of the form zone_X_Y with X and Y being the respective coordinates of that sector.

I guess mostly this question is to voice my thought process and see if anyone has any suggestions to make for better alternatives. My DB experience is pretty much just picking up what I can as I go along and learning from past mistakes.

Best Answer

Use UNSIGNED to squeeze out an extra bit. A 3-byte MEDIUMINT UNSIGNED gives you 0..16M; a 4-byte INT UNSIGNED gives you 0..4 billion. Since we are talking about X and Y, what is the max width/height of the maze? 16M or 64K seems more realistic than 4B.

Consider using a TINYINT UNSIGNED or SET to handle the 4 "open" flags. That would take 1 byte instead of 4. Don't you want to go for 3 dimensions? Why not 4? Or string-theory with 10 or 11. (That would take 2 bytes for all the possible openings.)

Smaller --> faster, especially since this may become I/O bound.

Do you really need FOREIGN KEY and cascading delete?

I don't think MySQL needs for you to break it into 'chunks', but if you do, use suitably sized ints, such as the 2-byte SMALLINT UNSIGNED (0..65535).

Perhaps you are optimistic to think there will be more than 4 billion "messages"? Do you have a terabyte-sized disk?

I can't really advise on the structure, nor the indexes, without seeing the queries. I would "guess" that you always say WHERE x=... AND y=...; that is, specifying both?

Is someone (a human, that is) going to wander through this maze? Even at one 'step' per second, a human cannot realistically touch more than 10M rooms in a year. And such a person would probably need a padded cell after that.