Mysql – dynamic or static database tables

database-designMySQLtable

I am creating a little mulitplayer game. The game creates a session where players can join in. Then the session has to store some data provided in the game.

Should I create a new session table dynamically or a one table with the col sessionId where I store the data from all sessions?

Easier and better structured would be the dynamic solution, but I have heard this is not the best practise.

Best Answer

Relational databases in general, but specifically MySQL are thought to have a static structure. Let me elaborate a bit on what static entails (as it doesn't necessarily mean immutable):

Each version of the application should work with a predefined set of tables and internal structure

Only when requirements change (be it functional or things like performance improvements) you will perform an application "migration" to adapt the logic and data to the new ones, and add, delete or ALTER existing tables. If your table structure changes frequently (compared to the number of read and write operations), that is normally a sign of a bad design.

In particular, new users or state of an application (sessions), should not create new sets of tables, but add rows to existing ones.

As exceptions, Schema-based Multi-tenancy is a pattern that would fit the "creation of a new database" for each "application user group", but normally the scope of that is new organizations sharing the same software, or separate communities that require separate configuration; and normally with things like security or cost in mind. For sessions, table-based multi-tenancy is probably the most common pattern.

Dynamicaly creating and deleting tables usually has a high cost compared to inserting and deleting rows, specially on MySQL (creation of new tablespaces, invalidating high regions of memory, extra metadata locking, etc.)

Related Solutions

Splitting up tables with similar structures since they contain unrelated data? (huge numbers)

Your issues are probably less about performance, assuming reasonable equipment and a good indexing structure, and more about other impacts that you face.

Approach 1 means some additional programming complexity to deal with filtering rows in each table to restrict access to a single gameworld. It also means that if you need to restore a gameworld, you would have to (a) restore all gameworlds (since they are in the same database) or (b) restore the one database to a recovery database and then script the appropriate gameworld data from Recovery to your Gameworlds database.

Approach 2 means that you need to be able to manage schema changes across many databases. This can be done by scripting as needed. It is also easier to choose to have different schemas for some gameworlds, but that adds management overhead. However, if a a single gameworld crashes, you can restore the latest good backup(s) and get it going again.

Some years ago some Microsoft folks wrote up their take on Multi-Tenanted databases. See: http://msdn.microsoft.com/en-us/library/aa479086.aspx

Having more tables should not, in itself, cause performance problems within the size range you describe.

Choosing your approach should take into account leveraging your strengths so that you have a system you (and your project members) can readily manage.

Postgresql – Nullable One To Many w/Join Table – Indexing

If a session can only belong to a single user at any given time, the more appropriate model would be without a join table (user_sessions). What's the problem with column user_id being NULL?

If you need it to be NOT NULL and you don't always know the user at creation time, just insert a user "unknown" (with user_id 0 in my example). Can serve as DEFAULT value, too ...

CREATE TABLE sessions
  session_id   serial PRIMARY KEY
 ,user_id      int NOT NULL DEFAULT 0 REFERENCES users(user_id)
 ,session_date timestamp
 ...
)

Note on indexing strategy

Postgres can combine multiple indices with excellent speed. Typically, in your original scenario, with a nested loop:

nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can be very time consuming. (However, if the right relation can be scanned with an index scan, this can be a good strategy. It is possible to use values from the current row of the left relation as keys for the index scan of the right.)

Say, you have these indices:

sesssions(session_id)
user_sessions(session_id, user_id)

Both should be covered by the pk constraints in your original model automatically. However, the second one could go either way (user_id, session_id) or (session_id, user_id) and that matters:
Is a composite index also good for queries on the first field?

For best performance in a query like:

SELECT u.user_name, s.session_date 
FROM   users         u
JOIN   user_sessions us USING (user_id)
JOIN   sessions      s  USING (session_id)
WHERE  u.user_name = 'foo'

You would need the index (or pk constraint) to be

user_sessions(user_id, session_id)

Or have an additional index with user_id as first column.

Best Answer

Related Solutions

Splitting up tables with similar structures since they contain unrelated data? (huge numbers)

Postgresql – Nullable One To Many w/Join Table – Indexing

Note on indexing strategy

Related Question