Your issues are probably less about performance, assuming reasonable equipment and a good indexing structure, and more about other impacts that you face.
Approach 1 means some additional programming complexity to deal with filtering rows in each table to restrict access to a single gameworld. It also means that if you need to restore a gameworld, you would have to (a) restore all gameworlds (since they are in the same database) or (b) restore the one database to a recovery database and then script the appropriate gameworld data from Recovery to your Gameworlds database.
Approach 2 means that you need to be able to manage schema changes across many databases. This can be done by scripting as needed. It is also easier to choose to have different schemas for some gameworlds, but that adds management overhead. However, if a a single gameworld crashes, you can restore the latest good backup(s) and get it going again.
Some years ago some Microsoft folks wrote up their take on Multi-Tenanted databases. See: http://msdn.microsoft.com/en-us/library/aa479086.aspx
Having more tables should not, in itself, cause performance problems within the size range you describe.
Choosing your approach should take into account leveraging your strengths so that you have a system you (and your project members) can readily manage.
If a session can only belong to a single user at any given time, the more appropriate model would be without a join table (user_sessions
). What's the problem with column user_id
being NULL
?
If you need it to be NOT NULL
and you don't always know the user at creation time, just insert a user "unknown" (with user_id 0
in my example). Can serve as DEFAULT
value, too ...
CREATE TABLE sessions
session_id serial PRIMARY KEY
,user_id int NOT NULL DEFAULT 0 REFERENCES users(user_id)
,session_date timestamp
...
)
Note on indexing strategy
Postgres can combine multiple indices with excellent speed. Typically, in your original scenario, with a nested loop:
nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can
be very time consuming. (However, if the right relation can be scanned
with an index scan, this can be a good strategy. It is possible to use
values from the current row of the left relation as keys for the index
scan of the right.)
Say, you have these indices:
sesssions(session_id)
user_sessions(session_id, user_id)
Both should be covered by the pk constraints in your original model automatically. However, the second one could go either way (user_id, session_id)
or (session_id, user_id)
and that matters:
Is a composite index also good for queries on the first field?
For best performance in a query like:
SELECT u.user_name, s.session_date
FROM users u
JOIN user_sessions us USING (user_id)
JOIN sessions s USING (session_id)
WHERE u.user_name = 'foo'
You would need the index (or pk constraint) to be
user_sessions(user_id, session_id)
Or have an additional index with user_id
as first column.
Best Answer
Relational databases in general, but specifically MySQL are thought to have a static structure. Let me elaborate a bit on what static entails (as it doesn't necessarily mean immutable):
Only when requirements change (be it functional or things like performance improvements) you will perform an application "migration" to adapt the logic and data to the new ones, and add, delete or
ALTER
existing tables. If your table structure changes frequently (compared to the number of read and write operations), that is normally a sign of a bad design.In particular, new users or state of an application (sessions), should not create new sets of tables, but add rows to existing ones.
As exceptions, Schema-based Multi-tenancy is a pattern that would fit the "creation of a new database" for each "application user group", but normally the scope of that is new organizations sharing the same software, or separate communities that require separate configuration; and normally with things like security or cost in mind. For sessions, table-based multi-tenancy is probably the most common pattern.
Dynamicaly creating and deleting tables usually has a high cost compared to inserting and deleting rows, specially on MySQL (creation of new tablespaces, invalidating high regions of memory, extra metadata locking, etc.)