IMHO You do not need to physically split it up. Yet, it would be nice to cache it.
If the users
table uses the MyISAM Storage Engine, you have a nice advantage.
Since MyISAM only caches indexes, you could do two things
- You could create a custom key cache just to load MyISAM index for the
users
table only
- You could index the username and password to force the query to hit that custom key cache only
Make sure the following indexes exist for users
ALTER TABLE users ADD UNIQUE INDEX username_ndx (username);
ALTER TABLE users ADD UNIQUE INDEX username_password_ndx (username,password);
There are two(2) major reasons for the two indexes
REASON for index #1
The index username_ndx
prevents a username from having multiple passwords, as well as prevents multiple users with the same name
REASON for index #2
The index username_password_ndx
provides a covering index. Thus, your query will lookup the username and password in the custom MyISAM cache only, instead of checking the table.
More Links on the Principles of Covering Indexes
Next thing is to actually create that custom key cache. Here are the commands to create an 8MB key cache and load that dedicated key cache (Example: If the table is mydb.users
):
SET GLOBAL authentication_cache.key_buffer_size = 1024 * 1024 * 8;
CACHE INDEX mydb.users IN authentication_cache;
LOAD INDEX INTO CACHE mydb.users;
You should place these three lines in the file /var/lib/mysql/startup.sql
Add this to /etc/my.cnf
[mysqld]
init-file=/var/lib/mysql/startup.sql
This will load the cache every time mysql is started up
Give it a Try !!!
UPDATE 2011-12-30 17:25 EDT
If you would like to get the exact size to set the cache, use the following query:
SELECT CONCAT('1024 * 1024 * ',ROUND(index_length/power(1024,2))) RecommendedCacheSize
FROM information_schema.tables WHERE table_name='users';
UPDATE 2011-12-30 23:21 EDT
Here is a method based on InnoDB
You still need the indexes
ALTER TABLE users ADD UNIQUE INDEX username_ndx (username);
ALTER TABLE users ADD UNIQUE INDEX username_password_ndx (username,password);
You have to make sure the InnoDB Buffer Pool has the usernames and passwords available. You may have to resort to doing a full index scan upon mysql startup:
Step 1) Create ReadUserPass.sql
echo "select username,password from users;" > /var/lib/mysql/ReadUserPass.sql
Step 2) Add that script to /etc/my.cnf
[mysqld]
init-file=/var/lib/mysql/ReadUserPass.sql
Step 3) Perform one of the following
$ service mysql restart
mysql> source /var/lib/mysql/ReadUserPass.sql
Because both of these columns (username and password) reside in the username_password_ndx
, all the index pages making up this index are reloaded into the InnoDB Buffer Pool. This is necessary because there is the possiblility of the index pages being flushed out. To minimize that happening, increase the Buffer Pool Size and restart mysql (one time).
The first path is certainly the more conventional of the two, absent a compelling reason to break the system down into multiple schemas. Something along these lines:
CREATE TABLE user (
user_id INT AUTO_INCREMENT PRIMARY KEY NOT NULL ,
user_name VARCHAR(50) NOT NULL,
password VARCHAR(32) NOT NULL,
status_code INT NOT NULL
) ENGINE=InnoDB;
CREATE TABLE user_file (
file_id INT AUTO_INCREMENT PRIMARY KEY NOT NULL ,
user_id INT NOT NULL,
path VARCHAR(2000) NOT NULL,
INDEX (user_id),
CONSTRAINT FOREIGN KEY (user_id)
REFERENCES user(user_id)
) ENGINE=InnoDB;
CREATE TABLE user_settings (
user_id INT PRIMARY KEY NOT NULL,
email VARCHAR(150),
color VARCHAR(10),
etc VARCHAR(255),
CONSTRAINT FOREIGN KEY (user_id)
REFERENCES user(user_id)
) ENGINE=InnoDB;
Best Answer
Check this out, and links within (1, 2). InnoDB has a limit of 4 billion tables. There is no compelling reason to use the file system to store data that is better stored in a database.
Take a look here to see that virtually every form of social networking software uses an underlying database - normally MySQL or PostgreSQL (which I would recommend). 2 million rows is very small in todays database terms - no need to have 40.000 tables (1 for each user either) - although you may beed to look at some form of partitioning - take a look at the answer here.
Anyway, with 40.000 users, shouldn't you perhaps be thinking of using more than 1 machine? 1 guy here tried putting > 32.000 schemas onto a MySQL instance - and couldn't (follow links there also). With 40K users, put 20K in a separate schema on each server if you want to - see here also. Implicit assumption - not all of your schemas will be very active simultaneously.
Databases are designed for searching and sorting and aggregating data - file systems aren't - are you going to grep huge files? Not that it's impossible, but your time is probably better used doing other things.
Final suggestion, download some of the Open Source stuff out there and check to see what they've done - keep what you like and throw away the rest.
[EDIT in response to OP's questions]
You asked:
I would still hold them in the database - you never know when you're going to need them - file systems aren't generally designed for lots and lots of small files. Plus, in a file system scenario, how do you store who is replying to whom? By having yet more files? Or by putting in a header? Uggghh! A BLOB for the message with fields for replying_to_message_ID &c. - much easier!
You also asked:
See the post by Bill Karwin here. Bill Karwin in training manager with Percona. Anybody with Percona is likely to know a thing or two about MySQL.
People argue about the performance of MyISAM v. InnoDB - Bill Karwin again. MyISAM is old and will soon be deprecated. Don't use unless you have a compelling reason to do so (my guess is that you won't).