Mysql – How to optimize a log process in MySQL

MySQLoptimization

In my project, I have about 100.000 users and can't control their behavior. Now, what I would like to do is log their activity in a certain task. Every activity, is one record which includes columns like user_id and some tag_id's.

The problem I have, is that these tasks in some cases can go up to 1.000.000 per year per user. So if I would store all these activities in one table. that would obviously become very big (=slow).

What is best to do here? Create a single table per user (so I have 100.000 log tables) or put all these activities in one table? And what kind of engine should I use?

One important thing to note: Although i simplified the situation a bit the following doesn't look normal, but users can also change values in these tables (like tag_id's).

Best Answer

Some simple arithmetic:

100k users * 1m events per user per year * 100 bytes per event =~ 10 TB per year

In MySQL, single table is limited in size at about 4TB. I would think it should be bigger than that, but this is what official doc says.

In other words, you would be effectively forced to split or partition your huge table into set of smaller ones.

Alternatively, you can use NoSQL databases like Hadoop or MongoDB. But, it can be painful if you are used to do everything in SQL.