Mysql – Redesign basic Mysql storage architecture

ArchitectureMySQLpartitioningsharding

We are a small Internet provider and we are storing all our clients traffic data in a MySQL database (MyISAM). We insert data every few minutes in three tables like this:

capture_table
----------
client_id
date
time
data1
...
datan
---------

The data is always queried by date.

After the tables were getting too big, we decided to use several tables to divide the data.
We created three PHP scripts that ran every few minutes dividing the original data into 3 tables (MyISAM) per month and creating an index table.

index_table
data_table_2014-01-1
data_table_2014-01-2
data_table_2014-01-3
...
...

Currently we have this structure and about 2.5 million of rows per table for our current number of clients, and growing (DB weights about 15GB).
The queries are done by joining the data from all the tables involved between two dates using UNION ALL, always searching for date and client. Example, search between 2014-01-01 and 2014-01-15:

(SELECT * FROM data_table_2014-01_1 WHERE client_id=2 AND date >= '2014-01-01' AND date <= '2014-01-15')
UNION ALL (SELECT * FROM data_table_2014-01_2 WHERE client_id=2 AND date >= '2014-01-01' AND date <= '2014-01-15')
UNION ALL (SELECT * FROM data_table_2014-01_3 WHERE client_id=2 AND date >= '2014-01-01' AND date <= '2014-01-15')

Is this the better approach? Could this be achieved in a better way? I was thinking of table partitioning for example…

Thanks a lot.

Best Answer

In my opinion, you should use an SSD if you are facing performance problems. Here is a 120 GB drive for approx. 64 Euro (< 100$ US). This will keep your application ticking over for a couple of years without any need for a redesign and performance should substantially improve for minimum expense.

Down the road, if you are ultimately deciding to move to a DW scenario, may I recommend PostgreSQL as your DW server over MySQL? MySQL is good for read-heavy web-facing apps, but not for DW type work. PostgreSQL has CHECK CONSTRAINTs, SET operators (INTERSECT, EXCEPT), CTEs (Common Table Expressions) and Windowing functions (all of which MySQL (incredibly) lacks).

Traditionally PostgreSQL was not strong on the replication, clustering and connection pooling front, but that has changed.