Mysql – Best architecture solution for heavy data mining system, that can run on cloud

clusteringMySQL

We are currently hosting our MySQL database on a dedicated server with lots of memory and processors. We would like to push it to a different architecture that would allow us to scale it easily, specially on the part of computing. The service that we are currently testing analyses mobile data. We would like to be able to use cloud system to perform all the computations depending on the current need. Can you point us in a right direction in terms of system architecture. What kind of server scenario would be best for such tasks.

Best Answer

The question is quite open to discussion with such few details. What kind of calculations are performed with what density? What kind of an application server are you using? What amount and what kind of data comes in and goes out? What's the acceptable latency between the application server and the database / clients and application server? Thus, are there any requirements to keep the application server in-house / will it need to be clustered? Will there be any business logic implemented in the database as stored procedures and stored functions? If so, will it be in the native SQL dialect, or will it be in any external language like C/C++/Python/Java etc?

I won't be able to single out and suggest a provider or strategy, due to the lack of details and due to the fact that (for the last few years) I'm generally managing DB servers deployed in-house or on dedicated servers on the cloud. But first of all, I strongly suggest you phase out MySQL.

Having the power of Oracle or the ease of use of MSSQL would be nice, of course, but I won't push such a proprietary system with (possibly) high cost of licencing.

Instead I strongly suggest moving towards PostgreSQL. Especially for stability and scalability issues. But the strategy to pick (will it be rented service, a platform or rented dedicated servers? Will there be a cluster, and if yes, which kind? What should be the population of the cluster, and what kind of network topology should be implemented) is all up to the details. It's kind of hard to suggest more than saying "build a cluster of PostgreSQL DB servers, consisting of a couple of dedicated hardware, within the compounds of a trusted infrastructure provider" with so few details. And maybe: "Keep an acceptable amount of business logic in stored functions and procedures if the application server will be residing in-house and the network latency between it and the DB cluster will be an issue." Maybe you need more, or maybe you don't even need all that hassle.

And if by an unlucky decision, you end up sticking with MySQL, do not even think about using MyISAM.