Optimal Solution for Storing Real-Time Time-Series in MariaDB/MySQL

mariadbMySQLpercona

Use case: A measurement creates a given number of images. For each image we need to store a small set of quality indicators (floats, doubles) along with an image integer [1 …N], a timestamp and one or two foreign key values. This should then be plotted in "real time" in a web application (PHP) for users to evaluate.

Each web client polls the database every 5s. Storage + retrieval of each set of quality indicators should ideally take < 2s (approximately). In a worst case scenario there can be ~30 simultaneous web clients polling and around 10 measurements could be writing simultaneously, leading to write bursts of approx. 1000 sets of quality indicators per second.

In a programming language, this sort of data would probably be stored in arrays or lists. As I'm not aware of anything similar in the MariaDB / MySQL world I'm just using a regular InnoDB table with a column for each of the values mentioned above. This already has 90+ million rows and is expected to grow faster in the coming months.

Is InnoDB overall the best storage engine for this, or should I consider others? Is it best practice to archive data after a while, perhaps once all the measurements' images have been processed? Would it help to enable compression, or would that have very negative impacts on performance?

Best Answer

With just MySQL/MariaDB, I would employ:

Your last paragraph throws in the kitchen sink of questions (Toku, MyRocks, archive, compression, history table). I'm surprised the posting has not be killed for being "too broad". Please elaborate on what your data and queries are like; otherwise, all we can do is throw a kitchen sink full of solutions.

You say "realtime", yet you need "thousands/sec". Can you allow 1 minute delay in realtime? 1 sec? You cannot get 1ms; 1s will be hard to achieve. How long does a burst last? What is a per-minute burst? 1K/sec will probably spill into the next couple of seconds. 6K/minute is not much trouble.

How many clients are storing data? Some solutions work well with a single client; different solutions are needed for multiple clients.

Keep in mind that benchmarks are tuned to show off one thing, and rarely match real life.