Best Database specifications for the IOT solution

database-designqueuestored-procedures

i'm working on a IOT environment that basically is a sensor that can measure more then 300 times a second. We receive this measurements remotely using MQTT protocol. I want to set up a painel where i can watch in real time the measures my sensor is doing, i can do this with MQTT with no problem at all. But someone have ever done this using some database function? I know there are something called "Queue watch" for database that can do something similar.

The problem starts when i need to store those measurements, because i need to do analysis on them, draw graphics, and things like that.

As i'm not familiar with databases, there is a lot of data to store ( i imagine ) what would be the best database environment for this kind of thing?


—-UPDATE1 —–

How many do you need to store? Last hour? Last day? Last decade? – Basil Bourque

As it's a service for clients, i would say that the lest month of measurements would be "hot data", i want to make them easily accessible and as fast as possible and anything older then that i can store as "cold data" just for eventual analysis, or any request for old measurements from clients.


Best Answer

You have two question, let's start with the easiest.

Math: rows / hours is: 300 Hz * 3600 s / hour = 1.08 Million rows

Which database can handle ~1 Million rows / hour?

Just about all of them.

You just need to use the correct technique in order to achieve those numbers. Single row insert then commit can potential kill performance.

This blog describes the different Oracle method to insert 30 seconds worth of data (10000 rows). The fastest method is done in 6ms.

https://geraldonit.com/2012/03/31/loading-data-fast-regular-insert-vs-bulk-insert/

MySQL can insert 3 seconds worth of data (1000 rows) in 10ms.

http://brian.pontarelli.com/2011/06/21/jdbc-batch-vs-multi-row-inserts/

Don't pick your RDBMS based off of those posts. The point of those blogs is to show you that a performance difference exists based on HOW you insert the data.

What you really need to look for in an RDBMS is to see if it supports partitioning (and still fits in your budget ).

Partitioning allows you to break up the data into smaller chunks. You can then toss those smaller chunks away ( truncate ) in milliseconds when you no longer need them.

Can I use a Database Queue instead of MQTT

You could try.

The only one I'm familiar with uses actual tables under-the-hood. That means you'll need to commit after every row is added in order for you clients to see the data. As mentioned before, this could be bad for performance.

Personally, I'd stick with the MQTT method for the Live Stream of data. You got it to work. No sense in trying other methods.

You'll need to write a subscriber to the MQTT so that it can buffer the inserts using one of the methods mentioned in the blogs.