Highly Concurrent Storage System

Architectureconcurrencynosqlstorage

Imagine your requirement is that you have 3 huge tables (structured data) with say 30 billion rows in each (total size of 4TB) and your many concurrent users (which are parallel os threads on remote LAN machines) will need to read a portion of the data through their SELELCT WHERE GROUPBY queries and highly concurrent, say 10,000 concurrent reads at the same time and also users need to insert (no update) data into these tables highly concurrent too like 2000 concurrent writers (all over the data center LAN network). The users would want to read and insert as fast as possible form this storage where each read and write will happen is ms to 1 second range.

What technologies do you recommend to satisfy such requirement? Is there any data storage or key value store that could do this? Cloud is NOT an option.

Some Clarifications:

The users do NOT have to see the data right away and eventual consistency is acceptable.
The data is accesses through whatever driver the storage can provide and users are again just threads running on remote machines of the data center. The queries are mostly like SELECT WHERE GROUPBY.

The data is in tabular format and each row is about 60 bytes.

No cloud option where I can not use DynamoDB or similar solutions. I have to be able to host it internally in the data center.

All the data of the tables can be read all the time and usage pattern is unpredictable. There is no join or super long query. No DR required but a reasonable HA is required but it does not have to be fancy. Every reader is getting a batches of rows based on its where clause and rows are not really related. We probably can have fixed length for each row but I am hoping the storage layer will worry about it.

Also, my biggest concern are all those concurrent writes that are happening with concurrent reads.

Your insights into this is highly appreciated.

And more thing, I have three of those tables with each 30 billion rows holding different object types

Best Answer

If eventual consistency is acceptable and all your queries are aggregates then perhaps a low-latency OLAP system might work for you. Your requirement sounds a bit like an algorithmic trading platform. This type of architecture is often used in trading floor systems that have a requirement to carry out aggregate statistical analysis computations on up to date data.

If you can partition your data by date and old rows don't get updated then you can build a hybrid OLAP system using a conventional OLAP server such as Microsoft Analysis services backed by an ordinary RDBMS platform. It should be possible to make this cope with ~4TB of data and both SQL Server and SSAS will do shared-disk clusters. Similar OLAP systems (e.g. Oracle/Hyperion Essbase) are available from other vendors.

OLAP servers work by persisting data in a native store, along with aggregates. Most will support partitioned data. In addition, most will also work in a ROLAP mode, where they issue queries against the underlying database. The important thing to note is that the storage strategy can be managed on a per-partition basis, and you can switch a partition from one to the other programatically,

In this model, historical data is stored in MOLAP partitions with aggregates of the data also persisted. If a query can be satisfied from the aggregates then the server will use them. Aggregates can be tuned to suit the queries, and correct aggregates will dramatically reduce the amount of computation needed to resolve the query. Very responsive aggregate queries are possible with this type of system.

Realtime data can be implemented by maintaining a small leading partition - for the current month, day or even hour if necessary. The OLAP server will issue queries against the database; if this partition is small enough the DBMS will be able to respond quickly. A regular process creates new leading partitions and converts closed historical periods to MOLAP. Older partitions can be merged, allowing the historical data to be managed at any grain desired.

The clients writing to the database just write straight out the the underlying RDBMS. If historical data remains static they will only be writing to the leading partition. 4TB is a practical volume to use SSDs for if you need extra DBMS performance. Even mainstream vendors have SSD based offerings with faster SLC units as an option.