Cassandra – Database Latency vs Throughput

cassandra

I'm a bit confused by terms latency and throughput in case of databases.

My understanding is that latency represents time needed to perform a single request (e.g. insert or select), while the throughput represents the number of such operations during a certain amount of time (e.g. how many inserts per second).

If my understanding is correct, shouldn't it always be the cases that the two are correlated? If latency decreases throughput should increase and vice versa? So a database having low latency should have high throughput?

What comes to my mind is that many requests can be performed in parallel, so that could lead to the fact that even a single request is slow (latency high) in general the database can perform well (can process large number of requests in parallel, especially if we are talking about scalable distributed database). Is that the point?

I started thinking about this by reading about Cassandra:

Cassandra also places a high value on performance. In 2012, University of Toronto researchers studying NoSQL systems concluded that "In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments" although "this comes at the price of high write and read latencies."

Is it the case that read and write are slow, but it achieves high throughput because it is distributed?

Best Answer

Your intuition is close, but disregards concurrency.

You can have medium/high latency, high throughput, massively concurrent systems - any individual request may take longer on Cassandra than on some other systems, but you can run far more requests in parallel, allowing total throughput to be higher. In Cassandra's case, the SEDA architecture allows many concurrent requests per machine (typically on the order of ~128 or ~256 read and write threads per machine), and the distributed nature allows you to scale up with many machines (single clusters on the order of thousands of machines).