Cassandra query response time degradation

cassandraperformance

I am experimenting with Cassandra on 4 sites with 2 nodes on each one. Have small network latency between them. I always query the data in DC where I inserted it. I see very strange behavior: when I increase the nodes number the query response time slowly increasing, but insert time is improving. I am expecting that query time should improve as well. Any suggestion regarding configurations check?

Best Answer

Response time heavily dependent on the query that you're executing & how you're executing them. Linear scalability is achieved only if you're performing reads by primary or at least partition key. If you're doing something like SELECT * FROM table WHERE non_pk_column = 'something' ALLOW FILTERING or SELECT * FROM table, then Cassandra will need to get data from all your nodes, and as you increase the number of nodes, the latency is increasing.

Also, when executing query you need to make sure that you're using token aware load balancing policy that will send query request to the coordinator node that has one of the replicas.

But it's a really broad question, please provide more information about your queries, etc.