I'm aware that I can read from any cassandra node and it acts as coordinator to read from the node containing a specific partition, but can I read data only from the partition which is on the node I'm connecting to?
In other words, when a cassandra cluster has 10 nodes, it contains 10 partitions on each node (and maybe replicas on other nodes when RF is set). When I send a SELECT * FROM TABLE I would like to get only 1/10th of the total data, which is really stored on that specific node without any traffic to other nodes.
Thank you so much!
Best Answer
You can do it as following (classes names, etc. are for driver 3.x, could be slightly different in 4.x):
Metadata
class ascluster.getMetadata().getTokenRanges(keyspace, host)
(see doc);select * from table where token(part_keys) > rangeStart AND token(part_keys) <= rangeEnd
(but this may not work all the time, as you need to handle cases when node's token range is split between end of token ring and begin of token ring - see linked code)Statement
for each query (for example, asSimpleStatement
), and set consistency level toLOCAL_ONE
and host to which query should be sent (via setHost function)execute
orexecuteAsync
functions and process dataSource code that performs a full cluster scan could be found here - you can reuse pieces of it regarding generation of queries on token ranges, etc.