Each node holds 50% of data on 2 nodes with replication_factor=1

cassandra

Description: Linux OS, Cassandra 3.9, Using CCM to create 2 Nodes, Consistent Level = One

replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} 

Using writing script (* ) as below (Cassandra-Python-Driver) to insert data into cluster.

from cassandra.cluster import Cluster
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement
cluster = Cluster(['127.0.0.1','127.0.0.2'])
session=cluster.connect('luan')
count = 1
while(count < 500000):
 a=count
 b='luan'
 c='2016-01-01 01:01:01'
 query = SimpleStatement(
    "INSERT INTO t1 (a,b,c) VALUES (%s, %s, %s)", consistency_level=ConsistencyLevel.ONE)
 session.execute(query, (a,b,c))
 session.execute("INSERT INTO t2 (a,b) VALUES (%s,%s)", (1,a))
 count = count + 1

When running (* ) , getting problem when I stopped node1 OR node2 and then (* ) stopped.

cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
against any hosts', {: Unavailable('Error
from server: code=1000 [Unavailable exception] message="Cannot achieve
consistency level ONE" info={'required_replicas': 1,
'alive_replicas': 0, 'consistency': 'ONE'}',)})

I read on Cassandra Calculator, with those configurations, it said that Each node holds 50% of your data. (cluster size = 2, replication factor = 1, write level = one)

Case: replication_factor = 2, (* ) OK when I stopped node1 or node2

Question :

1/ Why replication_factor = 1, (* ) stopped when I stopped 1 in 2 nodes ?

2/ Why "Each node holds 50% of your data" as those configurations?

Best Answer

Replication Factor 1. Assuming that the cluster is balanced. There is 1 copy of your data (1 replica) in the cluster. Since you have 2 nodes and one replica, then each node will have 1/2 of the data of half a replica each. When you shut down 1 node, you lose access to 1/2 of the data.

Replication factor 2. There are 2 replicas of the data in the cluster. 2 nodes and 2 replicas -> each node can have 1 full replica. When you shut down 1 node, the other node still has a complete replica.