Cassandra Cluster storage

cassandrascalability

I want to try out Cassandra Cluster. My main question is about scalability.

Every node in a Cassandra cluster has the same copy of data. So when I have in total 1 TB and have 5 nodes -> 5TB.

This at some point will get huge. How do I scale Cassandra so that the storage can be distributed? Do I need to shard manually then again?

Best Answer

The data is automatically distributed in the cluster based on the value of the partition key of your tables. So you need to take care about creating correct data model - don't have the partitions that have hundred thousands of rows, etc.

If the data model is correct, then you can scale Cassandra by just adding the new nodes, and then data will be redistributed between nodes.