Apache Cassandra – How to Query Properties of an Entity

cassandranosql

I am new to Apache Cassandra, and I would like to know how to create a table in which I can filter rows by non-key columns.

In relational databases, I would simply create a table with an ID as a primary key, and then execute a SELECT, filtering by non-key column entries. In Cassandra however, I can not do such a thing.

I have already read about secondary indexes in Cassandra. I know that I could add indexes to the property columnss and then execute a simple CQL SELECT, similar to relational databases, and which yields me the desired rows, but that would be inefficient. Isn't there a better way?

Best Answer

In Cassandra, as far as I know, for SELECT filtered by non-key attributes, you only have three options:

(1) Application side filtering. That is, if you get your results from a CQL SELECT, use your application to filter the results. For all but the smallest data sets, this is ill-advised.

(2) Bite the bullet, and create those secondary indices.

(3) Probably the most common option, duplicate your data by having rows composed of keys. That is, for whatever filter condition you want to apply, create a new entry in your database where you store the keys of all the relevant row entries which match the filter.

Note that while the third option is most common, you will almost eventually develop some data inconsistency due to the inherent de-normalization. Apache Cassandra is not a cure all, it simply handles some applications very well.

Good luck!

P.S.: Here's a couple decent blog entries which can explain a bit of the theory at the logical data model level. Part 1 & Part 2