Cassandra – Query a column with collection type

cassandra

I am pretty new to cassandra, so pardon me if this turns out to be a silly question.

I have a table structure as below

CREATE TABLE data_points (
  id text PRIMARY KEY,
  created_at timestamp,
  previous_event_id varchar,
  properties map<text,text>
);

I wanted to know, whether I can execute a query which gives matching records from the map type fields.

For instance, if I insert values in the table as below

INSERT INTO datapoints (id, properties) VALUES ('1', { 'fruit' : 'apple', 'band' : 'Beatles' });

Will I be able to fetch it as

SELECT * from data_points WHERE properties.band='Beatles';

Please Help.

Best Answer

You can index collection types in cassandra 2.1 and later. You are after:
SELECT * FROM <table> WHERE <field> CONTAINS <value_in_list/map/set>

Detailed example:

cqlsh> USE ks;
cqlsh:ks> CREATE TABLE data_points (
            id text PRIMARY KEY,
            created_at timestamp,
            previous_event_id varchar,
            properties map<text,text>
         );
cqlsh:ks> create index on data_points (properties);
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('1', { 'fruit' : 'apple', 'band' : 'Beatles' });
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('2', { 'fruit' : 'cherry', 'band' : 'Beatles' });
cqlsh:ks> SELECT * FROM data_points WHERE properties CONTAINS 'Beatles';

 id | created_at | previous_event_id | properties
----+------------+-------------------+----------------------------------------
  2 |       null |              null | {'band': 'Beatles', 'fruit': 'cherry'}
  1 |       null |              null |  {'band': 'Beatles', 'fruit': 'apple'}

(2 rows)

Word of warning, secondary indexes don't scale out well as they use a scatter/gather algorithm to find what you need, if you plan to use them for heavy tagging it might be better to denormalize the properties field int a separate table and carry out multiple queries.

Further reading: