I'm trying to leverage Cassandra's strong support for time series data to build a commenting system.
So far I'm thinking that each row would be a particular topic, and each comment would be a new column, stored in event_time descending order such that I could get the latest replies.
CREATE TABLE topic (
topic_id text,
user_id text,
event_time timestamp,
data text,
PRIMARY KEY (topic_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);
However, if I want to get the last 10 comments made by 'John' for example, this seems to be difficult with the current model.
Could anyone give any advice on how I could retrieve 'John's comments efficiently, and if I would have to change the data model?
Best Answer
In Cassandra you need to take a query-based modeling approach. In your case, I would create a similarly-structured query table like this:
And then, if you wanted to query for the last 10 comments made by John, this would work:
You could also add an
ORDER BY
, but since you have that in your column family definition, you shouldn't need it. Also, you should note that this query table does not replace the originaltopic
table. It works in conjunction with it.