NoSQL Database Design – Commenting System Data Model

cassandradatabase-designnosql

I'm trying to leverage Cassandra's strong support for time series data to build a commenting system.

So far I'm thinking that each row would be a particular topic, and each comment would be a new column, stored in event_time descending order such that I could get the latest replies.

CREATE TABLE topic (
    topic_id text,
    user_id text,
    event_time timestamp,
    data text,
    PRIMARY KEY (topic_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);

However, if I want to get the last 10 comments made by 'John' for example, this seems to be difficult with the current model.

Could anyone give any advice on how I could retrieve 'John's comments efficiently, and if I would have to change the data model?

Best Answer

In Cassandra you need to take a query-based modeling approach. In your case, I would create a similarly-structured query table like this:

CREATE TABLE topicbyuser (
    topic_id text,
    user_id text,
    event_time timestamp,
    data text,
    PRIMARY KEY (user_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);

And then, if you wanted to query for the last 10 comments made by John, this would work:

SELECT * FROM topicbyuser WHERE user_id='John' LIMIT 10;

You could also add an ORDER BY, but since you have that in your column family definition, you shouldn't need it. Also, you should note that this query table does not replace the original topic table. It works in conjunction with it.