NoSQL Database Design – Commenting System Data Model

cassandradatabase-designnosql

I'm trying to leverage Cassandra's strong support for time series data to build a commenting system.

So far I'm thinking that each row would be a particular topic, and each comment would be a new column, stored in event_time descending order such that I could get the latest replies.

CREATE TABLE topic (
    topic_id text,
    user_id text,
    event_time timestamp,
    data text,
    PRIMARY KEY (topic_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);

However, if I want to get the last 10 comments made by 'John' for example, this seems to be difficult with the current model.

Could anyone give any advice on how I could retrieve 'John's comments efficiently, and if I would have to change the data model?

Best Answer

In Cassandra you need to take a query-based modeling approach. In your case, I would create a similarly-structured query table like this:

CREATE TABLE topicbyuser (
    topic_id text,
    user_id text,
    event_time timestamp,
    data text,
    PRIMARY KEY (user_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);

And then, if you wanted to query for the last 10 comments made by John, this would work:

SELECT * FROM topicbyuser WHERE user_id='John' LIMIT 10;

You could also add an ORDER BY, but since you have that in your column family definition, you shouldn't need it. Also, you should note that this query table does not replace the original topic table. It works in conjunction with it.

Related Solutions

How to link Similar-but-distinct Models of Medical Imaging Data

Since you said that there are always 2 types of data I would do it like this

This might use the wrong terms since I don't know what exactly is available, but I think you will get the idea.

image (the data that is common to both scanes/images)
---------------------------
    image_id
    date


coordinator (the one that is inserted by hand)
---------------------
    image_id (PrimaryKey and ForeignKey)
    comment
    description
    order
    [...]

metadata (that you get programmatically)
----------------------
   image_id (PrimaryKey and ForeignKey)
   {whatever data that is}

This way, you have both images connected but have the freedom to model the details differently.

Additionally, you could move the metadata into a seperate table if the data is similar. This might be useful for later comparisson/collection/...

metadata (that you get programmatically)
----------------------
    image_id (PrimaryKey and ForeignKey)
    metadata_id


metadata_description
-------------------------
    metadata_id
    description
    name
    {whatever the metadata is}

Different bettween Cassandra NoSQL data model and SQL data model

Cassandra is a Columnar Store NoSQL Database, Which goes beyond the Key Value pairs.

A Column family in cassandra is a container for rows, like table in any RDBMS.

And These are Stored and represented in Json Formats.

Below are some links where you can get more info:

http://wiki.apache.org/cassandra/DataModel

http://www.datastax.com/docs/0.8/ddl/index

http://www.javageneration.com/?p=70

Best Answer

Related Solutions

How to link Similar-but-distinct Models of Medical Imaging Data

Different bettween Cassandra NoSQL data model and SQL data model

Related Question