Cassandra – How to Alter Table and Change Compaction Strategy to DTCS

I have cassandra table defined as:

CREATE TABLE db.table (
    value text,
    time timestamp,
    sid text,
    PRIMARY KEY (sid, time)
) WITH CLUSTERING ORDER BY (time ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

Table is pretty big, here are data files:

root@server:/home# ls -l /var/lib/cassandra/data/db/table/*-Data.db -h
-rw-r--r-- 1 cassandra cassandra 8.7G Aug 16 00:14 /var/lib/cassandra/data/db/table/db-table-ka-200785-Data.db
-rw-r--r-- 1 cassandra cassandra 2.2G Nov  5 21:28 /var/lib/cassandra/data/db/table/db-table-ka-208079-Data.db
-rw-r--r-- 1 cassandra cassandra  20G Nov 20 20:55 /var/lib/cassandra/data/db/table/db-table-ka-208702-Data.db
-rw-r--r-- 1 cassandra cassandra  18G Dec 17 17:31 /var/lib/cassandra/data/db/table/db-table-ka-210089-Data.db
-rw-r--r-- 1 cassandra cassandra 455M Dec 18 14:15 /var/lib/cassandra/data/db/table/db-table-ka-210153-Data.db
-rw-r--r-- 1 cassandra cassandra 4.4G Dec 27 09:27 /var/lib/cassandra/data/db/table/db-table-ka-210477-Data.db
-rw-r--r-- 1 cassandra cassandra 3.9G Jan  7 14:50 /var/lib/cassandra/data/db/table/db-table-ka-210834-Data.db
-rw-r--r-- 1 cassandra cassandra 4.1G Jan 16 04:47 /var/lib/cassandra/data/db/table/db-table-ka-211203-Data.db
-rw-r--r-- 1 cassandra cassandra 997M Jan 19 08:56 /var/lib/cassandra/data/db/table/db-table-ka-211292-Data.db
-rw-r--r-- 1 cassandra cassandra 1.1G Jan 20 23:24 /var/lib/cassandra/data/db/table/db-table-ka-211389-Data.db
-rw-r--r-- 1 cassandra cassandra 1.1G Jan 22 00:03 /var/lib/cassandra/data/db/table/db-table-ka-211479-Data.db
-rw-r--r-- 1 cassandra cassandra 285M Jan 22 11:33 /var/lib/cassandra/data/db/table/db-table-ka-211495-Data.db
-rw-r--r-- 1 cassandra cassandra  51M Jan 22 12:16 /var/lib/cassandra/data/db/table/db-table-ka-211500-Data.db
-rw-r--r-- 1 cassandra cassandra  16M Jan 22 12:31 /var/lib/cassandra/data/db/table/db-table-ka-211501-Data.db
-rw-r--r-- 1 cassandra cassandra  16M Jan 22 12:46 /var/lib/cassandra/data/db/table/db-table-ka-211502-Data.db

Table is user to store some metrics as time series. Every row inserted is set with TTL of one month.

First thing I noticed here is existance of sstable much older than TTL, possibly because of not optimal compaction strategy for time series data. Is it safe to alter the table and change compaction strategy to DTCS now? What compaction subproperties should be best regarding TTL of one month?
Since all data is inserted with the same TTL, I guess it's better to set default TTL on table. According to Datastax:

Default TTL on a table is another natural fit
for time-series data. It allows you to avoid costly compactions that
run solely for the purpose of dropping tombstone and/or TTLed cells.
Actually, it allows the compaction logic to be much more efficient,
simply dropping the sstable when the newest cell timestamps are old
enough to indicate all of the sstable is past its shelf life. This
data is kept in the table metadata when it is written, making it easy
to check. When default TTL is used in conjunction with a time-series
workload, you can further reduce the compaction load associated with
data density. When used in conjunction with DTCS, you can almost
eliminate the geometric loads associated with high density on the
compaction side of the system.

Is it safe to do it now?
Direction of clustering order is not optimal for getting last value.

select * from table where sid='XXX' limit 1

gives the first entry, not last.
Is it possible to change clustering order to DESC, at this moment?

Cassandra – How to Alter Table and Change Compaction Strategy to DTCS

Best Answer

Related Question

Best Answer

Related Solutions

NoSQL – What is Unstructured Data?

Cassandra query performs differently at different times

Related Question