Cassandra nodetool repair open files issue

cassandra

I am using Cassandra 3.6. After nodetool repair Cassandra start takes too much time. The message is:

ViewManager.java:226 - Not submitting build tasks for views in keyspace system_schema as storage service is not initialized

The system stuck hours on this message. Any suggestions, please. The number of the open files raised significantly. From 100k to 1.5 millions.

Best Answer

Since SSTables can contain tokens from multiple token ranges, and repair is performed by token range, it was necessary to be able to separate repaired data from unrepaired data. That process is called anticompaction. Level Compaction Strategy (LCS) a very intensive strategy where SSTables get compacted way more often than with STCS and TWCS. LCS creates fixed-sized SSTables, which can easily lead having thousands of SSTables for a single table. The way streaming occurs in Apache Cassandra during repair makes that overstreaming of LCS tables could create tens of thousands of small SSTables in L0 which can ultimately bring nodes down and affect the whole cluster. This is particularly true when the nodes use a large number of the vnodes. I have seen happening on several customer clusters, and it requires then a lot of operational expertise to bring back the cluster to a sane state. A safety measure has been set in place to prevent SSTables from going through anticompaction to be compacted, for valid reasons. The problem is that it will also prevent that SSTable from going through validation compaction which will lead repair sessions to fail if an SSTable is being anticompacted. Given that anticompaction also occurs with full repairs, this creates the following limitation: you cannot run a repair on more than one node at a time without risking to have failed sessions due to concurrency on SSTables. The only way to perform repair without anticompaction in “modern” versions of Apache Cassandra is subrange repair, which fully skips anticompaction. To perform a subrange repair correctly, you have three options :

  1. Compute valid token subranges yourself and script repairs accordingly
  2. Use the Cassandra range_repair.py script which performs subrange repair
  3. Use Cassandra Reaper, which also performs subrange repair. Google it to find as repo's might change

To decrease the number of the open files and minimize the restart time use: nodetool cleanup; nodetool compact