Instana

Instana

The community for performance and observability professionals to learn, to share ideas, and to connect with others.

 View Only

How to reduce PVC size for Cassandra nodes

By Oleg Samoylov posted Thu February 26, 2026 06:05 PM

  

Fastest and logical way for to fix "not enough space" kind of issues is to extend PVC for Cassandra. This may lead to a situation when PVC grew too large to administer and manage. It is recommended not to extend Cassandra PVC larger than 1Tb, this means that ideally data should take 500Gb - 700Gb on one node as free space is crucial for Cassandra operation. The instruction is for the case where PVC has been extended too large and adding another cassandra node leads to significant storage consumption ahead. Cassandra operator doesn't support storage reduce.

Normally, Instana keeps short term metric history in Cassandra, basically diagram sources from cassandra consist of points representing 1 second or 5 second intervals, Instana keeps them for up to 24 hours by default. So if it is acceptable to work with 10s interval for the past 24 hours until the data accumulated - the easiest way would be to delete and recreate cassandra cluster. Some Instana deployments do not have Beeinstana, that means all metric intervals (1s, 5s, 10s, 1m, 5m, 1h) kept in Cassandra.

The instruction is the workaround for deployments where data loss is unacceptable. In short, the approach is to ensure the cluster can survive 1 node unavailability and trigger nodes recreation one by one with adjusted PVC size. The procedure has been tested on OCP cluster with Datastax Cassandra that comes with Instana.



Using tmux

It is recommended to use tmux session to follow the procedure because few steps are very time consuming. Running them in one terminal which is active for long time is important. Here are few useful options to work with tmux:
  • if tmux is not installed, please install it on bastion/jump server or the console that is constantly connected to the OCP cluster:
dnf install tmux
  • start tmux session with name "cassandra" (create if not existing and attach if exists), run in Bash:
tmux new -A -s cassandra
  • Short key to detach from tmux session, while keep it running in background, from tmux session:
Press "Ctrl+b"
and then "D" to detach
  • To look at currently running tmux sessions, run in Bash:
tmux ls
  • to attach session with name "cassandra" run from Bash again:
tmux new -A -s cassandra
  • to finish current tmux session, enter to tmux session with the command above and:
## press Ctrl+D
## or type:
exit



Step 1. Ensure/Configure replication factor is 2 or more

Increasing replication factor will lead to increase storage consumption. For our procedure important step is to ensure replication factor is 2 or more. As soon as we plan do delete nodes one by one with data, we rely on Cassandra functionality to restore data from existing nodes. Another reason is performance, as soon as Instana keeps running during our procedure and replication factor makes Cassandra faster to read. Let's open a tmux session and configure aliases for our session which will ease commands usage:
### start tmux session
tmux new -A -s cassandra

### define aliases
CASSPASS=`oc -n instana-cassandra get secret instana-superuser --template='{{ index .data "password" | base64decode }}'`
alias cass="oc -n instana-cassandra exec instana-cassandra-default-sts-0 -c cassandra -- "
alias cs="oc -n instana-cassandra exec instana-cassandra-default-sts-0 -c cassandra -- cqlsh --username instana-superuser --password $CASSPASS -e"
alias csnodetool="oc -n instana-cassandra exec instana-cassandra-default-sts-0 -c cassandra -- nodetool --username instanaadmin --password $CASSPASS "

### list keyspaces
cs "select * from system_schema.keyspaces;"

The output shows the replication factor for Instana keyspaces shared and onprem_tenant0_unit0 which we need to configure: 
keyspace_name | durable_writes | replication
----------------------+----------------+-------------------------------------------------------------------------------------
shared | True | {'cassandra': '1', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
system_auth | True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'}
system_schema | True | {'class': 'org.apache.cassandra.locator.LocalStrategy'}
onprem_tenant0_unit0 | True | {'cassandra': '1', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
system_distributed | True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}
system | True | {'class': 'org.apache.cassandra.locator.LocalStrategy'}
system_traces | True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '2'}
Ensure system_auth keyspace has replication factor equal to the number of nodes of cassandra. This is important because when replication factor is 1, the node unavailability that has the keyspace `system_auth` will lead to failures to access Cassandra, in our case it is critical, so let's fix that too (assume the number of cassandra nodes is 3):
cs "ALTER KEYSPACE system_auth WITH REPLICATION = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'};"

Run the following commands to increase replication factor to 2 for keyspaces shared and onprem_tenant0_unit0:
cs "ALTER KEYSPACE shared WITH REPLICATION = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '2'};"
cs "ALTER KEYSPACE onprem_tenant0_unit0 WITH REPLICATION = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '2'};"
Ensure configuration has been changed:
cs "select * from system_schema.keyspaces;"

Run full repair to redistribute data across the cluster based on new replication factor:
cass nodetool repair --full



Step 2. Adjust Cassandra manifest to delete storage size

Starting v22+ cass-operator considers storage size as immutable parameter, however there is a annotation cassandra.datastax.com/allow-storage-changes: "true" that allows changing the value: https://docs.datastax.com/en/cassandra-operator/reference/feature-flags.html#storage-modification.

Let's use it and remove storage resource from Cassandra manifest:
oc -n instana-cassandra patch cassdc cassandra --type=merge -p '{"metadata":{"annotations": {"cassandra.datastax.com/allow-storage-changes": "true"}}}'
oc -n instana-cassandra patch cassdc cassandra --type json -p '[{ "op": "remove", "path": "/spec/storageConfig/cassandraDataVolumeClaimSpec/resources" }]'
oc -n instana-cassandra patch cassdc cassandra --type=merge -p '{"spec":{"stopped": true}}'

Force statefulset update to pick up the changes:
oc scale deployment cass-operator -n instana-cassandra --replicas=0
oc delete sts instana-cassandra-default-sts -n instana-cassandra



Step 3. Adjust Cassandra manifest to set lower storage size

Now we can set lower storage size. Pods will still attach the old PVCs with large size until they exist:
oc scale deployment cass-operator -n instana-cassandra --replicas=1

oc -n instana-cassandra patch cassdc cassandra --type=merge -p '{"spec":{"storageConfig": {"cassandraDataVolumeClaimSpec": {"resources": {"requests": {"storage": "150Gi"}}}}}}'

oc delete sts instana-cassandra-default-sts -n instana-cassandra

Start Cassandra cluster and wait until all pods are up and running:
oc -n instana-cassandra patch cassdc cassandra --type=merge -p '{"spec":{"stopped": false}}'

At the point, Cassandra manifest has lower storage size, but as soon as old PVC are intact they're used by pods.



Step 4. Delete Nodes one by one and wait for new nodes to sync the data

Wait until all pods are up and running. The Step 4. must be re-applied for each pods one by one. Ensure pods names:
oc get pods -owide -n instana-cassandra

Set the name of a pod in a variable, it will be used to get PVC and PV names, we will start with pod `instana-cassandra-default-sts-2`:
POD_NAME=instana-cassandra-default-sts-2
PVC_NAME=`oc -n instana-cassandra get -o jsonpath='{.spec.volumes[?(@.name=="server-data")].persistentVolumeClaim.claimName}' pod ${POD_NAME}`
PV_NAME=`oc -n instana-cassandra get -o jsonpath='{.spec.volumeName}' pvc ${PVC_NAME} `

Decommission Cassandra node from the selected pod:
oc -n instana-cassandra exec ${POD_NAME} -c cassandra -- nodetool decommission --force
This process will take some time, wait for it to finish. Current status can be checked using command for a separate terminal, ensure you're using pod other than one that is being decommissioned:
oc -n instana-cassandra exec instana-cassandra-default-sts-0 -c cassandra -- nodetool status

Once the node decommissioned delete the pod and its PV/PVC. It will trigger new pod to come up, provisioning a new PVC with configured size:
oc -n instana-cassandra patch pvc ${PVC_NAME} --type json -p '[{ "op": "remove", "path": "/metadata/finalizers" }]'
oc patch pv ${PV_NAME} --type json -p '[{ "op": "remove", "path": "/metadata/finalizers" }]'
oc -n instana-cassandra delete --force pod/${POD_NAME} pvc/${PVC_NAME} pv/${PV_NAME}

Wait for new node to join the cluster and reach "UN" state:
oc -n instana-cassandra exec instana-cassandra-default-sts-0 -c cassandra -- nodetool status

Start data redistribution. The process will take some time to copy all data between nodes:
cass nodetool repair -full

When repair command finished, repeat the step with another pod.



Close tmux session

When everything is done you can close tmux session:
## press Ctrl+D
## or type:
exit

---
Relevant documentation:


#Administration
#Kubernetes
#Self-Hosted

0 comments
24 views

Permalink