Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRDB-45670: helm: automate the statefulset update involving new PVCs #443

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 3 additions & 19 deletions build/templates/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,26 +203,10 @@ $ helm upgrade my-release cockroachdb/cockroachdb \

Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one.

However, the upgrade will fail if it involves adding new Persistent Volume Claim (PVC) to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.). In such cases, kindly repeat the following steps for each pod:
1. Delete the statefulset
```shell
$ kubectl delete sts my-release-cockroachdb --cascade=orphan
```
The statefulset name can be found by running `kubectl get sts`. Note the `--cascade=orphan` flag used to prevent the deletion of pods.

2. Delete the pod
```shell
$ kubectl delete pod my-release-cockroachdb-<pod_number>
```

3. Upgrade Helm chart
```shell
$ helm upgrade my-release cockroachdb/cockroachdb
```
Kindly update the values.yaml file or provide the necessary flags to the `helm upgrade` command. This step will recreate the pod with the new PVCs.
However, the upgrade will fail if it involves adding new Persistent Volume Claim (PVC) to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.).
In such cases, kindly run the `scripts/upgrade_with_new_pvc.sh` script to upgrade the cluster.

Note that the above steps need to be repeated for each pod in the CockroachDB cluster. This will ensure that the cluster is upgraded without any downtime.
Given the manual process involved, it is likely to cause network churn as cockroachdb will try to rebalance data across the other nodes. We are working on an automated solution to handle this scenario.
`./scripts/upgrade_with_new_pvc.sh -h` can be used for generating help on how to run the script.

Monitor the cluster's pods until all have been successfully restarted:

Expand Down
22 changes: 3 additions & 19 deletions cockroachdb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,26 +204,10 @@ $ helm upgrade my-release cockroachdb/cockroachdb \

Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one.

However, the upgrade will fail if it involves adding new Persistent Volume Claim (PVC) to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.). In such cases, kindly repeat the following steps for each pod:
1. Delete the statefulset
```shell
$ kubectl delete sts my-release-cockroachdb --cascade=orphan
```
The statefulset name can be found by running `kubectl get sts`. Note the `--cascade=orphan` flag used to prevent the deletion of pods.

2. Delete the pod
```shell
$ kubectl delete pod my-release-cockroachdb-<pod_number>
```

3. Upgrade Helm chart
```shell
$ helm upgrade my-release cockroachdb/cockroachdb
```
Kindly update the values.yaml file or provide the necessary flags to the `helm upgrade` command. This step will recreate the pod with the new PVCs.
However, the upgrade will fail if it involves adding new Persistent Volume Claim (PVC) to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.).
In such cases, kindly run the `scripts/upgrade_with_new_pvc.sh` script to upgrade the cluster.

Note that the above steps need to be repeated for each pod in the CockroachDB cluster. This will ensure that the cluster is upgraded without any downtime.
Given the manual process involved, it is likely to cause network churn as cockroachdb will try to rebalance data across the other nodes. We are working on an automated solution to handle this scenario.
`./scripts/upgrade_with_new_pvc.sh -h` can be used for generating help on how to run the script.

Monitor the cluster's pods until all have been successfully restarted:

Expand Down
60 changes: 60 additions & 0 deletions scripts/upgrade_with_new_pvc.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/bin/bash

Help()
{
# Display Help
echo "This script performs Helm upgrade involving new PVCs. Kindly run it from the root of the repository."
echo
echo "usage: ./scripts/upgrade_with_new_pvc.sh <release_name> <chart_version> <namespace> <sts_name> <num_replicas> [kubeconfig]"
echo
echo "options:"
echo "release_name: Helm release name, e.g. my-release"
echo "chart_version: Helm chart version to upgrade to, e.g. 15.0.0"
echo "namespace: Kubernetes namespace, e.g. default"
echo "sts_name: Statefulset name, e.g. my-release-cockroachdb"
echo "num_replicas: Number of replicas in the statefulset, e.g. 3"
echo "kubeconfig (optional): Path to the kubeconfig file. Default is $HOME/.kube/config."
echo
echo "example: ./scripts/upgrade_with_new_pvc.sh my-release 15.0.0 default my-release-cockroachdb 3"
Comment on lines +11 to +18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also take an input for values.yaml file. User could have custom values.yaml file they created for cockroachdb. User could be passing this custom values.yaml file using -f option in helm upgrade command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At L59, I'm currently using ./cockroachdb as the chart location to upgrade to. I am guessing this would default to using the values file at ./cockroachdb/values.yaml, right?
Once we do provide a flag to provide the values file (-f), I am thinking if it's okay to keep the chart path as ./cockroachdb.

echo
}

while getopts ":h" option; do
case $option in
h) # display Help
Help
exit;;
\?) # incorrect option
echo "Error: Invalid option"
exit;;
esac
done

release_name=$1
chart_version=$2
namespace=$3
sts_name=$4
num_replicas=$5
kubeconfig=${6:-$HOME/.kube/config}

# For each replica, do the following:
# 1. Delete the statefulset
# 2. Delete the pod replica
# 3. Upgrade the Helm chart

for i in $(seq 0 $((num_replicas-1))); do
echo "========== Iteration $((i+1)) =========="

echo "$((i+1)). Deleting sts"
kubectl --kubeconfig=$kubeconfig -n $namespace delete statefulset $sts_name --cascade=orphan --wait=true

echo "$((i+1)). Deleting replica"
kubectl --kubeconfig=$kubeconfig -n $namespace delete pod $sts_name-$i --wait=true

echo "$((i+1)). Upgrading Helm"
# The "--wait" flag ensures the deleted pod replica and STS are up and running.
# However, at times, the STS fails to understand that all replicas are running and the upgrade is stuck.
# The "--timeout 1m" helps with short-circuiting the upgrade process. Even if the upgrade does time out, it is
# harmless and the last upgrade process will be successful once all the pods replicas have been updated.
helm upgrade $release_name ./cockroachdb --kubeconfig=$kubeconfig --namespace $namespace --version $chart_version --wait --timeout 1m --debug
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one replica is not upgraded and not joined to the cockroachdb cluster properly and we move to the next one. Wouldn't it affect the quoram of the cluster if there are 3 nodes and 2 nodes are not accessible at the moment?

We should identify that the replica we have updated is joined to the cluster before moving to the next one.

done
Loading