Upgrade from 1.3.x

This is a guide to upgrade Burrito Aster to Burrito Begonia.

I assume Burrito Aster 1.3.x is already installed and running. This guide will show you how to upgrade it to Burrito Begonia 2.0.9.

Here is the example node ip address table.

Hostname	management IP
control1	192.168.21.111
control2	192.168.21.112
control3	192.168.21.113
compute1	192.168.21.114
compute2	192.168.21.115

KeepAlived VIP: 192.168.21.110

This is a k8s cluster version table.

Components	Aster 1.3.x	Begonia 2.0.9
containerd	v1.7.7	v1.7.13
kubernetes	v1.28.3	v1.29.2
kubeadm	v1.28.3	v1.29.2
etcd	v3.5.9	v3.5.10
calico	v3.26.3	v3.26.4
coredns	v1.10.1	v1.11.1
helm	v3.13.1	v3.14.2
nerdctl	1.6.0	1.7.1
crictl	v1.28.0	v1.29.0
nodelocaldns	1.22.20	1.22.28
cert-manager	v1.11.1	v1.13.2
metallb	v0.13.9	v0.13.9
registry	2.8.2	2.8.3

Modify kube-apiserver manifest

First, we need to modify kube-apiserver manifest to upgrade kubernetes,

Modify –anonymous-auth to true in /etc/kubernetes/manifests/kube-apiserver.yaml on every control node.:

$ sudo vi /etc/kubernetes/manifests/kube-apiserver.yaml
...
    - --anonymous-auth=true

Wait until kube-apiserver is restarted on each control node.

Check if we can connect to each kube-apiserver.:

$ curl -sk https://192.168.21.111:6443/healthz
ok
$ curl -sk https://192.168.21.112:6443/healthz
ok
$ curl -sk https://192.168.21.113:6443/healthz
ok

Prepare Begonia iso

We will use Burrito Begonia 2.0.9 iso to upgrade the existing Burrito Aster cluster.

Mount burrito-2.0.9_8.9.iso in /mnt.:

$ sudo mount -o loop,ro burrito-2.0.9_8.9.iso /mnt

Unarchive burrito-2.0.9 tarball from the iso.:

$ tar xzf /mnt/burrito-2.0.9.tar.gz

Back up localrepo.cfg and registry.cfg in /etc/haproxy/conf.d/.:

$ sudo mv /etc/haproxy/conf.d/localrepo.cfg \
            /etc/haproxy/conf.d/localrepo.cfg.bak
$ sudo mv /etc/haproxy/conf.d/registry.cfg \
            /etc/haproxy/conf.d/registry.cfg.bak

Reload haproxy.service on the first control node.:

$ sudo systemctl reload haproxy.service

Run prepare.sh script.:

$ cd burrito-2.0.9
$ ./prepare.sh offline

Copy files from the existing burrito dir (e.g. $HOME/burrito-1.3.x).:

$ cp $HOME/burrito-1.3.x/.vaultpass .
$ cp $HOME/burrito-1.3.x/group_vars/all/vault.yml group_vars/all/
$ cp $HOME/burrito-1.3.x/.offline_flag .

Edit hosts and vars.yml for your environment.:

$ vi hosts
$ vi vars.yml

Edit group_vars/all/*.yml if you modified them for your environment.:

$ vi group_vars/all/netapp_vars.yml
$ vi group_vars/all/ceph_vars.yml

Check the node connectivity.:

$ ./run.sh ping

Check if keepalived_vip(192.168.21.110) is on the first control node.:

$ ip -br a s dev eth1
eth1             UP             192.168.21.111/24 192.168.21.110/32 fe80::5054:ff:feeb:2b8b/64

If it is not, move keepalived_vip to the first control node by restarting keepalived service. For example, if keepalived_vip is on the second control node, restart keepalived service on the second control node.:

$ sudo systemctl restart keepalived.service

Then the keepalived_vip will be moved to the first control node.

Remove registry and localrepo pods.:

$ sudo kubectl delete deploy registry localrepo -n kube-system
deployment.apps "registry" deleted
deployment.apps "localrepo" deleted

These pods will be recreated while upgrading.

Disable gnsh service on all kubernetes nodes. It will be replaced by Askelpios service.:

$ sudo systemctl disable gnsh.service

Run preflight playbook.:

$ ./run.sh preflight

Run ha playbook.:

$ ./run.sh ha

You are ready to upgrade kubernetes cluster now.

Upgrade kubernetes

Run k8s playbook with upgrade_cluster_setup=true.:

$ ./run.sh k8s -e upgrade_cluster_setup=true

It will take a long time. It took about 52 minutes in my VM environment.

Check if the kubernetes version is v1.29.2.:

$ kubectl version
Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Run patch playbook.:

$ ./run.sh patch

Run registry playbook.:

$ ./run.sh registry

Check the new images(e.g. kube-apiserver:v1.29.2) are added to the local registry.:

$ curl -s 192.168.21.110:32680/v2/kube-apiserver/tags/list
{"name":"kube-apiserver","tags":["v1.28.3","v1.29.2"]}

Run landing playbook.:

$ ./run.sh landing

Check the new images(e.g. kube-apiserver:v1.29.2) are added to the genesis registry.:

$ curl -s 192.168.21.110:6000/v2/kube-apiserver/tags/list
{"name":"kube-apiserver","tags":["v1.28.3","v1.29.2"]}

Kubernetes upgrade is done!

Upgrade OpenStack

We will upgrade each openstack component one by one.

Here is a version table.

Components	Aster 1.3.x	Begonia 2.0.9
ingress	v1.1.3	v1.8.2
mariadb	10.6.16	10.11.7
rabbitmq	3.11.28	3.12.11
memcached	1.6.17	1.6.22
libvirt	6.0.0	8.0.0
keystone	21.0.2.dev3	23.0.2.dev10
glance	24.2.2.dev1	26.0.1.dev2
placement	7.0.1	9.0.1
neutron	20.5.1.dev28	22.1.1.dev110
nova	25.3.1.dev1	27.2.1.dev19
cinder	20.3.3.dev2	22.1.2.dev10
horizon	22.1.0	23.1.1.dev14
btx	1.2.3	2.0.2

Before upgrading openstack

Before upgrade, we need to do the following tasks.

Stop all VM instances.:

root@btx-0:/# o server stop <VM_NAME> [<VM_NAME> ...]

Get all compute node id.:

root@btx-0:/# o hypervisor list
+--------------------------------------+---------------------+-----------------+----------------+-------+
| ID                                   | Hypervisor Hostname | Hypervisor Type | Host IP        | State |
+--------------------------------------+---------------------+-----------------+----------------+-------+
| 5febbf97-71dc-4ae0-a902-2217ee97cd3b | aster-compute       | QEMU            | 192.168.21.114 | up    |
+--------------------------------------+---------------------+-----------------+----------------+-------+

Create /var/lib/nova/compute_id on each compute node.:

$ echo 5febbf97-71dc-4ae0-a902-2217ee97cd3b | sudo -u nova tee /var/lib/nova/compute_id

(For netapp nfs) Preserve nova-instances PVC if NetApp NFS is the default storage backend.

Patch nova-instances PVC.:

$ NOVA_INSTANCES_PVC=$(kubectl get pvc nova-instances -n openstack \
    -o jsonpath='{.spec.volumeName}')
$ echo $NOVA_INSTANCES_PVC
pvc-cc0d533d-eaaf-4a8f-81a0-3e11d9720944
$ sudo kubectl patch pv $NOVA_INSTANCES_PVC -p \
    '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
persistentvolume/pvc-cc0d533d-eaaf-4a8f-81a0-3e11d9720944 patched

Uninstall the following components. These components cannot be upgraded while they are running.:

$ ./scripts/burrito.sh uninstall nova
$ ./scripts/burrito.sh uninstall ingress
$ ./scripts/burrito.sh uninstall mariadb
$ ./scripts/burrito.sh uninstall rabbitmq

(For netapp nfs) Patch nova-instances PVC to nullify claim.:

$ sudo kubectl patch pv $NOVA_INSTANCES_PVC -p \
    '{"spec":{"claimRef": {"resourceVersion": null, "uid": null}}}'
persistentvolume/pvc-cc0d533d-eaaf-4a8f-81a0-3e11d9720944 patched

(For netapp nfs) Add volumeName in openstack-helm/nova/templates/pvc-instances.yaml.:

spec:
  accessModes: [ "ReadWriteMany" ]
  resources:
    requests:
      storage: {{ .Values.volume.size }}
  storageClassName: {{ .Values.volume.class_name }}
  volumeName: pvc-cc0d533d-eaaf-4a8f-81a0-3e11d9720944
{{- end }}

(For netapp nfs) Detach all volumes that ingress, mariadb, and rabbitmq are using.

remove_volumeattachment script.:

$ chmod +x remove_volumeattachment.sh
$ ./remove_volumeattachment.sh
ingress-0 pvc: pvc-28fb7360-cef8-4d64-9edb-08636f6c2e6b
ingress-0 volumeattachment id: csi-63fe882673981ce326e6eb7fbf1da194300e1bed9a30a2a5f7366172f5247887
volumeattachment.storage.k8s.io "csi-63fe882673981ce326e6eb7fbf1da194300e1bed9a30a2a5f7366172f5247887" deleted
...
mariadb-0 pvc: pvc-9b95c69f-2e8e-47ec-ad79-2bc1ca5c0dde
mariadb-0 volumeattachment id: csi-1d8d1834daf738654f4f79f587745f9c5469a3f7c0329b235b368f8b46f5c529
volumeattachment.storage.k8s.io "csi-1d8d1834daf738654f4f79f587745f9c5469a3f7c0329b235b368f8b46f5c529" deleted
...
rabbitmq-2 pvc: pvc-c201cc59-63fb-4cb2-8149-bfafe91d0d14
rabbitmq-2 volumeattachment id: csi-b52b24c2da084ea89f816b2d9bb9417836937784ef9c65d0c36292b3302288bf
volumeattachment.storage.k8s.io "csi-b52b24c2da084ea89f816b2d9bb9417836937784ef9c65d0c36292b3302288bf" deleted

(For netapp nfs) Unmount /var/lib/nova/instances in every compute node if netapp NFS is the default storage backend.:

$ sudo umount /var/lib/nova/instances

Run burrito playbook with system tag.:

$ ./run.sh burrito --tags=system

Upgrade openstack infra components

Upgrade ingress (v1.1.3 -> v1.8.2).:

$ ./scripts/burrito.sh install ingress

Check if ingress pods are running and ready.:

root@btx-0:/# k get po -l application=ingress,component=server
NAME        READY   STATUS    RESTARTS   AGE
ingress-0   1/1     Running   0          2m
ingress-1   1/1     Running   0          96s
ingress-2   1/1     Running   0          57s

Upgrade mariadb (10.6.16 -> 10.11.7).:

$ ./scripts/burrito.sh install mariadb

Check if mariadb server pods are running and ready.:

root@btx-0:/# k get po -l application=mariadb,component=server
NAME               READY   STATUS    RESTARTS   AGE
mariadb-server-0   1/1     Running   0          4m53s
mariadb-server-1   1/1     Running   0          4m53s
mariadb-server-2   1/1     Running   0          4m53s

Upgrade rabbitmq (3.11.28 -> 3.12.11).:

$ ./scripts/burrito.sh install rabbitmq

Check if rabbitmq server pods are running and ready.:

root@btx-0:/# k get po -l application=rabbitmq,component=server
NAME                  READY   STATUS    RESTARTS      AGE
rabbitmq-rabbitmq-0   1/1     Running   0             4m25s
rabbitmq-rabbitmq-1   1/1     Running   0             4m25s
rabbitmq-rabbitmq-2   1/1     Running   1 (60s ago)   4m25s

Upgrade memcached (1.6.17 -> 1.6.22).:

$ ./scripts/burrito.sh install memcached

Check if memcached pod is running and ready.:

root@btx-0:/# k get po -l application=memcached
NAME                                   READY   STATUS    RESTARTS   AGE
memcached-memcached-5cd8fc7496-cpx5m   1/1     Running   0          22s

Upgrade libvirt (6.0.0 -> 8.0.0).:

$ ./scripts/burrito.sh install libvirt

Check if libvirt pods are running and ready.:

root@btx-0:/# k get po -l application=libvirt
NAME                            READY   STATUS    RESTARTS   AGE
libvirt-libvirt-default-rq85p   1/1     Running   0          74s

Upgrade openstack components

Upgrade keystone (21.0.2.dev3 -> 23.0.2.dev10).:

$ ./scripts/burrito.sh install keystone

Check if keystone pods are running and ready.:

root@btx-0:/# k get po -l application=keystone,component=api
NAME                            READY   STATUS    RESTARTS   AGE
keystone-api-786c7866d6-crzsz   1/1     Running   0          2m9s
keystone-api-786c7866d6-gzc9g   1/1     Running   0          2m9s

If glance-api is a type of statefulset (i.e. if the default storage is not netapp), uninstall glance first.:

$ k get po -l application=glance,component=api
glance-api-0                 2/2     Running     0          72m
glance-api-1                 2/2     Running     0          72m
$ ./scripts/burrito.sh uninstall glance

Upgrade glance (24.2.2.dev1 -> 26.0.1.dev2).:

$ ./scripts/burrito.sh install glance

One of glance-api pods can be stuck in init state (netapp nfs only).:

root@btx-0:/# k get po -l application=glance,component=api
NAME                          READY   STATUS     RESTARTS   AGE
glance-api-596d7cf6c8-5srzp   2/2     Running    0          7m22s
glance-api-596d7cf6c8-z4lnr   0/2     Init:0/2   0          7m22s

Just delete the Init-state pod. Then it will be running okay.

Check if glance pods are running and ready.

If glance-api is a type of deployment (i.e. if the default storage is netapp):

root@btx-0:/# k delete po glance-api-596d7cf6c8-z4lnr
root@btx-0:/# k get po -l application=glance,component=api
NAME                          READY   STATUS    RESTARTS   AGE
glance-api-596d7cf6c8-5srzp   2/2     Running   0          9m23s
glance-api-596d7cf6c8-g7s7c   2/2     Running   0          94s

If glance-api is a type of statefulset (i.e. if the default storage is not netapp):

root@btx-0:/# k get po -l application=glance,component=api
NAME           READY   STATUS    RESTARTS   AGE
glance-api-0   2/2     Running   0          25m
glance-api-1   2/2     Running   0          25m

Upgrade placement (7.0.1 -> 9.0.1).:

$ ./scripts/burrito.sh install placement

Check if placement pods are running and ready.:

root@btx-0:/# k get po -l application=placement,component=api
NAME                             READY   STATUS    RESTARTS   AGE
placement-api-6d7948d754-9lfrg   1/1     Running   0          2m41s

Upgrade neutron (20.5.1.dev28 -> 22.1.1.dev110).:

$ ./scripts/burrito.sh install neutron

Some pods(neutron-{dhcp,l3,meata}-agent) will be in Init state. That’s okay since they are waiting for nova pods.

Before upgrading nova, we need to preserve images_type variable value. It is qcow2 in Aster 1.3.x while it is rbd in Begonia.

Set images_type to qcow2 in roles/burrito.openstack/templates/osh/nova.yml.j2.:

libvirt:
  volume_use_multipath: {{ enable_multipath }}
  connection_uri: "qemu+tcp://127.0.0.1/system"
  images_type: "qcow2"

Upgrade nova (25.3.1.dev1 -> 27.2.1.dev19).:

$ ./scripts/burrito.sh install nova

Check if nova pods are running and ready.:

root@btx-0:/# k get po -l application=nova,component=compute
NAME                         READY   STATUS    RESTARTS        AGE
nova-compute-default-2vn7r   1/1     Running   0               10m

If cinder-volume is a type of statefulset (i.e. if the default storage is not netapp), uninstall cinder first.:

$ k get po -l application=cinder,component=volume
cinder-volume-0                   1/1     Running     0          3h16m
cinder-volume-1                   1/1     Running     0          3h16m
$ ./scripts/burrito.sh uninstall cinder

Upgrade cinder (20.3.3.dev2 -> 22.1.2.dev10).:

$ ./scripts/burrito.sh install cinder

Check if cinder pods are running and ready.

If cinder-volume is a type of deployment (i.e. if the default storage is netapp):

root@btx-0:/# k get po -l application=cinder,component=volume
NAME                            READY   STATUS     RESTARTS   AGE
cinder-volume-86cf778db-2469r   1/1     Running    0          6m2s
cinder-volume-86cf778db-mhg7b   0/1     Init:0/4   0          6m2s

One of cinder-volume is stuck at Init state. This is the same problem as glance-api. Just delete the second cinder-volume and it will be okay.:

root@btx-0:/# k delete po cinder-volume-86cf778db-mhg7b
root@btx-0:/# k get po -l application=cinder,component=volume
NAME                            READY   STATUS    RESTARTS   AGE
cinder-volume-86cf778db-2469r   1/1     Running   0          7m33s
cinder-volume-86cf778db-pxf5v   1/1     Running   0          28s

If cinder-volume is a type of statefulset (i.e. if the default storage is not netapp):

root@btx-0:/# k get po -l application=cinder,component=volume
NAME              READY   STATUS    RESTARTS   AGE
cinder-volume-0   1/1     Running   0          5m20s
cinder-volume-1   1/1     Running   0          5m20s

Upgrade horizon (22.1.0 -> 23.1.1.dev14).:

$ ./scripts/burrito.sh install horizon

Check if horizon pods are running and ready.:

root@btx-0:/# k get po -l application=horizon,component=server
NAME                      READY   STATUS    RESTARTS   AGE
horizon-bfdcc7bd6-4n655   1/1     Running   0          84s
horizon-bfdcc7bd6-w9wqh   0/1     Running   0          84s

Sometimes One of horizon pods could not be ready. Just delete it and it’s all good.:

root@btx-0:/# k get po -l application=horizon,component=server
NAME                      READY   STATUS    RESTARTS   AGE
horizon-bfdcc7bd6-4n655   1/1     Running   0          5m59s
horizon-bfdcc7bd6-pg589   1/1     Running   0          76s

Last but not least, upgrade btx (1.2.3 -> 2.0.1).:

$ ./scripts/burrito.sh uninstall btx
$ ./scripts/burrito.sh install btx

Check btx is running.:

$ kubectl get po btx-0 -n openstack
NAME    READY   STATUS    RESTARTS   AGE
btx-0   1/1     Running   0          79s

Go to btx shell and check openstack services.:

$ bts
root@btx-0:/# o compute service list
root@btx-0:/# o volume service list
root@btx-0:/# o network agent list

If everything is okay, start the previously stopped VM instances.:

root@btx-0:/# o server start <VM_NAME> [<VM_NAME> ...]

OpenStack Upgrade is Done!!!