Good old forgotten commit from 8 months ago.

This commit is contained in:
savagebidoof 2024-07-02 23:36:48 +02:00
parent b23f4087b3
commit 801c504757

View File

@ -0,0 +1,392 @@
# Description
Very much what the title says.
0. Search.
1. Create Proxmox VM and install OS on it.
2. Install cluster thingies to the VM.
3. Backup Cluster/Master Node
4. Stop Old Master Node
5. Restore Cluster on New Master Node
6. Update New Master Node IP to Use the Old Master Node IP
7. Rejoin All Nodes to the "New Cluster"
# Notes
## Possible issues?
- Master node name might present some discrepancies, will need to test.
- When the cluster is restored in the New Master Node, grant access to the client in that NFS server.
## Virtual Master Hardware
- 2 CPU Cores
- 8 GB of RAM
# Procedure
- [x] VM Created
- [x] SO (Debian) Installed
- [x] Edit Cluster Setup installer Ansible script into allowing not proceeding further after installing the packages/stuff necessary.
- [x] Install guest agent in all the VMs (I did kinda forgot about that)
- [x] Backup VM
- [x] Follow the guide from bellow
- [ ] Perform another backup to the control plane VM
# Links
I'm going to be following this:
https://serverfault.com/questions/1031093/migration-of-kubernetes-master-node-from-1-server-to-another-server
[//]: # ()
[//]: # (# Backup ETCD Kubernetes)
[//]: # ()
[//]: # (https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/)
[//]: # ()
[//]: # ()
# Verify your etcd data directory
SSH into the masterk node.
```shell
kubectl get pods -n kube-system etcd-pi4.filter.home -oyaml | less
```
```yaml
...
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
...
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
```
# Copy from old_master to new_master
> Why **bakup** instead of ba**ck**up? Because I want to use the K as Kubernetes.
## On new_master
```shell
mkdir bakup
```
## on OLD_master
```shell
sudo scp -r /etc/kubernetes/pki master2@192.168.1.173:~/bakup/
```
```console
healthcheck-client.key 100% 1679 577.0KB/s 00:00
server.crt 100% 1216 1.1MB/s 00:00
server.key 100% 1679 1.1MB/s 00:00
peer.crt 100% 1216 440.5KB/s 00:00
ca.crt 100% 1094 461.5KB/s 00:00
healthcheck-client.crt 100% 1159 417.8KB/s 00:00
ca.key 100% 1679 630.8KB/s 00:00
peer.key 100% 1679 576.4KB/s 00:00
front-proxy-client.crt 100% 1119 859.7KB/s 00:00
front-proxy-ca.key 100% 1679 672.4KB/s 00:00
ca.crt 100% 1107 386.8KB/s 00:00
sa.pub 100% 451 180.7KB/s 00:00
front-proxy-client.key 100% 1679 1.4MB/s 00:00
apiserver-etcd-client.key 100% 1675 1.3MB/s 00:00
apiserver.crt 100% 1294 819.1KB/s 00:00
ca.key 100% 1679 1.3MB/s 00:00
sa.key 100% 1679 1.5MB/s 00:00
apiserver-kubelet-client.crt 100% 1164 908.2KB/s 00:00
apiserver-kubelet-client.key 100% 1679 1.2MB/s 00:00
apiserver-etcd-client.crt 100% 1155 927.9KB/s 00:00
apiserver.key 100% 1675 1.4MB/s 00:00
front-proxy-ca.crt 100% 1123 939.7KB/s 00:00
```
## Remove "OLD" certs from the backup created
### on new_master
```shell
rm ~/bakup/pki/{apiserver.*,etcd/peer.*}
```
```console
removed '~/bakup/pki/apiserver.crt'
removed '~/bakup/pki/apiserver.key'
removed '~/bakup/pki/etcd/peer.crt'
removed '~/bakup/pki/etcd/peer.key'
```
## Move backup Kubernetes to the kubernetes directory (new_master)
```shell
cp -r ~/bakup/pki /etc/kubernetes/
```
```console
'~/bakup/pki' -> '/etc/kubernetes/pki'
'~/bakup/pki/etcd' -> '/etc/kubernetes/pki/etcd'
'~/bakup/pki/etcd/healthcheck-client.key' -> '/etc/kubernetes/pki/etcd/healthcheck-client.key'
'~/bakup/pki/etcd/server.crt' -> '/etc/kubernetes/pki/etcd/server.crt'
'~/bakup/pki/etcd/server.key' -> '/etc/kubernetes/pki/etcd/server.key'
'~/bakup/pki/etcd/ca.crt' -> '/etc/kubernetes/pki/etcd/ca.crt'
'~/bakup/pki/etcd/healthcheck-client.crt' -> '/etc/kubernetes/pki/etcd/healthcheck-client.crt'
'~/bakup/pki/etcd/ca.key' -> '/etc/kubernetes/pki/etcd/ca.key'
'~/bakup/pki/front-proxy-client.crt' -> '/etc/kubernetes/pki/front-proxy-client.crt'
'~/bakup/pki/front-proxy-ca.key' -> '/etc/kubernetes/pki/front-proxy-ca.key'
'~/bakup/pki/ca.crt' -> '/etc/kubernetes/pki/ca.crt'
'~/bakup/pki/sa.pub' -> '/etc/kubernetes/pki/sa.pub'
'~/bakup/pki/front-proxy-client.key' -> '/etc/kubernetes/pki/front-proxy-client.key'
'~/bakup/pki/apiserver-etcd-client.key' -> '/etc/kubernetes/pki/apiserver-etcd-client.key'
'~/bakup/pki/ca.key' -> '/etc/kubernetes/pki/ca.key'
'~/bakup/pki/sa.key' -> '/etc/kubernetes/pki/sa.key'
'~/bakup/pki/apiserver-kubelet-client.crt' -> '/etc/kubernetes/pki/apiserver-kubelet-client.crt'
'~/bakup/pki/apiserver-kubelet-client.key' -> '/etc/kubernetes/pki/apiserver-kubelet-client.key'
'~/bakup/pki/apiserver-etcd-client.crt' -> '/etc/kubernetes/pki/apiserver-etcd-client.crt'
'~/bakup/pki/front-proxy-ca.crt' -> '/etc/kubernetes/pki/front-proxy-ca.crt'
```
## ETCD snapshot on OLD_master
### from Kubectl
Check etcd api version.
```shell
kubectl exec -it etcd-pi4.filter.home -n kube-system -- etcdctl version
```
```console
etcdctl version: 3.5.10
API version: 3.5
```
### Create snapshot through etcd pod
```shell
kubectl exec -it etcd-pi4.filter.home -n kube-system -- etcdctl --endpoints https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot save /var/lib/etcd/snapshot1.db
```
```console
{"level":"info","ts":"2024-03-10T04:38:23.909625Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/lib/etcd/snapshot1.db.part"}
{"level":"info","ts":"2024-03-10T04:38:23.942816Z","logger":"client","caller":"v3@v3.5.10/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2024-03-10T04:38:23.942946Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":"2024-03-10T04:38:24.830242Z","logger":"client","caller":"v3@v3.5.10/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2024-03-10T04:38:25.395294Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"19 MB","took":"1 second ago"}
{"level":"info","ts":"2024-03-10T04:38:25.395687Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/etcd/snapshot1.db"}
Snapshot saved at /var/lib/etcd/snapshot1.db
```
### Transfer snapshot to the new_master node
### on the OLD_master
```shell
scp /var/lib/etcd/snapshot1.db master2@192.168.1.173:~/bakup
```
```text
snapshot1.db 100% 19MB 44.0MB/s 00:00
```
### Update kubeadm.config
### on the OLD_master
```shell
kubectl get cm -n kube-system kubeadm-config -oyaml
```
```text
apiVersion: v1
data:
ClusterConfiguration: |
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: v1.28.7
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
kind: ConfigMap
metadata:
creationTimestamp: "2024-02-22T21:45:42Z"
name: kubeadm-config
namespace: kube-system
resourceVersion: "234"
uid: c56b87b1-691d-4277-b66c-ab6035cead6a
```
### on the new_master
#### Create kubeadm-config.yaml
```shell
touch kubeadm-config.yaml
```
I have used the information from the previously displayed cm to create the following file (basically filling the default kubeadmin-config file):
Note that the token used differs.
```yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.abcdef0123456789
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.9
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: masterk
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.29.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
```
### Install etcdctl
https://github.com/etcd-io/etcd/releases/tag/v3.5.12
### Restore from snapshot into new_master
This time I will be using the `etcdctl` cli tool.
```shell
mkdir /var/lib/etcd
```
```shell
ETCDCTL_API=3 /tmp/etcd-download-test/etcdctl --endpoints https://127.0.0.1:2379 snapshot restore './bakup/snapshot1.db' && mv ./default.etcd/member/ /var/lib/etcd/
```
```console
Deprecated: Use `etcdutl snapshot restore` instead.
2024-03-10T06:09:17+01:00 info snapshot/v3_snapshot.go:260 restoring snapshot {"path": "./bakup/snapshot1.db", "wal-dir": "default.etcd/member/wal", "data-dir": "default.etcd", "snap-dir": "default.etcd/member/snap"}
2024-03-10T06:09:17+01:00 info membership/store.go:141 Trimming membership information from the backend...
2024-03-10T06:09:18+01:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2024-03-10T06:09:18+01:00 info snapshot/v3_snapshot.go:287 restored snapshot {"path": "./bakup/snapshot1.db", "wal-dir": "default.etcd/member/wal", "data-dir": "default.etcd", "snap-dir": "default.etcd/member/snap"}
```
### Do shenanigans to replace the OLD_node by the new_node
Aka replace the IP maneuvers.
### Start new node
```shell
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd --config kubeadm-config.yaml
```
```console
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd --config kubeadm-config.yaml
[init] Using Kubernetes version: v1.29.0
[preflight] Running pre-flight checks
[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0310 06:42:10.268972 1600 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
```
## Join "old nodes" into the "new masterk"
For my surprise I didn't need to rejoin nodes, only remove the old control plane.
```shell
kubectl get nodes
```
```console
NAME STATUS ROLES AGE VERSION
masterk.filter.home Ready control-plane 4m59s v1.29.2
pi4.filter.home NotReady control-plane 16d v1.29.2
slave01.filter.home Ready <none> 10d v1.29.2
slave02.filter.home Ready <none> 16d v1.29.2
slave03.filter.home Ready <none> 16d v1.29.2
slave04.filter.home Ready <none> 16d v1.29.2
```
```shell
kubectl delete node pi4.filter.home
```
```console
node "pi4.filter.home" deleted
```
```shell
kubectl get nodes
```
```console
NAME STATUS ROLES AGE VERSION
masterk.filter.home Ready control-plane 5m20s v1.29.2
slave01.filter.home Ready <none> 10d v1.29.2
slave02.filter.home Ready <none> 16d v1.29.2
slave03.filter.home Ready <none> 16d v1.29.2
slave04.filter.home Ready <none> 16d v1.29.2
```
So very much done, since I didn't need to rejoin I will be paying extra attention to the nodes for a while.