最近公司的Kubernetes集群证书快要到期了,加上生产环境中的集群连续运行时长也已经超过一年了,很多组件伴随着官方的更新,也显得已经有一些老旧了。根据官方的要求,需要尽可能的保持运行的Kubernetes集群跟上官方的版本脚步,防止出现大的漏洞和兼容性问题,于是决定对集群进行一次升级,顺带依托升级的时候自动更新证书,也为工作证书链进行更新。
基本流程
- 升级主控制平面节点
- 升级其他控制平面节点
- 升级工作节点
前置条件
- 你需要有一个由 kubeadm 创建的 Kubernetes 集群。
- 禁用交换分区。
- 集群应使用静态的控制平面和 etcd Pod 或者 外部 etcd。
- 由于升级只能进行连续版本升级,务必仔细认真阅读发行说明。
- 务必备份所有重要组件。
升级步骤
以下内容为测试环境操作步骤记录
首先确认当前运行的kubeadm版本
[root@test01 ~]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:56:34Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
同时确认kubectl版本
[root@test01 ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:51:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
查看集群节点状态和版本
[root@test01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION test01.anyran Ready master 61d v1.18.6 test02.anyran Ready master 61d v1.18.6 test03.anyran Ready master 61d v1.18.6
这里的测试环境版本在1.18.6,我们在第一个控制台平面节点查看当前的kubeadm版本信息
yum list --showduplicates kubeadm --disableexcludes=kubernetes
运行以上命令,应该会输出如下信息(节选)
Installed Packages kubeadm.x86_64 1.18.6-0 @kubernetes Available Packages .................... kubeadm.x86_64 1.18.0-0 kubernetes kubeadm.x86_64 1.18.1-0 kubernetes kubeadm.x86_64 1.18.2-0 kubernetes kubeadm.x86_64 1.18.3-0 kubernetes kubeadm.x86_64 1.18.4-0 kubernetes kubeadm.x86_64 1.18.4-1 kubernetes kubeadm.x86_64 1.18.5-0 kubernetes kubeadm.x86_64 1.18.6-0 kubernetes kubeadm.x86_64 1.18.8-0 kubernetes kubeadm.x86_64 1.18.9-0 kubernetes kubeadm.x86_64 1.19.0-0 kubernetes kubeadm.x86_64 1.19.1-0 kubernetes kubeadm.x86_64 1.19.2-0 kubernetes
可以看到当前节点运行的版本以及当前最新的版本情况,这里我们选择最新的版本作为升级测试样例
安装所选择版本的kubeadm
[root@test01 ~]# yum install kubeadm-1.19.2-0 Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirrors.bfsu.edu.cn * epel: hk.mirrors.thegigabit.com * extras: mirror.bit.edu.cn * updates: mirror.bit.edu.cn Resolving Dependencies --> Running transaction check ---> Package kubeadm.x86_64 0:1.18.6-0 will be updated ---> Package kubeadm.x86_64 0:1.19.2-0 will be an update --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================================================== Package Arch Version Repository Size ========================================================================================================================================================================== Updating: kubeadm x86_64 1.19.2-0 kubernetes 8.3 M Transaction Summary ========================================================================================================================================================================== Upgrade 1 Package Total download size: 8.3 M Is this ok [y/d/N]: y
执行版本检查,确认已经更新到所需版本的kubeadm
[root@test01 ~]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:38:53Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
版本检查无误,对当前节点的pod进行驱散并且禁用调度
[root@test01 ~]# kubectl drain test01.anyran --ignore-daemonsets node/test01.anyran already cordoned WARNING: ignoring DaemonSet-managed Pods: ingress-nginx/ingress-nginx-controller-g5j22, kube-system/calico-node-hn5xh, kube-system/kube-proxy-cbddp evicting pod default/nginx-5449d4cdc5-c2nkx evicting pod kube-system/calico-kube-controllers-578894d4cd-npb2z evicting pod kube-system/coredns-66bff467f8-2fd9v evicting pod kube-system/coredns-66bff467f8-kdw8r pod/calico-kube-controllers-578894d4cd-npb2z evicted pod/nginx-5449d4cdc5-c2nkx evicted pod/coredns-66bff467f8-kdw8r evicted pod/coredns-66bff467f8-2fd9v evicted node/test01.anyran evicted
执行升级计划检查
[root@test01 ~]# kubeadm upgrade plan [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade] Fetching available versions to upgrade to [upgrade/versions] Cluster version: v1.18.6 [upgrade/versions] kubeadm version: v1.19.2 [upgrade/versions] Latest stable version: v1.19.2 [upgrade/versions] Latest stable version: v1.19.2 [upgrade/versions] Latest version in the v1.18 series: v1.18.9 [upgrade/versions] Latest version in the v1.18 series: v1.18.9
以上为节选的内容,一般说来会列出可以升级的多个版本,包含同版本内的修订版本和下一个版本,执行所需版本的升级应用
[root@test01 ~]# kubeadm upgrade apply v1.19.2 [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade/version] You have chosen to change the cluster version to "v1.19.2" [upgrade/versions] Cluster version: v1.18.6 [upgrade/versions] kubeadm version: v1.19.2 [upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
升级中会自动备份配置文件,强烈建议执行之前手动对应用以及集群的重要配置进行备份,确认无误后执行升级,依据镜像拉取的快慢,所用时长也会有所不一样,直到输出以下内容表示该平面的升级完成。
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.19.2". Enjoy! [upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
手动安装对应版本的kubelet和kubectl
[root@test01 ~]# yum install kubelet-1.19.2-0 kubectl-1.19.2-0
安装完成后,对kubelet组件进行重启。
[root@test01 ~]# systemctl daemon-reload [root@test01 ~]# systemctl restart kubelet
检查升级后的组件版本
[root@test01 ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
检查升级后的集群节点状况和版本
[root@test01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION test01.anyran Ready,SchedulingDisabled master 61d v1.19.2 test02.anyran Ready master 61d v1.18.6 test03.anyran Ready master 61d v1.18.6
取消节点的隔离,允许调度
[root@test01 ~]# kubectl uncordon test01.anyran node/test01.anyran uncordoned
升级完成后的节点状态
[root@test01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION test01.anyran Ready master 61d v1.19.2 test02.anyran Ready master 61d v1.18.6 test03.anyran Ready master 61d v1.18.6
对第二个控制台平面节点进行升级,注意这里的命令发生了改变
[root@test02 ~]# kubeadm upgrade node [upgrade] Reading configuration from the cluster... [upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [upgrade] Upgrading your Static Pod-hosted control plane instance to version "v1.19.2"... Static pod: kube-apiserver-test02.anyran hash: 007cd47d52fb6b0cb8c8bb6004cf29ef Static pod: kube-controller-manager-test02.anyran hash: 5f53a41ec77571f272bf23c529b24c44 Static pod: kube-scheduler-test02.anyran hash: f543c94683059cb32a4441e29fbdb238 [upgrade/etcd] Upgrading to TLS for etcd [upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version "3.4.13-0" is not newer than the currently installed "3.4.13-0". Skipping etcd upgrade [upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests568562993" [upgrade/staticpods] Preparing for "kube-apiserver" upgrade [upgrade/staticpods] Current and new manifests of kube-apiserver are equal, skipping upgrade [upgrade/staticpods] Preparing for "kube-controller-manager" upgrade [upgrade/staticpods] Current and new manifests of kube-controller-manager are equal, skipping upgrade [upgrade/staticpods] Preparing for "kube-scheduler" upgrade [upgrade/staticpods] Current and new manifests of kube-scheduler are equal, skipping upgrade [upgrade] The control plane instance for this node was successfully updated! [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [upgrade] The configuration for this node was successfully updated! [upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
升级完成后同前一个节点一样,需要手动更新并重启kubelet组件,重启后检查集群节点状态(这里的节点隔离是手动执行的)
[root@test02 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION test01.anyran Ready master 61d v1.19.2 test02.anyran Ready,SchedulingDisabled master 61d v1.19.2 test03.anyran Ready,SchedulingDisabled master 61d v1.18.6
同理升级完所有的控制台平面节点,最终集群状态,可以看到集群的节点状态均为Ready,并且版本也已经更新
[root@test03 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION test01.anyran Ready master 61d v1.19.2 test02.anyran Ready master 61d v1.19.2 test03.anyran Ready master 61d v1.19.2
工作平面节点的升级方式,同后面的几个控制台节点升级方式类似。升级完成后,再检查集群的证书信息。
[root@test01 ~]# kubeadm alpha certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Sep 27, 2021 05:45 UTC 364d no apiserver Sep 27, 2021 05:45 UTC 364d ca no apiserver-etcd-client Sep 27, 2021 05:45 UTC 364d etcd-ca no apiserver-kubelet-client Sep 27, 2021 05:45 UTC 364d ca no controller-manager.conf Sep 27, 2021 05:45 UTC 364d no etcd-healthcheck-client Sep 27, 2021 05:44 UTC 364d etcd-ca no etcd-peer Sep 27, 2021 05:44 UTC 364d etcd-ca no etcd-server Sep 27, 2021 05:44 UTC 364d etcd-ca no front-proxy-client Sep 27, 2021 05:45 UTC 364d front-proxy-ca no scheduler.conf Sep 27, 2021 05:45 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Jul 26, 2030 05:32 UTC 9y no etcd-ca Jul 26, 2030 05:32 UTC 9y no front-proxy-ca Jul 26, 2030 05:32 UTC 9y no
可以看到,证书也已经全部更新了(除了CA证书),又可以愉快的玩耍了。