Maintenance

Last modified by Bin Chen on 2020/09/04 04:05

1.1 Node maintenance

Kube-controller-manager –pod-evition-timeout=5m0s

So whenever a node goes offline, the master node waits for upto 5 minutes before considering the node dead.

If you have maintenance tasks to be performed on a node if you know that the workloads running on the Node have other replicas and if it's okay that they go down for a short period of time.

And if you're sure the node will come back on line within five minutes you can make a quick upgrade and reboot.

1.1.1 Drain

When you drain the node the pods are gracefully terminated from the node that they're on and recreated on another. The node is also cordoned or marked as unschedulable. Meaning no pods can be scheduled on this node until you specifically remove the restriction.

kubectl drain node-1

Now that the pods are safe on the others nodes, you can reboot the node. When it comes back online.

Reference:

https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/

1.1.2 cordon/uncordon

When the node under maintenance comes back, you then need to uncordon it, so that pods can be scheduled on it again.

kubectl uncordon node-1

Remember the pods that were moved to the other nodes, don’t automatically come back. If any of those pods where deleted or if some new pods were created in the cluster, then they would be created on this node.

Cordon simply marks a node unschedulable. Unlike drain it does not terminate or move the pods on an existing node. It simply makes sure that new pods are not scheduled on that node.

1.2 Kubernetes Upgrade

1.2.1 Kubernetes Version

To see the current version, run

kubectl get nodes

新建 Microsoft Word 文档_html_1037726376945305.png

Reference:

https://kubernetes.io/docs/concepts/overview/kubernetes-api/

Here is a link to kubernetes documentation if you want to learn more about this topic (You don't need it for the exam though):

https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md

https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md

1.2.2 Cluster Upgrade Process

Is it mandatory for all of these k8s components to have the same version? No

Since the kube-api server is the primary component in the control plane and that is the component that all other components talked to. None of the other components except kubectl should ever be at a version higher than the kube-api server.

The following diagram shows the permissible versions of other components based api-server version.

新建 Microsoft Word 文档_html_fe5cd2b6349a6bdd.png

At any time kubernetes supports only up to the recent 3 minor versions. So when 1.13 is released, only versions 1.13, 1.12 and 1.11 are supported. Before the release of 1.13 would be a good time to upgrade your cluster to the next release.

Say you update from v1.10 to v1.13. The recommended approach is to upgrade one minor version at a time. v1.10 to v1.11. Then 1.11 to 1.12, and then 1.12 to 1.13.

The upgrade process depends on how your cluster is setup.

If your cluster is a managed kubernetes cluster deployed on cloud service providers, like Google for instance, Google Kubernetes Engine lets you upgrade your cluster easily with just a few clicks.
If you deployed your cluster using tools like kubeadm, then the tool can help you plan and upgrade the cluster.
If you deployed your cluster from scratch, then you manually upgrade the different components of the cluster yourself.

1.2.2.1 Kubeadm Upgrade

Upgrading a cluster involves two major steps. First you upgrade your master nodes, and then upgrade the worker nodes.

1.2.2.1.1 Concept

1.2.2.1.1.1 Update Master Node

While the master is being upgraded, the control plane components such as the api-server, scheduler and controller-managers go down briefly. The master going down does not mean your worker nodes and applications on the cluster are impacted. All workloads hosted on the worker nodes continue to serve users as normal.

Since the master is down, all management functions are down. You cannot access the cluster using kubectl or the kubernetes API. You cannot deploy new applications or delete or modify existing ones. The controller managers don't function either. If a POD was to fail, a new POD won’t be automatically created.

But as long as the nodes and the pods are up your applications should be up and users will not be impacted. Once the upgrade is complete and the cluster is back up it should function normally.

We now have the master and the master components at v1.11. And the worker nodes at version 1.10 (x-1). As we saw earlier this is a supported configuration.

1.2.2.1.1.2 17.2.2.1.1.2Update Worker Nodes

There are different strategies available to upgrade the worker nodes.

One, is to upgrade all of them at once. But then your PODs are down and users are no longer able to access the applications. Once the upgrade is complete, the nodes are back up, new PODs are scheduled and users can resume access. That's one strategy that requires downtime.
The second strategy is to upgrade one node as a time. We first upgrade the first node, where the workloads move to the second and third node and users are served from there. Once the first note is upgraded, an update comes to the second node where the workloads move to first and third nodes. And finally the third node until we have all nodes upgraded to a newer version.
A third strategy would be to add new nodes to the cluster. Nodes with the newer software version. This is especially convenient if you are on a cloud environment where you can easily provision new nodes and decommission old nodes. Nodes with the newer software version can be added to the cluster. Move the workload over to the new and remove the old node until you finally have all new nodes with the new software version.

We then follow the same procedure to upgrade the nodes from 1.11 to 1.12 and then

1.2.2.1.2 Update Practice

So we are at 1.11, and we want to go to 1.13, but remember we can only go one minor version at a time. So we first go to 1.12.

First run kubeadmin update plan to have general update information.
Upgrade master nodes.
Upgrade work nodes.

1.2.2.1.2.1 kubeadm update plan

Run the kubeadm upgrade plan command and it will give you a lot of good information.

新建 Microsoft Word 文档_html_3820721a630a8e7b.png

It lists the current cluster version, the kubeadm tool version, the latest stable version of kubernetes.
It also tells you that after we upgrade the control plane components, you must manually upgrade the kubelet.
Then it lists all the control plane components and their versions and what version these can be upgraded to.
Finally it gives you the command to upgrade the cluster.

1.2.2.1.2.2 Upgrade Master Node

Then upgrade the kubeadm tool itself to version 1.12.

apt-get upgrade -y kubeadm=1.12.0-00

Then upgrade the cluster using the command from the upgrade plan output.

kubeadm upgrade apply v1.12.0

Update kubelet on master node

apt-get upgrade -y kubelet=1.12.0-00

systemctl restart kubelet

kubectl get nodes

You will see the kubectl version of the master node has been updated.

1.2.2.1.2.3 Upgrade Worker Nodes

Run the following command to the master node,

kubectl drain node-1

Then at node-1

apt-get upgrade -y kubeadm=1.12.0-00

kubeadm upgrade node config --kubelet-verion v1.12.0

apt-get upgrade -y kubelet=1.12.0-00

systemctl restart kubelet

Run the following command to the master node,

kubectl uncordon node-1

But remember, that it is not necessary that the pods come right back to this node.

It is only marked as schedulable. Only when the PODs are deleted from the other nodes, or when new pods are scheduled do they really come back to this first node.

Well, it will soon come when we take down the next node. Perform the same steps to upgrade it until all nodes are upgraded.

Tags:

Created by Bin Chen on 2020/09/04 04:05

Applications

More applications

Need help?

If you need help with XWiki you can contact:

京ICP备19054609号-1