Skip to main content

Manual Kubernetes TLS certificate renewal procedure

Intro

Kubernetes utilizes TLS certificates to secure different levels of internal and external cluster communication.  This includes internal services like the apiserver, kubelet, scheduler and controller-manager etc.

These TLS certificates are created during the initial cluster installation and are usually valid for 12 months. The cluster internal certificate authority (CA) certificate is valid for ten years.

There are options available to automate certificate renewals, but they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated either.

If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments.

This blog entry focuses on manual renewals / re-creation of Kubernetes certificates.

For example, the api-server certificate below expires in June 2022.

# cat /etc/kubernetes/pki/apiserver.crt | openssl x509 -dates -noout
notBefore=Jun  8 07:09:29 2021 GMT
notAfter=Jun  8 07:09:29 2022 GMT

Scope

Below is a list of Kubernetes (v1.21) internal files (on each master node) which include SSL / TLS certificates.

/etc/kubernetes/admin.conf
/etc/kubernetes/controller-manager.conf
/etc/kubernetes/kubelet.conf
/etc/kubernetes/pki/apiserver-etcd-client.crt
/etc/kubernetes/pki/apiserver-kubelet-client.crt
/etc/kubernetes/pki/apiserver.crt
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/healthcheck-client.crt
/etc/kubernetes/pki/etcd/peer.crt
/etc/kubernetes/pki/etcd/server.crt
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-client.crt
/etc/kubernetes/scheduler.conf
There are also some certificates on each worker node, mainly used by kubelet.

/etc/kubernetes/kubelet.conf
/etc/kubernetes/pki/ca.crt
/var/lib/kubelet/pki/kubelet.crt
/var/lib/kubelet/pki/kubelet-client-current.pem

Status Check

Kubernetes conveniently offers kubeadm command line options to verify certificate expiration. As you can see below, all certificates are still valid for almost a year in this cluster.











Alternatively you can use openssl to verify the expiry time when connecting to the api-server endpoint (can also be used to verify that the api-server has been restarted since renewing the certificate):

# echo | openssl s_client -showcerts -connect 127.0.0.1:6443 -servername api 2>/dev/null | openssl x509 -noout -enddate
notAfter=Apr  5 19:17:16 2021 GMT

Manual Renewal Process

It is best practice to backup the /etc/kubernetes/pki folder on each master before renewing certificates.

All Kubernetes certificates can be re-created via kubeadm. Some certificates are specific to each master node name and some are shared across each service across different master servers (if any).

These commands will update / overwrite the corresponding certificate files as described above:

kubeadm certs renew apiserver-kubelet-client
kubeadm certs renew apiserver
kubeadm certs renew front-proxy-client
kubeadm certs renew apiserver-etcd-client
kubeadm alpha phase kubeconfig user --client-name system:kube-controller-manager

Renewing a certificate also requires the corresponding Kubernetes containers to be restarted. In most cases just deleting the pod (such as kubectl delete pod -n kube-system kube-scheduler-master1) or restarting kubelet will cause the containers / pods to be restarted and to read the new certificates.

The kube-apiserver process / pod uses many different TLS certificates - so this should ideally be restarted if any certificate changes / gets updated.

We have experienced problems with deleting the api-server pods using kubectl delete pod -n kube-system kube-apiserver-master1. The command completes as you would expect, that is, the pod resets the age to 0 seconds and the status temporarily transitions to Pending before returning to Running. However, the docker container does not actually get restarted as these are static pods in some installations!

Non-restarting / static api-server pods can be identified by the pod name (includes the nodename) and running docker ps | grep kube-apiserver on the master. If the docker container uptime has not been reset, then the container can be killed via:

docker rm -f `docker ps | grep k8s_kube-apiserver | cut -d" " -f1`

We have experienced issues with this problem in the past, where the Kubernetes API server (and also scheduler and controller-manager) failed to communicate with the metrics-server (kubectl top and the HPA stopped working). The process was just logging Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid as the error message. The root cause was an expired front-proxy-client certificate (which was renewed recently without explicitly restarting the kube-apiserver containers).

Other Certs

It is quite common to also have TLS certificates in config maps, which are used by ingress controllers etc. These also need to get renewed in a timely manner.

Conclusion

Kubernetes is quite complex and there are many moving parts which need to be maintained. Even though I has operated production Kubernetes clusters for years, I am still constantly learning and experiencing interesting challenges in this ever-changing space.

More information here:

Comments

Popular posts from this blog

Analysing and replaying MySQL database queries using tcpdump

Why There are situations where you want to quickly enable query logging on a MySQL Database or trouble shoot queries hitting the Database server in real-time. Yes, you can enable the DB query log and there are other options available, however the script below has helped me in many cases as it is non intrusive and does not require changing the DB server, state or configuration in any way. Limitations The following only works if the DB traffic is not encrypted (no SSL/TLS transport enabled). Also this needs to be run directly on the DB server host (as root / admin). Please also be aware that this should be done on servers and data you own only. Script This script has been amended to suit my individual requirements. #!/bin/sh tcpdump -i any -s 0 -l -w - dst port 3306 | strings | perl -e ' while(<>) { chomp; next if /^[^ ]+[ ]*$/;   if(/^(ALTER|COMMIT|CREATE|DELETE|DROP|INSERT|SELECT|SET|UPDATE|ROLLBACK)/i) {     if (defined $q) { print "$q\n"; }     $q=$_; ...

Deprecating Networking Ingress API version in Kubernetes 1.22

  Intro Kubernetes deprecates API versions over time. Usually this affects alpha and beta versions and only requires changing the apiVersion: line in your resource file to make it work. However with this Ingress object version change, additional changes are necessary. Basics For this post I am quickly creating a new cluster via Kind (Kubernetes in Docker) . Once done, we can see which API versions are supported by this cluster (version v1.21.1). $ kubectl api-versions | grep networking networking.k8s.io/v1 networking.k8s.io/v1beta1 Kubernetes automatically converts existing resources internally into different supported API versions. So if we create a new Ingress object with version v1beta1 on a recent cluster version, you will receive a deprecation warning - and the same Ingress object will exist both in version v1beta1 and v1. Create $ cat ingress_beta.yaml apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata:   name: clusterpirate-ingress spec:   rules:  ...