Intro
Kubernetes utilizes TLS certificates to secure different levels of internal and external cluster communication. This includes internal services like the apiserver, kubelet, scheduler and controller-manager etc.
These TLS certificates are created during the initial cluster installation and are usually valid for 12 months. The cluster internal certificate authority (CA) certificate is valid for ten years.
There are options available to automate certificate renewals, but they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated either.
If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments.
This blog entry focuses on manual renewals / re-creation of Kubernetes certificates.
For example, the api-server certificate below expires in June 2022.
# cat /etc/kubernetes/pki/apiserver.crt | openssl x509 -dates -noout
notBefore=Jun 8 07:09:29 2021 GMT
notAfter=Jun 8 07:09:29 2022 GMT
Scope
Below is a list of Kubernetes (v1.21) internal files (on each master node) which include SSL / TLS certificates.
/etc/kubernetes/admin.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/etcd/ca.crt /etc/kubernetes/pki/etcd/healthcheck-client.crt /etc/kubernetes/pki/etcd/peer.crt /etc/kubernetes/pki/etcd/server.crt /etc/kubernetes/pki/front-proxy-ca.crt /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/scheduler.conf
kubelet
./etc/kubernetes/kubelet.conf
/etc/kubernetes/pki/ca.crt
/var/lib/kubelet/pki/kubelet.crt
/var/lib/kubelet/pki/kubelet-client-current.pem
Status Check
Kubernetes conveniently offers kubeadm
command line options to verify certificate expiration. As you can see below, all certificates are still valid for almost a year in this cluster.
Alternatively you can use openssl
to verify the expiry time when connecting to the api-server endpoint (can also be used to verify that the api-server has been restarted since renewing the certificate):
# echo | openssl s_client -showcerts -connect 127.0.0.1:6443 -servername api 2>/dev/null | openssl x509 -noout -enddate
notAfter=Apr 5 19:17:16 2021 GMT
Manual Renewal Process
It is best practice to backup the /etc/kubernetes/pki
folder on each master before renewing certificates.
All Kubernetes certificates can be re-created via kubeadm
. Some certificates are specific to each master node name and some are shared across each service across different master servers (if any).
These commands will update / overwrite the corresponding certificate files as described above:
kubeadm certs renew apiserver-kubelet-client
kubeadm certs renew apiserver
kubeadm certs renew front-proxy-client
kubeadm certs renew apiserver-etcd-client
kubeadm alpha phase kubeconfig user --client-name system:kube-controller-manager
Renewing a certificate also requires the corresponding Kubernetes containers to be restarted. In most cases just deleting the pod (such as kubectl delete pod -n kube-system kube-scheduler-master1
) or restarting kubelet will cause the containers / pods to be restarted and to read the new certificates.
The kube-apiserver
process / pod uses many different TLS certificates - so this should ideally be restarted if any certificate changes / gets updated.
We have experienced problems with deleting the api-server pods using kubectl delete pod -n kube-system kube-apiserver-master1
. The command completes as you would expect, that is, the pod resets the age to 0 seconds and the status temporarily transitions to Pending
before returning to Running
. However, the docker container does not actually get restarted as these are static pods in some installations!
Non-restarting / static api-server pods can be identified by the pod name (includes the nodename) and running docker ps | grep kube-apiserver
on the master. If the docker container uptime has not been reset, then the container can be killed via:
docker rm -f `docker ps | grep k8s_kube-apiserver | cut -d" " -f1`
We have experienced issues with this problem in the past, where the Kubernetes API server (and also scheduler and controller-manager) failed to communicate with the metrics-server
(kubectl top
and the HPA
stopped working). The process was just logging Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
as the error message. The root cause was an expired front-proxy-client
certificate (which was renewed recently without explicitly restarting the kube-apiserver
containers).
Other Certs
It is quite common to also have TLS certificates in config maps, which are used by ingress controllers etc. These also need to get renewed in a timely manner.
Conclusion
Kubernetes is quite complex and there are many moving parts which need to be maintained. Even though I has operated production Kubernetes clusters for years, I am still constantly learning and experiencing interesting challenges in this ever-changing space.
More information here:
- https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
- https://kubernetes.io/docs/setup/best-practices/certificates/
- https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md
Comments
Post a Comment