Skip to main content

Posts

Docker Container Size Quota within Kubernetes

Disclaimer I have published this post on my work blog https://reece.tech previously. Intro We are running an on premise Kubernetes cluster on Red Hat Linux 7.5 (in VMware). The  /var/lib/docker  file-system is a separate partition, formatted with  ext4  and we used  overlay  as storage provider for docker, which was recommended for earlier RHEL 7 releases. What happened One fine day, one of our containers started creating core dumps - about 1 GB per minute worth, resulting in  /var/lib/docker  (100 GB in size) to fill up in less than 90 minutes. Existing pods crashed, new pods could not pull their image or start up. We deleted the existing pods on one of the Kubernetes worker nodes manually, however the container in question migrated to a different worker and continued its mission. Investigation We believed there is a 10 GB size limit for each running containers by default, however this did not seem to be the case. After consulting the relevant d...

How to check if the Kubernetes control plane is healthy

Disclaimer I have published this post on my work blog https://reece.tech previously. Why is this important We are running an on premise Kubernetes cluster (currently version 1.11.6) on Red Hat Linux 7.5 (in VMware). Most documentation (especially when it comes to master version upgrades) mentions checking that the control plane is healthy prior to performing any cluster changes. Obviously this is an important step to ensure consistency and repeatability - and also important during day to day management of your cluster, but how exactly do we do this? Our approach Our (multi master) Kubernetes control plane consists of a few different services / parts like etcd, kube-apiserver, scheduler, controller-manager and so on. Each component should be verified during this process. Starting simple Run  kubectl get nodes -o wide  to ensure all nodes are  Ready . Also check that the  master  servers have the  master  role. Also running  kubectl get cs  wi...

Upgrading Kubernetes to 1.16 and decommissioned API versions

  Disclaimer I have published this post on my work blog https://reece.tech previously. Overview I like to upgrade our Kubernetes clusters quite frequently. Recently I started the upgrade journey to 1.16. Some upgrades are rather uneventless and completed within in a few minutes (we run 5 master nodes per cluster), however this particular upgrade was different. Preparation The biggest change in 1.16 is that certain (and commonly used) API versions have been removed completely. Yes, there were mentions and deprecation warnings here and there in the past but now it’s for real. For example, you will not be able to create or upgrade deployments or daemonsets created with the  extensions/v1beta1  API version without changing your resource manifests. We did upgrade Kubernetes internal services like Grafana, Prometheus, dashboards and our logging services API versions prior to upgrading our clusters to 1.16. API version changes Here is a list of all changes (removed APIs in Kube...

Hosting CentOS7 and CentOS8 yum repositories in AWS S3

  Disclaimer I have published this post on my work blog https://reece.tech previously. Overview We are utilising compute instances in different cloud environments as well as traditional data centres. On-premise virtual machines usually run RHEL 7/8  and CentOS 7/8. Scope This post explains how to create and host your own yum repositories in an S3 bucket and how to maintain secure, consistent and reliable server builds. This method also allows for a controlled package version and patch level life-cycle across environments. The problem Using externally hosted yum repositories or mirrors is very convenient and easy for end users installing and updating a single workstation, however it is not the best option in an enterprise environment where many new identical virtual machines could be built every day in an automated fashion. Issues The main problems with publicly hosted repositories are: Security (who has access to the mirror or DNS and can alter packages?) Consistency (package...

Manual Kubernetes TLS certificate renewal procedure

Intro Kubernetes utilizes TLS certificates to secure different levels of internal and external cluster communication.  This includes internal services like the apiserver, kubelet, scheduler and controller-manager etc. These TLS certificates are created during the initial cluster installation and are usually valid for 12 months. The cluster internal certificate authority (CA) certificate is valid for ten years. There are options available to automate certificate renewals, but they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated either. If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments. This blog entry focuses on manual renewals / re-creation of Kubernetes certificates. For example, the api-server certificate below...