Skip to main content

Docker Container Size Quota within Kubernetes


Disclaimer

I have published this post on my work blog https://reece.tech previously.

Intro

We are running an on premise Kubernetes cluster on Red Hat Linux 7.5 (in VMware).

The /var/lib/docker file-system is a separate partition, formatted with ext4 and we used overlay as storage provider for docker, which was recommended for earlier RHEL 7 releases.

What happened

One fine day, one of our containers started creating core dumps - about 1 GB per minute worth, resulting in /var/lib/docker (100 GB in size) to fill up in less than 90 minutes. Existing pods crashed, new pods could not pull their image or start up. We deleted the existing pods on one of the Kubernetes worker nodes manually, however the container in question migrated to a different worker and continued its mission.

Investigation

We believed there is a 10 GB size limit for each running containers by default, however this did not seem to be the case. After consulting the relevant documentation it became clear that the overlay storage driver and also use of ext4 does not support container size limits and is also not the recommended solution (anymore). At the time of writing, xfs and overlay2 are recommended, which in combination with xfs project quotas can enforce the size limit per container.

Resolution

Reformatting the disk and updating the fstab.

/dev/sdb /var/lib/docker xfs defaults,quota,prjquota,pquota,gquota 0 0

Updating /etc/systemd/system/docker.service.d/override.conf

ExecStart=/usr/bin/dockerd --storage-driver=overlay2 --exec-opt native.cgroupdriver=systemd --log-driver=journald --storage-opt overlay2.override_kernel_check=true --storage-opt overlay2.size=10G

The overlay2.override_kernel_check=true option is required for older (3.10.x) kernels.

Testing the new setup

# docker info | egrep "Backing Filesystem|Storage Driver"
Storage Driver: overlay2
 Backing Filesystem: xfs
 
# mount | grep '/dev/sdb on /var/lib/docker'
/dev/sdb on /var/lib/docker type xfs (rw,relatime,seclabel,attr2,inode64,usrquota,prjquota,grpquota)
 
# docker container disk space is limited to 10 GB
root@aaba31936b78:/# dd if=/dev/zero of=out bs=4096k
dd: error writing 'out': No space left on device
2560+0 records in
2559+0 records out
10737352704 bytes (11 GB, 10 GiB) copied, 7.11036 s, 1.5 GB/s
 
# xfs_quota -x -c 'report -h' /var/lib/docker
Project quota on /var/lib/docker (/dev/sdb)
                        Blocks             
Project ID   Used   Soft   Hard Warn/Grace  
---------- ---------------------------------
...
#197          16K    10G    10G  00 [------]
#198           8K    10G    10G  00 [------]
#199        10.0G    10G    10G  00 [------]  # <---- this container uses 10 GB max.

Comments

Popular posts from this blog

Manual Kubernetes TLS certificate renewal procedure

Intro Kubernetes utilizes TLS certificates to secure different levels of internal and external cluster communication.  This includes internal services like the apiserver, kubelet, scheduler and controller-manager etc. These TLS certificates are created during the initial cluster installation and are usually valid for 12 months. The cluster internal certificate authority (CA) certificate is valid for ten years. There are options available to automate certificate renewals, but they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated either. If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments. This blog entry focuses on manual renewals / re-creation of Kubernetes certificates. For example, the api-server certificate below...

Analysing and replaying MySQL database queries using tcpdump

Why There are situations where you want to quickly enable query logging on a MySQL Database or trouble shoot queries hitting the Database server in real-time. Yes, you can enable the DB query log and there are other options available, however the script below has helped me in many cases as it is non intrusive and does not require changing the DB server, state or configuration in any way. Limitations The following only works if the DB traffic is not encrypted (no SSL/TLS transport enabled). Also this needs to be run directly on the DB server host (as root / admin). Please also be aware that this should be done on servers and data you own only. Script This script has been amended to suit my individual requirements. #!/bin/sh tcpdump -i any -s 0 -l -w - dst port 3306 | strings | perl -e ' while(<>) { chomp; next if /^[^ ]+[ ]*$/;   if(/^(ALTER|COMMIT|CREATE|DELETE|DROP|INSERT|SELECT|SET|UPDATE|ROLLBACK)/i) {     if (defined $q) { print "$q\n"; }     $q=$_; ...

Deprecating Networking Ingress API version in Kubernetes 1.22

  Intro Kubernetes deprecates API versions over time. Usually this affects alpha and beta versions and only requires changing the apiVersion: line in your resource file to make it work. However with this Ingress object version change, additional changes are necessary. Basics For this post I am quickly creating a new cluster via Kind (Kubernetes in Docker) . Once done, we can see which API versions are supported by this cluster (version v1.21.1). $ kubectl api-versions | grep networking networking.k8s.io/v1 networking.k8s.io/v1beta1 Kubernetes automatically converts existing resources internally into different supported API versions. So if we create a new Ingress object with version v1beta1 on a recent cluster version, you will receive a deprecation warning - and the same Ingress object will exist both in version v1beta1 and v1. Create $ cat ingress_beta.yaml apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata:   name: clusterpirate-ingress spec:   rules:  ...