Skip to main content

Hosting CentOS7 and CentOS8 yum repositories in AWS S3

 


Disclaimer

I have published this post on my work blog https://reece.tech previously.

Overview

We are utilising compute instances in different cloud environments as well as traditional data centres. On-premise virtual machines usually run RHEL 7/8 and CentOS 7/8.

Scope

This post explains how to create and host your own yum repositories in an S3 bucket and how to maintain secure, consistent and reliable server builds. This method also allows for a controlled package version and patch level life-cycle across environments.

The problem

Using externally hosted yum repositories or mirrors is very convenient and easy for end users installing and updating a single workstation, however it is not the best option in an enterprise environment where many new identical virtual machines could be built every day in an automated fashion.

Issues

The main problems with publicly hosted repositories are:

  • Security (who has access to the mirror or DNS and can alter packages?)
  • Consistency (packages get updated all the time, how can we ensure that the packages on the production servers are identical to nonprod or development?)
  • Reliability (what happens if the source repo or mirror is down when building or updating a server?)
  • Speed and cost (having packages locally will reduce latency, improve speed and lower bandwidth / data usage)

There are solutions available to address these issues with RHEL (Satellite), however for CentOS there are not many options.

The solution

CentOS comes with tools to download or sync rpm packages to a local folder. Once this has been done, the createrepo command will build all the metadata needed to host your own yum repository via HTTP(s).

We could run a web server in front of this folder or sync the files to an S3 bucket. Each repo can be versioned (with timestamp) to promote changes across environments (dev > test > staging > production). As we are automatically rebuilding most of our servers, updating the repo URL in our source build repository is all that needs to be done. Alternatively just changing the yum repo bucket name will allow us to upgrade or downgrade packages to the desired patch level.

CentOS 7

For CentOS 7, we need to perform the following steps:

Create a new empty S3 bucket in your desired region. This bucket is used for publicly available files and may allow read access for everyone. There are options to further restrict access to the repository files but this goes beyond the scope of this blog entry.

# install necessary binaries
yum -y install yum-utils createrepo

# create repository location (ensure there is sufficient space)
mkdir /repository

# download all packages of repos we are interested in
reposync -g -l -p /repository -r base
reposync -g -l -p /repository -r extras
reposync -l -p /repository -r docker-ce-stable

# recreate repository metadata folder (this will create a repodata sub folder with repomd.xml file)
createrepo /repository

# sync whole folder to your S3 bucket
cd /repository
aws --region ap-southeast-2 s3 sync . s3://my-yum-repository-bucket-20200420/ \
    --cache-control "max-age=3600,public" \
    --grants "read=uri=http://acs.amazonaws.com/groups/global/AllUsers"

Once the repo has been synced, simply create a repository file in /etc/yum.repos.d/myrepo.repo:

[myrepo]
baseurl = https://my-yum-repository-bucket-20200420.s3-ap-southeast-2.amazonaws.com
enabled = 1
gpgcheck = 0
name = myrepo
repo_gpgcheck = 0

and test the repo:

yum clean all
yum repolist
yum check-update

Some packages in this repo contain special characters (like the plus sign) in the URL. Unfortunately AWS S3 encodes these URLs, so these will not work without enabling website hosting in your bucket. The corresponding base URL will change accordingly.

In the above example, we are merging different source repositories (base, extras and docker-ce) into a single repository. This could be separated, however in our case this simplifies updates.

CentOS 8

For CentOS 8, this process is a bit more involved due to the addition of modular repositories. Program names and parameters have changed somewhat as well.

There are also many more packages which contain special characters, requiring the website hosting option for your S3 bucket to be enabled from the start.

# ensure the tools are installed
yum -y install yum-utils createrepo

# sync latest rpms from original repo
reposync --destdir /repository --repo BaseOS
reposync --destdir /repository --repo AppStream
reposync --destdir /repository --repo extras

# scan all rpms and create repodata
createrepo_c /repository
repo2module --module-name=myrepo --module-stream=stable /repository /tmp/modules.yaml
modifyrepo_c --mdtype=modules /tmp/modules.yaml /repository/repodata

# sync files to S3 (similar to CentOS 7)

The repo2module command is a small python script which can be downloaded from the URL below.

Upgrades

Just replace the URL in /etc/yum.repos.d/myrepo.repo with the new path and run yum clean all.

Conclusion

Hosting your own yum repositories in S3 is a relatively easy process. The cost is minimal and the whole process can be automated in your favourite CI/CD pipeline.

More information here:

Comments

Popular posts from this blog

Manual Kubernetes TLS certificate renewal procedure

Intro Kubernetes utilizes TLS certificates to secure different levels of internal and external cluster communication.  This includes internal services like the apiserver, kubelet, scheduler and controller-manager etc. These TLS certificates are created during the initial cluster installation and are usually valid for 12 months. The cluster internal certificate authority (CA) certificate is valid for ten years. There are options available to automate certificate renewals, but they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated either. If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments. This blog entry focuses on manual renewals / re-creation of Kubernetes certificates. For example, the api-server certificate below...

Deprecating Networking Ingress API version in Kubernetes 1.22

  Intro Kubernetes deprecates API versions over time. Usually this affects alpha and beta versions and only requires changing the apiVersion: line in your resource file to make it work. However with this Ingress object version change, additional changes are necessary. Basics For this post I am quickly creating a new cluster via Kind (Kubernetes in Docker) . Once done, we can see which API versions are supported by this cluster (version v1.21.1). $ kubectl api-versions | grep networking networking.k8s.io/v1 networking.k8s.io/v1beta1 Kubernetes automatically converts existing resources internally into different supported API versions. So if we create a new Ingress object with version v1beta1 on a recent cluster version, you will receive a deprecation warning - and the same Ingress object will exist both in version v1beta1 and v1. Create $ cat ingress_beta.yaml apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata:   name: clusterpirate-ingress spec:   rules:  ...

Analysing and replaying MySQL database queries using tcpdump

Why There are situations where you want to quickly enable query logging on a MySQL Database or trouble shoot queries hitting the Database server in real-time. Yes, you can enable the DB query log and there are other options available, however the script below has helped me in many cases as it is non intrusive and does not require changing the DB server, state or configuration in any way. Limitations The following only works if the DB traffic is not encrypted (no SSL/TLS transport enabled). Also this needs to be run directly on the DB server host (as root / admin). Please also be aware that this should be done on servers and data you own only. Script This script has been amended to suit my individual requirements. #!/bin/sh tcpdump -i any -s 0 -l -w - dst port 3306 | strings | perl -e ' while(<>) { chomp; next if /^[^ ]+[ ]*$/;   if(/^(ALTER|COMMIT|CREATE|DELETE|DROP|INSERT|SELECT|SET|UPDATE|ROLLBACK)/i) {     if (defined $q) { print "$q\n"; }     $q=$_; ...