Building Local Repositories for Various Linux Distributions

2021-02-01 10:36Edit this page

Background

Containerized applications often spend a lot of time downloading packages during build (sometimes encountering network issues, requiring multiple builds for one version to succeed), yet most dependency packages rarely change. So I had this optimization idea for building images:

Use a minimal container to pre-download required packages locally, then build these packages into local repositories. In containers that need to be built, replace the software sources with local repositories to save container build time (by orders of magnitude).

Another reason I’m writing this blog post: although everything used here is existing tools and software, besides their own manuals and --help, other helpful documentation is really scattered. And from my searching, these existing tools always have small “pitfalls” not mentioned in documentation, or even rarely encountered on sites like Stack Overflow - solving these “pitfalls” is what takes the most time.

TL;DR

I’ll later open source part of this on GitHub, placed here (TODO)

Currently tested compatible distributions:

  • centos 6 / 7 / 8
  • fedora 31 / 32 / 33
  • amazonlinux 1 / 2
  • ubuntu trusty (14.04) / xenial (16.04) / bionic (18.04) / focal (20.04)
  • debian jessie (8) / stretch (9) / buster (10)
  • opensuse leap 15

Process

  1. Based on a minimal container for the distribution, add some software sources that will be needed.
  2. For different distributions, use the corresponding package management tool to download all packages in the required package list plus their dependencies
  3. Place these packages in corresponding directories by distribution, use repository creation commands in the container to build local repositories
  4. Use a simple static web server, listening on a local port. This sets up a local http software repository
  5. Add the local software source to the Dockerfile of containers that need frequent rebuilds. Note that local software repositories generally don’t have signature verification or https, so you need to manually add trust.

Here I’ll only explain the more tedious steps

0x02. Package Download

yum / dnf

$ cd /path/to/dir \
    && yumdownloader --resolve pkg-1 pkg-2 ...
  • yumdownloader is preferred here. The previous approach tried dnf install --downloadonly, and found quite a few unknown pitfalls. One of them is that after downloading, packages already downloaded locally occasionally get deleted - feels like dnf / yum has some storage optimization strategies.
  • The --resolve option tells yumdownloader to download dependencies for the specified packages
  • --installroot Not recommended to use this option to specify download path. After using this option, macros (variables) in software source config files won’t be automatically resolved. For example, the common $releasever variable would need extra manual specification.
  • yumdownloader downloads packages directly to the working directory, just use cd to switch the working directory first

apt-get

$ cd /path/to/dir \
    && apt-get download \
    $(apt-cache depends --recurse --no-recommends --no-suggests \
        --no-conflicts --no-breaks --no-replaces --no-enhances \
        pkg-1 pkg-2 ... | grep "^\w")
  • If using apt-get install --donwload-only --reinstall to download packages, dependency packages that already exist in the current container won’t be downloaded again.

For example, if the downloader-container already has ca-certificates and openssl packages installed, when executing the following command, the result is: due to the --reinstall option ca-certificates will be downloaded, but openssl as a dependency of ca-certificates will be ignored.

$ apt-get download \
        $(apt-cache depends --recurse --no-recommends --no-suggests \
        --no-conflicts --no-breaks --no-replaces --no-enhances \
        ca-certificates | grep "^\w")
  • Here we use apt-get download rather than apt-get --install --donwload-only, mainly because in the subcommand apt-cache depends, the queried dependencies will have preferred and alternative choices, and these two are often conflicting. Even if apt-get install uses --donwload-only, it will cause package download failure because conflicts can’t be resolved.

Below is an example of apt-cache depends output, where pinentry-curses is the more preferred choice over <pinentry:i386>. Detailed explanation can be found at https://www.thecodeship.com/gnu-linux/understanding-apt-cache-depends-output/

$ apt-cache depends --recurse --no-recommends \
    --no-suggests --no-conflicts --no-breaks \
    --no-replaces --no-enhances --no-pre-depends \
    gnupg2 | grep -E '^gnupg-agent:i386' -A10

gnupg-agent:i386
 |Depends: pinentry-curses:i386
  Depends: <pinentry:i386>
    mew-beta-bin:i386
    mew-bin:i386
    pinentry-curses:i386
    pinentry-gnome3:i386
    pinentry-gtk2:i386
    pinentry-qt:i386
    pinentry-tty:i386
  Depends: libassuan0:i386
  • apt-get download also downloads packages directly to the current directory, so just use the cd command to switch working directory first

zypper

$ zypper --no-gpg-checks --non-interactive \
    --pkg-cache-dir /path/to/dir \
    install -y -f --download-only \
    pkg-1 pkg-2 ...
  • --non-interactive is mainly for scripts, preventing zypper from waiting for user input until timeout
  • --pkg-cache-dir specifies the download directory
  • -f forces downloading already installed packages. This actually encounters the same problem as apt-get install --download-only, where dependency packages won’t be downloaded if already installed. Currently I write it this way, adding missing base packages manually.
  • For zypper, distinguish between global arguments and subcommand arguments. Specifically for this command, before install are global arguments, and after are subcommand arguments

0x03. Directory Structure

yum

The yum repository directory structure is as follows:

base/
├── amazonlinux-1
│   └── x86_64
|       ├── audit-libs-2.6.5-3.28.amzn2.i686.rpm
|       ├── ...
│       └── repodata
...

Note: yum repository structure is relatively simple. Under distribution subdirectory -> CPU architecture directory, store downloaded rpm packages, then create local repository index in the same directory.

Command to create yum repository index:

cd /path/to/dir \
    && createrepo --update ./

There’s also a C version createrepo_c which is faster with the same usage. Recommended for newer distributions, like centos 8 / fedora 31+ / amazonlinux

cd /path/to/dir \
    && createrepo_c --update ./

Newer distributions have some packages built with modularity1. If you want to build local repositories for these packages, extra commands are needed:

Documentation at: https://docs.fedoraproject.org/en-US/modularity/hosting-modules/

cd /path/to/dir \
    && createrepo_c --update ./ \
    && repo2module -s stable -n REPO_NAME -d ./ ./repodata/modules \
    && modifyrepo_c --mdtype=modules ./repodata/modules.yaml ./repodata

Where REPO_NAME is the local repository name

A noteworthy command here is repo2module (from https://github.com/rpm-software-management/modulemd-tools), because the above documentation doesn’t mention how to generate the modules.yaml file.

fedora or centos 8 (needs additional epel repository) can install the repo2module command via dnf install -y python3-gobject modulemd-tools

apt

The apt repository directory structure is as follows:

ubuntu/
├── dists
│   ├── bionic
│   │   └── base
│   │       └── main
│   │           └── binary-amd64
|  ...
└── pool
    ├── bionic
    │   └── base
    │       └── main
    │           └── binary-amd64
   ...

Note: apt repository has two subdirectories dists/ and pool/. dists/ subdirectory stores indexes, pool/ subdirectory stores packages.

Command to create apt repository index:

Here the local repository won’t use gpg signed Release, full command at: https://medium.com/sqooba/create-your-own-custom-and-authenticated-apt-repository-1e4a4cf0b864#35dd

cd /path/to/dir

apt-ftparchive --arch amd64 packages \
    pool/bionic/base/main/binary-amd64 \
    > dists/base/main/binary-amd64/Packages

gzip -k -c \
    -f dists/base/main/binary-amd64/Packages \
    > dists/base/main/binary-amd64/Packages.gz

apt-ftparchive release dists/bionic/base > dists/bionic/Release

Where base is a custom repository subdirectory, convenient for future expansion. The apt-ftparchive command can be installed via apt-get install -y dpkg-dev.

0x05. Adding Local Repository

The host.docker.internal below is a hostname added via docker build’s --add-host, 4891 is the port openresty listens on locally

yum

printf "[local-base]\n\
name=Local Base Repo\n\
baseurl=http://host.docker.internal:4891/base/centos-7/x86_64/\n\
skip_if_unavailable=True\n\
gpgcheck=0\n\
repo_gpgcheck=0\n\
enabled=1\n\
enabled_metadata=1" > /etc/yum.repos.d/local-base.repo

zypper

printf "[local-base]\n\
name=Local Base Repo\n\
baseurl=http://host.docker.internal:4891/base/sles-12/x86_64/\n\
skip_if_unavailable=True\n\
gpgcheck=0\n\
repo_gpgcheck=0\n\
enabled=1\n\
enabled_metadata=1" > /root/local-base.repo \
    && zypper -n ar --check --refresh -G file:///root/local-base.repo \
    && zypper -n mr --gpgcheck-allow-unsigned-repo local-base \
    && zypper -n mr --gpgcheck-allow-unsigned-package local-base \
    && rm -f /root/local-base.repo

apt

echo "deb [trusted=yes] http://host.docker.internal:4891/ubuntu bionic/base main" > /etc/apt/sources.list

Unless otherwise stated, articles on this blog are licensed under the Creative Commons Attribution 4.0 International License. Please credit the original author and source when sharing.


Tags: programming

Leave a comment

Creative Commons © 2013 — 2026 xiaocang | Theme based on fzheng.me & NexT | Hosted by Netlify