Docker is a software platform that allows you to quickly build, test, and deploy applications. Docker packs software into standardized units called containers that contain everything the software needs to run, including libraries, system tools, code, and runtime…
It has been a few years now from the container explosion and the ecosystem is starting to settle down. After all the big names in industry jumped in the containerization wagon, very often proposing their own solution, it seems like Docker based platforms are finally here to stay.
After Docker became the de facto standard, many things have evolved, most often under the hood. The Open Container Iniciative (OCI) was created with the objective of standardizing run times and image formats and Docker changed its internal plumbing to accommodate, such as implementing runC, and containerd-shim.
Now, after years of Docker freezing its user facing APIs we are starting to see movement again on this front, and many of the common practices we are used to seeing when building and running containers might have better alternatives. Other things have just been deprecated or fallen out of favor.
Now Dockerfiles can be built with the Buildkit backend. This backend is smarter than the classic one, and is able to speed up the build process by achieving smarter caching, parallelizing build steps and also it accepting new Dockerfile options. To quote the official documentation
Starting with version 18.09, Docker supports a new backend for executing your builds that is provided by the moby/buildkit project. The BuildKit backend provides many benefits compared to the old implementation. For example, BuildKit can: Detect and skip executing unused build stages Parallelize building independent build stages Incrementally transfer only the changed files in your build context between builds Detect and skip transferring unused files in your build context Use external Dockerfile implementations with many new features Avoid side-effects with rest of the API (intermediate images and containers) Prioritize your build cache for automatic pruning.
In order to select this backend, we export the environment variableDOCKER_BUILDKIT=1
. If you miss the detailed output, just add--progress=plain
.
We are used to seeing a lot of trickery to try to keep each layer size to a minimum. Very often we use a RUN
statement to download the source and build an artifact, for instance a .deb
file or to compile a binary, and try to clean it up in the same statement to keep the layer tidy.
RUN \
apt-get update ; \
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y git make gcc libc-dev ; \
git clone https://github.com/some_repo ; \
cd some_repo ; \
./configure ; \
make ; \
make install; \
cd -; \
rm -rf some_repo; \
apt-get purge -y make gcc libc-dev git; \
apt-get autoremove -y; \
apt-get clean; \
find /var/lib/apt/lists -type f | xargs rm; \
find /var/log -type f -exec rm {} \;; \
rm -rf /usr/share/man/*; \
rm -rf /usr/share/doc/*; \
rm -f /var/log/alternatives.log /var/log/apt/*; \
rm /var/cache/debconf/*-old
This both wastes time each build and is ugly for what it does. It is better to use multi-stage builds.
FROM debian:stretch-slim AS some_bin
RUN apt-get update; \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install- recommends git make gcc libc-dev; \
git clone https://github.com/some_repo; \
cd some_repo; \
./configure; \
make
FROM debian:stretch-slim
COPY --from=some_bin /root/some_repo/some_bin /usr/local/bin/
In some cases, in order to avoid this I have seen people adding a binary blob in the git repository so it can be copied. Much better to use this approach.
While it doesn’t always makes sense for every Dockerfile, we can often squash our first layer and achieve a more maintainable Dockerfile.
--squash
is an experimental argument that needs to be enabled like we explained here. Instead of
RUN apt-get update; \
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y mariadb; \
apt-get autoremove -y; \
apt-get clean; \
find /var/lib/apt/lists -type f | xargs rm; \
find /var/log -type f -exec rm {} \;; \
rm -rf /usr/share/man/*; \
rm -rf /usr/share/doc/*; \
rm -f /var/log/alternatives.log /var/log/apt/*; \
rm /var/cache/debconf/*-old
, we can do
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y mariadb
RUN apt-get autoremove -y
RUN apt-get clean
RUN find /var/lib/apt/lists -type f | xargs rm
RUN find /var/log -type f -exec rm {} \;
RUN rm -rf /usr/share/man/*
RUN rm -rf /usr/share/doc/*
RUN rm -f /var/log/alternatives.log /var/log/apt/*
RUN rm /var/cache/debconf/*-old
qemu-user-static
for creating builds for different architectures in a past article. When building different versions of the same image for different architectures, it is very common to see three exact copies of the same Dockerfile where the only thing that changes is the FROM
line.FROM debian:stretch-slim
# the rest of the file is duplicated
FROM armhf/debian:stretch-slim
# the rest of the file is duplicated
FROM arm64v8/debian:stretch-slim
# the rest of the file is duplicated
Just have one single file like so
ARG arch
FROM ${arch}/debian:stretch-slim
, and build with
docker build . --build-arg arch=amd64
docker build . --build-arg arch=armhf
docker build . --build-arg arch=arm64v8
So we now have three builds with one Dockerfile, and we tag them like this
docker build . --build-arg arch=amd64 -t ownyourbits/example-x86
docker build . --build-arg arch=armhf -t ownyourbits/example-armhf
docker build . --build-arg arch=arm64v8 -t ownyourbits/example-arm64
That’s all good, but we can now simplify the instructions the user has to type by creating a multi-arch manifest.
This is still an experimental CLI feature, so we need to export DOCKER<em>CLI</em>EXPERIMENTAL=enabled
to be able to access it.
export DOCKER_CLI_EXPERIMENTAL=enabled
docker manifest create --amend ownyourbits/example \
ownyourbits/example-x86 \
ownyourbits/example-armhf \
ownyourbits/example-arm64
docker manifest annotate ownyourbits/example ownyourbits/example-x86 --os linux --arch amd64
docker manifest annotate ownyourbits/example ownyourbits/example-armhf --os linux --arch arm
docker manifest annotate ownyourbits/example ownyourbits/nextcloudpi-arm64 --os linux --arch arm64v8
docker manifest push -p ownyourbits/example
Now your users of any architecture only have to do
docker pull ownyourbits/example
, and they will receive the correct version of the image.
Even if we run with --restart=unless-stopped
, the only clue Docker or Docker Swarm to know that things are OK is that the container has not crashed. If it is unresponsive or returning errors it won’t be restarted properly.
It is more robust to add a HEALTHCHECK
statement to the Dockerfile
HEALTHCHECK CMD curl --fail http://localhost:8080/status || exit 1
Even though containers are theoretically isolated, it is not good security practice to run processes as root inside them in the same way as you don’t run your web server as root.
Towards the end of your build you should add something like
RUN \
# other stuff
useradd nonroot
USER nonroot
Also, if possible avoid relying on sudo
and if you don’t control the Dockerfile at least run in a different user namespace with docker run -u nonroot:nonroot
.
Use docker build --pull
in your scripts so you are always on the latest base image.
MAINTAINER
is deprecated. Instead of
MAINTAINER nachoparker (nacho@ownyourbits.com)
, use LABEL
s instead so they can be inspected just like any other metadata.
LABEL maintainer="nachoparker (nacho@ownyourbits.com)"
ENV variables remain in the container at run time and pollute its own environment. Use ARGs instead.
Speed up your builds by providing a cache for your package manager, ccache, git and so on. This needs to be enabled with DOCKER<em>CLI</em>EXPERIMENTAL=enabled
# syntax=docker/dockerfile:experimental
# FROM and the rest
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
apt-get install -y --no-install-recommends mongodb-server
If you require SSH credentials for your build don’t copy ~/.ssh
because it will stay in the layer even if you remove it later.
Set up SSH agent, and use the experimental feature for ssh mounts
RUN --mount=type=ssh git clone https://github.com/private_repo/repo.git
If you need sensitive files that should not be public for your build, use secrets. This way, those files will only be visible to that RUN
command during its execution and its contents will disappear without a trace from all layers after that.
RUN --mount=type=secret,id=signing_key,dst=/tmp/sign.cert signing_command
While I pointed out what for me are the biggest offenders, it is always good to go back and review the officially recommended best practices.
Most of them we are familiar with: write small containers, rearrange layers to take advantage of build cache, don’t add unnecessary packages, but it’s not uncommon to go back to it and discover something we might have missed before.
*Originally published by *nachoparker *at *ownyourbits.com
#docker #web-development