Creating efficient docker images using Dockerfile is very important when pushing out images into production. We need images as small as possible in production for faster downloads, lesser surface attacks.
In this article, we will see how we can build images efficiently with Docker multi-stage builds and also we will explore what are the options before multi-stage builds.
Reducing Docker Images is something we should know how to do to keep our application secure and stick with the proper industry standards and guidelines.
There are a lot of ways to do this, including:
Use a .dockerignore file to remove unnecessary content from the build context
Try to avoid installing unnecessary packages and dependencies
Keep the layers in the image to a minimum
Use alpine images wherever possible
Use Multi-Stage Builds, which I am going to talk about in this article.
Let’s move to Multi-Stage Builds
Multi-stage builds in Docker are a new feature introduced in Docker 17.05. It is a method to reduce the image size, create a better organization of Docker commands, and improve the performance while keeping the Dockerfile easy to read and understand.
The multi-stage build is the dividing of Dockerfile into multiple stages to pass the required artifact from one stage to another and eventually deliver the final artifact in the last stage. This way, our final image won’t have any unnecessary content except our required artifact.
Previously, when we didn’t have the multi-stage builds feature, it was very difficult to minimize the image size. We used to clean up every artifact (which isn’t required) before moving to the next instruction as every instruction in Dockerfile adds the layer to the image. We also used to write bash/shell scripts and apply hacks to remove the unnecessary artifacts.
Let’s look at an example:
This is just the one instruction of the Dockerfile in which we need to download the abc.tar.gz
file from some http://xyz.com website and extract the content and run make install
.
In the same instruction, we stored the content of the make install
command to /tmp
dir and removed the remaining data like the downloaded tar
file and extracted tar contents so that we can only have the content of the make install
command, which is required for our further processing.
That’s all the stuff we have to do in one instruction to reduce the size of the final image. Now we can imagine the complexity of the Dockerfile for n number of instructions.
Ohh wait…wait…wait…!!! Now we have the power of multi-stage builds with which we can reduce the size of the image without compromising the readability of the Dockerfile.
Here in this Dockerfile, we are usingubuntu:16.04as a base image and called this stage as stage1
and executed some instructions as follows:
Run apt-get update
to update the packages
Run apt-get -y install make curl
to install make and curl packages
We downloaded the abc.tar.gz
file from http://xyz.com using curl
Untar the abc.tar.gz
file and change the directory to abc
Run the make DESTDIR=/tmp install
command to store the output to tmp directory
Rather than removing the unnecessary artifacts, we created another stage i.e stage 2 with alpine:3.10
as the base image because it is lighter
We copied the content from the /tmp
dir from stage1
to /abc
dir in stage2
by simply running COPY --from=stage1 /tmp /abc
command
Finally, we added the path of the binary in the Entrypoint
to run it
This way, we copied the required artifact from stage 1 to stage 2 without compromising the Dockerfile and successfully created the most optimized and reduced image. Similarly, we can use multi-stage builds to create a static build for the frontend files and pass the static files to stage 2 where we can use nginx base image to host them without keeping the large, bulky node_modules
in our app which is of no use after the static build.
We can also use external Docker images as a stage and can also stop at a specific build stage. It is not always useful as we lost the previous stage intermediate containers so we won’t be able to leverage build cache in Docker. Read more about the multi-stage build from Docker official docs.
Thanks for reading !
#Docker