In this article, we will show how it's possible to reconstruct a Dockerfile from an image using two tools, Dedockify, a customized Python script provided for this article, and dive. The basic process flow used will be as follows.
As public Docker registries like Docker Hub and TreeScale increase in popularity, except for the most restrictive environments, it has become common for admins and developers to casually download an image built by an unknown entity. It often comes down to the convivence outweighing the perceived risk. When a Docker image is made publicly available, the Dockerfile is sometimes also provided, either directly in the listing, in a git repository, or through an associated link, but sometimes this is not the case. Even if the Dockerfile was made available, we don't have many assurances that the published image is safe to use.
Maybe security vulnerabilities aren't your concern. Perhaps one of your favorite images is no longer being maintained, and you would like to update it so that it runs on the latest version of Ubuntu. Or perhaps a compiler for another distribution has an exclusive feature that makes it better optimized to produce binaries during compile time, and you have an uncontrollable compulsion to release a similar image that's just a little more optimized.
Whatever the reason, if you wish to recover a Dockerfile from an image, there are options. Docker images aren't a black box. Often, you can retrieve most of the information you need to reconstruct a Dockerfile. In this article, we will explore exactly how to do that by looking inside a Docker image so that we can very closely reconstruct the Dockerfile that built it.
In this article, we will show how it's possible to reconstruct a Dockerfile from an image using two tools,
[Dedockify](https://github.com/mavenshark/Dedockify), a customized Python script provided for this article, and
[dive](https://github.com/wagoodman/dive). The basic process flow used will be as follows.
To get some quick, minimal-effort intuition regarding how images are composed, we will introduce ourselves to various advanced and potentially unfamiliar Docker concepts using Dive. Dive is an image exploration tool that allows examination of each layer of a Docker image.
First, let us create a simple, easy to follow Dockerfile that we can explore for testing purposes.
In an empty directory, enter the following snippet directly into the command line:
cat > Dockerfile << EOF ; touch testfile1 testfile2 testfile3 FROM scratch COPY testfile1 / COPY testfile2 / COPY testfile3 / EOF
By entering the above and pressing enter, we've just created a new
Dockerfile and populated three zero-byte test files in the same directory.
$ ls Dockerfile testfile1 testfile2 testfile3
So now, let's build an image using this Dockerfile and tag it as
$ ls Dockerfile testfile1 testfile2 testfile3
example1 image should produce the following output:
Sending build context to Docker daemon 3.584kB Step 1/4 : FROM scratch ---> Step 2/4 : COPY testfile1 / ---> a9cc49948e40 Step 3/4 : COPY testfile2 / ---> 84acff3a5554 Step 4/4 : COPY testfile3 / ---> 374e0127c1bc Successfully built 374e0127c1bc Successfully tagged example1:latest
The following zero-byte
example1 image should now be available:
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE example1 latest 374e0127c1bc 31 seconds ago 0B
Note that since there's no binary data, this image won't be functional. We are only using it as a simplified example of how layers can be viewed in Docker images.
We can see here by the size of the image that there is no source image. Instead of a source image, we used
scratch which instructed Docker to use a zero-byte blank image as the source image. We then modified the blank image by copying three additional zero-byte test files onto it, and then tagged the changes as
Now, let us explore our new image with Dive.
docker run --rm -it \ -v /var/run/docker.sock:/var/run/docker.sock \ wagoodman/dive:latest example1
Executing the above command should automatically pull
wagoodman/dive from Docker Hub, and produce the output of Dive's polished interface.
Unable to find image 'wagoodman/dive:latest' locally latest: Pulling from wagoodman/dive 89d9c30c1d48: Pull complete 5ac8ae86f99b: Pull complete f10575f61141: Pull complete Digest: sha256:2d3be9e9362ecdcb04bf3afdd402a785b877e3bcca3d2fc6e10a83d99ce0955f Status: Downloaded newer image for wagoodman/dive:latest Image Source: docker://example-image Fetching image... (this can take a while for large images) Analyzing image... Building cache...
Scroll through the three layers of the image in the list to find the three files in the tree displayed on the right.
We can see the contents on the right change as we scroll through each layer. As each file was copied to a blank Docker
scratch image, it was recorded as a new layer.
Notice also that we can see the commands that were used to produced each layer. We can also see the hash value of the source file and the file that was updated.
If we take note of the items in the
Command: section, we should see the following:
#(nop) COPY file:e3c862873fa89cbf2870e2afb7f411d5367d37a4aea01f2620f7314d3370edcc in / #(nop) COPY file:2a949ad55eee33f6191c82c4554fe83e069d84e9d9d8802f5584c34e79e5622c in / #(nop) COPY file:aa717ff85b39d3ed034eed42bc1186230cfca081010d9dde956468decdf8bf20 in /
Each command provides solid insight into the original command used in the Dockerfile to produce the image. However, the original filename is lost. It appears that the only way to recover this information is to make observations about the changes to the target filesystem, or perhaps to infer based on other details. More on this later.
Our original Kubernetes tool list was so popular that we've curated another great list of tools to help you improve your functionality with the platform.
Article covers: How native is react native?, React Native vs (Ionic, Cordova), Similarities and difference between React Native and Native App Development.
Mismanagement of multi-cloud expense costs an arm and leg to business and its management has become a major pain point. Here we break down some crucial tips to take some of the management challenges off your plate and help you optimize your cloud spend.
Becoming Cloud Native. Dive into this comprehensive guide to becoming cloud native: the principle definition, why to adopt the concept, architecture, benefits and more.
This article explains how you can leverage Kubernetes to reduce multi cloud complexities and improve stability, scalability, and velocity.