This is the Docker Image Deep Dive and Build Best Practices Tech Talk.
Table of Contents
- Intro (0:05)
- Understanding unioned filesystems (1:59)
- Union filesystem terminology (4:52)
- Creating an image manually (6:29)
- Visualizing the layers (8:44)
- Image building best practices (10:21)
- Image goals (12:40)
- The build cache (15:50)
- Identifying where the cache breaks (17:47)
- Multi-stage builds (21:41)
- Latest Build and Dockerfile features (24:59)
- Multi-architecture (26:57)
- Docker Desktop GUI Build View (29:44)
- Learn More
Intro (0:05)
Docker images are similar to shipping containers. A shipping container can have all types of things inside of it. It can have food, it can have furniture, it can have building supplies. A Docker image is the same way, except for software. You can have a web server in a Docker image, or you can have your application, or a database, or a message queue, or any other number of things. So the image needs to be flexible enough to be able to handle all these different types of software that could be inside of it. Now way back in 2015, Docker created the Open Container Initiative. The Open Container Initiative is the set of standards around containers. It’s owned currently by the Linux Foundation, and there’s three specifications. There is the Image Specification, which describes the image structures and manifest. There’s the runtime specification, which defines how an OCI image executes, and there’s the Distribution Specification, which defines the API protocol to push, pull, and discover content. Before we get into the image details, there is one question that often gets asked, which is why there are different architectures for images. Because a container is a process that’s tied to a particular kernel, and tied to a particular chipset, there has to be different versions to be able to support different chipsets in different OSes. So this is why you’re going to see an architecture tag underneath images. Now when you pull an image down to your particular machine, it’s going to automatically understand which architecture you’re pulling from and pull the correct image down, so you can execute it without having to worry about this. But if you are building images, you are going to have to worry about which architecture you’re actually going to be executing that image you’re building on. And we’ll go into that more detail on that further in the presentation.
Understanding unioned filesystems (1:59)
All right, so now let’s understand the unioned file system, which makes up images and how they’re structured. So here we see an example of pulling the nginx image. And if we look here on the right-hand side, we’ll see something like this if we go pull it ourselves. We’ll see some different rows here of letters and numbers, and it says pull and pull, and it says pull complete. Sometimes it’ll say ‘already exists’ on these. And so the question is, what are these things actually being pulled, and how does that become a container?
Well, each one of those things is a layer, and the layer is a set of filesystem changes that is going to be layered on top of each other. So if we look here, we can see layer 1, layer 2, and a merged layer. Each layer could add, remove, or delete files. Okay, so with the unioned file system, all these will get pulled together. We can see down here in layer 1, we’re adding file 1, file 2, file 3, file 4. Layer 2 adds a new version of file 2, and adds file 5 here as well. So up here in the merged version, which is what would exist if this image is executed as a container, it would be file 1, file 2 from layer 2, file 3, file 4, and file 5. It’s taking each one of these layers, and it’s pulling them together into the merged version. There are a number of advantages of doing it this way that we’ll get into as we go further into this.
Now, what about deleted files? What if I need to remove a file in a more advanced layer? Well, let’s see what that would look like. Here we have layer 3, and we need to remove something from layer 1. Now, each layer, when it’s written, is immutable. You cannot change a layer once it’s actually written to the file system. So in layer 3, if I need to get rid of file 4, what I’m going to do is very similar to if I had a piece of paper with ink on it, and I needed to write over that. I would take a bottle of white out, and I would cover up that text on the piece of paper, and then I would write on top of it. That’s exactly what we’re doing here. We’re creating a white out file that exists in layer 3, so that in the merged file system, in the running container, we don’t see file 4 anymore. Now, the incredibly important part to understand here is that file 4 still exists in layer 1. Layer 1 is immutable as of the time that it was actually written to the file system. So just because you can’t see it in the merged version doesn’t mean that it’s not still there. We’ll come back to that in the security concerns around that in just a minute.
Union filesystem terminology (4:52)
Okay, let’s look at the terminology here and understand what we’re talking about. The layers that have been written are the lower directories. As I’ve said, they are immutable. They are read-only. The upper directory is the writable space. So if I am actively building this particular image or running in the container and actually making a modification, it goes to that upper directory. And then we have the merged directory, which is the representation of the file system that the container is actually using. Now, if we want to see all this information, what we would do is a docker inspect on the container.
Let’s go do that. Here I am in the GUI of Docker Desktop. You can see down here in version 4.35, we have a terminal now. So I’m going to open up this terminal. And what I’m going to do is do a docker inspect on this particular container. And I’m going to copy the container ID, drop it in here. Now the inspect gives you all kinds of information about this particular container. We don’t need all that information. The part we’re going to look for here is what’s called the graph driver. And we’re going to see it right here. The graph driver is actually giving us the locations of each layer. So we can see here the lower directories and see all those particular directories, the merged directory, the upper directory, the working directory. All those things are laid out here. This actually shows us where each one of these particular layers is actually being stored on the file system. This is part of the container information
Creating an image manually (6:29)
Did you know you could create an image manually? You can do a docker run to start a new container. You can do docker exec and docker cp to make changes to that running container again in the upper directory. And then you could do a docker commit to save that file system as an image. Now I want to be very clear that this is not the recommended way of creating an image. This is something that can be done as a one-off if you’re debugging that kind of thing. But the vast majority of time what you’re going to want to do is use a dockerfile. A dockerfile is the repeatable standardized way of actually creating a docker image. The dockerfile has a number of commands that are going to help it to actually understand how to build that particular image. We’re going to go through each one of those.
The first one is the FROM statement. The FROM statement is going to give us the base to start from. What’s known as the base image. You can see in the example on the right here we have FROM Ubuntu as our base. The next thing we’re going to do is set a WORKDIR (working directory) in the image. Now because a container and container image is an isolated file system, that working directory could be anything you want it to be. In this case it’s /usr/local/app. It could be /app. It could be /opt. It could be anything you want to be because again this is an isolated file system. The next things are COPY commands and RUN commands. And you could do any number of these. You could do several runs or several copies or back and forth depending on what you need to do. The copies obviously are copying from the host file system or some other location to the container image file system. The runs are actually executing things within that image file system. That could be compiling code or installing tools or whatever that might be. Finally we’re going to set a CMD (command). The command is going to be the default command that’s going to execute when we start this image up as a container. And then to build this we would use the docker build command. This is the standardized way of building an image.
Visualizing the layers (8:44)
Now the next thing is what about actually seeing these layers? If I have an image can I see these layers and get some details about them? Yes there’s a number of ways of doing this. There is the docker image history which would give details by each layer. We can actually go into the GUI and see information. And there’s other open source tools that will actually show you the file system layer by layer like the dive tool will be able to do this. So let’s go look at a couple of examples of an image. If we go back over to Docker Desktop, I will clear off my terminal. I will go to my images. Let me go find my mongo, here it is. And I will just do a docker image history mongo. And it’s going to show me each of the different layers for this particular image. It’s going to show me what the commands were that created each layer within here as well. I can actually look at that and I can see the size of each layer. I can see some of these are nothing and then some of them are significantly more than that depending upon what’s actually happening in that particular layer. I can also just simply come in here in the GUI. I click on this and we actually go in and see all the layers, all 26 different layers. And I’ll be able to see all the vulnerabilities for this particular package. I’ll be able to see what the vulnerabilities are and I’ll be able to see which layer each vulnerability is in as well.
Image building best practices (10:21)
All right, so now that we understand a little bit about the image structure and what it is and why it is, let’s start talking about some image best practices. The first one and many of you probably have already guessed this is, never include secrets in your image. Because even if you delete it out of the image at a later point, it’s always there. Because image layers are immutable as soon as you write it to the image, it’s there forever. Never, ever, ever copy in a secret to delete at a later time those types of things. You always want to provide the secrets to the container at runtime, not baked into the image itself.
The next is using base images. If you’re using a base image, you need to trust where that base image came from. You want to make sure that base image is something that you know where it came from and who created it. This is where things like Docker trusted content – there’s Docker official images or Docker verified publisher images or managed images can help with this particular piece. Or there are other vendors out there who provide that as well. Or you can create your own base images. That’s also an option. But any way you do it, you just need to know where that base image is coming from.
The next is the latest tag. The latest tag is the default tag. And you know, it’s very easy to use. You can just say from node or from nginx, that type of thing. And many, many, many people do that. And what I’m going to tell you is if you are using the latest tag, there will come a day when it breaks your build. You know, things will be going great for a while. And then suddenly the version will change. Something will be different that you didn’t expect. And suddenly you’ve got an error you’ve got to deal with. So instead of using the latest tag, what you should be doing instead is pinning to a specific version. This way it doesn’t change and you know exactly what it is you have, and you’re working with, what version of that particular image you’re working with.
The last one is, by default, images have a root user. You should set a non-root user. The user command allows you to specify a user. There’s a very nice blog here as well that goes into the user instruction. Highly recommend you take a look at that.
Image goals (12:40)
All right, so let’s talk about some goals when building our images. The first is we want to reduce the size of the image wherever possible. We don’t need extra stuff in the image just to have it there. The next is we want to structure layers to promote reuse. We want to try to make it so that the layers help us to build images faster and make it easier to work with. And then we only want to include what’s needed to run the application. So we don’t want to include extraneous tools that were important in development to our production environment.
How are we going to do each one of these things? Well, the first is we’re going to think about what we’re copying in each layer and how we can structure it to make sure that we’re only adding the things we really meant to add. All right, so let’s look at this example we have here. What we want to do is we want to clean up as we go. Here we have an example on the left from Ubuntu. We’re going to do an apt update and apt install a python3-pip install flask and then we’re going to remove all of our stuff in the last two commands. All right, this looks pretty good, right? You know, we’re doing our stuff. You know, we’re making sure to clean up after ourselves that kind of thing.
But let’s think about this for a second here. Each one of these lines creates a new layer. And so each layer is immutable as soon as it’s written. So instead of actually cleaning things up, those last two commands are actually adding to the size of this, because it’s adding white out files for those particular files that it can’t remove for that particular image. Instead of doing that, if we could run all those commands as a single command and a single layer, what it means is that the output of all of those things together would be what’s written to the file system as the layer. So if we go to the example on the right, you can see here we have one run command and we’re using the ampersands and slashes to combine all these different commands into one. And what this means is that all of this will execute at once and all of it will execute together and the result of that will be written to the file system. So the image created from the commands on the left would be 460 MB; the functionally same image on the right is only 157 MB. That’s a 66% change just by changing this one command! So this is what we’re talking about with thinking about what you’re doing with the layers and cleaning up as you go to make sure that you’re being efficient with how you’re actually writing each layer for the image.
The build cache (15:50)
All right, our next goal is to structure the layers to promote reuse. So reduce the number of change layers on each build and reduce the bandwidths to push and pull images. All right, so we’ve talked a little bit about layers at this point and I really haven’t explained why we’re doing this layer thing. And one of the reasons we’re doing this layer thing is what’s being described here: because each layer is immutable and because the layers can be reused, it means that we can actually when we pull and push images and when we build images, we can simply use a cached version of that layer instead of having to rebuild it every single time. This is a huge benefit not only in building images, but it’s also a huge benefit when you push and pull images depending upon what layers you need. The build cache becomes very important when you’re actually building images. So the build cache tries to reuse layers wherever it can. So the cache rules state that for each add and copy instruction, a checksum of the contents or metadata is created. If the checksum is changed, the cache is invalidated. What this means is that if you’re copying, for instance, the source directory to your image and you change something within that source directory, the checksum will change and therefore the cache is invalidated because you changed what needs to be copied. If a RUN command changes, the cache is invalidated. So at any point during the build, if the cache gets invalidated, it’s invalidated for the rest of that build, for all the rest of the layers in that particular build. What this means is that if we want really fast builds, we want that cache to be invalidated as late as it possibly can be during our build process. Now it’s important to note that just because setting up caching for a particular set of technologies like Java or whatever, works one way doesn’t mean it works the same way for all technologies. The caching rules that you’re going to have to look at are going to be dependent upon your particular technology stack.
Identifying where the cache breaks (17:47)
Okay, so the first thing to look at is really trying to understand where does the cache break? One way of doing this is simply to run the build and look at the output. We can see here in this example on the right, if we look through the lines here, we’ll see a cached 2 of 4 working directory. That means that step was cached. It was not actually executed. It simply was pulled from a layer that sat on the local file system. The copy and run statements below that were not cached because you don’t see the cache statement next to it. We can look at the layers and understand which ones were actually reused and which ones were not. And so this is an easy way to actually understand where the cache needs happening. I’m going to show another way of doing that a little later in the presentation.
All right, one of the key things to understand when you’re building an image is what actually needs to change on a regular basis. Certainly your application code will change on a regular basis. But does the application setup need to change on a regular basis? Probably not. Once you’ve actually set up the application and got it configured properly, you’re probably not going to change that code very often. So if we look at the example on the left here, we have FROM node, we’re copying everything from the current directory into that particular image. We’re running npm install. We’re exposing a port and we said doing the entry point. Now, if you look at this, what that means is that any change you make, any source code change or anything, is going to validate the cache in the second step. Which means that npm install is always going to have to execute every single time. Whereas on the right hand side, if we take the couple of files that the npm install needs and copied them in their first, then run the npm install, then copy the rest of the source into the image. We now have moved down where the cache breaks to a lower point so that copy of the package JSON and package lock JSON and the npm install, those steps can be cached and that’s going to speed up your builds every time you build up a particular image.
Now, let’s take that one step further. Let’s say that the source that I’m copying in is purely source. It’s HTML files or whatever it might be. And that particular layer is not going to be interacting with any other layer. I’m not going to compile it later on in another step or anything like that. It’s simply a set of files that are going in there. I could use the –link command to basically say this layer is independent of any other layer. Don’t invalidate the cache even if this particular step is different. And so what this is going to allow me to do is actually be able to do source code changes and copy those source code changes into a subsequent build, but not invalidate cache. So the cache will remain in place for the exposing of the port and for the command at the end. So this only works for COPY and ADD commands for obvious reasons, but if you are copying in things that are not going to be part of a later step in the build, you can use the –link command to keep your cache going.
The last one of these is: include only what’s needed to run the application. So if we’re running into something in production, we do not want to have a bunch of extraneous stuff that is possibly exploitable in that production system. We want that production system to be as clean and bare as possible to simply run the application that needs to be run. So how are we going to do that while still having an image that is helpful to us in the development?
Multi-stage builds (21:41)
Well, we have this concept of multi-stage builds. A multi-stage build allows you to create a mini pipeline within the Dockerfile. So you can actually separate out your build-time image from your run-time image and do even more beyond that. You can do the COPY –from=<stage> to copy from a previous stage. The last stage is going to be the default target of this particular build, but you can override it, which we’ll do here in just a minute. So looking at the example down here at the bottom, we have FROM <image> AS stage1. And then we’re doing a bunch of copies and runs and compiling things and all that. Then later on, we have a FROM <image2> so a different base image as stage two. And then we’re going to copy from stage1, the output of all that compilation and copying that we did earlier, to the directory in stage2, which is going to be our runtime environment. So we’re just copying the output and not all the other pieces that are in there that we don’t need in our production environment.
Here’s a React example to see what this actually looks like. Let’s walk through this particular thing. So we’re going to use Node to build the React code and then we’re going to copy all this into a static web server. We have a FROM node:lts AS build. This is our build stage. We set a working directory. We copy in the package JSON and package-lock JSON. We install NPM. We copy in the public directory, the source directory. We do a build. We’ve done all these steps to actually build our application. Then we have FROM nginx:alpine, completely different base image. We’re going to copy in the nginx configuration from the host to this new stage. And then we’re going to COPY –from=build the output of the build stage into the nginx HTML directory. So now we have a production image that only has the pieces we need to actually be able to run our production image.
Now we had two stages up to this point, but why not more? We could have multiple different stages. And this means that we could actually build dev images and production images within the same Dockerfile. Let’s look at this example. We have FROM node:lts AS base and all we’re doing there is setting the working directory. Then we have FROM base AS dev and we set a command. Now at this point, we could actually do a build with a target of dev, like you see on the right hand side and do a file bind mount of our directory on the host to this, set up a port, and now I have a development image. I have an image that’s ready for me to work with as a developer and I can work with that and make changes of my host system, see it reflected in the container that’s running, that’s great. Then we have a FROM base AS build and then we’re going through all those commands we went through before. And then finally we have our production version with the FROM nginx:alpine. So what this is doing is giving us the ability to have the dev image and have the production image within that same Dockerfile, so we can see exactly what’s going on with both versions of the image.
Latest Build and Dockerfile features (24:59)
Now let’s talk about some of the latest Build and Dockerfile features. So the first one here is mounts. So we talked earlier about not including secrets inside of images. That’s a very bad idea. What should we do instead? What if our build needs a secret like an SSH key or something else or if we simply need to bind another file system into that build. This is where a mount comes in. A mount allows us to modify the filesystem while building. But it does not add things to the file system. It simply attaches them temporarily during the build process. So we could have host directories, we could mount into the build. We could have cache directories if we built up cache already someplace else and we want to use it. We could have secrets. We could have SSH keys. All those types of things.
So looking at example down here FROM node. We’re going to copy the package and the yarn. We’re going to run the yarn install. But we’re going to mount in a secret of npm credentials. And we’re going to mount in a cache as well for that builder to be using. These are going to be mounted in temporarily, but they’re not going to be part of that particular image.
HEREDOC support is next. So we had the example earlier where we’re using the ampersands and the slashes to combine commands together. HEREDOC gives you the ability to actually have multiple lines together without having to do all those ampersands and slashes. You can see the example on the left here where I have the RUN and then the EOT bash. And then I have a bunch of commands I’m going to put together. And they’re all going to execute as one line. I can even, on the right hand side, specify an interpreter and do things like print lines and all that kind of stuff within there. So the HEREDOC support gives you a lot of flexibility and helps clean up your Dockerfiles as well.
Multi-architecture (26:57)
We talked about this at the beginning. Let’s talk about it now. If I’m building an image, that means that image is going to be built to the architecture that I’m currently running on. I’m coming to you from a Mac M2 machine. If I build an image natively, it’s going to build an ARM 64 image for it. If I need to have that image run on AMD 64 for example, then I need to be able to build an image of that architecture to have it run there. Multi architecture builds give you that capability. So if we look at the example here, we’ve got a couple of statements in here. We have a FROM –platform=$BUILDPLATFORM node as build. BUILDPLATFORM here is going to be the native platform that’s being built on. In my case, that’s ARM 64 and we see all these different statements here where we’re actually building all these different things, together. Then we have a FROM –platform=$TARGETPLATFORM nginx. Now, where does that target platform come from? If we look down at the bottom, you’ll see we’re actually specifying it on the command line where we’re executing the build command. You can see –platform=linux/amd64,linux/arm64. This means we’re going to produce two different versions of this final stage as you can see on the right hand side. The build was based upon the native architecture where we’re building; the final is going to have both an AMD 64 and an ARM 64 version of this particular image. This is how we can actually do multi architecture builds. Now, it’s important to note, if I am building for an architecture that’s not my native architecture, I’m going to have to emulate to make that happen. That means that it’s going to take longer to build and it’s going to take more processing time to build as well. There are ways to get around that though. Things like Docker Build Cloud would allow you to do both ARM and AMD 64 builds natively in a cloud environment. We’ll talk a little bit more about that later.
Build Checks is one of the newer features that’s in the builder at this point. What build checks are doing is basically providing you information on things that may be going on in your Dockerfile that you may not be aware of. You may have a Dockerfile that seems to be executing properly, but there may be things in it that are incorrect that you maybe want to be aware of. This is going to give you those best practices and checking and what you’re doing with your Dockerfile. You’re going to get a list of warnings and guidance on how to resolve those things inside of this. You can see this either as you are doing the build or in the desktop builds view, which we’re going to come back to here in just one second.
Docker Desktop GUI Build View (29:44)
So we’ve recently added a build view within the Docker Desktop GUI. This build view gives you a lot of information about the builds you’re doing. It gives you information about build times, cache usage, dependencies, build source, Dockerfiles, build logs, and the history by image. Let’s go take a look at that and see what that looks like. Here I am and my Docker Desktop GUI. I can go to builds and I’m going to see builds for all my different images. I can see some of these took very little time because the cache took care of everything and I can actually just regenerate the image. I can see some of these are longer times and I can get information about those. So for instance, if I come in here to the Spring Pet Clinic, I can actually see, oh, the Build check says there are some warnings within here. So I can go and check those particular warnings. This also gives me information about build times. So I can see here how long it took to actually build it. I can see how much the cache was actually used within this particular time, and I can see the dependencies within here as well. So I can see there’s actually multiple different dependencies within this particular one as well. If I come over to the source, I’m going to get the actual source file for that Dockerfile. And I can see some of the warnings it has in here about the FROM and AS being different here. So again, these are not things that are necessarily caused errors, but it’s things you may want to be aware of as you’re doing your Dockerfile. I can see all the information about my particular Dockerfile. Then I can come over to the logs and I can see not only the individual logs for each step of this, but if you notice at the top, as I’m scrolling here, we’re actually going to see as we’re processing through this, that the little line is moving, showing us how we’re progressing through the build. I can actually see where I am in the build process as I’m going through this. This is a particularly long Java build. It’s going to be a little hard to see within there, but it actually is moving that little line as we’re moving through this particular build. Then if we go over to history, I’ll actually be able to see for this particular image I’m building, how long has it taken to build? What’s been the duration of it? How many build steps have there been in each one of these? And how many of those steps have been cached? So this gives me an idea of, you know, am I getting better or worse over time as I’m actually building this particular image. And I can get more information about those previous builds as well and see what it is. Now notice here that several of these builders are just the default desktop builder. There’s also a cloud builder here.
This is the Docker Build Cloud capability. And you notice here that my local builds were almost three minutes, two min 45 sec to three minutes. And then this one was 1:23 in the cloud builder. That’s what the cloud builder gets you is it makes you a faster build and gives you ability to do ARM and AMD 64 builds directly in a native environment so that you don’t have to worry about emulation.
I hope you’ve enjoyed this Docker Image Deep Dive and Build Best Practices presentation. If you want more information, there’s a ton more information about builds. They’re in our build manual and our documentation. Please go and check it out. Thank you so much.
Learn more
- Check out the Docker Build Manual for more info.
- New to Docker? Get started.
- Deep dive into Docker products with featured guides.
- Subscribe to the Docker Newsletter.
- Get the latest release of Docker Desktop.
- Have questions? The Docker community is here to help.