Transcript
Hello, and welcome to this session in which we’re going to be diving into advanced build topics. The focus area is going to be basically everything between the Dockerfile and the actual built image in which we’re really kind of focusing on what actually happens during the build. What are the tools involved in? How do we maximize the build performance? What options exist, etc? So again, everything between the Dockerfile and an actually built image. To do this, we’re going to follow this agenda. We’re going to start off with image builders and talk a little bit about context. We’re going to talk about some of the advanced capabilities that come along with buildx, including multi-architecture images and cache management. And they’ll also talk about some of the new developments in this space, including Docker Bake. And when or when you might not want to use Docker in Docker, and we’ll talk about some other tools that we provide that make it so you don’t even need to worry about Docker in Docker.
Table of Contents
- Image builders and contexts (0:52)
- Using buildx (6:54)
- Multi-architecture builds (13:22)
- Cache management (19:37)
- Builds with bake (24:19)
- Docker in Docker (DinD) (27:00)
- Learn More
Image builders and contexts (0:52)
Let’s start off with talking about image builders and contexts. Again, what happens when you actually press enter on the Docker build command? Now, it would be important to start off with a little bit of information about the legacy builder. Basically, where did we come from? This is the original builder that was included with the Docker engine. And so when you run Docker build, it used this legacy builder. Now for the most part, it just followed the script of the Dockerfile going one instruction to the next to the next to the next to produce an image. But it was also pretty basic. It had very basic support for multi-stage builds. And again, since it was reading step by step by step, it read every instruction and sequential order. And it basically did that until it reached the target of the multi-stage build. So if you’ve got multiple stages, it may even build intermediate stages that weren’t even needed for the final build. But again, that was just some of the limitations of the legacy builder. The legacy builder also isn’t able to do any multi-architecture images. So if you need to arm an AMD 64, it just wasn’t able to do that. Again, it was very sequential. Here’s the Docker file. Here’s the script. Now build the image.
So because of that, back in 2017, now quite a few years ago, there was a new effort and a new open source project called BuildKit that was put together. And BuildKit was designed to replace and fix a lot of these issues. And the way that it works is, at the end of the day, it’s a very low-level build definition format. Basically, a graph of steps and nodes and executions and configuration and everything is put together. And with this, an execution plan can be created to build the final output. And so what this means is, where builds previously could only go sequentially, now builds can run in parallel. And again, since this is a very low-level tool, you’ll hardly ever interact with BuildKit directly. You’re going to most likely interact with a tool that builds on top of BuildKit. But BuildKit also provides lots of support for new capabilities such as caching, distributed workers, new frontends, and a lot more. So again, where the legacy builder was basically just build and run sequentially, think of BuildKit much like a graph, where all the steps are outlined as nodes in the graph.
Now in order to run BuildKit, you have to have it running somewhere. You can run BuildKit standalone, but most of the time you’re not going to do that. Docker Desktop does include a BuildKit daemon, but that one currently has some limitations because of some of the legacy systems that are still in Docker Desktop that are being worked on being pulled out. And so for example, the legacy image store that’s built in the Docker Desktop can’t support multi-architecture images. So you can only build one architecture and store that, but it can’t handle multiple architectures. So that built-in BuildKit daemon is also bound by that same limitation. There is an effort to move that legacy image store to containerd and that will solve a lot of these issues. And also allow images to have attached attestations. So for example, SBOMs and build provenance, etc. But that work is still in progress.
Docker buildx is the tool that you’ll typically interact with that, again, builds on top of BuildKit. And it’s a CLI command that exposes a lot of the BuildKit features. So for example, you can create BuildKit Builders, multi-architecture builds. It can also handle advanced cache management and SBOM provenance creation as well too. Now since Docker Desktop 4.19, Docker Build has actually been aliased to Docker buildx Build. So you’ve probably actually been using BuildKit without even knowing it, but it was still using a Docker driver, which we’ll talk about in a few minutes. And this Docker driver was then basically taking the export, the image and making it so that it’ll work with the legacy image store. And we’ll dive into drivers a little bit more here in just a moment. Now, one of the other things that buildx provides is the ability to use Remote Builders. So it doesn’t even have to use the local BuildKit or anything on your local machine. In order to do that, it leverages what are called Docker contexts. Docker contexts and contexts go beyond the scope of just Builds. But contexts allow basically the local CLI to operate against other engines. At the end of the day, the Docker CLI is just a REST client that’s interacting with the engine exposed by the Docker engine. And that engine can be local, it can be remote. And so the default context just simply uses the engine that’s on your local machine. But you can create contexts that point to other machines.
So for example, Docker context create will create new context, you can list out the context, you can use new context. And the two examples here, this first one is going to create a context, name it SSH example, this is a Docker context and here’s how to connect to the remote machine. In this case, it’s going to use SSH. The second example is a TLS example and it’s going to interact with a Docker engine that’s exposing its engine API through a TLS endpoint. And in this case, we’re doing a cert-based authentication and so here’s the details about the CA certificate, private key, etc. And so what that looks like kind of graphically here, again, is the default context would point to the local Docker engine, where if I start using commands using, for example, the SSH context, instead of querying the local engine, then it’ll query an engine on a remote machine. But before doing so, it’ll actually SSH into that machine, query the engine and then send the results back and you’ll still see it on your local CLI. So it’ll feel like it’s local, even though it’s actually SSH into a machine remotely. Where the TLS example will just talk to the API that’s exposed directly on this remote machine using the certificates that we’re configured. Okay, so that’s a little of the background here.
Using buildx (6:54)
Now buildX, again, some of the deeper examples and features that come along with it, with buildx, I can create various BuildKit daemons. So as I mentioned earlier, the one that’s bundled with Docker Desktop is currently kind of bound to the limitations that Docker Desktop has with the legacy image store, etc. But I can create another Builder and basically just have it sitting there running in a container. And that Builder then can have the full feature set of a BuildKit daemon. So it can do multi-architecture builds, SBOMs, etc. So for example, I can do Docker buildx create, specify a platform. In this case, two platforms. And this will create a multi-architecture Builder, and we’ll use a generated name. And this Builder will be able to execute builds for multiple architectures. Now the non-native architectures, it’ll just use QEMU emulation. So it’s not going to run at native speeds by any means. But it will be able to do builds using those other architectures. I can list builders, and then I can also say, I want to use this builder and it will set that as the default builder for future builds.
Now as I mentioned earlier, there’s other drivers that I can use. And these drivers kind of change how BuildKit operates in the back end a little bit. And so as I mentioned earlier, the Docker driver will use the BuildKit library, the BuildKit daemon bundled into the Docker daemon, the one that’s bundled with Docker desktop. And again, export the images and store it into the image store that’s bundled with Docker desktop. The Docker container driver is one in which it will run BuildKit in a container. Kubernetes, it’ll create BuildKit pods and a Kubernetes cluster. So every time I do a build, it’s going to basically launch something, launch new pods in a Kubernetes cluster. And remote will let me connect to remote builders as well too. And you can see the kind of feature matrix down here at the bottom of slide and the various features that come along with each of them. And you can see that the Docker driver has fewer features, but it does automatically load the image back into the image store, again, single architecture without attestations, etc. Now you can load images with these others as well. You just have to specify it when you do the build.
Now there’s a couple of things I can do when it comes to exporting this well. And this is actually some of the kind of neat features. When I normally just do a Docker build, it’s just going to build an image, store it in the image store. But this opens up some other opportunities where I can do other things with the builds. Maybe what if I want to do a build and I want to actually save the image as a tarball because I’m going to take this build and I’m going to put on a USB drive and move it to an air-gapped network and load the image there, as an example. Well, I can use the OCI exporter to export to the local file system in the OCI image layout format. And this actually will expand it up. Maybe I want to do a tarball as well too, just to get easier to move around. But again, there’s lots of different examples here. And in fact, I’m going to switch over to a terminal here and just demonstrate some of it here.
So if I do a Docker build and let’s do output, type=oci, and we’ll just export it to a tar here. What this is going to do is trigger my build and we’ll see that I’m actually using a cloud builder here. This is coming from Docker Build Cloud, which we’ll talk about here in just a moment. And this build after it runs here, it’s just building a very simple React application. When the build is completed, we’ll see that it sends a tarball down. And it’s not actually saving it as an image, but I can see the oci-export.tar here. And let’s make a new directory. And let’s just copy that tar into here and we’ll expand it out. Let’s use JQ. We’ll see the manifest that comes along with it as well as all the blobs for all the individual layers and whatnot too. So again, it’s exported as a tarball for OCI format. Now, there’s something else I can do here too. So this Dockerfile, the default one, is going to build this React application and put it into an nginx container. But maybe for whatever reason, instead of putting it in an nginx container, maybe I just want to export the results from this build that I’m copying in an nginx container. But what if I just want to put it on the filesystem? Because maybe I’ve got a different deployment process or maybe I want to use Docker to do the build, but I’m actually going to, since this is static HTML assets, put it in an S3 bucket and distribute it with a CloudFront CDN.Well, I can do that. So let’s specify my other Dockerfile. I’m going to use the output type=local. I’m going to set the destination just as a local folder called export. And now when this runs – re-uploading the context includes my OCI output – so it took a little bit longer to upload everything. But in this case, I see that it’s exporting to the client directory. I copied about 151 kilobytes. And now if I look in the export directory, I see the built output there. And so it’s got my index.html from my React app, as well as the other JavaScript and CSS assets. So again, with these exporters, I can change what the output of the build is. And again, just to show what that Dockerfile looks like, in this case, I started from scratch. And I’m taking the files that I wanted to persist from the previous stage. And I’m just putting them at the root of the file system here. And so that’s any files that are in root, it’s going to follow the same file system structure when it’s being exported to that local directory. So again, there’s quite a few different exporters that I can leverage depending on my various needs.
Multi-architecture builds (13:22)
So let’s talk about multi-architecture builds for a moment. So multi-architecture builds – why are they important? Why are they necessary? What value do we get from them? And the answers here are obviously very dependent on your organization. But when we first think about binaries, and the zeros and ones, they’re encoded to a particular CPU architecture. And every CPU architecture has a different instruction set. So for example, ARM has a very different instruction set than AMD, etc. And even operating systems vary. So Windows and Linux, the binaries are very different. And so I can’t literally just take one image, build once to play anywhere. I still have to make that image for each of the different CPU architectures and operating systems I want this application to run on. Now many developers were seeing more and more shifting to use ARM-based machines, especially with Apple laptops, a lot of them are on Apple Silicon machines, which are ARM-based. But we’re also seeing a lot more Windows machines starting to make their way into the fleet as well. And many of those are ARM-based as well. So we’re starting to see this mixture of environments for developer workstations. When applications are deployed, a lot of folks are starting to use ARM-based machines out in the cloud because they’re more energy efficient, and they cost less to run. And so that bottom line matters. And so if there’s an easy way to build ARM-based images, and you can now save money in your infrastructure, all the better. And finally, multi-architecture images gives a choice. Now, of course, so I’m running on a Mac here. My machine can run AMD 64 images. But of course, it’s not going to be running at native speeds and there’s going to be a performance penalty there. Now, if that same image was available in multiple architectures, then I’ve got a choice and it works out better for me. And if you are an organization that’s creating software that then other downstream users are going to pull, again, it gives them the choice as well too.
So again, we see lots of reasons that people want to do multi-architecture. Now, there’s ways that you can do that build. Built in, BuildKit will use QEMU to emulate the non-native architectures. And so in this example, do a docker buildx create to create a builder, specify the platforms. And then when this build runs, it’s going to use whatever the native architecture is run at native speeds, but the non-native architectures will be using the QEMU emulation. So again, it works, but it’s often quite slow. And I run into this quite a bit with my CI pipelines because my job may be running in a single pipeline, which is obviously bound to a single architecture. But I want to build for multiple architectures. And we’ll talk about that here in just a minute. So again, QEMU emulation, it’s built in. It’s not the best, but it’s there.
For those that want to go a little bit more of a DIY approach, there are ways to do native multi-architecture builds. And so some of this is going to leverage the Docker context that we talked about earlier. So in this case, I’m going to create a context called arm-machine that points to an ARM-based machine. And it’s going to use SSH to connect to it. And then we create a builder to say, for the ARM64 platform, use the ARM machine context for this particular build. Then what we can do is we can append basically another node to this builder to say: a different architecture, use the default. So in this case, maybe this particular machine is an AMD64 machine. So if I were to do this multi-architecture build now, it’ll delegate the AMD64 build to the local machine, but then delegate the ARM64 build to the remote machine. And BuildKit will just take care of it all for you. Now, again, you can set this up, you can run it yourself, it’s complicated. And of course, now you’re having to maintain all these different machines and the builders and whatnot. But I do want to just put a plug here for a Docker Build Cloud. Where with Docker Build Cloud, you get these managed builders that run in the cloud. And they come with native multi-architecture support. So there’s no setup, there’s no managing the machines or the different nodes in the different contexts. It’s just super easy.
Let’s talk about some advanced Dockerfile support that comes with multi-architecture builds. One of the cool things that you can do is in each of the different stages, you can actually specify the platform that this stage should run with. So for example, in this particular build, and we’re going to do a multi-architecture build, AMD64 and ARM64, and for this React app, the final assets, the HTML, CSS, and JavaScript, they’re platform independent. So does it really make sense to actually build the HTML, CSS, and JavaScript on an ARM machine and an AMD64 machine when there’s static assets? Those assets, again, aren’t dependent on the platform. So in this case, what we can say is, hey, only actually run this build using the native build platform. But then on the final stage, actually, then do the target platform. So what ends up doing is something like this, where the build stage is running with the native architecture. Again, we’ll only run once. But then the final stage will run once per final target platform. And this is really powerful for assets that aren’t platform dependent. So in this case, the React app or Java applications where the final jar or file is platform independent. And you can do the build using the native build platform. And then the final target platform can be based on, have a JDK that’s for ARM, a JRE for AMD64 and ARM64. And it just works. So again, some really cool things you can do with these platform flags as well.
Cache management (19:37)
All right, let’s talk about cache management. And I’m sure probably you’re already aware of the build cache. But there’s some other things that come along when we think about cache. So I want to talk for a second, just why cache management. First off, builders are going to use the build cache to speed up builds. And the more that you can leverage the build cache, the better. The more that you can orchestrate and build your Dockerfile in a way that allows you to reuse layers, means again, your builds are going to go faster. It’s less you haven’t a push and pull, etc. But the problem is that these caches, typically, by default, are local to the builder. So if I build on my local machine here, nobody else on my team can leverage the caches I just populate. Or especially my CI pipelines, builds are often running in an ephemeral environment, where every time my CI pipeline starts up, it’s starting on a job in a brand new machine that has no knowledge or no cache from the previous runs.
So again, how can we leverage these caches? Well, there’s a couple of ways that you can do this on your own. And BuildKit and buildx have various cache backends. So for example, you can say, I want to actually embed the cache into the image itself. I wouldn’t recommend it, but you can certainly do it. You could push the cache to your registry and we see this one used pretty often, where you push the image, but then you also push the cache as well too. And it includes additional manifest so that cache lookups and everything can occur in the future as well. You can write the cache to a local directory if maybe you’ve got a network attached storage that’s being used across builds or whatever. There’s a GitHub action integration in which the caches will push into the GitHub action caches, S3, so on and so forth. Now there’s two different modes that I want to mention here. The first min being that only the exported layers are cached, meaning only the final stage. If you’re doing a multi-stage build, then this won’t catch all the previous stages. And that may or may not be what you want, it just depends on your needs. And if it’s not what you want, then maybe max is and that will catch all the intermediate steps as well too. So again, depends on your needs. There’s different cache modes there.
So two examples that we’ll show here. In this cache example, what this is going to do is it’s going to do the normal build, but cache two and from is saying that I’m going to use a registry and the location of my registry is going to be, in this case, my-repo/my-cache. Okay, obviously just a made up name here. But in this case, it’ll pull from Docker Hub. If you’re pulling from your own internal registry, you’ll just follow the same naming pattern as normal image names. So you know, registry.company.com/my-repo/my-cache, whatever. In this case, mode=max. So again, it’s going to push all the intermediate steps. It’ll use the cache from there. And then once it’s done, it’ll push the cache back there. Another example here is using S3 buckets. And in this example, the caches, instead of being stored in a registry, are going to be stored in S3 buckets in this region, in this bucket name, in this key prefix in the bucket as well. So again, lots of different ways you can configure this.
What this looks like in GitHub Actions – if you’re using our Docker build push action, it’s just an additional field to specify the cache from and the cache to as well. And then it just plugs in. So again, there’s opportunities. I mean, you can certainly manage all this yourself. And then you have to manage all the garbage collection and clean up and whatnot. But again, I’ll put another plug here for Docker Build Cloud, one of our products here. Where now you have a shared cache amongst all your builds and it automatically takes care of garbage collection and ensures that you don’t spend all your disk space and everything on the caches there.
And again, now with this, when you run builds – let’s jump over to get hub workflow here – this particular build isn’t using the cache. And we see that it’s having to pull the image layers every single time; we don’t see any cache being used. And so it’s a lot of it’s having to push and pull all the dependencies and everything every time, where once I’m using Docker Build Cloud, then we’ll see that all the images are already pulled and layers are cached across various workflows. So again, it just makes it really easy and very minimal change that’s needed here.
Builds with bake (24:19)
All right, let’s talk about bake for a second. Now bake is a new tool that we’ve been working on that helps solve the needs for pretty sophisticated builds. So let’s start off with this complex CLI build. Now again, I reckon not everybody has ones like this, but this has got a cache-from, a cache-to, it’s doing a multi-architecture build, it’s specifying the various labels, and this case, it’s actually going to have multiple tags for it, it’s specifying a file, and so it’s not using a default Docker file. Again, not everybody’s build looks like this, but we do see a lot of adding lots of labels, etc. Again, how do we make this easier? So what bake does, it’s a higher level tool and abstraction, orchestration, declare a format, lots of terms you can put to it. But it allows you to basically codify your build commands and then you can publish different variants of your images, and if you’ve got– “ok, how do you do this build and this build?” Where BuildKit can do the individual steps within a Dockerfile in parallel, Bake can actually let you do parallel image builds and do some really cool stuff, linking them together and everything. It uses an orchestration file, and a lot of times you’ll see it using HCL. So it’ll look kind of like Terraform, but obviously it’s different structure, but you can also write it in JSON or YAML as well too.
And so in many ways, you can think of Bake like this: Bake is to build as Compose is to run. So again, taking this previous CLI command, I could convert it to this Bake file. Where now I specify target, I’ve got the outputs, the caches, the Dockerfile, the platforms, labels, etc. And then from here, I can just run docker buildx bake and it will do the job for me. And so now that that complex CLI command is codified in this HCL file and I can easily reproduce and build it. Now again, this is a fairly simple example. Although there’s lots of arguments here, I can start to do parallel execution where I want to build actually three different container images and parallel, each of them using different Docker files and they’re going to inherit from a shared configuration. Again, you can do some pretty sophisticated things and we’re not going to try to teach you everything here, but go check out the docs. There’s a lot of really good examples and the docs around Docker Bake as well.
Docker in Docker (DinD) (27:00)
Now to wrap up, we’re going to talk about Docker in Docker. Okay, Docker in Docker is a tricky subject. And it’s one that comes up quite a bit with builds because especially in CI pipelines, because, if you’re using Docker Build, it requires a Docker engine. And many CI jobs run inside a container. So for example, we talked to a lot of folks that in their CI pipelines, whether in Jenkins or GitLab or wherever it might be, every new job is actually a pod and a cluster or just another container that’s running. And so for that container to do a build, well, then it needs to have the ability to run new containers to actually do the build. And so there’s basically two options – one is either I’m going to mount the host socket into that image or the CI job or I just allow it to basically run nested containers and that’s where Docker in Docker comes from. So many CI jobs run inside a container and this is basically – I want to keep that CI job sandboxed, I can’t break you out, can’t do anything on the host. And so it can launch new containers, but there can be nested containers, Docker in Docker.
But this comes along with some challenges because in order to do that, that CI job container needs extra permissions and needs privilege mode access to be able to create those nested namespaces and launch containers, etc. And this privilege mode may be blocked in your organization’s security configurations; may also conflict with Linux security modules that may be on the machine that’s running there. And then also you start to run into some complicated file system issues where if I’m trying to launch a nested container, the file mount paths may not be accurate and it can get some pretty tricky situations there. So there’s kind of really two alternatives here. Again, if you’re doing this yourself, the first one is to mount the Docker socket into the container and this comes with a very big warning. Okay, this can be considered a security gap. So for example, if I were to do this, I would be sharing my host socket into this container. And so any commands inside that container are really operating on the engine of the host and that means that that job could say “I want to Docker run, mount the entire host file system into a new container and then it has access to anything on the host.” And so again, you want to be very careful of that, but it is an option. Again, it’s there, but recognize the security gap and security posture you’re putting yourself into. Another option is to use sysbox and we’ve got a couple examples of this in which you can run nested containers safely. There’s some trade offs that come along with this, but there are ways to set it up so that with sysbox, you can do nested containers, but then control a little bit of what those nested containers can and can’t do, that they can’t mount the root file system, for example, etc. So it’s complicated to set up, but it’s an option as well. And then finally, with going to Docker Build Cloud, again, there’s no need for Docker in Docker and your job is really only a client to remote builds. And again, you get the native support of both ARM and AMD64 machines. But again, I want to hit this middle point here, extra hard, that imagine I’ve got a Kubernetes pod that’s starting up in a pipeline, that pod only needs the ability to communicate to Docker Build Cloud. It doesn’t need the ability to launch nested containers or really do anything. And it allows you to still use all the same tooling that you’re already used to, just Docker Build and magic happens. And it’s going to leverage the shared caches and all the other things that we’ve talked about before. So again, lots of advantages there.And with that, we’ve talked about lots of different things when it comes to builds and build X and multi-architecture builds and cache management. And so I say, thank you. If you’ve got any questions, feel free to reach out. If you’re a customer and you’ve got various support needs, feel free to submit support tickets, feel free to reach out to us in the community Slack and other locations as well. And as always, we appreciate you. And if you’ve got any questions again, feel free to reach out and keep learning. Thanks all.
Learn more
- New to Docker? Get started.
- Deep dive into Docker products with featured guides.
- Subscribe to the Docker Newsletter.
- Get the latest release of Docker Desktop.
- Have questions? The Docker community is here to help.