DockerCon
Streamlining CI/CD with Modern Container Builds
Kevin Alvarez, Sr. Software Engineer, Docker
Transcript
Hi, everyone, I’m Kevin from Docker. I work in the build team, and today will be a presentation about CI continuous integration with build tooling exclusively.
Overview
The first thing I want to talk about is the CI (continuous integration) provider and GitHub Actions. This is a popular provider we have, and we want to focus on this with just a short introduction and on GitHub Actions. The main component will be about the Docker actions we are providing, and today we have six of them in the marketplace.
To get started, let’s look at a basic workflow we can start to work with. This is a basic step, so you can understand how we can evolve from this basic workflow and how to optimize using build cache as well. As you know, there is multiple platform support with Docker images. This is a popular trend nowadays, so you will see how we can create optimized container images for different architectures. Following this chapter, there will be something linked to publish your materials or to publish your image using automating tags and labels. This is also using our official actions.
Following this, there will be how to unleash and simplify your workflow using Bake. This is also a build tool. Not really popular, but I want to focus a bit on this one in this presentation, so you can see the advantage of using a new definition to do your builds.
Then there will be another small part, but really important, I think, because we see maybe bad patterns being used today on CIs, and secrets are an important part and deserve their own chapter, I guess. Finally, we will see how we can use container-based workflows and achieve a greater profitability from your inner loop and also on CI.
Table of Contents
GitHub Actions
Let’s get started with GitHub Actions. What are GitHub Actions? You have first the workflow itself, so this is where your pipeline is defined with events, runners, jobs, and stuff. I will not explain in this specification, but just so you have a better understanding of how it works, because I will show many workflows during this talk.
You also have an event. This is a specific activity on the repository that triggers the workflow run. There are many of them like push-pull requests. Workflow dispatch is one that is commonly used to manually trigger your workflow. There is also the runner. So this is the server that runs your workflows like Ubuntu latest, macOS latest, also self-hosted runners if you have one. And each runner can run a single job at a time.
There is the job. This is the set of steps inside your workflow that is executed, so inside the same instance. And the action, this is, I will say, a custom application you can build. It can be a container, it can be a JavaScript action, it can be also composite actions. This is running some command, something like that. But for the purpose of this presentation, it will be about all the connections we are providing, so there are six of them.
Let’s see what the current state of connections is today. We want to ease your pipeline and to do this we provide these GitHub Actions to have a seamless integration in the workflow. The most popular of this one is probably the build-push-action. This one is basically to build push your image. It’s using our build tooling, so this is buildx and BuildKit with full support like multi-platform builds, using secrets, exporting your cache. You can also use different builders, deployment, spacing options, thanks to BuildKit. We are going to talk, of course, about the other actions there. There are more satellite actions like, for example, you need the login action to be able to push your image. You need first to trigger the login action and then you can use the build-push-action.
There is, as you can see, the actions-toolkit. It’s not an action on its own. It’s just a library we are using for all operations. It provides some utilities and common logic around our build tooling. It is a minimal wrapper, if you want, that can be used, and it’s an easier API for us to interact with them. And this toolkit, of course, is consumed by all our actions today.
Basic workflow
Let’s see a basic workflow. To get started, this workflow will be triggered on a workflow dispatch event. It’s just manually invoked, like you can just go to your repository. You configure this workflow manually without any specific event, and this will use two of our official actions. There is the login-action, and there is the build-push-action. So the login-action will login there on Docker Hub. After that, we can use the build-push-action to tag your image and push your image. So this is really simple. I use a simple Go project, because I want to talk about multi-platform later, so it will be easier to have a project like this. This is just a simple web server. There is nothing fancy behind it.
Run the build
Let’s run this build. I would have liked to do a live demo to see this in action actually, but it will be a screenshot for this presentation. As you can see there, it took approximately 24 seconds. That’s not bad to build something, but we can do way better if we leverage cache actually. So, if we see the build logs there, we can see this stage is never cached. It will run every time you run this workflow. This stage will run and run, and there is no restriction being cached. So we can leverage caching things to this.
Optimizing cache
We can do better optimizing a workflow with cache. We can see first what BuildKit offers in terms of caching implementation to export your cache. There are many backends available today in build kits. As you know, maybe, BuildKit, when you do a build automatically caches the build result in its own internal cache. But as GitHub runners are familiar, this cache is not existing, it’s not persisted. The idea with BuildKit is also exporting the build cache to an external location, making it possible to import it in future builds. So between runs, we have the following cache backends available, but we also have this one that we are going to use today.
This is the GHA1, GitHub Action1 that is using the GitHub API. It’s basically the same as the official GitHub cache action. This is using the same kind of API. To enable this cache in your workflow, you just need to add the set of buildx actions. This set of buildx actions will create a container builder to leverage all the BuildKit features available today. So it’s using the latest table of BuildKit, that is currently not available with BuildKit embedded inside the Docker engine.
Now we can set the cache instructions to say, okay, I want to use this cache. I will push this cache to exporting this cache on GHA. I can also read this cache from the GitHub Actions backend. So if we build now this code, this cache saves both the cache metadata and layers to the GitHub cache service. It is an Azure Blob storage under the hood. On top, they have their own API, and this is the same backend being used by the official GitHub Actions, of course.
If we build this project, let’s see how it performs with the new options. As you can see, it takes eight seconds after cache being set, so this is way better. We save 67% of the build time, so this is pretty great. If we look at the logs, we can see now that the instructions from the build stage are being cached. With the GitHub cache backend, with GitHub Actions, it’s the easiest remote cache you can use today when you’re using this CI. If you are using another CI provider, you can use, for example, an S3 backend, or AZ blob backend, or another storage backend if you want. But you need the authentication layer. You need things like this. With GitHub Actions, you have this available right away, and you don’t need anything else.
Before going further, there is something you need to be aware of with the cache and GitHub Action. The cache has a size limit of 10 gigabytes that is shared across your repository. So GitHub will save your cache, but will begin evicting caches if you exceed the size. Recycling the cache can result in slower run times overall, so you need to be careful about this. What you need to export, by default, it will just export the last stage of your Docker file. But if you want to export everything like, for example, with the mode=max, you need to be careful about this.
There is the scope key. So the scope key by default is named BuildKit, so this is the key that your cache will belong to. It can be useful, for example, if you have a mono repo with many projects inside it. You can say, for example, I want to put this cache in foo and the other cache in bar so you can not be child together, and you can restrict with a mono repo to multiple projects. By default, BuildKit today doesn’t export cache mounts in the GitHub Actions cache. If you wish to do this, we have a workaround using this GitHub Actions, so you can export your cache mounts.
Multi-platform builds
Now let’s talk a bit about multi-platform builds. As you know, Docker images support multiple platforms, which means a single image can contain multiple variants and multiple platforms on OS. This is a popular trend, especially on CI, so this is something we want to show today.
To do a multi-platform build, basically you just need to provide a single input. I say, in this case, I just want x96, an Arm build. To do this, you just need to specify this, and also, in some cases, it depends a lot on how your Dockerfile is written. So you also need to install emulators and also, for this, you can use the setup QEMU action we provide. That will just install the emulators for your build. If we run this build as we can see, yeah, it’s three minutes — that’s not good
That’s a lot just to build for an extra platform, so do you have an idea? Does someone have an idea why it takes so much time? No? Maybe, okay, I have an idea myself, so maybe the logs will give us an answer. Let’s take a look. So, yeah, I think we found the culprit. As you can see, the Arm one takes about three minutes just to build this project, as the other one takes 30 seconds. So, yeah, this is the issue. The emulation is guilty. Basically, as you may know, in the Git event infrastructure the Ubuntu latest runner is an x96 architecture. So when we do the Arm build, it will just use emulation. And emulation is the huge penalty when you do a build, so that’s why it takes so much time. So, how can we solve this? There are two solutions to solve this issue, and what we are going to see is how to use cross-compilation in our Dockerfile to remove this penalty using emulation.
Let’s get back first to the Dockerfile. As we have seen in the logs, the build state will be built against their respective platforms. And in this case, using cross-compilation with Go can be quite easy. We just need to pass a set of variables, so we can use the target architecture. We provide arguments in the global scope of our Dockerfile. So in the front end, when you use this set of arguments, you have the specific target being used. In our case, using cross-compilation, we want to use the native platform to do the build and use the target arguments to do the cross-compilation with the Go compiler.
But how to infer this? For that, we can use the dash dash platform in your Dockerfile, so it will be strict to the build platform. This is the one where your build runs, and we can specify then the target OS and the target architecture. After that, we can specify the GOOS and GOARCH, so it will match the target platform you define. Now for build, so I just need to see how much time it takes. Now it’s 47 seconds; that’s better. That’s because we are using cross-compilation.
If we go to the logs now, we can see the AMD 64 and Arm 64 take the same time. There is still some penalty there. If you recall before, it took approximately 24 seconds; it’s taking 31 seconds because there are two builds occurring on the same runner. So there are two builds on the same instance. What we could do, for example, is distribute this build across multiple runners, so there is less penalty, and you can merge the result together later. But we are going to see that later. Keep in mind, of course, that cross-compilation might not be suitable depending on the programming language. For example, with Go, it’s really simple. With Rust, it can be simple as well. But if you can do this, the only viable solution you can use is having a native node on ARM. There are native nodes on AMD, for example in 64, and you can build on each of them, but it requires a more advanced setup in your workflow. So, there is some constraint with this.
Manage image tags
Now that we have seen how to do multi-platform build, we can see how we can manage our image tags on level when we push our image. It can be a bit complicated. In CI, we see that often. We see that at the beginning of the GitHub Action, you can have big chunks of scripts just to specify the list of tags you want to push, depending if there is a similar tag being pushed on your repository. Or, if you use, for example, edge releases, you may have something on main, and you would say, okay, I want the name of my branch being pushed to extract conditions with this. So this is a bit of pain. To save you the trouble, we have developed a GitHub Action that lets you automate the creation of tags and levels in your workflow.
Let’s take a look. I just need to define an extract step before building using the metadata action. So only the image input is required. In this case, it’s a repo. It’s a simple repository. You can use many images if you want to push to GHCR, Docker Hub at the same time, you can define both if you want to. If there will be dedicated tags, it will generate the tags on level that need to be applied after that to the build-push-action. In this case, I just set the tags with the input from the previous step. For example, when I push to the main branch, it will generate this list of tags. So there is only one that is called main, and there is also a list of OCI formats specification labels that are automatically generated. We don’t have a way to opt out for this kind of thing, but, as you can see, there is the description one. This is taken from the GitHub API. This is the description of your repository. You also have the title of your repository that is being taken and put there, so you can track what is what is in your image using this kind of action. This is pretty useful, I think. When you see the README, for example, of your image in container images, it takes advantage of the OCI labels and displays more meaningful information in your README, for example.
And, if I push a v1 tag, there v1.0.0, it generates latest and v1.0.0 as well. It can be a bit cryptic, so let me explain what’s going on under the hood. With this kind of customized tag, you might wonder how the action undoes it. By default, if you don’t specify the tags input, the default values will be type=schedule and type=ref with event=branch tag or PR. So the type=schedule will be used when there is a schedule event only in your workflow, and we’ll tag the name. The tag will be named nicely by default, and the ref under the Git reference like the branch, the tag of the request as well. There are more rules for specific use cases, for example, a common pattern in Git flow is to release your project by pushing a similar tag.
In our case, you might not just want to have the image tag and also the latest being set, but maybe you want something like major/minor, and you also want just major. To do this, you need to specify the rules you want to apply to your tags. In this case, for example, I say I want the full version but I also want major/minor being set, because some users don’t care about patches and will use only the major minor in their image. So, in this case, around the type=semver tag, it will produce in this case, for example, when I push v1.0.0, it will say, okay, I have 1.0.0, I have v1.0, and latest, and SHA with the commit SHA being set.
This is automatically generated for you. You don’t need to don’t need to care about it. It will set this for you. Snd for a pull request, for example, it will generate for this case PR196, and same for the SHA. That is being set every time you push something, so this is pretty useful. We have also other cases. There are specific ones for Python and versioning. There are also other ones that are being used for edge cases, like you want to put some flavor like under prefix suffix. Let’s say I want an Alpine version of my image or a Debian version. You can prefix with Debian or something like that. So there are many rules. We have a huge documentation in the repository if you want to take a look.
Bake
I want to talk about Bake, so as you can see the workflow was pretty huge, and I want to drastically reduce the size of this workflow. Bake will help us with this. Before getting started, you might not know what Bake is, so I will go back to the roots with the build command. As you know you can have something like this with the build command today.
It’s quite huge. There are many flags available with the build command, and we want to simplify this kind of thing, because porting this thing from your local environment to CI, maybe you need to copy paste stuff. You need to maintain both, so that’s not great. In your workflow you will have something like this, maybe, because you need to port your local command to inputs inside the build push actions. So that’s quite huge. It’s not ideal. We need to maintain both. I don’t like it. As well, you need to update the contributing notes documentation to build our projects, so, yeah, that’s not great. With Bake, we can do something like this instead. So you just need to define, for example, this push flag, which says I want to build stuff, so to build something, and I want to push it.
So before going further, I want to first talk about what is Bake definition. So this is the starting point for using Bake. So with Bake, we want to let the user define projects, specifically reusable build flows that can be easily invoked by anyone using your definition file. So, this definition file can be an HCL file, it can be a JSON one, or a Compose file as well. You can specify multiple files, and they will be meshed together in the specific order, of course, and it defaults to these if you don’t specify anything. So it looked like Compose itself, it looked at the default files available in where you execute the bake command.
As you see, I put up a small heart next to HCL, because that’s what I’m going to talk about in this talk. HCL is very powerful for this kind of thing. So the HCL format supports via block definition like, you know, Terraform. It’s almost the same thing. You can use variables as build arguments or your Dockerfile interpolates them in attributes values in your Bake file as well. There is also a set of general pure post functions. You can use any functions that are available by go-cty, but also you can use your own user-defined functions.
Let’s see an example of a HCL definition. This is how it looks, so this basically adds support for custom build groups. As you can see, it’s better code reuse across your project with different target groups. A target, you can see, for example, there is an image, image all, and it reflects a single Docker build invocation with the same option that we would specify for Docker build command on a group.
As you see, the one called default can specify its list of targets with target option. A target can inherit build options as well by setting the inherit options to the list of targets. For example, the image all inherits from the image one, so you can inherit from the options of the image. This is really similar to Terraform to how we provide a way to define variables. So the HCL file format also supports variable block definition, like this one with tag. And they can be used to define variables with values provided by the current environment as well. So if you use an environment variable, it will replace this value by the variable variable.
Now I want, for example, to build the image all target. I just need to define image all Docker buildx bake image all, and that’s it. It will build exactly what I want. This is quite nice to streamline the inner loop and to other environments.
Let’s take a look now with a simple one — the previous one was a huge one. You can see approximately how the reference is done. Now let’s get back to the basic workflow. We are at the beginning and at the Bake definition and workflow. We just need to use now just the Bake action. So we have also an action with Bake, and this one isn’t under tags at all. It just uses push, and it will use by default the image name that you see up there, crazymax.com. It will set a hashtag, this image.
Now that we have seen this, we have also the canonical representation that is shown inside the action, so you see what’s going on and what’s taken into account, like variables that are being passed. So you can see it inside the build push, the Bake action, so it’s using the dash dash print tag to show the canonical representation.
I want to talk about Bake now with secrets. We see too many bad practices around CI, and we see people using build arguments to pass sensitive information, credentials. So that’s not good, as sometimes it can be inside the image itself. It’s just not at build time. So using secrets at build time is what you should use in such cases. Secrets is a simple way to use it.
Before going further I want to add a new stage in my Dockerfile to test my project, and it will do as well add a new target inside my Bake definition. And right before this, I want before build and push, I want to test this project. But there is an issue. I know my test requires access to the GitHub API; otherwise it will be skipped.
I need to somehow pass the GitHub token to this workflow. How can I do this? I can add a build time secret for this. Basically this is, in my Bake definition, I just define a secret I want to expose. So I want to expose the secret name gh token using the environment variable name github token. And this is the same as using the secret flag, also with the build command. There is nothing different. And knowing my Dockerfile can move these secrets when I run my test. Also I can use this environment variable github token that will be used inside my test as well. Finally, in my workflow, I set the GitHub token with the GitHub token secrets provided. So it will be passed through, and then it can be used inside my container directly. So this is pretty useful. This is secure, because it will only be used in this run instruction. It will not be in your final image. When you want to push this image, it will just be used for this stage only.
Container-based workflow
With that in mind, we can explore the vast potential of using containers in your CI, how to achieve greater portability between the inner loop and your CI environments. The idea is to use containers because you achieve better portability with this. And if I add, for example, a new target — I will call this one lint — so my back file starts to be huge, but this is useful. I can reuse this one if I want in the future and create a lint stage as well in my Dockerfile. And that will just run golang CI lint to check if my code is okay. And, in my definition, in my workflow, I will now add a new lint step.
If we take a look at the logs now, we can see my lint stage pass. It’s okay, but there is something strange. We see there is something called, it exports an image, but I don’t want that. I just want to run the lint part. But by default the build command will write the image to the local store every time you do a build. But you can build without exporting anything. To do this, you can avoid this target only to use the cache-only output type. This output type discards basically outputting the build result but still writes the cache for this specific target. I should do the same as well for the test one. I forgot to do this before, so test, I don’t output anything. Let’s say in the future, we want to export the test definition to post it to cut code or something like that. You could export the build result as well. So this is useful if you don’t want to waste time, or if your stage doesn’t need to output anything but run tests.
As you can see, we have a common pattern. Every time we add something, we add a new step in your workflow. So your workflow is likely to be quite huge. In this case, we need a new step every time. Here, for example, I can group tests and lint because the purpose of this step is to validate my code. What I can do now is remove or mesh with them and just have a validate one, and I can call validate, so it’s removed.
There is more abstraction around your local environment so you can tell users or just run validate. It will do everything for you. You don’t need to tell, run test, run lint, run anything else, so just validate, and that’s it. Of course, it will run in parallel, thanks to BuildKit, so there is no control like make file. In this case, it will run in parallel. As you can see also in the canonical representation, we can see it will call both.
I did not want to jump to this section for this tool, but I think that could be interesting. Previously, for the test and lint part, I say it runs on the same runner. So this is useful, but if you want, for example, if you know that a test can be a resource constraint — take too much CPUs or RAM — and you want to split this between runners to take advantage of GitHub workflows. To solve this, you can distribute the group of targets that we have, so we have the validate one, but we want to distribute this across runners, so not part of the same job but of two jobs actually. To solve this, we can distribute group targets and multiple runners while keeping the same big groups. So the local flow is still the same, it will just change a bit inside your workflow. So there is a pattern with GitHub action to set a matrix dynamically.
That’s a bit cryptic. We want to use this in the future using the Bake action, so you will not need to do this. We will create automatically invoked multiple runners. What it does, in this case, you can see, I use the bake command with the validate group. It will print the list of targets that is available for this group, put it inside a specific target output for this prepared job, and after that, I can use a matrix with the target values. It will be the list of targets that are available, which are test and lint, so it will run the validates both times on each runner. So test on one runner and lint on another runner.
A better representation of this will be before we add a single validate one that runs both test and lint in parallel, but now they are split between runners, so it takes less time. Again, instead of 57 seconds, because they run each run on its own runner. So this is a way to distribute builds. We have examples in our documentation to distribute the builds as well for YouTube platform images. So you can have, let’s say, five platforms you want to distribute across five runners, because doing this on the same runner can be quite resource-intensive, and you want to split this around. Also another use case, many people build something to push an image on a registry. This is something many people do, but we also want to put the binary and maybe push this binary to get a release or something. So you can actually do this with BuildKit. You just need to change the output and say, okay, I want to output not on a registry, but I want to output my thing locally in my client. In this case, it will output the binary from the stage, called binary, to the local file /bin. So I will create a dedicated workflow for this. This workflow will be triggered when a tag is pushed, and after that, when this binary is done, I will create a GitHub release and push this binary.
You can do this, for example, for other use cases, like you have multiple platforms. You can do the same, in this case, so you can do a multi-platform build, extract all the binaries, and push them inside GitHub releases directly. So you just have to use a final action. Even this section, we could put inside the Dockerfile itself, but it requires more work because you need to talk to the GitHub API. This is something easier — to use a GitHub action already available.
Conclusion
What have we seen today? I used cache exporters and how to create efficient images using different architecture pipeline strategy for automating tags. We talked a lot about Bake to simplify the workflow and reduce the override in your CI, using build secrets as well with Bake. And also how to use containers only to achieve better portability between your local environment and your CI environment.
Before closing, we have some projects using containers. For example, we have buildx. Of course, using this kind of pattern, it can do many things: cross-compilation, updating vendor, checking vendor. There is also lint. Many things we can do, using a new pattern, using metrics. There are also some metrics inside Bake. I did not talk about it at this presentation, but we have great documentation about this. So the GitHub Action I shifted today, you can just shift the same inside Bake itself. This is something useful, and we are doing this inside this project. So that’s it. Thank you.
Learn more
- GitHub Actions: https://docs.docker.com/build/ci/github-actions/
- New to Docker? Get started.
- Get the latest release of Docker Desktop.
- Have questions? The Docker community is here to help.
- Subscribe to the Docker Newsletter.
Find a subscription that’s right for you
Contact an expert today to find the perfect balance of collaboration, security, and support with a Docker subscription.