DockerCon
Docker Rocks in Node.js, 2023 Edition
Bret Fisher, Docker Captain, DevOps Dude
Transcript
Hello. All right, this is the audience participation. I like to be as not boring as possible. So, my name is Bret. This will be the third time at Docker that I have done a Node.js talk. So this is a refresh. And I always think, I’ll just do five slides. It’ll change five slides. Everything in this is different. Because there’s so much new Docker stuff. Node has had a few changes. It’s changed a few commands, a few things.
So we’re going to have an adventure together, and I want you to yell out. Again, audience participation. And then I’m going to leave plenty of time for questions. Because we can make this a three-hour conversation, right? I have a whole course on Node.js for Docker on YouTube, and it’s not 45 minutes. I want to make sure that if you have questions that I don’t address in this talk, go to a previous year, and they’re all online on YouTube. And then watch that to answer that specific question. Because I keep adding new stuff, and I can’t fit it all in here.
Table of Contents
New and latest
I want to focus on things that are new and latest, so that you get all the new stuff for this year. And the design here is that, who is it for? Right? So you know some Node, you know some Docker, and you want to be awesome at it. And I work with a lot of teams as an advisor. Not just an implementer, but I’m looking at their stuff and advising on how to improve it. Streamline things make them simpler, also more secure, stuff like that. So we’re going to go for awesome sauce mode today. And we’re going to go over four main things, and then we’re going to end with a production checklist.
So we’re going to start with Node file, Node Dockerfile best practices. This isn’t your basic 101 stuff you get on the internet. This is more than that. And then we’re going to talk a little bit about base images, because we got new exciting things to talk about, and then there’s also the Valley of Despair of like, it’s hard, right? To go from day one to true production, enterprise grade, Node images is harder than you think it is. Some of you probably have great experiences and wonderful tales to tell about all the things you’ve tried.
We’re going to talk about Node process start up and shut down. Although most of that is actually going to be me referring you to previous years, because I went on for like 15 minutes about it to give you all the nitty gritty, but we’ll cover the basics of that. Then we’re going to talk about some new Compose stuff, because I love Compose. It’s still, even with all the teams I work with on Kubernetes and all the fancy tools, Compose is still the place we keep coming back to for local development optimization and simplifying the dev setup. I’m an ops guy at heart, but I want development to be as simple as we can make it, right? So we all want this beautiful idea of a magical development stack that looks just like production, but also extremely easy to use and fast. That’s hard to do, right? So I still love Compose because it’s local, it’s quick, it’s simple. The file is easy to understand, and there’s a lot of new updates in the last couple of years, especially if you didn’t see my talk last year or the one in 2019.
Dockerfile
So let’s just jump right into the Dockerfile, because that tends to be the first place that people start making ill-advised decisions, and the internet gets a lot of things wrong. In fact, the talk I used to say was like, you know, who’s seen this 101 on the internet? Raise your hands, right? Every blog post of the last 10 years has been, this is how you do Node.
Can anyone yell out what they see, anything they see wrong with this file? There’s a dozen things wrong probably. Copying in everything. Yeah, yeah, down there at the bottom. There’s two copy commands, which is technically correct. What else? Anyone see anything else? The base image, right? We definitely have room for improvement there, right? That’s every Docker 101 example, but it’s probably not the image that anyone ever on the internet should ever be using. I have opinions. The WORKDIR has actually been some changes in the last three years, and now the WORKDIR doesn’t actually correctly assigns permissions, but you’ll notice we’re still using root, because the node image, just like all Docker images, default to easy mode, which doesn’t mean the most secure best mode.
Back in 2013, when they made these images, they were going for simplification, ease of use, and we’ll get into that, and why we want to change all that. If we just revamped that and say, what if this was your day one? This isn’t necessarily production ready, but day one image. You’ve got Tier 1 supported builds, meaning if you didn’t know in Nodeland, the Node project supports different compilations and different platforms, essentially, for Node, and they have tier rankings of what is Tier 1, and this is the best, it’s for production. You can get support contracts, stuff like that for Tier 1. Tier 2 means we do best. It’s not as important, but we try, and then Experimental is beta. It may work, it may not, we don’t guarantee anything. Weirdly, or maybe ironically, even Docker recommends sometimes the Alpine image, which you will hear me talk about today in not a nice way. Alpine, the project is a fantastic project. I never recommend Node, Alpine, and production.
I’ve worked for 15 years with Node, and 10 years with Docker, and eventually every project that I’m on, if they’re doing large-scale Node stuff in production, they will eventually have problems that are Alpine specific, mostly due to Musl, which is the way that things are compiled in Alpine. And there’s other reasons — BusyBox is sometimes a problem. So, you won’t see me recommending Alpine today, but don’t worry, I’ve got lots of recommendations for you that are even better. One, meaning you want an image that is on a Tier 1 supported build of Node, and Alpine is the one that’s set to experimental.
Next up, we’re pinning the image. So, in case you didn’t know about pinning of images, it’s been around a while, but I can guarantee that I’m getting the exact same base image because tags can be reused, right? In this case, I’m using Node 20, so I’m not pinning to the patch level version, but I am SHA hashing it. Now, technically, when you SHA hash an image like this, when you put that hash in that you can get from just doing a Docker images command, there’s actually like a `– show digest`, or something like that. I forget the command, but you can get those hashes. Technically, it’s ignoring the tag, but the tag is for the humans to know what we pinned in the file. So, when you pin a SHA hash, it ignores the tag, we’re just using it as a friendly label to know what the heck it’s from.
So, the Node 20 bookworm means that’s the version of Debian that this is based on, which is the latest Debian, and then Slim — always use Slim Images. And every programming language in Docker Hub, if you can get official images, always use Slim. You never want the non-Slim variant of Debian, and you’ll see why in a minute. We’re running as non-root. Now, something changed in the last three years so that I can now put the user node in there, which is built into the official Node images by default. The user already exists. I can put that there to run as non-root. This is key. A lot of Kubernetes clusters, especially in sectors like government and financial and whatnot, will not allow a container to run as root. So, you have to do that. And if I do that before WORKDIR now — I actually learned this earlier this year, that it was about two or three years ago maybe — that they updated WORKDIR so that it assigns the proper permissions based on the user above it.
So, you put the user in first, you make it non-root, and then when you create that WORKDIR, it will properly give node user permissions, so you don’t have to assign those manually. If you’re someone who’s taking my node courses, the way I used to tell you is you had to type MAKEDIR, and do all these things in a run command. But you don’t have to do that anymore. It’s easier. Then there’s the copy. So, we’re copying with the right permissions, because now we’re as a regular user, not a root user. So, we have to use the chown any time we do a copy. And then you see that multiple levels there, and we have the package and the actual copying of the rest of the source code in.
npm ci
And then what’s next? We’re doing `npm ci`. This is actually incorrect, because I just learned today that we, the `npm ci` has changed three times now, or twice. This is the third rendition. So, you’ll see in future slides that technically should be `npm ci omit dev`, is what you want. Dash-omit dev. You’ll see it in the slides, and it’s also in this repo. In case you didn’t see, I’ll put it again at the end of the slides. There’s a whole repo with all this stuff in it, example Docker files, and I keep updating that repo every year. So, you can get all these notes in a lot more detail. It’s a tomb of information, really.
So, you’re doing the `npm ci` that removes dev dependencies or prevents dev dependencies, and then we’re not running NPM or any other process manager yet in the command. We don’t want to use NPM ever in production to start things that are going to be the long-serving process, and you’ll learn why in a little bit. So, I’ve actually changed my opinion about some things as tools change in the industry, and one of them was I used to teach people that this may be a great way to do your NPM audits, your Trivy scans, any of your CV security scans. Maybe you can make a stage and a multi-stage Docker file, and I’m now saying, no, we’re not going to do that anymore.
One of my hopes in the industry was that the CI tools out there would start looking at build stages and Docker commands, or essentially each Docker step, as a thing that they can light up in their CI solution so that we basically can use a Docker file to do a lot of our CI, and a lot of the automation, you know, testing, and all the other things we need to do. That didn’t happen in the industry, so even though I kept advocating for Docker build as the way to do this, I don’t recommend that anymore. All the modern CIs, GitHub actions, you know, GitLab, all of them, have better support natively to do your NPM audits and your CVE scans and stuff like that. So, I don’t recommend these stages anymore, which is great. It simplifies our Docker file. We don’t have to do it.
Docker init
Talk about Docker init. So, this is a new thing. You heard about it in the keynote. Docker init allows you to start a project from scratch, which means it comes with an opinionated Dockerfile. And this is kind of what it would look like if you ran that command, so if you’re brand-new, day one to Docker, Docker now at least has this init option, like every other package manager in the world. So, it’s great. It’s great for new people. I have opinions. It’s one of those things where nothing’s ever perfect or universal for everyone, right? And it gives you a bunch of questions, you give it a bunch of answers, and then it creates three files, which is pretty great, to start with. It starts with the Docker ignore, it starts with the Dockerfile, and it gives you a Compose file. By the way, Compose files now, the standard is `compose.yaml`, not Docker compose.yml, which is what all of us that have been doing this for 10 years have been typing. It still supports all the file names, but this is the new convention: `compose.yaml`. So, it creates those files for you, and then recommends, you know, to do a Docker `compose up` for you.
So, let’s just take a quick look at that while we have time. So, over here, if I look at the Dockerfile, it’s actually pretty fancy. We have the syntax at the top, in case you weren’t familiar, because Buildkit is now the default builder, so we have this thing called frontend, that allows Buildkit to update dynamically and support a lot of new features. So, there’s a lot of new stuff happening in the Dockerfile, but it’s not necessarily in the OCI spec. It’s happening in the Buildkit, and through something called frontend. So, if you put that syntax line in, what that basically guarantees is, when all of your team members build images, or your CI build images, assuming they’re all using Buildkit as they should be, because it’s still the best container builder, they will all have the same support for the same feature set inside the building of your image, which if you start to use advanced features is important. And we’ll talk about a few of those eventually. In fact, we’ll talk about one of them right now.
This is a heavily documented file, this is all generated by Docker, and you’ll see things like it mounts files into the build time, right? So, this is a fancy Buildkit frontend feature that’s been around for years. I don’t always use or recommend this. I have to look at the team and decide how many, like, are they installing 200 megs worth of node modules, or 1,000 megs of node modules, because I’ve seen both. And the bigger your node modules, the more likely caching, this isn’t caching node modules, this is caching the installation of those node modules, so the downloads of those zips before they’re expanding the node modules. So, it’ll save you internet trips on building. It does require you to use Buildkit, because I do believe this is a very specific thing to the Docker builder for Buildkit. Again, that’s fine, because it’s the best one.
So, they give you this nice little file, where it helps you optimize that, and it has a bunch of stages where it copies the files. And this is all fine. I do find that with teams I work with, that if they’re new to node in Docker, this is a lot to take in. I try to teach and help teams crawl before they walk, and walk before they run. This is a little bit closer to a fast walk, so maybe not the day one Dockerfile, but hey, it’s great. Docker gives it to us. The one thing that I don’t agree with is that they default to Alpine, which again, I was talking to several people in the conference that after 10 years of trying to help teams using Alpine in production, I’ve kind of given up and don’t recommend it.
What else does it do? It gives you the compose file, which we’re going to talk about in a little bit. But it lights up some pretty sweet new features in Dockerfiles, because we now have the compose spec. So, you probably know now if you’re an avid compose user, we no longer have compose versions. In the file, we now support all the features of versions 2 and versions 3. Even in the keynote, someone’s not going to file with a v3.4 in it, which is considered legacy. No versions needed. All features are available in the compose command line, as long as you remove the compose version. And we’re going to talk about health checks and other stuff here in a minute, so I’m not going to go too much more of that.
The right base image
So, the perfect base image. This is my favorite part of the talk, because I could either talk for five minutes or 50 minutes on this topic. But a lot of time gets spent in most teams I work with on finding the right base image that has all the things they need, and then none of the vulnerabilities they don’t want; it’s small; it meets all these three or four different metrics. Right? And it doesn’t exist, right? This is very specific to your team, and almost every team I work with chooses a different path based on their culture, based on their requirements, based on the security team’s involvement. And it’s a balancing act. The more secure and smaller you make it, the more advanced you’re going to have to be to use it. So, it just depends on what you need in your team. And some teams are perfectly fine with one of the default official images.
Let’s start with the first three. Never use the top one ever, ever in your life use it. There’s zero reasons to use it. One of the big negatives besides, you can see all these CVEs. So, this is me using multiple scanners in the industry, two open source, two commercials. You’ll notice a trend that the commercials tend to have less false positives. This is something actually recently discovered. And they’ve had multiple conversations this week about that there’s definitely a huge difference in some images between the open source scanners and the commercial ones that are specifically applying issues to each one that might be a false positive. They tend to be fixing them faster. I’m not really sure what’s going on in the background. It doesn’t mean you can’t use an open source scanner, but to me it’s kind of like, you know, Docker Scout is working great, and it’s really new. It’s not perfect yet, but the team’s taking a lot of feedback.
You’ll see here that we have actually a couple of images that Docker Scout’s not perfect on. It doesn’t scan it correctly. But I always start people with Slim, the second one. Right? Slim is way smaller. Right? It’s less than a fourth the size, huge different in the CVE count. And it has everything you need to run Node. The problem we have with the base image, the original and the top one, is that I’ll see teams, especially when they need open source. They need OS package manager dependencies, so they need apt and yum. And what happens with this, the first one, is it has tons of stuff in it. And people, when they get their first image working, it’ll work. But then if they try to use Slim, it’ll fail the build, because there’s a missing dependency that they didn’t specify, because they didn’t realize in their day one Docker experience that the default Node image has Mercurial in it, it has ImageMagick. It has so many things that we don’t need in a Node image, usually. So when those things are there, it has all the build tools. So you can do binary builds. Sometimes you need that. But usually you want to specify those things by putting in your own run line. Right? So you get the point there. So I don’t recommend that. The Alpine one has great CVE stuff, but because of various reasons that you can listen to in my talks over the years, and then if you just go to the repo, I’ve put in lots of details about pros and cons. I generally don’t recommend the Alpine, even though it is nice and small. We can get smaller than Alpine without the negative side effects of musl and BusyBox.
Next I’m going to show the Debian, just to give you a comparison. These Node images that are official from Docker Hub are based on Debian, which means that a lot of the vulnerabilities if we’re focused on that come from the base image. Node can’t do anything about them. So the 12 Slim, for example, has less, a few less, but it still has some vulnerabilities. And then you can see Ubuntu.
Ubuntu
So Ubuntu is going to be one of my recommended images for you. I’m going to give you three recommendations in a minute, but we’ve got to take the journey to get there. So, traditionally, for those of us that are sysadmins, when we think of Ubuntu, we think of LTSs like 20.04 and 22.04, those are long term stable releases of Ubuntu. You don’t necessarily have to do that in a container image, because if you need to change from 22.04 to 23.04, it’s just a one-line change. And in theory, you’re getting newer dependencies. And in this case, you actually can see that there’s less vulnerabilities in 23.04.
You sacrifice a little bit of that long-term availability of package managers stuff, but that’s getting into the weeds a little bit. I’m not going to talk about that today. So you might have an opinion in your company where I see companies where they say, well, we only use Ubuntu LTS images. And I know multiple companies on AWS and Azure that their approach to all their own base images is they start with Ubuntu, and they build their own images from there. And I’ll show you a couple of ways to do that.
One way you can do it and node specifically is you can make Ubuntu, which is a very small image, right? The Ubuntu 22.04 is smaller than all the others, 69 meg. And it’s great because it has the classic Ubuntu enterprise support built in. It has the long-term app package manager stuff built in. It’s well supported on the internet, well documented. And you can add in the official Node binaries with the Node source. So anyone here familiar with Node source, you’ve heard of Node source, like you’ve probably if you’ve ever installed or built Node, you know about Node source. So you can make a Dockerfile, which is in the repo for this. It shows you how to build this. And then you just install their Node source. One negative of that method is that the Node source team, even though I’ve complained, requires Python in order to install Node. So the Node image in this case now has Python and all its dependencies, which will bring vulnerabilities to your Node package. I don’t like that. That’s why I’m using Node, not Python. So I don’t like that option.
The next option is what I’m calling like a side load, where you can use the copy command in your Dockerfile to simply copy all the binaries out of the Node image into the Ubuntu image. And now you don’t need apt, you don’t need all that extra stuff. You get just what you want. It’s a smaller image. You can tell here 225. So it’s leaner than the others. And it has a small vulnerability count. There’s no highs or criticals in any of the scanners. One negative of this approach, though, is that it means that the binaries are probably not going to be picked up by your CVE scanner, except for Snyk. Yeah, Snyk detects those binaries, even though they’re not installed by apt and, you know, it reported that there were no vulnerabilities. I’m hoping that Docker Scout will do that one day, and I’m going to let them know that they should. Then we have after that the idea of moving Ubuntu to the 23 or 4, and you can see the results in the scans there. It’s a couple less CVEs because 23 or 4 has newer dependencies in Ubuntu than 22.
Distroless
Then lastly, the last two we’re going to talk about is Distroless, who here is you, is anyone using Distrolesst? Got one up front. So Distroless is a cool idea. I have issues, and you’ll see the little dots, the little three and the four. Those are referring to the GitHub repo where this is all at. You’ll see on the last slide again, it was on the first slide. But the Distroless has side effects. You can’t pin a lot of things. It doesn’t keep versions over time the way I wish it would. And it also because of the way it’s designed, it doesn’t have apt installed or anything. So it has to be the last stage, which means you inherently have to have an advanced Dockerfile, and you have to know that you have to have build images and then production images that you copy everything into this Distroless image.
So I considered an advanced solution, but it still has vulnerabilities in it. In fact, in some cases, it might have more vulnerabilities than the Ubuntu one. So why would I use it, because the point of Distroless was to keep it small and secure Distroless, that is. And it’s not always that best choice.
Chainguard
The new one out there is Chainguard. Who here has heard of Chainguard? Anyone? Okay, we got a couple. So Chainguard is a software supply chain security company. They’ve been on my, in case you didn’t know, I do a YouTube livestream about this stuff every week. I have guests come on to YouTube, join me there. We are live every Thursday, and I had Chainguard on last year, and I loved it. They are basically in my opinion — how I would describe what they’re doing with Wolfi — is they’re taking Docker official images and they’re going back and redesigning them from scratch and then maintaining them themselves to get it to zero CVEs across the board. And they’re very public about it. These are free images. They have a paid plan that allows you to do a few more things with these images, but you get a lot out of the box for free. They have their own registry. I highly recommend them for any team that starts.
This would be one of my top three, if not number one, images that I would have them trying to use. It is a little advanced. It does require a little bit of understanding because, when these images get really small like this, they don’t have shells, right? They don’t necessarily have all the packages you need. So you, you, it does get a little harder. So that leads me to this slide. These are the prime recommendations. They’re not in any order. They are dependent upon your team and what they might need.
So if you’re going to want to use an official image, which is easy, comes out of the box, that’s the Node-slim, right? It currently has no — according to Snyk and Docker Scout — it has no critical or high vulnerabilities. So only low and mediums. I should have defined that at the beginning. Sorry. It’s defined on the website. I apologize. And then the second image there is the one we built where I side loaded. And if you wanted to see what that looks like, it’s pretty simple. I don’t know if that’s the right term, but I’m making up that term side load. So if you looked at copy. So in this file, this is how I get node into this. This is a regular Ubuntu image. And the way I get node in it is I use the copy from. And this is a legitimate way I see multiple teams out there doing this for other types of images.
So I define both of my images at the top. I say, here’s my Node image. And here’s my Ubuntu image that I’m going to, you know, I’m going to use the Node one later, but I wanted to find them all at the top so I can track versions. I should be SHA hashing these so that I have the hash to guarantee I get that exact image each time. And then I’m giving them an alias. And then we’ll talk about “tini” here in a little bit. But here I am side loading Node in by copying it from one image to another. Because I trust the official Node image for building the correct Node version. And since I can specify the Node version coming from Docker Hub, I know exactly what binaries I’m getting. I just need to get them in here without maybe the side effects of Node source or Python being loaded or apt package dependencies that I don’t really need. And I’ve tested this in production. This is an example I’ve been giving out for like four years, and I haven’t had any negative effects so far. So there’s that.
So those three options are for you. And then you got Chainguard there at the bottom, right? So the Node-latest image. If you want to pin versions on Chainguard, they changed their policy recently because of their increased success. And if you want pinned versions in the tag, you need to send it for one of their paid plans. But you can always pin the SHA hash like I’m recommending. And that’s essentially getting you the same thing because they’re always going to make those SHA hashes available, and you can depend on those.
Process management
Let’s move on. All right, process management. How many people here know about an init process or use tini or any of these things in Node, right? Do we have a couple people? Okay, like half the people. Great. So you know about this problem. And I also have opinions. So I spent years working with teams on managing Node processes in Docker and Swarm and then Kubernetes. And trying to figure out the init problem and process shut down. And basically zero downtime deploys, never miss a connection, never miss essentially an HTTP ping, and this caused me to go down the rabbit hole of signals and what init processes are really doing, and what does zombie reaping really look like in the wild, and does Node even have these problems.
So I came up with a slide — I ended up making this really complicated decision tree for today and realized I could just help you with two questions. I could tell you whether or not you need this or not. And the first one is you want to add tini in most cases as the thing that will start Node in your container. So not npm. I prefer tini because that’s what ships with Docker built in. Your app doesn’t create subprocesses, which a lot of node apps don’t. They might do calls out to the file system, but they don’t necessarily spawn curl or some other binary on the machine. Or if you’re in Kubernetes in production, if you didn’t know about this option, you can turn on, which isn’t turned on by default unfortunately, the share process namespace. If you do that, Kubernetes has this neat trick where the pause container — who knows about the pause container?
Pause container
So the pause container is the first thing in every Kubernetes pod. It’s always there. It’s super small. It’s like a hundred lines or 50 lines of code or something. And that thing will do the zombie reaping and the protection and the signal processing that you need. It will do that for you, but only if it’s sharing the process and namespaces as the rest of the containers in the pod, which unfortunately back in like Kubernetes 112 or something, they decided to not do by default. So if you set that true in Kubernetes, all the containers in your pods will be in the same namespaces and essentially Kubernetes gives you a free init manager known as pause. So you don’t need tini in that case. You can avoid it.
The other case is if your app listens for signals in the code, and if you have questions about how to do that, I have a link to my previous talk where I went into the weeds with code examples and counting connections on HTTP, so that you can make sure you set them down properly with thin packets. So if you get into networking and nerdy stuff like that, I’ll give you the link at the end to go find that video. But I’m not going to be able to go through all that today. But if these both are true, then you don’t need tini. And you can save yourself the, not really a hassle, but you can avoid an unnecessary encapsulation.
So for everyone else, we should have tini. We should have tini in there. And you shouldn’t just have it here. You should probably also use it in any exec probes or any health checks. If you’re actually calling out to the filesystem, you should be using it there as well. This is the talk from 2019 actually. It’s still all relevant when it comes to process management, signal processing. I personally like to write all of this into my Node app.
So my Node app will see the shutdown signals. The way you know if this works, in case you’re not familiar with all this stuff, is if you try to stop a Node container and it takes 10 seconds, because in Docker this value is 10 and Kubernetes is 30 seconds. But if it takes longer than 10 seconds, then you have an init problem. And what’s happening is Node is not aware of the signals coming from Linux, the kernel saying you need to shut down now. And so because the Node by default, and this is true of Python, a lot of other programming languages, they’re not trapping those signals by default. So they ignore it, and then Docker has to kill it. So that’s what the 10-second wait is. So if you do a lot of online demos of Node app samples, you’ll notice like when you control C or Docker stop or whatever you do, it just sits there for 10 seconds. That’s because it’s not hearing the signals. You can fix all that with an init.
Compose updates
Compose updates. Let’s talk about compose — my favorite developer tool. We’ve had changes in the last three or four years. So if you haven’t been to a previous virtual DockerCon, then you maybe haven’t been aware of all these changes.
So I’ll give you some brief examples of what’s changed. We don’t have the versions. I mentioned that. Yay, unless you’re on Swarm. If you’re still on Swarm, which is great. I have a growing community of Swarm fans that we’re actually going to meet in hallway track tomorrow. On Swarm, you still need to have the v3, because it’s still on an older version of the Compose specification, or technically doesn’t even use it. But for the rest of us, we could get rid of that, which gives us a bunch of features we had over the last 10 years of Compose that we didn’t get to use together. If you’ve been around a while, you knew that we had days where you had to decide on v2 versus v3, because features in v2 didn’t come into v3. So there was a fork in the road. It was a little complicated.
Now all the features that we had in v2 and all the features we had in v3 are all together again as a happy family. And one of my favorites that a lot of teams I work with do not use and did not know about is that you can… People have heard about `depends_on`, but then they realize it doesn’t really do what they thought. I wanted it to wait for my database to have its schema loaded before the Node app starts. Well, you can do that. You just have to use this specific way to do it. You put in a `depends_on`, and then you define the service that it depends on like the database. And then you say the condition of service healthy.
And I’ll just show you the YAML file of that real quick. So you know what I’m looking at. And right here is what I would do for my Node app. I say I `depends_ on` the DB. The condition must be healthy. So you can do this. I’ve seen Compose files that have 30 different microservices in it. And they use the new profiles feature to actually put those into chunks so they can load them separate times. And they depend on Redis and Postgres and a backend worker. And these all things have to be running first. So you add that to all of your services that depend on something else. And then in the dependent services, you add a health check.
And in a database, this is actually a pretty easy one if I just go into health checks for Postgres. I can actually go in here. And it does a SQL query and looks for a specific record. So it knows that I’ve seeded the database. So it’s a simple Docker health check. You know, the same kind of health check you would put in Kubernetes. And as long as I have that in my database or my Redis or whatever my backend thing is, when I do a Docker compose up, it will sit there and wait until that health check passes green before it starts the services. And you can chain these. So you can have the backend API wait on the database, and then the frontend waits on the API. You can chain these all the way up. And it’s you know it’s 10 lines of YAML to put it all in there. So we get that now in the latest versions.
The next one here includes extends and CLI overrides. I don’t know if you’ve all scaled up your compose work, but includes is the brand-new feature that allows us at the top of a compose file or anywhere. You put it at the root of a compose file. You can say here’s other files I want you to bring in. Extends is a little bit more flexible and I like it better; it’s a very similar feature that we’ve had for a long time. And then CLI overrides is honestly the one I use the most because it allows me to give a whole team a compose file. And then they all can make another file called compose override.yaml. And that file will change any of the settings, including environment variables for their development setup. So if they want different ports or they want different environment variables or different passwords. And then we ignore that file, and get ignore, so that everyone has their own custom setup without needing a different compose file.
You can also do overrides for things like CI testing. So you can have an override that puts in all the CI testing values. And you can have a base regular compose file that’s simple and then override the customizations — that’s called overrides. So you can look all those up in the docs. In fact, I was going to click on one and bring it up, but you get the point. They had a blog article recently. Nicholas put up a great blog post on improving Docker compose. They walk through all the different ways to create many different YAML files that build into a single compose setup. And it’s a pretty great walkthrough of all the ways, and the pros and cons of each, and why you might want to use one over the other.
Next up. Develop, did they show this in the keynote? I can’t remember. So, develop is a brand-new thing for watch. And we’ll talk about watch in a second. Watch is my favorite new feature this year. Also get new features most people don’t know about again, not related to Node, but you get a `docker compose ls`. So, if you have multiple projects all running, you can actually see them all in one command. It’s pretty handy. You can see if you have stuff running and some other directory that you forgot about.
And then `docker compose alpha publish` just got launched in the last month. And this is something I’ve been asking for for about five years. And now put your compose files with that command. It will automatically put them in a, essentially an image and push them to a registry so you can share compose files without code as a deployable object or as an artifact. And we have this for Kubernetes. We have this for Helm, you know, Kubernetes manifest customized and stuff, but we didn’t really have it for Compose until just this last month. So, that’s there.
So compose watch looks like this at the very top. I’m typing `docker compose watch`. And it requires some extra yaml, which I will show you in a minute. But once you’ve added that extra yaml, you no longer need in most cases to do bind mounts for development. Who here struggles with NPM install performance or build performance on their local machine with bind mounts — has anybody suffered with that over the years? You’ve tried Mutagen, you tried docker sync. If you go hardcore, you might try rsync. You might do all sorts of crazy stuff.
Well, now with the compose watch in a lot of cases — the people that I’m talking to and working with and showing the examples to are saying they can now avoid the binding of their source code. So it simply watches for file changes on the host and then copies them in the background into the container, or it will rebuild the image based on your configuration. So it avoids the cross-OS boundary bind mount that us on Mac especially, a little bit on Windows side, we have to deal with.
The last thing here is this section. If you were steely-eyed earlier, you might have seen this and thought what the heck is that. So this is my Node app, and I’m using this to tell it if I change the package of the package lock file automatically rebuild the image every time I do a `docker compose watch`, which is like an up. And then watch for anything in my directory, and if it changes, sync that file into the container while it’s still running.
Now you would still need to probably run this with like a node mon, because node mon would be in the container and would see the changes in the container and restart the app in the container, which is a little bit faster than completely restarting the container. So if you add that in there for your Node apps, it’s not Node-specific, but it is very handy for Node developers. Then, whenever you run this, you can actually see what it’s doing. It’s replacing `docker compose up` because it pulls the images, builds the images, and then spins them all up as their services, and it tells you down to very bottom and small text. It says watching, and it gives me the path on my host to where it’s watching for changes. So it’s like a node mon or one of the other file watch utilities, but it’s happening across the container boundary without a bind mount. Pretty cool.
Production checklist
All right. My last thing here, if you have any questions, I’m going to have a couple of minutes for questions, but this is a quick checklist for going production. These are the things that I think of mentally and I work with teams on regardless of their status with Node. Are they doing these things before we go into production? It’s not super focused, but obviously `docker ignore` file. Like it’s amazing how many teams I work with that are maybe in their first year or two of containers that didn’t realize they needed a docker ignore. I make a copy of the git ignore file and then I add node modules to it and that usually solves the problem that they’re always running as node or a non-root user. Right.
They’re using tini or another init process. They’re calling Node directly without using PM to node mon in production or. You know, in PM or yarn or any other tool, we want to call node directly — at least, we want to have tini call Node directly. We want to have a health check. You’re going to need those probes in Kubernetes. But in Docker, you can just type the health check and the docker file. I find that, if you have that, then you can use the `depends_on` to wait for the database, right? But in order to do that, so imagine you had a API back in on node. If you added the health check in the Dockerfile, then the other developers could easily set up that dependency for depends on without needing to have to work on adding a manual health check.
So if you put it in the Dockerfile, you can avoid it in the compose file is essentially what I’m saying. We want to use the omit dev and our npm ci commands. That’s always how we want to run production.
In your source code, these are the things that I would have you do. I would ask you, assuming that your team is controlling the Node source code, I would ask you to capture SIGTERM and SIGINT processes and handle proper shutdown. I would ask you to track, if you’re a website, if you’re web system, if you’re looking for zero downtime deployments, essentially, you probably at some layer of your system, you’re going to need to monitor HTTP connections, send FIN packets to the frontend browser, whatever is your client, and have them automatically route to another healthy container, because this container is shutting down. And you can look up projects like stoppable, which is a node.js, npm project out there that will properly count connections, give you FIN packets, which are the way that you do a healthy shutdown of a Node container without basically cutting people off and giving them a hard connection reset.
If you’re doing file I/O, a thing that I’ve learned the last three or four years working with teams that still are doing a lot of file I/O, like uploading images and storing them in some sort of system on a file system, is that permissions will end up biting you in the butt in production at some point, especially if you’re using like NFS or something on the network. So I have them put code in to look for the proper permissions during Node startup, and if they don’t see the proper permissions in the places they expect, it will crash the app. Because a lot of times what will happen is we’ll end up in production, and then days later someone does a unique thing like they upload a PDF report and whatever their app is. And then there’s a permissions problem because someone changed the AWS EC2 thing and suddenly we have an outage, or at least the user gets a really bad experience.
So we’ve learned to just start checking for file permissions if we’re going to write to disk during Node startup. If you’re listening on HTTP, provide a common standard health, health endpoint so that Docker, Compose, Swarm, Kubernetes, and all the things can monitor your apps. If you don’t use HTTP apps, and they don’t have a listening port, then typically the way that we do it is we write every 30 seconds, we’re writing a health status to a file on disk, and then we’re having the probe or the health check look for the date timestamp of that file, or maybe look at the inside of that file for whatever data we gave it. So that’s how we deal with non-listing services.
And then lastly on Kubernetes pods, because it’s not just about Docker, I have a recommended pod spec that I use with all my consulting clients, all my students. And so go grab that. I’m not going to go through it because I got 30 seconds. But you want to have ready that example gives you all those security features and all the stuff you should have in your pod spec that you may not have today. It talks about probes, it talks about listeners, about set terminationGracePeriodSeconds, disabling privileged and escalations, making sure that you’re running as a non-privileged user, and enforcing that so that your security team is happy with you. And then finally enabling setcompProfiles, which Docker has by default, but Kubernetes disables by default unless you do that in every pod or at a level of the cluster.
So that’s it. I’m going to hang around for questions since I ran out of time. Thank you.
Learn more
- How to Use the Node Docker Official Image
- Docker Init: Initialize Dockerfiles and Compose files with a single CLI command
- What are containers?
- Try Docker Desktop
- Docker 101 Tutorial
New to Docker? Get started.
Find a subscription that’s right for you
Contact an expert today to find the perfect balance of collaboration, security, and support with a Docker subscription.