HomeDockerIntroduction to heredocs in Dockerfiles

Introduction to heredocs in Dockerfiles

Guest post by Docker Community Member Justin Chadell. This post initially appeared here.

As of a couple weeks ago, Docker’s BuildKit tool for building Dockerfiles now helps heredoc syntax! With these new improvements, we can do all sorts of things that were difficult before, like multiline RUNs without needing all those pesky backslashes at the end of each line, or the creation of small inline configuration files.

In this post, I’ll cover the fundamentals of what these heredocs are, and more importantly what you can use them for, and how to get initiated with them! 🎉

Introduction to heredocs in Dockerfiles

BuildKit (a quick refresher)

From BuildKit’s own github:

BuildKit is a toolkit for converting source code to construct artifacts in an efficient, expressive and repeatable manner.

Essentially, it’s the next generation builder for docker images, neatly separate from the rest of the main docker runtime; you can use it for building docker images, or images for other OCI runtimes.

It comes with a lot of useful (and pretty) features past what the basic builder helps, including neater construct log output, faster and more cache-efficient builds, concurrent builds, as well as a very flexible architecture to allow easy extensibility (I’m definitely not doing it justice).

You’re either most likely using it already, or you probably want to be! You can enable it regionally by setting the environment variable DOCKER_BUILDKIT=1 when performing your docker construct, or switch to using the new(ish) docker buildx command.

At a slightly more technical level, buildkit allows easy switching between multiple different “builders”, which can be native or remote, in the docker daemon itself, in docker containers or even in a Kubernetes pod. The builder itself is split up into 2 main pieces, a frontend and a backend: the frontend produces intermediate Low Level Builder (LLB) code, which is then constructed into an image by the backend.

You can think of LLB to BuildKit as the LLVM IR is to Clang.

Part of what makes buildkit so fantastic is it’s flexibility – these components are completely detached from each other, so you can use any frontend in any image. For example, you could use the default Dockerfile frontend, or compile your own self-contained buildpacks, or even develop your own alternative file format like Mockerfile.

Getting setup

To get initiated with using heredocs, first make sure you’re setup with buildkit. Switching to buildkit gives you a ton of out-of-the-box improvements to your construct setup, and should have complete compatibility with the old builder (and you can always switch back if you don’t like it).

With buildkit properly setup, you can create a new Dockerfile: at the top of this file, we need to include a #syntax= directive. This directive informs the parser to use a specific frontend – in this case, the 1 located at docker/dockerfile:1.3-labs on Docker Hub.

# syntax=docker/dockerfile:1.3-labs

With this line (which has to be the very first line), buildkit will find and download the right image, and then use it to construct the image.

We then specify the base image to construct from (just like we normally would):

FROM ubuntu:20.04

With all that out the way, we can use a heredoc, executing 2 instructions in the same RUN!

RUN <<EOF

echo "Hello" >> /hello

echo "World!" >> /hello

EOF

Why?

Now that heredocs are working, you might be questioning – why all the fuss? Well, this feature has kind of, until now, been missing from Dockerfiles.

See moby/moby#34423 for the original issue that proposed heredocs in 2017.

Let’s suppose you want to construct an image that requires a lot of instructions to setup. For example, a fairly common pattern in Dockerfiles includes wanting to update the system, and then to install some additional dependencies, i.e. apt update, upgrade and install all at once.

Naively, we might put all of these as separate RUNs:

RUN apt-get update

RUN apt-get upgrade -y

RUN apt-get install -y ...

But, sadly like too many intuitive solutions, this doesn’t quite do what we want. It certainly works – but we create a new layer for each RUN, making our image much larger than it needs to be (and making builds take much longer).

So, we can squish this into a single RUN command:

RUN apt-get update &&

    apt-get upgrade -y &&

    apt-get install -y ...

And that’s what most Dockerfiles do today, from the official docker images down to the messy ones I’ve written for myself. It works fine, images are small and quickly to construct… but it does gaze a bit ugly. And if you accidentally overlook the line continuation symbol , well, you’ll get a syntax error!

Heredocs are the next step to improve this! Now, we can just write:

RUN <<EOF

apt-get update

apt-get upgrade -y

apt-get install -y ...

EOF

We use the <<EOF to introduce the heredoc (just like in sh/bash/zsh/your shell of choice), and EOF at the end to close it. In between those, we put all our instructions as the content of our script to be run by the shell!

More ways to run…

So far, we’ve seen some basic syntax. However, the new heredoc support doesn’t just allow simple examples, there’s lots of other fun things you can do.

For completeness, the hello world example using the same syntax we’ve already seen:

RUN <<EOF

echo "Hello" >> /hello

echo "World!" >> /hello

EOF

But let’s say your setup scripts are getting more complicated, and you want to use another language – say, like Python. Well, no dispute, you can connect heredocs to other programs!

RUN python3 <<EOF

with open("/hello", "w") as f:

    print("Hello", file=f)

    print("World", file=f)

EOF

In fact, you can use as complex instructions as you like with heredocs, simplifying the above to:

RUN python3 <<EOF > /hello

print("Hello")

print("World")

EOF

If that feels like it’s getting a bit fiddly or complicated, you can also always just use a shebang:

RUN <<EOF

#!/usr/bin/env python3

with open("/hello", "w") as f:

    print("Hello", file=f)

    print("World", file=f)

EOF

There’s lots of different ways to connect heredocs to RUN, and hopefully some more ways and improvements to approach in the prospective!

…and some file fun!

Heredocs in Dockerfiles also let us mess around with inline files! Let’s suppose you’re building an nginx site, and want to create a custom index page:

FROM nginx

COPY index.html /usr/share/nginx/html

And then in a separate file index.html, you put your content. But if your index page is just really simple, it feels irritating to have to separate everything out: heredocs let you hold everything in the same place if you want!

FROM nginx

COPY <<EOF /usr/share/nginx/html/index.html

(your index page goes here)

EOF

You can even copy multiple files at once, in a single layer:

COPY <<robots.txt <<humans.txt /usr/share/nginx/html/

(robots content)

robots.txt

(humans content)

humans.txt

Finishing up

Hopefully, I’ve managed to persuade you to give heredocs a try when you can! For now, they’re nonetheless only available in the staging frontend, but they should be making their way into a release very soon – so make sure to take a gaze and give your feedback!If you’re interested, you can find out more from the official buildkit Dockerfile syntax guide.

Go to the source

Most Popular