If you’ve seen me demonstrate Replicated, you’ve probably noticed that I nearly always use SlackerNews as my demo application. Replicated built SlackerNews as a showcase application for our Platform, so my team uses it every time we do a demo or experiment.
When we launched SecureBuild, I knew I needed to adopt it for SlackerNews immediately. We had a Postgres image available, so I dropped that into my values file right away.
images:
postgres:
registry: images.shortrib.io
repository: proxy/slackernews-mackerel/cve0.io/postgres
tag: 16
pullPolicy: IfNotPresent
I made the equivalent change to my HelmChart
object and added SecureBuild as a private registry that the proxy Registry could access. Being the first person from Replicated to sign up publicly for SecureBuild, I snagged the replicated
username.
$ replicated registry add other \
--username replicated \
--password 4kVS48FrbcBATEsfIPEX4MPL3iiitLndSk0B
--registry https://cve0.io
Then we needed an NGINX image. But there wasn’t one in the catalog.
Basing the SlackerNews image on a SecureBuild image
The team was busy moving the product forward and building images that our customers needed, so I wasn’t sure I’d get an NGINX image right away (I’m not as important as you folks are). But I couldn’t sit still.
I needed to figure out what else I could do. Then I heard @grant share something on a call—we’re providing access to one “language” image for every Replicated customer. I wanted to check out how that works, so I decided to use the Node.js image as the base for the application image in SlackerNews.
At first, I thought it would mean adopting a new toolchain. After all, we use a different approach to building images with SecureBuild than the Dockerfile
I was using. But that was just me trying to get a gold star. The tools don’t matter as long as the foundation is secure.
So I updated my Dockerfile
, swapping out the base image like this:
FROM cve0.io/node:20.19
Then I ran a build.
# the output you'd expect...
Dockerfile.web:5
--------------------
3 |
4 | # Install dependencies that might be necessary
5 | >>> RUN apk add --no-cache libc6-compat
6 | WORKDIR /app
7 |
Well, that’s not so great.
It makes sense, though. We don’t have the apk
command because we’re working with a distroless SecureBuild image. Maybe I don’t need it—it’s just installing the standard C library. I tried that, but it failed a little later because I had some shell logic and the SecureBuild image has no shell.
But I noticed that I specify three different images in the build: base
, deps
, and runner
. I care most about using the SecureBuild image for the runner
image. Let’s try that.
FROM node:20-alpine AS base
# set the working directory and install the library
FROM base AS deps
# build application
FROM cve0.io/node:20.19 AS runner
# run the application
That got me through building the runner
. But that failed on its first RUN
command.
Dockerfile.web:39
--------------------
37 |
38 | # Create a group and user for running the application
39 | >>> RUN addgroup --system --gid 1001 nodejs
40 | RUN adduser --system --uid 1001 nextjs
41 |
--------------------
I hate to admit how long this took me to figure out. The parts of my brain that may once have known that RUN
behaves differently with a string vs. an array just did not light up. Eventually Claude realized I had to use an array with my RUN
command to avoid calling a shell (remember, the image doesn’t have one!).
I got past that then realized that addgroup
and adduser
aren’t on the image either, but that’s fine since COPY
can set permissions and both COPY
and USER
can use IDs instead of names.
# Copy the necessary files from the builder stage
COPY --from=builder --chown=1001:1001 /app/public ./public
COPY --from=builder --chown=1001:1001 /app/.next/standalone ./
COPY --from=builder --chown=1001:1001 /app/.next/static ./.next/static
# Switch to the nextjs user
USER 1001
With those changes I had an image that built, but I couldn’t get it to run.
Why won’t it run?
There’s not much more frustrating that wrestling through an image build only to find out your image won’t run. Here’s what it did:
node:internal/modules/cjs/loader:1215
throw err;
^
Error: Cannot find module '/app/node'
at Module._resolveFilename (node:internal/modules/cjs/loader:1212:15)
at Module._load (node:internal/modules/cjs/loader:1043:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:164:12)
at node:internal/main/run_main_module:28:49 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}
This was another one that left me scratching my head and begging Claude for help. It even led me to turn to ChatGPT after Claude didn’t get me past it. Eventually, with all my searching and lots of trial and error, I realized exactly what the message meant (I’m not a Node developer, after all). Here was the culprit:
CMD ["node", "server.js"]
Remember that CMD
is the command that gets run when the container starts. Another long forgotten detail surfaced in my flailing and pleading with AIs: CMD
behaves differently with ENTRYPOINT
. But I wasn’t defining an ENTRYPOINT
. Was my base image?
$ crane config cve0.io/node:20.19
{
"architecture": "amd64",
"author": "github.com/chainguard-dev/apko",
"created": "1970-01-01T00:00:00Z",
"history": [
{
"author": "apko",
"created": "1970-01-01T00:00:00Z",
"created_by": "apko",
"comment": "This is an apko single-layer image"
}
],
"os": "linux",
"rootfs": {
"type": "layers",
"diff_ids": [
"sha256:6442be140a6047e57cda66b2bff50ee19c3207c5ef4c13a1f01e0ae4f6bbed7c"
]
},
"config": {
"Entrypoint": [
"/usr/bin/node"
],
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin",
"SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt"
],
"Labels": {
"org.opencontainers.image.created": "1970-01-01T00:00:00Z"
}
}
}
There it is! The ENTRYPOINT
is /usr/bin/node
. I had to change my CMD
to ["server.js"]
to pass the argument to the node
command rather than invoking the node
command directly (which was telling node
to run node
).
Returning to NGINX
I got through all of this and the team still hadn’t been able to get to my request for an NGINX image. After a bit of persistence on Slack, I was able to deputize myself onto the SecureBuild team to build the image for myself. It was quite an adventure.
A SecureBuild image is a multi-architecture image with a distroless base image, with all packages built from source code. The image can be “dropped in” in place of an “official” image. In this case, I wanted to swap out the Docker Hub “Official” NGINX image with my SecureBuild image. That meant I needed to understand what was in that image and build each and every component from source.
I initially tried to build a maximal NGINX installation (though I did leave out the perl module) since I thought that might be the best bet. That led me down the road of building lots of libraries, which meant I needed to build tools like cmake
and the GNU Autotools. And that last one meant I needed to build Perl. I felt like I woke up in 1995 when I was building all my packages from source and a big part of my job was writing CGI scripts in Perl[1].
Eventually I realized that I could scope down my NGINX build to the same modules and configuration used in the official images, and that all of the libraries I needed for that were already built[2].
I’m glad that, most of the time, I have the SecureBuild team to take care of all that for me.
Finalizing the SlackerNews image
After a quick pairing session with a SecureBuild engineer, we had an NGINX Image ready to go. I updated my values file and KOTS HelmChart
to use it, and SlackerNews was ready to go.
Only one thing left to do: I’d been ignoring those SlackerNews Dependabot alerts for a long time. SlackerNews itself had a ton of CVEs! There was no point in basing my image on a Zero-CVE NodeJS image if I had dozens of CVEs in my code. Fortunately, I wrestled most of them to the ground by removing an old dependency that brought in lots of vulnerabilities and wasn’t even used in the code anymore. That was a quick fix.
There were a few packages in the application that didn’t have a fix for their CVEs, but I got rid of dozens of them. That’s on top of the ones SecureBuild eliminated: 744 in Node, 141 in NGINX, and 221 in Postgres.
Not bad for what amounted to a few hours of work.
And installing the patches that lead to “a patchy server” ↩︎
I still finished up all those dependent packages because the team needed them. ↩︎