280 lines
12 KiB
Markdown
280 lines
12 KiB
Markdown
---
|
||
title: "The Redemption of Slirp and Snapshotter"
|
||
date: 2022-11-29T15:15:15+11:00
|
||
draft: false
|
||
showSummary: true
|
||
summary: "Part two in our journey of getting rootless containers working on Alpine. Join us as we fix slirp and our \
|
||
snapshotter."
|
||
series:
|
||
- "Rootless Containers on Alpine Linux"
|
||
series_order: 2
|
||
---
|
||
|
||
|
||
# Part Two: Fixing Things
|
||
|
||
## The Story So Far
|
||
> **(Ashe)** So where were we?
|
||
>
|
||
> **(Tammy)** We're illiterate, and so forth?
|
||
>
|
||
> **(Ashe)** Right, yes.
|
||
|
||
Last time, we did a bunch of prep-work for rootless containers on Alpine, but got stuck with `slirp4netns` not working
|
||
with BusyBox's `ip` program, and `rootlesskit` not supporting the devmapper snapshotter. So where are we now?
|
||
|
||
## Fixing Slirp4netns
|
||
On the former, you'll hopefully recall that we
|
||
[raised an issue](https://github.com/rootless-containers/slirp4netns/issues/304) with the slirp devs.
|
||
|
||
They suggested we install a non-busybox version of iproute2 (which Alpine provides in the `iproute2` package), and
|
||
this neatly solved the issue! `rootlesskit` will now happily start with `--net=slirp4netns`. Well that was remarkably
|
||
painless. We're kinda annoyed we didn't think of it ourselves.
|
||
|
||
## Picking a supported snapshotter
|
||
|
||
With that done, let's look at some snapshotters. The
|
||
[list of supported snapshotters](https://github.com/containerd/nerdctl/blob/main/docs/rootless.md#snapshotters) gives
|
||
us a few options.
|
||
|
||
Of the presented options, overlayfs is the default, and will happily run on our Alpine instance (which is currently
|
||
running kernel 5.15).
|
||
|
||
To get that working, we just need to remove our devmapper config, as overlayfs is the default. Let's go ahead and
|
||
disable the devmapper snapshotter while we're at it, since we know it won't work in a rootless context.
|
||
|
||
For those following along, our `config.toml` currently looks like:
|
||
```toml
|
||
version = 2
|
||
root = "/home/tammy/.local/share/containerd"
|
||
state = "/tmp/1000-runtime-dir/containerd"
|
||
|
||
disabled_plugins = ["io.containerd.grpc.v1.cri", "io.containerd.snapshotter.v1.devmapper"]
|
||
|
||
[plugins]
|
||
|
||
[grpc]
|
||
address = "/tmp/1000-runtime-dir/containerd/containerd.sock"
|
||
```
|
||
|
||
With that done, we finally have containerd running. Our final command is:
|
||
```sh
|
||
rootlesskit --net=slirp4netns --copy-up=/etc --copy-up=/run \
|
||
--state-dir=/tmp/1000-runtime-dir/rootlesskit-containerd --disable-host-loopback \
|
||
sh -c "rm -f /run/containerd; exec containerd -c config.toml"
|
||
```
|
||
|
||
### Cleaning up devmapper
|
||
|
||
> **(Doll)** Oh! Miss! What about the devmapper partition we created?
|
||
>
|
||
> **(Ashe)** Good catch, Doll. Let's fix that up, too.
|
||
|
||
|
||
#### Out with the old, in with the new
|
||
|
||
[If you recall]({{< ref "rootless-containers-alpine/#creating-our-nerdctl-thin-pool" >}}), we created an
|
||
{{<hover "Logical Volume" >}}LV{{< /hover >}} called `scratch` for our devmapper setup.
|
||
Since we're no longer using that, we can safely delete that LV. Let's do that with `doas lvremove /dev/data/scratch`.
|
||
|
||
We'll still need somewhere to put container images and non-peristent data, so let's recreate scratch as
|
||
a normal partition. We can do this with `doas lvcreate -n scratch -l 100%FREE data`, since we know we've consumed the
|
||
rest of the Volume Group. If you wanted a particular size, you could use `--size` instead of `-l`.
|
||
|
||
#### Formatting the new LV
|
||
|
||
Next up, let's format the LV as ext4 with `doas mkfs.ext4 /dev/data/scratch`:
|
||
```sh
|
||
mke2fs 1.46.5 (30-Dec-2021)
|
||
Discarding device blocks: done
|
||
Creating filesystem with 7863296 4k blocks and 1966080 inodes
|
||
Filesystem UUID: b6686e80-0eae-4316-b1ac-e8544be2cd87
|
||
Superblock backups stored on blocks:
|
||
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
|
||
4096000
|
||
|
||
Allocating group tables: done
|
||
Writing inode tables: done
|
||
Creating journal (32768 blocks): done
|
||
Writing superblocks and filesystem accounting information: done
|
||
```
|
||
|
||
Looks good.
|
||
|
||
## Getting rootless containerd to start automatically
|
||
|
||
As we noted last time, Alpine Linux uses OpenRC, which uses plain old shell scripts
|
||
for service automation. So getting containerd running rootless should be relatively painless.
|
||
The [Service Script Guide](https://github.com/OpenRC/openrc/blob/master/service-script-guide.md) tells use what
|
||
we need to do. First, we'll need to define `command`, `command_args`, and `pidfile`. Sure. Easy enough.
|
||
```sh
|
||
command="rootlesskit"
|
||
command_args="--net=slirp4netns --copy-up=/etc --copy-up=/run \
|
||
--state-dir=/tmp/1000-runtime-dir/rootlesskit-containerd --disable-host-loopback \
|
||
sh -c \"rm -f /run/containerd; exec containerd -c /etc/conf.d/rootless-containerd/config.toml\""
|
||
pidfile="/run/${RC_SVCNAME}.pid"
|
||
```
|
||
> **(Tammy)** Huh. Interesting config path.
|
||
>
|
||
> **(Ashe)** Mm. OpenRC will automatically source any files of matching name `/etc/conf.d/`, but we don't want that here,
|
||
> since the config file isn't a shell script. Instead, we'll create a folder of matching name, and stash the config file
|
||
> there.
|
||
>
|
||
> **(Octavia)** I'm still deeply annoyed that containerd uses toml instead of YAML, or something not written by assholes.
|
||
>
|
||
> **(Ashe)** Even pedigree aside, it's way clunkier than YAML too. Sucks on all fronts.
|
||
>
|
||
> **(Selene)** This may be one of those little things we could work on.
|
||
>
|
||
> Ashe sighs.
|
||
>
|
||
> **(Ashe)** Perhaps. The backlog continues to grow.
|
||
|
||
We also need to tell {{< hover "Stop-Start Daemon" >}}ssd{{< /hover >}} what our service depends on:
|
||
```sh
|
||
depend () {
|
||
use net dns
|
||
need cgroups sysctl
|
||
}
|
||
```
|
||
|
||
We'll also need to tell the start-stop deamon that we want the process started as a non-root user,
|
||
and that rootless kit doesn't automatically background itself.
|
||
|
||
```sh
|
||
command_user="tammy:tammy"
|
||
command_background=true
|
||
```
|
||
|
||
Finally, let's add some documentation and niceness:
|
||
```sh
|
||
name="rootless containerd $SVCNAME"
|
||
describe () {
|
||
echo "This service auto-starts rootless containerd as tammy (UID 1000) when the system starts."
|
||
}
|
||
```
|
||
|
||
Our (semi-)final script looks like:
|
||
```sh
|
||
#!/sbin/openrc-run
|
||
depend() {
|
||
use net dns
|
||
need cgroups sysctl
|
||
}
|
||
|
||
|
||
describe () {
|
||
echo "This service auto-starts rootless containerd as tammy (UID 1000) when the system starts."
|
||
}
|
||
|
||
name="rootless containerd $SVCNAME"
|
||
command="rootlesskit"
|
||
command_user="tammy:tammy"
|
||
command_background=true
|
||
command_args="--net=slirp4netns --copy-up=/etc --copy-up=/run \
|
||
--state-dir=/tmp/1000-runtime-dir/containerd-rootless --disable-host-loopback \
|
||
sh -c \"rm -f /run/containerd; exec containerd -c /etc/conf.d/rootless-containerd/config.toml\""
|
||
pidfile="/run/${RC_SVCNAME}.pid"
|
||
```
|
||
|
||
Our final step is to place the script somewhere where OpenRC can find it.
|
||
```sh
|
||
doas chown root:root rootless-containerd
|
||
doas chmod 0755 rootless-containerd
|
||
doas mv rootless-containerd /etc/init.d/
|
||
```
|
||
|
||
We can now `doas rc-service rootless-containerd start` and... it starts!
|
||
|
||
## Finally Testing nerdctl
|
||
|
||
Okay. Finally here. Let's try something simple like `nerdctl run -it --rm alpine`:
|
||
|
||
```sh
|
||
FATA[0000] rootless containerd not running? (hint: use `containerd-rootless-setuptool.sh install` to start rootless containerd): stat /tmp/1000-runtime-dir/containerd-rootless: no such file or directory
|
||
```
|
||
|
||
Wat.
|
||
|
||
OH.
|
||
|
||
```sh
|
||
~ ❯ ls /tmp/1000-runtime-dir
|
||
containerd rootlesskit-containerd
|
||
```
|
||
|
||
`nerdctl` expects the folder to be called something different. That's fine. Let's stop the rootless containerd service
|
||
with `doas rc-service rootless-containerd stop`, and then we can just modify the `command_args` of our service script:
|
||
```sh
|
||
command_args="--net=slirp4netns --copy-up=/etc --copy-up=/run \
|
||
--state-dir=/tmp/1000-runtime-dir/containerd-rootless --disable-host-loopback \
|
||
sh -c \"rm -f /run/containerd; exec containerd -c /etc/conf.d/rootless-containerd/config.toml\""
|
||
```
|
||
|
||
And then `doas rc-service rootless-containerd start`.
|
||
|
||
Let's try that `nerdctl run -it --rm alpine` again.
|
||
|
||
```sh
|
||
~ ❯ nerdctl run -it --rm alpine
|
||
:applet not found
|
||
```
|
||
|
||
I. What. What applet.
|
||
|
||
Okay let's try this the manual way:
|
||
|
||
```sh
|
||
~ ❯ nsenter -U --preserve-credentials -m -n -t $(cat /tmp/1000-runtime-dir/containerd-rootless/child_pid) with tammy@tammy at 13:15:41
|
||
/ ❯ export CONTAINERD_ADDRESS=/tmp/1000-runtime-dir/containerd/containerd.sock with root@tammy at 13:16:16
|
||
/ ❯ export CONTAINERD_SNAPSHOTTER=overlayfs with root@tammy at 13:16:43
|
||
/ ❯ ctr images pull docker.io/library/alpine:latest with root@tammy at 13:16:48
|
||
docker.io/library/alpine:latest: resolved |++++++++++++++++++++++++++++++++++++++|
|
||
index-sha256:8914eb54f968791faf6a8638949e480fef81e697984fba772b3976835194c6d4: done |++++++++++++++++++++++++++++++++++++++|
|
||
manifest-sha256:c0d488a800e4127c334ad20d61d7bc21b4097540327217dfab52262adc02380c: done |++++++++++++++++++++++++++++++++++++++|
|
||
layer-sha256:c158987b05517b6f2c5913f3acef1f2182a32345a304fe357e3ace5fadcad715: done |++++++++++++++++++++++++++++++++++++++|
|
||
config-sha256:49176f190c7e9cdb51ac85ab6c6d5e4512352218190cd69b08e6fd803ffbf3da: done |++++++++++++++++++++++++++++++++++++++|
|
||
elapsed: 4.9 s total: 3.2 Mi (671.5 KiB/s)
|
||
unpacking linux/amd64 sha256:8914eb54f968791faf6a8638949e480fef81e697984fba772b3976835194c6d4...
|
||
done: 177.071479ms
|
||
/ ❯ ctr run -t --rm --fifo-dir /tmp/foo-fifo --cgroup "" docker.io/library/alpine:latest foo with root@tammy at 13:17:58
|
||
ctr: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process:
|
||
unable to apply cgroup configuration: rootless needs no limits + no cgrouppath when no permission is granted for cgroups
|
||
mkdir /sys/fs/cgroup/foo: permission denied: unknown
|
||
/ ❯ with root@tammy at 13:18:02
|
||
```
|
||
|
||
Huh. Alright. We recently found out that cgroups might only work with systemd, and it seems that's the case.
|
||
|
||
### Some Bad News
|
||
|
||
A few hours of pain later, here's what I have.
|
||
|
||
Attempting to run containerd as our user, and then manually using `nsenter` does not work:
|
||
```sh
|
||
~ ❯ nsenter -U --preserve-credentials -m -n -t $(cat /tmp/1000-runtime-dir/containerd-rootless/child_pid)
|
||
nsenter: setns(): can't reassociate to namespace 'net': Operation not permitted
|
||
```
|
||
If we run it via `rc-service`, it works. Likely something to do with the exact CLI options we're using. Won't think too
|
||
hard about this one.
|
||
|
||
From there, nerdctl breaks with an applet error, and manually `nsenter`-ing the daemon and attempting to `ctr run` breaks
|
||
because writing the cgroup file gets a permission denied. This occurs even with cgroups disabled.
|
||
|
||
I'm truly out of my depth here. I knew going in that this was an unsupported configuration, since most rootless
|
||
implementations rely on systemd, but I wanted to try it anyway.
|
||
|
||
In my digging, I did find a guide on [rootless Docker](https://virtualzone.de/posts/alpine-docker-rootless/), and while
|
||
we really don't want to use Docker, we did try using `containerd-rootless.sh`, but this results in the same errors.
|
||
|
||
For now, we're going to have to put this aside, since we really need to get to actually migrating our workloads. We'll
|
||
keep digging in the background and see if we can discover anything, and ask some questions on the #containerd-dev Slack
|
||
channel.
|
||
|
||
## More Tea Please
|
||
|
||
With that, I heave a sigh while Doll is having us make some tea to calm down. Hope we'll have more for you all soon.
|
||
|
||
Excuse us while we turn into a jellyfish and swim away.
|
||
|