Infrastructure as a Service

 View Only

MicroShift – Part 24: Build and Install the Kata Containers Runtime on Raspberry Pi 4 with Manjaro

By Alexei Karve posted Sat August 06, 2022 06:24 AM

  

Build and Install the Kata Containers Runtime with MicroShift on Raspberry Pi 4 with Manjaro

Introduction

MicroShift is a research project that is exploring how OpenShift OKD Kubernetes distribution can be optimized for small form factor devices and edge computing. In Part 23, we installed and used Kata Containers with MicroShift on multiple editions of Fedora 36. In this Part 24, we will compile Kata v2 from source code on Manjaro. We had already installed MicroShift on Manjaro in Part 18, we will work off the same setup. We will install and test Kata Containers with pods for alpine, busybox and nginx images. We will also run the Influx DB sample with a deployment containing multiple Kata containers and show SenseHat metrics in Grafana.


Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers but provide the workload isolation and security advantages of VMs (Source: Kata Containers Website). Traditional Containers uses crun/runc as a container runtime that relies on kernel features such as Cgroups and namespaces to provide isolation with the shared kernel whereas Kata Containers makes containers to be more isolated in their own lightweight VM. With Kata Containers, each container or container pod is launched into a lightweight VM with its own unique kernel instance. Since each container/pod is now running in its own VM, malicious code can no longer exploit the shared kernel to access neighboring containers. Kubernetes CRI (Container Runtime Interface) implementations allow using any OCI-compatible runtime with Kubernetes, such as the Kata Containers runtime. Kata Containers support both the CRI-O and CRI-containerd CRI implementations. RuntimeClass is a feature for selecting the container runtime configuration. The container runtime configuration is used to run a Pod's containers. The picture below shows the Kubernetes integration with containerd-shim-kata-v2.

Kata v2


In the following sections, we start by building the kata containers runtime, followed by the initrd and the rootfs images and finally the kata containers kernel.

Building Kata v2 on Manjaro

Download and install updates

pacman --noconfirm -Syu
pacman --noconfirm -S inetutils # For hostname

Install golang

wget https://golang.org/dl/go1.18.3.linux-arm64.tar.gz
rm -rf /usr/local/go && tar -C /usr/local -xzf go1.18.3.linux-arm64.tar.gz
rm -f go1.18.3.linux-arm64.tar.gz

export PATH=$PATH:/usr/local/go/bin
export GOPATH=/root/go
cat << EOF >> /root/.bash_profile
export PATH=\$PATH:/usr/local/go/bin
export GOPATH=/root/go
EOF
#export GO111MODULE=off
go env -w GO111MODULE=auto

Build and install the Kata Containers runtime

go get -d -u github.com/kata-containers/kata-containers
cd /root/go/src/github.com/kata-containers/kata-containers/src/runtime/
make
make install

Output:

[root@rpi runtime]# make
     GENERATE data/kata-collect-data.sh
     GENERATE pkg/katautils/config-settings.go
kata-runtime - version 2.5.0-rc0 (commit 0b4a91ec1a508604d687ec55da9a6732144e7917)

• architecture:
	Host:
	golang:
	Build: arm64

• golang:
	go version go1.18.3 linux/arm64

• hypervisors:
	Default: qemu
	Known: acrn cloud-hypervisor firecracker qemu
	Available for this architecture: cloud-hypervisor firecracker qemu

• Summary:

	destination install path (DESTDIR) : /
	binary installation path (BINDIR) : /usr/local/bin
	binaries to install :
	 - /usr/local/bin/kata-runtime
	 - /usr/local/bin/containerd-shim-kata-v2
	 - /usr/local/bin/kata-monitor
	 - /usr/local/bin/data/kata-collect-data.sh
	configs to install (CONFIGS) :
	 - config/configuration-clh.toml
 	 - config/configuration-fc.toml
 	 - config/configuration-qemu.toml
	install paths (CONFIG_PATHS) :
	 - /usr/share/defaults/kata-containers/configuration-clh.toml
 	 - /usr/share/defaults/kata-containers/configuration-fc.toml
 	 - /usr/share/defaults/kata-containers/configuration-qemu.toml
	alternate config paths (SYSCONFIG_PATHS) :
	 - /etc/kata-containers/configuration-clh.toml
 	 - /etc/kata-containers/configuration-fc.toml
 	 - /etc/kata-containers/configuration-qemu.toml
	default install path for qemu (CONFIG_PATH) : /usr/share/defaults/kata-containers/configuration.toml
	default alternate config path (SYSCONFIG) : /etc/kata-containers/configuration.toml
	qemu hypervisor path (QEMUPATH) : /usr/bin/qemu-system-aarch64
	cloud-hypervisor hypervisor path (CLHPATH) : /usr/bin/cloud-hypervisor
	firecracker hypervisor path (FCPATH) : /usr/bin/firecracker
	assets path (PKGDATADIR) : /usr/share/kata-containers
	shim path (PKGLIBEXECDIR) : /usr/libexec/kata-containers

     BUILD    /root/go/src/github.com/kata-containers/kata-containers/src/runtime/kata-runtime
     GENERATE config/configuration-qemu.toml
     GENERATE config/configuration-clh.toml
     GENERATE config/configuration-fc.toml
     BUILD    /root/go/src/github.com/kata-containers/kata-containers/src/runtime/containerd-shim-kata-v2
     BUILD    /root/go/src/github.com/kata-containers/kata-containers/src/runtime/kata-monitor

[root@rpi runtime]# make install
kata-runtime - version 2.5.0-rc0 (commit 0b4a91ec1a508604d687ec55da9a6732144e7917)

• architecture:
	Host:
	golang:
	Build: arm64

• golang:
	go version go1.18.3 linux/arm64

• hypervisors:
	Default: qemu
	Known: acrn cloud-hypervisor firecracker qemu
	Available for this architecture: cloud-hypervisor firecracker qemu

• Summary:

	destination install path (DESTDIR) : /
	binary installation path (BINDIR) : /usr/local/bin
	binaries to install :
	 - /usr/local/bin/kata-runtime
	 - /usr/local/bin/containerd-shim-kata-v2
	 - /usr/local/bin/kata-monitor
	 - /usr/local/bin/data/kata-collect-data.sh
	configs to install (CONFIGS) :
	 - config/configuration-clh.toml
 	 - config/configuration-fc.toml
 	 - config/configuration-qemu.toml
	install paths (CONFIG_PATHS) :
	 - /usr/share/defaults/kata-containers/configuration-clh.toml
 	 - /usr/share/defaults/kata-containers/configuration-fc.toml
 	 - /usr/share/defaults/kata-containers/configuration-qemu.toml
	alternate config paths (SYSCONFIG_PATHS) :
	 - /etc/kata-containers/configuration-clh.toml
 	 - /etc/kata-containers/configuration-fc.toml
 	 - /etc/kata-containers/configuration-qemu.toml
	default install path for qemu (CONFIG_PATH) : /usr/share/defaults/kata-containers/configuration.toml
	default alternate config path (SYSCONFIG) : /etc/kata-containers/configuration.toml
	qemu hypervisor path (QEMUPATH) : /usr/bin/qemu-system-aarch64
	cloud-hypervisor hypervisor path (CLHPATH) : /usr/bin/cloud-hypervisor
	firecracker hypervisor path (FCPATH) : /usr/bin/firecracker
	assets path (PKGDATADIR) : /usr/share/kata-containers
	shim path (PKGLIBEXECDIR) : /usr/libexec/kata-containers

     INSTALL  install-scripts
     INSTALL  install-completions
     INSTALL  install-configs
     INSTALL  install-configs
     INSTALL  install-bin
     INSTALL  install-containerd-shim-v2
     INSTALL  install-monitor

The build creates the following:

  • runtime binary: /usr/local/bin/kata-runtime and /usr/local/bin/containerd-shim-kata-v2
  • configuration file: /usr/share/defaults/kata-containers/configuration.toml

Output:

[root@rpi runtime]# ls /usr/local/bin/kata-runtime /usr/local/bin/containerd-shim-kata-v2 /usr/share/defaults/kata-containers/configuration.toml
/usr/local/bin/containerd-shim-kata-v2	/usr/local/bin/kata-runtime  /usr/share/defaults/kata-containers/configuration.toml

Check hardware requirements

sudo kata-runtime check --verbose # This will return error because vmlinux.container does not exist

pacman --noconfirm -S which
`which kata-runtime` --version
`which containerd-shim-kata-v2` --version

Output:

[root@microshift runtime]# sudo kata-runtime check --verbose
ERRO[0000] /usr/share/defaults/kata-containers/configuration-qemu.toml: file /usr/share/kata-containers/vmlinux.container does not exist  arch=arm64 name=kata-runtime pid=9296 source=runtime
/usr/share/defaults/kata-containers/configuration-qemu.toml: file /usr/share/kata-containers/vmlinux.container does not exist 

[root@rpi src]# which kata-runtime
/usr/local/bin/kata-runtime
[root@rpi src]# `which kata-runtime` --version
kata-runtime  : 2.5.0-rc0
   commit   : 0b4a91ec1a508604d687ec55da9a6732144e7917
   OCI specs: 1.0.2-dev

[root@rpi src]# which containerd-shim-kata-v2
/usr/local/bin/containerd-shim-kata-v2
[root@rpi src]# `which containerd-shim-kata-v2` --version
Kata Containers containerd shim: id: "io.containerd.kata.v2", version: 2.5.0-rc0, commit: 0b4a91ec1a508604d687ec55da9a6732144e7917

Kata creates a VM in which to run one or more containers by launching a hypervisor. Kata supports multiple hypervisors. We use QEMU. The hypervisor needs two assets for this task: a Linux kernel and a small root filesystem image to boot the VM. The guest kernel is passed to the hypervisor and used to boot the VM. The default kernel provided in Kata Containers is highly optimized for kernel boot time and minimal memory footprint, providing only those services required by a container workload. The hypervisor uses an image file which provides a minimal root filesystem used by the guest kernel to boot the VM and host the Kata Container. Kata Containers supports both initrd and rootfs based minimal guest images. The default packages provide both an image and an initrd, both of which are created using the osbuilder tool.

The initrd image is a compressed cpio(1) archive, created from a rootfs which is loaded into memory and used as part of the Linux startup process. During startup, the kernel unpacks it into a special instance of a tmpfs mount that becomes the initial root filesystem.

  • The runtime will launch the configured hypervisor.
  • The hypervisor will boot the mini-OS image using the guest kernel.
  • The kernel will start the init daemon as PID 1 (the agent) inside the VM root environment.
  • The agent will create a new container environment, setting its root filesystem to that requested by the user.
  • The agent will then execute the command inside the new container.

The rootfs image, sometimes referred to as the mini O/S, is a highly optimized container bootstrap system. With this,

  • The runtime will launch the configured hypervisor.
  • The hypervisor will boot the mini-OS image using the guest kernel.
  • The kernel will start the init daemon as PID 1 (systemd) inside the VM root environment.
  • systemd, running inside the mini-OS context, will launch the agent in the root context of the VM.
  • The agent will create a new container environment, setting its root filesystem to that requested by the user.
  • The agent will then execute the command inside the new container.

We will see the output of the qemu commands later.

Configure to use initrd image

Since, Kata containers can run with either an initrd image or a rootfs image, we will build both images but initially use the initrd. We will switch to rootfs in later section. So, make sure you add initrd = /usr/share/kata-containers/kata-containers-initrd.img in the configuration file /usr/share/defaults/kata-containers/configuration.toml and comment out the default image line with the following:

sudo mkdir -p /etc/kata-containers/
sudo install -o root -g root -m 0640 /usr/share/defaults/kata-containers/configuration.toml /etc/kata-containers
sudo sed -i 's/^\(image =.*\)/# \1/g' /etc/kata-containers/configuration.toml

The initrd line is not added by default, so add the initrd line in /etc/kata-containers/configuration.toml so that it looks as follows:

initrd = "/usr/share/kata-containers/kata-containers-initrd.img"
# image = "/usr/share/kata-containers/kata-containers.img"

Next, we create the initrd image and the rootfs image. One of the initrd and image options in Kata runtime config file must be set, but not both. The main difference between the options is that the size of initrd (10MB+) is significantly smaller than rootfs image (100MB+).

Create an initrd image

Create a local rootfs for initrd image

# yes Y | pacman -S podman buildah skopeo # If not already installed
sudo ln -s `which podman` /usr/bin/docker #  or run later commands with USE_PODMAN instead of USE_DOCKER
export ROOTFS_DIR="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs"
sudo rm -rf ${ROOTFS_DIR}
cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
./rootfs.sh -l

Let’s use the distro=ubuntu

export distro=ubuntu
script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true ./rootfs.sh ${distro}'

This will download and compile numerous crates and build the kata-agent (in Rust). When complete, you will see the rootfs directory.

[root@rpi rootfs-builder]# ls rootfs
bin  boot  dev	etc  home  lib	lib64  media  mnt  opt	proc  root  run  sbin  srv  sys  tmp  usr  var

If you get errors such as following, just run the above command again after some time.

Could not connect to ports.ubuntu.com:80 (185.125.190.39), connection timed out

You may alternatively use the distro=debian

export distro=debian script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true ./rootfs.sh ${distro}'

When I tried with the Debian, it resulted in

/kata-containers/tools/osbuilder/rootfs-builder/ubuntu/rootfs_lib.sh: line 22: multistrap: command not found
Failed at 22: multistrap -a "$DEB_ARCH" -d "$rootfs_dir" -f "$multistrap_conf"

This was easily fixed by editiing the debian/Dockerfile.in to install the missing multistrap

    wget \
    multistrap
# aarch64 requires this name -- link for all
RUN ln -s /usr/bin/musl-gcc "/usr/bin/$(uname -m)-linux-musl-gcc"

and running the script command again

script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true ./rootfs.sh ${distro}'

Build an initrd image

cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/initrd-builder
script -fec 'sudo -E AGENT_INIT=yes USE_DOCKER=true ./initrd_builder.sh ${ROOTFS_DIR}'

Output:

[root@rpi rootfs-builder]# cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/initrd-builder
[root@rpi initrd-builder]# script -fec 'sudo -E AGENT_INIT=yes USE_DOCKER=true ./initrd_builder.sh ${ROOTFS_DIR}'
Script started, output log file is 'typescript'.
[OK] init is installed
[OK] Agent is installed
INFO: Creating /root/go/src/github.com/kata-containers/kata-containers/tools/osbuilder/initrd-builder/kata-containers-initrd.img based on rootfs at /root/go/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs
135141 blocks
Script done.

Install the initrd image

commit=$(git log --format=%h -1 HEAD)
date=$(date +%Y-%m-%d-%T.%N%z)
image="kata-containers-initrd-${date}-${commit}"
sudo install -o root -g root -m 0640 -D kata-containers-initrd.img "/usr/share/kata-containers/${image}"
(cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers-initrd.img)

Output:

[root@rpi initrd-builder]# commit=$(git log --format=%h -1 HEAD)
[root@rpi initrd-builder]# date=$(date +%Y-%m-%d-%T.%N%z)
[root@rpi initrd-builder]# image="kata-containers-initrd-${date}-${commit}"
[root@rpi initrd-builder]# sudo install -o root -g root -m 0640 -D kata-containers-initrd.img "/usr/share/kata-containers/${image}"
[root@rpi initrd-builder]# (cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers-initrd.img)

Create a rootfs image

Create a local rootfs for rootfs image

export ROOTFS_DIR=${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs
sudo rm -rf ${ROOTFS_DIR}
cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true ./rootfs.sh ${distro}'

Build a rootfs image

cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/image-builder
script -fec 'sudo -E USE_DOCKER=true ./image_builder.sh ${ROOTFS_DIR}'

Install the rootfs image

commit=$(git log --format=%h -1 HEAD)
date=$(date +%Y-%m-%d-%T.%N%z)
image="kata-containers-${date}-${commit}"
sudo install -o root -g root -m 0640 -D kata-containers.img "/usr/share/kata-containers/${image}"
(cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers.img)

Build Kata Containers Kernel

The process to build a kernel for Kata Containers is automated.

pacman -S flex bison bc
go env -w GO111MODULE=auto
go get github.com/kata-containers/packaging
cd $GOPATH/src/github.com/kata-containers/packaging/kernel

The script ./build-kernel.sh tries to apply the patches from ${GOPATH}/src/github.com/kata-containers/packaging/kernel/patches/ when it sets up a kernel. If you want to add a source modification, add a patch on this directory. The script also copies or generates a kernel config file from ${GOPATH}/src/github.com/kata-containers/packaging/kernel/configs/ to .config in the kernel source code. You can modify it as needed. We will use the defaults.

./build-kernel.sh setup

After the kernel source code is ready, we build the kernel

cp /root/go/src/github.com/kata-containers/packaging/kernel/configs/fragments/arm64/.config kata-linux-5.4.60-92/.config
./build-kernel.sh build

Install the kernel to the default Kata containers path (/usr/share/kata-containers/)

./build-kernel.sh install

The /etc/kata-containers/configuration.toml has the following:

# Path to vhost-user-fs daemon.
virtio_fs_daemon = "/usr/libexec/virtiofsd"

So, create a symbolic link as follows:

ln -s /usr/lib/qemu/virtiofsd /usr/libexec/virtiofsd

Cgroup v2 on host is not yet supported. So, we need to switch to cgroup v1. Concatenate the following onto the end of the existing line (do not add a new line) in /boot/cmdline.txt

systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller

Then, reboot the Raspberry Pi 4 and log back in as root.

mount | grep cgroup

Output:

[root@rpi ~]# mount | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)

Alternatively, instead of appending the kernel arguments with unified_cgroup_hierarchy=0, you may run the following after every reboot

mkdir /sys/fs/cgroup/systemd
mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd

Check the output kata-runtime:

[root@rpi src]# sudo kata-runtime check --verbose
WARN[0000] Not running network checks as super user      arch=arm64 name=kata-runtime pid=3952 source=runtime
INFO[0000] Unable to know if the system is running inside a VM  arch=arm64 source=virtcontainers/hypervisor
INFO[0000] kernel property found                         arch=arm64 description="Kernel-based Virtual Machine" name=kvm pid=3952 source=runtime type=module
INFO[0000] kernel property found                         arch=arm64 description="Host kernel accelerator for virtio" name=vhost pid=3952 source=runtime type=module
INFO[0000] kernel property found                         arch=arm64 description="Host kernel accelerator for virtio network" name=vhost_net pid=3952 source=runtime type=module
INFO[0000] kernel property found                         arch=arm64 description="Host Support for Linux VM Sockets" name=vhost_vsock pid=3952 source=runtime type=module
System is capable of running Kata Containers
INFO[0000] device available                              arch=arm64 check-type=full device=/dev/kvm name=kata-runtime pid=3952 source=runtime
INFO[0000] feature available                             arch=arm64 check-type=full feature=create-vm name=kata-runtime pid=3952 source=runtime
INFO[0000] kvm extension is supported                    arch=arm64 description="Maximum IPA shift supported by the host" id=165 name=KVM_CAP_ARM_VM_IPA_SIZE pid=3952 source=runtime type="kvm extension"
INFO[0000] IPA limit size: 44 bits.                      arch=arm64 name=KVM_CAP_ARM_VM_IPA_SIZE pid=3952 source=runtime type="kvm extension"
System can currently create Kata Containers

Check the hypervisor.qemu section in configuration.toml:

[root@rpi kata]# cat /etc/kata-containers/configuration.toml | awk -v RS= '/\[hypervisor.qemu\]/'
[hypervisor.qemu]
path = "/usr/bin/qemu-system-aarch64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
initrd = "/usr/share/kata-containers/kata-containers-initrd.img"
# image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "virt"

Check the initrd image (kata-containers-initrd.img), the rootfs image (kata-containers.img), and the kernel in the /usr/share/kata-containers directory:

[root@rpi kata]# ls -las /usr/share/kata-containers
total 339588
     4 drwxr-xr-x   2 root root      4096 Jul 23 19:41 .
     4 drwxr-xr-x 122 root root      4096 Jul 12 16:07 ..
    68 -rw-r--r--   1 root root     68526 Jul 12 16:31 config-5.4.60
131076 -rw-r-----   1 root root 134217728 Jul 12 15:47 kata-containers-2022-07-12-15:47:42.838238693-0400-d8d6998
131072 -rw-r-----   1 root root 134217728 Jul 23 19:41 kata-containers-2022-07-23-19:41:18.888912761-0400-0b4a91ec1
     4 lrwxrwxrwx   1 root root        60 Jul 23 19:41 kata-containers.img -> kata-containers-2022-07-23-19:41:18.888912761-0400-0b4a91ec1
 37124 -rw-r-----   1 root root  38013194 Jul 12 16:46 kata-containers-initrd-2022-07-12-16:46:00.687387798-0400-d8d6998
 25992 -rw-r-----   1 root root  26612417 Jul 23 18:50 kata-containers-initrd-2022-07-23-18:50:58.452932320-0400-0b4a91ec1
     4 lrwxrwxrwx   1 root root        67 Jul 23 18:50 kata-containers-initrd.img -> kata-containers-initrd-2022-07-23-18:50:58.452932320-0400-0b4a91ec1
  9708 -rw-r--r--   1 root root  10179072 Jul 12 16:31 vmlinux-5.4.60-92
     0 lrwxrwxrwx   1 root root        17 Jul 12 16:31 vmlinux.container -> vmlinux-5.4.60-92
  4532 -rw-r--r--   1 root root   4638904 Jul 12 16:31 vmlinuz-5.4.60-92
     0 lrwxrwxrwx   1 root root        17 Jul 12 16:31 vmlinuz.container -> vmlinuz-5.4.60-92

The kernel file is called vmlinuz-version. vmlinuz is the name of the Linux kernel executable. vmlinuz is a compressed Linux kernel, and it can load the operating system into memory so that the computer becomes usable and application programs can be run. When virtual memory was developed for easier multitasking abilities, “vm” was put at the front of the file to show that the kernel supports virtual memory. For a while the Linux kernel was called vmlinux, but the kernel grew too large to fit in the available boot memory, so the kernel image was compressed, and the ending x was changed to a z to show it was compressed with zlib compression. This same compression isn’t always used, often replaced with LZMA or BZIP2, and some kernels are simply called zImage.

At the head of this kernel image (vmlinuz) is a routine that does some minimal amount of hardware setup and then decompresses the kernel contained within the kernel image and places it into high memory. If an initial RAM disk image (initrd) is present, this routine moves it into memory (or we can say extract the compressed ramdisk image into the real memory) and notes it for later use. The routine then calls the kernel, and the kernel boot begins. The initial RAM disk (initrd) is an initial root file system that is mounted prior to when the real rootfile system is available. The initrd is bound to the kernel and loaded as part of the kernel boot procedure. The kernel then mounts this initrd as part of the two-stage boot process to load the modules to make the real file systems available and get at the real root file system. The initrd contains a minimal set of directories and executables to achieve this, such as the insmod tool to install kernel modules into the kernel.

Many Linux distributions ship a single, generic Linux kernel image – one that the distribution's developers create specifically to boot on a wide variety of hardware. The device drivers for this generic kernel image are included as loadable kernel modules because statically compiling many drivers into one kernel causes the kernel image to be much larger, perhaps too large to boot on computers with limited memory.

Create the file /etc/crio/crio.conf.d/50-kata

cat > /etc/crio/crio.conf.d/50-kata << EOF
[crio.runtime.runtimes.kata]
  runtime_path = "/usr/local/bin/containerd-shim-kata-v2"
  runtime_root = "/run/vc"
  runtime_type = "vm"
  privileged_without_host_devices = true
EOF

Restart crio and start microshift

systemctl restart crio

systemctl start microshift
export KUBECONFIG=/var/lib/microshift/resources/kubeadmin/kubeconfig

Running some Kata samples

After MicroShift is started, you can apply the kata runtimeclass and run the samples.

cd ~
git clone https://github.com/thinkahead/microshift.git
cd ~/microshift/raspberry-pi/kata/
oc apply -f kata-runtimeclass.yaml
# Start three kata pods
oc apply -f kata-nginx.yaml -f kata-alpine.yaml  -f kata-busybox.yaml

watch "oc get nodes;oc get pods -A;crictl stats -a"

Output:

NAME              STATUS   ROLES    AGE     VERSION
rpi.example.com   Ready    <none>   6m30s   v1.21.0

NAMESPACE                       NAME                                  READY   STATUS    RESTARTS   AGE
default                         busybox-1                             1/1     Running   0          87s
default                         kata-alpine                           1/1     Running   0          87s
default                         kata-nginx                            1/1     Running   0          4m13s
kube-system                     kube-flannel-ds-q72px                 1/1     Running   0          6m29s
kubevirt-hostpath-provisioner   kubevirt-hostpath-provisioner-zllj5   1/1     Running   0          6m29s
openshift-dns                   dns-default-rgcgl                     2/2     Running   0          6m30s
openshift-dns                   node-resolver-wbxjv                   1/1     Running   0          6m30s
openshift-ingress               router-default-85bcfdd948-cpbsf       1/1     Running   0          6m34s
openshift-service-ca            service-ca-7764c85869-tppsk           1/1     Running   0          6m35s

CONTAINER           CPU %               MEM                 DISK                INODES
22e6c795dc826       0.00                1.372MB             89B                 10
3e98d8dcf5892       0.00                0B                  6.961kB             15
62b79de9bf9c8       0.00                335.9kB             12B                 15
66296acecd982       0.00                0B                  12B                 18
90f1c9544f85d       0.00                0B                  0B                  7
9cefcde983a44       0.00                0B                  138B                17
a1dd48420b598       0.00                0B                  12.3kB              25
a41c53578f43d       0.00                0B                  12B                 17
b3fa22e8384b7       0.00                0B                  0B                  8
e31b7a624d532       0.00                2.343MB             1.225kB             23

We can see that the kernel used in the kata containers (when we created the initrd for Debian image is 5.4.60) is different from that host 5.15.52-1-MANJARO-ARM-RPI (5.15.56-1-MANJARO-ARM-RPI on latest install)

[root@rpi kata]# kata-runtime kata-env
[Kernel]
  Path = "/usr/share/kata-containers/vmlinuz-5.4.60-92"
  Parameters = "scsi_mod.scan=none" 

[root@rpi kata]# oc exec -it kata-nginx -- uname -a
Linux kata-nginx 5.4.60 #1 SMP Tue Jul 12 16:09:33 EDT 2022 aarch64 GNU/Linux
[root@rpi kata]# oc exec -it kata-alpine -- uname -a
Linux kata-alpine 5.4.60 #1 SMP Tue Jul 12 16:09:33 EDT 2022 aarch64 Linux
[root@rpi kata]# oc exec -it busybox-1 -- uname -a
Linux busybox-1 5.4.60 #1 SMP Tue Jul 12 16:09:33 EDT 2022 aarch64 GNU/Linux

[root@rpi kata]# uname -a
Linux rpi.example.com 5.15.52-1-MANJARO-ARM-RPI #1 SMP PREEMPT Tue Jul 5 12:49:52 UTC 2022 aarch64 GNU/Linux

Check that we can ping from the kata-alpine container if you set the default_capabilities "NET_RAW" in /etc/crio/crio.conf as shown in Part 23

[root@rpi /]# oc exec -it kata-alpine -- ping -c2 google.com
PING google.com (142.250.80.110): 56 data bytes
64 bytes from 142.250.80.110: seq=0 ttl=117 time=3.702 ms
64 bytes from 142.250.80.110: seq=1 ttl=117 time=3.977 ms

--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 3.702/3.839/3.977 ms

When done, we can delete the sample deployments

[root@rpi kata]# oc delete -f kata-alpine.yaml -f kata-busybox.yaml -f kata-nginx.yaml
pod "kata-alpine" deleted
pod "busybox-1" deleted
pod "kata-nginx" deleted 

Influxdb Sample

We execute the runall-balena-dynamic.sh for Manjaro (instead of the runall-fedora-dynamic.sh because the Fedora workaround with sense_hat.py.new is not required) after updating the deployment yamls to use the runtimeclass: kata.

cd ~
git clone https://github.com/thinkahead/microshift.git
cd ~/microshift/raspberry-pi/influxdb/

Update the influxdb-deployment.yaml, telegraf-deployment.yaml and grafana/grafana-deployment.yaml to use the runtimeClassName: kata. With Kata containers, we do not directly get access to the host devices. So, we run the measure container as a runc pod. In runc, '--privileged' for a container means all the /dev/* block devices from the host are mounted into the guest. This will allow the privileged container to gain access to mount any block device from the host.

# sed -i '/^    spec:/a \ \ \ \ \ \ runtimeClassName: kata' influxdb-deployment.yaml telegraf-deployment.yaml grafana/grafana-deployment.yaml

sed -i '/^    spec:/a \ \ \ \ \ \ runtimeClassName: kata' influxdb-deployment.yaml
sed -i '/^    spec:/a \ \ \ \ \ \ runtimeClassName: kata' telegraf-deployment.yaml
sed -i '/^    spec:/a \ \ \ \ \ \ runtimeClassName: kata' grafana/grafana-deployment.yaml

Now, get the nodename

[root@rpi influxdb]# oc get nodes
NAME              STATUS   ROLES    AGE   VERSION
rpi.example.com   Ready    <none>   12h   v1.21.0

Replace the annotation kubevirt.io/provisionOnNode with the above nodename rpi.example.com and execute the runall-balena-dynamic.sh. This will create a new project influxdb.

nodename=rpi.example.com
sed -i "s|kubevirt.io/provisionOnNode:.*| kubevirt.io/provisionOnNode: $nodename|" influxdb-data-dynamic.yaml
sed -i "s| kubevirt.io/provisionOnNode:.*| kubevirt.io/provisionOnNode: $nodename|" grafana/grafana-data-dynamic.yaml

./runall-balena-dynamic.sh

Let’s watch the stats (CPU%, Memory, Disk and Inodes) of the kata container pods:

watch "oc get nodes;oc get pods;crictl stats"

Output:

NAME              STATUS   ROLES    AGE     VERSION
rpi.example.com   Ready    <none>   4h33m   v1.21.0
NAME                                   READY   STATUS    RESTARTS   AGE
grafana-855ffb48d8-bxrbb               1/1     Running   0          4m22s
influxdb-deployment-6d898b7b7b-vtnrt   1/1     Running   0          4m59s
measure-deployment-58cddb5745-dzpqd    1/1     Running   0          4m48s
telegraf-deployment-d746f5c6-f6h55     1/1     Running   0          4m38s
CONTAINER           CPU %               MEM                 DISK                INODES
67dea04a753ab       1.50                24.48MB             265B                13
89bf5e9fbe2ab       4.07                12.06MB             186kB               13
89e0ed1ad3557       0.05                26.06MB             4.021MB             77

We can look at the RUNTIME_CLASS using custom columns:

[root@rpi influxdb]# oc get pods -o custom-columns=NAME:metadata.name,STATUS:.status.phase,RUNTIME_CLASS:.spec.runtimeClassName,IP:.status.podIP,IMAGE:.status.containerStatuses[].image -A
NAME                                   STATUS    RUNTIME_CLASS   IP              IMAGE
busybox-1                              Running   kata            10.85.0.117     docker.io/library/busybox:latest
kata-alpine                            Running   kata            10.85.0.115     docker.io/karve/alpine-sshclient:arm64
kata-nginx                             Running   kata            10.85.0.118     docker.io/library/nginx:latest
grafana-855ffb48d8-f5cp7               Running   kata            10.85.0.122     docker.io/grafana/grafana:5.4.3
influxdb-deployment-6d898b7b7b-wf4rx   Running   kata            10.85.0.119     docker.io/library/influxdb:1.7.4
measure-deployment-58cddb5745-p57k7    Running   <none>          10.85.0.120     docker.io/karve/measure:latest
telegraf-deployment-d746f5c6-2nnlv     Running   kata            10.85.0.121     docker.io/library/telegraf:1.10.0
kube-flannel-ds-6spzn                  Running   <none>          192.168.1.228   quay.io/microshift/flannel:4.8.0-0.okd-2021-10-10-030117
kubevirt-hostpath-provisioner-7mp5f    Running   <none>          10.85.0.2       quay.io/microshift/hostpath-provisioner:4.8.0-0.okd-2021-10-10-030117
dns-default-ll4jc                      Running   <none>          10.85.0.4       quay.io/microshift/coredns:4.8.0-0.okd-2021-10-10-030117
node-resolver-84z6x                    Running   <none>          192.168.1.228   quay.io/microshift/cli:4.8.0-0.okd-2021-10-10-030117
router-default-85bcfdd948-wkg9r        Running   <none>          192.168.1.228   quay.io/microshift/haproxy-router:4.8.0-0.okd-2021-10-10-030117
service-ca-7764c85869-sgzs5            Running   <none>          10.85.0.3       quay.io/microshift/service-ca-operator:4.8.0-0.okd-2021-10-10-030117

Check the qemu process. We used the initrd image and we can see that in the parameters:

ps -ef | grep qemu

Output:

root       94205       1 12 13:54 ?        00:00:02 /usr/bin/qemu-system-aarch64 -name sandbox-a9f5ae9072018e67e56f9a8582319664d34df5dab6ec45f84e499c4cc369e4a4 -uuid a074427c-40ba-4291-8a4c-787ab4c7e4ac -machine virt,usb=off,accel=kvm,gic-version=host -cpu host,pmu=off -qmp unix:/run/vc/vm/a9f5ae9072018e67e56f9a8582319664d34df5dab6ec45f84e499c4cc369e4a4/qmp.sock,server=on,wait=off -m 2048M,slots=10,maxmem=7810M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=off,addr=2,io-reserve=4k,mem-reserve=1m,pref64-reserve=1m -device virtio-serial-pci,disable-modern=false,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/a9f5ae9072018e67e56f9a8582319664d34df5dab6ec45f84e499c4cc369e4a4/console.sock,server=on,wait=off -device virtio-scsi-pci,id=scsi0,disable-modern=false -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=false,vhostfd=3,id=vsock-3352093150,guest-cid=3352093150 -chardev socket,id=char-e2e46bed4a955485,path=/run/vc/vm/a9f5ae9072018e67e56f9a8582319664d34df5dab6ec45f84e499c4cc369e4a4/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-e2e46bed4a955485,tag=kataShared -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=3e:07:15:1d:a5:f9,disable-modern=false,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /usr/share/kata-containers/vmlinux-5.4.60-92 -initrd /usr/share/kata-containers/kata-containers-initrd-2022-08-04-19:44:07.258170129-0400-587c0c5e5 -append iommu.passthrough=0 console=hvc0 console=hvc1 quiet panic=1 nr_cpus=4 scsi_mod.scan=none -pidfile /run/vc/vm/a9f5ae9072018e67e56f9a8582319664d34df5dab6ec45f84e499c4cc369e4a4/pid -smp 1,cores=1,threads=1,sockets=4,maxcpus=4

Add the "<RaspberryPiIPAddress> grafana-service-influxdb.cluster.local" to /etc/hosts on your laptop and login to http://grafana-service-influxdb.cluster.local/login using admin/admin. You will need to change the password on first login. Go to the Dashboards list (left menu > Dashboards > Manage). Open the Analysis Server dashboard to display monitoring information for MicroShift. Open the Balena Sense dashboard to show the temperature, pressure, and humidity from SenseHat.

Finally, after you are done working with this sample, you can run the deleteall-balena-dynamic.sh

./deleteall-balena-dynamic.sh

Deleting the persistent volume claims automatically deletes the persistent volumes.

Configure to use the rootfs image

We have been using the initrd image when running the samples above, now let’s switch to the rootfs image instead of using initrd by changing the following lines in /etc/kata-containers/configuration.toml

#initrd = "/usr/share/kata-containers/kata-containers-initrd.img"
image = "/usr/share/kata-containers/kata-containers.img"

Also disable the image nvdimm by setting the following:

disable_image_nvdimm = true # Default is false

Restart crio and test with the kata-alpine sample

systemctl restart crio
cd ~/microshift/raspberry-pi/kata/
oc apply -f kata-alpine.yaml

Output of qemu process when we use rootfs image with disable_image_nvdimm=true

root       90939       1 13 13:37 ?        00:00:02 /usr/bin/qemu-system-aarch64 -name sandbox-892c3bd11a5e478d7760765a5fe384a722c40d8fbf8e9f9b89b8ef41467e498b -uuid 6abac702-3b7b-4268-97b7-9a7ef14cec1b -machine virt,usb=off,accel=kvm,gic-version=host -cpu host,pmu=off -qmp unix:/run/vc/vm/892c3bd11a5e478d7760765a5fe384a722c40d8fbf8e9f9b89b8ef41467e498b/qmp.sock,server=on,wait=off -m 2048M,slots=10,maxmem=7810M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=off,addr=2,io-reserve=4k,mem-reserve=1m,pref64-reserve=1m -device virtio-serial-pci,disable-modern=false,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/892c3bd11a5e478d7760765a5fe384a722c40d8fbf8e9f9b89b8ef41467e498b/console.sock,server=on,wait=off -device virtio-blk-pci,disable-modern=false,drive=image-4c9c03c6b770408f,scsi=off,config-wce=off,share-rw=on,serial=image-4c9c03c6b770408f -drive id=image-4c9c03c6b770408f,file=/usr/share/kata-containers/kata-containers-2022-08-05-12:53:58.236244980-0400-587c0c5e5,aio=threads,format=raw,if=none,readonly=on -device virtio-scsi-pci,id=scsi0,disable-modern=false -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=false,vhostfd=3,id=vsock-4175752710,guest-cid=4175752710 -chardev socket,id=char-f64cd1bfb913eecc,path=/run/vc/vm/892c3bd11a5e478d7760765a5fe384a722c40d8fbf8e9f9b89b8ef41467e498b/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-f64cd1bfb913eecc,tag=kataShared -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=da:2c:9f:7a:2e:21,disable-modern=false,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /usr/share/kata-containers/vmlinux-5.4.60-92 -append iommu.passthrough=0 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 console=hvc0 console=hvc1 quiet systemd.show_status=false panic=1 nr_cpus=4 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none -pidfile /run/vc/vm/892c3bd11a5e478d7760765a5fe384a722c40d8fbf8e9f9b89b8ef41467e498b/pid -smp 1,cores=1,threads=1,sockets=4,maxcpus=4

If we try to use the rootfs image with the disable_image_nvdimm=false, we get the following error, and the Kata container does not start.

0s          Warning   FailedCreatePodSandBox   pod/kata-alpine   Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: failed to launch qemu: exit status 1, error messages from qemu log: qemu-system-aarch64: -device nvdimm,id=nv0,memdev=mem0: memory hotplug is not enabled: missing acpi-ged device...

Output of qemu process when we use rootfs image with the disable_image_nvdimm=false

root       77890       1  0 13:13 ?        00:00:00 /usr/bin/qemu-system-aarch64 -name sandbox-e43561d381112254278ae2d1cd604a7a2b8d03c1065ab54b3bbdbc6af48cf9a9 -uuid c27a22de-4421-437e-ac2a-10decd17ecd2 -machine virt,usb=off,accel=kvm,gic-version=host,nvdimm=on -cpu host,pmu=off -qmp unix:/run/vc/vm/e43561d381112254278ae2d1cd604a7a2b8d03c1065ab54b3bbdbc6af48cf9a9/qmp.sock,server=on,wait=off -m 2048M,slots=10,maxmem=7810M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=off,addr=2,io-reserve=4k,mem-reserve=1m,pref64-reserve=1m -device virtio-serial-pci,disable-modern=false,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/e43561d381112254278ae2d1cd604a7a2b8d03c1065ab54b3bbdbc6af48cf9a9/console.sock,server=on,wait=off -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/kata-containers/kata-containers-2022-08-05-12:53:58.236244980-0400-587c0c5e5,size=134217728 -device virtio-scsi-pci,id=scsi0,disable-modern=false -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=false,vhostfd=3,id=vsock-19022696,guest-cid=19022696 -chardev socket,id=char-68ffbbd4f7c62079,path=/run/vc/vm/e43561d381112254278ae2d1cd604a7a2b8d03c1065ab54b3bbdbc6af48cf9a9/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-68ffbbd4f7c62079,tag=kataShared -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=ce:6a:b5:ee:c4:6e,disable-modern=false,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /usr/share/kata-containers/vmlinux-5.4.60-92 -append iommu.passthrough=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 console=hvc0 console=hvc1 quiet systemd.show_status=false panic=1 nr_cpus=4 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none -pidfile /run/vc/vm/e43561d381112254278ae2d1cd604a7a2b8d03c1065ab54b3bbdbc6af48cf9a9/pid -smp 1,cores=1,threads=1,sockets=4,maxcpus=4

We can also run MicroShift Containerized as shown in Part 18 and execute the Jupyter Notebook samples for Digit Recognition, Object Detection and License Plate Recognition with Kata containers as shown in Part 23.

Errors

QMP command failed: The feature 'query-hotpluggable-cpus' is not enabled

This error occurs if you add resource limits to the containers. The dedicated legacy interface: cpu-add QMP command is removed in QEMU v5.2. Need to use the device_add interface.

Conclusion

In this Part 24, we looked at building Kata containers from source and running MicroShift with Kata Containers on Manjaro. We ran multiple samples using the Kata Containers Runtime from MicroShift and viewed the metrics from the containers. In Part 25 we will install and use MicroShift with KubeVirt and Kata Containers on Raspberry Pi 4 with Pop!_OS.

Hope you have enjoyed the article. Share your thoughts in the comments or engage in the conversation with me on Twitter @aakarve. I look forward to hearing about your use of MicroShift, KubeVirt and Kata Containers on ARM devices and if you would like to see something covered in more detail.

References

​​​

0 comments
208 views

Permalink