Exec in VM
Almost everyone these days relies on continuous integration. And it seems, once you got accustomed to it, you never want to work without it again. Unfortunately, most CI systems lack cross-architecture capabilities. As a systems engineer with lots of C projects, I was desperately looking for a solution to run my tests on little-endian, big-endian, 32bit, and 64bit machines. So far, without any luck. Hence, I patched together qemu, docker, fedora, and some bash scripts to get a tool that allows me to execute scripts from the command-line in a VM ad-hoc.
My ultimate goal is to type
vmrun make as replacement for
and it spawns a virtual machine, mounts the current directory into the machine,
make inside of it, returning the exit-code to my shell. Of
course, it could be extended to support selecting the target architecture
and/or OS image to us. So eventually, it might look something like:
vmrun \ --image fedora-ci \ --architecture armv7hl \ -- \ meson setup build && ninja -C build
As a developer, I would love having this at hand. I can easily compile and run projects in foreign architectures, without the requirement of setting up non-volatile VMs, moving data in and out of the machine, and also getting automation and scripting support.
Containers already allow this kind of setup. Using docker or systemd-nspawn you can get something similar already:
docker run \ --interactive \ --rm \ --tty \ --volume $PWD:/mnt/cwd \ --workdir /mnt/cwd \ fedora-ci \ meson setup build && ninja -C build systemd-nspawn \ --bind $PWD:/mnt/cwd \ --chdir /mnt/cwd \ --ephemeral \ --image fedora-ci \ meson setup build && ninja -C build
This, however, has one major drawback: This can only run native binaries. If you want to run code in a foreign architecture, you need a kernel for that architecture as well. There are options like qemu-user, though they cannot provide perfect compatibility. They only get you so far.
Hence, you need some machine emulator. So how about we execute the image inside of qemu, rather than in a container? Sounds easier than it is:
Needs to Boot: Unlike in a container, the virtual machine needs to boot a kernel, user-space, and prepare the execution environment. This means, we cannot simply specify a script or binary to execute by qemu. We must actually boot the image and instruct the image to execute a given binary.
One way to get this to work on Fedora is to craft a special
.servicefile and pull it in after boot is done. Make the service file execute your binary and then poweroff the machine when done, or on failure.
No Exit-Code Propagation: The qemu emulator does not propagate the exit-code of the code executed in the virtual machine. Hence, we need a side-channel to detect whether the script executed successfully. This is easily done by hooking up a separate serial-line and making your OS write
successinto it, once everything succeeded.
Maybe someone wants to hook up a qemu extension to propagate Exit-Codes?
No Bind Mounts: The biggest issue is, we cannot simply bind-mount the directory of the caller into the virtual machine. This is particularly bad, because there is no simple alternative solution. The closest possible solution I am aware of is to share the directory via NFS or 9pfs.
Maybe someone can figure out a way to do this. All my attempts failed. While I successfully shared the directory, either performance suffered, or random features failed, which were expected by some development tools (e.g., file-locks or mmap failed). I am not saying the tools are broken, but just that I couldn’t make it work. Help welcome!
(Also be aware that you suddenly run into UID and permission issues. The entire qemu machine runs as an unprivileged user, so it will only be able to access/write files as that user. But inside of the VM, you are free to use
sudoand friends. There is no way to propagate this to the outside. This might be fine, but it is a source of confusion.)
No Image Hubs: While docker gave us image stores for free (e.g., Docker Hub, Quay.io, etc.), there is nothing like it for virtual machine images. Companies seem unwilling to provide the world with free terrabytes of storage.
Solution: Use docker.
While docker stores images in a format unsuitable to qemu, we can still use its storage. I simply took my XFS-qcow2 image-file and threw it into a docker container. While at it, I threw in a qemu binary with all its dependencies as well. This combined image can now be pushed to docker repositories and be hosted on Docker Hub and friends. As a consumer, you simply fetch the docker image and execute the qemu-binary inside of it, including its embedded OS image.
I went forth and threw together all the bits and pieces. But, sadly, I cannot
provide you the
vmrun tool as I described it above. I simply ran into too
many issues around sharing a directory. However, I did end up with something
docker run \ --rm \ -it \ -v $PWD/myscript.sh:/mnt/cherryimages/input/main:ro \ cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-20180110-1
This command executes
$PWD/myscript.sh inside of a fedora armv7hl image,
hosted by an x86_64 qemu. For reproducability, I tagged the image at the time
of this blog-post as
cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-20180110-1. If you
want the latest image, use
cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-latest. Other tags
exist as well. Just check out the repository, if interested. The Dockerfile
sources as of the time of this post can be found on
vmrun tool I described above, this shares the input script
read-only. Furthermore, it shares the input as FAT16 volume (qemu can create
this on-the-fly via the vvfat driver), so its size is quite limited, and
filesystem attributes are mostly discarded. In the end, its only use is to push
a script into the machine to execute (alternatively, you can push an entire
directory into the machine, but the entrypoint must be named
For my personal use, I now added a script that fetches a git-repository, runs
the embedded tests, and returns.
In combination with this docker-qemu-image, I can easily run my CI on
foreign architectures. Maybe some day I will pick this up again and get a
vmrun tool (or maybe someone else does?). Until then, I will stick
to the reduced version, as it serves my needs. Sadly, there are still too many
variables that cannot be auto-detected (How many memory to give to the VM?
Which devices to forward? Which CPU features to enable?), and too many hacks
required (String-conversions required between different command-lines… Getting
a distribution to boot fast in these containers… Sharing data correctly into
and out of the VM…).
In the end, I think it is just too much a hassle to turn into a project I can
maintain and support. The tools I needed do not provide proper APIs, but
require me to lump together command-lines, PID-files, and magic configurations.
Maybe some day we will get there? Until then, lets make use of
avoid system integration tests…