Back to home

Ponyhof

Dysfunctional Programming

Jekyll 2019-02-12T15:52:25+00:00 https://dvdhrm.github.io/ dvdhrm Dysfunctional Programming David Rheinsberg Goodbye Gnuefi 2019-01-31T00:00:00+00:00 2019-01-31T00:00:00+00:00 https://dvdhrm.github.io/2019/01/31/goodbye-gnuefi <p>The recommended way to link <a href="https://www.uefi.org/">UEFI</a> applications on linux was until now through <em>GNU-EFI</em>, a toolchain provided by the <em>GNU Project</em> that bridges from the <em>ELF</em> world into <em>COFF/PE32+</em>. But why don’t we compile directly to native UEFI? A short dive into the past of <em>GNU Toolchains</em>, its remnants, and a surprisingly simple way out.</p> <p>The <em>Linux World</em> (and many <em>UNIX Derivatives</em> for that matter) is modeled around <a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format">ELF</a>. With statically linked languages becoming more prevalent, the impact of the ABI diminishes, but it still defines properties far beyond just how to call functions. The ABI your system uses also effects how compiler and linker interact, how binaries export information (especially symbols), and what features application developers can make use of. We have become used to ELF, and require its properties in places we didn’t expect.</p> <p>UEFI does not use ELF. For all that matters, <a href="https://www.uefi.org/uefi">UEFI follows Microsoft Windows</a>. This means, UEFI uses <a href="https://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx"><em>COFF/PE32+</em></a> (or short <em>PE+</em>). If we compile binaries for UEFI, they must target <em>PE+</em>. And the <em>GNU Compiler Collection</em> can do this… somewhat.</p> <p>Conceptually, <em>GCC</em> supports many languages, ABIs, targets, and architectures in a single code-base. Technically, though, every compiled instance of <em>GCC</em> compiles from one language to one target. Your compiler that takes <em>C</em> and produces <em>x86-64</em> is actually specific to <em>x86_64-pc-linux-gnu</em>. You cannot tweak it to compile UEFI binaries. Instead, you need another instance of <em>GCC</em>, one that takes <em>C</em> and produces <em>x86_64-windows-msvc</em>. You probably know this combination under the name <em>MinGW</em>.</p> <p>But this is not what <em>GNU</em> went for. Instead, to what still puzzles me to this day, the <em>GNU</em> project decided against using its own software and instead produced something named <a href="https://sourceforge.net/projects/gnu-efi/"><em>GNU-EFI</em></a>. The goal of <em>GNU-EFI</em> is to allow writing UEFI applications using the common <em>GNU Toolchain</em> (meaning you compile <em>ELF</em> binaries for <em>Linux</em>). They achieve this by linking a <em>PE+ Stub</em>, which at runtime performs required relocations, parameter translations, and jumps into the <em>ELF</em> application. You effectively write a free-standing <em>Linux Application</em>, add a wrapping layer and then execute it on <em>UEFI</em>. It works, but is needlessly complex.</p> <p><em>Is this really the best way to compile for UEFI?</em> <strong>Not anymore!</strong></p> <p>The <em>LLVM</em> toolchain (<em>clang</em> compiler plus <em>lld</em> linker) combines all supported targets in a single toolchain, offering a target selector <code class="highlighter-rouge">--target</code> to let <em>LLVM</em> know what to compile for. So as long as you have <em>clang</em> and <em>lld</em> installed, you can compile native UEFI binaries just like normal local compilation:</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Normal local compile+link</span> <span class="nv">$ </span>clang <span class="se">\</span> <span class="nv">$CFLAGS</span> <span class="se">\</span> <span class="nt">-o</span> OBJECT <span class="se">\</span> <span class="nt">-c</span> <span class="o">[</span>SOURCES…] <span class="nv">$ </span>clang <span class="se">\</span> <span class="nv">$LDFLAGS</span> <span class="se">\</span> <span class="nt">-o</span> BINARY <span class="se">\</span> <span class="o">[</span>OBJECTS…] </code></pre></div></div> <p>To make this compile for UEFI targets, you simply set:</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CFLAGS+<span class="o">=</span> <span class="se">\</span> <span class="nt">--target</span> x86_64-unknown-windows <span class="se">\</span> <span class="nt">-ffreestanding</span> <span class="se">\</span> <span class="nt">-fshort-wchar</span> <span class="se">\</span> <span class="nt">-mno-red-zone</span> LDFLAGS+<span class="o">=</span> <span class="se">\</span> <span class="nt">--target</span> x86_64-unknown-windows <span class="se">\</span> <span class="nt">-nostdlib</span> <span class="se">\</span> <span class="nt">-Wl</span>,-entry:efi_main <span class="se">\</span> <span class="nt">-Wl</span>,-subsystem:efi_application <span class="se">\</span> <span class="nt">-fuse-ld</span><span class="o">=</span>lld-link </code></pre></div></div> <p>The two things special are <code class="highlighter-rouge">--target &lt;TRIPLE&gt;</code> and <code class="highlighter-rouge">--fuse-ld=&lt;LINKER&gt;</code>. The former instructs both compiler and linker to produce <em>COFF/PE32+</em> objects compatible to the <em>Microsoft Windows Platform</em> (which matches the UEFI platform). The latter selects the linker to use. Mind you, using the default linker will very likely fail (default being <em>ld</em> or <em>ld-gold</em>). Currently, you either have to use <em>lld-link</em> (<em>PE+</em> backend of the <em>LLVM</em> linker), or you need a version of <em>GNU-ld</em> compiled for a <em>PE+</em> toolchain. I recommend <em>LLVM lld</em>.</p> <p>Voilà! No need for <em>GNU-EFI</em>, no need to mess with separated toolchains. With <em>LLVM</em> you get all this through your local toolchain.</p> <p>If you use <em>Meson Build</em>, the <a href="https://c-util.github.io/c-efi"><strong>c-efi</strong></a> project even provides you an example <em>cross-file</em>. A native meson C project can then be compiled for UEFI by nothing more than passing <code class="highlighter-rouge">--cross-file x86_64-unknown-uefi</code> to <code class="highlighter-rouge">meson</code>. See its <a href="https://github.com/c-util/c-efi/blob/master/src/x86_64-unknown-uefi.mesoncross.ini">sources</a> for details.</p> <p>The <a href="https://c-util.github.io/c-efi"><strong>c-efi</strong></a> project also provides the protocol contants and definitions from the UEFI specification, so you don’t have to extract them yourself.</p> David Rheinsberg The recommended way to link UEFI applications on linux was until now through GNU-EFI, a toolchain provided by the GNU Project that bridges from the ELF world into COFF/PE32+. But why don’t we compile directly to native UEFI? A short dive into the past of GNU Toolchains, its remnants, and a surprisingly simple way out. Exec In Vm 2018-01-10T00:00:00+00:00 2018-01-10T00:00:00+00:00 https://dvdhrm.github.io/2018/01/10/exec-in-vm <p>Almost everyone these days relies on continuous integration. And it seems, once you got accustomed to it, you never want to work without it again. Unfortunately, most CI systems lack cross-architecture capabilities. As a systems engineer with lots of C projects, I was desperately looking for a solution to run my tests on little-endian, big-endian, 32bit, and 64bit machines. So far, without any luck. Hence, I patched together qemu, docker, fedora, and some bash scripts to get a tool that allows me to execute scripts from the command-line in a VM ad-hoc.</p> <p>My ultimate goal is to type <strong><code class="highlighter-rouge">vmrun make</code></strong> as replacement for <strong><code class="highlighter-rouge">make</code></strong>, and it spawns a virtual machine, mounts the current directory into the machine, executes <strong><code class="highlighter-rouge">make</code></strong> inside of it, returning the exit-code to my shell. Of course, it could be extended to support selecting the target architecture and/or OS image to us. So eventually, it might look something like:</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>vmrun <span class="se">\</span> <span class="nt">--image</span> fedora-ci <span class="se">\</span> <span class="nt">--architecture</span> armv7hl <span class="se">\</span> <span class="nt">--</span> <span class="se">\</span> meson setup build <span class="o">&amp;&amp;</span> ninja <span class="nt">-C</span> build </code></pre></div></div> <p>As a developer, I would love having this at hand. I can easily compile <strong>and run</strong> projects in foreign architectures, without the requirement of setting up non-volatile VMs, moving data in and out of the machine, and also getting automation and scripting support.</p> <p>Containers already allow this kind of setup. Using <em>docker</em> or <em>systemd-nspawn</em> you can get something similar already:</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="se">\</span> <span class="nt">--interactive</span> <span class="se">\</span> <span class="nt">--rm</span> <span class="se">\</span> <span class="nt">--tty</span> <span class="se">\</span> <span class="nt">--volume</span> <span class="nv">$PWD</span>:/mnt/cwd <span class="se">\</span> <span class="nt">--workdir</span> /mnt/cwd <span class="se">\</span> fedora-ci <span class="se">\</span> meson setup build <span class="o">&amp;&amp;</span> ninja <span class="nt">-C</span> build systemd-nspawn <span class="se">\</span> <span class="nt">--bind</span> <span class="nv">$PWD</span>:/mnt/cwd <span class="se">\</span> <span class="nt">--chdir</span> /mnt/cwd <span class="se">\</span> <span class="nt">--ephemeral</span> <span class="se">\</span> <span class="nt">--image</span> fedora-ci <span class="se">\</span> meson setup build <span class="o">&amp;&amp;</span> ninja <span class="nt">-C</span> build </code></pre></div></div> <p>This, however, has one major drawback: This can only run native binaries. If you want to run code in a foreign architecture, you need a kernel for that architecture as well. There are options like <strong>qemu-user</strong>, though they cannot provide perfect compatibility. They only get you so far.</p> <p>Hence, you need some machine emulator. So how about we execute the image inside of qemu, rather than in a container? Sounds easier than it is:</p> <ul> <li> <p><strong>Needs to Boot</strong>: Unlike in a container, the virtual machine needs to boot a kernel, user-space, and prepare the execution environment. This means, we cannot simply specify a script or binary to execute by qemu. We must actually boot the image and instruct the image to execute a given binary.</p> <p>One way to get this to work on Fedora is to craft a special <code class="highlighter-rouge">.service</code> file and pull it in after boot is done. Make the service file execute your binary and then poweroff the machine when done, or on failure.</p> </li> <li> <p><strong>No Exit-Code Propagation</strong>: The qemu emulator does not propagate the exit-code of the code executed in the virtual machine. Hence, we need a side-channel to detect whether the script executed successfully. This is easily done by hooking up a separate serial-line and making your OS write <em><code class="highlighter-rouge">success</code></em> into it, once everything succeeded.</p> <p>Maybe someone wants to hook up a qemu extension to propagate Exit-Codes?</p> </li> <li> <p><strong>No Bind Mounts</strong>: The biggest issue is, we cannot simply bind-mount the directory of the caller into the virtual machine. This is particularly bad, because there is no simple alternative solution. The closest possible solution I am aware of is to share the directory via <strong>NFS</strong> or <strong>9pfs</strong>.</p> <p>Maybe someone can figure out a way to do this. All my attempts failed. While I successfully shared the directory, either performance suffered, or random features failed, which were expected by some development tools (e.g., file-locks or mmap failed). I am not saying the tools are broken, but just that I couldn’t make it work. Help welcome!</p> <p>(Also be aware that you suddenly run into UID and permission issues. The entire qemu machine runs as an unprivileged user, so it will only be able to access/write files as that user. But inside of the VM, you are free to use <code class="highlighter-rouge">sudo</code> and friends. There is no way to propagate this to the outside. This might be fine, but it is a source of confusion.)</p> </li> <li> <p><strong>No Image Hubs</strong>: While docker gave us image stores for free (e.g., Docker Hub, Quay.io, etc.), there is nothing like it for virtual machine images. Companies seem unwilling to provide the world with free terrabytes of storage.</p> <p>Solution: Use docker.</p> <p>While docker stores images in a format unsuitable to qemu, we can still use its storage. I simply took my XFS-qcow2 image-file and threw it into a docker container. While at it, I threw in a qemu binary with all its dependencies as well. This combined image can now be pushed to docker repositories and be hosted on Docker Hub and friends. As a consumer, you simply fetch the docker image and execute the qemu-binary inside of it, including its embedded OS image.</p> </li> </ul> <p>I went forth and threw together all the bits and pieces. But, sadly, I cannot provide you the <strong><code class="highlighter-rouge">vmrun</code></strong> tool as I described it above. I simply ran into too many issues around sharing a directory. However, I did end up with something close:</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="se">\</span> <span class="nt">--rm</span> <span class="se">\</span> <span class="nt">-it</span> <span class="se">\</span> <span class="nt">-v</span> <span class="nv">$PWD</span>/myscript.sh:/mnt/cherryimages/input/main:ro <span class="se">\</span> cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-20180110-1 </code></pre></div></div> <p>This command executes <em><code class="highlighter-rouge">$PWD/myscript.sh</code></em> inside of a fedora armv7hl image, hosted by an <em>x86_64</em> qemu. For reproducability, I tagged the image at the time of this blog-post as <em>cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-20180110-1</em>. If you want the latest image, use <em>cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-latest</em>. Other tags exist as well. Just check out the repository, if interested. The Dockerfile sources as of the time of this post can be found on <a href="https://github.com/cherry-pick/cherry-images/tree/a176d7feb95bc3cf89dd8071e304f30f763820cb">github</a>.</p> <p>Unlike the <strong><code class="highlighter-rouge">vmrun</code></strong> tool I described above, this shares the input script read-only. Furthermore, it shares the input as FAT16 volume (qemu can create this on-the-fly via the vvfat driver), so its size is quite limited, and filesystem attributes are mostly discarded. In the end, its only use is to push a script into the machine to execute (alternatively, you can push an entire directory into the machine, but the entrypoint must be named <strong><code class="highlighter-rouge">main</code></strong>).</p> <p>For my personal use, I now added a script that fetches a git-repository, runs the embedded tests, and returns. In combination with this docker-qemu-image, I can easily run my CI on foreign architectures. Maybe some day I will pick this up again and get a proper <strong><code class="highlighter-rouge">vmrun</code></strong> tool (or maybe someone else does?). Until then, I will stick to the reduced version, as it serves my needs. Sadly, there are still too many variables that cannot be auto-detected (How many memory to give to the VM? Which devices to forward? Which CPU features to enable?), and too many hacks required (String-conversions required between different command-lines… Getting a distribution to boot fast in these containers… Sharing data correctly into and out of the VM…).</p> <p>In the end, I think it is just too much a hassle to turn into a project I can maintain and support. The tools I needed do not provide proper APIs, but require me to lump together command-lines, PID-files, and magic configurations. Maybe some day we will get there? Until then, lets make use of <code class="highlighter-rouge">qemu-user</code> and avoid system integration tests…</p> David Rheinsberg Almost everyone these days relies on continuous integration. And it seems, once you got accustomed to it, you never want to work without it again. Unfortunately, most CI systems lack cross-architecture capabilities. As a systems engineer with lots of C projects, I was desperately looking for a solution to run my tests on little-endian, big-endian, 32bit, and 64bit machines. So far, without any luck. Hence, I patched together qemu, docker, fedora, and some bash scripts to get a tool that allows me to execute scripts from the command-line in a VM ad-hoc. Cross Bootstrap Fedora 2018-01-09T00:00:00+00:00 2018-01-09T00:00:00+00:00 https://dvdhrm.github.io/2018/01/09/cross-bootstrap-fedora <p>I recently had to assemble linux distribution images to be run in containers and virtual machines. While most package managers provide tools to bootstrap an entire distribution into a target directory (e.g., <code class="highlighter-rouge">debootstrap</code>, <code class="highlighter-rouge">dnf --installroot</code>, <code class="highlighter-rouge">zypper</code>, <code class="highlighter-rouge">pacstrap</code>, …), I needed to do that for foreign architectures. Fortunately, Fedora got me covered!</p> <p>If you use <em><code class="highlighter-rouge">dnf --installroot=/path</code></em>, dnf will perform the given operations in a separate directory tree, rather than your file-system root. It is easy to use this with <em><code class="highlighter-rouge">dnf install</code></em> to install an entire Fedora distribution into some custom directory. Unfortunately, RPM allows scripts to be run as part of the installation process of packages. Those scripts might invoke binaries of the target architecture as part of the installation. Hence, before we can cross bootstrap Fedora, we need one more tool: qemu-user-static</p> <p>The qemu project provides two kinds of emulators:</p> <ul> <li> <p><strong>System Emulators</strong>: These are the commonly known emulators used to emulate an entire machine of a given architecture. They can be used to run virtual machines of any kind. The binaries are usually called <strong><code class="highlighter-rouge">qemu-system-&lt;arch&gt;</code></strong>.</p> </li> <li> <p><strong>User Emulators</strong>: These emulators are much less known. They emulate the linux user-space of your target architecture of choice. That is, they execute binaries of foreign architectures on your machine, translating on the syscall boundary. Hence, you can run <code class="highlighter-rouge">MIPS</code> binaries on your <code class="highlighter-rouge">x86_64</code> machine running a normal <code class="highlighter-rouge">x86_64</code> kernel, as long as you use the qemu-user-mips emulator. The binaries are usually called <strong><code class="highlighter-rouge">qemu-&lt;arch&gt;</code></strong>.</p> </li> </ul> <p>Fedora provides a package called <code class="highlighter-rouge">qemu-user-static</code>, which provides statically linked qemu user-space emulators and hooks them up with the kernel-binfmt configuration. Hence, with the package installed, you can directly execute binaries of foreign architectures, and the kernel will use the qemu emulators to run the binaries. Since the qemu emulators are statically linked, they will work just fine in chroots as well.</p> <p>With this in mind, you can simply add <code class="highlighter-rouge">--forcearch=&lt;arch&gt;</code> to dnf to bootstrap Fedora in a foreign architecture. For instance, this bootstraps just <code class="highlighter-rouge">bash</code> and all its dependencies as 32bit ARM targets:</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dnf \ -y \ --repo=fedora \ --repo=updates \ --releasever=27 \ --forcearch=armv7hl \ --installroot=/some/path \ install \ bash </code></pre></div></div> <p>For more information, have a look at Nathaniel McCallum’s <a href="https://npmccallum.gitlab.io/post/cross-architecture-roots-with-dnf/">introduction</a> of the <code class="highlighter-rouge">--forcearch</code> argument to dnf.</p> David Rheinsberg I recently had to assemble linux distribution images to be run in containers and virtual machines. While most package managers provide tools to bootstrap an entire distribution into a target directory (e.g., debootstrap, dnf --installroot, zypper, pacstrap, …), I needed to do that for foreign architectures. Fortunately, Fedora got me covered! Rethinking The Dbus Message Bus 2017-08-23T00:00:00+00:00 2017-08-23T00:00:00+00:00 https://dvdhrm.github.io/rethinking-the-dbus-message-bus <p>Later this year, on November 21, 2017, D-Bus will see its <a href="https://cgit.freedesktop.org/dbus/dbus/commit/?id=93cff3d69fb705806d2af4fd6f29c497ea3192e0">15th birthday</a>. An impressive age, only shy of the KDE and GNOME projects, whose collaboration inspired the creation of this independent IPC system. While still relied upon by the most recent KDE and GNOME releases, D-Bus is not free of criticism. Despite its age and mighty advocates, it never gained traction outside of its origins. On the contrary, it has long been criticized as bloated, over-engineered, and orphaned. Though, when looking into those claims, you’re often left with unsubstantiated ranting about the environment D-Bus is used in. If you rather want a glimpse into the deeper issues, the best place to look is the <a href="https://bugs.freedesktop.org/buglist.cgi?bug_status=__open__&amp;product=dbus">D-Bus bug-tracker</a>, including the assessments of the D-Bus developers themselves. The bugs range from uncontrolled memory usage, over silent dropping of messages, to dead-locks by design, unsolved for up to 7 years. Looking closer, most of them simply cannot be solved without breaking guarantees long given by <em>dbus-daemon(1)</em>, the reference implementation. Hence, workarounds have been put in place to keep them under control.</p> <p>Nevertheless, these issues still bugged us! Which is, why we rethought some of the fundamental concepts behind the shared Message Buses defined by the D-Bus Specification. We developed a new architecture that is designed particularly for the use-cases of modern D-Bus, and it allows us to solve several long standing issues with <em>dbus-daemon(1)</em>. With this in mind, we set out to implement an alternative D-Bus Message Bus. Half a year later, we hereby announce the <a href="https://www.github.com/bus1/dbus-broker/wiki"><strong>dbus-broker project</strong></a>!</p> <p>But before we dive into the project, lets first have a look at some of the long standing open bug reports on D-Bus. A selection:</p> <ul> <li> <p><a href="https://bugs.freedesktop.org/show_bug.cgi?id=33606#c11">Bug #33606</a>: <em>“stop dbus-daemon memory usage ballooning if a client is slow to read”</em></p> <p>The bug-report describes a situation where the memory-usage of <em>dbus-daemon(1)</em> grows in an uncontrolled manner, if inflight messages keep piling up in the incoming and outgoing queues of the daemon. Despite being reported more than 6 years ago, there is no satisfying solution to the issue.</p> <p>What it boils down to is the fact that <em>dbus-daemon(1)</em> does not judge messages based on their message type. Hence, whether a message was triggered by a peer itself (e.g., a method call), or triggered by another peer (e.g., a method reply), the message is always accounted on the sender of the message. Hence, if those messages are piled up in outgoing queues in <em>dbus-daemon(1)</em>, the sender of those messages is accounted and punished for them. This can be misused by malicious applications that simply trigger a target peer to send messages (like method replies and signals), but they never read those messages but leave them queued. As a result, there is still no agreed upon way to decide who to punish for excessive buffering.</p> </li> <li> <p><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80817">Bug #80817</a>: <em>“messages with abusive recursion are silently dropped”</em></p> <p>Depending on the linux kernel you use, consecutively queued unix-domain-sockets may be rejected by <strong>sendmsg(2)</strong>. This can have the effect of <em>dbus-daemon(1)</em> being unable to forward a message. The message will be silently dropped, without notifying anyone.</p> <p>There is no known workaround for this issue, since the time of <strong>sendmsg(2)</strong> might be too late for proper error-handling, due to output buffering or short writes.</p> <p>Similarly, <a href="https://bugs.freedesktop.org/show_bug.cgi?id=52372">Bug #52372</a> describes another situation where messages are silently dropped, if they are queued on an activatable name but their sender disconnects before the destination is activated.</p> <p>Lastly, <em>dbus-daemon(1)</em> might fail any message and reply with an error message. That is, method-calls but also method-replies, signals, and error-messages can all be rejected for arbitrary reason by <em>dbus-daemon(1)</em> and trigger an error-reply. Nearly no application is ready to expect asynchronous error-replies to their attempt to send a method reply or signal. Again, this stems from <em>dbus-daemon(1)</em> never judging messages by their type. Despite method-transactions being stateful, there is no reliable way for a peer to cancel a message transaction. Any attempt to do so might fail. Same is true for a signal-subscription.</p> <p>There are some more similar scenarios where <em>dbus-daemon(1)</em> has to silently drop messages, or unexpectedly rejects messages, thus breaking the rule of reliability. This is not about catching errors in client libraries, but this is about either messages being silently discarded or asynchronously rejected.</p> </li> <li> <p><a href="https://bugs.freedesktop.org/show_bug.cgi?id=28355">Bug #28355</a>: <em>“dbus-daemon hangs while starting if users are in LDAP/NIS/etc.”</em></p> <p>Additionally to client-side policies, <em>dbus-daemon(1)</em> implements a mandatory access control mechanism, based on uids, gids, and message content. This, however, required D-Bus to resolve user-names and group-names to IDs, which will involve NSS, and as such LDAP/NIS/etc. This has long been a source of deadlocks, when using D-Bus to implement those NSS modules themselves. Workarounds are available, but the problem itself is not solved.</p> </li> <li> <p><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83938">Bug #83938</a>: <em>“improve data structures for pending replies”</em></p> <p>This bug-report concerns the method-call tracking in <em>dbus-daemon(1)</em>, which is used to allow exactly one reply per method-call, but not more. A list of <em>open reply windows</em> is kept to track pending method-calls. In <em>dbus-daemon(1)</em>, this is a global, linked list, searched whenever a reply is sent. By queuing up too many replies on too many connections, lookups on this list will consume a considerable amount of time, slowing down the entire bus.</p> <p>While the issue at hand can be solved, and has been solved, there remain many similar global data-structures in <em>dbus-daemon(1)</em>, that are shared across all users. Some of them can be fixed, some cannot, since D-Bus defines some global behavior (like broadcast matching and name-ownership/handover). This prevents D-Bus from scaling nicely with more processors being added to a system.</p> <p>In fact, the name-registry of D-Bus, and the atomic hand-over of queued name owners, requires huge global state-tracking without any known efficient, parallel solution.</p> <p>Furthermore, many of the employed workarounds simply introduce per-peer limits for those global resources. By setting them low enough, their scope has been kept under control. However, history shows that those limits have had violated application expectations <a href="https://bugs.freedesktop.org/show_bug.cgi?id=50264">several</a> <a href="https://github.com/NetworkManager/NetworkManager/commit/2c299ba65c51e9c407090dc83929d692c74ee3f2">times</a>.</p> </li> </ul> <p>None of the issues mentioned here is critical enough for D-Bus to become unbearable. On the contrary, D-Bus is still popular and no serious replacement is even close to be considered a contender. Furthermore, suitable workarounds have often been put in place to control those issues.</p> <p>But we kept being annoyed by these fundamental problems, so we set forth to solve them in <em>dbus-broker(1)</em>. What we came up with is a set of theoretical rules and concepts for a different message bus:</p> <ol> <li> <p><strong>No Shared Medium</strong></p> <p>This is a rather theoretical change. Previously, the D-Bus Message Bus followed the model of actual physically wired buses, where peers place messages on a shared medium for others to fetch. The problem here is to guarantee fairness, and to make peers accountable for excessive use. In D-Bus the problem can be reduced to outgoing queues in the message broker. Whenever many peers send messages to the same destination, they fill the same message queue. If that queue runs full, someone needs to be held accountable. Was the destination too slow reading messages and should be disconnected? Did a sender flood the destination with an unreasonable amount of messages? Or did an innocent 3rd party just send a single message, but happened to be the final straw?</p> <p>We decided to overcome this by throwing the model of a shared medium overboard. We no longer consider a D-Bus Message Bus a global medium that all peers are connected to and submit messages to. We rather consider a bus a set of distinct peers with no global state. Whenever a peer sends a message, we consider this a transaction between the sender and the destination (or multiple destinations in case of multicasts). We try to avoid any global state or context. We want every action taken by a peer to only affect the source and target of the action, but nothing else.</p> <p>While nice in theory, D-Bus does not allow this. There is global state, and it is hard-coded in the D-Bus specification with many existing applications relying on it. However, we still tried to stick to this as close as possible. In particular, this means:</p> <ul> <li> <p>Whenever a peer creates an object in the bus manager, it must be linked and indexed on a specific peer. There must not be any global lists or maps. Whenever the bus manager performs a transaction, it must be able to collect all objects that affect it by just looking at the involved peers.</p> <p>This rule is, in some corner-cases, violated to keep compatibility to the specification. That is, if applications rely on global behavior, it will still work. However, anything that can be indexed, is indexed, and as long as applications don’t rely on obscure D-Bus features, they will never end up in those global data-structures.</p> </li> <li> <p>We now judge messages by their message types. We implement proper message transactions and always know who to account for for inflight messages. Moreover, every peer now has a limited incoming queue, which every other peer gets a fair share of. Whenever a peer exceeds their share on another peer’s queue, one of both exceeded their configured limits and the message must be rejected. Details on how we dynamically adjust those shares can be found in the online <a href="https://github.com/bus1/dbus-broker/wiki/Accounting">documentation</a>.</p> <p>We still need to decide who is at fault. Is the sender to blame or the receiver? Our solution is to base this on the question whether a message is unsolicited. That is, for unsolicited messages, the sender is to blame. For solicited messages, the receiver is to blame. Effectively, this means whenever you send a method call, you are to blame if you did not account for the reply. In case of signals, we simply treat a subscription as the intention of the subscriber to receive an unlimited stream of signals, thus making subscribed signals solicited.</p> <p>Lastly, in case of unsolicited messages, we reply with an error, and expect every peer to be able to deal with asynchronous errors to unsolicited messages. By contrast, solicited messages never yield an error. Instead, we always consider the receiver of solicited messages to be at fault, thus throw them off the bus.</p> </li> </ul> </li> <li> <p><strong>No IPC to implement IPC</strong></p> <p>D-Bus is an IPC mechanism to allow other processes to communicate. We strictly believe that the implementation of an IPC mechanism should not use IPC itself. Otherwise, deadlocks are a steady threat.</p> <p>This means, the transaction of a message (whatever kind) should not depend on any other means but local data. We do not read files, we do not invoke NSS, we do not call into D-Bus. Instead, the operation of the bus manager regarding message transactions is a self-contained process without any external hooks or callbacks.</p> </li> <li> <p><strong>User-based Accounting</strong></p> <p>Any resource and any object allocated in the bus must be accounted on a user. We do not account based on peers, but always account based on users.</p> <p>In particular, this means we never have stacked accounting. We have limits for specific resources, but all those limits only ever affect the user accounting. That is, you can no longer exceed limits by simply connecting multiple times to the bus, or by creating objects that have separate accounting. Instead, whenever an action is accounted, it will be accounted on the calling user, regardless through which peer or object the action is performed.</p> </li> <li> <p><strong>Reliability</strong></p> <p>Never ever shall a message be silently dropped! Any error condition must be caught and handled, and must never put peers into unexpected situations.</p> <p>If a situation arises where we cannot gracefully handle an error condition, we exit. We never put the burden on the peers, nor do we silently ignore it.</p> </li> </ol> <p>With these in mind, we implemented an independent D-Bus Message Bus and named it <strong>dbus-broker</strong>. It is available on <a href="https://www.github.com/bus1/dbus-broker">GitHub</a> and already capable of booting a full Fedora Desktop System. Some of its properties are:</p> <ul> <li> <p><strong>Pure Bus Implementation</strong></p> <p>One of our aims was to make the bus manager a pure implementation with as little policy as possible. Furthermore, following our rule of <em>“No IPC to implement IPC”</em>, we stripped all external communication from it. The result is a standalone program we call <em>dbus-broker</em>, which implements a Message Bus as defined by the D-Bus specification. The only external control channel is a private socketpair that must be passed down by the parent process that spawns <em>dbus-broker(1)</em>. This channel is used to control the broker at runtime, as well as get notified about specific events like name activation.</p> <p>On top of this, we implemented a launcher compatible to <em>dbus-daemon(1)</em>, employing <em>dbus-broker(1)</em>. This <em>dbus-broker-launch(1)</em> program implements the <em>dbus-daemon(1)</em> semantics of a system and session/user message bus.</p> </li> <li> <p><strong>Local Only</strong></p> <p>We only implement local IPC. No remote transports are supported. We believe that this is beyond the realm of D-Bus. You can always employ ssh tunneling to get remote D-Bus working, just like most projects do already.</p> </li> <li> <p><strong>No Legacy</strong></p> <p>We do not implement legacy D-Bus features. Anything that is marked as deprecated was dropped, as long as it is not relied upon by crucial infrastructure. We are compatible to <em>dbus-daemon(1)</em>, so use it if your system still requires those legacy features.</p> <p>All those deviations are documented in our online <a href="https://github.com/bus1/dbus-broker/wiki/Deviations">wiki</a>. Each case comes with a rationale why we decided to drop support for it.</p> </li> <li> <p><strong>Linux Only</strong></p> <p>A lot of functionality we rely on is simply not available on other operating systems. <em>dbus-daemon(1)</em> is still around (and will stay around), so there will always be a working D-Bus Message Bus for other operating systems.</p> <p>Note that we rely on several peculiar features of the linux kernel to implement a secure message broker (including its accounting for inflight FDs, its output queueing on <em>AF_UNIX</em> including the <em>IOCOUTQ</em> ioctl, edge-triggered event notification, <em>SO_PEERGROUPS</em> ioctl, and more). We fixed <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/core/sock.c?id=28b5ba2aa0f55d80adb2624564ed2b170c19519e">several</a> <a href="https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/net/unix/af_unix.c?id=27eac47b00789522ba00501b0838026e1ecb6f05">bugs</a> upstream just few weeks ago, and we will continue to do so. But we are not in a position to review other kernels for the same guarantees.</p> </li> <li> <p><strong>Pipelining</strong></p> <p>We support SASL pipelining for fast connection attempts. This means, all SASL D-Bus authentication requests can be queued up without waiting for their replies, including any following <em>Hello()</em> call or other D-Bus Message.</p> <p>This allows connecting to the message broker without waiting for a single roundtrip.</p> </li> <li> <p><strong>No Spec-Deviation</strong></p> <p>We do not intend to add features not standardized in the D-Bus Specification, nor do we intend to deviate. However, we do sometimes deviate from the behavior of the reference implementation. All those deviations are carefully considered and <a href="https://github.com/bus1/dbus-broker/wiki/Deviations">documented</a>.</p> <p>Our intention is to base this implementation on the ideas described above, and thus fix some of the fundamental issues we see in D-Bus. We report all our findings back and recommend solutions to upstream <em>dbus-daemon(1)</em>. Discussion and development of the D-Bus specification still happens upstream. We are not the persons to contact for extensions of the specification, but we will happily collaborate on the upstream mailing-list and bug-tracker with whoever wants to discuss D-Bus.</p> </li> <li> <p><strong>Runtime Broker Control</strong></p> <p>The message broker process provides a control API to its parent process via a private connection. It allows to feed initial state, but also control the broker at runtime.</p> <p>While it can and is used to implement compatibility to the dbus-daemon configuration files, it is also possible to modify the broker at runtime, if necessary. This includes adding and removing listener sockets and activatable names at runtime. Thus, appearance of activatable names can now be scheduled arbitrarily.</p> </li> </ul> <p>Please be aware that the <em>dbus-broker</em> project is still experimental. While we successfully use it on our machines to run the system and session/user bus, we do not recommend deploying it on production machines at this time. We are not aware of any critical bugs, but we do want more testing before recommending its deployment.</p> <p>If you are curious and want to try it out, there are packages available for Fedora and Arch Linux. Other distributions will follow. The online documentation also contains information on how to compile and deploy it manually.</p> <ul> <li><a href="https://github.com/bus1/dbus-broker/wiki">Project Wiki</a></li> <li><a href="https://github.com/bus1/dbus-broker/issues">Issue Tracker</a></li> <li>Last Release: <a href="https://github.com/bus1/dbus-broker/archive/v3/dbus-broker-v3.tar.gz">v3</a></li> <li>Fedora Packages in <a href="https://copr.fedorainfracloud.org/coprs/g/bus1/dbus/package/dbus-broker/">Copr</a></li> <li>Arch Linux Packages in <a href="https://aur.archlinux.org/packages/dbus-broker">AUR</a></li> </ul> David Rheinsberg Later this year, on November 21, 2017, D-Bus will see its 15th birthday. An impressive age, only shy of the KDE and GNOME projects, whose collaboration inspired the creation of this independent IPC system. While still relied upon by the most recent KDE and GNOME releases, D-Bus is not free of criticism. Despite its age and mighty advocates, it never gained traction outside of its origins. On the contrary, it has long been criticized as bloated, over-engineered, and orphaned. Though, when looking into those claims, you’re often left with unsubstantiated ranting about the environment D-Bus is used in. If you rather want a glimpse into the deeper issues, the best place to look is the D-Bus bug-tracker, including the assessments of the D-Bus developers themselves. The bugs range from uncontrolled memory usage, over silent dropping of messages, to dead-locks by design, unsolved for up to 7 years. Looking closer, most of them simply cannot be solved without breaking guarantees long given by dbus-daemon(1), the reference implementation. Hence, workarounds have been put in place to keep them under control. Welcome 2017-07-17T00:00:00+00:00 2017-07-17T00:00:00+00:00 https://dvdhrm.github.io/2017/07/17/welcome <p>The Ponyhof blog was moved over from wordpress to here. Lets see how this will work out!</p> <p>Thanks to <a href="https://github.com/barryclark">Barry Clark</a> for the nice jekyll guides and examples.</p> David Rheinsberg The Ponyhof blog was moved over from wordpress to here. Lets see how this will work out!