mirror of
https://github.com/genodelabs/genode.git
synced 2024-12-18 21:27:56 +00:00
98211db63d
This keeps the doc/ directory tidy and neat.
1124 lines
53 KiB
Plaintext
1124 lines
53 KiB
Plaintext
|
|
|
|
===============================================
|
|
Release notes for the Genode OS Framework 14.05
|
|
===============================================
|
|
|
|
Genode Labs
|
|
|
|
|
|
|
|
With Genode version 14.05, we address two problems that are fundamental for
|
|
the scalability of the framework. The first problem is the way how Genode
|
|
interoperates with existing software. A new concept for integrating 3rd-party
|
|
source code with the framework makes the porting and use of software that
|
|
is maintained outside the Genode source tree easier and more robust than ever.
|
|
The rationale and the new concept are explained in Section
|
|
[Management of ported 3rd-party source code].
|
|
The second problem is concerned about how programs that are built atop a C
|
|
runtime (as is the case for most 3rd-party software) interact with the Genode
|
|
world. Section [Per-process virtual file systems] describes how we
|
|
consolidated many special-purpose solutions into one coherent design of using
|
|
process-local virtual file systems.
|
|
|
|
In line with our road map, we put forward our storage-related agenda by enabling
|
|
the use of NetBSD's cryptographic device driver (CGD) on Genode. Thereby, we
|
|
continue our engagement with the rump kernel that we started to embrace with
|
|
version 14.02. Section [Block-level encryption using CGD] explains the
|
|
use of CGD as a Genode component.
|
|
|
|
Apart from those infrastructural improvements, the release cycle has focused
|
|
on the NOVA and base-hw platforms. On NOVA, we are happy to have enabled
|
|
static real-time priorities, which make the kernel much more appealing
|
|
for the designated use for a general-purpose OS. Furthermore, we intensified
|
|
our work on VirtualBox on NOVA by enabling guest-addition support and
|
|
improving stability and performance. The NOVA-related improvements are
|
|
covered by Sections [VirtualBox on NOVA] and [NOVA microhypervisor].
|
|
|
|
The development of our custom base-hw kernel platform for the ARM architecture
|
|
goes full steam ahead. With the added support for multiple processors, base-hw
|
|
can finally leverage the CPU resources of modern ARM platforms. Furthermore,
|
|
we largely redesigned the memory management to avoid the need to maintain
|
|
identity mappings, which makes the kernel more robust. Section
|
|
[Execution on bare hardware (base-hw)] explains those developments in detail.
|
|
|
|
Finally, we enhanced the driver support for x86-based platforms by enabling
|
|
USB 3.0 in our Linux device-driver environment
|
|
Section [USB 3.0 for x86-based platforms] outlines the steps we had to take.
|
|
|
|
|
|
Management of ported 3rd-party source code
|
|
##########################################
|
|
|
|
Without the wealth of existing open-source software, Genode would be of little
|
|
use. We regularly combine the work of more than 70 open-source projects with
|
|
the framework. The number is steadily growing because each Genode user longs
|
|
for different features.
|
|
|
|
Since version 11.08, we employed a common way of integrating 3rd-party
|
|
software with Genode, which came in the form of a makefile per source-code
|
|
repository. Each of those makefiles offered "prepare" and "clean" rules
|
|
that automated the downloading and integration of 3rd-party code.
|
|
The introduced automatism was a big relief for our work flows.
|
|
Since then, the amount of 3rd-party code ported to Genode has been steadily
|
|
increasing. It eventually reached a complexity that became hard to manage
|
|
using the original mechanism.
|
|
In order to make Genode easier to conquer for new users and more
|
|
enjoyable for regular developers, we had to reconsider the way of how
|
|
3rd-party code is integrated with the framework.
|
|
|
|
We identified the following limitations of the existing approach:
|
|
|
|
* From the viewpoint of Genode users, the most inconvenient limitation was
|
|
the lack of proper error messages when a port was not prepared beforehand.
|
|
Instead, the build system produced confusing error messages when unable to
|
|
find the source code. According to the trouble-shooting requests on our
|
|
mailing list, the missing preparation of 3rd-party code seems to be the
|
|
most prominent road block for new users.
|
|
|
|
* Still, when having prepared all required 3rd-party ports, the prepared
|
|
version may become outdated when using Genode over time. Eventually the
|
|
build process will expect a different version of the 3rd-party code than the
|
|
one prepared. This happens particularly when switching between branches. In
|
|
some cases the version of the 3rd-party code is updated quite often (e.g.,
|
|
base-nova). The build system could not detect such inconsistencies and
|
|
consequently responded with arcane error messages, or even worse, produced
|
|
binaries with unexpected runtime behaviour.
|
|
|
|
* There are many source-code repositories that deal with downloading and
|
|
integrating 3rd-party code in different ways, namely libports, ports,
|
|
ports-foc, base-<kernel>, dde_ipxe, dde_rump, dde_linux, dde_oss, qt4. Even
|
|
though all makefiles contained in those repositories used to contain the
|
|
"prepare" and "clean" rules, they were not consistent with regard to the
|
|
handling of corner cases, to the updating of packages, and with the use of
|
|
additional arguments ("PKG="). Moreover, the individual port-description
|
|
files (_<repository>/ports/*.mk_) files found in the ports and libports
|
|
repositories contained a lot of boiler-plate content such as the rules for
|
|
downloading files via wget, or the rules for checking signatures. Such
|
|
duplicated code tends to degrade in quality and consistence over time,
|
|
affecting the user experience and maintenance costs in a negative way.
|
|
|
|
* The downloaded archives and the extracted 3rd-party code used to reside
|
|
within the respective repositories (in the _download/_ and _contrib/_
|
|
subdirectories). This made the use of search tools like grep very inefficient
|
|
when attempting to search in Genode's source code while excluding 3rd-party
|
|
sources. For this reason, most regular Genode developers have crafted some
|
|
special shell aliases for filtered search operations. But this should not be
|
|
the way to go.
|
|
|
|
* During the "make prepare" step, most ports of libraries used to create a
|
|
bunch of symlinks within _<rep-dir/include/_ that pointed to the respective
|
|
header files within _<rep-dir>/contrib/_. Effectively, this step touched
|
|
Genode's source tree, which was bad in two ways. First, the portions of the
|
|
source tree installed by the "make prepare" mechanism had to be blacklisted
|
|
in Genode's .gitignore file. And second, executing the port-specific "make
|
|
clean" rules was quite dangerous because those rules operated on the source
|
|
tree.
|
|
|
|
|
|
The way forward
|
|
===============
|
|
|
|
The points above made the need for a changed source-tree structure apparent.
|
|
Traditionally, all of Genode's source-code repositories alongside the _tool/_
|
|
and _doc/_ directories were located at the root of the tree structure:
|
|
|
|
! tool/
|
|
! doc/
|
|
! base/
|
|
! base-okl4/Makefile
|
|
! download/
|
|
! include/
|
|
! lib/
|
|
! src/
|
|
! os/
|
|
! ...
|
|
|
|
Repositories that incorporated 3rd-party code (e.g., base-okl4 as depicted
|
|
above) hosted a makefile for the preparation, a _download/_ directory for the
|
|
downloaded 3rd-party source code, and a _contrib/_ directory for the extracted
|
|
source code. There was no notion of common tools that would work across
|
|
repositories.
|
|
|
|
With Genode 14.05, we move all repositories to a _repos/_ directory:
|
|
|
|
! tool/
|
|
! doc/
|
|
! repos/
|
|
! base/
|
|
! base-okl4/
|
|
! os/
|
|
! ...
|
|
! contrib/
|
|
|
|
Downloaded 3rd-party source code resides outside of the actual repository at
|
|
the central 'contrib/' directory. By using this structure, we achieve
|
|
the following:
|
|
|
|
* Working with grep within the repositories works very efficient because
|
|
downloaded and extracted 3rd-party code are no longer in the way. They
|
|
reside next to the repositories.
|
|
|
|
* In contrast to the original situation where we had no convention about
|
|
the location of source-code repositories, tools can rely on a convention
|
|
now. Being located at a known position within the tree, the tools for
|
|
creating build directories and for managing ports become aware of the
|
|
location of the repositories as well as the central _contrib/_ directory.
|
|
|
|
* Adding a supplemental repository is pretty intuitive: Just clone a git
|
|
repository into _repos/_.
|
|
|
|
* Tutorials that describe the use of Genode could benefit from the introduced
|
|
convention as they could suggest creating build directories at the top
|
|
level, which no longer interferes with the location of the source-code
|
|
repositories. This would make those tutorials a bit easier to follow.
|
|
|
|
* The create_builddir tool can create build directories at sensible default
|
|
locations. E.g., when 'create_builddir' is called with nova_x86_64 as
|
|
argument but with no BUILD_DIR argument, the tool will create a build
|
|
directory _build/nova_x86_64/_ by default. This way, we reinforce a useful
|
|
convention about the naming and location of build directories that will ease
|
|
the support of Genode users.
|
|
|
|
* Storing all build directories and downloaded 3rd-party source code somewhere
|
|
outside the Genode source tree, let's say on different disk partitions, can
|
|
be easily accomplished by creating a symbolic link for each of the _build/_
|
|
and _contrib/_ directory.
|
|
|
|
Of course, changing the source-tree structure at the top-level was no
|
|
light-hearted decision. In particular, it raised the question of how to
|
|
deal with topic branches that were branched off a Genode version with the
|
|
old layout. During the transition, we observed the following patterns
|
|
to deal with that problem:
|
|
|
|
* Git can deal well with patches that change existing files, even if the
|
|
file location has changed. For simple patches, e.g., small bug fixes,
|
|
cherry-picking those individual commits to a current branch works quite
|
|
well.
|
|
|
|
* If a commit adds new files, the files will naturally end up at the
|
|
location specified in the patch, i.e., somewhere outside of the _repos/_
|
|
directory. You will have to manually move them to the correct location using
|
|
'git mv' and squash the resulting rename commit onto the original commit
|
|
using 'git rebase -i'.
|
|
|
|
* For migrating a series of complex commits to the new layout, we use
|
|
'git format-patch' to obtain a patch series for the topic branch, prefix
|
|
the original pathnames with "repos/" using 'sed', and apply the result
|
|
using 'git am'.
|
|
|
|
|
|
Unification of the ports management
|
|
===================================
|
|
|
|
With the new source-tree layout in place, we could pursue a new take on
|
|
unifying the management of ported 3rd-party source code. The new solution,
|
|
which is very much inspired by the fabulous
|
|
[https://nixos.org/nix - Nix package manager] comes in the form of new tools to
|
|
be found at 'tool/ports/'.
|
|
|
|
Note that even though the port mechanism described herein looks a bit like
|
|
"package management", it covers a different problem. The problem covered here
|
|
is the integration of existing 3rd-party source code with the Genode source
|
|
tree. Packaging, on the other hand, would provide a means to distribute
|
|
self-contained portions of the Genode source tree including their respective
|
|
3rd-party counterparts as separate packages. Package management is not
|
|
addressed yet.
|
|
|
|
The new tools capture all ports present in the repositories located under
|
|
_repos/_. Using them is as simple as follows:
|
|
|
|
:Obtain a list of available ports:
|
|
! tool/ports/list
|
|
|
|
:Download and install a port:
|
|
! tool/ports/prepare_port <port-name>
|
|
|
|
The prepare_port tool will scan all repositories for the specified port and
|
|
install the port into _contrib/_. Each version of an installed
|
|
port resides in a dedicated subdirectory within the _contrib/_ directory.
|
|
The port-specific directory is called port directory. It is named
|
|
_<port-name>-<fingerprint>_. The _<fingerprint>_ uniquely identifies
|
|
the version of the port (it is a SHA1 hash of the ingredients of the
|
|
port). If two versions of the same port are installed, each of them will
|
|
have a different fingerprint. So they end up in different directories.
|
|
|
|
Within a source-code repository, a port is represented by two files, a
|
|
_<port-name>.port_ and a _<port-name>.hash_ file. Both files reside at the
|
|
_ports/_ subdirectory of the corresponding repository. The
|
|
_<port-name>.port_ file is the port description, which declares the
|
|
ingredients of the port, e.g., the archives to download and the patches to apply.
|
|
The _<port-name>.hash_ file contains the fingerprint of the corresponding
|
|
port description, thereby uniquely identifying a version of the port
|
|
as expected by the checked-out Genode version.
|
|
|
|
So how does Genode's build system find the source code for a given port?
|
|
If the build system encounters a target that incorporates
|
|
ported source code, it looks up the respective _<port-name>.hash_ file in the
|
|
repositories as specified in the build configuration. The fingerprint found in
|
|
the hash file is used to construct the path to the port directory under
|
|
_contrib/_. If that lookup fails, a meaningful error is printed. Any number of
|
|
versions of the same port can be installed at the same time. I.e., when
|
|
switching Git branches that use different versions of the same port, the build
|
|
system automatically finds the right port version as expected by the currently
|
|
active branch.
|
|
|
|
For step-by-step instructions on how to add a port using the new mechanism,
|
|
please refer to the updated porting guide:
|
|
|
|
:Genode Porting Guide:
|
|
|
|
[https://genode.org/documentation/developer-resources/porting]
|
|
|
|
|
|
:Known limitations:
|
|
|
|
* There is no garbage collection of stale ports, yet. Each time when a port
|
|
gets updated, a new version will be created within the _contrib/_ directory.
|
|
However, the subdirectories can safely be deleted manually to regain
|
|
disk space. In the worst case, if you deleted a port that is in use,
|
|
the build system will let you know.
|
|
|
|
* Even though some port files are equipped with information about
|
|
cryptographic signatures, those signatures are not checked yet. However,
|
|
each downloaded archive is checked against a known-good hash value declared
|
|
in the port description so that the integrity of downloaded files is
|
|
checked. But as illustrated by the signature declarations in the
|
|
port descriptions, we plan to increase the confidence by enabling
|
|
signature checks in addition to the hash-sum checks.
|
|
|
|
* Dependencies between ports are not covered by port descriptions, yet.
|
|
|
|
|
|
:Transition to the new mechanism:
|
|
|
|
We have reworked the majority of the more than 70 existing ports to the new
|
|
mechanism. The only ports not covered so far are base-codezero, qt5, gcc, gdb,
|
|
and qt4. During the next release cycle, we will keep the original "make
|
|
prepare" mechanism as a front end intact. So the "make prepare" instructions
|
|
as found in many tutorials will still work. But under the hood, "make prepare"
|
|
just invokes the new _tool/ports/prepare_port_ tool.
|
|
|
|
|
|
Block-level encryption using CGD
|
|
################################
|
|
|
|
The need for protection of personal data is becoming generally
|
|
accepted in the information age. Especially, against the background of
|
|
ubiquitous storage devices in smart phones, notebooks, and tablet
|
|
computers, which may go missing easily.
|
|
|
|
There are several different approaches to prevent unauthorized access
|
|
to data storage. For example, data could be encrypted on a per file
|
|
basis (e.g. EncFS or PEFS). Thereby each file is encrypted using a
|
|
cipher but stored on a regular file system besides unencrypted files.
|
|
Beyond this approach, it is also common to encrypt data on the lower
|
|
block-device layer. With block-level encryption, each block on the
|
|
storage device is encrypted respectively decrypted when written to or read
|
|
from the device (e.g., TrueCrypt, FreeBSD's geli(8), Linux LUKS). On
|
|
top of this cryptographic storage device, a regular file system may be
|
|
used.
|
|
|
|
Additionally, it is desirable to access the
|
|
encrypted data from various operating systems. In our case, we want to
|
|
use the data from Genode as well as from our current development
|
|
platform Linux.
|
|
|
|
In Genode 14.02, we introduced a port of the NetBSD based rump kernels
|
|
to leverage file-system implementations, e.g., ext2. Beside file
|
|
systems, NetBSD itself also offers block-level encryption in form of
|
|
its cryptographic disk-driver _cgd(4)_. In line with our roadmap, we
|
|
enabled the cryptographic-device driver in our rump-kernels port as a
|
|
first step to explore block-level encryption on Genode.
|
|
|
|
:[https://www.netbsd.org/docs/guide/en/chap-cgd.html]:
|
|
NetBSD cryptographic-device driver (CGD)
|
|
|
|
The heart of our CGD port is the _rump_cgd_ server, which encapsulates
|
|
the rump kernels and the cgd device. The server uses a block session to
|
|
get access to an existing block device and, in return, provides a
|
|
block session to its client. Each block written or read by the client
|
|
is transparently encrypted resp. decrypted by the server with a given
|
|
key. This enables us to seamlessly integrate CGD into Genode's existing
|
|
infrastructure.
|
|
|
|
To ease the use, the server interface is modelled after the interface
|
|
of _cgdconfig(8)_. This implies that the key must have the same format
|
|
as used by _cgdconfig_, which means the key is a base64-encoded
|
|
string. The first 4 bytes of the key string denote the actual length
|
|
of the key in bits (these 4 bytes are stored in big endian order). For
|
|
now, we only support the use of a stored key. However, we plan to add
|
|
the use of passphrases in relation with keys later.
|
|
|
|
Currently, _rump_cgd_ is only able to _configure_ a _cgd_ device but
|
|
can not generate the configuration itself. A configuration or rather a
|
|
working key may be generated by using the new _tool/rump_ script. The
|
|
used cipher is hard-coded to _aes-cbc_ with a key size of 256 bit at
|
|
the moment. Note, the server serves only one client as it
|
|
transparently encrypts/decrypts one back-end block session. Though
|
|
_rump_cgd_ is currently limited with regard to the used cipher and the
|
|
way key input is handled, we plan to extend this
|
|
rump-kernel-based component step by step in the future.
|
|
|
|
If you want to get some hands on with CGD, the first step is to
|
|
prepare a raw encrypted and ext2-formatted partition image by using
|
|
the 'tool/rump' script
|
|
|
|
! dd if=/dev/urandom of=/path/to/image
|
|
! rump -c /path/to/image # key is printed to stdout
|
|
! rump -c -k <key> -f -F ext2fs /path/to/image
|
|
|
|
To use this disk image, the following config snippet can be used
|
|
|
|
! <start name="rump_cgd">
|
|
! <resource name="RAM" quantum="8M"/>
|
|
! <provides><service name="Block"/></provides>
|
|
! <config action="configure">
|
|
! <params>
|
|
! <method>key</method>
|
|
! <key>AAABAJhpB2Y2UvVjkFdlP4m44449Pi3A/uW211mkanSulJo8</key>
|
|
! </params>
|
|
! </config>
|
|
! <route>
|
|
! <service name="Block"> <child name="ahci"/> </service>
|
|
! <any-service> <parent/> <any-child/> </any-service>
|
|
! </route>
|
|
! </start>
|
|
|
|
Note, we explicitly route the block-session requests for the
|
|
underlying block device to the AHCI driver.
|
|
|
|
The block service provided by _rump_cgd_, in turn, is used by a file-system
|
|
server.
|
|
|
|
! <start name="rump_fs">
|
|
! <resource name="RAM" quantum="16M"/>
|
|
! <provides><service name="File_system"/></provides>
|
|
! <config fs="ext2fs">
|
|
! <policy label="" root="/" writeable="yes"/>
|
|
! </config>
|
|
! <route>
|
|
! <service name="Block"> <child name="rump_cgd"/> </service>
|
|
! <any-service> <parent/> <any-child/> </any-service>
|
|
! </route>
|
|
! </start>
|
|
|
|
Currently, the key to access the cryptographically secured device must
|
|
be specified before using the device. Implementing a mechanism which
|
|
asks for the key on the first attempt is in the works.
|
|
|
|
By using the rump kernels and the cryptographic-device driver, we are
|
|
able to use block-level encryption on Genode and on Linux.
|
|
In Linux case, we depend on _rumprun_, which can
|
|
run unmodified NetBSD userland tools on top of the rump kernels to
|
|
manage the cgd device. To ease this task, we provide the
|
|
aforementioned _rump_ wrapper script.
|
|
|
|
:[https://github.com/rumpkernel/rumprun]: Rumprun
|
|
|
|
Since the rump script covers the most common use cases for the tools,
|
|
the script is comparatively extensive, hence giving a short tutorial
|
|
is reasonable.
|
|
|
|
|
|
:Format a disk image with Ext2:
|
|
|
|
First, prepare the actual image file
|
|
|
|
! dd if=/dev/zero of=/path/to/image bs=1M count=128
|
|
|
|
Second, use _tool/rump_ to format the disk image:
|
|
|
|
! rump -f -F ext2fs /path/to/image
|
|
|
|
Afterwards the file system just created may be populated with the
|
|
contents of another directory by executing
|
|
|
|
! rump -F ext2fs -p /path/to/source /path/to/image
|
|
|
|
To list the contents of the image run
|
|
|
|
! rump -F ext2fs -l /path/to/image
|
|
|
|
|
|
:Create an encrypted disk image:
|
|
|
|
Creating a cryptographic-disk image based on cgd(4) is done by
|
|
executing the following command
|
|
|
|
! rump -c /path/to/image
|
|
|
|
This will generate a key that may be used to decrypt the image later
|
|
on. Since this command will only generate a key and _not_ initialize
|
|
the disk image, it is highly advised to prepare the disk image by
|
|
using _/dev/urandom_ instead of _/dev/zero_. In other words, only new
|
|
blocks later written to the disk image are encrypted on the fly. In
|
|
addition while generating the key, a temporary configuration file will
|
|
be created. Although this file has proper permissions, it may leak the
|
|
generated key if it is created on persistent storage. To specify a
|
|
more secure directory, the '-t' option can be used:
|
|
|
|
! rump -c -t /path/to/secure/directory /path/to/image
|
|
|
|
It is advised to carefully select an empty directory because the specified
|
|
directory is removed at after completion.
|
|
|
|
Decrypting the disk image requires the key generated in the previous
|
|
step:
|
|
|
|
! rump -c -k <key> /path/to/image
|
|
|
|
For now this key has to be specified as command line argument. This is
|
|
an issue if the shell, which is used, is maintaining a history of
|
|
executed commands.
|
|
|
|
For the sake of completeness let us put all examples together by creating an
|
|
encrypted ext2 image that will contain all files of Genode's _demo_
|
|
scenario:
|
|
|
|
! dd if=/dev/urandom of=/tmp/demo.img bs=1M count=16
|
|
! rump -c /tmp/demo.img # key is printed to stdout
|
|
! rump -c -k <key> -f -F ext2fs -d /dev/rcgd0a /tmp/demo.img
|
|
! rump -c -k <key> -F ext2fs -p $(BUILD_DIR)/var/run/demo /tmp/demo.img
|
|
|
|
To check if the image was populated successfully, execute the
|
|
following:
|
|
|
|
! rump -c -k <key> -F ext2fs -l /tmp/demo.img
|
|
|
|
More detailed information about the options and arguments of
|
|
this tool can be obtained by running:
|
|
|
|
! rump -h
|
|
|
|
Since _tool/rump_ just utilizes the rump kernels running on the host
|
|
system to do its duty, there is a script called _tool/rump_cgdconf_
|
|
that extracts the key from a 'cgdconfig(8)' generated configuration
|
|
file and is also able to generate such a file from a given key.
|
|
Thereby, we try to accommodate the interoperability between the general
|
|
rump-kernel-based tools and the _rump_cgd_ server used on Genode.
|
|
|
|
|
|
Per-process virtual file systems
|
|
################################
|
|
|
|
Our C runtime served us quite well over the years. At its core, it has a
|
|
flexible plugin architecture that allows us to combine different back ends
|
|
such as the lwIP socket API (using libc_lwip_nic_dhcp), using LOG as stdout
|
|
(via libc_log), or using a ROM dataspace as a file (via libc_rom). Recently
|
|
however, the original design has started to show its limitations:
|
|
|
|
Although there is the libc_fs plugin that allows a program to access files
|
|
from a file-system server, there is no way to allow a program to access
|
|
two different file-system servers. For example, if a web server wants to
|
|
obtain its configuration and the website content from two different file
|
|
systems.
|
|
|
|
Beside the lack of features of individual libc plugins, there are
|
|
problems stemming from combining multiple plugins.
|
|
For example, there is the libc_block plugin that makes a block session
|
|
accessible as a
|
|
pseudo block device named "/dev/blkdev". However, when combined with the
|
|
libc_fs plugin, it is not defined which of the two plugins will respond to
|
|
requests for a file with this name.
|
|
As a quick and dirty work-around, the libc_fs plugin
|
|
explicitly black-lists "/dev/blkdev". The need for such a work-around
|
|
hints at a deficiency of the overall design.
|
|
In general, if multiple plugins are combined, there is no consistent
|
|
virtual file-system structure exposed via getdirentries.
|
|
|
|
Another inconvenience is a missing concept for handling standard input
|
|
and output. Most programs use
|
|
libc_log to direct stdout to the LOG service. But what if we want to
|
|
direct the output of such a program to a terminal? Granted, there
|
|
exists the terminal_log server to translate a LOG session to a
|
|
terminal session but it would be much nicer to have this flexibility
|
|
at the C-runtime level.
|
|
|
|
Finally, when looking at the implementation of the plugins, it becomes
|
|
apparent that many of them look similar. We have to admit that there are quite
|
|
a few dusty corners where duplicated code has been accumulated over the years.
|
|
That said, the semantic details (e.g., the quality of error handling) differ
|
|
from plugin to plugin. Seeing the number of file systems (and thereby the
|
|
number of added libc plugins) grow, it became clear that our original
|
|
design would make the situation even worse.
|
|
|
|
On the other hand, we have gathered overly positive experiences with the
|
|
virtual file-system implementation of our Noux runtime, which is an
|
|
environment for running Unix software on Genode. The VFS as implemented for
|
|
Noux supports stacked file systems (similar to union mounts) of various
|
|
types. It is stable and complete enough to run our tool chain to build Genode
|
|
on Genode. Wouldn't it be a good idea to reuse the Noux VFS for the normal
|
|
libc? With the current release cycle, we pursued this line of thoughts.
|
|
|
|
The first step was transplanting the VFS code from the Noux runtime to a
|
|
free-standing library. The most substantial
|
|
change was the decoupling of the VFS interfaces from the types provided by
|
|
Noux. All those types had been moved to the VFS library. In the process
|
|
of reshaping the Noux VFS into a library, several existing pseudo file systems
|
|
received a welcome clean-up, and some new ones were added. In particular,
|
|
there is a new "log" file system for writing data to a LOG session, a "rom"
|
|
file system for reading ROM modules, and an "inline" file system for
|
|
reading data defined within the VFS configuration.
|
|
|
|
The second step was the addition of a new libc_vfs plugin to the C runtime.
|
|
This plugin makes the VFS library available to libc-using programs via the
|
|
original libc plugin interface. It translates the types and functions of the
|
|
VFS library to the types and functions of the C library. At this point, it was
|
|
an optional plugin. As the VFS was meant to replace the various existing plugins
|
|
instead of accompanying them, the next challenge was to revisit all the
|
|
users of the various libc plugins and adapting them to use the libc_vfs
|
|
plugin instead. This was, by far, the more elaborative step. More than 50
|
|
programs and their respective run scripts had to be adapted and tested.
|
|
However, this process was very satisfying because we could see how the
|
|
new VFS plugin satisfies all the use cases formerly accommodated by a zoo
|
|
of special plugins.
|
|
|
|
As the last step, we could retire several libc plugins such as libc_rom,
|
|
libc_block, libc_log, and libc_fs and merge the libc_vfs into the libc.
|
|
Technically, it is still a plugin, but it is always present.
|
|
|
|
|
|
:How has the libc changed?:
|
|
|
|
Each libc-using program can be configured with a program-local virtual
|
|
file system as illustrated by the following example:
|
|
|
|
! <config>
|
|
! ...
|
|
! <libc stdin="/dev/null" stdout="/dev/log" stderr="/dev/log">
|
|
! <vfs>
|
|
! <dir name="dev">
|
|
! <log/>
|
|
! <null/>
|
|
! </dir>
|
|
! <dir name="etc">
|
|
! <dir name="lighttpd">
|
|
! <inline name="lighttpd.conf">
|
|
! ...
|
|
! </inline>
|
|
! </dir>
|
|
! </dir>
|
|
! <dir name="website">
|
|
! <tar name="website.tar"/>
|
|
! </dir>
|
|
! </vfs>
|
|
! </libc>
|
|
! </config>
|
|
|
|
Here you see a lighttpd server that serves a website coming from a TAR
|
|
archive (which is obtained from a ROM module named "website.tar"). There
|
|
are two pseudo devices "/dev/log" and "/dev/null", to which the
|
|
"stdin", "stdout", and "stderr" attributes refer. The "log" file system
|
|
consists of a single node that represents a LOG session. The web server
|
|
configuration is supplied inline as part of the configuration. (BTW, you can
|
|
try out a very similar scenario using the 'ports/genode_org.run' script)
|
|
|
|
The VFS implementation resides at 'os/include/vfs/'. This is where you
|
|
can see the file-system types that are available (look for
|
|
_*_file_system.h_ files). Because the same code is used by Noux, we have
|
|
one unified and coherent VFS implementation throughout the framework now.
|
|
|
|
There are two things needed to adapt your work to the change.
|
|
|
|
* Remove the use of the libc_{rom, block, log, fs} plugins from your
|
|
target description files. Those plugins are no more. As of now,
|
|
the VFS is still internally a plugin, but it is always included with
|
|
the libc.
|
|
|
|
* Configure the VFS of your libc-using program in your run script. For
|
|
most former users of the sole libc_log plugin, this configuration
|
|
looks like this:
|
|
|
|
! <config>
|
|
! <libc stdout="/dev/log" stderr="/dev/log">
|
|
! <vfs> <dir name="dev"> <log/> </dir> </vfs>
|
|
! </libc>
|
|
! </config>
|
|
|
|
For former users of other plugins, there are the 'block', 'rom',
|
|
and 'fs' file-system types available.
|
|
|
|
|
|
:Feature set and limitations:
|
|
|
|
As of now, the following file-system types are supported:
|
|
|
|
:dir: represents a directory, which, in turn, can host multiple file
|
|
systems.
|
|
|
|
:block: accesses a block session. The label of the session can be configured
|
|
via the "label" attribute.
|
|
|
|
:fs: accesses a file-system server via a file-system session. The session
|
|
label can be defined via the "label" attribute.
|
|
|
|
:inline: provides the content of the configuration node as the content of
|
|
a read-only file.
|
|
|
|
:log: represents a pseudo device for writing to a LOG session. This type
|
|
is useful for redirecting stdout to a LOG service such as the one provided
|
|
by core.
|
|
|
|
:null and zero: represent pseudo devices similar to _/dev/null_ and
|
|
_/dev/zero_ on Unix.
|
|
|
|
:rom: makes a ROM module available as a read-only file. If the name of
|
|
the ROM module differs from the node name, the module name can be
|
|
expressed by the "label" attribute.
|
|
|
|
:tar: obtains a TAR archive as ROM module and makes its content available
|
|
as a file system. The name of the ROM module corresponds to the
|
|
name of the tar node.
|
|
|
|
:terminal: is a pseudo device that accesses a terminal session. The
|
|
session can be labeled using the "label" attribute.
|
|
|
|
There are still two major limitations: First, select is not supported yet.
|
|
That means that programs cannot block for I/O (such as reading from a
|
|
terminal). Because of this limitation, we still keep the libc_terminal around,
|
|
which supports select. As the second limitation, the VFS interface performs
|
|
read and write operations as synchronous requests. This is inherited from the
|
|
Noux implementation. It goes without saying that we plan to change it to
|
|
support non-blocking operations. But this step is not taken yet.
|
|
|
|
|
|
Revised session interfaces
|
|
==========================
|
|
|
|
The session interfaces for framebuffer and file-system access underwent
|
|
the following minor changes.
|
|
|
|
:Framebuffer session:
|
|
|
|
We simplified the framebuffer-session interface by removing the
|
|
'Framebuffer::Session::release()' method. This step makes the mode-change
|
|
protocol consistent with the way the ROM-session interface handles
|
|
ROM-module changes. That is, the client acknowledges the release of its
|
|
current dataspace by requesting a new dataspace via the
|
|
'Framebuffer::Session::dataspace()' method.
|
|
|
|
To enable framebuffer clients to synchronize their operations with the
|
|
display frequency, the session interface received the new 'sync_sigh'
|
|
function. Using this function, a client can register a handler for
|
|
receiving display-synchronization events. As of now, no framebuffer
|
|
service implements this feature in a useful way. But this will change
|
|
in the upcoming release cycle when we overhaul Genode's GUI stack.
|
|
|
|
:File-system session:
|
|
|
|
Until now, there was no exception type for the condition where a symbolic link was
|
|
created on a file system w/o symlink support, e.g., FAT. The
|
|
corresponding file-system server (ffat_fs) used to return a negative handle
|
|
as a work-around. Hence, we added 'Permission_denied' to the list of
|
|
exceptions thrown by 'File_system::Session::symlink' to handle this case in
|
|
a clean way.
|
|
|
|
|
|
Ported 3rd-party software
|
|
#########################
|
|
|
|
VirtualBox on NOVA
|
|
==================
|
|
|
|
With Genode 14.02, we successfully executed more than seven
|
|
guest-operating systems, including MS Windows 7, on top of Genode/NOVA. Based
|
|
on this proof of concept, we invested significant efforts to stabilize
|
|
and extend our port of VirtualBox during the last three months. We
|
|
also paid attention to user friendliness (i.e., features) by enabling
|
|
support for guest-additions.
|
|
|
|
Regarding stability, one issue we encountered has been occasional
|
|
synchronization problems during the early VMM bootstrap phase. Several
|
|
internal threads in the VMM are started concurrently, like the timer
|
|
thread, emulation thread (EMT), virtual CPU handler thread, hard-disk
|
|
thread, and user-interface front-end thread. Some of these threads are
|
|
favoured regarding their execution over others according to their
|
|
importance. VirtualBox expresses this by host-specific mechanisms like
|
|
priorities and nice levels of the host operating system. For Genode,
|
|
we implemented this specific part accordingly by using multiple Genode
|
|
CPU sessions.
|
|
|
|
The next working field was the emulation code and the code for
|
|
handling VM exits, which have been executed by two different threads.
|
|
We chose this structure in the original port to satisfy the following
|
|
specific characteristics of the underlying NOVA kernel. The emulation
|
|
code is provided by VirtualBox and is started as a pthread (EMT
|
|
thread). In contrast, the hardware accelerated vCPU thread is running
|
|
solely in the context of the VM in guest mode. When a VM exit happens,
|
|
the exit is reflected by an IPC message sent through a NOVA portal and
|
|
received by a vCPU handler thread running in our port of the
|
|
VirtualBox VMM. This thread must be a NOVA _worker_ thread, one which
|
|
has no scheduling context (SC) associated. The emulation thread
|
|
however is a _global_ thread with an associated SC.
|
|
|
|
Using two separate threads and synchronization points between them
|
|
enabled us in the first release of the port to quickly make progress,
|
|
which led to the successful execution of Windows guests. Now, one goal was
|
|
to merge both threads in order to avoid thread-context switching costs
|
|
between them. Also, we wanted to get rid of transferring the state
|
|
between vCPU handler and emulation thread back and forth including all
|
|
that ugly synchronization code. For that purpose, we changed the
|
|
startup of the emulation code: We first setup the vCPU handler thread
|
|
and then start the vCPU in the VM. Hereafter, the VM exits immediately
|
|
via a NOVA specific vCPU startup exception and the vCPU handler thread
|
|
gets in control. The vCPU handler thread then actually starts
|
|
executing the VirtualBox specific emulation code (originally executed
|
|
by the EMT thread). Now the vCPU handler thread and the VirtualBox EMT
|
|
thread are physically one execution context. Whenever the emulation
|
|
code decides to switch to hardware accelerated mode, the vCPU handler
|
|
thread can directly setup the transfer of the VM state from the
|
|
VirtualBox emulation mode into the state fields of the vCPU of the
|
|
guest.
|
|
|
|
Additionally, we had to re-adjust the memory management of our port to
|
|
meet requirements expected by VirtualBox. For some internal data
|
|
structures, VirtualBox saves a pointer to a memory location not just
|
|
as absolute pointer, but instead splits this pointer into a
|
|
process-absolute base and a base-local offset. These structures can
|
|
thereby be shared over different protection domains where the base
|
|
pointer typically differs (shared memory attached at different
|
|
addresses). For the Genode port, we actually don't need this shared
|
|
memory features, however, we had to recognize that the space for the
|
|
offset value is a signed integer (int32_t). On a 64bit host, this
|
|
feature caused trouble if the distance of two memory pointers was
|
|
larger than 31 bit (2 GiB). Fortunately, each memory-allocation
|
|
request for such data structures comes with a type field, which we can
|
|
use to make sure that all allocations per type are located within a 2
|
|
GiB virtual range.
|
|
|
|
Finally, we optimized the VM exits marginally and now try to avoid
|
|
entering the emulation mode during a recall VM exit. If we detect that
|
|
an IRQ is pending by the VMM models during the recall VM-exit
|
|
handling, we inject the IRQ directly into the VM instead of changing
|
|
into the VirtualBox emulation mode by default.
|
|
|
|
Regarding our keen endeavor to enable VirtualBox's guest additions, we
|
|
started by enabling the VMMDev PCI pseudo device, which is the basis
|
|
for VMM-specific hypercalls executed by guest systems. Beside basic
|
|
functions (e.g., software version reporting from host to guest and
|
|
vice versa) also complex communication protocols can be implemented by
|
|
storing request structures in guest-physical memory and passing their
|
|
addresses to the VMMDev request I/O port. The communication mechanism
|
|
in VirtualBox is called host-guest-communication manager (HGCM) and
|
|
provides host services to the enlightened guest-operating system.
|
|
Among the available services, the most interesting service for us was
|
|
support for _shared folders_ to exchange data between Genode and the
|
|
guest OS. Now, we are able to configure shares in VirtualBox, which
|
|
are mapped to VFS directories. For example
|
|
|
|
! <start name="virtualbox">
|
|
! ...
|
|
! <config>
|
|
! ...
|
|
! <libc> <vfs> <dir name="ram"> <fs label="ram" /> </dir> </vfs> </libc>
|
|
! <share host="/ram/miezekatze" guest="miezekatze" />
|
|
! ...
|
|
! </config>
|
|
! <route>
|
|
! <service name="File_system">
|
|
! <if-arg key="label" value="ram" /> <child name="ram_fs"/>
|
|
! </service>
|
|
! ...
|
|
! </route>
|
|
! </start>
|
|
|
|
configures one shared folder _miezekatze_, which is backed by a VFS
|
|
mount to a pre-populated RAM file system.
|
|
|
|
Furthermore, we integrated the guest-pointer device with the
|
|
Nitpicker pointer and connected the real-time clock VMM model to our
|
|
RTC-device driver. Both features are enabled by default and need no
|
|
further configuration. Currently, both Nitpicker and the guest OS
|
|
draw the mouse pointers on screen. We will improve this in the future
|
|
as the guest informs about GUI state via distinct pointer shapes.
|
|
|
|
During our development, we updated our port to VirtualBox 4.2.24 with the
|
|
rough plan to go for 4.3 during the rest of the year.
|
|
|
|
|
|
Ported libraries
|
|
================
|
|
|
|
We updated OpenSSL to version 1.0.1g, which contains a fix for the
|
|
heart-bleed bug.
|
|
Furthermore, we enabled OpenSSL and curl for the ARM architecture.
|
|
|
|
|
|
Device drivers
|
|
##############
|
|
|
|
USB 3.0 for x86-based platforms
|
|
===============================
|
|
|
|
Having support for USB 3.0 or XHCI host controllers on the Exynos 5 platform
|
|
since mid 2013, we decided it was about time to enable USB 3.0 on x86
|
|
platforms. Because XHCI is a standardized interface, which is also exposed by
|
|
the Exynos 5 host controller, the enablement was relatively straight forward.
|
|
The major open issue for x86 was the missing connection of the USB controller
|
|
to the PCI bus. For this, we ported the XHCI-PCI part from Linux and connected
|
|
it with the internal-PCI driver of our _dde_linux_ environment. This step
|
|
enabled basic XHCI support for x86 platforms. Unfortunately, there seems not
|
|
to be a single USB 3.0 controller without quirks. Thus, we tested some PCI
|
|
cards and notebooks and added controller-specific quirks as needed. These
|
|
quirks may not cover all current production chips though.
|
|
|
|
We also enabled and tested the HID, storage, and network profiles for USB 3.0,
|
|
where the supported network chip is, as for Exynos 5, the ASIX AX88179
|
|
Gigabit-Ethernet Adapter.
|
|
|
|
|
|
Platforms
|
|
#########
|
|
|
|
Execution on bare hardware (base-hw)
|
|
====================================
|
|
|
|
Multi-processor support
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
When we started to contemplate the support for symmetric multiprocessing
|
|
within the base-hw kernel, a plenty of fresh influences on this subject
|
|
floated around in our minds. Most notably, the NOVA port of Genode recently
|
|
obtained SMP support in the course of a prototypically comparison of different
|
|
models for inter-processor communication. In addition to the very insightful
|
|
conclusions of this evaluation, our knowledge about other kernel projects and
|
|
their way to SMP went in. In general, this showed us that the subject - if
|
|
addressed too ambitious - may boast lots of complex stabilization problems, and
|
|
coping with them easily draws down SMP efficiency in the aftermath.
|
|
|
|
Against this backdrop, we decided - as so often in the evolution of the base-hw
|
|
kernel - to pick the easiest-to-reach and easiest-to-grasp solution first with
|
|
preliminary disregard to secondary requirements like scalability. As the
|
|
base-hw kernel is single-threaded on uniprocessor systems, it was obvious to
|
|
maintain one kernel thread per SMP processor and, as far as possible, let them
|
|
all work in a similar way. To moreover keep the code base of the kernel as
|
|
unmodified as possible while introducing SMP, access to kernel objects get
|
|
fully serialized by one global spin lock. Therewith, we had a very minimalistic
|
|
starting point for what shall emerge on the kernel side.
|
|
|
|
Likewise, we started with a feature set narrowed to only the essentials on the
|
|
user side, prohibiting thread migration, any kind of inter-processor
|
|
communication, and also the unmapping of dataspaces, as this would have
|
|
raised the need for synchronization of TLBs. While thread migration
|
|
is still an open issue, means of inter-processor communication and TLB
|
|
synchronization were added successively after having the basics work stable.
|
|
|
|
First of all, the startup code of the kernel had to be adapted. The simple
|
|
uniprocessor instantiation was split into three phases: At the very beginning,
|
|
the primary processor runs alone and initializes everything that is needed for
|
|
calling a simple C function, which then prepares and performs the activation of
|
|
the other processors. For each processor, the program provides a dedicated
|
|
piece of memory for the local kernel stack to live in. Now each processor
|
|
goes through the second (the asynchronous multiprocessor) phase, initializing
|
|
its local caches and its memory-management unit. This is a basic prerequisite
|
|
for spin locks to behave globally coherent, which also implies that memory
|
|
accesses at this level can't be synchronized. Therefore, the first
|
|
initialization phase prepares everything in such a way, that the second phase
|
|
can be done without writing to global memory. As soon as the processors are
|
|
done with the second phase, they acquire the global spin lock that protects all
|
|
kernel data. This way, all processors consecutively pass the third
|
|
initialization phase that handles all remaining drivers and kernel objects.
|
|
This is the last time the primary processor plays a special role by doing all
|
|
the work that isn't related to processor-local resources. Afterwards the
|
|
processors can proceed to the main function that is called on every kernel
|
|
pass.
|
|
|
|
Another main challenge was the mode-transition assembler code path that
|
|
performs both
|
|
transitions from a processor exception to the call of the kernel-main function
|
|
and from the return of the kernel-main function back to the user
|
|
space. As this can't be synchronized, all corresponding data must be provided
|
|
per processor. This brought in additional offset calculations, which were a
|
|
little tricky to achieve without polluting the user state. But after we managed
|
|
to do so, the kernel was already able to handle user threads on different
|
|
processors as long as they didn't interact with each other.
|
|
|
|
When it came to synchronous and asynchronous inter-processor communication,
|
|
we enjoyed a big benefit of our approach. Due to fully serializing all kernel
|
|
code paths, none of the communication models had changed with SMP. Thanks to
|
|
the cache coherence of ARM hardware, even shared memory amongst processors
|
|
isn't a problem. The only difference is that now a processor may change the
|
|
schedule of another processor by unblocking one of its threads on communication
|
|
feedback. This may rescind the current scheduling choice of the other
|
|
processor. To avoid lags in this case, we let the unaware processor trap into
|
|
an IPI. As the IPI sender doesn't have to wait for an answer, this isn't a big
|
|
deal neither conceptually nor according to performance.
|
|
|
|
The last problem we had to solve for common Genode scenarios was the coherency
|
|
of the TLBs. When unmapping a dataspace at one processor, the corresponding
|
|
TLB entries must be invalidated on all processors, which - at least on
|
|
ARM systems - can be done processor-local only. Thus we needed a protocol to
|
|
broadcast the operation. First, we decided to leave it to the user land to
|
|
reserve a worker thread at each processor and synchronize between them. This
|
|
way, we didn't have to modify the kernel back end that was responsible for
|
|
updating the caches back in uniprocessor mode. Unfortunately, the revised
|
|
memory management explained in Section [Sparsely populated core address space]
|
|
relies on unmap operations at the startup of user threads, which led us into a
|
|
chicken-and-egg situation. Therefore, the broadcasting was moved from the
|
|
userland into the kernel. If a user thread now asks the kernel to update the
|
|
TLBs, the kernel blocks the thread and informs all processors. The last
|
|
processor that completes the operation unblocks the user thread. If this
|
|
unblocking happens remotely, the kernel acts exactly the same as described
|
|
above in the user-communication model. This way, the kernel never blocks itself
|
|
but only the thread that requests a TLB update.
|
|
|
|
Given that all kernel operations are lightweight non-blocking operations, we
|
|
assume that there is little contention for the global kernel lock. So we hope
|
|
that the simple SMP model will perform well for the foreseeable future where
|
|
we will have to accommodate only a handful of processors. If this assumption
|
|
turns out to be wrong, or if the kernel should scale to large-scale SMP
|
|
systems one day, we still have the choice to advance to a more sophisticated
|
|
approach without much backpedaling.
|
|
|
|
|
|
Sparsely populated core address space
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
As the base-hw platform started as an experiment, its memory management was
|
|
built pretty straight forward. All physical memory of the
|
|
corresponding hardware was mapped to the virtual memory-address space of
|
|
the kernel/core one-by-one. This approach comes with several limitations:
|
|
|
|
* The amount of physical memory that can be used is limited to a maximum
|
|
of 4GB on 32-bit ARM platforms
|
|
* Several classes of potential memory bugs within base-hw's core may remain
|
|
undetected (i.e., dangling pointers)
|
|
* A static mapping of the core/kernel code within a dedicated, restricted area
|
|
of the address space of all tasks is impossible. Although, this might be
|
|
valuable to minimize runtime overhead of interrupts, and page faults.
|
|
* As all physical RAM is mapped into core/kernel's address space as
|
|
cacheable memory, in general it is impossible to map a portion of RAM with
|
|
other caching attributes, as the cache is working with physical addresses
|
|
on ARM. This caused problems when dealing with DMA memory, or when sharing
|
|
uncached memory between TrustZone's secure and normal world in the past.
|
|
|
|
These limitations are resolved as only memory actually used by base-hw's
|
|
core/kernel is mapped on demand now. Moreover, the mapping from physical to
|
|
virtual isn't necessarily one-by-one anymore.
|
|
|
|
|
|
NOVA microhypervisor
|
|
====================
|
|
|
|
In line with most L4 kernels, the NOVA microhypervisor supports
|
|
priority-based round robin scheduling. However, on Genode we did not
|
|
leverage this feature. The reason was simple: We had no use for
|
|
priorities on NOVA until now. This changes when we are heading towards
|
|
using Genode on a daily basis to perform our work. On live Genode
|
|
systems, we want to prioritize particular workloads over others.
|
|
Admittedly, we also wanted to postpone the solution of one challenging
|
|
technical issue beside just enabling priority configuration.
|
|
|
|
The NOVA kernel supports the creation of threads with and without a
|
|
scheduling context attached. Scheduling contexts define a time
|
|
quantum, a budget, and a priority. The scheduler uses contexts to
|
|
decide which activity runs next on the CPU. Therefore, a thread
|
|
without a scheduling context attached can be executed only if a thread
|
|
with a scheduling context transfers the context during IPC or during
|
|
an exception implicitly for the time of the request. The transfer of
|
|
the scheduling context implicitly defines the thread's current
|
|
priority level. As a consequence, entrypoint threads inherit the
|
|
priority of client threads and may run on completely different
|
|
priority levels than other threads in the same process. Unfortunately,
|
|
the described behavior interferes with the invariant, which is
|
|
required for Genode's yielding spinlock implementation: All threads of one
|
|
process are running at the same priority level. Otherwise, the system
|
|
may end up in a live lock. Although, the user-level yielding spinlock
|
|
implementation is used solely to protect some few instructions in the
|
|
lock implementation, the live-lock bears a high risk for the system.
|
|
|
|
To overcome this issue in base-nova, we replaced the generic yielding
|
|
spinlock implementation with a NOVA specific helping lock. So,
|
|
lower-priority threads potentially holding the helping-lock get lent
|
|
the scheduling context of a higher-priority lock applicant and thereby
|
|
can finish the critical section. The core idea is to store the identity of the
|
|
lock holder in form of an execution-context capability in the lock
|
|
variable. Other lock applicants use the stored capability and instruct
|
|
the kernel to help the lock holder with their own scheduling context.
|
|
Consequently, the lock-holder thread will run on the budget of the
|
|
scheduling context obtained by the helping thread and, therefore,
|
|
implicitly at the inherited priority level. The lock holder will
|
|
instruct the kernel to pass back the lent scheduling context to the
|
|
applicant when leaving the critical section.
|
|
|
|
We had to extend the NOVA syscall interface to express that a thread
|
|
wants to pass its current scheduling context explicitly to another
|
|
thread if and only if both threads belong to the same process and CPU.
|
|
On reschedule, the context implicitly returns to the lending thread.
|
|
Additionally, a thread may request an explicit reschedule in order to
|
|
return a lent scheduling context obtained from another thread.
|
|
|
|
The current solution enables Genode to make use of NOVA's static priorities.
|
|
|
|
Another unrelated NOVA extension is the ability for a thread to yield
|
|
the CPU. The context gets enqueued at the end of the run queue without
|
|
refreshing the left budget.
|
|
|
|
|
|
Build system and tools
|
|
######################
|
|
|
|
Build system
|
|
============
|
|
|
|
Sometimes software requires custom tools that are used to generate source
|
|
code or other ingredients for the build process, for example IDL compilers.
|
|
Such tools won't be executed on top of Genode but on the host platform
|
|
during the build process. Hence, they must be compiled with the tool chain
|
|
installed on the host, not the Genode tool chain. The Genode build system
|
|
received new support for building such host tools as a side effect of building
|
|
a library or a target.
|
|
|
|
Even though it is possible to add the tool compilation step to a regular build
|
|
description file, it is recommended to introduce a dedicated pseudo library
|
|
for building such tools.
|
|
This way, the rules for building host tools are kept separate from rules that
|
|
refer to Genode programs. By convention, the pseudo library should be named
|
|
_<package>_host_tools_ and the host tools should be built at
|
|
_<build-dir>/tool/<package>/_. With _<package>_, we refer to the name of the
|
|
software package the tool belongs to, e.g., qt5 or mupdf. To build a tool
|
|
named _<tool>_, the pseudo library contains a custom make rule like the
|
|
following:
|
|
|
|
! $(BUILD_BASE_DIR)/tool/<package>/<tool>:
|
|
! $(MSG_BUILD)$(notdir $@)
|
|
! $(VERBOSE)mkdir -p $(dir $@)
|
|
! $(VERBOSE)...build commands...
|
|
|
|
To let the build system trigger the rule, add the custom target to the
|
|
'HOST_TOOLS' variable:
|
|
|
|
! HOST_TOOLS += $(BUILD_BASE_DIR)/tool/<package>/<tool>
|
|
|
|
Once the pseudo library for building the host tools is in place, it can be
|
|
referenced by each target or library that relies on the respective tools via
|
|
the 'LIBS' declaration. The tool can be invoked by referring to
|
|
'$(BUILD_BASE_DIR)/tool/<package>/tool'.
|
|
|
|
For an example of using custom host tools, please refer to the mupdf package
|
|
found within the libports repository. During the build of the mupdf library,
|
|
two custom tools fontdump and cmapdump are invoked. The tools are built via
|
|
the _lib/mk/mupdf_host_tools.mk_ library-description file. The actual mupdf
|
|
library (_lib/mk/mupdf.mk_) has the pseudo library 'mupdf_host_tools' listed
|
|
in its 'LIBS' declaration and refers to the tools relative to
|
|
'$(BUILD_BASE_DIR)'.
|
|
|
|
|
|
Rump-kernel tools
|
|
=================
|
|
|
|
During our work on porting the cryptographic-device driver to Genode,
|
|
we identified the need for tools to process block-device and
|
|
file-system images on our development machines. For this purpose, we
|
|
added the rump-kernel-based tools, which are used for preparing and
|
|
populating disk images as well as creating cgd(4)-based cryptographic
|
|
disk devices.
|
|
|
|
The rump-tool chain can be built (similar to building GCC for Genode)
|
|
by executing _tool/tool_chain_rump build_. Afterwards, the tools can
|
|
be installed via _tool/tool_chain_rump install_ to the default install
|
|
location _/usr/local/genode-rump_. As mentioned in
|
|
[Block-level encryption using CGD], instead of using the tools
|
|
directly, we added the wrapper shell script _tool/rump_.
|