2011-12-22 16:19:25 +01:00
|
|
|
|
|
|
|
|
|
|
|
===============================================
|
|
|
|
Release notes for the Genode OS Framework 10.02
|
|
|
|
===============================================
|
|
|
|
|
|
|
|
Genode Labs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
After the release of the feature-packed version 9.11, we turned our attention
|
|
|
|
to improving the platform support of the framework. The current release 10.02
|
|
|
|
bears fruit of these efforts on several levels.
|
|
|
|
|
|
|
|
First, we are proud to announce the support for two new base platforms, namely
|
|
|
|
the NOVA hypervisor and the Codezero microkernel. These new kernels complement
|
|
|
|
the already supported base platforms Linux, L4/Fiasco, L4ka::Pistachio, and
|
|
|
|
OKL4. So why do we address so many different kernels instead of focusing our
|
|
|
|
efforts to one selected platform? Our observation is that different applications
|
|
|
|
pose different requirements on the kernel. Most kernels have a specific profile
|
|
|
|
with regard to security, hardware support, complexity, scheduling, resource
|
|
|
|
management, and licensing that may make them fit well for one application area
|
|
|
|
but not perfectly suited for a different use case. There is no single perfect
|
|
|
|
kernel and there doesn't need to be one. By using Genode, applications
|
|
|
|
developed for one kernel can be ported to all the other supported platforms with
|
|
|
|
a simple recompile. We believe that making Genode available on a new kernel is
|
|
|
|
beneficial for the kernel developers, application developers, and users alike.
|
|
|
|
For kernel developers, Genode brings additional workloads to stress-test their
|
|
|
|
kernel, and it extends the application area of the kernel. Application
|
|
|
|
developers can address several kernel platforms at once instead of tying their
|
|
|
|
programs to one particular platform. Finally, users and system integrators can
|
|
|
|
pick their kernel of choice for the problem at hand. Broadening the platform
|
|
|
|
support for Genode helps to make the framework more relevant.
|
|
|
|
|
|
|
|
Second, we introduced a new way for managing real-time priorities, which fits
|
|
|
|
perfectly with the recursive system structure of Genode. This clears the way to
|
|
|
|
multi-media and other real-time workloads that we target with our upcoming
|
|
|
|
work. We implemented the concept for the L4ka::Pistachio and OKL4 platforms.
|
|
|
|
With real-time priorities on OKL4, it is possible to run multiple instances of
|
|
|
|
the OKLinux kernel at the same time, each instance at a different priority.
|
|
|
|
|
|
|
|
Third, we vastly improved the existing framework, extended the ARM architecture
|
|
|
|
support to cover dynamic loading and the C runtime, introduced a new
|
|
|
|
thread-context management, added a plugin-concept to our C runtime, and
|
|
|
|
improved several device drivers.
|
|
|
|
|
|
|
|
Even though platform support is the main focus of this release, we introduced a
|
|
|
|
number of new features, in particular the initial port of the Python 2.6 script
|
|
|
|
interpreter.
|
|
|
|
|
|
|
|
|
|
|
|
NOVA hypervisor as new base platform
|
|
|
|
####################################
|
|
|
|
|
|
|
|
When we started the development of Genode in 2006 at the OS Group of the
|
|
|
|
Technische Universität Dresden, it was originally designated to be the user
|
|
|
|
land of a next-generation and to-be-developed new kernel called NOVA. Because
|
|
|
|
the kernel was not ready at that time, we had to rely on intermediate solutions
|
|
|
|
as kernel platform such as L4/Fiasco and Linux during development. These
|
|
|
|
circumstances led us to the extremely portable design that Genode has today and
|
|
|
|
motivated us to make Genode available on the whole family of L4 microkernels.
|
|
|
|
In December 2009, the day we waited for a long time had come. The first version
|
|
|
|
of NOVA was publicly released:
|
|
|
|
|
|
|
|
:Official website of the NOVA hypervisor:
|
|
|
|
[http://hypervisor.org]
|
|
|
|
|
|
|
|
Besides the novel and modern kernel interface, NOVA has a list of features that
|
|
|
|
sets it apart from most other microkernels, in particular support for
|
|
|
|
virtualization hardware, multi-processor support, and capability-based
|
|
|
|
security.
|
|
|
|
|
|
|
|
|
|
|
|
Why bringing Genode to NOVA?
|
|
|
|
============================
|
|
|
|
|
|
|
|
NOVA is an acronym for NOVA OS Virtualization Architecture. It stands for a
|
|
|
|
radically new approach of combining full x86 virtualization with microkernel
|
|
|
|
design principles. Because NOVA is a microkernelized hypervisor, the term
|
|
|
|
microhypervisor was coined. In its current form, it successfully addresses
|
|
|
|
three main challenges. First, how to consolidate a microkernel system-call API
|
|
|
|
with a hypercall API in such a way that the API remains orthogonal? The answer
|
|
|
|
to this question lies in NOVA's unique IPC interface. Second, how to implement
|
|
|
|
a virtual machine monitor outside the hypervisor without spoiling
|
|
|
|
performance? The Vancouver virtual machine monitor that runs on top NOVA proves
|
|
|
|
that a decomposition at this system level is not only feasible but can yield
|
|
|
|
high performance. Third, being a modern microkernel, NOVA set out to pursue a
|
|
|
|
capability-based security model, which is a challenge on its own.
|
|
|
|
|
|
|
|
Up to now, the NOVA developers were most concerned about optimizing and
|
|
|
|
evaluating NOVA for the execution of virtual machines, not so much about
|
|
|
|
running a fine-grained decomposed multi-server operating system. This is where
|
|
|
|
Genode comes into play. With our port of Genode to NOVA, we contribute the
|
|
|
|
workload to evaluate NOVA's kernel API against this use case. We are happy to
|
|
|
|
report that the results so far are overly positive.
|
|
|
|
|
|
|
|
At this point, we want to thank the main developers of NOVA Udo Steinberg and
|
|
|
|
Bernhard Kauer for making their exceptional work and documentation publicly
|
|
|
|
available, and for being so responsive to our questions. We also greatly
|
|
|
|
enjoyed the technical discussions we had and look forward to the future
|
|
|
|
evolution of NOVA.
|
|
|
|
|
|
|
|
|
|
|
|
Challenges
|
|
|
|
==========
|
|
|
|
|
|
|
|
From all currently supported base platforms of Genode, the port to NOVA was the
|
|
|
|
most venturesome effort. It is the first platform with kernel support for
|
|
|
|
capabilities and local names. That means no process except the kernel has
|
|
|
|
global knowledge. This raises a number of questions that seem extremely hard
|
|
|
|
to solve at the first sight. For example: There are no global IDs for threads
|
|
|
|
and other kernel objects. So how to address the destination for an IPC message?
|
|
|
|
Or another example: A thread does not know its own identity per se and there is
|
|
|
|
no system call similar to 'getpid' or 'l4_myself', not even a way to get a
|
|
|
|
pointer to a thread's own user-level thread-control block (UTCB). The UTCB,
|
|
|
|
however, is needed to invoke system calls. So how can a thread obtain its UTCB
|
|
|
|
in order to use system calls? The answers to these questions must be provided by
|
|
|
|
user-level concepts. Fortunately, Genode was designed for a capability kernel
|
|
|
|
right from the beginning so that we already had solutions to most of these
|
|
|
|
questions. In the following, we give a brief summary of the specifics of Genode
|
|
|
|
on NOVA:
|
|
|
|
|
|
|
|
* We maintain our own system-call bindings for NOVA ('base-nova/include/nova/')
|
|
|
|
derived from the NOVA specification. We put the bindings under MIT license
|
|
|
|
to encourage their use outside of Genode.
|
|
|
|
|
|
|
|
* Core runs directly as roottask on the NOVA hypervisor. On startup, core
|
|
|
|
maps the complete I/O port range to itself and implements debug output via
|
|
|
|
comport 0.
|
|
|
|
|
|
|
|
* Because NOVA does not allow rootask to have a BSS segment, we need a slightly
|
|
|
|
modified linker script for core (see 'src/platform/roottask.ld').
|
|
|
|
All other Genode programs use Genode's generic linker script.
|
|
|
|
|
|
|
|
* The Genode 'Capability' type consists of a portal selector expressing the
|
|
|
|
destination of a capability invocation and a global object ID expressing
|
|
|
|
the identity of the object when the capability is specified as an invocation
|
|
|
|
argument. In the latter case, the global ID is needed because of a limitation
|
|
|
|
of the current system-call interface. In the future, we are going to entirely
|
|
|
|
remove the global ID.
|
|
|
|
|
|
|
|
* Thread-local data such as the UTCB pointer is provided by the new thread
|
|
|
|
context management introduced with the Genode release 10.02. It enables
|
|
|
|
each thread to determine its thread-local data using the current stack
|
|
|
|
pointer.
|
|
|
|
|
|
|
|
* NOVA provides threads without time called local execution contexts (EC).
|
|
|
|
Local ECs are intended as server-side RPC handlers. The processing time
|
|
|
|
needed to perform RPC requests is provided by the client during the RPC call.
|
|
|
|
This way, RPC semantics becomes very similar to function call semantics with
|
|
|
|
regard to the accounting of CPU time. Genode already distinguishes normal
|
|
|
|
threads (with CPU time) and server-side RPC handlers ('Server_activation')
|
|
|
|
and, therefore, can fully utilize this elegant mechanism without changing the
|
|
|
|
Genode API.
|
|
|
|
|
|
|
|
* On NOVA, there are no IPC send or IPC receive operations. Hence, this part
|
|
|
|
of Genode's IPC framework cannot be implemented on NOVA. However, the
|
|
|
|
corresponding classes 'Ipc_istream' and 'Ipc_ostream' are never used directly
|
|
|
|
but only as building blocks for the actually used 'Ipc_client' and
|
|
|
|
'Ipc_server' classes. Compared with the other Genode base platforms, Genode's
|
|
|
|
API for synchronous IPC communication maps more directly onto the NOVA
|
|
|
|
system-call interface.
|
|
|
|
|
|
|
|
* The Lock implementation utilizes NOVA's semaphore as a utility to let a
|
|
|
|
thread block in the attempt to get a contended lock. In contrast to the
|
|
|
|
intuitive way of using one kernel semaphore for each user lock, we use only
|
|
|
|
one kernel semaphore per thread and the peer-to-peer wake-up mechanism we
|
|
|
|
introduced in the release 9.08. This has two advantages: First, a lock does
|
|
|
|
not consume a kernel resource, and second, the full semantics of the Genode
|
|
|
|
lock including the 'cancel-blocking' semantics are preserved.
|
|
|
|
|
|
|
|
* NOVA does not support server-side out-of-order processing of RPC requests.
|
|
|
|
This is particularly problematic in three cases: Page-fault handling, signal
|
|
|
|
delivery, and the timer service.
|
|
|
|
|
|
|
|
A page-fault handler can receive a page fault request only if the previous
|
|
|
|
page fault has been answered. However, if there is no answer for a
|
|
|
|
page-fault, the page-fault handler has to decide whether to reply with a
|
|
|
|
dummy answer (in this case, the faulter will immediately raise the same page
|
|
|
|
fault again) or block until the page-fault can be resolved. But in the latter
|
|
|
|
case, the page-fault handler cannot handle any other page faults. This is
|
|
|
|
unfeasible if there is only one page-fault handler in the system. Therefore,
|
|
|
|
we instantiate one pager per user thread. This way, we can block and unblock
|
|
|
|
individual threads when faulting.
|
|
|
|
|
|
|
|
Another classical use case for out-of-order RPC processing is signal
|
|
|
|
delivery. Each process has a signal-receiver thread that blocks at core's
|
|
|
|
signal service using an RPC call. This way, core can selectively deliver
|
|
|
|
signals by replying to one of these in-flight RPCs with a zero-timeout
|
|
|
|
response (preserving the fire-and-forget signal semantics). On NOVA however,
|
|
|
|
a server cannot have multiple RPCs in flight. Hence, we use a NOVA semaphore
|
|
|
|
shared between core and the signal-receiver thread to wakeup the
|
|
|
|
signal-receiver on the occurrence of a signal. Because a semaphore-up
|
|
|
|
operation does not carry payload, the signal has to perform a non-blocking
|
|
|
|
RPC call to core to pick up the details about the signal. Thanks to Genode's
|
|
|
|
RPC framework, the use of the NOVA semaphore is hidden in NOVA-specific stub
|
|
|
|
code for the signal interface and remains completely transparent at API
|
|
|
|
level.
|
|
|
|
|
|
|
|
For the timer service, we currently use one thread per client to avoid the need
|
|
|
|
for out-of-order RPC processing.
|
|
|
|
|
|
|
|
* Because NOVA provides no time source, we use the x86 PIT as user-level time
|
|
|
|
source, similar as on OKL4.
|
|
|
|
|
|
|
|
* On the current version of NOVA, kernel capabilities are delegated using IPC.
|
|
|
|
Genode supports this scheme by being able to marshal 'Capability' objects as
|
|
|
|
RPC message payload. In contrast to all other Genode base platforms where
|
|
|
|
the 'Capability' object is just plain data, the NOVA version must marshal
|
|
|
|
'Capability' objects such that the kernel translates the sender-local name to
|
|
|
|
the receiver-local name. This special treatment is achieved by overloading
|
|
|
|
the marshalling and unmarshalling operators of Genode's RPC framework. The
|
|
|
|
transfer of capabilities is completely transparent at API level and no
|
|
|
|
modification of existing RPC stub code was needed.
|
|
|
|
|
|
|
|
|
|
|
|
How to explore Genode on NOVA?
|
|
|
|
==============================
|
|
|
|
|
|
|
|
The Genode release 10.02 supports the NOVA pre-release version 0.1. You can
|
|
|
|
download the archive here:
|
|
|
|
|
|
|
|
:Download NOVA version 0.1:
|
|
|
|
[http://os.inf.tu-dresden.de/~us15/nova/nova-hypervisor-0.1.tar.bz2]
|
|
|
|
|
|
|
|
For building NOVA, please refer to the 'README' file contained in the archive.
|
|
|
|
Normally, a simple 'make' in the 'build/' subdirectory is all you need to
|
|
|
|
get a freshly baked 'hypervisor' binary.
|
|
|
|
|
|
|
|
The NOVA platform support for Genode resides in the 'base-nova/' repository.
|
|
|
|
To create a build directory prepared for compiling Genode for NOVA, you can use
|
|
|
|
the 'create_builddir' tool. From the top-level Genode directory, issue the
|
|
|
|
following command:
|
|
|
|
|
|
|
|
! ./tool/builddir/create_builddir nova_x86 GENODE_DIR=. BUILD_DIR=<dir>
|
|
|
|
|
|
|
|
This tool will create a fresh build directory at the location specified
|
|
|
|
as 'BUILD_DIR'. Provided that you have installed the
|
|
|
|
[http://genode.org/download/tool-chain - Genode tool chain], you can now build
|
|
|
|
Genode by using 'make' from within the new build directory.
|
|
|
|
|
|
|
|
Note that in contrast to most other kernels, the Genode build process does not
|
|
|
|
need to know about the source code of the kernel. This is because Genode
|
|
|
|
maintains its own system-call bindings for this kernel. The bindings reside in
|
|
|
|
'base-nova/include/nova/'.
|
|
|
|
|
|
|
|
NOVA supports multi-boot boot loaders such as GRUB, Pulsar, or gPXE. For
|
|
|
|
example, a GRUB configuration entry for booting the Genode demo scenario
|
|
|
|
with NOVA looks as follows, whereas 'genode/' is a symbolic link to the
|
|
|
|
'bin/' subdirectory of the Genode build directory and the 'config' file
|
|
|
|
is a copy of 'os/config/demo'.
|
|
|
|
|
|
|
|
! title Genode demo scenario
|
|
|
|
! kernel /hypervisor noapic
|
|
|
|
! module /genode/core
|
|
|
|
! module /genode/init
|
|
|
|
! module /config/demo/config
|
|
|
|
! module /genode/timer
|
|
|
|
! module /genode/ps2_drv
|
|
|
|
! module /genode/pci_drv
|
|
|
|
! module /genode/vesa_drv
|
|
|
|
! module /genode/launchpad
|
|
|
|
! module /genode/nitpicker
|
|
|
|
! module /genode/liquid_fb
|
|
|
|
! module /genode/nitlog
|
|
|
|
! module /genode/testnit
|
|
|
|
! module /genode/scout
|
|
|
|
|
|
|
|
Please note the 'noapic' argument for the NOVA hypervisor. This argument
|
|
|
|
enables the use of ordinary PIC IRQ numbers, as relied on by our current
|
|
|
|
PIT-based timer driver.
|
|
|
|
|
|
|
|
|
|
|
|
Limitations
|
|
|
|
===========
|
|
|
|
|
|
|
|
The current NOVA version of Genode is able to run the complete Genode demo
|
|
|
|
scenario including several device drivers (PIT, PS/2, VESA, PCI) and the GUI.
|
|
|
|
At version 0.1, however, NOVA is not yet complete and misses some features
|
|
|
|
needed to make Genode fully functional. The current limitations are:
|
|
|
|
|
|
|
|
* No real-time priority support: NOVA supports priority-based scheduling
|
|
|
|
but, in the current version, it allows each thread to create scheduling
|
|
|
|
contexts with arbitrary scheduling parameters. This makes it impossible
|
|
|
|
to enforce priority assignment from a central point as facilitated with
|
|
|
|
Genode's priority concept.
|
|
|
|
|
|
|
|
* No multi-processor support: NOVA supports multi-processor CPUs through
|
|
|
|
binding each execution context (ECs) to a particular CPU. Because everyone
|
|
|
|
can create ECs, every process could use multiple CPUs. However, Genode's API
|
|
|
|
devises a more restrictive way of allocating and assigning resources. In
|
|
|
|
short, physical resource usage should be arbitrated by core and the creation
|
|
|
|
of physical ECs should be performed by core only. However, Remote EC creation
|
|
|
|
is not yet supported by NOVA. Even though, multiple CPU can be used with
|
|
|
|
Genode on NOVA right now by using NOVA system calls directly, there is no
|
|
|
|
support at the Genode API level.
|
|
|
|
|
|
|
|
* Missing revoke syscall: NOVA is not be able to revoke memory mappings or
|
|
|
|
destroy kernel objects such as ECs and protection domains. In practice, this
|
|
|
|
means that programs and complete Genode subsystems can be started but not
|
|
|
|
killed. Because virtual addresses cannot be reused, code that relies on
|
|
|
|
'unmap' will produce errors. This is the case for the dynamic loader or
|
|
|
|
programs that destroy threads at runtime.
|
|
|
|
|
|
|
|
Please note that these issues are known and worked on by the NOVA developers.
|
|
|
|
So we expect Genode to become more complete on NOVA soon.
|
|
|
|
|
|
|
|
|
|
|
|
Codezero kernel as new base platform
|
|
|
|
####################################
|
|
|
|
|
|
|
|
Codezero is a microkernel primarily targeted to ARM-based embedded systems.
|
|
|
|
It is developed as an open-source project by a British company called B-Labs.
|
|
|
|
|
|
|
|
:B-Labs website:
|
2011-12-23 14:04:29 +01:00
|
|
|
[http://b-labs.com]
|
2011-12-22 16:19:25 +01:00
|
|
|
|
|
|
|
The Codezero kernel was first made publicly available in summer 2009. The
|
|
|
|
latest version, documentation, and community resources are available at the
|
|
|
|
project website:
|
|
|
|
|
|
|
|
:Codezero project website:
|
|
|
|
[http://l4dev.org]
|
|
|
|
|
|
|
|
As highlighted by the name of the project website, the design of the kernel is
|
|
|
|
closely related to the family of L4 microkernels. In short, the kernel provides
|
|
|
|
a minimalistic set of functionality for managing address spaces, threads, and
|
|
|
|
communication between threads, but leaves complicated policy and device access
|
|
|
|
to user-level components.
|
|
|
|
|
|
|
|
To put Codezero in relation to other L4 kernels, here is a quick summary on the
|
|
|
|
most important design aspects as implemented with the version 0.2, and how
|
|
|
|
our port of Genode relates to them:
|
|
|
|
|
|
|
|
* In the line of the original L4 interface, the kernel uses global name spaces
|
|
|
|
for kernel objects such as threads and address spaces.
|
|
|
|
|
|
|
|
* For the interaction between a user thread and the kernel, the concept of
|
|
|
|
user-level thread-control blocks (UTCB) is used. A UTCB is a small
|
|
|
|
thread-specific region in the thread's virtual address space, which is
|
|
|
|
always mapped. The access to the UTCB can never raise a page fault,
|
|
|
|
which makes it perfect for the kernel to access system-call arguments,
|
|
|
|
in particular IPC payload copied from/to user threads. In contrast to other
|
|
|
|
L4 kernels, the location of UTCBs within the virtual address space is managed
|
|
|
|
by the user land.
|
|
|
|
|
|
|
|
On Genode, core keeps track of the UTCB locations for all user threads.
|
|
|
|
This way, the physical backing store for the UTCB can be properly accounted
|
|
|
|
to the corresponding protection domain.
|
|
|
|
|
|
|
|
* The kernel provides three kinds of synchronous inter-process communication
|
|
|
|
(IPC): Short IPC carries payload in CPU registers only. Full IPC copies
|
|
|
|
message payload via the UTCBs of the communicating parties. Extended IPC
|
|
|
|
transfers a variable-sized message from/to arbitrary locations of the
|
|
|
|
sender/receiver address spaces. During an extended IPC, page faults may
|
|
|
|
occur.
|
|
|
|
|
|
|
|
Genode solely relies on extended IPC, leaving the other IPC mechanisms to
|
|
|
|
future optimizations.
|
|
|
|
|
|
|
|
* The scheduling of threads is based on hard priorities. Threads with the
|
|
|
|
same priority are executed in a round-robin fashion. The kernel supports
|
|
|
|
time-slice-based preemption.
|
|
|
|
|
|
|
|
Genode does not support Codezero priorities yet.
|
|
|
|
|
|
|
|
* The original L4 interface leaves the question on how to manage and account
|
|
|
|
kernel resources such as the memory used for page tables unanswered.
|
|
|
|
Codezero makes the accounting of such resources explicit, enables the
|
|
|
|
user-land to manage them in a responsible way, and prevent kernel-resource
|
|
|
|
denial-of-service problems.
|
|
|
|
|
|
|
|
* In contrast to the original L4.v2 and L4.x0 interfaces, the kernel provides
|
|
|
|
no time source in the form of IPC timeouts to the user land. A time source
|
|
|
|
must be provided by a user-space timer driver. Genode employs such a timer
|
|
|
|
services on all platforms so that it is not constricted by this limitation.
|
|
|
|
|
|
|
|
In several ways, Codezero goes beyond the known L4 interfaces. The most
|
|
|
|
noticeable addition is the support of so-called containers. A container is
|
|
|
|
similar to a virtual machine. It is an execution environment that holds a set
|
|
|
|
of physical resources such as RAM and devices. The number of containers and the
|
|
|
|
physical resources assigned to them are static and have to be defined at build
|
|
|
|
time. The code executed inside a container can roughly be classified by two
|
|
|
|
categories. First, there are static programs that require strong isolation from the
|
|
|
|
rest of the system but no classical operating-system infrastructure, for
|
|
|
|
example special-purpose telecommunication stacks or cryptographic functionality
|
|
|
|
of an embedded device. Second, there are kernel-like workloads, which use the L4
|
|
|
|
interface to substructure the container into address spaces, for example a
|
|
|
|
paravirtualized Linux kernel that uses Codezero address spaces to protect Linux
|
|
|
|
processes. Genode runs inside a container and facilitates Codezero's L4
|
|
|
|
interface to implement its multi-server architecture.
|
|
|
|
|
|
|
|
The second major addition is the use of a quite interesting flavor of a
|
|
|
|
capability concept to manage the authorization of processes to access system
|
|
|
|
resources and system calls. In contrast to most current approaches, Codezero
|
|
|
|
does not attempt to localize the naming of physical objects such as
|
|
|
|
address-space IDs and thread ID. So a capability is not referred to via a local
|
|
|
|
name but a global name. However, for delegating authorization throughout the
|
|
|
|
system, the capability approach is employed. A process that possesses a capability
|
|
|
|
to an object can deal with the object. It can further delegate this access
|
|
|
|
right to another party (to which it holds a capability). In a way, this
|
|
|
|
approach keeps the kernel interface true to the original L4 interface but
|
|
|
|
provides a much stronger concept for access control. However, it is important
|
|
|
|
to point out that the problem of ambient authority is not (yet) addressed by
|
|
|
|
this concept. If a capability is not used directly but specified as an argument
|
|
|
|
to a remote service, this argument is passed as a plain value not
|
|
|
|
protected by the kernel. Because the identity of the referenced object can be
|
|
|
|
faked by the client, the server has to check the plausibility of the argument.
|
|
|
|
For the server, however, this check is difficult/impossible because it has no
|
|
|
|
way to know whether the client actually possesses the capability it is talking
|
|
|
|
about.
|
|
|
|
|
|
|
|
The current port of Genode to Codezero does not make use of the capability
|
|
|
|
concept for fine-grained communication control, yet. As with the other L4
|
|
|
|
kernels, each object is identified by a unique ID allocated by a core service.
|
|
|
|
There is no mechanism in place to prevent faked object IDs.
|
|
|
|
|
|
|
|
|
|
|
|
:Thanks:
|
|
|
|
We want to thank the main developer of Codezero Bahadir Balban for his great
|
|
|
|
responsiveness to our feature requests and questions. Without his help, the
|
|
|
|
port would have taken much more effort. We hope that our framework will be of
|
|
|
|
value to the Codezero community.
|
|
|
|
|
|
|
|
|
|
|
|
Using Genode with Codezero
|
|
|
|
==========================
|
|
|
|
|
|
|
|
The port of Genode is known to work with the devel branch of Codezero version
|
|
|
|
0.2 as of 2010-02-19.
|
|
|
|
|
|
|
|
To download the Codezero source code from the official source-code repository,
|
|
|
|
you can use the following commands:
|
|
|
|
|
|
|
|
!git clone git://git.l4dev.org/codezero.git
|
|
|
|
!git checkout -b devel --track origin/devel
|
|
|
|
|
|
|
|
In addition to downloading the source code, you will need to apply the small
|
|
|
|
patch 'base-codezero/lcd.patch' to the Codezero kernel to enable the device
|
|
|
|
support for the LCD display. Go to the 'codezero.git/' directory and issue:
|
|
|
|
|
|
|
|
!patch -p1 < <genode-dir>/base-codezero/lcd.patch
|
|
|
|
|
|
|
|
For a quick start with Codezero, please follow the "Getting Started with the
|
|
|
|
Codezero Development" guide, in particular the installation of the tool chain:
|
|
|
|
|
|
|
|
:Getting started with Codezero:
|
|
|
|
[http://www.l4dev.org/getting_started]
|
|
|
|
|
|
|
|
The following steps guide you through building and starting Genode on Codezero
|
|
|
|
using the Versatilepb platform as emulated by Qemu.
|
|
|
|
|
|
|
|
# Create a Genode build directory for the Codezero/Versatilepb platform.
|
|
|
|
Go to the Genode directory and use the following command where '<build-dir>'
|
|
|
|
is the designated location of the new Genode build directory and
|
|
|
|
'<codezero-src-dir>' is the 'codezero.git/' directory with the Codezero
|
|
|
|
source tree, both specified as absolute directories.
|
|
|
|
! ./tool/builddir/create_builddir codezero_versatilepb \
|
|
|
|
! GENODE_DIR=. \
|
|
|
|
! BUILD_DIR=<genode-build-dir> \
|
|
|
|
! L4_DIR=<codezero-src-dir>
|
|
|
|
|
|
|
|
With the build directory created, Genode targets can immediately be
|
|
|
|
compiled for Codezero. For a quick test, go to the new build directory and
|
|
|
|
issue:
|
|
|
|
! make init
|
|
|
|
|
|
|
|
In addition to being a Genode build directory, the directory is already
|
|
|
|
prepared to be used as Codezero container. In particular, it holds a
|
|
|
|
'SConstruct' file that will be called by the Codezero build system. In this
|
|
|
|
file, you will find the list of Genode targets to be automatically built when
|
|
|
|
executing the Codezero build process. Depending on your work flow, you may
|
|
|
|
need to adapt this file.
|
|
|
|
|
|
|
|
# To import the Genode container into the Codezero configuration system,
|
|
|
|
go to the 'codezero.git/' directory and use the following command:
|
|
|
|
|
|
|
|
! ./scripts/baremetal/baremetal_add_container.py \
|
|
|
|
! -a -i Genode -s <genode-build-dir>
|
|
|
|
|
|
|
|
# Now, we can add and configure a new instance of this container via the
|
|
|
|
Codezero configuration system:
|
|
|
|
! ./configure.py
|
|
|
|
|
|
|
|
Using the interactive configuration tool, select to use a single container
|
|
|
|
and set up the following values for this bare-metal container, choose a
|
|
|
|
sensible 'Container Name' (e.g., 'genode0') and select the 'Genode' entry in
|
|
|
|
the 'Baremetal Project' menu.
|
|
|
|
|
|
|
|
:Default pager parameters:
|
|
|
|
! 0x40000 Pager LMA
|
|
|
|
! 0x100000 Pager VMA
|
|
|
|
These values are important because they are currently hard-wired in the
|
|
|
|
linker script used by Genode. If you need to adopt these values, make
|
|
|
|
sure to also update the Genode linker script located at
|
|
|
|
'base-codezero/src/platform/genode.ld'.
|
|
|
|
|
|
|
|
:Physical Memory Regions:
|
|
|
|
! 1 Number of Physical Regions
|
|
|
|
! 0x40000 Physical Region 0 Start Address
|
|
|
|
! 0x4000000 Physical Region 0 End Address
|
|
|
|
We only use 64MB of memory. The physical memory between 0 and 0x40000 is
|
|
|
|
used by the kernel.
|
|
|
|
|
|
|
|
:Virtual Memory Regions:
|
|
|
|
! 1 Number of Virtual Regions
|
|
|
|
! 0x0 Virtual Region 0 Start Address
|
|
|
|
! 0x50000000 Virtual Region 0 End Address
|
|
|
|
It is important to choose the end address such that the virtual memory
|
|
|
|
covers the thread context area. The context area is defined at
|
|
|
|
'base/include/base/thread.h'.
|
|
|
|
|
|
|
|
:Container Devices (Capabilities):
|
|
|
|
Enable the LCD display in the 'CLCD Menu'.
|
|
|
|
|
|
|
|
The configuration system will copy the Genode container template to
|
|
|
|
'codezero.git/conts/genode0'. Hence, if you need to adjust the container's
|
|
|
|
'SConscript' file, you need to edit 'codezero.git/conts/genode.0/SConscript'.
|
|
|
|
The original Genode build directory is only used as template when creating
|
|
|
|
a new Codezero container but it will never be looked at by the Codezero build
|
|
|
|
system.
|
|
|
|
|
|
|
|
# After completing the configuration, it is time to build both Codezero and
|
|
|
|
Genode. Thanks to the 'SConscript' file in the Genode container, the Genode
|
|
|
|
build process is executed automatically:
|
|
|
|
! ./build.py
|
|
|
|
|
|
|
|
You will find the end result of the build process at
|
|
|
|
! ./build/final.elf
|
|
|
|
|
|
|
|
# Now you can try out Genode on Qemu:
|
|
|
|
! qemu-system-arm -s -kernel build/final.elf \
|
|
|
|
! -serial stdio -m 128 -M versatilepb &
|
|
|
|
|
|
|
|
The default configuration starts the nitpicker GUI server and the launchpad
|
|
|
|
application. The versatilepb platform driver is quite limited. It does
|
|
|
|
support the LCD display as emulated by Qemu but no user input, yet.
|
|
|
|
|
|
|
|
|
|
|
|
Limitations
|
|
|
|
===========
|
|
|
|
|
|
|
|
At the current stage, the Genode version for Codezero is primarily geared
|
|
|
|
towards the developers of Codezero as a workload to stress their kernel. It
|
|
|
|
still has a number of limitations that would affect the real-world use:
|
|
|
|
|
|
|
|
* Because the only platform supported out of the box by the official Codezero
|
|
|
|
source tree is the ARM-based Versatilebp board, Genode is currently tied to
|
|
|
|
this hardware platform. When Codezero moves beyond this particular platform,
|
|
|
|
we will add a modular concept for platform support packages to Genode.
|
|
|
|
|
|
|
|
* The current timer driver at 'os/src/drivers/timer/codezero/' is a dummy
|
|
|
|
driver that just yields the CPU time instead of blocking. It is not
|
|
|
|
suitable as time source.
|
|
|
|
|
|
|
|
* The versatilepb platform driver at 'os/src/drivers/platform/versatilepb/'
|
|
|
|
does only support the LCD display as provided by Qemu but it was not tested on
|
|
|
|
real hardware. Because Codezero does not yet allow the assignment of the
|
|
|
|
Versatilepb PS/2 controller to a container, the current user-input driver is
|
|
|
|
just a dummy.
|
|
|
|
|
|
|
|
* The lock implementation is based on a simple spinlock using an atomic
|
|
|
|
compare-exchange operation, which is implemented via Codezero's kernel mutex.
|
|
|
|
The lock works and is safe but it has a number of drawbacks with regard to
|
|
|
|
fairness, efficiency, and its interaction with scheduling.
|
|
|
|
|
|
|
|
* Core's IRQ service is not yet implemented because the IRQ-handling interface
|
|
|
|
of Codezero is still in flux.
|
|
|
|
|
|
|
|
* Because we compile Genode with the same tool chain (Codesourcery ARM tool
|
|
|
|
chain) as used for Codezero, there are still subtle differences in the
|
|
|
|
linker scripts, making Genode's dynamic linker not yet functional on
|
|
|
|
Codezero.
|
|
|
|
|
|
|
|
* Even though Codezero provides priority-based scheduling, Genode does not
|
|
|
|
allow assigning priorities to Codezero processes, yet.
|
|
|
|
|
|
|
|
* Currently, all Genode boot modules are linked as binary data against core,
|
|
|
|
which is then loaded as single image into a container. For this reason, core
|
|
|
|
must be build after all binaries. This solution is far from being convenient
|
|
|
|
because changing the list of boot modules requires changes in core's
|
|
|
|
'platform.cc' and 'target.mk' file.
|
|
|
|
|
|
|
|
|
|
|
|
New thread-context management
|
|
|
|
#############################
|
|
|
|
|
|
|
|
With the current release, we introduced a new stack management concept that is
|
|
|
|
now consistently used on all Genode base platforms. Because the new concept
|
|
|
|
does not only cover the stack allocation but also other thread-specific context
|
|
|
|
information, we speak of thread-context management. The stack of a Genode
|
|
|
|
thread used to be a member of the 'Thread' object with its size specified as
|
|
|
|
template argument. This stack-allocation scheme was chosen because it was easy
|
|
|
|
to implement on all base platforms and is straight-forward to use. But there
|
|
|
|
are two problems with this approach.
|
|
|
|
|
|
|
|
First, the implementation of thread-local storage (TLS) is either platform
|
|
|
|
dependent or costly. There are kernels with support for TLS, mostly by the
|
|
|
|
means of a special register that holds a pointer to a thread-local data
|
|
|
|
structure (e.g., the UTCB pointer). But using such a facility implicates
|
|
|
|
platform-specific code on Genode's side. For kernels with no TLS support, we
|
|
|
|
introduced a unified TLS concept that registers stacks alongside with
|
|
|
|
thread-local data at a thread registry. To access the TLS of a thread, this
|
|
|
|
thread registry can be queried with the current stack pointer of a caller.
|
|
|
|
This query, however, is costly because it traverses a data structure. Up to
|
|
|
|
now, we accepted these costs because native Genode code did not use TLS. TLS
|
|
|
|
was only needed for code ported from the Linux kernel. However, with NOVA,
|
|
|
|
there is now a kernel that requires the user land to provide a fast TLS
|
|
|
|
mechanism to look up the current thread's UTCB in order to perform system
|
|
|
|
calls. On this kernel, a fast TLS mechanism is important.
|
|
|
|
|
|
|
|
The second disadvantage of the original stack allocation scheme is critical
|
|
|
|
to all base platforms: Stack overflows could not be detected. For each stack,
|
|
|
|
the developer had to specify a stack size. A good estimation for this value
|
|
|
|
is hard, in particular when calling functions of library code with unknown
|
|
|
|
stack usage patterns. If chosen too small, the stack could overflow, corrupting
|
|
|
|
the data surrounding the 'Thread' object. Such errors are extremely cumbersome
|
|
|
|
to detect. If chosen too large, memory gets wasted.
|
|
|
|
|
|
|
|
For storing thread-specific data (called thread context) such as the stack and
|
|
|
|
thread-local data, we have now introduced a dedicated portion of the virtual address
|
|
|
|
space. This portion is called thread-context area. Within the thread-context
|
|
|
|
area, each thread has a fixed-sized slot, a thread context. The layout of each
|
|
|
|
thread context looks as follows
|
|
|
|
|
2012-01-06 20:44:30 +01:00
|
|
|
[image thread_context]
|
2011-12-22 16:19:25 +01:00
|
|
|
|
|
|
|
; lower address
|
|
|
|
; ...
|
|
|
|
; ============================ <- aligned at 'CONTEXT_VIRTUAL_SIZE'
|
|
|
|
;
|
|
|
|
; empty
|
|
|
|
;
|
|
|
|
; ----------------------------
|
|
|
|
;
|
|
|
|
; stack
|
|
|
|
; (top) <- initial stack pointer
|
|
|
|
; ---------------------------- <- address of 'Context' object
|
|
|
|
; additional context members
|
|
|
|
; ----------------------------
|
|
|
|
; UTCB
|
|
|
|
; ============================ <- aligned at 'CONTEXT_VIRTUAL_SIZE'
|
|
|
|
; ...
|
|
|
|
; higher address
|
|
|
|
|
|
|
|
On some platforms, a user-level thread-control block (UTCB) area contains
|
|
|
|
data shared between the user-level thread and the kernel. It is typically
|
|
|
|
used for transferring IPC message payload or for system-call arguments.
|
|
|
|
The additional context members are a reference to the corresponding
|
|
|
|
'Thread_base' object and the name of the thread.
|
|
|
|
|
|
|
|
The thread context is a virtual memory area, initially not backed by real
|
|
|
|
memory. When a new thread is created, an empty thread context gets assigned
|
|
|
|
to the new thread and populated with memory pages for the stack and the
|
|
|
|
additional context members. Note that this memory is allocated from the RAM
|
|
|
|
session of the process environment and gets not accounted for when using the
|
|
|
|
'sizeof()' operand on a 'Thread_base' object.
|
|
|
|
|
|
|
|
This way, stack overflows are immediately detected because the corresponding
|
|
|
|
thread produces a page fault within the thread-context area. Data corruption
|
|
|
|
can never occur.
|
|
|
|
|
|
|
|
We implemented this concept for all base platforms and thereby made the
|
|
|
|
stack-overflow protection and the fast TLS feature available to all platforms.
|
|
|
|
On L4ka::Pistachio, OKL4, L4/Fiasco, Codezero, and NOVA, the thread-context
|
|
|
|
area is implemented as a managed dataspace. This ensures that the unused
|
|
|
|
virtual memory of the sparsely populated thread-context area is never selected
|
|
|
|
for attaching regular dataspaces into the process' address space. On Linux, the
|
|
|
|
thread-context area is implemented via a fixed offset added to the local
|
|
|
|
address for the 'mmap' system call. So on this platform, there is no protection
|
|
|
|
in place to prevent regular dataspaces from being attached within the
|
|
|
|
thread-context area.
|
|
|
|
|
|
|
|
Please note that in contrast to the original 'Thread' object, which contained
|
|
|
|
the stack, the new version does not account for the memory consumed by the
|
|
|
|
stack when using the 'sizeof()' operator. This has to be considered for
|
|
|
|
multi-threaded servers that want to account client-specific threads to the
|
|
|
|
memory donated by the corresponding client.
|
|
|
|
|
|
|
|
|
|
|
|
Real-time priorities
|
|
|
|
####################
|
|
|
|
|
|
|
|
There are two application areas generally regarded as predestined for
|
|
|
|
microkernels, high security and real time. Whereas the development of Genode
|
|
|
|
was primarily focused on the former application area so far, we observe growing
|
|
|
|
interest in using the framework for soft real-time applications, in particular
|
|
|
|
multi-media workload. Most of Genode's supported base platforms already provide
|
|
|
|
some way of real-time scheduling support, hard priorities with round-robin
|
|
|
|
scheduling of threads with the same priority being the most widely used
|
|
|
|
scheduling scheme. What has been missing until now was a way to access these
|
|
|
|
facilities through Genode's API or configuration interfaces. We deferred the
|
|
|
|
introduction for such interfaces for a very good reason: It is hard to get
|
|
|
|
right. Even though priority-based scheduling is generally well understood, the
|
|
|
|
combination with dynamic workload where differently prioritized processes are
|
|
|
|
started and deleted at runtime and interact with each other is extremely hard
|
|
|
|
to manage. At least, this had been our experience with building complex
|
|
|
|
scenarios with the Dresden real-time operating system (DROPS). Combined with
|
|
|
|
optimizations such as time-slice donating IPC calls, the behaviour of complex
|
|
|
|
scenarios tended to become indeterministic and hardly possible to capture.
|
|
|
|
|
|
|
|
Genode imposes an additional requirement onto all its interfaces. They have to
|
|
|
|
support the recursive structure of the system. Only if any subsystem of
|
|
|
|
processes is consistent on its own, it is possible to replicate it at an arbitrary
|
|
|
|
location within Genode's process tree. Assigning global priorities to single
|
|
|
|
processes, however, would break this condition. For example, non-related
|
|
|
|
subsystems could interfere with each other if both used the same range of
|
|
|
|
priorities for priority-based synchronization within the respective subsystem.
|
|
|
|
If executed alone, each of those subsystems would run perfectly but integrated
|
|
|
|
into one setup, they would interfere with each other, yielding unpredictable
|
|
|
|
results. We have now found a way to manage real-time priorities such that the
|
|
|
|
recursive nature Genode is not only preserved but actually put to good use.
|
|
|
|
|
|
|
|
|
|
|
|
Harmonic priority-range subdivision
|
|
|
|
===================================
|
|
|
|
|
|
|
|
We call Genode's priority management concept harmonic priority-range
|
|
|
|
subdivision. Priorities are not assigned to activities as global values but
|
|
|
|
they can be virtualized at each node in Genode's process tree. At startup time,
|
|
|
|
core assigns the right to use the complete range of priorities to the init
|
|
|
|
process. Init is free to assign those priorities to any of the CPU sessions it
|
|
|
|
creates at core, in particular to the CPU sessions it creates on behalf its
|
|
|
|
children and their grandchildren. Init, however, neither knows nor is it
|
|
|
|
interested in the structure of its child subsystems. It only wants to make sure
|
|
|
|
that one subsystem is prioritized over another. For this reason, it uses the
|
|
|
|
most significant bits of the priority range to express its policy but leaves
|
|
|
|
the lesser significant bits to be defined by the respective subsystems. For
|
|
|
|
example, if init wants to enforce that one subsystem has a higher priority than
|
|
|
|
all others, it would need to distinguish two priorities. For each CPU-session
|
|
|
|
request originating from one of its clients, it would diminish the supplied
|
|
|
|
priority argument by shifting the argument by one bit to the right and
|
|
|
|
replacing the most significant bit with its own policy. Effectively, init
|
|
|
|
divides its own range of priorities into two subranges. Both subranges, in
|
|
|
|
turn, can be managed the same way by the respective child. The concept works
|
|
|
|
recursively.
|
|
|
|
|
|
|
|
|
|
|
|
Implementation
|
|
|
|
==============
|
|
|
|
|
|
|
|
The implementation consists of two parts. First, there is the actual management
|
|
|
|
implemented as part of the parent protocol. For each CPU session request,
|
|
|
|
the parent evaluates the priority argument and supplements its own policy.
|
|
|
|
At this management level, a logical priority range of 0...2^16 is used to pass
|
|
|
|
the policy arguments from child to parent. A lower value represents a higher
|
|
|
|
priority. The second part is the platform-specific code in core that translates
|
|
|
|
priority arguments into kernel priorities and assigns them to physical
|
|
|
|
threads. Because the typical resolution for priority values is lower than 2^16,
|
|
|
|
this quantization can lead to the loss of the lower-significant priority bits.
|
|
|
|
In this case, differently prioritized CPU sessions can end up using the same
|
|
|
|
physical priority. For this reason, we recommend to not use priorities for
|
|
|
|
synchronization purposes.
|
|
|
|
|
|
|
|
|
|
|
|
Usage
|
|
|
|
=====
|
|
|
|
|
|
|
|
The assignment of priorities to subsystems is done via two additional tags in
|
|
|
|
init's 'config' file. The '<priolevels>' tag specifies how many priority levels
|
|
|
|
are distinguished by the init instance. The value must be a power of two. Each
|
|
|
|
'<start>' node can contain an optional '<priority>' declaration, which holds a
|
|
|
|
value between -priolevels + 1 and 0. This way, priorities can only be lowered,
|
|
|
|
never alleviated above init's priority. If no '<priority>' tag is specified,
|
|
|
|
the default value of 0 (init's own priority) is used. For an example, here is a
|
|
|
|
'config' file starting several nested instances of the init process using
|
|
|
|
different priority subranges.
|
|
|
|
|
|
|
|
! <config>
|
|
|
|
! <!--
|
|
|
|
! divides priority range 1..128 into
|
|
|
|
! 65..128 (prio 0)
|
|
|
|
! 1..64 (prio -1)
|
|
|
|
! -->
|
|
|
|
! <priolevels>2</priolevels>
|
|
|
|
! <start>
|
|
|
|
! <filename>init</filename>
|
|
|
|
! <priority>0</priority>
|
|
|
|
! <ram_quota>5M</ram_quota>
|
|
|
|
! <config>
|
|
|
|
! <!--
|
|
|
|
! divides priority range 65..128 into
|
|
|
|
! 113..128 (prio 0)
|
|
|
|
! 97..112 (prio -1)
|
|
|
|
! 81..96 (prio -2)
|
|
|
|
! 65..80 (prio -3)
|
|
|
|
! -->
|
|
|
|
! <priolevels>4</priolevels>
|
|
|
|
! <start>
|
|
|
|
! <filename>init</filename>
|
|
|
|
! <!-- results in platform priority 112 -->
|
|
|
|
! <priority>-1</priority>
|
|
|
|
! <ram_quota>512K</ram_quota>
|
|
|
|
! </start>
|
|
|
|
! <start>
|
|
|
|
! <filename>init</filename>
|
|
|
|
! <!-- results in platform priority 96 -->
|
|
|
|
! <priority>-2</priority>
|
|
|
|
! <ram_quota>2M</ram_quota>
|
|
|
|
! <config>
|
|
|
|
! <start>
|
|
|
|
! <filename>init</filename>
|
|
|
|
! <ram_quota>768K</ram_quota>
|
|
|
|
! </start>
|
|
|
|
! </config>
|
|
|
|
! </start>
|
|
|
|
! </config>
|
|
|
|
! </start>
|
|
|
|
! <start>
|
|
|
|
! <filename>init</filename>
|
|
|
|
! <!-- results in platform priority 64 -->
|
|
|
|
! <priority>-1</priority>
|
|
|
|
! <ram_quota>6M</ram_quota>
|
|
|
|
! <config></config>
|
|
|
|
! </start>
|
|
|
|
! </config>
|
|
|
|
|
|
|
|
On kernels that support priorities and where priority 128 is used as priority
|
|
|
|
limit (this is the case for OKL4 and Pistachio), this configuration should
|
|
|
|
result in the following assignments of physical priorities to process-tree
|
|
|
|
nodes:
|
|
|
|
|
2012-01-06 20:44:30 +01:00
|
|
|
[image priorities]
|
2011-12-22 16:19:25 +01:00
|
|
|
|
|
|
|
The red marker shows the resulting priority of the corresponding process.
|
|
|
|
|
|
|
|
; 128 : core
|
|
|
|
; 128 : core->init
|
|
|
|
; 128 : core->init->init
|
|
|
|
; 112 : core->init->init->init
|
|
|
|
; 98 : core->init->init->init.2
|
|
|
|
; 98 : core->init->init->init.2->init
|
|
|
|
; 64 : core->init->init.2
|
|
|
|
|
|
|
|
With Genode 10.02, we implemented the described concept for the OKL4 and
|
|
|
|
L4ka::Pistachio base platforms first. On both platforms, a priority range of 0
|
|
|
|
to 128 is used.
|
|
|
|
|
|
|
|
On L4/Fiasco, we were not yet able to apply this concept because on this
|
|
|
|
kernel, the used lock implementation is based on a yielding spinlock.
|
|
|
|
If a thread at a high priority would attempt to acquire a contended lock,
|
|
|
|
it would infinitely yield the CPU to itself, letting all other threads in
|
|
|
|
the system starve. In order to make real-time priorities usable on L4/Fiasco
|
|
|
|
we would need to change the lock first.
|
|
|
|
|
|
|
|
|
|
|
|
Base framework
|
|
|
|
##############
|
|
|
|
|
|
|
|
Read-only dataspaces
|
|
|
|
====================
|
|
|
|
|
|
|
|
Until now, we have not handled ROM dataspaces any different from RAM dataspaces
|
|
|
|
in core except for their predefined content. With the Genode workload becoming
|
|
|
|
more complex, ROM files tend to get shared between different processes and need
|
|
|
|
protection. Now, dataspaces of ROM modules are always mapped read-only.
|
|
|
|
|
|
|
|
Enabled the use of super pages by default
|
|
|
|
=========================================
|
|
|
|
|
|
|
|
Since release 9.08, we support super pages as an experimental feature. Now,
|
|
|
|
this feature is enabled by default on L4/Fiasco, L4ka::Pistachio, and NOVA.
|
|
|
|
|
|
|
|
Enabled managed dataspaces by default
|
|
|
|
=====================================
|
|
|
|
|
|
|
|
We originally introduced managed dataspaces with the release 8.11. However,
|
|
|
|
because we had no pressing use cases, it remained a experimental feature
|
|
|
|
until now. The new thread-context management introduced with this release
|
|
|
|
prompted us to promote managed dataspaces to become a regular feature.
|
|
|
|
Originally there was one problem holding us back from this decision, which
|
|
|
|
was the handling of cyclic references between nested dataspaces. However,
|
|
|
|
we do now simply limit the number of nesting levels to a fixed value.
|
|
|
|
|
|
|
|
Streamlined server framework
|
|
|
|
============================
|
|
|
|
|
|
|
|
We removed the 'add_activation()' functionality from the server and pager
|
|
|
|
libraries because on all platforms server activations and entry points have
|
|
|
|
a one-to-one relationship. This API was originally intended to support
|
|
|
|
platforms that are able to trigger one of many worker threads via a single
|
|
|
|
entry point. This was envisioned by an early design of NOVA. However, no
|
|
|
|
kernel (including NOVA) supports such a feature as of today.
|
|
|
|
|
|
|
|
Furthermore, we added a dedicated 'Pager_capability' type. On most
|
|
|
|
platforms, a pager is simply a thread. So using a 'Thread_capability' as type
|
|
|
|
for the 'Pager_capability' was sufficient. On NOVA, however, a pager is not
|
|
|
|
necessarily a thread. So we need to reflect this difference in the types.
|
|
|
|
|
|
|
|
PD session interface
|
|
|
|
====================
|
|
|
|
|
|
|
|
To support capability kernels with support for local names, it is not
|
|
|
|
sufficient to provide the parent capability to a new child by passing a plain
|
|
|
|
data argument to the new child during ELF loading anymore. We also need to tell
|
|
|
|
the kernel about the delegated right of the child to talk to its parent. This is
|
|
|
|
achieved using the new 'assign_parent' function of the PD session interface.
|
|
|
|
This function allows the creator of a new process to register the parent
|
|
|
|
capability.
|
|
|
|
|
|
|
|
Singleton services
|
|
|
|
==================
|
|
|
|
|
|
|
|
There are services, in particular device drivers, that support only one session
|
|
|
|
at a time. This characteristic was not easy to express in the framework.
|
|
|
|
Consequently, such services tended to handle the case of a second session
|
|
|
|
request inconsistently. We have now enhanced the 'Root_component' template with
|
|
|
|
a policy parameter to 'Root_component' that allows the specification of a
|
|
|
|
session-creation policy. The most important policy is whether a service can
|
|
|
|
have a single or multiple clients.
|
|
|
|
[http://genode.org/documentation/api/static_content/code/base/include/root/component - See the improved template...]
|
|
|
|
|
|
|
|
Out-of-order RPC replies
|
|
|
|
========================
|
|
|
|
|
|
|
|
In the previous release, we introduced a transitional API for supporting
|
|
|
|
out-of-order RPC replies. This API is currently used by the timer and
|
|
|
|
signal services but is declared deprecated. The original implementation
|
|
|
|
used a blocking send operation to deliver replies, which is not desired
|
|
|
|
and can cause infinite blocking times in the presence of misbehaving clients.
|
|
|
|
Therefore, we changed the implementation to send explicit replies with no
|
|
|
|
timeout. Thanks to Frank Kaiser for pointing out this issue.
|
|
|
|
|
|
|
|
|
|
|
|
Operating-system services and libraries
|
|
|
|
#######################################
|
|
|
|
|
|
|
|
Python scripting
|
|
|
|
================
|
|
|
|
|
|
|
|
We have ported a minimal Python 2.6.4 interpreter to Genode. The port is
|
|
|
|
provided with the 'libports' repository. It is based on the official
|
|
|
|
Python code available from the website:
|
|
|
|
|
|
|
|
:Python website:
|
|
|
|
[http://www.python.org]
|
|
|
|
|
|
|
|
To fetch the upstream Python source code, call 'make prepare' from within the
|
|
|
|
'libports' directory. To include Python in your build process, add 'libports'
|
|
|
|
to your 'build.conf' file.
|
|
|
|
|
|
|
|
A test program for the script interpreter is provided at
|
|
|
|
'libports/src/test/python'. When building this test program, a shared library
|
|
|
|
'python.lib.so' will be generated. A sample Genode configuration
|
|
|
|
('config_sample') file that starts a Python script can be found within this
|
|
|
|
directory. If you are not using Linux as a Genode base platform, do not forget
|
|
|
|
to add 'python.lib.so' to your boot module list.
|
|
|
|
|
|
|
|
We regard this initial port as the first step to make a complete Python
|
|
|
|
runtime. At the current stage, there is support for 'Rom_session' Python
|
|
|
|
scripts to serve basic scripting needs, currently geared towards automated
|
|
|
|
testing. Modules and standard modules are not yet supported.
|
|
|
|
|
|
|
|
|
|
|
|
Plugin-interface for the C library
|
|
|
|
==================================
|
|
|
|
|
|
|
|
The recent addition of the lwIP stack to Genode stimulated our need to make the
|
|
|
|
C runtime extensible by providing multiple back ends, lwIP being one of them.
|
|
|
|
Therefore, we introduced a libc-internal plugin interface, which is able to
|
|
|
|
dispatch libc calls to one of potentially many plugins. The plugin interface
|
|
|
|
covers the most used file operations and a few selected networking functions.
|
|
|
|
By default, if no plugin is used, those functions point to dummy
|
|
|
|
implementations. If however, a plugin is linked against a libc-using program,
|
|
|
|
calls to 'open' or 'socket' are directed to the registered plugins, resulting
|
|
|
|
in plugin-specific file handles. File operations on such a file handle are then
|
|
|
|
dispatched by the corresponding plugin.
|
|
|
|
|
|
|
|
The first functional plugin is the support for lwIP. This makes it possible to
|
|
|
|
compile BSD-socket based network code to use lwIP on Genode. Just add the
|
|
|
|
following declaration in your 'target.mk':
|
|
|
|
|
|
|
|
! LIBS += libc libc_lwip lwip
|
|
|
|
|
|
|
|
The 'libc' library is the generic C runtime, 'lwip' is the raw lwIP stack, and
|
|
|
|
'libc_lwip' is the lwip plugin for the C runtime - the glue between 'lwip' and
|
|
|
|
'libc'. The initialization of lwip is not yet part of the 'lwip' plugin.
|
|
|
|
|
|
|
|
:Limitations:
|
|
|
|
We expand the libc-plugin interface on a per case basis. Please refer to
|
|
|
|
'libc/include/libc-plugin/plugin.h' to obtain the list of currently supported
|
|
|
|
functions. Please note that 'select' is not yet supported.
|
|
|
|
|
|
|
|
|
|
|
|
ARM architecture support for the C library
|
|
|
|
==========================================
|
|
|
|
|
|
|
|
We enhanced our port of the FreeBSD libc with support for the ARM
|
|
|
|
architecture. In the ARM version, the following files are excluded:
|
|
|
|
:libm: 'e_acosl.c', 'e_asinl.c', 'e_atan2l.c', 'e_hypotl.c', 's_atanl.c',
|
|
|
|
's_cosl.c', 's_frexpl.c', 's_nextafterl.c', 's_nexttoward.c',
|
|
|
|
's_rintl.c', s_scalbnl.c', 's_sinl.c', 's_tanl.c', 's_fmal.c',
|
|
|
|
|
|
|
|
:libc-gen: 'setjmp.S'
|
|
|
|
|
|
|
|
Atomic operation on ARM are not supported. Although these operations are
|
|
|
|
defined in 'machine/atomic.h', their original FreeBSD implementations are
|
|
|
|
not functional because we do not emulate the required FreeBSD environment
|
|
|
|
(see: 'sysarch.h'). However, these functions are not a regular part of
|
|
|
|
the libc anyway and are not referenced from any other libc code.
|
|
|
|
|
|
|
|
|
|
|
|
Light-weight IP stack
|
|
|
|
=====================
|
|
|
|
|
|
|
|
After introducing LwIP support with our last release, we stabilized the port
|
|
|
|
and combined it with our libc implementation. Moreover, we upgraded the lwIP
|
|
|
|
library to the latest stable version 1.3.2. For convenience reasons, we
|
|
|
|
added initialization code, setting up the LwIP stack, the NIC session back end,
|
|
|
|
and optionally DHCP.
|
|
|
|
|
|
|
|
The example programs 'http_srv' and 'loopback' within the 'libports' repository
|
|
|
|
show how to use the LwIP stack directly or as a plugin with the libc. The
|
|
|
|
first one makes direct use of the LwIP library and demonstrates how to deal
|
|
|
|
with the new initialization routine, to setup the session to the NIC driver
|
|
|
|
and to request an IP address via DHCP. The second example doesn't use the
|
|
|
|
socket interface of the LwIP library directly but uses the libc variant instead.
|
|
|
|
It doesn't initialize the NIC session back end but uses the loopback
|
|
|
|
device provided by the LwIP library itself.
|
|
|
|
|
|
|
|
|
|
|
|
Device-driver environment kit
|
|
|
|
=============================
|
|
|
|
|
|
|
|
The basis for Genode's legacy driver emulation environment was granted some
|
|
|
|
maintenance. DDE kit now utilizes the thread registry and is able to adopt
|
|
|
|
alien threads. This unimpressive feature permits the execution of driver code
|
|
|
|
directly from server activations, i.e., adds support for single-threaded
|
|
|
|
drivers.
|
|
|
|
|
|
|
|
|
|
|
|
Dynamic linker
|
|
|
|
==============
|
|
|
|
|
|
|
|
We added dynamic linking support for OKL4 on the ARM architecture.
|
|
|
|
Because of the tool chain used on this platform, we had to revisit our
|
|
|
|
linker scripts (one warning is left because of 'gc-sections') and removed the
|
|
|
|
dependency on gcc builtin functions (with the exception of 'alloca').
|
|
|
|
|
|
|
|
To ease debugging on Linux, we revised the handling of registrations of
|
|
|
|
libraries and dynamic binaries, and thereby, made gdb debugging of
|
|
|
|
dynamically linked programs possible.
|
|
|
|
|
|
|
|
Furthermore, we prepared the future support for the 'dl' API ('dlopen', 'dlsym'
|
|
|
|
etc.) calls by enabling the linker to register exported linker symbols at startup.
|
|
|
|
This is achieved by emulating '.hash', '.dynsym', and '.dynstr' sections within
|
|
|
|
the linker object.
|
|
|
|
|
|
|
|
|
|
|
|
Misc
|
|
|
|
====
|
|
|
|
|
|
|
|
* Prevent running over the XML data on sub node identification. This
|
|
|
|
change fixes a problem with parsing the 'config' file on OKL4.
|
|
|
|
|
|
|
|
* C Runtime: Disable definition of 'pthread_cancel' symbol because it
|
|
|
|
collides with a weak implementation provided (and relied on) by the C++
|
|
|
|
support library.
|
|
|
|
|
|
|
|
|
|
|
|
Device drivers
|
|
|
|
##############
|
|
|
|
|
|
|
|
PIT timer driver
|
|
|
|
================
|
|
|
|
|
|
|
|
We use the x86 Programmable Interval Timer (PIT) on kernels that provide no
|
|
|
|
time source via their kernel APIs, i.e., OKL4 and NOVA.
|
|
|
|
|
|
|
|
Up to now, the accuracy of the timer implementation was not a big concern
|
|
|
|
because we wanted to satisfy the use cases of blocking for a short amount
|
|
|
|
of time (ca. 10ms) as needed by many periodic processes such as interactive
|
|
|
|
GUI applications, DDE device drivers, and the OKLinux timer loop. Achieving
|
|
|
|
exact wake-up times with a user-level timing service that get preemptively
|
|
|
|
scheduled alongside an unknown number of other threads is impossible anyway.
|
|
|
|
However, with the introduction of real-time priorities in the current release,
|
|
|
|
real-time workload and the accuracy of the timer driver becomes important.
|
|
|
|
For this reasons we improved the timer implementation.
|
|
|
|
|
|
|
|
* Corrected programming of one-shot timer IRQs. In the function for assigning
|
|
|
|
the next timeout, the specified argument was not taken over to the
|
|
|
|
corresponding member variable. This way, the timer implementation was not
|
|
|
|
operating in one-shot mode but it periodically triggered at a high rate.
|
|
|
|
This change should take off load from the CPU.
|
|
|
|
|
|
|
|
* Replaced counter-latch command with read-back in PIT timer. We use the PIT
|
|
|
|
status byte to detect counter wrap arounds and adjust our tick count
|
|
|
|
accordingly. This fixes problems with long single timeouts.
|
|
|
|
|
|
|
|
Thanks to Frank Kaiser for investigating these timer-accuracy issues and
|
|
|
|
providing us with suggestions to fix them.
|
|
|
|
|
|
|
|
|
|
|
|
NIC driver for Linux
|
|
|
|
====================
|
|
|
|
|
|
|
|
Genode provides the NIC session interface and a DDE Linux 2.6 based
|
|
|
|
driver for AMD PCnet32 devices since release 9.11. The NIC driver adds
|
|
|
|
networking support for all Genode base platforms except Linux. With the current
|
|
|
|
release we filled that gap with the TAP-based 'nic_drv'. The driver
|
|
|
|
itself accesses '/dev/net/tun' for 'tap0' and needs no super-user
|
|
|
|
privileges. Therefore, the device has to be configured prior to
|
|
|
|
running Genode like the following.
|
|
|
|
|
|
|
|
! sudo tunctl -u $$USER -t tap0
|
|
|
|
! sudo ip link set tap0 up
|
|
|
|
! sudo ip address add 10.0.0.1/24 brd + dev tap0
|
|
|
|
|
|
|
|
Give it a try with the
|
|
|
|
[http://genode.org/documentation/release-notes/9.11#section-17 - lwIP example scenario].
|
|
|
|
Please note that lwIP is configured for DHCP and does not assign a
|
|
|
|
static IP configuration to its end of the wire. Hence, you should run
|
|
|
|
a DHCP server on tap0, e.g.
|
|
|
|
|
|
|
|
! sudo /usr/sbin/dhcpd3 -d -f -cf /tmp/dhcpd.conf \
|
|
|
|
! -pf /tmp/dhcpd.pid -lf /tmp/dhcpd.lease tap0
|
|
|
|
|
|
|
|
An example 'dhcpd.conf' may look like
|
|
|
|
|
|
|
|
! subnet 10.0.0.0 netmask 255.255.255.0 {
|
|
|
|
! range 10.0.0.16 10.0.0.31;
|
|
|
|
! }
|
|
|
|
|
|
|
|
The DHCP server's log will show you that the driver fakes an ethernet
|
|
|
|
NIC with the MAC address 02-00-00-00-00-01.
|
|
|
|
|
|
|
|
|
|
|
|
VESA driver
|
|
|
|
===========
|
|
|
|
|
|
|
|
Our VESA driver used to set a default resolution of 1024x768 at 16 bit color
|
|
|
|
depth, which could be changed by specifying session arguments. However, most
|
|
|
|
of the time, clients are able to adapt itself to the framebuffer resolution and
|
|
|
|
do not want to implement the policy of defining the screen mode. Now we made the
|
|
|
|
VESA driver configurable, taking the burden of choosing a screen mode from the
|
|
|
|
client. A client can still request a particular resolution but for the common
|
|
|
|
case, it is policy free.
|
|
|
|
|
|
|
|
If no configuration is supplied, the driver sets up a resolution of 1024x768 at
|
|
|
|
16 bit color depth. This behaviour can be overridden by supplying the following
|
|
|
|
arguments via Genode's config mechanism:
|
|
|
|
|
|
|
|
! <config>
|
|
|
|
! <!-- initial screen width -->
|
|
|
|
! <width>1024</width>
|
|
|
|
!
|
|
|
|
! <!-- initial screen height -->
|
|
|
|
! <height>768</height>
|
|
|
|
!
|
|
|
|
! <!-- initial color depth (bits per pixel) -->
|
|
|
|
! <depth>16</depth>
|
|
|
|
! </config>
|
|
|
|
|
|
|
|
Note that only VESA modes but no arbitrary combination of values are supported.
|
|
|
|
To find out which graphics modes exist on your platform, you might use the
|
|
|
|
'vbeprobe' command of the GRUB boot loader. Also, the driver will print a list
|
|
|
|
of supported modes if the specified values are invalid.
|
|
|
|
|
|
|
|
|
|
|
|
Paravirtualized Linux refinements
|
|
|
|
#################################
|
|
|
|
|
|
|
|
The para-virtualized Linux port introduced in Genode Release 9.11 has been
|
|
|
|
refined, especially the block driver providing a root file system for Linux
|
|
|
|
has been completely reworked. Also the configuration facilities changed a bit.
|
|
|
|
Moreover, few problems that occurred when using multiple Linux instances, or
|
|
|
|
when using one instance under heavy load have been fixed. At this point, we
|
|
|
|
like to thank Sven Fülster for providing information and a fix for a bug
|
|
|
|
triggered by a race condition.
|
|
|
|
|
|
|
|
:Repository structure:
|
|
|
|
We rearranged the structure of the 'oklinux' repository. The downloaded
|
|
|
|
archive and the original OKLinux code are now stored under 'download'
|
|
|
|
and respectively 'contrib' analog to the 'libports' repository
|
|
|
|
structure.
|
|
|
|
|
|
|
|
:Rom-file block driver:
|
|
|
|
The block driver using a ramdisk as backing store as contained in the original
|
|
|
|
OKLinux port has been replaced by a new implementation that uses a dataspace
|
|
|
|
provided by the ROM session interface to provide a read-only block driver.
|
|
|
|
|
|
|
|
The read-only block driver can be used together with UnionFS (stackable
|
|
|
|
file system) or the Cowloop driver (copy-on-write block device) for Linux to
|
|
|
|
obtain a writeable root-file system, like it is done in many Linux Live-CDs.
|
|
|
|
|
|
|
|
To use the new rom-file block driver you first need to specify what file to use
|
|
|
|
as block device. This can be done by adding a 'rom_file' section in the XML
|
|
|
|
configuration of your Linux instance:
|
|
|
|
|
|
|
|
! <config>
|
|
|
|
! <rom_file>rootfs.img</rom_file>
|
|
|
|
! </config>
|
|
|
|
|
|
|
|
Of course, you need to add this file to your list of boot modules.
|
|
|
|
|
|
|
|
The block device is named 'gda' within the Linux Kernel (e.g., take
|
|
|
|
a look at '/proc/partitions'). When using it as root-file system, you
|
|
|
|
might specify the following in your configuration:
|
|
|
|
|
|
|
|
! <config>
|
|
|
|
! <commandline>root=/dev/gda1</commandline>
|
|
|
|
! <rom_file>rootfs.img</rom_file>
|
|
|
|
! </config>
|
|
|
|
|
|
|
|
Assuming the rom-file contains a valid partition table and the root file system
|
|
|
|
is located in the first partition.
|
|
|
|
|
|
|
|
|
|
|
|
Distribution changes
|
|
|
|
####################
|
|
|
|
|
|
|
|
Starting with the release 10.02, we will no longer distribute our slightly
|
|
|
|
customized version of the L4/Fiasco kernel together with the official Genode
|
|
|
|
distribution but instead will provide this kernel as a separate archive. Our
|
|
|
|
original intention with packaging L4/Fiasco with Genode was to give newcomers
|
|
|
|
a convenient way to start working with Genode on a real microkernel without the
|
|
|
|
need to download the whole TUDOS source tree where the main-line development of
|
|
|
|
L4/Fiasco is hosted. In the meanwhile, the number of supported base platforms
|
|
|
|
greatly increased to 6 different kernels. There are now plenty of opportunities
|
|
|
|
to get started with real microkernels so that the special case of hosting
|
|
|
|
L4/Fiasco with Genode is no longer justified. We want to leave it up to you to
|
|
|
|
pick the kernel that suits your needs best, and provide assistance via our wiki
|
|
|
|
and mailing list.
|
|
|
|
|