genode/repos/base-okl4/doc/notes.txt


           ===================================================
           Bringing the Genode OS Framework to the OKL4 kernel
           ===================================================

                              Norman Feske


This article documents the process of bringing the Genode OS Framework to a new
kernel platform, namely the OKL4 kernel developed by OK-Labs. OKL4 is an
industry-grade kernel that is deployed in millions of mobile phones.

For our work, we went for the OKL4 version 2.1 for two reasons. First,
whereas this version officially supports the x86 architecture, the later
version 3 is pretty much focused on the ARM architecture. At present, the x86
architecture is our primary platform for Genode development.  Second, we like
to follow the evolution of OKL4 from its genesis (L4ka::Pistachio) to the
capability-based kernel design as pursued with the later versions. On this
path, the version 2.1 is an important milestone, which we wont like to miss.
Nevertheless of having chosen version 2.1 to begin with, we plan to bring
Genode to later versions of OKL4 as well.

In the article, we face numerous challenges such as integrating OKL4 support
into Genode's build system, exploring the OKL4 kernel interface and the
boot procedure, adapting Genode's framework libraries to the feature set
provided by the new kernel, and accessing interrupts and other hardware
resources.

The intended audience are developers interested in exploring
the realms of the L4-microkernel world and kernel developers who consider
running Genode as user-land infrastructure on top of their kernel.
For the latter group, we laid out the article as a rough step-by-step
guide providing our proposed methodology for approaching the port of
Genode to a new kernel platform. At many places, the article refers
to the source code of Genode, in particular the 'base-okl4' repository.
You can read the code online via our subversion repository:

[http://genode.svn.sourceforge.net/viewvc/genode/trunk/ - Browse the Genode subversion repository...]


Build-system support
####################

The first step is to create a simple hello-world program that can be executed
directly on the OKL4 kernel as roottask-replacement. This program does not rely
on any kernel features but uses port I/O to output some characters to the
serial interface. We need to understand the following things:

* We need a program that outputs some characters to the serial interface.
  This program can be developed on a known kernel platform. Once we have a
  working hello program, we only need to port it to the new kernel platform
  but can assume that the test program itself is correct.

* How must the OKL4 rootask be linked in order to be executed by the kernel?

* How does the OKL4 boot procedure work? OKL4 relies on a tool called elfweaver,
  which creates a bootable ELF-image (often called single image) from multiple
  binaries, in particular the kernel and roottask. We need to create a
  minimalist elfweaver configuration file that just starts the kernel and our
  hello example.

The result of this first step can be found in 'src/test/okl4_01_hello_raw':

:'crt0': is the assembly startup code taken from the L4/Fiasco version of
  Genode. This code defines the initial stack, contains the entry point of
  the hello program, which calls a C function called '_main'.

:'hello.cc': is the implementation of the '_main' function, which outputs
  some characters directly via the serial interface of a PC. It does not
  contain any kernel-specific code nor it depends on any include files.

:'genode.ld': is the linker script that we already use for Genode programs
  on other base platforms.

:'weaver.xml': is the description file of the single image to be created
  by OKL4's elfweaver tool. It is useful to take a close look at this file. The
  most important bits are the filename of the kernel specified in the
  '<kernel>' tag and the filename of the hello program specified in the
  '<rootprogram>' tag.

:'Makefile': contains the steps needed to compile the hello program and
  invoke elfweaver to create the bootable single image.

To boot the single image, you can use your favorite boot loader such as
Grub. The single-image file must be specified as kernel. When booted, the
program should print a message over the serial line.

The next step is the proper integration of the hello example into the
Genode build system. For this, we create a new source-code repository called
'base-okl4' with the following structure:
! base-okl4/lib/mk/x86/startup.mk
! base-okl4/mk/spec-okl4.mk
! base-okl4/mk/spec-okl4_x86.mk
! base-okl4/src/test/okl4_02_hello/target.mk
! base-okl4/src/test/okl4_02_hello/hello.cc
! base-okl4/src/platform/x86/_main.cc
! base-okl4/src/platform/x86/crt0.s
! base-okl4/src/platform/genode.ld
! base-okl4/etc/specs.conf

The OKL4-specific build-system support is contained in the files 'specs.conf',
'spec-okl4.mk', and 'spec-okl_x86.mk'. The 'specs.conf' file steers the build
process once the 'base-okl4' repository is specified in the 'REPOSITORIES'
declaration in the 'etc/build.conf' file in the build directory.
The 'spec-okl4_x86.mk' file describes the build specifics via the mechanism
described in Genode's getting-started documentation:
! SPECS = genode okl4_x86

Driven by the content of this 'SPECS' declaration, the build system first
includes the 'spec' files for 'spec-genode.mk' (found in the 'base/' repository)
and 'spec-okl4_x86.mk' (found in the 'base-okl4/' repository).
The latter file contains all build options for OKL4 on the x86 architecture,
extends the 'SPECS' declaration by the platform specifics 'x86_32' and 'okl4'
(which both apply for 'okl4_x86'), and aggregates the corresponding 'spec'
files:
! SPECS += x86_32 okl4
!
! LD_SCRIPT    ?= $(call select_from_repositories,src/platform/genode.ld)
! CXX_LINK_OPT += -Wl,-T$(LD_SCRIPT) -Wl,-Ttext=0x01000000
!
! include $(call select_from_repositories,mk/spec-x86_32.mk)
! include $(call select_from_repositories,mk/spec-okl4.mk)

The 'spec' file for 'x86_32' is contained in the 'base/'
repository. The one for 'okl4' is provided by 'base-okl4/'. It contains
all build options that are independent from the hardware platform, OKL4
is deployed on:
! -include $(call select_from_repositories,etc/okl4.conf)
! -include $(BUILD_BASE_DIR)/etc/okl4.conf
!
! INC_DIR += $(OKL4_DIR)/build/iguana/include
! INC_DIR += $(REP_DIR)/include
!
! PRG_LIBS += startup
!
! CC_OPT_NOSTDINC += -nostdinc
! CXX_LINK_OPT    += -static -nostdlib -Wl,-nostdlib
! EXT_OBJECTS     += $(shell $(CUSTOM_CXX_LIB) -print-file-name=libsupc++.a) \
!                    $(shell $(CUSTOM_CXX_LIB) -print-file-name=libgcc_eh.a) \
!                    $(shell $(CUSTOM_CXX_LIB) -print-libgcc-file-name)
!
! EXT_OBJECTS += $(OKL4_DIR)/build/iguana/lib/libl4.a

The most interesting point is that this file reads an OKL4-specific config
file from the 'etc/' subdirectory of the build directory. From this file,
it obtains the location of the OKL4 distribution via the 'OKL4_DIR'
declaration. The 'spec-okl4.mk' file above adds the 'build/iguana/include'
path to the default include search locations. We need this path for including
the headers from the 'l4/' subdirectory. Unfortunately, 'build/iguana/include/'
contains a lot of further includes, which we don't want to use. In contrary,
these includes pollute our include-search space. This is particularly problematic
for headers such as 'stdio.h', which will inevitably collide with Genode's own
libC headers. Hence we need to find a way, to isolate the 'l4/' headers from
the remaining Iguna headers. One elegant way is to shadow the 'build/iguana/include/l4'
directory in our local Genode build directory. This can be accomplished either
manually by creating a symbolic link from OKL4's 'build/iguana/include/l4' to
an include file within our Genode build directory, or by letting 'make' create
such a link automatically. The corresponding rules for this approach can be
found in the 'spec-okl4.mk' file.

On Genode, the startup code is encapsulated in a library called 'startup',
which is linked to each program by default. This library essentially consists
of a little snipped of assembly startup code 'crt0.s', which calls a platform-
independent C startup function called '_main' implemented in '_main.cc'. The
library-description file for the startup library is called 'startup.mk'
and has the following content:
! REQUIRES = okl4 x86
! SRC_S    = crt0.s
! SRC_CC   = _main.cc
!
! vpath crt0.s   $(REP_DIR)/src/platform/x86
! vpath _main.cc $(REP_DIR)/src/platform/x86

We will use a '_main.cc' from another platform as template for the OKL4-
specific startup code but strip it down to an absolute minimum (leaving
out everything except the call the actual 'main' function. Note that
for this simple setup, we need to explicitly reference a symbol of 'crt0.s'
from '_main.cc' to prevent the linker from discarding the otherwise
unreferenced object file (which only contains our entry point). The easiest
way is to reference the '__dso_handle' variable, which is defined in
'crt0.s'. However, this is an intermediate work-around, which we will
remove in the next step. Alternatively, we could rely on the '-u' option
of the linker to prevent the entry symbol ('_start') from being discarded.

The implementation of the hello program equals the version of
'okl4_01_hello_raw' except that the main function is actually called
'main' rather than '_main'. The corresponding target description file
'target.mk' is straight forward:
! TARGET   = hello
! REQUIRES = okl4
! SRC_CC   = hello.cc


Creating dummy versions of the 'env' and 'cxx' libraries
########################################################

So far, the hello program does rely neither on OKL4-specific nor
Genode-specific code. The goal of the next step is to remove the
differences between the '_main.cc' file in our repository and the
'_main.cc' file of the other base platforms. We will add proper
C++ initialization, the calling of static constructors, and a
proper console implementation.

The first step is to include the 'cxx' libary to our target.
This is a Genode-specific C++ support library, which contains
functions used as back end of the GCC's 'libsupc++' and 'libgcc_eh'.
To include the 'cxx' library for building our hello program, we
add the following declaration to the 'target.mk' file:

! LIBS = cxx

On a rebuild, the build system will try to compile the 'cxx' library,
which, in turn, depends on a number of Genode header files. Most
of these header files are generic and hence contained in the 'base/'
repositories. However, the following header files are specific for
the actual base platform and, therefore, must be provided by ourself:

:'base/capability.h': This file defines the representation of an object
  capability on the actual platform. For now, we can use the following
  version, which we will expand later on (at the current stage, the
  Capability class is not actually used but we need its definition for
  successful compilation. The OKL4-specific 'capability.h' file must
  be placed in 'include/base/' of the 'base-okl4/' repository.
  ! #ifndef _INCLUDE__BASE__CAPABILITY_H_
  ! #define _INCLUDE__BASE__CAPABILITY_H_
  !
  ! namespace Genode {
  !   class Capability {
  !     public: bool valid() const { return false; }
  !   }
  !   typedef int Connection_state;
  ! }
  !
  ! #endif /* _INCLUDE__BASE__CAPABILITY_H_ */

:'base/native_types.h': This file defines platform representations of
  thread IDs, locks etc. Please take a look at the 'native_types.h' file
  of another platform to get an overview on these types. For now, the
  following simple version suffices:
  ! #ifndef _INCLUDE__BASE__NATIVE_TYPES_H_
  ! #define _INCLUDE__BASE__NATIVE_TYPES_H_
  !
  ! namespace Genode {
  !   typedef volatile int Native_lock;
  !   typedef          int Native_thread_id;
  !   typedef          int Native_thread;
  ! }
  !
  ! #endif /* _INCLUDE__BASE__NATIVE_TYPES_H_ */

  In fact, at this point, the types are just dummies, which we will
  replace later when porting further parts of the framework.

:'base/ipc.h': This is a platform-specific wrapper for Genode's
  IPC API. Usually, this file just includes 'base/ipc_generic.h'.
  Optionally, it can host platform-specific IPC functionality.
  ! #ifndef _INCLUDE__BASE__IPC_H_
  ! #define _INCLUDE__BASE__IPC_H_
  !
  ! #include <base/ipc_generic.h>
  !
  ! #endif /* _INCLUDE__BASE__IPC_H_ */

:'base/ipc_msgbuf.h': This file defines the IPC message-buffer layout.
  Naturally, it is highly platform specific. For now, the following dummy
  message-buffer layout will do:
  ! #ifndef _INCLUDE__BASE__IPC_MSGBUF_H_
  ! #define _INCLUDE__BASE__IPC_MSGBUF_H_
  !
  ! namespace Genode {
  !   class Msgbuf_base { };
  !
  !   template <unsigned BUF_SIZE>
  !   class Msgbuf : public Msgbuf_base { };
  ! }
  !
  ! #endif /* _INCLUDE__BASE__IPC_MSGBUF_H_ */

Once, we have created these platform-specific header files, the 'cxx' libary
should compile successfully. However, there are a number of unresolved
symbols when linking the hello program.  The 'cxx' library uses Genode's
'env()->heap()' as back end for its local malloc implementation. But so far,
we do not have ported Genode's 'env' library. Furthermore, there are
unresolved references to 'Genode::printf' as provided by Genodes console
implementation and some functions of the IPC framework.

Let us first resolve the 'Genode::printf' references by creating an
OKL4-specific version of Genode's console library. For this, we create
a new back end in 'src/base/console/okl4_console.cc' that uses the
serial output mechanism that we employed for our first 'hello_raw' program.
The corresponding library description file 'lib/mk/printf_okl4.mk' looks
as follows:
! SRC_CC = okl4_console.cc
! LIBS   = cxx console
!
! vpath %.cc $(REP_DIR)/src/base/console

Now, we can add 'printf_okl4' to the 'LIBS' declaration of hello's 'target.mk'
file. When recompiling the hello program, the new 'printf_okl4' library will
be built and resolve the 'Genode::printf' symbols. There remain the unresolved
references to 'Genode::env()' and parts of the IPC framework.

The IPC implementation in 'src/base/ipc/ipc.cc' is not straight forward
and we defer it for now. Hence, we place only the following dummy functions
into the 'ipc.cc' file:

! #include <base/ipc.h>
!
! using namespace Genode;
!
! Ipc_ostream::Ipc_ostream(Capability dst, Msgbuf_base *snd_msg) :
!   Ipc_marshaller(0, 0) { }
!
! void Ipc_istream::_wait() { }
!
! Ipc_istream::Ipc_istream(Msgbuf_base *rcv_msg) :
!   Ipc_unmarshaller(0, 0) { }
!
! Ipc_istream::~Ipc_istream() { }
!
! void Ipc_client::_call() { }
!
! Ipc_client::Ipc_client(Capability &srv, Msgbuf_base *snd_msg,
!                                         Msgbuf_base *rcv_msg) :
!   Ipc_istream(rcv_msg), Ipc_ostream(srv, snd_msg), _result(0) { }
!
! void Ipc_server::_wait() { }
!
! void Ipc_server::_reply() { }
!
! void Ipc_server::_reply_wait() { }
!
! Ipc_server::Ipc_server(Msgbuf_base *snd_msg,
!                        Msgbuf_base *rcv_msg) :
!   Ipc_istream(rcv_msg), Ipc_ostream(Capability(), snd_msg) { }

The corresponding library-description file 'lib/mk/ipc.mk' looks as
follows:
! SRC_CC = ipc.cc
! vpath ipc.cc $(REP_DIR)/src/base/ipc

By adding 'ipc' to the 'LIBS' declaration in hello's 'target.mk' file, the
IPC-related linker errors should disappear and only the reference to
'Genode::env()' remains. To resolve this symbol, we add the following dummy
function directly into the code of 'hello.cc'.
! namespace Genode {
!   void *env() { return 0; }
! }

Before we can use the Genode framework, which is written in C++, we need to
make sure that all static constructors are executed in the startup code
('_main'). Therefore, we add the following code to the '_main' function:
! void (**func)();
! for (func = &_ctors_end; func != &_ctors_start; (*--func)());

The referenced symbols '_ctors_start' and '_ctors_end' are created by the
linker script. The corresponding declarations are provided by
'base/include/base/crt0'..

Now, its time to replace the direct I/O port access in 'hello.cc' by
Genode's 'printf' implementation. Just add the following line to the main
function of 'hello.cc' and make sure to include '<base/printf.h>':
! Genode::printf("This is Genode's printf\n");

When starting the resulting program, this message should appear via the
serial interface comport 0.


Initializing the C++ exception handling
#######################################

The Genode OS Framework makes use of C++ exceptions. Hence, we need to
make sure to properly initialize the 'libsupc++'. This initialization
comes down to calling the function
! __register_frame(__eh_frame_start__);
which is performed by the function 'init_exception_handling' as provided
by the generic 'cxx' library. Normally, 'init_exception_handling' is called
from '_main'. It is important to know that the initialization code does
use 'malloc', which is mapped to Genode's 'env()->heap()' by the 'cxx'
library. Consequently, we need a working heap to successfully initialize
the exception handling.

Therefore, we have to replace the dummy 'env()' function in our hello
program with something more useful. The header file 'src/test/minimal_env.h'
provides the heap functionality by using a minimalistic custom environment,
which contains a heap with static pool of memory. With such an environment
in place, we can safely call 'init_exception_handling' from the '_main'
startup code. The test 'okl4_02_hello' is the result of this step. It
first prints some text via Genode's 'printf' implementation and then triggers
a C++ exception.


Thread creation
###############

So far, we have not performed any OKL4 system call. The first system call that
we will explore is the 'L4_ThreadControl' to create a thread. A corresponding
test for this functionality is implemented in the 'test/okl4_03_thread'
example. This example creates a new thread with the thread number 1. Note that
the matching L4 thread ID uses the lowest 14 bits as version number, which is
always set to 1. Hence, the L4 thread ID of thread number 1 will be 0x4001. If
you happen to need to look up this thread in OKL4's kernel debugger, you will
find its thread control block (TCB) via this number.

Another important thing to note is that rootask's main thread runs initially
at the priority of 255 whereas newly created threads get assigned a default
priority of 100. To make OKL4's preemtive scheduling to work as expected, we
need to assign the same priority to both threads by calling 'L4_Set_Priority'.


IPC framework
#############

Now that we can start multiple threads, we can fill Genode's IPC framework with
life.

However, before we can get started with communication between threads, the
communication partners must have a way to get to know each other. In particular,
a receiver of IPC communication needs a way to make its communication address
known to a sender. OKL4 uses 'L4_ThreadId_t' as communication address. The
thread's ID is assigned to each thread by its creator. The thread itself however,
does not know its own identity when started up. In contrast to other L4 kernels
that provide a way for thread to determine its own identity via a 'L4_Myself'
call, this functionality is not supported on OKL4. Therefore, the creator of
a new thread must communicate the assigned thread ID to the new thread via
a startup protocol. We use OKL4's 'UserDefinedHandle' for this purpose. This
is an entry of the threads UTCB that can be remotely accessed by the creating
thread. Before starting the new thread, the creator writes the assigned thread
ID to the new thread's user-defined handle. In turn, the startup code of the
new thread copies the supplied value from the user-defined handle to a
thread-local entry of the UTCB (a designated 'ThreadWord'). In the following,
the thread can always determine its own global ID by reading this 'ThreadWord'
from its UTCB. We declare the convention about which 'ThreadWord' to use for
this purpose in Genode's 'base/native_types.h' ('UTCB_TCR_THREAD_WORD_MYSELF').


IPC send and wait
=================

The test program 'okl4_04_ipc_send_wait' sends an IPC messages via Genode's
'Ipc_istream' and 'Ipc_ostream' framework. To make this example functional,
we have to work on the following parts of the 'base-okl4/' repository.

:'include/base/capability.h':
  Genode uses the 'Capability' class to address an IPC communication and a
  referenced object. Therefore, we must provide a valid representation of these
  information. Because all IPC operations on OKL4 always address threads, we
  use 'L4_ThreadId_t' as representation of communication address. There are no
  kernel objects representing user-level objects in OKL4 (version 2). So we
  need to manage object identities on the user level, unprotected by the
  kernel. For now, we simply use a globally unique object ID for this purpose.

:'include/base/ipc_msgbuf.h':
  The message-buffer representation used for OKL4 does not use any
  kernel-specific layout because IPC payload is always transferred through the
  communicating thread's UTCBs. Hence, the 'Msgbuf' template does only need to
  provide some space for storing messages but no control information.

:'src/base/ipc/ipc.cc':
  For the send-and-wait test, we need to implement the 'Ipc_istream' and
  'Ipc_ostream' class functions: the constructors of 'Ipc_istream' and
  'Ipc_ostream', the '_wait' function, and the '_send' function. It is useful
  to take a look at the other platform's implementations for reference.
  Because the Genode IPC Framework provides the functionality for marshalling
  and unmarshalling of messages, we skip OKL4 'message.h' convenience
  abstraction in favor of addressing UTCB message registers 'ipc.h' directly.


IPC call
========

The test program 'okl4_05_ipc_call' performs IPC communication using Genode's
'Ipc_client' and 'Ipc_server' classes. To make this test work, the corresponding
functions in 'src/base/ipc/ipc.cc' must be implemented, in particular the
functions '_reply_wait' and '_call'.


Address-space creation and page-fault handling
##############################################

There are the following Peculiarities of OKL4 with regard to address-spaces.

OKL4 does not use IPC to establish memory mappings but an independent
system call 'L4_MapControl' to configure the local or an remote address
space. In the line of other L4 kernels, page faults are handled via
an IPC-based pager protocol. The typical mode of operation of a pager
looks like:
# A page fault occurs, the kernel translates the page fault into a
  page-fault message delivered to the pager of the faulting thread.
# The pager receives a page-fault message, decodes the page-fault
  address, the fault type (read, write, execute), and the instruction
  pointer of the faulter from the page-fault message.
# The pager resolves the page fault by populating the faulter's
  address spaces with valid pages via 'L4_MapControl'.
# The pager answers the page-fault message with an empty IPC to
  resume the operation of the faulter.
In contrast to L4/Fiasco and L4ka::Pistachio, which incorporate the
memory mapping into the reply message, this procedure involves
an additional system call. However, it is more flexible and allows
the construction of a fully populated address space without employing
an IPC-based protocol. Furthermore, the permissions for establishing
memory mappings are well separated from IPC-communication rights.

In contrast to the L4/Fiasco and L4ka::Pistachio kernels, which take
a virtual address of the mapper as argument, the OKL4 map operation
always refers to a physical page. This enables the configuration of a
remote address space without having all the used pages locally mapped
as well. For specifying a local virtual address for a mapping, we
can use the 'L4_ReadFpage' function to look up a physical-memory
descriptor for a given virtual address.

The test 'okl4_06_pager' creates an address space to be one-to-one
mapped with roottask. In the new address space, a thread is created.
For the new thread, we use the roottask thread as pager. Once started,
the new that raises a number page faults:
# Reading the first instruction of the entry point
# Accessing the first stack element
# Reading data
# Writing data
The pager receives the corresponding page-fault messages, prints
the decoded information, and resolves the page faults accordingly.


Determining the memory configuration and boot modules
#####################################################

OKL4 provides its boot information to roottask via a boot-info structure, which
is located at the address provided in roottask's UTCB message register 1. This
structure is created by OKL4's elfweaver during the creation of the boot image.
It has no fixed layout but it contains a batch of operations such as "add
memory pool" or "create protection domain". In short, it (loosely) resembles
the content of the elfweaver XML config file in binary form. Most of
elfweaver's features will remain unused when running Genode on OKL4. However,
there are some important bits of information we need to know:
* Memory configuraion
* Information on the boot modules
For parsing the boot-info structure, there exists a convenient library located
in the OKL4 source tree at 'libs/bootinfo'. The test program
'okl4_07_boot_info' uses this library to obtain the information we are
interested in.

Note that we link the library directly to the test program by using the
'EXT_OBJECTS' declaration in the 'target.mk' file. We are not adding this
library to the global 'spec-okl4.mk' file because we need the bootinfo-library
only at a very few places (this test program and core).

We obtain the memory configuration by assigning a callback function to the
'init_mem' entry of the 'bi_callbacks_t' structure supplied to the parser
library. There are indeed two 'init_mem' function called 'init_mem' and
'init_mem2'. The second instance is called during a second parsing stage.
However, both functions seem to be called with the same values. So we just
disregard the values supplied to 'init_mem2' at this point.

To include other modules than the 'rootprogram' to the boot image, we use the
help of elfweaver's '<pd>' declaration. We create a pseudo protection domain as
a container for several memory sections, each section loaded with the content
of a file. An example declaration for including the files 'init' and 'config'
into the boot image looks like this:
!<pd name="modules">
!  <memsection name="init"   file="init"   direct="true" />
!  <memsection name="config" file="config" direct="true" />
!</pd>
The 'direct="true"' attribute has the effect that the memory section will
have equal physical and virtual addresses.

When observing the output of 'okl4_07_boot_info', the relevant information
are the 'new_ms' (new memory section) lines with owner != 0 (another PD
than roottask) and virtpool != 1. These memory sections correspond to
the files. However, the association of the memory sections with their file
names is still missing at this point. To resolve this problem, we also observe
the 'export_object'  calls. For each memory section, 'export_object' gets
called with the type parameter set to 'BI_EXPORT_MEMSECTION_CAP' and the key
parameter set to the name of the file. Note that the file name is converted to
upper case.  For associating memory sections with file names, we assume that
the order of 'new_ms' calls corresponds to the order of matching
'export_object' calls.


Interrupt handling and time source
##################################

In contrast to most of the classical L4 kernels, OKL4 provides no means
for accessing wall-clock time from the user land. Internally, OKL4 uses
a scheduling timer to perform preemptive scheduling but it does not expose
a time source to the user land via IPC timeouts. Hence, we need an alternative
way to obtain a user-level time source. We follow the same path as Iguana
by driving the programmable interval timer (PIT) directly from a
user-level service. Because OKL4 uses the more modern APIC timer, which is
completely independent of the PIT, both the kernel and the user land
can use entirely different timer devices as their respective time source.

The PIT is connected to the interrupt line 0 of the programmable interrupt
controller (PIC). The test program 'okl4_08_timer_pit' switches the PIT
into one-shot mode and waits for timer interrupts. Each time a timer
interrupt occurs, the next one-shot is scheduled. The program tests two
important things: How does the interrupt handling work on OKL4 and
how to provide a user-level time source?

The following things are worth mentioning with regard to IRQ handling:

* By default, no one (roottask included) has the right to handle interrupts.
  We have to explicitly grant ourself the right to handle a particular
  interrupt by calling 'L4_AllowInterruptControl'.
* When calling 'L4_RegisterInterrupt', the kernel expects a real global
  thread ID, not the magic ID returned by 'L4_Myself()'.
* Interrupts are delivered in an asynchronous fashion by using OKL4's
  notification mechanism. To block for incoming asynchronous messages,
  the corresponding notification bit must be unmasked and notifications
  must be accepted.
* The interrupt-handler loop invokes two system calls per interrupt,
  'L4_ReplyWait' for blocking for the next interrupt and 'L4_AcknowledgeInterrupt'
  for interrupt acknowledgement. Both syscalls could be consolidated into a
  call of 'L4_AcknowledgeWaitInterrupt'.


Porting core
############

Now that we have discovered the most functional prerequisites for running
Genode on OKL4, we can start porting Genode's core. I suggest to take
another platform's core version as a template. For OKL4, the 'base-pistachio'
version becomes handy. First, make a copy of 'src/core' to the 'base-okl4/'
repository. Then we revisit all individual files and remove all
platform-specific code with the goal to create a skeleton of core that
compiles successfully. Thereby, we can already apply some simple type
substitutions, for example by using the types declared in 'native_types.h'
we can avoid using platform-specific types such as 'L4_ThreadId_t'.

By trying to compile core, we will see that there are still a few framework
libraries missing, namely 'pager', 'lock', and 'raw_signal'. For resolving the
dependency on the _lock library_, we can use a simple spinlock implementation
as an intermediate step. The implementation at 'src/base/lock/lock.cc' looks
like this:
!#include <base/cancelable_lock.h>
!#include <cpu/atomic.h>
!
!using namespace Genode;
!
!Cancelable_lock::Cancelable_lock(Cancelable_lock::State initial)
!: _native_lock(UNLOCKED)
!{
!  if (initial == LOCKED)
!    lock();
!}
!
!void Cancelable_lock::lock()
!{
!  while (!cmpxchg(&_native_lock, UNLOCKED, LOCKED));
!}
!
!void Cancelable_lock::unlock()
!{
!  _native_lock = UNLOCKED;
!}
Note that this implementation does not fully implement the 'Cancelable_lock'
semantics but it is useful to get things started. The corresponding 'lib/mk/lock.mk'
can be based on another platform's variant:
!SRC_CC   = lock.cc
!vpath lock.cc $(REP_DIR)/src/base/lock
The OKL4-specific _signal library_ can be taken almost unmodified from
'base-pistachio/'. The _pager library_ is a bit more complicated because
it depends on 'ipc_pager.h' and the corresponding part of the ipc library,
which we have not yet implemented yet. However, based on the knowledge
gained from the 'okl4_06_pager' test, the adaption of another platform's
implementation of 'src/base/ipc/pager.cc' becomes straight-forward. For now,
it actually suffices to leave the functions in 'pager.cc' blank.

Once, we get the skeleton of core linked, we can work on the OKL4-specific
code, starting with core's platform initialization in 'platform.cc'.
Configuring core's memory allocators:

:'region_alloc': This is the allocator containing the virtual address
  regions that are usable within core. The boot-info parser reports these
  regions via the callbacks 'init_mem' and 'add_virt_mem'.
:'ram_alloc': This is the allocator containing the available physical
  memory pages. It must be initialized with the physical-memory ranges
  provided via the 'init_mem' and 'add_phys_mem' callbacks.
:'core_mem_alloc': This is an allocator for available virtual address
  ranges within core. In contrast to 'region_alloc' and 'ram_alloc', which
  both are operating at page-granularity, 'core_mem_alloc' can be used to
  allocate arbitrarily-sized memory objects. The implementation uses
  'region_alloc' and 'ram_alloc' as back ends. The core-local mapping
  of physical memory pages to core's virtual address space is done in a
  similar way as practiced in the 'okl4_06_pager' test program.

For implementing the allocators, special care must be taken to make their
interfaces thread safe as they may be used concurrently by different core
threads. With the memory configuration in place, core will pass the first
initialization steps and tries to initialize 'Core_env', which is a
core-specific variant of the Genode environment. A part of 'Core_env' is a
server-activation, which is indeed a thread. Upon the creation of this thread,
the main thread of core will stop executing until the new thread's startup
protocol is finished. So we have to implement core's thread-creating facility,
which is 'platform_thread.cc'.

After core successfully creates its secondary threads (called 'activation' and
'pager'), and finishes the initialization of 'Core_env()', it starts executing
the 'main' function, which uses plain Genode APIs such as the 'env()->heap()'.
The heap however relies on a working 'env()->rm_session()' and
'env()->ram_session()'. To make 'env()->rm_session()' functional, we need to
provide a working implementation of the 'Core_rm_session::attach()' function,
which maps the content of a dataspace to core's local address space.  Once,
core starts using its 'Env', it will try to use 'env()->rm_session()' to attach
dataspaces into its local address space. Therefore, we need an implementation
of a core version of the 'Rm_session' interface, which we call
'Core_rm_session'. This implementation uses the OKL4 kernel API to map the
physical pages of a dataspace into core's local address space.  With the
working core environment, core will look for the binary of the init process.
Init is supplied to core as a boot module via the elfweaver mechanism we
just explored with the 'okl4_07_boot_info' test.  Within core, all boot modules
are registered to an instance of the 'Rom_fs' class. Hence, we will need to
call OKL4's boot-info parser with the right callback functions supplied and put
the collected information into 'Rom_fs'.  It is useful to take the other
platforms as reference.


Starting init
#############

To enable core to successfully load and start the init process, we first need
to build the init binary. For compiling 'init' we have to implement the still
missing functionality of determining the parent capability at the startup code.
The needed function is called 'parent_cap()' and should be implemented in the
'_main' function. For OKL4, the implementation looks exactly like the Pistachio
version. On both kernels, the parent capability is supplied at predefined
locations declared in the linker script. The corresponding symbols are called
'_parent_cap_thread_id' and '_parent_cap_local_name'.

After successfully having started init, we can proceed with starting further
instances of init as a children of the first instance. This can be achieved by the
following config file:

!<config>
!  <parent-provides>
!    <service name="ROM"/>
!    <service name="RAM"/>
!    <service name="CAP"/>
!    <service name="PD"/>
!    <service name="RM"/>
!    <service name="CPU"/>
!    <service name="LOG"/>
!  </parent-provides>
!  <default-route>
!    <any-service> <parent/> </any-service>
!  </default-route>
!  <start name="init.1">
!    <binary name="init"/>
!    <resource name="RAM" quantum="5M"/>
!  </start>
!  <start name="init.2">
!    <binary name="init"/>
!    <resource name="RAM" quantum="5M"/>
!    <config>
!      <parent-provides>
!        <service name="ROM"/>
!        <service name="RAM"/>
!        <service name="CAP"/>
!        <service name="PD"/>
!        <service name="RM"/>
!        <service name="CPU"/>
!        <service name="LOG"/>
!      </parent-provides>
!      <default-route>
!        <any-service> <parent/> </any-service>
!      </default-route>
!      <start name="init.2.1">
!        <binary name="init"/>
!        <resource name="RAM" quantum="2M"/>
!      </start>
!      <start name="init.2.2">
!        <binary name="init"/>
!        <resource name="RAM" quantum="2M"/>
!      </start>
!    </config>
!  </start>
!</config>

To successfully execute the creation of this nested process tree, we need
a correct implementation of 'unmap' functionality within core.
Furthermore, if starting multiple processes, we will soon run into the problem
of starting too many threads in core. This is caused by the default
implementation of Genode's signal API.
Within core, each 'Rm_session_component' within core is a signal transmitter,
used for signalling address-space faults.
With the default implementation, each signal transmitter employs one thread.
Because OKL4's roottask is limited to 8 threads, the number of RM sessions
becomes quite limited. Therefore, we disable signal support on OKL4 for now
by the means of a dummy implementation of the signal interface. Later, we can
create a OKL4-specific signal implementation, which will hopefully be able to
utilize OKL4's asynchronous notification mechanism.


Hardware access and the Genode demo scenario
############################################

The default demo scenario of Genode requires hardware access performed by the
following components:

* The timer driver needs access to a hardware timer. On x86, the programmable
  interval timer (PIT) is available for this use case.
  However, for the first version of Genode on OKL4, we can use a simple dummy
  driver that ignores the argument of 'msleep' and just returns.

* The PS/2 driver and the timer driver rely on interrupts. We already exercised
  interrupt handling in 'okl4_08_timer_pit'. So it is relatively straight-forward
  to implement the IRQ service in core. (taking the other platforms such as
  Pistachio as reference)

* The VESA driver requires several hardware facilities, in particular access
  to the VGA registers via I/O ports, the frame buffer via memory-mapped I/O
  and other resources such as the PIC (at least some VESA BIOSes rely on the
  PIT to implement proper delays during the PLL initialization).
  However, with a working implementation of the I/O-port service and
  I/O-memory service in core, these requirements become satisfied.

If all the hardware-access services within core are in place, we should be able
to start 'fb_drv', 'ps2_drv', 'nitpicker', 'launchpad'. Furthermore starting
and killing of an additional 'testnit' process via the launchpad should work.
However, we will observe that starting another instance of testnit after
killing it will not work. In order to fully support restartable components,
we have to implement thread destruction, and the cancel-blocking mechanism within core.
The interesting bits about thread destruction are 'Platform_thread::unbind' and
'Platform_pd::_destroy_pd'. For implementing the cancel-blocking mechanism, we
have to revisit core's 'Platform_thread::cancel_blocking', the IPC framework
('src/base/ipc/ipc.cc') and the lock implementation ('src/base/lock/lock.cc').

With this work done, we are able to run the full Genode demonstration scenario
including the Scout tutorial browser, user-level device drivers for PS/2
input and video, and the dynamic creation and destruction of process trees.


Outlook
#######

We consider the result of the porting work as described in this article as the
first working version of Genode on OKL4. Of course, there are several areas
for possible improvements, which we will address in a demand-driven way.
The following list gives some hints:

* Exploring OKL4's kernel mutex for Genode's lock implementation,
  paying special attention to the cancel-blocking semantics
* Increasing the flexibility of the UTCB allocator in core. Right now, the UTCB
  area of each PD is equally sized, defined by the 'THREAD_BITS' definition.
  In the future, we could support differently sized UTCB areas to tailor the
  number of threads per protection domain.
* Checking the privileges of non-core tasks
* Supporting RM faults and nested region-manager sessions
* Replacing the dummy timer implementation with a proper PIT-based
  timer
* Virtualizing the PIT in the VESA frame-buffer driver, otherwise
  the PIT-based timer service won't be usable because of both
  components needing access to the PIT. Fortunately, the VESA BIOS of Qemu
  does not access the PIT but we are aware that other BIOSes do.
* Eventually optimize I/O port access. Right now, we perform an RPC call
  to core for each I/O port access, which is ok for the other platforms
  because I/O ports are rarely used (mostly for the PS/2 driver, but at
  a low rate). On OKL4 however, we provide the user-level time source
  via the timer driver that accesses the PIT via I/O ports. We could
  optimize these accesses by lazily mapping the I/O ports from core to
  the timer driver the first time, an RPC call to the I/O service is
  performed.