mirror of
https://github.com/genodelabs/genode.git
synced 2025-01-06 22:08:46 +00:00
1143 lines
51 KiB
Plaintext
1143 lines
51 KiB
Plaintext
|
||
|
||
=========================================================
|
||
Genode on seL4 - Building a simple root task from scratch
|
||
=========================================================
|
||
|
||
|
||
Norman Feske
|
||
|
||
|
||
This document is a loose collection of notes about the exploration of the
|
||
seL4 and the port of the Genode system to this kernel. The seL4 kernel is
|
||
a modern microkernel jointly developed by NICTA and General Dynamics.
|
||
|
||
:[http://sel4.systems]:
|
||
|
||
Website of the seL4 project
|
||
|
||
|
||
A fresh Genode source-code repository
|
||
#####################################
|
||
|
||
Following the convention to host each kernel in its dedicated _base-<kernel>_
|
||
source repository, the seL4-specific code will go to _base-sel4_. This way,
|
||
we can cleanly hold the seL4-specific code apart from generic Genode code.
|
||
|
||
For the start, the new repository will contain two things: This notes
|
||
document at _doc/_ and the port-description file for downloading the seL4
|
||
kernel _(ports/sel4.port)_ accompanied with the corresponding hash file
|
||
_(ports/sel4.hash)_.
|
||
|
||
Since seL4 is hosted on GitHub, writing a port-description file is easy.
|
||
We can simply use _base-nova/ports/nova.port_ as a template and adapt it:
|
||
|
||
! LICENSE := GPLv2
|
||
! VERSION := git
|
||
! DOWNLOADS := sel4.git
|
||
!
|
||
! URL(sel4) := https://github.com/seL4/seL4.git
|
||
! # experimental branch
|
||
! REV(sel4) := b6fbb78cb1233aa8549ea3acb90524306f49a8d2
|
||
! DIR(sel4) := src/kernel/sel4
|
||
|
||
There are two branches of seL4. The master branch is the stable branch
|
||
that contains the formally verified version of the kernel. The experimental
|
||
branch contains features that are not yet ready to be included in the
|
||
master branch. Among those features is the support for virtualization.
|
||
To anticipate current developments of the seL4 community, I am going with
|
||
the experimental branch. For the ports file, I pick the commit ID of the
|
||
most recent commit of the experimental branch at the time of porting. Once,
|
||
the _ports/sel4.port_ file is in place, we can generate the matching
|
||
port hash as follows (from the base directory of the Genode source tree):
|
||
|
||
! # create an empty hash file
|
||
! touch repos/base-sel4/ports/sel4.hash
|
||
!
|
||
! # update the hash for the current version of the sel4.port file
|
||
! ./tool/ports/update_hash sel4
|
||
|
||
With the _sel4.port_ file in place, we can download the seL4 kernel via:
|
||
! ./tool/ports/prepare_port sel4
|
||
|
||
After invoking this command, the kernel source will be located at
|
||
_contrib/sel4-<hash>/src/kernel/sel4_.
|
||
|
||
|
||
Building the kernel for the first time
|
||
######################################
|
||
|
||
For getting acquainted with the code base, the README.md file provides a
|
||
good starting point. It seems to contain exactly the information that I need
|
||
at this point. As a first test, I am going to build the kernel for the pc99
|
||
platform using the Genode tool chain. According to the README, the following
|
||
command should work
|
||
|
||
! make TOOLPREFIX=/usr/local/genode-gcc/bin/genode-x86- \
|
||
! ARCH=ia32 PLAT=pc99
|
||
|
||
On the first attempt, the following error occurs:
|
||
|
||
! Traceback (most recent call last):
|
||
! File "./tools/invocation_header_gen.py", line 18, in <module>
|
||
! import tempita
|
||
! ImportError: No module named tempita
|
||
! make: *** [arch/api/invocation.h] Error 1
|
||
|
||
This problem could be easily solved by installing the 'python-tempita'
|
||
package. However, further down the road, the build process stops with
|
||
the following message:
|
||
|
||
! src/arch/ia32/kernel/boot_sys.c:75:26: error: ‘CONFIG_MAX_NUM_IOAPIC’ undeclared here (not in a function)
|
||
! src/plat/pc99/machine/hardware.c: In function ‘maskInterrupt’:
|
||
! src/plat/pc99/machine/hardware.c:36:9: error: implicit declaration of function ‘pic_mask_irq’ [-Werror=implicit-function-declaration]
|
||
! src/plat/pc99/machine/hardware.c:36:9: error: nested extern declaration of ‘pic_mask_irq’ [-Werror=nested-externs]
|
||
|
||
A 'grep' for 'MAX_NUM_IOAPIC' reveals that this definition is normally
|
||
generated by the kernel configuration program as there is a match in
|
||
_src/plat/pc99/Kconfig_. At this point, I am wondering if I am on the wrong
|
||
track altogether. On the seL4 download page, the kernel is mentioned as just
|
||
one of several "projects". But since it is the only one of the list of
|
||
projects that I am actually interested in, I wanted to focus on only this one.
|
||
But apparently, attempts to configure the kernel via 'make menuconfig' are not
|
||
supposed to work if checking out the repository in a free-standing fashion.
|
||
|
||
Sooner or later, I would have to look behind the curtain of the seL4 build
|
||
system. So why not now? Passing 'BUILD_VERBOSE=1 V=1' to 'make' is quite
|
||
helpful in this situation. A look into _include/config.h_ reveals that the
|
||
use of the autoconf-generated header _autoconf.h_ is optional. This is
|
||
nice. So I can leave the kernel-configuration magic out of the loop and just
|
||
manually specify the config definitions at the build command line. After
|
||
a few iterations, I came up with the following command-line arguments:
|
||
|
||
! make TOOLPREFIX=/usr/local/genode-gcc/bin/genode-x86- \
|
||
! ARCH=ia32 PLAT=pc99 BUILD_VERBOSE=1 V=1 \
|
||
! LDFLAGS+=-Wl,-melf_i386 \
|
||
! LDFLAGS+=-nostdlib \
|
||
! LDFLAGS+=-Wl,-nostdlib \
|
||
! CFLAGS+=-m32 \
|
||
! CONFIG_KERNEL_EXTRA_CPPFLAGS+=-DCONFIG_MAX_NUM_IOAPIC=1 \
|
||
! CONFIG_KERNEL_EXTRA_CPPFLAGS+=-DCONFIG_IRQ_IOAPIC=1
|
||
|
||
The 'CONFIG_IRQ_IOAPIC' flag is needed to specify whether the legacy PIC
|
||
or the IOPIC should be used. It is picked up by for conditionally compiling
|
||
the code of _src/plat/pc99/machine/hardware.c_.
|
||
|
||
As the result of the build process, we get a freshly baked 'kernel.elf'
|
||
file.
|
||
|
||
Of course, we don't want Genode users to manually build the kernel in this
|
||
way. So we add a "kernel" target to our _base-sel4_ repository. The kernel
|
||
target comes in the form of a _src/kernel/target.mk_ file and a library
|
||
_lib/mk/x86_32/kernel.mk_. The _target.mk_ file is just a pseudo target
|
||
with a dependency on the _kernel.mk_ library. There may be multiple versions
|
||
of the _kernel.mk_ library. The build-system configuration determines the
|
||
version that is used. E.g., when we set up the build directory for x86_32,
|
||
the _lib/mk/x86_32/kernel.mk_ one will be used. The _kernel.mk_ file looks
|
||
as follows:
|
||
|
||
! SEL4_DIR = $(call select_from_ports,sel4)/src/kernel/sel4
|
||
!
|
||
! ifeq ($(called_from_lib_mk),yes)
|
||
! all: build_kernel
|
||
! endif
|
||
!
|
||
! LINKER_OPT_PREFIX := -Wl,
|
||
!
|
||
! build_kernel:
|
||
! $(VERBOSE)$(MAKE) TOOLPREFIX=$(CROSS_DEV_PREFIX) \
|
||
! ARCH=ia32 PLAT=pc99 \
|
||
! LDFLAGS+=-nostdlib LDFLAGS+=-Wl,-nostdlib \
|
||
! $(addprefix LDFLAGS+=$(LINKER_OPT_PREFIX),$(LD_MARCH)) \
|
||
! $(addprefix CFLAGS+=,$(CC_MARCH)) \
|
||
! CONFIG_KERNEL_EXTRA_CPPFLAGS+=-DCONFIG_MAX_NUM_IOAPIC=1 \
|
||
! CONFIG_KERNEL_EXTRA_CPPFLAGS+=-DCONFIG_IRQ_IOAPIC=1 \
|
||
! SOURCE_ROOT=$(SEL4_DIR) -f$(SEL4_DIR)/Makefile
|
||
|
||
The pseudo target _base-sel4/src/kernel/target.mk_ exists merely for making
|
||
the result from the seL4 build directory visible in the install directory
|
||
_(bin/)_.
|
||
|
||
! TARGET = sel4
|
||
! LIBS = kernel
|
||
!
|
||
! $(INSTALL_DIR)/$(TARGET):
|
||
! $(VERBOSE)ln -sf $(LIB_CACHE_DIR)/kernel/kernel.elf $@
|
||
|
||
Genode's build system works in two stages. At the first (light-weight) stage,
|
||
it merely determines library dependencies. At the second stage, the actual
|
||
build steps are performed. The condition around the 'all' target ensures that
|
||
the 'build_kernel' target will be visited only at the second build stage with
|
||
the current working directory set to the library location and all build
|
||
variables (such as CROSS_DEV_PREFIX) defined. Fortunately, the seL4 build
|
||
system supports out-of-tree builds by defining the SOURCE_ROOT variable.
|
||
|
||
To test drive the kernel target, we first need to create Genode build
|
||
directory prepared for seL4. Later, we will add seL4 support to the regular
|
||
'create_builddir' tool. But for now, it suffices to create one manually:
|
||
|
||
# Create a new directory with an _etc/_ subdirectory.
|
||
|
||
# Add an _etc/specs.conf_ file with the following content:
|
||
! SPECS = sel4 x86_32
|
||
This tells the build system the so-called build specification.
|
||
|
||
# Add an _etc/build.conf_ file with the following content:
|
||
! GENODE_DIR := /path/to/your/genode.git
|
||
! BASE_DIR := $(GENODE_DIR)/repos/base
|
||
! REPOSITORIES := $(GENODE_DIR)/repos/base-sel4 \
|
||
! $(GENODE_DIR)/repos/base
|
||
! CONTRIB_DIR := $(GENODE_DIR)/contrib
|
||
|
||
GENODE_DIR must point to the root of Genode's source tree.
|
||
BASE_DIR is used by the build system to find the build-system scripts.
|
||
The CONTRIB_DIR is needed to enable the build system to find the location
|
||
of the seL4 source code. The REPOSITORIES declaration lists all source
|
||
repositories that should be included in the build. Note that the order
|
||
is important. By listing 'base-sel4' before 'base', we allow 'base-sel4'
|
||
to override the content of the 'base' repository.
|
||
|
||
# Symlink the _<genode-dir>/too/builddir/build.mk_ file to a _Makefile_.
|
||
! ln -s <genode-dir>/tool/builddir/build.mk Makefile
|
||
|
||
With the build directory, we can trigger the kernel build via 'make kernel'.
|
||
The seL4 build process will be invoked from within the kernel library. Hence,
|
||
the library's directory _var/libcache/kernel_ will be used as the build
|
||
directory of the kernel.
|
||
|
||
|
||
Starting the kernel in Qemu
|
||
###########################
|
||
|
||
To test-drive the kernel, we need to create a bootable image including a boot
|
||
loader, the boot-loader configuration, and the kernel. To spare us the
|
||
manual work, Genode comes with a so-called run tool that automates this
|
||
procedure. We merely need to supplement the run tool with a script snippet
|
||
that describes how a boot image is assembled for the seL4 base platform.
|
||
This snippet has to reside at _tool/run/boot_dir/sel4_. To create it, we can
|
||
take inspiration from the versions for the other platforms such as NOVA. It
|
||
comes down to implementing the function 'run_boot_dir' according to our needs.
|
||
For the most part, the function contains the generation of the boot-loader
|
||
configuration, i.e., GRUB's _menu.lst_ file. At the current stage, we simply
|
||
load the seL4 ELF image as multi-boot kernel. For using the run tool for the
|
||
initial test, the following small run script at _base-sel4/run/test.run_ does
|
||
the trick:
|
||
|
||
! create_boot_directory
|
||
! build_boot_image ""
|
||
! append qemu_args " -nographic -m 64 "
|
||
! run_genode_until forever
|
||
|
||
To invoke it, all we have to do is issuing 'make run/test' from the build
|
||
directory. All we see, however, is, well, nothing.
|
||
|
||
Normally we would expect from the kernel to print a life sign when booting
|
||
up. This is suggested by a 'printf' in _arch/arm/kernel/boot.c_. So the lack
|
||
of output is most certainly a configuration issue. A look in
|
||
_include/plat/pc99/plat/machine/io.h_ indicates that 'kernel_putchar', which
|
||
is used as the back end of 'printf' _(src/machine/io.c)_ is compiled-in only
|
||
if DEBUG mode is enabled. Adding 'DEBUG=1' to the sel4 build command (in our
|
||
_lib/mk/x86_32/kernel.mk_ file) enables this mode.
|
||
|
||
But enabling the DEBUG option comes with a strange surprise. The kernel build
|
||
would fail with the following error:
|
||
|
||
! kernel.o: In function `insert_dev_p_reg':
|
||
! kernel_final.c:(.boot.text+0x131a): undefined reference to `putchar'
|
||
|
||
The message is strange because the seL4 source code is void of any call to
|
||
'putchar'. It turned out that the compiler automatically turned a 'printf'
|
||
with a one-character string ("/n") into a call of 'putchar'. Fortunately,
|
||
this built-in heuristic can be disabled by passing '-fno-builtin-printf'
|
||
to the CFLAGS when compiling seL4. With this fix, we can successfully compile
|
||
the kernel in debug mode, boot it in Qemu, and observe the first life sign:
|
||
|
||
! Boot config: parsing cmdline '/sel4'
|
||
! Boot config: console_port of node #0 = 0x3f8
|
||
! Boot config: debug_port of node #0 = 0x3f8
|
||
! Boot config: max_num_nodes = 1
|
||
! Boot config: num_sh_frames = 0x0
|
||
! seL4 failed assertion '_ndks_end - _ndks_start <= NDKS_SIZE'
|
||
! at src/arch/ia32/kernel/boot_sys.c:424 in function try_boot_sys
|
||
|
||
The NDKS section apparently contains those parts of the image that should
|
||
always be mapped at the upper part of the virtual address space. The assertion
|
||
triggers because the section of the binary is larger (40900 bytes) than the
|
||
assumed limit NDKS_SIZE (12288 bytes). According to the linker script
|
||
_(src/plat/pc99/linker.lds)_, the section includes the kernel stack, the BSS,
|
||
and "COMMON" symbols. When looking at the list of those symbols using 'nm |
|
||
sort', the 'ksReadyQueues' symbol raises my eyebrows because it is quite
|
||
large (32768 bytes) compared to all others. This symbol belongs to an array,
|
||
which is dimensioned as CONFIG_NUM_DOMAINS * CONFIG_NUM_PRIORITIES. As we are
|
||
using the default configuration values of _include/config.h_,
|
||
CONFIG_NUM_DOMAINS is 16 and CONFIG_NUM_PRIORITIES is 256. Hence, the array
|
||
has 4096 entries with 8 bytes for each entry ('tcb_queue_t' with two
|
||
pointers). Interestingly, the default value of NUM_DOMAINS in the _Kconfig_ is
|
||
1, which would result in a much smaller 'ksReadyQueues' array. In fact,
|
||
passing the following argument to the kernel build fixes the assertion:
|
||
|
||
! CONFIG_KERNEL_EXTRA_CPPFLAGS+=-DCONFIG_NUM_DOMAINS=1
|
||
|
||
Now, the kernel bootstraps successfully, detects one CPU and tries to
|
||
obtain information about the boot modules, which we don't have provided yet.
|
||
Hence, the kernel backs out with the following message:
|
||
|
||
! Boot loader did not provide information about boot modules
|
||
! seL4 called fail at src/arch/ia32/kernel/boot_sys.c:711
|
||
! in function boot_sys, saying "boot_sys failed for some reason :(
|
||
|
||
Well, this is expected. So it is time to take the first baby step of providing
|
||
a root task to the system.
|
||
|
||
|
||
A root task for exercising the kernel interface
|
||
###############################################
|
||
|
||
At this point, there are two options. We could either start with a very
|
||
minimalistic hello-world program that is completely unrelated to Genode. Such
|
||
a root-task from scratch could be used to explore the individual system calls
|
||
before jumping into Genode. The second option would be to go directly for
|
||
a Genode program, which includes a proper C++ runtime and Genode's generic
|
||
linker script.
|
||
|
||
I went for the latter option and created a simple test program at
|
||
_base-sel4/src/test/sel4/_. The _target.mk_ file looks as follows:
|
||
|
||
! TARGET = test-sel4
|
||
! SRC_CC = main.cc
|
||
! LIBS = cxx startup
|
||
|
||
The 'main' function of the _main.cc_ file does not much except for writing
|
||
at an illegal (yet known) address:
|
||
|
||
! int main()
|
||
! {
|
||
! *(int *)0x1122 = 0;
|
||
! return 0;
|
||
! }
|
||
|
||
When attempting to build the program by issuing 'make test/sel4' from within
|
||
our build directory, the build system will attempt to compile the C++ runtime,
|
||
which, in turn, depends on a few Genode headers. Some headers are readily
|
||
provided by the _base/_ repository but others are expected to be supplied
|
||
by the _base-<kernel>_ repository. The compile errors look like this:
|
||
|
||
! COMPILE malloc_free.o
|
||
! In file included from repos/base/include/parent/capability.h:17:0,
|
||
! from repos/base/include/base/env.h:20,
|
||
! from repos/base/src/base/cxx/malloc_free.cc:17:
|
||
! repos/base/include/base/capability.h:21:31:
|
||
! fatal error: base/native_types.h: No such file or directory
|
||
|
||
For now, we can supply a dummy version of this header, which contain
|
||
preliminary type definitions. To see, which types are expected from which
|
||
header file, we can take cues from the other _base-<kernel>_ platforms.
|
||
|
||
At link time, we will observe plenty of undefined references originating
|
||
from the startup code. Most of those references concern basic data
|
||
structures such as the AVL tree. Normally, those symbols are provided by
|
||
the so-called _base-common_ library, which is used both core and non-core
|
||
programs. We can take a copy of a _base-common.mk_ file from one of the
|
||
other base platforms as a starting point. I decided to go for the version
|
||
provided by _base-nova_. With the base-common library present, we can
|
||
replace the 'cxx' and 'startup' libs by the base-common library in the
|
||
'LIBS' declaration of our test-sel4 program. Because base-common already
|
||
depends on both 'cxx' and 'startup', there is no need to specify those
|
||
libraries twice.
|
||
|
||
On the attempt to compile the base-common library, we will stumble over
|
||
further missing headers such as _ipc_msgbuf.h_. Here the _base-linux_
|
||
version of this file becomes handy because it is free-standing.
|
||
|
||
When compiling _lock.cc_, the compiler complaints about the missing
|
||
_lock_helper.h_ file. This platform-specific file that is internally used by
|
||
the lock implementation is normally expected
|
||
at _base-<kernel>/src/base/lock/_. For now, just for getting the binary
|
||
linked, we provide empty dummies:
|
||
|
||
! static inline void thread_yield() { }
|
||
! static inline void thread_switch_to(Genode::Thread_base *thread_base) { }
|
||
! static inline void thread_stop_myself() { }
|
||
!
|
||
! static inline bool thread_check_stopped_and_restart(Genode::Thread_base *)
|
||
! {
|
||
! return false;
|
||
! }
|
||
|
||
The next missing piece is the platform-specific implementation of the IPC
|
||
API _(ipc/ipc.cc and ipc/pager.cc)_, which are expected to reside at
|
||
_base-sel4/src/base/ipc/_. We just take the simplest version of the other
|
||
base platforms (in this case _base-codezero/src/base/ipc/_) and strip it down
|
||
to become a mere dummy.
|
||
|
||
By adding the base-common library, the number of unresolved references
|
||
decreased by a great amount. Some functions are still unresolved.
|
||
There are many references to 'printf', which is not part of the base-common
|
||
library because core uses a different back end (typically a kernel debugger)
|
||
than non-core processes (using a session to core's LOG service). Because
|
||
our test plays the role of a root task, we can include core's version by
|
||
adding _base/src/base/console/core_printf.cc_ as source to our target
|
||
description file.
|
||
|
||
! SRC_CC += base/console/core_printf.cc
|
||
! vpath %.cc $(BASE_DIR)/src
|
||
|
||
The back end of _core_printf.cc_ has to be provided by a header at
|
||
_base-sel4/src/base/console/core_console.h_. For now, we just provide
|
||
an empty implementation of '_out_char'. To let the compiler find the
|
||
header, we need to extend the include search path as follows:
|
||
|
||
! INC_DIR += $(REP_DIR)/src/base/console
|
||
|
||
For the unresolved references of 'env_context_area_ram_session' and
|
||
'env_context_area_rm_session', the following dummy will do:
|
||
|
||
! #include <ram_session/ram_session.h>
|
||
! #include <rm_session/rm_session.h>
|
||
!
|
||
! namespace Genode {
|
||
! Rm_session *env_context_area_rm_session() { return nullptr; }
|
||
! Ram_session *env_context_area_ram_session() { return nullptr; }
|
||
! }
|
||
|
||
The remaining piece of the puzzle is the 'Genode::env()' function, which
|
||
is an accessor to the Genode environment. Because the Genode environment does
|
||
not exist yet, we provide a dummy 'env' called _mini_env_. Initially, this
|
||
implementation of the 'Genode::Env' interface merely returns dummy values
|
||
(null pointers and invalid capabilies).
|
||
|
||
After this step, the test-sel4 target links successfully. To use it as
|
||
roottask, we have to modify our _test.run_ script by
|
||
|
||
# Adding a build step
|
||
! build { test/sel4 }
|
||
# Specifying the test-sel4 binary as boot module so that the our run
|
||
environment includes it in the boot image and appends the file as module
|
||
to the boot-loader configuration.
|
||
! build_boot_image "test-sel4"
|
||
|
||
When issuing 'make run/test' now, we get the following messages from the
|
||
kernel:
|
||
|
||
! Detected 1 boot module(s):
|
||
! module #0: start=0x169000 end=0x186754 size=0x1d754
|
||
! name='/genode/test-sel4'
|
||
! ELF-loading userland images from boot modules:
|
||
! module #0 for node #0: size=0x27000 v_entry=0x136cc
|
||
! v_start=0x0 v_end=0x27000
|
||
! p_start=0x187000 p_end=0x1ae000
|
||
! Moving loaded userland images to final location:
|
||
! from=0x187000 to=0x15e000 size=0x27000
|
||
!
|
||
! Starting node #0
|
||
! Caught cap fault in send phase at address 0x0
|
||
! while trying to handle:
|
||
! vm fault on data at address 0x9090c3fb with status 0x4
|
||
! in thread 0xe0189880 at address 0x314a
|
||
|
||
Assuming that 'v_start' and 'v_end' stand for the virtual address range
|
||
of the loaded binary, those numbers look weird. Normally, Genode binaries
|
||
are linked at a much higher address, e.g., 0x1000000. By inspecting the
|
||
binary via the readelf command, it turns out that we haven't yet declared the
|
||
link address for the seL4 platform. So its time to introduce a so-called spec
|
||
file for the "sel4" build-spec value. The new file _base-sel4/mk/spec-sel4.mk_
|
||
will be incorporated into the build process whenever the 'SPEC' declaration of
|
||
the build directory has the value "sel4" listed. To define the default link
|
||
address for the platform, the file has the following content:
|
||
|
||
! LD_TEXT_ADDR ?= 0x01000000
|
||
|
||
When issuing 'make test/sel4 VERBOSE=', we can see the link address specified
|
||
at the linker command line. Another look at the binary via readelf confirms
|
||
that the location of the text segment looks good now. Re-executing the run
|
||
script produces the same result as before though. But the last message is
|
||
quite helpful:
|
||
|
||
! vm fault on data at address 0x0 with status 0x4
|
||
! in thread 0xe0189880 at address 0x100312b
|
||
|
||
The last address lies well within the text segment of our binary. It seems
|
||
that the kernel has kicked off the execution of our root task (presumably
|
||
at the entry point address as found in the ELF binary). At some point, our
|
||
"root task" de-references a null pointer. Given that several of the dummy
|
||
functions that we just created, return null pointers, this is hardly a
|
||
surprise. So let us have a look how far we have come by inspecting the
|
||
fault address 0x100312b using
|
||
! objdump -lSd bin/test-sel4 | less
|
||
In less, we search for the pattern "100312b". We see that the surrounding
|
||
code belongs to the heap implementation ('Heap::_allocate_dataspace'). It
|
||
seems that someone is trying to use 'Genode::env()->heap()'. To confirm
|
||
this assumption, we can use Qemu's GDB stub to obtain a backtrace. We have
|
||
to take the following steps:
|
||
|
||
# We want to halt the execution at the point where the fault would happen
|
||
instead of triggering a page fault. The easiest way to accomplish that
|
||
is to insert an infinite loop at the right spot. The infinite loop serves
|
||
us as a poor man's break point. In our case, we add a 'for (;;);' statement
|
||
right at the beginning of the '_allocate_dataspace' function in
|
||
_repos/base/src/base/heap/heap.cc_. When re-executing the run script after
|
||
this change, we can see that the fault message won't appear.
|
||
|
||
# We cancel the execution of the run script and start Qemu manually using
|
||
the command-line arguments that are found in the log output (the message
|
||
"spawn qemu ...". To enable Qemu's GDB stub, we have to append "-s" as
|
||
argument.
|
||
! qemu-system-i386 -nographic -m 64 -cdrom var/run/test.iso -s
|
||
|
||
# When qemu is running, we start GDB from another shell (changing to
|
||
our build directory) as follows:
|
||
! gdb bin/test-sel4 -ex "target remote :1234"
|
||
|
||
# Listing the backtrace via the 'bt' command is quite revealing (output
|
||
slightly edited for brevity):
|
||
|
||
! (gdb) bt
|
||
! #0 Genode::Heap::_allocate_dataspace
|
||
! at genode/repos/base/src/base/heap/heap.cc:57
|
||
! #1 0x010030b9 in Genode::Heap::_unsynchronized_alloc
|
||
! at genode/repos/base/src/base/heap/heap.cc:181
|
||
! #2 0x01003155 in Genode::Heap::alloc
|
||
! at genode/repos/base/src/base/heap/heap.cc:199
|
||
! #3 0x01010732 in malloc
|
||
! #4 0x0100f539 in start_fde_sort
|
||
! #5 init_object
|
||
! at ../../../../../../contrib/gcc-4.7.2/libgcc/unwind-dw2-fde.c:768
|
||
! #6 search_object
|
||
! at ../../../../../../contrib/gcc-4.7.2/libgcc/unwind-dw2-fde.c:958
|
||
! #7 0x01010375 in _Unwind_Find_registered_FDE
|
||
! at ../../../../../../contrib/gcc-4.7.2/libgcc/unwind-dw2-fde.c:1022
|
||
! #8 _Unwind_Find_FDE
|
||
! at ../../../../../../contrib/gcc-4.7.2/libgcc/unwind-dw2-fde-dip.c:440
|
||
! #9 0x0100d4c3 in uw_frame_state_for
|
||
! #10 0x0100dff8 in uw_init_context_1
|
||
! at ../../../../../../contrib/gcc-4.7.2/libgcc/unwind-dw2.c:1500
|
||
! #11 0x0100e3ea in _Unwind_RaiseException
|
||
! #12 0x01011f51 in __cxa_throw ()
|
||
! #13 0x01013751 in init_main_thread ()
|
||
! at genode/repos/base/src/platform/init_main_thread.cc:119
|
||
! #14 0x010134fe in _start ()
|
||
! at genode/repos/base/src/platform/x86_32/crt0.s:47
|
||
|
||
We can see that Genode's startup code tries to throw the first C++ exception.
|
||
There is a lengthy comment at the corresponding code portion that explains
|
||
the rationale behind throwing an exception right from the startup code.
|
||
The exception handling, in particular the stack unwinding) is done by the
|
||
GCC support library, which eventually calls 'malloc' to allocate some
|
||
backing store for metadata. Genode has no 'malloc' function. But for making
|
||
the GCC support library happy (which expects a C runtime to be present), our
|
||
Genode-specific C++ support code in the form of the 'cxx' library comes with
|
||
a library-local version of malloc _(base/src/base/cxx/malloc_free.cc)_. This
|
||
malloc implementation uses a separate 'Genode::Heap' instance. The instance
|
||
has an initial capacity of 512 bytes. If the allocations exceed this capacity,
|
||
this heap will try to expand using 'env()->ram_session()' as backing store.
|
||
This is why the '_allocate_dataspace' function was called. At the current
|
||
stage, we don't have an implementation of 'env()->ram_session()'. To work
|
||
around this early allocation issue, we can simply increase the capacity of
|
||
the 'initial_block', let's say by factor 10. When re-executing the run script
|
||
now, '_allocate_dataspace' won't be called. Hence, we execution won't get
|
||
stuck in our infinite loop (we can remove the loop either way now). Instead,
|
||
we get another null-pointer dereference:
|
||
|
||
! vm fault on data at address 0x0 with status 0x4
|
||
! in thread 0xe0189880 at address 0x1003ab7
|
||
|
||
The procedure to investigate the reason for this page fault is exactly the
|
||
same as for the first one, using objdump, infinite loops, and Qemu's GDB stub.
|
||
|
||
This time, the null-pointer dereference occurs in
|
||
_base/src/base/thread/thread.cc_ on the attempt to allocate so-called
|
||
thread context for the main thread. Such thread contexts contain the
|
||
stacks and meta data of threads. They are placed in a dedicated virtual
|
||
memory area that is manually managed by the process.
|
||
|
||
For our minimalistic root task, we don't have Genode's address-space
|
||
management facilities at our disposal yet. So we have to side-step the
|
||
'_alloc_context' function somehow. This can be accomplished by using
|
||
custom version of _thread.cc_ instead of the one provided by the _base/_
|
||
repository. So we remove _thread.cc_ from the base-common library for now
|
||
and add a new _thread.cc_ file to our test-sel4 target. The following
|
||
dummy stub will do for now:
|
||
|
||
! static Thread_base::Context *main_context()
|
||
! {
|
||
! enum { STACK_SIZE = sizeof(long)*4*1024 };
|
||
!
|
||
! static long buffer[STACK_SIZE/sizeof(long)];
|
||
!
|
||
! /* the context is located beyond the top of the stack */
|
||
! addr_t const context_addr = (addr_t)buffer + sizeof(buffer)
|
||
! - sizeof(Thread_base::Context);
|
||
!
|
||
! Thread_base::Context *context = (Thread_base::Context *)context_addr;
|
||
! context->stack_base = (addr_t)buffer;
|
||
! return context;
|
||
! }
|
||
!
|
||
! Thread_base *Thread_base::myself() { return nullptr; }
|
||
!
|
||
! Thread_base::Thread_base(const char *name, size_t stack_size, Type type,
|
||
! Cpu_session *cpu_session)
|
||
! :
|
||
! _cpu_session(cpu_session), _context(main_context())
|
||
! {
|
||
! strncpy(_context->name, name, sizeof(_context->name));
|
||
! _context->thread_base = this;
|
||
!
|
||
! _init_platform_thread(type);
|
||
! }
|
||
!
|
||
! Thread_base::Thread_base(const char *name, size_t stack_size, Type type)
|
||
! : Thread_base(name, stack_size, type, nullptr) { }
|
||
!
|
||
! Thread_base::~Thread_base() { _deinit_platform_thread(); }
|
||
|
||
Genode's startup code will change the stack of the main thread prior calling
|
||
the main function. This way, the stack can be placed at a dedicated area of
|
||
the virtual address space (thread-context area). By not placing the stack
|
||
close to the program image, stack overflows won't silently corrupt data but
|
||
trigger a page fault. The code above, however, allocates a dummy context for
|
||
the main thread in the BSS segment (the 'buffer' used as backing store for the
|
||
context is a static variable). This code above is intended as an interim
|
||
solution for the initialization of the main thread only.
|
||
|
||
When executing our run script again, we will get the following message:
|
||
|
||
! vm fault on data at address 0x1122 with status 0x6
|
||
! in thread 0xe0189880 at address 0x1000190
|
||
|
||
This message is caused by the main function of our test program! In this
|
||
function, we deliberately triggered a fault at the address 0x1122 via
|
||
the statement '*(int *)0x1122 = 0;'.
|
||
|
||
|
||
Issuing the first system call
|
||
#############################
|
||
|
||
We have successfully started our custom root task but we have not interacted
|
||
with the kernel yet. So it is time to take a look at seL4's system-call
|
||
interface. The interfaces comes in the form of several header files within
|
||
sel4's _libsel4/_ directory. At the first glance, the directory layout looks
|
||
straight forward. The generic parts reside in _libsel4/include/_ whereas
|
||
the architecture-depending parts are located at _libsel4/arch_include/_.
|
||
However, when skimming over the headers, it becomes apparent that some of
|
||
them are generated from XML files. Also, some headers are including
|
||
a top-level header '<autoconf.h>'.
|
||
|
||
To make sel4's kernel interface visible to the Genode world, we use a pseudo
|
||
library called _platform.mk_. The platform library is built before all other
|
||
libraries and targets and thereby gives us a hook to populate the build
|
||
directory with a custom include-directory structure. Because the selection of
|
||
the kernel-interface header depends on the architecture, we place the
|
||
_platform.mk_ file at _lib/mk/x86_32_. To create the platform library, we can
|
||
take OKL4's version as a blue print. When the library gets built, the include
|
||
directory structure will be created as a side effect. However, we cannot
|
||
implement the removal of the _include/_ directory of a side effect of a clean
|
||
rule because library description files have no clean rule (the build system
|
||
just wipes the respective library directory when cleaning). To complement the
|
||
creation of the _include/_ directory structure with a corresponding clean
|
||
rule, the _base-sel4/mk/spec-sel4.mk_ file can be extended with such a rule,
|
||
which will be globally visible.
|
||
|
||
The platform library is used as a mere hook to create the include directory
|
||
structure within the build directory. To allow a program to actually use
|
||
those headers, we'd need to extend the include-search path accordingly.
|
||
One way would be to have each target specify the build-directory's local
|
||
include path via 'INC_DIR += $(BUILD_BASE_DIR)/include'. Not too bad. However,
|
||
to make the use of the syscall headers more convenient, we introduce just
|
||
another library called _syscall_. The library is solely used for providing
|
||
a so-called library-import file to all targets that use the library. The
|
||
import file contains the extension of the include-search path. Additionally,
|
||
we extend the 'REP_INC_DIR' with the value "include/sel4". This way, we can
|
||
place custom headers (such as a version of _autoconf.h_) within one of the
|
||
Genode source repositories under _include/sel4/_. Those headers will appear
|
||
to be located at the inlude root scope for such targets. Speaking of
|
||
_autoconf.h_, this header is expected by some sel4 includes to distinguish
|
||
the debug mode from the non-debug mode. As we want to enable the debug
|
||
functionality, we supply our version of the _autoconf.h_ file at
|
||
_base-sel4/include/sel4/autoconf.h_ with the definition:
|
||
|
||
! #define SEL4_DEBUG_KERNEL 1
|
||
|
||
Besides the _autoconf.h_ file, the kernel-interface headers also require
|
||
an _stdint.h_ to be present at the include root scope. So we place a version
|
||
of this file at _base-sel4/include/sel4/stdint.h_.
|
||
|
||
! #include <base/fixed_stdint.h>
|
||
!
|
||
! typedef genode_uint32_t uint32_t;
|
||
!
|
||
! #ifndef NULL
|
||
! #define NULL ((void *)0)
|
||
! #endif
|
||
|
||
For trying out the access to the kernel-interface headers, we let our target
|
||
use the syscall library by extending the
|
||
_target.mk_ file with 'LIBS += syscall'. Then, we change the main program
|
||
as follows.
|
||
|
||
! /* Genode includes */
|
||
! #include <util/string.h>
|
||
!
|
||
! /* seL4 includes */
|
||
! #include <sel4/arch/syscalls.h>
|
||
!
|
||
! int main()
|
||
! {
|
||
! char const *string = "\nMessage printed via the kernel\n";
|
||
! for (unsigned i = 0; i < Genode::strlen(string); i++)
|
||
! seL4_DebugPutChar(string[i]);
|
||
!
|
||
! *(int *)0x1122 = 0;
|
||
! return 0;
|
||
! }
|
||
|
||
Here we see three things. First, we can use Genode's usual utilities such as
|
||
the string functions, which is quite convenient. Second, we include one of
|
||
seL4 headers. And third, we try to invoke the 'seL4_DebugPutChar' system call
|
||
to print a string before triggering the page fault at address 0x1122. Before
|
||
we can run the program, through, we just have to overcome yet another problem.
|
||
On the attempt to build it, we get the following message:
|
||
|
||
! COMPILE main.o
|
||
! In file included from include/sel4/arch/syscalls.h:15:0,
|
||
! from genode/repos/base-sel4/src/test/sel4/main.cc:18:
|
||
! include/sel4/types.h:16:28: fatal error: sel4/types_gen.h: No such file or directory
|
||
|
||
A look at the top-level _Makefile_ of seL4 reveals that this file is generated
|
||
from a "types.bf" file using a python script called _tools/bitfield_gen.py_.
|
||
So we have to add a rule for generating this file to our platform library.
|
||
|
||
! $(BUILD_BASE_DIR)/include/sel4/types_gen.h: $(LIBSEL4_DIR)/include/sel4/types.bf
|
||
! $(VERBOSE)python $(LIBSEL4_DIR)/tools/bitfield_gen.py \
|
||
! --environment libsel4 "$<" $@
|
||
|
||
The next missing header that we stumble over is _assert.h_. So we have to add
|
||
a simple version of _assert.h_ to _base-sel4/include/sel4/_. This version
|
||
uses Genode's PDBG facility to print error messages, which is, of course, not
|
||
expected to work yet.
|
||
|
||
On the next attempt to build the program, the compilation fails because
|
||
of a missing header again:
|
||
|
||
! In file included from include/sel4/arch/syscalls.h:15:0,
|
||
! from genode/repos/base-sel4/src/test/sel4/main.cc:18:
|
||
! include/sel4/types.h:17:26: fatal error: sel4/syscall.h: No such file or directory
|
||
|
||
The _libsel4/Makefile_ contains the rule to generate this file. At this point,
|
||
I am wondering whether to use this Makefile to add those rules to our
|
||
platform library. I decided for the latter because there are not too many
|
||
files to generate, I will need to look behind the scenes sooner or later
|
||
anyway, and I would need to supply some some additional environment (such
|
||
as providing a _common.mk_ file in order to invoke the original Makefile.
|
||
The rules for generating headers like _syscall.h_ look similar to the rule
|
||
for _types_gen.h_ above.
|
||
|
||
With _sel4/syscall.h_ in place, we get confronted with another problem:
|
||
|
||
! include/sel4/arch/syscalls.h: In function ‘void seL4_Send(seL4_CPtr, seL4_MessageInfo_t)’:
|
||
! include/sel4/arch/syscalls.h:32:26: error: ‘seL4_GetMR’ was not declared in this scope
|
||
|
||
Fortunately, this error is simply caused by a missing include directive of
|
||
_syscalls.h_. The 'seL4_GetMR' function is provided by _sel4/arch/functions.h_
|
||
but this file is never included except for _sel4/sel4.h_. I assume that seL4
|
||
users are expected to always include the _sel4/sel4.h_ file instead of
|
||
including individual kernel headers. By prepending the include of
|
||
'<sel4/arch/functions.h>' to the include of '<sel4/arch/syscalls.h>, the
|
||
problem goes away and clears the path for the next one:
|
||
|
||
! include/sel4/arch/syscalls.h: In function ‘void seL4_DebugPutChar(char)’:
|
||
! include/sel4/arch/syscalls.h:478:16: error: ‘seL4_SysDebugPutChar’ was not declared in this scope
|
||
|
||
This message is accompanied with similar errors for "seL4_SysDebugHalt",
|
||
"seL4_SysDebugSnapshot", and "seL4_SysDebugCapIdentify".
|
||
A look in the generated _include/sel4/syscall.h_ reveals that those
|
||
declarations exists only when building in DEBUG mode. Hence, we need
|
||
to add the following line to our _autoconf.h_ file:
|
||
|
||
! #define DEBUG 1
|
||
|
||
The next problem is more tricky:
|
||
|
||
! include/sel4/arch/syscalls.h: In function ‘int main()’:
|
||
! include/sel4/arch/syscalls.h:481:6: error: impossible register constraint in ‘asm’
|
||
|
||
The error refers to the system-call binding for 'seL4_DebugPutChar'. After
|
||
twiddling with the asm constraints, it turns out that the error is caused by
|
||
the use of an enum value as input argument. The C++ compiler is free to pick
|
||
any integer type that it sees fit for representing an enum value. Even though
|
||
the seL4 developers use a helper macro (SEL4_FORCE_LONG_ENUM) to force a
|
||
certain minimum bit width for the used type, the C++ compiler complains. By
|
||
explicitly casting the input argument to 'int', this ambiguity can be resolved
|
||
and the compiler becomes happy. Unfortunately, this means that I will have to
|
||
patch the system-call bindings to make them fit for the use with C++. But
|
||
looking at the bindings, I think that a patch won't be avoidable anyway
|
||
because the bindings clobber the EBX register. This means that we won't be
|
||
able to use the headers for compiling position-independent code (as is the
|
||
case for Genode). For now, we have are not compiling with '-fPIC' enabled but
|
||
this issue is clear in front of us.
|
||
|
||
Patches for the seL4 code will be reside at _base-sel4/src/kernel/_. E.g.,
|
||
we just added the current modification of the _syscalls.h_ header by
|
||
copying a git diff to the file _base-sel4/src/kernel/syscalls.patch_.
|
||
To apply the patch automatically when preparing the seL4 port, we need
|
||
to modify the _base-sel4/ports/sel4.port_ file by adding the following
|
||
lines:
|
||
|
||
! PATCHES := src/kernel/syscalls.patch
|
||
! PATCH_OPT := -p1 -d src/kernel/sel4
|
||
|
||
Since we modified the port-description file, we need to update the
|
||
accompanied hash via './tool/ports/update_hash sel4'.
|
||
|
||
_Edit: After consulting the seL4 mailing list, Adrian Danis pointed out that_
|
||
_the actual issue is an off-by-one bug in the SEL4_FORCE_LONG_ENUM macro._
|
||
_So instead of explicitly casting all opcodes to integers, the macro_
|
||
_can be fixed at one place. Hence, I replaced the syscalls.patch by a_
|
||
_macros.patch until the fix appears upstream._
|
||
|
||
Anyway, after all the steps, our test-sel4 program can be successfully
|
||
built. Executing the run script produces the result that we longed for:
|
||
|
||
! Message printed via the kernel
|
||
! Caught cap fault in send phase at address 0x0
|
||
! while trying to handle:
|
||
! vm fault on data at address 0x1122 with status 0x6
|
||
! in thread 0xe0189880 at address 0x10001d8
|
||
|
||
The first line is produced by our test program. Knowing how to print
|
||
characters using the kernel's debug interface, filling out the empty
|
||
stub function 'Genode::Core_console::_out_char' in _core_console.h_
|
||
is easy. We can replace the main program with this version:
|
||
|
||
! #include <base/printf.h>
|
||
!
|
||
! int main()
|
||
! {
|
||
! PDBG("a message printed via Genode's PDBG");
|
||
!
|
||
! *(int *)0x1122 = 0;
|
||
! return 0;
|
||
! }
|
||
|
||
When running it via 'make run/test', it produces the expected result:
|
||
|
||
! int main(): a message printed via Genode's PDBG
|
||
|
||
|
||
Exploration of the kernel interface
|
||
###################################
|
||
|
||
Now that we have a nice playground in place, it is time to explore the
|
||
actual kernel interface step by step. The goal is to get a tangible
|
||
feeling for the kernel and to exercise the functionality that is needed
|
||
by Genode. Those functionalities are:
|
||
|
||
* Access to boot information such as the memory layout and the boot modules
|
||
* Multi-threading
|
||
* Process-local synchronization
|
||
* Synchronous inter-process communication between threads
|
||
* Address-space creation
|
||
* Page-fault handling
|
||
|
||
At this point, it is useful to take a look at the excellent seL4 reference
|
||
manual to learn the concepts behind the kernel interface:
|
||
|
||
:[http://sel4.systems/Docs/seL4-manual.pdf]:
|
||
|
||
seL4 reference manual
|
||
|
||
|
||
Obtaining boot information
|
||
==========================
|
||
|
||
The manual mentions the function 'seL4_GetBootInfo' to obtain a pointer to
|
||
a boot-info structure. The function implementation, however, requires a
|
||
prior call of 'seL4_InitBootInfo' from the startup code. According to the
|
||
manual, the startup code gets the pointer passed in a CPU register. It
|
||
presumably registers this pointer via 'seL4_InitBootInfo'. Of course, we
|
||
don't want to change Genode's kernel-agnostic startup code by inserting
|
||
a call to the 'seL4_InitBootInfo' function. Fortunately, this is not the
|
||
first time, Genode has to pick up a register value passed by the kernel to
|
||
root task via a CPU register. The startup code already saves the initial
|
||
stack pointer, EAX and EDI. The seL4 manual does not state, which register
|
||
is used. In the kernel code ('create_initial_thread'), the register is denoted
|
||
at "capRegister". A look into _ia32/arch/machine/registerset.h_ reveals that
|
||
this register is actually EBX on the x86 architecture. Since EBX is not
|
||
saved by Genode's startup code yet, we need to enhance the startup code
|
||
a bit to save the initial EBX value in the variable '_initial_bx'.
|
||
|
||
The boot-info type is declared in _sel4/bootinfo.h_. When including the
|
||
header, we have to expand our _sel4/stdint.h_ version by a definition
|
||
of 'uint8_t'. Also, the header expects CONFIG_MAX_NUM_BOOTINFO_UNTYPED_CAPS
|
||
to be defined. We are just using the kernel's default config value, which
|
||
we copy to our _sel4/autoconf.h_ file.
|
||
|
||
! #define CONFIG_MAX_NUM_BOOTINFO_UNTYPED_CAPS 800
|
||
|
||
With those changes in place, we can access the boot info with a utility like
|
||
this:
|
||
|
||
! #include <sel4/bootinfo.h>
|
||
!
|
||
! static seL4_BootInfo const *boot_info()
|
||
! {
|
||
! extern Genode::addr_t __initial_bx;
|
||
! return (seL4_BootInfo const *)__initial_bx;
|
||
! }
|
||
|
||
While writing a simple 'dump_boot_info' function, I noticed that some boot-info
|
||
fields mentioned in the manual and present on the master branch, have
|
||
disappeared in the experimental branch, i.e., 'numDeviceRegions'.
|
||
|
||
Anyway, the output of 'dump_boot_info' function looks like this:
|
||
|
||
! --- boot info at 102c000 ---
|
||
! ipcBuffer: 102b000
|
||
! initThreadCNodeSizeBits: 12
|
||
! untyped: [38,4d)
|
||
! [38] [00100000,00107fff]
|
||
! [39] [00108000,00109fff]
|
||
! [3a] [001a0000,001bffff]
|
||
! [3b] [001c0000,001fffff]
|
||
! [3c] [00200000,003fffff]
|
||
! [3d] [00400000,007fffff]
|
||
! [3e] [00800000,00ffffff]
|
||
! [3f] [01000000,01ffffff]
|
||
! [40] [02000000,02ffffff]
|
||
! [41] [03000000,037fffff]
|
||
! [42] [03800000,03bfffff]
|
||
! [43] [03c00000,03dfffff]
|
||
! [44] [03e00000,03efffff]
|
||
! [45] [03f00000,03f7ffff]
|
||
! [46] [03f80000,03fbffff]
|
||
! [47] [03fc0000,03fdffff]
|
||
! [48] [03fe0000,03feffff]
|
||
! [49] [03ff0000,03ff7fff]
|
||
! [4a] [03ff8000,03ffbfff]
|
||
! [4b] [03ffc000,03ffdfff]
|
||
! [4c] [00189000,001897ff]
|
||
|
||
|
||
Creating a thread
|
||
=================
|
||
|
||
On seL4, kernel objects are created by turning untyped memory into kernel
|
||
memory using the 'seL4_Untype_Retype' function. As a first test. I am going to
|
||
manually define the parameters for this function using the information from
|
||
the boot info.
|
||
|
||
But before I can start using this function, we first need to generate some
|
||
stub code. We need to generate _sel4/invocation.h_ and its corresponding
|
||
_sel4/arch/invocation.h_. The rules in the platform library are quickly added,
|
||
taking the seL4 Makefile as inspiration. We also need to generate the
|
||
_interfaces/sel4_client.h_ stub code. In the stub code, we find the function
|
||
'seL4_Untyped_RetypeAtOffset', which pretty much matches the signature of
|
||
'seL4_Untype_Retype' explained in the manual. This is just another difference
|
||
from the master branch.
|
||
|
||
I decided to proceed with invoking 'seL4_Untyped_RetypeAtOffset' by
|
||
manually specifying its parameters. At this point, I am having a hard time
|
||
with wrapping my mind around seL4's kernel-resource management, in particular
|
||
the CNode addressing. My first attempt looked like this:
|
||
|
||
! {
|
||
! seL4_Untyped const service = 0x38; /* untyped */
|
||
! int const type = seL4_TCBObject;
|
||
! int const offset = 0;
|
||
! int const size_bits = 0;
|
||
! seL4_CNode const root = seL4_CapInitThreadCNode;
|
||
! int const node_index = 0;
|
||
! int const node_depth = 32;
|
||
! int const node_offset = 0x100;
|
||
! int const num_objects = 1;
|
||
!
|
||
! int const ret = seL4_Untyped_RetypeAtOffset(service,
|
||
! type,
|
||
! offset,
|
||
! size_bits,
|
||
! root,
|
||
! node_index,
|
||
! node_depth,
|
||
! node_offset,
|
||
! num_objects);
|
||
! PDBG("seL4_Untyped_RetypeAtOffset returned %d", ret);
|
||
! }
|
||
|
||
Admittedly, I feel a bit flabbergasted by the amount of arguments and not 100%
|
||
certain what I am doing here. I put the individual arguments into named
|
||
constants instead of directly supplying them to the function to make their
|
||
meaning easier to memorize. The 'service' argument refers to one of the
|
||
untyped memory capabilities reported by the boot info. This memory will be
|
||
turned into a thread control block (TCB). The 'node_offset' is a presumably
|
||
free capability slot of the root CScope that is supposed to be free. This is
|
||
where we want to store the capability for the newly created thread.
|
||
|
||
When executing the code, the kernel reports an error like this:
|
||
! vm fault on data at address 0x1e8 with status 0x6
|
||
! in thread 0xe0189880 at address 0x10002ed
|
||
|
||
The fault is triggered by the function 'seL4_SetCap', more precisely by the
|
||
instruction:
|
||
! mov %eax,%gs:0x1e8(,%ecx,4)
|
||
|
||
It appears that the seL4 bindings rely on a thread-local-storage facility
|
||
via GS-relative addressing. When the kernel switches thread contexts, it
|
||
loads a segment with a thread-specific memory location. Since we have not
|
||
initialized the GS register with a particular value, we end up addressing
|
||
the first page, which is not mapped. The issue could be resolved by
|
||
initializing the GS register as follows:
|
||
|
||
! static inline void init_ipc_buffer()
|
||
! {
|
||
! asm volatile ("movl %0, %%gs" :: "r"(IPCBUF_GDT_SELECTOR) : "memory");
|
||
! }
|
||
|
||
The IPCBUF_GDT_SELECTOR is defined by the seL4 headers. On the next attempt
|
||
to execute the code, we get a much nicer kernel message:
|
||
|
||
! <<seL4 [decodeUntypedInvocation/136 Te0189880 @100035a]:
|
||
! Untyped Retype: Destination cap invalid or read-only.>>
|
||
|
||
In reality, the message looks even much better - it is in color!
|
||
The message tells us that the kernel has actually received our request
|
||
for retyping untyped memory but the arguments are messed up.
|
||
The message comes from _src/object/untyped.c_:
|
||
|
||
! /* Lookup the destination CNode (where our caps will be placed in). */
|
||
! if (nodeDepth == 0) {
|
||
! nodeCap = extraCaps.excaprefs[0]->cap;
|
||
! } else {
|
||
! cap_t rootCap = extraCaps.excaprefs[0]->cap;
|
||
! lu_ret = lookupTargetSlot(rootCap, nodeIndex, nodeDepth);
|
||
! if (lu_ret.status != EXCEPTION_NONE) {
|
||
! userError("Untyped Retype: Invalid destination address.");
|
||
! return lu_ret.status;
|
||
! }
|
||
! nodeCap = lu_ret.slot->cap;
|
||
! }
|
||
|
||
Apparently, by specifying the value 32 for the depth argument, we entered
|
||
the code path for traversing a CNode tree instead of just inserting a
|
||
capability into the root CScope. By changing 'node_depth' to 0, system
|
||
call returns successfully:
|
||
! int main(): seL4_Untyped_RetypeAtOffset returned 0
|
||
|
||
Even though the new thread has no valid register set and no defined space,
|
||
let us see what happens when we try to start it anyway. This can be done
|
||
via the 'seL4_TCB_Resume' call with our just created TCB capability as
|
||
argument.
|
||
|
||
! seL4_TCB_Resume(0x100);
|
||
|
||
This results in the following exciting output:
|
||
|
||
! Caught cap fault in send phase at address 0x0
|
||
! while trying to handle:
|
||
! vm fault on data at address 0x1122 with status 0x6
|
||
! in thread 0xe0189880 at address 0x10003a5
|
||
! Caught cap fault in send phase at address 0x0
|
||
! while trying to handle:
|
||
! vm fault on data at address 0x0 with status 0x4
|
||
! in thread 0xe0100080 at address 0x0
|
||
|
||
We see two threads faulting! The main thread faults at our "breakpoint"
|
||
0x1122. But there is also another thread, which faults at 0x0. We can see
|
||
that the TCB creation via 'seL4_Untyped_RetypeAtOffset' was successful!
|
||
|
||
Now, turning the situation into something useful seems like a walk in the
|
||
park: We need to allocate a stack for the new thread and set up the initial
|
||
program counter and stack pointer of the new thread. At this point, I decide
|
||
to give the number 0x100 a proper name "SECOND_THREAD_CAP" because it will be
|
||
repeatedly used.
|
||
|
||
! enum { SECOND_THREAD_CAP = 0x100 };
|
||
|
||
Following the manual, we have to call 'seL4_TCB_WriteRegisters' and
|
||
'seL4_TCB_SetSpace'. The following code snippet allocates the stack for
|
||
the new thread on the stack of the main thread, initializes the stack
|
||
pointer and program counter of the new thread, assigns the new thread to
|
||
the same address space as the main thread, and kicks off the execution of
|
||
the new thread.
|
||
|
||
! long stack[0x1000];
|
||
! {
|
||
! seL4_UserContext regs;
|
||
! Genode::memset(®s, 0, sizeof(regs));
|
||
!
|
||
! regs.eip = (uint32_t)&second_thread_entry;
|
||
! regs.esp = (uint32_t)&stack[0] + sizeof(stack);
|
||
! int const ret = seL4_TCB_WriteRegisters(SECOND_THREAD_CAP, false,
|
||
! 0, 2, ®s);
|
||
! PDBG("seL4_TCB_WriteRegisters returned %d", ret);
|
||
! }
|
||
!
|
||
! {
|
||
! seL4_CapData_t no_cap_data = { { 0 } };
|
||
! int const ret = seL4_TCB_SetSpace(SECOND_THREAD_CAP, 0,
|
||
! seL4_CapInitThreadCNode, no_cap_data,
|
||
! seL4_CapInitThreadPD, no_cap_data);
|
||
! PDBG("seL4_TCB_SetSpace returned %d", ret);
|
||
! }
|
||
!
|
||
! seL4_TCB_Resume(SECOND_THREAD_CAP);
|
||
|
||
The entry function of the new thread is supposed to produce a page fault at
|
||
the predefined address 0x2244:
|
||
|
||
! void second_thread_entry()
|
||
! {
|
||
! *(int *)0x2244 = 0;
|
||
! }
|
||
|
||
When executing the code, we get the desired result:
|
||
|
||
! vm fault on data at address 0x1122 with status 0x6
|
||
! ...
|
||
! vm fault on data at address 0x2244 with status 0x6
|
||
|
||
From these messages, we can see that both the main thread and the second
|
||
thread are faulting at their designated fault addresses.
|
||
|
||
With two threads in place, we can test-drive the preemptive scheduling of
|
||
the kernel by changing the 'second_thread_entry' function to:
|
||
|
||
! static int volatile cnt = 0;
|
||
!
|
||
! void second_thread_entry()
|
||
! {
|
||
! for (;;)
|
||
! cnt++;
|
||
! }
|
||
|
||
At the end of the main function, we repeatedly print the counter value:
|
||
|
||
! for (;;)
|
||
! PDBG("cnt = %d", cnt);
|
||
|
||
When executing the code, the counter values surprisingly stays at the value
|
||
0. This is because the just-created new thread has a lower priority than the
|
||
main thread. By explicitly assigning the maximum priority to the second
|
||
thread, we can enable the preemptive round-robin scheduling:
|
||
|
||
! seL4_TCB_SetPriority(SECOND_THREAD_CAP, 0xff);
|
||
|
||
Now, we can see the counter value nicely increasing:
|
||
|
||
! int main(): cnt = 0
|
||
! ...
|
||
! int main(): cnt = 0
|
||
! int main(): cnt = 2908738
|
||
! ...
|
||
! int main(): cnt = 2908738
|
||
! int main(): cnt = 5876191
|
||
! ...
|
||
! int main(): cnt = 5876191
|
||
! ...
|
||
|
||
Each thread consumes its entire time slice. This way, the second thread has
|
||
the chance to increment the counter circa 3 million times per time slice after
|
||
which the main thread has the chance to print the counter about 50 times
|
||
before being preempted again.
|
||
|
||
|