=======================================
                  Genode on seL4 - IPC and virtual memory
                  =======================================


                              Norman Feske


This is the second part of a series of hands-on articles about bringing Genode
to the seL4 kernel.
[http://genode.org/documentation/articles/sel4_part_1 - Read the previous part here...]

After having created a minimalistic root task consisting of two threads, we
can move forward with exercising the functionality provided by the kernel,
namely inter-process communication and the handling of virtual memory.
Once we have tested those functionalities in our minimalistic root task
environment, we will be able to apply the gained knowledge to the actual
porting effort of Genode's core process.


Inter-process communication
###########################

In the L4 universe, the term IPC (inter-process communication) usually stands
for synchronous communication between two threads. In seL4, IPC has two uses.
First, it enables threads of different protection domains (or the same
protection domain) to exchange messages. So information can be transferred
across protection-domain boundaries. Second, IPC is the mechanism used to
delegate access rights throughout the system. This is accomplished by sending
capabilities as message payload. When a capability is part of a message, the
kernel translates the local name of the capability in the sender's protection
domain to a local name in the receiver's protection domain.

In Genode, IPC is realized as a two thin abstractions that build upon each
other:

# At the low level, the IPC library _src/base/ipc/ipc.cc_ is responsible
  for sending and receiving messages using the kernel mechanism. It has a
  generic interface _base/include/base/ipc.h_, which supports the marshalling
  and un-marshalling of message arguments and capabilities using C++ streaming
  operators. Genode users never directly interact with the IPC library.

# Built on top the IPC library, the so-called RPC framework adds the notion
  of RPC functions and RPC objects. RPC interfaces are declared using
  abstract C++ base classes with a few annotations. Under the hood, the
  RPC framework uses C++ meta-programming techniques to turn RPC definitions
  into code that transfers messages via the IPC library. In contrast to
  the IPC library, the RPC library is platform-agnostic.

To enable Genode's RPC mechanism on seL4, we merely have to provide a
seL4-specific IPC library implementation. To warm up with seL4's IPC
mechanism, however, we first modify our test program to let the main thread
perform an IPC call to the second thread.

To let the second thread receive IPC messages, we first need to create a
synchronous IPC endpoint using the 'seL4_Untyped_RetypeAtOffset' function
with 'seL4_EndpointObject' as type, an offset that skips the already allocated
TCB (the TCB object has a known size of 1024 bytes) and the designated
capability number, let's call it EP_CAP. Of course, we have to create the
entrypoint before starting the second thread.

As a first test, we want the second thread to receive an incoming message.
So we change the entry function as follows:

! PDBG("call seL4_Wait");
! seL4_MessageInfo_t msg_info = seL4_Wait(EP_CAP, nullptr);
! PDBG("returned from seL4_Wait, call seL4_Reply");
! seL4_Reply(msg_info);
! PDBG("returned from seL4_Reply");

At the end of the main function, we try call the second thread via 'seL4_Call':

! PDBG("call seL4_Call");
! seL4_MessageInfo_t msg_info = seL4_MessageInfo_new(0, 0, 0, 0);
! seL4_Call(EP_CAP, msg_info);
! PDBG("returned from seL4_Call");

When executing the code, we get an error as follows:

! int main(): call seL4_Call
! void second_thread_entry(): call seL4_Wait
! Caught cap fault in send phase at address 0x0
! while trying to handle:
! vm fault on data at address 0x4 with status 0x6
! in thread 0xe0100080 at address 0x10002e1

By looking at the output of 'objdump -lSd', we see that fault happens at the
instruction
! mov %edi,%gs:0x4(,%ebx,4)
The issue is the same as the one we experienced for the main thread - we
haven't initialized the GS register with a proper segment, yet. This can be
easily fixed by adding a call to our 'init_ipc_buffer' function right at the
start of the second thread's entry function. Still, the program does not work
yet:

! vm fault on data at address 0x4 with status 0x6
! in thread 0xe0100080 at address 0x10002e8

Looking at the objdump output, we see that the fault still happens at the same
instruction. So what is missing? The answer is that we haven't equipped the
second thread with a proper IPC buffer. The attempt to call 'seL4_Wait',
however, tries to access the IPC buffer of the calling thread. The IPC buffer
can be configured for a thread using the 'seL4_TCB_SetIPCBuffer' function. But
wait - what arguments do we need to pass? In addition to the TCB capability,
there are two arguments a pointer to the IPC buffer and a page capability,
which contains the IPC buffer. Well, I had hoped to get away without dealing
with the memory management at this point. I figure that setting up the IPC
buffer for the second thread would require me to create a seL4_IA32_4K page
object via 'seL4_Untyped_RetypeAtOffset' and insert a mapping of the page
within the roottask's address space, and possibly also create and install a
page table object.

To avoid becoming side-tracked by those memory-management issues, I decide
to assign the IPC buffer of the second thread right at the same page as
the one for the initial thread. Both the local address and the page
capability for the initial thread's IPC buffer are conveniently provided by
seL4's boot info structure. So let's give this a try:

! /* assign IPC buffer to second thread */
! {
!   static_assert(sizeof(seL4_IPCBuffer) % 512 == 0,
!                 "unexpected seL4_IPCBuffer size");
!
!   int const ret = seL4_TCB_SetIPCBuffer(SECOND_THREAD_CAP,
!                                         (seL4_Word)(bi->ipcBuffer + 1),
!                                         seL4_CapInitThreadIPCBuffer);
!
!   PDBG("seL4_TCB_SetIPCBuffer returned %d", ret);
! }

With the initialization of the IPC buffer in place, we finally get our
desired output:

! int main(): call seL4_Call
! void second_thread_entry(): call seL4_Wait
! void second_thread_entry(): returned from seL4_Wait, call seL4_Reply
! int main(): returned from seL4_Call
! void second_thread_entry(): returned from seL4_Reply


Delegation of capabilities via IPC
==================================

The seL4 kernel supports the delegation of capabilities across address-space
boundaries by the means of synchronous IPC. As Genode fundamentally relies
on such a mechanism, I decide to give it a try by extending the simple IPC
test. Instead of letting the main thread call the second thread without any
arguments, the main thread will pass the thread capability of the second
thread as argument. Upon reception of the call, the second thread will find
a capability in its IPC buffer. To validate that the received capability
corresponds to the thread cap, the second thread issues a 'seL4_TCB_Suspend'
operation on the received cap. It is supposed to stop it execution right
there. This experiment requires the following steps:

# At the caller side, we need to supply a capability as argument to the
  'seL4_Call' operation by specifying the number of capabilities to transfer
  at the 'extraCaps' field of the 'seL4_MessageInfo', and marshalling the
  index of the capability via the 'seL4_SetCap' function (specifying
  SECOND_THREAD_CAP as argument).

# At the callee side, we need to define where to receive an incoming
  capability. First, we have to reserve a CNode slot designated for the
  new capability. For the test, a known-free index will do:

  ! enum { RECV_CAP = 0x102 };

  Second, we have to configure the IPC buffer of the second thread to
  point to the RECV_CAP:

  ! seL4_SetCapReceivePath(seL4_CapInitThreadCNode, RECV_CAP, 32);

  We specify 32 as receive depth because the CNode of the initial thread has a
  size of 2^12 and a guard of 20.

At this point I am wondering that there is apparently no way to specify a
*receive window* rather than an individual CNode for receiving capabilities.
After revisiting Section 4.2.2 of the manual, I came to the realization that
*seL4 does not support delegating* *more than one capability in a single IPC*.
From Genode's perspective, this could become an issue because Genode's RPC
framework generally allows for the delegation of multiple capabilities via a
single RPC call.

That said, the simple capability-delegation test works as expected.

When repeatedly performing an IPC call with a delegated capability, the
RECV_CAP index will be populated by the first call. Subsequent attempts to
override the RECV_CAP capability do not work (the 'extraCaps' field of the
received message info remains 0). The receiver has to make sure that the
specified 'CapReceivePath' is an empty capability slot. I.e., by calling
'seL4_CNode_Delete' prior 'seL4_Wait'.


Translation of capabilities aka "unwrapping"
============================================

In addition to receiving delegated capabilities under a new name, seL4's IPC
mechanism allows the recipient of a capability that originated from the
recipient to obtain a custom defined value instead of receiving a new name.
In seL4 terminology, this mechanism is called "unwrapping".

Capability unwrapping is supposed to happen if the transferred capability is
a minted endpoint capability and the recipient is the original creator of
the capability. For testing the mechanism, we replace the 'SECOND_THREAD_CAP'
argument of the 'seL4_Call' by a minted endpoint capability derived from
the endpoint used by the second thread.

For creating a minted endpoint capability, we allocate a new index for the
minted capability (EP_MINTED_CAP) and use the 'seL4_CNode_Mint' operation
as follows:

! seL4_CNode     const service    = seL4_CapInitThreadCNode;
! seL4_Word      const dest_index = EP_MINTED_CAP;
! uint8_t        const dest_depth = 32;
! seL4_CNode     const src_root   = seL4_CapInitThreadCNode;
! seL4_Word      const src_index  = EP_CAP;
! uint8_t        const src_depth  = 32;
! seL4_CapRights const rights     = seL4_Transfer_Mint;
! seL4_CapData_t const badge      = seL4_CapData_Badge_new(111);
!
! int const ret = seL4_CNode_Mint(service,
!                                 dest_index,
!                                 dest_depth,
!                                 src_root,
!                                 src_index,
!                                 src_depth,
!                                 rights,
!                                 badge);

The badge is set to the magic value 111.
When specifying the resulting EP_MINTED_CAP as IPC argument for a 'seL4_Call',
the kernel will translate the capability to the badge value. The callee
observes the reception of such an "unwrapped" capability via the
'capsUnwrapped' field of the 'seL4_MessageInfo' structure returned by the
'seL4_Wait' operation. The badge value can be obtained from the IPC buffer via
'seL4_GetBadge(0)'. This simple experiment shows that the mechanism works
as expected.