The distinction between 'ipc.h' and 'ipc_generic.h' is no more. The only
use case for platform-specific extensions of the IPC support was the
marshalling of capabilities. However, this case is accommodated by a
function interface ('_marshal_capability', '_unmarshal_capability'). By
moving the implementation of these functions from the headers into the
respective ipc libraries, we can abandon the platform-specific 'ipc.h'
headers.
The cleanup call must be performed already during the _dissolve function
shortly after the object at the cap_session is freed up. Otherwise there
is the chance that an in-flight IPC will find the to be dissolved function
again.
Bomb test triggered the case, that a already dissolved rpc_object was found
by a in-flight IPC. If the rpc_object was already freed up by alloc->destroy
the thread using this stale rpc_object pointer cause page-faults in core.
Fixes partly #549
As first step the rpc object must be freed up so that the kernel object
(portal) vanishes. Then the object must be removed from the internal object
pool list so that the object can't be obtained anymore. And then the cleanup
call can be performed (_leave_server_object) since now all names to the
rpc_object are gone.
Doing it in different order (as before the commit) there is a very very little
chance (but the bomb test triggers it occasionally) that the rpc_object can be
obtained again by an incoming IPC - even it is already scheduled for removal.
Fixes partly #549
Since we have now more than a handful patches to the vanilla kernel, we
better switch to a separate git repository in order to review and to maintain
the patches more effectively.
Remove the patches, they are already in the kernel branch.
Fixes#394
If we ran out of capabilities indexes, the bit allocator throws an exception.
If this happens the code seems to hang and nothing happens.
Instead one could catch the exception and print some diagnostic message.
This would be nice, but don't work. Printing some diagnostic message itself
tries to do potentially IPC and will allocate new capability indexes at
least for the receive window.
So, catch the exception and let the thread die, so at least the instruction
pointer is left as trace to identify the reason of the trouble.
Fixes#625
If an exception is thrown the lock is released automatically, so that
other callers may get a capability index if in between some are freed. Fixes
some deadlocks if Genode is short on capability indexes.
Related to #625
Remove signal context object from signal source component list (_signal_queue)
before destruction, otherwise we get a dangling pointer.
On native hardware for base-nova, the signal source thread triggered page
faults in the Signal_source_component::wait_for_signal() method when the signal
context got freed up in Signal_session_component::free_context but was still
enqueued in Signal_source_component::_signal_queue.
Fixes#600
Add functionality to lookup an object and lock it. Additional the case is
handled that a object may be already in-destruction and the lookup will deny
returning the object.
The object_pool generalize the lookup and lock functionality of the rpc_server
and serve as base for following up patches to fix dangling pointer issues.
The CPU session interfaces comes with the ability to install an
exception handler per thread. This patch enhances the feature with the
provision of a default signal handler that is used if no thread-specific
handler is installed. The default signal handler can be set by
specifying an invalid thread capability and a valid signal context
capability.
Furthermore, this patch relaxes the requirement of the order of the
calls of 'exception_handler' and 'set_pager'. Originally, the exception
handler could be installed not before setting a pager. Now, we remember
the installed exception handler in the 'Cpu_thread' and propagate to to
the platform thread at a later time.
This patch reflects eventual allocation errors in a more specific way to
the caller of 'alloc_aligned', in particular out-of-metadata and
out-of-memory are considered as different conditions.
Related to issue #526.
Revert the core-local mapping created in 'Ram_session_component::_clear_ds()'
and free the virtual memory region allocated for this mapping when a
RAM dataspace gets freed.
Fixes#416.
'Bender' can detect serial ports accessible via PCI and writes the I/O ports
to the Bios Data area (BDA).
Usage together with the PXE bootloader ease life running Genode/NOVA on native
hardware, where a standard serial device isn't available anymore anywhere.
We don't can use map_local_one_to_one for boot modules because it happens
that boot modules can be at addresses above physical 3G boundary for x86_32.
Defer the mapping of modules until the point where the core allocators
are set up properly and then remap the physical pages to virtual addresses
below 3G.
If the I/O ports are non default (3f8), we had to specify manually the correct
I/O ports. With this commit the BDA is read and the I/O port of the first
serial interface (COM) is taken. If no serial interface is available no device
configuration will be undertaken.
On Linux, we want to attach additional attributes to processes, i.e.,
the chroot location, the designated UID, and GID. Instead of polluting
the generic code with such Linux-specific platform details, I introduced
the new 'Native_pd_args' type, which can be customized for each
platform. The platform-dependent policy of init is factored out in the
new 'pd_args' library.
The new 'base-linux/run/lx_pd_args.run' script can be used to validate
the propagation of those attributes into core.
Note that this patch does not add the interpretation of the new UID and
PID attributes by core. This will be subject of a follow-up patch.
Related to #510.
Using the new 'join()' function, the caller can explicitly block for the
completion of the thread's 'entry()' function. The test case for this
feature can be found at 'os/src/test/thread_join'. For hybrid
Linux/Genode programs, the 'Thread_base::join()' does not map directly
to 'pthread_join'. The latter function gets already called by the
destructor of 'Thread_base'. According to the documentation, subsequent
calls of 'pthread_join' for one thread may result in undefined behaviour.
So we use a 'Genode::Lock' on this platform, which is in line with the
other platforms.
Related to #194, #501
The IPC-server object exists solely on the stack of the entrypoint
thread and, therefore, would never be destructed as the thread is just
killed. Now, the object is explicitly destructed in the entrypoint
destructor. An alternative solution could instruct the entrypoint thread
the terminate, which would automatically cleanup its stack.
The object pool is assumed to be empty on destruction of the entrypoint.
If not, we warn and at least dissolve all RPC objects.
Extend tracking of delegated and of translated items. The additional
information is used to solely free up unused/unwanted mapped capabilities and
to avoid unnecessary revokes on capability indexes where nothing have been
received.
Fixes#430
After this commit "make prepare" uses HTTP, HTTPS, or FTP where possible
fvor downloading third-party source codes. This prevents problems with
strict firewall rules where only selected ports are usable.
Unfortunately, git.l4android.org does not support Git via HTTP and,
therefore, the sources need a working Git port (9418).
Fixes#443.
By now all services in core where created, and registered in the generic
main routine. Although there exists already a x86-specific service (I/O ports)
there was no possibility to announce core-services for certain platforms only.
This commit introduces a hook function in the 'Platform' class, that enables
registration of platform-specific services. Moreover, the io-port service
is offered on x86 platforms only now.
Areas of an attached dataspace which have never been accessed cannot get
unmapped. With this patch this case is not treated as error anymore.
Fixes#398.
Implement shared IRQs using 'Irq_proxy' class.
Nova: Added global worker 'Irq_thread' support in core and adapted Irq_session.
FOC: Adapted IRQ session code, x86 has shared IRQ support, ARM uses the old
model. Read and set 'mode' argument (from MADT) in 'Irq_session'.
OKL4: Use generic 'Irq_proxy'
Fixes issue #390
Unify handling of UTCBs. The utcb of the main thread is with commit
ea38aad30e at a fixed location - per convention.
So we can remove all the ugly code to transfer the utcb address during process
creation.
To do so also the UTCB of the main thread of Core must be inside Genode's
thread context area to handle it the same way. Unfortunately the UTCB of the
main thread of Core can't be chosen, it is defined by the kernel.
Possible solutions:
- make virtual address of first thread UTCB configurable in hypervisor
- map the utcb of the first thread inside Core to the desired location
This commit implements the second option.
Kernel patch: make utcb map-able
With the patch the Utcb of the main thread of Core is map-able.
Fixes#374
Noux actually uses the sp variable during thread creation and expects to be
set accordingly. This wasn't the case for the main thread, it was ever set
to the address of the main thread UTCB.
Move the context area close to the end of the virtual user available address,
so that Vancouver can obtain as much as possible of the lower virtual address
range for VMs.
Use virtual regions for memory used during core initialization behind context
area. Enables us to start Vancouver VMs up to 1280 MiB, which requires
large virtual regions of contiguous aligned memory.
Exclude used virtual regions of echo and of pager thread in core.
This patch introduces the functions 'affinity' and 'num_cpus' to the CPU
session interface. The interface extension will allow the assignment of
individual threads to CPUs. At this point, it is just a stub with no
actual platform support.
The cpu_session interface fails to be virtualized by gdb_monitor because
platform-nova uses an extended nova_cpu_session interface.
The problem was that threads have been created directly at core without
knowledge of gdb_monitor. This lead to the situation that gdb_monitor didn't
know of all threads to be debugged.
Tunnel the additional parameters required on base-nova through the state()
call of the cpu_session interface before the thread actual is started.
The kernel provides a "recall" feature issued on threads to force a thread into
an exception. In the exception the current state of the thread can be obtained
and its execution can be halted/paused.
However, the recall exception is only delivered when the next time the thread
would leave the kernel. That means the delivery is asynchronous and Genode has
to wait until the exception triggered.
Waiting for the exception can either be done in the cpu_session service or
outside the service in the protection domain of the caller.
It turned out that waiting inside the cpu_service is prone to deadlock the
system. The cpu_session interface is one of many session interfaces handled by
the same thread inside Core.
Deadlock situation:
* The caller (thread_c) to pause some thread_p manages to establish the call
to the cpu_session thread_s of Core but get be interrupted before issuing
the actual pause (recall) command.
* Now the - to be recalled thread_p - is scheduled and tries to invoke another
service of Core, like making log output.
* Since the Core thread_s is handling the session request of thread_c, the
kernel uses the timeslice of thread_p to help to finish the request handled
by thread_s.
* Thread_s issues the actual pause/recall on thread_p and blocks inside Core
to wait for the recall exception to be issued.
* thread_p will leave not the kernel before finishing it actual IPC with
thread_s which is blocked waiting for thread_p.
That is the reason why the waiting/blocking for the recall exception taking
place must be done on NOVA in the context of the caller (thread_1).
Introduce a pause_sync call to the cpu_session which returns a semaphore
capability to the caller. The caller blocks on the semaphore and is woken up
when the pager of thread_p receives the recall exception with the state of
thread_p.
Multiple calls to get the dataspace capability on NOVA lead to the situation
that the caller gets each time a new mapping of the same capability at
different indexes.
The client/caller assumes to get every time the very same index, e.g. in
Noux the index is used to look up structures.
Cache the dataspace capability returned via a rm_session for base-nova.
Since no kernel objects can be created anymore outside Genode::core,
the Vancouver port must be adjusted to use solely the Genode interfaces.
The Vcpu_dispatcher creates all portals via the cpu_session interface and
uses the feature to setup a specific receive window during a IPC (the
cap_session::alloc IPC) to place to be received/to be mapped capability
(virtualization exception portal) at the designed indexes.
The actual vCPU thread extends from a normal Genode::Thread and extends it
by specific vCPU requirements, which are a larger exception base window and
the need by Vancouver to place the SM and EC cap at indexes next to each other.
Fixes#316
Extend Native_capability type to hold a specific selector index where the to
be received cap during a IPC should be mapped to. This feature is required to
place created caps by the cap_session at specific indexes. This feature is
used by Vancouver to setup the virtualization exception portals (created by
the cap_session) at the intended indexes.
Patch prevents following bugs:
* In sleep_forever the thread return from semaphore down if cap is revoked
during destruction of a thread. This causes an endless loop consuming time
not available for other threads.
* In lock_helper and cap_sel_alloc the thread return from the lock() method
even if the semaphore down call failed because of an revoked semaphore.
This lead to the situation that a thread subject to de-construction returns
from the lock method, but not holding the lock, entering the critical section
and modifying state inside the critical section. Another thread in parallel
already in the critical section or entering the critical section also
modifies the state. This lead to curious bugs ...
* thread_nova, thread_start, irq_session
Detect early bugs if the SM is gone unexpectedly where it should never
happen.
Vancouver recalls the vCPU in the vCPU dispatcher code. Enable the right bit
in the mapped native cap so that Vancouver actually is able to perform this
operation.
It now can hold a right bit used during IPC to demote rights of the to be
transfered capability.
The local_name field in the native_capability type is not needed anymore
in NOVA. Simplify the class, remove it from constructors and adapt all
invocations in base-nova.
Unfortunately local_name in struct Raw is still used in generic base code
(process.cc, reload_parent_cap.cc), however has no effect in base-nova.
MsgBuf has to keep the number of received capabilities in order
to free/know correctly unused and unwanted capabilities. Explicitly
call rcv_msg->post_ipc to store this information in a MsgBuf.
Don't reset rcv_msg in ipc.cc, since this is used during
un-marshalling of caps in ipc.h afterwards. The MsgBuf is reseted when its
de-constructor is called.
With this patch solely the local ids are used, no global unique ids
are transfered anymore during IPC.
demo.run, signal.run, noux_tool_chain.run works up to the same
point as before the patches for issue #268.
Fixes#268
Unfortunately, another kernel patch is required for Genode/NOVA to get rid
of global unique ids for objects (issue #268).
Kernel patch:
If a translate of a object capability item inside the same PD
(receiver/sender in same PD) is not successful then he very same item is
returned instead of the null item.
Genode:
Some code in Genode try to map/translate the "root" (the first instance of a)
object capability within the same PD. The translate fails since it is the
first cap and was not delegated beforehand. Instead the cap gets mapped to a
new capability index due to xlt_rcv kernel item patch.
The new local object capability index is used to lookup manged objects
in lists, which however fails because the object is only known by the original
object capability index.
Unfortunately, this happens not only once. Below one example trace and
description is attached.
There are several possible solutions possible:
* Find all places in Genode and replace normal function calls between objects
with IPC calls, such that all capabilities can be translated during IPC.
** Time consuming to find all spots
** Rather platform specific issue requires re-adjustments in generic Genode
code
** Not trivial to ever remember this fact during development of new components
[other platforms have not such a issue, however have global object ids]
** Neither good in terms of performance.
* Use some special system call to the kernel to be able to translate a given
capability index as long until you find the requested original index.
(Obviously ... no comment).
* Kernel patch as this one.
* <your proposal>
Example trace + code description showing the behavior above:
int main(): --- create local services ---
int main(): --- start init ---
[0] DEL OBJ PD:0xc000aa80->0xc000aa80 SB:0x000000aa RB:0x000000ac O:0x00 A:0x1f
int main(): transferred 42 MB to init
[0] DEL OBJ PD:0xc000aa80->0xc000aa80 SB:0x00000120 RB:0x0000013c O:0x00 A:0x1f
[0] DEL OBJ PD:0xc000aa80->0xc000aa80 SB:0x0000016c RB:0x00000168 O:0x00 A:0x1f
Setup ELF failed
[0] XLT OBJ PD:0xc000aa80->0xc000aa80 SB:0x00000168 RB:0x0000016c O:0x00
unknown exception?
int main(): --- init created, waiting for exit condition ---
thread - file - line - text
-------------------------------------------------------------------------------
thread A - [ 0] - 228 - new Core_child(... rom_session.dataspace() ...)
thread A - [ 1] - 27 - IPC call - ask for dataspace cap
thread B - [ 2] - 49 - function - return dataspace cap index 0x120
thread A - [ 1] - 27 - IPC returned - map 0x120 -> 0x13c, translate failed
thread A - ...
thread A - [ 3] - 231 - call _setup_elf()
thread A - [ 3] - 60 - call env->rm_session()->attach()
thread A - [ 4] - 35 - do dataspace object lookup (0x13c)
thread A - [ 4] - 36 - lookup failed (object known as 0x120), throw Exception
thread A - [ 3] - 61 - catch Exception -> return error code "0"
thread A - [ 3] - 233 - "Setup ELF failed" - because error code "0"
File legend:
[0] base/src/core/main.cc
[1] base/include/rom_session/client.h
[2] base-nova/src/core/include/core_rm_session.h
[3] base/src/base/process/process.cc
[4] base-nova/src/core/core_rm_session.cc
Kernel patch:
Introduce a transfer item type to express that a cap should be translated
and if this fails to map it instead.
It would be possible without this combined transfer item type however
with additional overhead. In this case Genode/NOVA would
have to map and translate all caps used as parameter in IPC. It would look
like this:
* If the map and translation succeed, the cap at the new cap index
would have to be revoked. Then the translated cap index can be used.
* If the map succeeds and the translation fails then the mapped cap index
can be used.
* It would become complicated when multiple caps are mapped and translated
and only some of the translation succeed. In such cases Genode would have
to figure out the right relation of translated/mapped and not
translated/mapped caps. It would require to make some assumption about the
order how translated/mapped caps are reported at the UTCB by the kernel.
All the points above lead to the decision to create a separate transfer item
type for that.
Genode:
Most the times the translation succeeds, mapping of caps happens either
seldom. This takes now a bit the pressure of not enough aligned receive
cap windows as described in issue #247.
The patch mainly adds adjustments to handle the
translated and mapped caps correctly especially during freeing of the
receive window (don't free translated cap indexes).
Fixes#268