Prior this patch the startup lock was not released if the call of
'_associate()' failed. In this condition, the caller of the constructor
was infinitely blocked.
During a ram_session->free call in 'core' the lock in core_env.h is taken.
Then in the ram_session::_free_ds implementation the dissolve function for the
dataspace is called. base-nova tries to make sure that the ds is not
accessible anymore by any kind of parallel incoming IPC by performing a
cleanup IPC. Unfortunately the dataspace_session implementation uses the very
same allocator in 'core' and may require to obtain the same lock as taken in
ram_session->free. This leads to a spurious deadlock on base-nova.
The actual free_ds implementation is mostly thread safe, since all used objects
inside there are already locked. The only missing piece is the _payload
variable. By changing the _payload variable in a atomic fashion there is no
need to lock the whole ram_session->free call which avoids deadlocks on
base-nova.
Fixes#549
The cleanup call must be performed already during the _dissolve function
shortly after the object at the cap_session is freed up. Otherwise there
is the chance that an in-flight IPC will find the to be dissolved function
again.
Bomb test triggered the case, that a already dissolved rpc_object was found
by a in-flight IPC. If the rpc_object was already freed up by alloc->destroy
the thread using this stale rpc_object pointer cause page-faults in core.
Fixes partly #549
As first step the rpc object must be freed up so that the kernel object
(portal) vanishes. Then the object must be removed from the internal object
pool list so that the object can't be obtained anymore. And then the cleanup
call can be performed (_leave_server_object) since now all names to the
rpc_object are gone.
Doing it in different order (as before the commit) there is a very very little
chance (but the bomb test triggers it occasionally) that the rpc_object can be
obtained again by an incoming IPC - even it is already scheduled for removal.
Fixes partly #549
If page faults are handled concurrently (as for base-nova) the traverse lookup
call in rm_session_component must be thread safe, which it isn't.
If the faulting area is backed by nested dataspaces which are managed by
various rm_sessions then a race happens under following circumstances
(triggered occasionally by the bomb test).
The traverse lookup may return a pointer to a rm_session of a nested dataspace.
If the rm_session is in parallel subject to destruction it happened that faults
got enqueued to the faulters list of the deleted rm_session and internally to
a list of the current rm_session of the Rm_client.
During destruction of the faulting Rm_client the associated rm_session will
be dissolved from the Rm_client, which leads to dereferencing the
dangling pointer of the already destructed rm_session.
On base-nova the memory of the rm_session object get unmapped eventually, so
that the de-referencing of the dangling pointer caused page faults in core.
The memory on other kernels inside core never get unmapped so that the
bug doesn't trigger visible faults.
The patch replace the keeping of a rm_session pointer by keeping a
capability instead. The rm_session object must be looked up now explicitly in
the Object_pool implementation, which implements proper reference counting on
the rm_session object.
Issue #549
Since we have now more than a handful patches to the vanilla kernel, we
better switch to a separate git repository in order to review and to maintain
the patches more effectively.
Remove the patches, they are already in the kernel branch.
Fixes#394
Warnings like the following:
warning: narrowing conversion of ‘((Genode::Platform_pd*)this)->Genode::Platform_pd::_space_id’ from ‘int’ to ‘Codezero::l4id_t {aka unsigned int}’ inside { } is ill-formed in C++11 [-Wnarrowing]
First make the clients inaccessible and dissolve them from the entrypoint. If
this isn't the first step the clients may be obtained again between
the unlock and lock steps in the destructor.
Additionally the clients may be removed in between the unlock and call
sequence, which renders such client pointers dangling and causes spurious page
faults. Keep instead a lock as long as possible and when it is required to
release a lock, then the pointer to the objects must be revalidated.
Replace the dissolve function with a remove_client implementation as suggested
by #13, which avoids that the cpu_session may call dissolve with a dangling
pointer of a already removed rm_client object. Instead the pager must be
released explicitly.
Related to issue #549
Related to issue #394
Related to issue #13
This patch removes the 'soname' link option for building the host
library for the 'lx_hybrid_ctors' test. Without this option, the
library's absolute path at build time gets hardcoded into the
application, which should be okay for this simple test case.
Fixes#638.
If we ran out of capabilities indexes, the bit allocator throws an exception.
If this happens the code seems to hang and nothing happens.
Instead one could catch the exception and print some diagnostic message.
This would be nice, but don't work. Printing some diagnostic message itself
tries to do potentially IPC and will allocate new capability indexes at
least for the receive window.
So, catch the exception and let the thread die, so at least the instruction
pointer is left as trace to identify the reason of the trouble.
Fixes#625
If an exception is thrown the lock is released automatically, so that
other callers may get a capability index if in between some are freed. Fixes
some deadlocks if Genode is short on capability indexes.
Related to #625
Currently, the hello run script of the hello_tutorial misses some services the
timer driver needs on various platforms. The hello_tutorial is meant for
educational purposes only. So it's desireable to keep it simple. Instead of
complexifying the configuration, this commit just removes the timer from the
example.
By now, the memcmp implementation of Genode's basic string utilities just
returned whether two memory blocks are equal or differ. It gave no hint which
block is greater, or lesser than the other one. This isn't the behaviour
anticipated by implementations that rely on the C standard memcmp, e.g. GCC's
libsupc++, or the nic_bridge's AVL tree implementation.
With this patch, the 'Signal_receiver::dissolve()' function does not return
as long as the signal context to be dissolved is still referenced by one
or more 'Signal' objects. This is supposed to delay the destruction of the
signal context while it is still in use.
Fixes#594.
With this change, init becomes able to respond to config changes by
restarting the scenario with the new config. To make this feature useful
in practice, init must not fail under any circumstances. Even on
conditions that were considered as fatal previously and led to the abort
of init (such as ambiguous names of the children or misconfiguration in
general), init must stay alive and responsive to config changes.
This patch improves the config handling by falling back to a static
string (empty "<config />") if no valid config ROM module could be
found. This can happen initially, but also at runtime when the ROM
module dissapears, e.g., a ROM module accessed via fs_rom where the
corresponding file gets unlinked.