If a local thread is attempted to be 'pause'd via cpu_session, don't wait
until it gets into the recalled state. If the caller is lucky it is, if not
return only the stack pointer.
Avoids deadlocking of the gdb when attached to a process running a server.
Issue #478
The 'pause' call on base-nova assumes that a thread can solely block in its
associated semaphore. Main reason is that so core can unblock a thread in order
that the recall exception gets delivered and the register state can be
obtained.
Unfortunately the signal session implementation creates a semaphore, which is
unknown by the pager code. Instead create the semaphore via the pager of the
thread, so that the pager can unblock the signal thread when a pause is issued.
Issue #478
If a thread caused a page fault and later on get be paused, then it left
the recall handler immediately due to the pause call instead of staying
in this handler.
Add some (complicated) state machine to detect and handle the case. Still not
waterproof, especially server threads may never get recalled if they never get
a IPC from the outside.
Fixes#478
This patch introduces new types for expressing CPU affinities. Instead
of dealing with physical CPU numbers, affinities are expressed as
rectangles in a grid of virtual CPU nodes. This clears the way to
conveniently assign sets of adjacent CPUs to subsystems, each of them
managing their respective viewport of the coordinate space.
By using 2D Cartesian coordinates, the locality of CPU nodes can be
modeled for different topologies such as SMP (simple Nx1 grid), grids of
NUMA nodes, or ring topologies.
The cleanup call must be performed already during the _dissolve function
shortly after the object at the cap_session is freed up. Otherwise there
is the chance that an in-flight IPC will find the to be dissolved function
again.
Bomb test triggered the case, that a already dissolved rpc_object was found
by a in-flight IPC. If the rpc_object was already freed up by alloc->destroy
the thread using this stale rpc_object pointer cause page-faults in core.
Fixes partly #549
The cpu_session interface fails to be virtualized by gdb_monitor because
platform-nova uses an extended nova_cpu_session interface.
The problem was that threads have been created directly at core without
knowledge of gdb_monitor. This lead to the situation that gdb_monitor didn't
know of all threads to be debugged.
Tunnel the additional parameters required on base-nova through the state()
call of the cpu_session interface before the thread actual is started.
It now can hold a right bit used during IPC to demote rights of the to be
transfered capability.
The local_name field in the native_capability type is not needed anymore
in NOVA. Simplify the class, remove it from constructors and adapt all
invocations in base-nova.
Unfortunately local_name in struct Raw is still used in generic base code
(process.cc, reload_parent_cap.cc), however has no effect in base-nova.
Allocate exc_pt_sel inside Thread_base object
instead of pager object, since it is a thread
specific characteristic.
Same for freeing of the thread capabilities:
- ec, sc, rs, exc_pt_sel is thread specific
and has nothing to do in server nor pager object.
The UTCB of the thread cleaning up thread objects has been unmapped.
However the UTCB of the destroyed thread must be unmapped.
Objects must explicitly be made unreachable before cleaning up. The
server and pager objects must be unreachable before they can be freed.
Both object types are threads. Revoking the thread(EC) cap on NOVA
doesn't mean that the thread stops executing. All portals pointing to a
thread are still reachable by clients even if the last EC cap is gone in
user land. So it must be taken care that no portals are pointing anymore
to a thread when the associated objects are getting destroyed. This
commit handles this.
Additionally, even if the last portal is gone - there can be still an
ongoing request handled by such server/pager object/threads. For each
such object an additional portal is created. This object is called
'cleanup portal' and is only local to the object. After all portals are
revoked the cleanup portal is called. When the call returns we know that
nobody is anymore handled by the object since all remotely available
portals are gone.
Fixes#20