Following deadlock happens when a Rm_client/Pager_object handles a page-fault
and concurrently the same object is dissolved (triggered by parent killing
the client).
The situation is as follows:
Page fault handling :
base-nova/src/base/pager/pager.cc : pf_handler() - lock pf_lock
base/.../core/rm_session_component.cc: pager() - lock rm_session
(in reverse_lookup())
Dissolve of Rm_client:
base/src/core/rm_session_component.cc: dissolve() - lock rm_session
base-nova/src/base/pager/pager.cc : dissolve() - lock pf_lock
The pf_lock is not required here during normal page fault handling,
since this pager object @NOVA is executed only by one and the same
thread and all critical operations inside the rm_session_object itself
are locked anyway. The only critical point is the destruction of the
Pager_object which is already handled in the both dissolve functions
of the rm-session_component (locking) and the pager_object (finalize
in-flight page faults).
Allocate exc_pt_sel inside Thread_base object
instead of pager object, since it is a thread
specific characteristic.
Same for freeing of the thread capabilities:
- ec, sc, rs, exc_pt_sel is thread specific
and has nothing to do in server nor pager object.
The invalid thread is specified as 0,0,-1 (ec cap, sc cap, sem cap).
The main thread is specified as 0,0,0.
The comparator identified "tid_main == tid_invalid" as equal,
which is obviously wrong.
The patch compares at least ec and sem cap.
Use semaphore down feature of NOVA to set the counter to zero.
If the semaphore was up()ed more than one time by impatient callers
(e.g. guys calling cancel_blocking) we make sure that the thread
really stops.
Don't allocate ec cap twice, in pager.cc and thread_start.cc.
Unmap of utcb has to be done in destructor of thread class, not
in pager class. Free capability selectors of ec and rs.
Invoke cancel_blocking before calling the
cleanup portal of the rpc_entrypoint. If a rpc_entrypoint
is blocked in a semaphore the cleanup call gets
stuck forever.
If nobody is blocked in a semaphore, nothing can be dequeued. If
the semaphore is used for signalling, there can be somebody in the queue,
but not necessarily.
This patch replaces the first attempt to resolve the ambiguity of using
the size_t type that occurred when 'loader_session.h' was included
alongside libc headers. Instead of explicitly qualifying each occurrence
of the type, the new solution defines 'size_t' within the 'Loader' namespace.
Fixes#253
The CML2 configuration system calls 'evn python' and expects version
2.x. So we check if python2 is installed when preparing Pistachio and
use the found version instead.
Fixes#264.
Some type size tests in the findutils source code expect the 'time_t' type
to be of the same size as the 'long' type, whereas the Genode libc defines
it as '__int64_t' for ARM. This patch disables these tests.
Fixes#262.
Eliminate prints to stderr for normal messages, because it leads to exceptional
returns in TCL-scripts e.g. when run-script is triggered by the autopilot even
if the script's return code itself will be zero.
In the create_builddir script the foc_x86_64 platform was missing
when adding x86-drivers to the etc/build.conf file. This lead to
failed run-scripts initiated by the autopilot tool.
The compiler complained about ambigous references when compiling a
lx_hybrid program using the loader session. Here are some error
messages:
genode/os/include/loader_session/loader_session.h:72: error: reference to 'size_t' is ambiguous
/usr/lib/gcc/i486-linux-gnu/4.4.5/include/stddef.h:211: error: candidates are: typedef unsigned int size_t
genode/base/include/base/stdint.h:25: error: typedef unsigned int Genode::size_t
genode/os/include/loader_session/loader_session.h:72: error: reference to 'size_t' is ambiguous
/usr/lib/gcc/i486-linux-gnu/4.4.5/include/stddef.h:211: error: candidates are: typedef unsigned int size_t
genode/base/include/base/stdint.h:25: error: typedef unsigned int Genode::size_t
...
This commit qualifies size_t using the Genode namespace which fixes
the compilation.
Make pxe optional and use by default grub.
For that to work we use objcopy to repack the elf64
file into elf32.
With this commit more tests succeed. Most
tests use 64M and with that pulsar even does not start
the hypervisor. With 96M more test run however that would
mean to adjust most of the run scripts ...
Without this patch the compilation failed with:
/usr/bin/ld: main.o: relocation R_X86_64_32S against
`vtable for Genode::Dataspace' can not be used when making a shared object;
recompile with -fPIC
main.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[6]: *** [init] Error 1
For this patch the use of the hardening tool chain must be indicated
using the "hardening_tool_chain" SPECS entry within the file
<build>/etc/specs.conf
Fixes#79
Check that there is enough room for a typed item on the
UTCB. Otherwise deny to add the item and return false.
Enable explicitly a return unused warning to get the right
attention.
The UTCB of the thread cleaning up thread objects has been unmapped.
However the UTCB of the destroyed thread must be unmapped.
Objects must explicitly be made unreachable before cleaning up. The
server and pager objects must be unreachable before they can be freed.
Both object types are threads. Revoking the thread(EC) cap on NOVA
doesn't mean that the thread stops executing. All portals pointing to a
thread are still reachable by clients even if the last EC cap is gone in
user land. So it must be taken care that no portals are pointing anymore
to a thread when the associated objects are getting destroyed. This
commit handles this.
Additionally, even if the last portal is gone - there can be still an
ongoing request handled by such server/pager object/threads. For each
such object an additional portal is created. This object is called
'cleanup portal' and is only local to the object. After all portals are
revoked the cleanup portal is called. When the call returns we know that
nobody is anymore handled by the object since all remotely available
portals are gone.
Fixes#20