The 'Timer::Session::msleep' function is one of the last occurrences of
long-blocking RPC calls. Synchronous blocking RPC interfaces turned out
to be constant source of trouble and code complexity. I.e., a timer
client that also wants to respond to non-timer events was forced to be a
multi-threaded process. This patch replaces the blocking 'msleep' call
by a mechanism for programming timeouts and receiving wakeup signals in
an asynchronous fashion. Thereby signals originating from the timer can
be handled along with signals from other signal sources by a single
thread.
The changed interface has been tested on Linux, L4/Fiasco, OKL4, NOVA,
L4ka::Pistachio, Codezero, Fiasco.OC, and hw_pbxa9. Furthermore, this
patch adds the timer test to autopilot.
Fixes#1
Cap_sessions and portals created via the sessions are nowadays freed up during
c++ object destruction. Because of that the exception portals for a vCPU thread
get be revoked as soon as the cap_session object leaves its scope.
Keep one cap_session for the whole lifetime of the vmm to avoid disappearing
exception portals.
Related to #582.
The setup now uses nitpicker and nit_fb to display several instances of
vancouver. The guest OS binaries must be supplied in the
'<build-dir>/bin' directory manually.
Furthermore, the patch lets launchpad pass Block, Nic, and Rtc to the
parent.
Vancouver can now assign block devices to guests using the Block
interface. The machine has to be configured to use a specified drive,
which could be theoretically routed to different partitions or services
via policy definitions. Currently the USB driver only supports one
device. Genode's AHCI driver is untested.
If the session quota is too low, random pagefaults can occur on the
stack.
According to @Nils-TUD, it is necessary to protect the DiskCommit
messages with a lock against deadlocking with the timer. Observations
showed that this mitigates some problems with Gentoo on real hardware.
Vancouver is now able to use the Intel 82576 device model from NUL to
give VMs access to the network via the nic_bridge service. In order to
integrate the device model, it had to be renamed to i82576 due to XML
limitations. This is done by a patch applied via the 'make prepare'
mechanism.
Although current network card models in Vancouver panic if they can't
get a MAC address, the OP_GET_MAC hostop now fails gracefully in the
case where no nic_drv or nic_bridge is available.
The guest VM can now be provided with a framebuffer and keyboard input.
Mouse positioning of the guest is a problem. Because the PS2 model applies
some calculations to the movement values, it can happen that overflows mess
with the cursor. Therefore the handling was changed and only movements of 1
and -1 are sent. Since absolute positioning is not possible with PS2, we
have to live with this limitation until USB HID is implemented.
For the framebuffer size in Vancouver the configuration value in the machine
XML node is used. It is possible to map the corresponding memory area
directly to the guest, regardless if it is from nitpicker,
liquid_framebuffer or vesa_drv. The guest is provided with two modes (text
mode 3 and graphics mode 0x114 (0x314 in Linux).
Pressing LWIN+END while a VM has focus resets the virtual machine. Also,
RESET and DEBUG key presses will not be forwarded to the VM anymore.
It is possible to dump a VM's state by pressing LWIN+INS keys.
The text console is able to detect idle mode, unmaps the buffer from the
guest and stops interpreting. Upon the next pagefault in this area, it
resumes operation again. The code uses a simple checksum mechanism instead
of a large buffer and memcmp to detect an idle text console. False
positives don't matter very much.
When an EPT/NPT fault occurs during IDT vectoring, the original event must
be reinjected. Additionally we may have to inject an IRQ window if another
event is already pending.
core does not use POSIX threads when built for the 'lx_hybrid_x86'
platform, so we need to reserve the thread-context area via a segment in
the program to prevent clashes with vdso and shared libraries.
Fixes#639.
The default constructor didn't initialize all members, some of them holding
pointers. In the de-constructor the _name pointer was tried to free up, even
when it was not initialized.
Avoid any hassle for uninitialized members and just initialize it. Fixes
sporadic page fault on x86_64 base-nova.
Issue #155
Some shared libraries of the host system contain search paths for finding
other needed shared libraries. These paths get evaluated only by a native
linker. To find all needed shared libraries, with this patch, the host
linker is used to link hybrid applications.
Fixes#645.
reverts 68156918ee
"base: apply thread.cc fix of foc to base"
Depending on the context area a fixed location is calculated where the
memory for the stack is attached to. If the context area is released before the
detach call, the very same context area can be reused and memory for the new
stack is attached for a new thread. The detach of the old thread would then
revoke the mapping for the new thread which will cause a un-handled page fault.
Issue #549
Prior this patch the startup lock was not released if the call of
'_associate()' failed. In this condition, the caller of the constructor
was infinitely blocked.
During a ram_session->free call in 'core' the lock in core_env.h is taken.
Then in the ram_session::_free_ds implementation the dissolve function for the
dataspace is called. base-nova tries to make sure that the ds is not
accessible anymore by any kind of parallel incoming IPC by performing a
cleanup IPC. Unfortunately the dataspace_session implementation uses the very
same allocator in 'core' and may require to obtain the same lock as taken in
ram_session->free. This leads to a spurious deadlock on base-nova.
The actual free_ds implementation is mostly thread safe, since all used objects
inside there are already locked. The only missing piece is the _payload
variable. By changing the _payload variable in a atomic fashion there is no
need to lock the whole ram_session->free call which avoids deadlocks on
base-nova.
Fixes#549
The cleanup call must be performed already during the _dissolve function
shortly after the object at the cap_session is freed up. Otherwise there
is the chance that an in-flight IPC will find the to be dissolved function
again.
Bomb test triggered the case, that a already dissolved rpc_object was found
by a in-flight IPC. If the rpc_object was already freed up by alloc->destroy
the thread using this stale rpc_object pointer cause page-faults in core.
Fixes partly #549
As first step the rpc object must be freed up so that the kernel object
(portal) vanishes. Then the object must be removed from the internal object
pool list so that the object can't be obtained anymore. And then the cleanup
call can be performed (_leave_server_object) since now all names to the
rpc_object are gone.
Doing it in different order (as before the commit) there is a very very little
chance (but the bomb test triggers it occasionally) that the rpc_object can be
obtained again by an incoming IPC - even it is already scheduled for removal.
Fixes partly #549
If page faults are handled concurrently (as for base-nova) the traverse lookup
call in rm_session_component must be thread safe, which it isn't.
If the faulting area is backed by nested dataspaces which are managed by
various rm_sessions then a race happens under following circumstances
(triggered occasionally by the bomb test).
The traverse lookup may return a pointer to a rm_session of a nested dataspace.
If the rm_session is in parallel subject to destruction it happened that faults
got enqueued to the faulters list of the deleted rm_session and internally to
a list of the current rm_session of the Rm_client.
During destruction of the faulting Rm_client the associated rm_session will
be dissolved from the Rm_client, which leads to dereferencing the
dangling pointer of the already destructed rm_session.
On base-nova the memory of the rm_session object get unmapped eventually, so
that the de-referencing of the dangling pointer caused page faults in core.
The memory on other kernels inside core never get unmapped so that the
bug doesn't trigger visible faults.
The patch replace the keeping of a rm_session pointer by keeping a
capability instead. The rm_session object must be looked up now explicitly in
the Object_pool implementation, which implements proper reference counting on
the rm_session object.
Issue #549
Since we have now more than a handful patches to the vanilla kernel, we
better switch to a separate git repository in order to review and to maintain
the patches more effectively.
Remove the patches, they are already in the kernel branch.
Fixes#394
Warnings like the following:
warning: narrowing conversion of ‘((Genode::Platform_pd*)this)->Genode::Platform_pd::_space_id’ from ‘int’ to ‘Codezero::l4id_t {aka unsigned int}’ inside { } is ill-formed in C++11 [-Wnarrowing]
First make the clients inaccessible and dissolve them from the entrypoint. If
this isn't the first step the clients may be obtained again between
the unlock and lock steps in the destructor.
Additionally the clients may be removed in between the unlock and call
sequence, which renders such client pointers dangling and causes spurious page
faults. Keep instead a lock as long as possible and when it is required to
release a lock, then the pointer to the objects must be revalidated.
Replace the dissolve function with a remove_client implementation as suggested
by #13, which avoids that the cpu_session may call dissolve with a dangling
pointer of a already removed rm_client object. Instead the pager must be
released explicitly.
Related to issue #549
Related to issue #394
Related to issue #13
This patch removes the 'soname' link option for building the host
library for the 'lx_hybrid_ctors' test. Without this option, the
library's absolute path at build time gets hardcoded into the
application, which should be okay for this simple test case.
Fixes#638.
If we ran out of capabilities indexes, the bit allocator throws an exception.
If this happens the code seems to hang and nothing happens.
Instead one could catch the exception and print some diagnostic message.
This would be nice, but don't work. Printing some diagnostic message itself
tries to do potentially IPC and will allocate new capability indexes at
least for the receive window.
So, catch the exception and let the thread die, so at least the instruction
pointer is left as trace to identify the reason of the trouble.
Fixes#625
If an exception is thrown the lock is released automatically, so that
other callers may get a capability index if in between some are freed. Fixes
some deadlocks if Genode is short on capability indexes.
Related to #625