This patch replaces the former 'Pd_session::bind_thread' function by a
PD-capability argument of the 'Cpu_session::create_thread' function, and
removes the ancient thread-start protocol via 'Rm_session::add_client' and
'Cpu_session::set_pager'. Threads are now bound to PDs at their creation
time and implicitly paged according to the address space of the PD.
Note the API change:
This patch changes the signature of the 'Child' and 'Process' constructors.
There is a new 'address_space' argument, which represents the region map
representing the child's address space. It is supplied separately to the
PD session capability (which principally can be invoked to obtain the
PD's address space) to allow the population of the address space
without relying on an 'Pd_session::address_space' RPC call.
Furthermore, a new (optional) env_pd argument allows the explicit
overriding of the PD capability handed out to the child as part of its
environment. It can be used to intercept the interaction of the child
with its PD session at core. This is used by Noux.
Issue #1938
This patch integrates three region maps into each PD session to
reduce the session overhead and to simplify the PD creation procedure.
Please refer to the issue cited below for an elaborative discussion.
Note the API change:
With this patch, the semantics of core's RM service have changed. Now,
the service is merely a tool for creating and destroying managed
dataspaces, which are rarely needed. Regular components no longer need a
RM session. For this reason, the corresponding argument for the
'Process' and 'Child' constructors has been removed.
The former interface of the 'Rm_session' is not named 'Region_map'. As a
minor refinement, the 'Fault_type' enum values are now part of the
'Region_map::State' struct.
Issue #1938
The return code of assign_parent remained unused. So this patch
removes it.
The bind_thread function fails only due to platform-specific limitations
such as the exhaustion of ID name spaces, which cannot be sensibly
handled by the PD-session client. If occurred, such conditions used to
be reflected by integer return codes that were used for diagnostic
messages only. The patch removes the return codes and leaves the
diagnostic output to core.
Fixes#1842
Besides unifying the Msgbuf_base classes across all platforms, this
patch merges the Ipc_marshaller functionality into Msgbuf_base, which
leads to several further simplifications. For example, this patch
eventually moves the Native_connection_state and removes all state
from the former Ipc_server to the actual server loop, which not only
makes the flow of control and information much more obvious, but is
also more flexible. I.e., on NOVA, we don't even have the notion of
reply-and-wait. Now, we are no longer forced to pretend otherwise.
Issue #1832
This patch unifies the CPU session interface across all platforms. The
former differences are moved to respective "native-CPU" interfaces.
NOVA is not covered by the patch and still relies on a custom version of
the core-internal 'cpu_session_component.h'. However, this will soon be
removed once the ongoing rework of pause/single-step on NOVA is
completed.
Fixes#1922
This commit introduces the new `Component` interface in the form of the
headers base/component.h and base/entrypoint.h. The os/server.h API
has become merely a compatibilty wrapper and will eventually be removed.
The same holds true for os/signal_rpc_dispatcher.h. The mechanism has
moved to base/signal.h and is now called 'Signal_handler'.
Since the patch shuffles headers around, please do a 'make clean' in the
build directory.
Issue #1832
This commit replaces the stateful 'Ipc_client' type with the plain
function 'ipc_call' that takes all the needed state as arguments.
The stateful 'Ipc_server' class is retained but it moved from the public
API to the internal ipc_server.h header. The kernel-specific
implementations were cleaned up and simplified. E.g., the 'wait'
function does no longer exist. The badge and exception code are no
longer carried in the message buffers but are handled in kernel-specific
ways.
Issue #610
Issue #1832
This patch moves details about the stack allocation and organization
the base-internal headers. Thereby, I replaced the notion of "thread
contexts" by "stacks" as this term is much more intuitive. The fact that
we place thread-specific information at the bottom of the stack is not
worth introducing new terminology.
Issue #1832
By moving the stub implementation to rm_session_client.cc, we can use
the generic base/include/rm_session/client.h for base-linux and
base-nova and merely use platform-specific implementations.
Issue #1832
This patch establishes a common organization of header files
internal to the base framework. The internal headers are located at
'<repository>/src/include/base/internal/'. This structure has been
choosen to make the nature of those headers immediately clear when
included:
#include <base/internal/lock_helper.h>
Issue #1832
This patch integrates the functionality of the former CAP session into
the PD session and unifies the approch of supplementing the generic PD
session with kernel-specific functionality. The latter is achieved by
the new 'Native_pd' interface. The kernel-specific interface can be
obtained via the Pd_session::native_pd accessor function. The
kernel-specific interfaces are named Nova_native_pd, Foc_native_pd, and
Linux_native_pd.
The latter change allowed for to deduplication of the
pd_session_component code among the various base platforms.
To retain API compatibility, we keep the 'Cap_session' and
'Cap_connection' around. But those classes have become mere wrappers
around the PD session interface.
Issue #1841
This patch removes the SIGNAL service from core and moves its
functionality to the PD session. Furthermore, it unifies the PD service
implementation and terminology across the various base platforms.
Issue #1841
Previously, ports that were needed for a scenario and that were not
prepared or outdated, triggered one assertion each during the second
build stage. The commit slots a mechanism in ahead that gathers all
these ports during the first build stage and reports them in form of a
list before the second build stage is entered. This list can be used
directly as argument for tool/ports/prepare_port to prepare respectively
update the ports. If, however, this mechanism is not available, for
example because a target is build without the first build stage, the old
assertion still prevents the target from running into troubles with a
missing port.
Fixes#1872
Now, the right PCI bus:device:function (BDF) is reported to the kernel
during assign_pci syscall - beforehand it was ever 0:0.0. The BDF is
needed to lookup the correct DMAR unit the kernel has to configure. This
was revealed as the DMAR unit for Intel graphics on x201 is not the same
as for all other PCI devices we have drivers for on this platform.
Fixes#1848
Use kernel branch which is more accurate in accounting memory, which avoids
kernel messages of following form:
[0] warning: insufficient resources ...
Fixes#1830
Accidentally removed by #1658. We need to make the cleanup call for server
objects - otherwise we may get in capability identifier re-use trouble.
Issue #1778
- Align implementation to the current generic implementation
- Document NOVA-specific implementation of dataspace() (as in the
original commit message)
Don't skip the cleanup call if a pager object is marked as blocked.
It happens that the pager_object is in destruction but it is also used
concurrently by the pager thread. The pager thread handling code may set the
pager object to blocked but still uses the pointer to the pager object. Avoid
locking at the state of the pager object and make the cleanup call everytime.
Error output looks like this, where the pf_ip is within
void Pager_object::_page_fault_handler(addr_t pager_obj)
method and the pf_addr is the stale pointer to the already released pager_object.
no RM attachment (READ pf_addr=xxx pf_ip=xxx from 00 <NULL>)
static void Genode::Pager_object::_page_fault_handler(Genode::addr_t): page fault, thread '<NULL>', cpu x, ip=xxx, fault address=xxx
PAGE-FAULT IN CORE (READ pf_addr=b10e0090 pf_ip=132dbc from 00 <NULL>)
Too less memory quota for a PD may be calculated, which leads to too early
punishment for a Genode process.
Discovered during Turmvilla scenario #1552 and issue #1733.
Additionally print warnings about unavailable CPUs if they are tried to be
used during pager object setup.
Discovered during Turmvilla scenario #1552 and issue #1733.
threads with prio 0 will not be started and would fail silently.
Happened on Turmvilla for the USBProxy thread in virtualbox.
Discovered during Turmvilla scenario #1552 and issue #1733.
Reduces kernel log message noise when running on kernel-debug branch.
Additionally add a more verbose core message.
Discovered during Turmvilla scenario #1552 and issue #1733.
Addressing must be PC-relative, so adapt the approach from the other
nova_x86_32 syscall bindings (description by @ssumpf):
Use call to push the current IP on the stack and add the distance of
label 0 and label 1 in order to determine the return address, which
NOVA requires in edx.
The bug only showed up with "-O0" in libc.lib.so in form of a unwanted
text relocation.
Fixes#1721
Destroying an object within the scope of a lambda/functor executed
in the object pool's apply function leads potentially to memory corruption.
Within the scope the corresponding object is locked and unlocked when
leaving the scope. Therefore, it is illegal to free the object's memory meanwhile.
This commit eliminates several places in core that destroyed wrongly in
the object pool's scope.
Fix#1713
* Move the Synced_interface from os -> base
* Align the naming of "synchronized" helpers to "Synced_*"
* Move Synced_range_allocator to core's private headers
* Remove the raw() and lock() members from Synced_allocator and
Synced_range_allocator, and re-use the Synced_interface for them
* Make core's Mapped_mem_allocator a friend class of Synced_range_allocator
to enable the needed "unsafe" access of its physical and virtual allocators
Fix#1697
Instead of holding SPEC-variable dependent files and directories inline
within the repository structure, move them into 'spec' subdirectories
at the corresponding levels, e.g.:
repos/base/include/spec
repos/base/mk/spec
repos/base/lib/mk/spec
repos/base/src/core/spec
...
Moreover, this commit removes the 'platform' directories. That term was
used in an overloaded sense. All SPEC-relative 'platform' directories are
now named 'spec'. Other files, like for instance those related to the
kernel/architecture specific startup library, where moved from 'platform'
directories to explicit, more meaningful places like e.g.: 'src/lib/startup'.
Fix#1673
Instead of returning pointers to locked objects via a lookup function,
the new object pool implementation restricts object access to
functors resp. lambda expressions that are applied to the objects
within the pool itself.
Fix#884Fix#1658
For most platforms except of NOVA a distinction between pager entrypoint
and pager activation is not needed, and only exists due to historical
reasons. Moreover, the pager thread's execution path is almost identical
between most platforms excluding NOVA, HW, and Fisco.OC. Therefore,
this commit unifies the pager loop for the other platforms, and removes
the pager activation class.
The reference count get increase to use 2 bytes, so we need the double amount
of selectors as before.
Additionally print a message if we run out of capabilities in a server. Since
our rpc framework is now clever enough to detect that for a printf we don't
need to setup a receive window, we may use a printf instead of a die call.
Eases debugging.
Issue #1601
Showcasing the out of memory kernel issue.
One test triggers oom during memory delegation when talking to core pager
thread. Two other test trigger oom during capability delegation in a
server/client scenario for send and reply phase separately.
Issue #1601
Moves the Bios Data Area header from base-hw to base. Modifies the
base-nova core console that it uses the header as replacement for
the previous BDA bit logic.
Ref #1625
This commit eliminates the mutual interlaced taking of destruction lock,
list lock and weak pointer locks that could lead to a dead-lock situation
when a lock pointer was tried to construct while a weak object is in
destruction progress.
Now, all weak pointers are invalidated and dequeued at the very
beginning of the weak object's destruction. Moreover, before a weak pointer
gets invalidated during destruction of a weak object, it gets dequeued, and
the list lock is freed again to avoid the former dead-lock.
Fix#1607
- free up kernel memory of empty slabs (if already one empty slab is in
place)
- free up more page table entries
- handle CPUs with invariant TSCs gracefully
Genode/Nova running on CPUs without the invariant TSC feature may seem
to 'hang'. The referenced commit of the nova branch fixes the issue
for some older Intel CPUs.
Fixes#1615
Bomb and any server may generate references to capabilities exceeding 256 -
use a 16bit counter until the cap handling in Genode gets unified.
Additionally try to print a warning, instead of dying, if we get cap reference
count under or overflow.
Issue #1615
This patch enable clients of core's TRACE service to obtain the
execution times of trace subjects (i.e., threads). The execution time is
delivered as part of the 'Subject_info' structure.
Right now, the feature is available solely on NOVA. On all other base
platforms, the returned execution times are 0.
Issue #813
Avoids the need to have per IRQ a thread that blocks synchronously for next
interrupt. Now a thread may wait for multiple IRQs as other signals
simultaneously.
In core no threads are required anymore for IRQs/MSI - the clients (either
the pci_drv or in case of MSI the driver) gets the IRQ delivered directly as
a ordinary Genode signal.
Useful since #1216 and #1487 is now available.
Commit applies feature of #1446 also to IRQ/MSIs.
On NOVA, a Genode thread currently cannot destroy itself by destroying its
own 'Thread' object, because in 'Thread_base::_deinit_platform_thread()'
it cannot call 'Cpu_session::kill_thread()' anymore after it has revoked
its own UTCB.
As solution, the revocation of the UTCB can be delayed until its location
in the context area is needed by a new thread.
Fixes#1505
On seL4, we need to convert untyped memory to page frames before being
able to use it as normal memory. There already exists the hook function
'_export_ds' that is principally suitable for such tasks. It is
currently solely used on Linux where we have to create a file for each
dataspace. To make the hook useful also for seL4, we need to call
_export_ds prior _clear_ds. Otherwise, we would try to clear memory that
is still untyped.
The thread library (thread.cc) in base-foc shared 95% of the code with
the generic implementation except myself(). Therefore, its
implementation is now separated from the other generic sources into
myself.cc, which allows base-foc to use a foc-specific primitive to
enable our base libraries in L4Linux.
Issue #1491
Physical CPU quota was previously given to a thread on construction only
by directly specifying a percentage of the quota of the according CPU
session. Now, a new thread is given a weighting that can be any value.
The physical counter-value of such a weighting depends on the weightings
of the other threads at the CPU session. Thus, the physical quota of all
threads of a CPU session must be updated when a weighting is added or
removed. This is each time the session creates or destroys a thread.
This commit also adapts the "cpu_quota" test in base-hw accordingly.
Ref #1464
* Instead of using local capabilities within core's context area implementation
for stack allocation/attachment, simply do both operations while stack gets
attached, thereby getting rid of the local capabilities in generic code
* In base-hw the UTCB of core's main thread gets mapped directly instead of
constructing a dataspace component out of it and hand over its local
capability
* Remove local capability implementation from all platforms except Linux
Ref #1443
The global capability ID counter is not used by NOVA and Fiasco.OC
and in the future not needed by base-hw too. Thereby, remove the static
counter variable from the generic code base and add it where appropriated.
Ref #1443
Enable platform specific allocations and ram quota accounting for
protection domains. Needed to allocate object identity references
in the base-hw kernel when delegating capabilities via IPC.
Moreover, it can be used to account translation table entries in the
future.
Ref #1443
The bindings for 32bit did not consider that in the syscall_3 function
edx changes due to the assembly instructions and that in the syscall_4
function edx and ecx change. So, the compiler wrongly assumed that the
content of these registers stayed unchanged.
Fixes#1447
If running multiple VBox VMMs with Windows as guest concurrently then it may
happen that the system seem to hang. It turned out that actually
a VM-exit storm (vmx_exception->handle_exc_nm) causes a endless loop between
kernel and vCPU. Nothing gets scheduled nor interrupts are received anymore.
The referenced kernel commit fixes this issue.
Issue #1343
The linker scripts use to fill alignment gaps within the text section
with the magic value 0x90909090, which correponds to the opcodes of four
nop instructions on x86. This patch removes this value because it
apparently solves no problem. If, for some reason (e.g., due to a dangling
pointer) a thread executes instructions within alignment paddings, NOP
instructions are not any better than any other instruction. The program
will eventually execute the instructions after the padding, which is
most likely fatal. It would be more reasonable to fill the padding with
the opcode of an illegal instruction so that such an error can be
immediately detected. That said, I cannot remember a single instance,
where the fill value has helped us during debugging.
Even if the mechanism served a purpose on x86, it is still better to
remove it because it does not equally work on the other architectures
where the linker scripts are used. I.e., on ARM, the opcode 0x90909090
is not a NOP instruction.
Workaround for issue #1343. By disabling the 'vpid' feature of the nova
kernel several VMs can be used concurrently. Applies for Seoul and VirtualBox.
Issue #1343
The commit uses a fixed kernel branch (r8), which fixes a caching bug
observable in the Genode host. The quirk detecting the circumstance in the
timer service is obsolete now and is removed.
Fixes#1338
The commit
- fixes the syscall bindings for using portal permissions
- revokes PT_CTRL permission after pager in core set local badge name
- revokes PT_CTRL permission after server entrypoint code set local badge name
Fixes#1335
If the debug branch of the nova kernel is used, following messages are printed
by the kernel during vCPU setup phase:
[0] overmap attempt OBJ - tree - ...
Fixes#1324
In the init configuration one can configure the donation of CPU time via
'resource' tags that have the attribute 'name' set to "CPU" and the
attribute 'quantum' set to the percentage of CPU quota that init shall
donate. The pattern is the same as when donating RAM quota.
! <start name="test">
! <resource name="CPU" quantum="75"/>
! </start>
This would cause init to try donating 75% of its CPU quota to the child
"test". Init and core do not preserve CPU quota for their own
requirements by default as it is done with RAM quota.
The CPU quota that a process owns can be applied through the thread
constructor. The constructor has been enhanced by an argument that
indicates the percentage of the programs CPU quota that shall be granted
to the new thread. So 'Thread(33, "test")' would cause the backing CPU
session to try to grant 33% of the programs CPU quota to the thread
"test". By now, the CPU quota of a thread can't be altered after
construction. Constructing a thread with CPU quota 0 doesn't mean the
thread gets never scheduled but that the thread has no guaranty to receive
CPU time. Such threads have to live with excess CPU time.
Threads that already existed in the official repositories of Genode were
adapted in the way that they receive a quota of 0.
This commit also provides a run test 'cpu_quota' in base-hw (the only
kernel that applies the CPU-quota scheme currently). The test basically
runs three threads with different physical CPU quota. The threads simply
count for 30 seconds each and the test then checks wether the counter
values relate to the CPU-quota distribution.
fix#1275
The memory barrier prevents the compiler from changing the program order
of memory accesses in such a way that accesses to the guarded resource
get outside the guarded stage. As cmpxchg() defines the start of the
guarded stage it also represents an effective memory barrier.
On x86, the architecture ensures to not reorder writes with older reads,
writes to memory with other writes (except in cases that are not
relevant for our locks), or read/write instructions with I/O
instructions, locked instructions, and serializing instructions.
However on ARM, the architectural memory model allows not only that
memory accesses take local effect in another order as their program
order but also that different observers (components that can access
memory like data-busses, TLBs and branch predictors) observe these
effects each in another order. Thus, a correct program order isn't
sufficient for a correct observation order. An additional architectural
preservation of the memory barrier is needed to achieve this.
Fixes#692
The weak implementation was added for quite special purposes years ago
and is no longer needed. On the other hand, the weak attribute does not
help if the implementation ends up in a shared library, which first
resolves symbols locally before asking ldso (that includes the acutal
thread library) *shiver*
This provides bootable disk images for x86 platforms via
! RUN_OPT="--target disk"
The resulting disk image contains one ext2 partition with binaries from
the GRUB2 boot loader and the run scenario. The default disk size fits
all binaries, but is configurable via
! --disk-size <size in MiB>
in RUN_OPT.
The feature depends on an grub2-head.img, which is part of the commit,
but may also be generated by executing tool/create_grub2. The script
generates a disk image prepared for one partition, which contains files
for GRUB2. All image preparation steps that need superuser privileges
are conducted by this script.
The final step of writing the entire image to a disk must be executed
later by
sudo dd if=<image file> of=<device> bs=8M conv=fsync
Fixes#1203.
When a page fault cannot be resolved, the GDB monitor can get a hint about
which thread faulted by evaluating the thread state object returned by
'Cpu_session::state()'. Unfortunately, with the current implementation,
the signal which informs GDB monitor about the page fault is sent before
the thread state object of the faulted thread has been updated, so it
can happen that the faulted thread cannot be determined immediately
after receiving the signal.
With this commit, the thread state gets updated before the signal is sent.
At least on base-nova it can also happen that the thread state is not
accessible yet after receiving the page fault notification. For this
reason, GDB monitor needs to retry its query until the state is
accessible.
Fixes#1206.
On ARM it's relevant to not only distinguish between ordinary cached memory
and write-combined one, but also having non-cached memory too. To insert the
appropriated page table entries e.g.: in the base-hw kernel, we need to preserve
the information about the kind of memory from allocation until the pager
resolves a page fault. Therefore, this commit introduces a new Cache_attribute
type, and replaces the write_combined boolean with the new type where necessary.
On ARM, when machine instructions get written into the data cache
(for example by a JIT compiler), one needs to make sure that the
instructions get written out to memory and read from memory into
the instruction cache before they get executed. This functionality
is usually provided by a kernel syscall and this patch adds a generic
interface for Genode applications to use it.
Fixes#1153.
This patch changes the top-level directory layout as a preparatory
step for improving the tools for managing 3rd-party source codes.
The rationale is described in the issue referenced below.
Issue #1082