* Instead of always re-load page-tables when a thread context is switched
only do this when another user PD's thread is the next target,
core-threads are always executed within the last PD's page-table set
* remove the concept of the mode transition
* instead map the exception vector once in bootstrap code into kernel's
memory segment
* when a new page directory is constructed for a user PD, copy over the
top-level kernel segment entries on RISCV and X86, on ARM we use a designated
page directory register for the kernel segment
* transfer the current CPU id from bootstrap to core/kernel in a register
to ease first stack address calculation
* align cpu context member of threads and vms, because of x86 constraints
regarding the stack-pointer loading
* introduce Align_at template for members with alignment constraints
* let the x86 hardware do part of the context saving in ISS, by passing
the thread context into the TSS before leaving to user-land
* use one exception vector for all ARM platforms including Arm_v6
Fix#2091
* introduce new syscall (core-only) to create privileged threads
* take the privilege level of the thread into account
when doing a context switch
* map kernel segment as accessable for privileged code only
Ref #2091
* introduces central memory map for core/kernel
* on 32-bit platforms the kernel/core starts at 0x80000000
* on 64-bit platforms the kernel/core starts at 0xffffffc000000000
* mark kernel/core mappings as global ones (tagged TLB)
* move the exception vector to begin of core's binary,
thereby bootstrap knows from where to map it appropriately
* do not map boot modules into core anymore
* constrain core's virtual heap memory area
* differentiate in between user's and core's main thread's UTCB,
which now resides inside the kernel segment
Ref #2091
In the past, a signal context, that was chosen for handling by
'Signal_receiver::pending_signal and always triggered again before
the next call of 'pending_signal', caused all other contexts behind
in the list to starve. This was the case because 'pending_signal'
always took the first pending context in its context list.
We avoid this problem now by handling pending signals in a round-robin
fashion instead.
Ref #2532
Ensure that the timer does not handle timeouts again within 1000
microseconds after the last handling of timeouts. This makes denial of
service attacks harder. This commit does not limit the rate of timeout
signals handled inside the timer but it causes the timer to do it less
often. If a client continuously installs a very small timeout at the
timer it still causes a signal to be submitted to the timer each time
and some extra CPU time to be spent in the internal handling method. But
only every 1000 microseconds this internal handling causes user timeouts
to trigger.
If we would want to limit also the call of the internal handling method
to ensure that CPU time is spent beside the RPCs only every 1000
microseconds, things would get more complex. For instance, on NOVA
Time_source::schedule_timeout(0) must be called each time a new timeout
gets installed and becomes head of the scheduling queue. We cannot
simply overwrite the already running timeout with the new one.
Ref #2490
This patch merges two similar rules, which create content at 'include'
into a single rule. This prevents a possible race condition when
creating archives in parallel.
We moved the stack-area segment 128 MiB behind text and data to comply
with assumptions in the kernel ELF loader.
This commit also reenables static binaries on linux and removes the
unused stack_area.stdlib.ld script.
Fixes#2521
In nested scenarios like driver_manager.run, the initial session quota
for IO_PORT, IO_PORT, and IRQ sessions is expectedly insufficient.
However, the condition is properly handled by re-attemping the request
with a slightly increased quota. Still, core prints a warning each time
the request is denied for quota reasons, which spams the log. This patch
removes the non-critical message.
This should actually never happen. However if it happens, be a bit robuster
and don't provide the memory for re-use (which causes tons of other trouble
afterwards).
Issue #2505
There are hardware timers whose frequency can't be expressed as
ticks-per-microsecond integer-value because only a ticks-per-millisecond
integer-value is precise enough. We don't want to use expensive
floating-point values here but nonetheless want to translate from ticks
to time with microseconds precision. Thus, we split the input in two and
translate both parts separately. This way, we can raise precision by
shifting the values to their optimal bit position. Afterwards, the results
are shifted back and merged together again.
As this algorithm is not so trivial anymore and used by at least three
timer drivers (base-hw/x86_64, base-hw/cortex_a9, timer/pit), move it to a
generic header to avoid redundancy.
Ref #2400
When building Genode on a Linux system running in a Xen Dom0, the 'xen'
run target can run a Genode scenario in a Xen DomU.
Usage: in build/x86_*/etc/build.conf, define:
RUN_OPT = --include boot_dir/$(KERNEL) --include image/iso --include power_on/xen --include log/xen --include power_off/xen
The Xen DomU runs in HVM mode and loads Genode from an ISO image. Serial
log output is printed to the console and graphical output is shown in an
SDL window.
The Xen DomU ist managed using the 'xl' command line tool and it is
possible to add configuration options in the 'xen_args' variable in a run
script. Common options are:
- disabling the graphical output:
append xen_args { sdl="0" }
- configuring a network device:
append xen_args { vif=\["model=e1000,mac=02:00:00:00:01:01,bridge=xenbr0"\] }
- configuring USB input devices:
append xen_args { usbdevice=\["mouse","keyboard"\] }
Note: the 'xl' tool requires super-user permissions and interactive
password input can be troublesome in combination with 'expect' and is not
practical for automatic tests. For this reason, the current implementation
assumes that no password input is needed when running 'sudo xl', which can
be achieved by creating a file '/etc/sudoers.d/xl' with the content
'user ALL=(root) NOPASSWD: /usr/sbin/xl'
(where 'user' is the Linux user name).
Fixes#2504
When running core as the kernel inside every component, a separate
stack area for core is needed that is different from the user-land
component's one.
Ref #2091
Acquire Signal_context objects locks via Object_pool::apply() in the
context of the entrpyoint thread, instead in the context of the calling
thread.
Fixes#2485
- Fix fatal exception handling so that stack traces are dumped
- Add 'include/nim' directories to Nim module search path
- Enable release optimizations for release builds
Fix#2493
This patch removes the assertion about the unexpected call of
'block_for_signal' within core. On Linux, this call is actually
expected because of the handling of SIGCHLD signals by core.
A boot module with size 0 previously made Core crash with a page fault in
Region_map_component::attach. This patch prevents the creation of ROM-FS
entries for such modules.
Ref #2490
For most base platforms (except linux and sel4), the initialization of
boot modules is the same. Thus, merge this default implementation in the
new unit base/src/core/platform_rom_modules.cc.
Ref #2490
In Region_map_component::attach, storing the metadata for a region may
throw an exception. Catch it and throw an Invalid_dataspace exception.
Ref #2490
This is helpful for disabling messages in etc/tools.conf by
setting it to e.g.
MSG_LINK = @true ""
This results in much shorter and less cluttered logs in automatic
builds.
- factor out Rm_client::pager lambda code into utility
Region_map_component::create_map_item
- use utility to find/lookup physical addresses to be mapped eagerly
Issue #2209
Platform_pd "_pd" uses a allocator for, which relies on the mapped RAM
dataspace within core. Unfortunately the RAM dataspaces are already freed up
during _ram_ds_factory destruction, which may lead to trouble if accessed
afterwards.
Issue #2451
This patch sets the -march complile flag in spec/arm_v7a.mk, which
enables us to build depot archives for the 'arm_v7a' architecture.
It also removes copy-pasted comments that offer no valuable insights but
contain grammar errors.
This patch decouples the error handling of the quota transfers
and the actual session creation. In the previous version, an error in
the 'initiate_request' phase would leave the local scope via an
exception without disarming the transfer guard objects. This way,
the guard destructors would attempt the returning of session quota in
addition to the explicit call of '_revert_quota_and_destroy' as done in
the error handling of the 'initiate_request' operation.
In the presence of a session-creation error in the 'initiate_request'
phase, session quota would eventually be returned twice. This patch
removes the intertwined error handling of both phases in a way that the
guards of the first phase (quota transfer) are no longer present in the
second phase (initiate_request).
This patch makes sure that the initial PD session limit (as defined by
the client-provided session quota) is preserved over the entire lifetime
of the PD session. That means, it cannot be transferred to other PD
sessions. Otherwise, it may be impossive to hand back all the static
session quota to the PD-session client at session-destruction time
because parts of the initial quota would no longer belong to the
session.
Note that the initial limit can still be used for allocations within the
PD session as those allocations are automatically reverted at
session-destruction time.
The implementations of the lock and C++ guards tests depend on
thread-execution priorities, which produces false negatives of the whole
thread test on platforms without priority support.
The recently implemented capability resource trading scheme unfortunately
broke the automated capability memory upgrade mechanism needed by base-hw
kernel/core. This commit splits the capability memory upgrade mechanism
from the PD session ram_quota upgrade, and moves that functionality
into a separate Pd_session::Native_pd interface.
Ref #2398
A dataspace capability request to a ROM service may invalidate any
previously issued dataspace. Therefor no requests should be made while a
session dataspace is mapped. Reducing calls to the session also improves
performance where servicing a ROM request has a significant cost.
Fix#2418
The 'Stack_area_ram_session' is now a 'Stack_area_ram_allocator', which
simplifies the code and remove a dependency from the 'Ram_session'
interface, which we want to remove after all.
Issue #2407
By supplying a statically allocated initial block to the slab allocator
for signal contexts, we become able to construct a 'Signal_broker' (the
back end for the PD's signalling API) without any dynamic memory
allocation. This is a precondition for using the PD as meta-data
allocator for its contained signal broker (meta data allocations must
not happen before the PD construction is complete).
Issue #2407
By separating the session-interface concerns from the mechanics of the
dataspace creation, the code becomes simpler to follow, and the RAM
session can be more easily merged with the PD session in a subsequent
step.
Issue #2407
This patch allows core's 'Signal_transmitter' implementation to sidestep
the 'Env::Pd' interface and thereby adhere to a stricter layering within
core. The 'Signal_transmitter' now uses - on kernels that depend on it -
a dedicated (and fairly freestanding) RPC proxy mechanism for signal
deliver, instead of channeling signals through the 'Pd_session::submit'
RPC function.
Previously, the Genode::Timer::curr_time always used the
Timer_session::elapsed_ms RPC as back end. Now, Genode::Timer reads
this remote time only in a periodic fashion independently from the calls
to Genode::Timer::curr_time. If now one calls Genode::Timer::curr_time,
the function takes the last read remote time value and adapts it using
the timestamp difference since the remote-time read. The conversion
factor from timestamps to time is estimated on every remote-time read
using the last read remote-time value and the timestamp difference since
the last remote time read.
This commit also re-works the timeout test. The test now has two stages.
In the first stage, it tests fast polling of the
Genode::Timer::curr_time. This stage checks the error between locally
interpolated and timer-driver time as well as wether the locally
interpolated time is monotone and sufficiently homogeneous. In the
second stage several periodic and one-shot timeouts are scheduled at
once. This stage checks if the timeouts trigger sufficiently precise.
This commit adds the new Kernel::time syscall to base-hw. The syscall is
solely used by the Genode::Timer on base-hw as substitute for the
timestamp. This is because on ARM, the timestamp function uses the ARM
performance counter that stops counting when the WFI (wait for
interrupt) instruction is active. This instruction, however is used by
the base-hw idle contexts that get active when no user thread needs to
be scheduled. Thus, the ARM performance counter is not a good choice for
time interpolation and we use the kernel internal time instead.
With this commit, the timeout library becomes a basic library. That means
that it is linked against the LDSO which then provides it to the program it
serves. Furthermore, you can't use the timeout library anymore without the
LDSO because through the kernel-dependent LDSO make-files we can achieve a
kernel-dependent timeout implementation.
This commit introduces a structured Duration type that shall successively
replace the use of Microseconds, Milliseconds, and integer types for duration
values.
Open issues:
* The timeout test fails on Raspberry PI because of precision errors in the
first stage. However, this does not render the framework unusable in general
on the RPI but merely is an issue when speaking of microseconds precision.
* If we run on ARM with another Kernel than HW the timestamp speed may
continuously vary from almost 0 up to CPU speed. The Timer, however,
only uses interpolation if the timestamp speed remained stable (12.5%
tolerance) for at least 3 observation periods. Currently, one period is
100ms, so its 300ms. As long as this is not the case,
Timer_session::elapsed_ms is called instead.
Anyway, it might happen that the CPU load was stable for some time so
interpolation becomes active and now the timestamp speed drops. In the
worst case, we would now have 100ms of slowed down time. The bad thing
about it would be, that this also affects the timeout of the period.
Thus, it might "freeze" the local time for more than 100ms.
On the other hand, if the timestamp speed suddenly raises after some
stable time, interpolated time can get too fast. This would shorten the
period but nonetheless may result in drifting away into the far future.
Now we would have the problem that we can't deliver the real time
anymore until it has caught up because the output of Timer::curr_time
shall be monotone. So, effectively local time might "freeze" again for
more than 100ms.
It would be a solution to not use the Trace::timestamp on ARM w/o HW but
a function whose return value causes the Timer to never use
interpolation because of its stability policy.
Fixes#2400
This patch make sure that a once managed parent RPC object will always be
dissolved if an exception during the remaining child construction
occurs. The original version would miss the dissolve call if one of the
subsequent members throws an exception at construction time.
This patch eases the debugging of situations where a session-object
constructor wrongly throws an exception type not specified in the
'Local_service::Factory' interface.
This patch reduces the number of exception types by facilitating
globally defined exceptions for common usage patterns shared by most
services. In particular, RPC functions that demand a session-resource
upgrade not longer reflect this condition via a session-specific
exception but via the 'Out_of_ram' or 'Out_of_caps' types.
Furthermore, the 'Parent::Service_denied', 'Parent::Unavailable',
'Root::Invalid_args', 'Root::Unavailable', 'Service::Invalid_args',
'Service::Unavailable', and 'Local_service::Factory::Denied' types have
been replaced by the single 'Service_denied' exception type defined in
'session/session.h'.
This consolidation eases the error handling (there are fewer exceptions
to handle), alleviates the need to convert exceptions along the
session-creation call chain, and avoids possible aliasing problems
(catching the wrong type with the same name but living in a different
scope).
This patch mirrors the accounting and trading scheme that Genode employs
for physical memory to the accounting of capability allocations.
Capability quotas must now be explicitly assigned to subsystems by
specifying a 'caps=<amount>' attribute to init's start nodes.
Analogously to RAM quotas, cap quotas can be traded between clients and
servers as part of the session protocol. The capability budget of each
component is maintained by the component's corresponding PD session at
core.
At the current stage, the accounting is applied to RPC capabilities,
signal-context capabilities, and dataspace capabilities. Capabilities
that are dynamically allocated via core's CPU and TRACE service are not
yet covered. Also, the capabilities allocated by resource multiplexers
outside of core (like nitpicker) must be accounted by the respective
servers, which is not covered yet.
If a component runs out of capabilities, core's PD service prints a
warning to the log. To observe the consumption of capabilities per
component in detail, the PD service is equipped with a diagnostic
mode, which can be enabled via the 'diag' attribute in the target
node of init's routing rules. E.g., the following route enables the
diagnostic mode for the PD session of the "timer" component:
<default-route>
<service name="PD" unscoped_label="timer">
<parent diag="yes"/>
</service>
...
</default-route>
For subsystems based on a sub-init instance, init can be configured
to report the capability-quota information of its subsystems by
adding the attribute 'child_caps="yes"' to init's '<report>'
config node. Init's own capability quota can be reported by adding
the attribute 'init_caps="yes"'.
Fixes#2398
This patch reworks the implementation of core's RAM service to make use
of the 'Session_object' and to remove the distinction between the
"metadata" quota and the managed RAM quota. With the new implementation,
the session implicitly allocates its metadata from its own account. So
there is not need to handle 'Out_of_metadata' and 'Quota_exceeded' via
different exceptions. Instead, the new version solely uses the
'Out_of_ram' exception.
Furthermore, the 'Allocator::Out_of_memory' exception has become an alias
for 'Out_of_ram', which simplifies the error handling.
Issue #2398
The 'Session_object' unifies several aspects of server-component
implementations:
* It keeps track of session quotas and is equipped with standardized
interfaces (Quota_guard) to upgrade (and in the future potentially
downgrade) session quotas in a uniform way.
* It follows the pattern of modern RPC objects / signal handlers that
manage/dissolve themselves at the entrypoint given as constructor
argument. Thereby, the relationship with its entrypoint is always
coupled with the lifetime of the session-component object.
* It stores the session label, which was previously done manually by
most but not all server-component implementations.
* It stores the session 'diag' flag.
* It is equipped with output methods 'diag', 'error', and 'warning'.
All messages printed from the context of a session component is
automatically prefixed with the session type and client label.
Messages passed via 'diag' are only printed if the 'diag' flag of
the session is set.
Issue #2398
The 'diag' flag can be defined by a target node of a route in init's
configuration. It is propagated as session argument to the server, which
may evaluate the flag to enable diagnostic output for the corresponding
session.
Issue #2398
This patch makes use of the new 'Quota_transfer::Account' by the service
types in base/service.h and uses 'Quota_transfer' objects in
base/child.cc and init/server.cc.
Furthermore, it decouples the notion of an 'Async_service' from
'Child_service'. Init's 'Routed_service' is no longer a 'Child_service'
but is based on the new 'Async_service' instead.
With this patch in place, quota transfers do no longer implicitly use
'Ram_session_client' objects. So transfers can in principle originate
from component-local 'Ram_session_component' objects, e.g., as used by
noux. Therefore, this patch removes a strumbling block for turning noux
into a single threaded component in the future.
Issue #2398
The 'Quota_transfer' helper facilitated the implementation of quota
transfers between components in a transactional manner. It is designated
for framework-internal use (replacing the 'Transfer' class in child.h).
However, since it is also useful for init, we make it publicly
available.
The 'Quota_transfer::Account' class serves as an interface representing
the donor or receiver of quotas (parent, service, client).
Issue #2398
This patch replaces the 'Parent::Quota_exceeded',
'Service::Quota_exceeded', and 'Root::Quota_exceeded' exceptions
by the single 'Insufficient_ram_quota' exception type.
Furthermore, the 'Parent' interface distinguished now between
'Out_of_ram' (the child's RAM is exhausted) from
'Insufficient_ram_quota' (the child's RAM donation does not suffice to
establish the session).
This eliminates ambiguities and removes the need to convert exception
types along the path of the session creation.
Issue #2398
This patch adds sanity checks to the RPC entrypoint that detect attempts
to manage or dissolve the same RPC object twice. This is not always a
bug. I.e., if RPC objects are implemented in the modern way where the
object manages/dissolves itself. As the generic framework code (in
particular root/component.h) cannot rely on this pattern, it has to
call manage/dissolve for session objects anyway. For modern session
objects, this double attempt would result in a serious error (double
insertion into the object pool's AVL tree).
Issue #2398
This patch replaces the former use of size_t with the use of the
'Ram_quota' type to improve type safety (in particular to avoid
accidentally mixing up RAM quotas with cap quotas).
Issue #2398
The 'Ram_allocator' interface contains the subset of the RAM session
interface that is needed to satisfy the needs of the 'Heap' and
'Sliced_heap'. Its small size makes it ideal for intercepting memory
allocations as done by the new 'Constrained_ram_allocator' wrapper
class, which is meant to replace the existing 'base/allocator_guard.h'
and 'os/ram_session_guard.h'.
Issue #2398
This patch augments the existing session/session.h with useful types for
the session creation:
* The new 'Insufficient_ram_quota' and 'Insufficient_cap_quota'
exceptions are meant to supersede the old 'Quota_exceeded' exception
of the 'Parent' and 'Root' interfaces.
* The 'Session::Resources' struct subsumes the information about the
session quota provided by the client.
* The boolean 'Session::Diag' type will allow sessions to operate in a
diagnostic mode.
* The existing 'Session_label' is not also available under the alias
'Session::Label'.
* A few helper functions ease the extraction of typed session arguments
from the session-argument string.
Issue #2398
This accessor is useful to eagerly expand the slab with new slab blocks,
side stepping the slab's built-in policy for the allocation of new slab
blocks.
This is particularly important when using the slab for allocating the
cap space meta-data for the base-hw kernel. To guarantee that the slab
gets never exhausted in the kernel, it is expanded before entering the
kernel.
With the introduction of the 'Out_of_caps' exception type, the slab
needs to consider exceptions during the call of '_new_slab_block' by
reverting the 'nested' state.
This commit moves the headers residing in `repos/base/include/spec/*/drivers`
to `repos/base/include/drivers/defs` or repos/base/include/drivers/uart`
respectively. The first one contains definitions about board-specific MMIO
iand RAM addresses, or IRQ lines. While the latter contains device driver
code for UART devices. Those definitions are used by driver implementations
in `repos/base-hw`, `repos/os`, and `repos/dde-linux`, which now need to
include them more explicitely.
This work is a step in the direction of reducing 'SPEC' identifiers overall.
Ref #2403
For asynchronously provided sessions, the parent has to maintain the
session state as long as the server hasn't explicitly responded to a
close request. For this reason, the lifetime of such session states is
bound to the server, not the client.
When the server responds to a close request, the session state gets
freed. The 'session_response' implementation does not immediately
destroy the session state but delegates the destruction to a client-side
callback, which thereby also notifies the client. However, the code did
not consider the case where the client has completely vanished at
session-response time. In this case, we need to drop the session state
immediately.
Fixes#2391
Originally, the spec files for less specific SPEC values were include
via the 'select_from_repositories' function. This implies that BASE_DIR
must always be present in the list of 'REPOSITORIES'. Otherwise the
spec files won't be found. By explicitly including sub specs from
'$(BASE_DIR)/mk', we lift this restriction.
The <build-dir>/bin/ directory used to contain symbolic links to the
unstripped build results. However, since the upcoming depot tool
extracts the content of binary archives from bin/, the resulting
archives would contain overly large unstripped binaries, which is
undesired. On the other hand, always stripping the build results is not
a good option either because we rely of symbol information during
debugging.
This patch changes the installation of build results such that a new
'debug/' directory is populated besides the existing 'bin/' directory.
The debug directory contains symbolic links to the unstripped build
results whereas the bin directory contains stripped binaries that are
palatable for packaging (depot tool) and for assembling boot images (run
tool).
By installing the core object to bin/, we follow the same convention as
for regular binaries. This, in turn, enables us to ship core in a
regular binary archive. The patch also adjusts the run tool to pick up
the core object from bin/ for the final linking stage.
The use of 'select_from_repositories' for locating the linker script for
dynamically-linked executables only works if 'BASE_DIR' appears in the
list of 'REPOSITORIES'. This is the case when using the build system in
the traditional way but it is not desired when building binary archives
of individual components.
The base class of Registered must provide a virtual destructor to enable
safe deletion with just a base class pointer. This requirement can be
lifted by using Registered_no_delete in places where the deletion
property is not needed.
Fixes#2331
Ldso now does not automatically execute static constructors of the
binary and shared libraries the binary depends on. If static
construction is required (e.g., if a shared library with constructor is
used or a compilation unit contains global statics) the component needs
to execute the constructors explicitly in Component::construct() via
Genode::Env::exec_static_constructors().
In the case of libc components this is done by the libc startup code
(i.e., the Component::construct() implementation in the libc).
The loading of shared objects at runtime is not affected by this change
and constructors of those objects are executed immediately.
Fixes#2332
This patch destructs the environment sessions for the binary and the
dynamic linker along with the other environment sessions to avoid a
warning about reverting quota that occurs when attempting to close
these sessions too late.
This patch addresses the corner cases where an environment session
could not be routed, i.e., if an environment LOG log session is
routed to a non-existing child.
This patch extends the constructor of 'Local_connection' with an
optional 'label' argument, which was previously passed implicitly as
part of the 'args' argument. Keeping the label separate from 'args'
enables us to distinguish the client-specified label from a label that
resulted from a server-side label as it is used when rewriting a label
of an environment session (i.e., the binary name) in init's routing
policy. In principle, this patch eliminates the need for init's
explicite handling of the binary name via the '<binary>' node, or
at least allows us to simplity the binary-node handling.