In the past, the core-only privileged syscall `update_pd` was used only
to invalidate the TLB after removal of page-table entries.
By now, the whole TLB at least for one protection domain got invalidated,
but in preparation for optimization and upcomingARM v8 support,
it is necessary to deliver the virtual memory region that needs to get
invalidated. Moreover, the name of the call shall represent explicitely
that it is used to invalidate the TLB.
Ref #3405
On x86 the CPU count is determined through ACPI's MADT by counting the
local APICs reported there. Some platforms report more APICs
than there are actual CPUs. These might be physically disabled CPUs.
Therefore, a check if the LAPIC is actually physically enabled in
hardware fixes this issue.
Thanks to Alex Boettcher
fixes#3376
Fix initial stack pointer alignment for x86_64 in crt0.s startup code of
bootstrap. SysV ABI states that upon function entry (rsp + 8) % 16 = 0.
There, we have to align the stack to 16 bytes before all 'call'
instruction not 8. Otherwise FPU (GP) exception might be raised later on
because of unaligned FPU accesses.
issue #3365
Since gcc 8.3.0 generates SSE instructions into kernel code, the
kernel itself may raise FPU exceptions and/or corrupt user level FPU
contexts thereby. Both things are not feasible, and therefore, lazy FPU
switching becomes a no go for base-hw because we cannot avoid FPU
instructions because of the entanglement of base-hw, base, and the tool
chain (libgcc_eh.a).
issue #3365
Also disable TS (task switch) flag in cr0 during kernel initialization,
so FPU faults are not raised. This became necessary since GCC lately
aggressively generates FPU instructions at arbitrary places and also at
early kernel-bootstrapping stages.
fixes#3365
Components like kernel, core, and bootstrap that are built for a
specific board need to reside inside the same architectural dependent
build directory. For instance there are sel4, foc, and hw kernel builds
for imx6q_sabrelite and imx7d_sabre, which have to reside inside the same
arm_v7 build directory.
This commit names those components explicitely, and adapts the run-tool to it.
Fix#3316
This enforces the use of unsigned 64-bit values for time in the duration type,
the timeout framework, the timer session, the userland timer-drivers, and the
alarm framework on all platforms. The commit also adapts the code that uses
these tools accross all basic repositories (base, base-*, os. gems, libports,
ports, dde_*) to use unsigned 64-bit values for time as well as far as this
does not imply profound modifications.
Fixes#3208
Instead of using `cps` instruction, use an exception return
instruction to switch from `hyp` mode to `svc` mode.
Otherwise it causes unpredicted behaviour on ARM.
Fix#3284
Track the dataspaces used by attach and add handling of flushing VM space
when dataspace gets destroyed (not triggered via the vm_session interface).
Issue #3111
Triggering of an invalidated signal seems to be no real exception,
but something that occurs regularily. Therefore, the kernel warning
is of no use to developers anymore.
Ref #3277
As far as possible remove usage of warning/error/log in the kernel,
otherwise the kernel context might try to take a lock hold by a core
thread, which results in a syscall to block.
Fix#3277
* Introduces pending_signal syscall to check for new signals for the
calling thread without blocking
* Implements pending_signal in the base-library specific for hw to use the
new syscall
Fix#3217
* Introduce 64-bit tick counter
* Let the timer always count when possible, also if it already fired
* Simplify the kernel syscall API to have one current time call,
which returns the elapsed microseconds since boot
In commit "hw: improve cross-cpu synchronization" the implicit safe
initialization of the global kernel lock gets unsafe.
It is a static object, which is protected by the cxx library regarding
its initialization. But our cxx library uses a Genode::semaphore in
the contention case of object construction, which implicitly leads
to kernel syscalls for blocking the corresponding thread. This behaviour
is unacceptable for the kernel code.
Therefore, this fix guards the initialization of the kernel code with
a simple static boolean value explicitely.
Ref #3042
Ref #3043
This commit removes APIs that were previously marked as deprecated. This
change has the following implications:
- The use of the global 'env()' accessor is not possible anymore.
- Boolean accessor methods are no longer prefixed with 'is_'. E.g.,
instead of 'is_valid()', use 'valid()'.
- The last traces of 'Ram_session' are gone now. The 'Env::ram()'
accessor returns the 'Ram_allocator' interface, which is a subset of
the 'Pd_session' interface.
- All connection constructors need the 'Env' as argument.
- The 'Reporter' constructor needs an 'Env' argument now because the
reporter creates a report connection.
- The old overload 'Child_policy::resolve_session_request' that returned
a 'Service' does not exist anymore.
- The base/printf.h header has been removed, use base/log.h instead.
- The old notion of 'Signal_dispatcher' is gone. Use 'Signal_handler'.
- Transitional headers like os/server.h, cap_session/,
volatile_object.h, os/attached*_dataspace.h, signal_rpc_dispatcher.h
have been removed.
- The distinction between 'Thread_state' and 'Thread_state_base' does
not exist anymore.
- The header cpu_thread/capability.h along with the type definition of
'Cpu_thread_capability' has been removed. Use the type
'Thread_capability' define in cpu_session/cpu_session.h instead.
- Several XML utilities (i.e., at os/include/decorator) could be removed
because their functionality is nowadays covered by util/xml_node.h.
- The 'os/ram_session_guard.h' has been removed.
Use 'Constrained_ram_allocator' provided by base/ram_allocator.h instead.
Issue #1987
This patch adjusts the implementation of the base library and core such
that the code no longer relies on deprecated APIs except for very few
cases, mainly to keep those deprecated APIs in tact for now.
The most prominent changes are:
- Removing the use of base/printf.h
- Removing of the log backend for printf. The 'Console' with the
format-string parser is still there along with 'snprintf.h' because
the latter is still used at a few places, most prominently the
'Connection' classes.
- Removing the notion of a RAM session, which does not exist in
Genode anymore. Still the types were preserved (by typedefs to
PD session) to keep up compatibility. But this transition should
come to an end now.
- Slight rennovation of core's tracing service, e.g., the use of an
Attached_dataspace as the Argument_buffer.
- Reducing the reliance on global accessors like deprecated_env() or
core_env(). Still there is a longish way to go to eliminate all such
calls. A useful pattern (or at least a stop-gap solution) is to
pass the 'Env' to the individual compilation units via init functions.
- Avoiding the use of the old 'Child_policy::resolve_session_request'
interface that returned a 'Service' instead of a 'Route'.
Issue #1987
- support to create multiple vCPUs
- support to implement Vm_session methods client side within base library
- adjust muen specific virtualbox4 version to compile/link
Issue #3111
As we don't execute the acpi_drv on Muen, we have to supply a static
'acpi' info as boot module. This is normally done by the
base/run/platform.inc include. However, when using base-hw-muen kernel
from a depot archive - as done by modern run scripts like
depot_download.run - the platform.inc magic is not applied.
This patch enhances the src archive of base-hw-muen with a mechanism
that creates a pre-defined acpi info at the bin directory via an
artificial src/acpi/target.mk file. This way, the static acpi ROM ends
up as boot module when importing the base-hw-muen archive into a
run script.
This patch replaces the former prominent use of pointers by references
wherever feasible. This has the following benefits:
* The contract between caller and callee becomes more obvious. When
passing a reference, the contract says that the argument cannot be
a null pointer. The caller is responsible to ensure that. Therefore,
the use of reference eliminates the need to add defensive null-pointer
checks at the callee site, which sometimes merely exist to be on the
safe side. The bottom line is that the code becomes easier to follow.
* Reference members must be initialized via an object initializer,
which promotes a programming style that avoids intermediate object-
construction states. Within core, there are still a few pointers
as member variables left though. E.g., caused by the late association
of 'Platform_thread' objects with their 'Platform_pd' objects.
* If no pointers are present as member variables, we don't need to
manually provide declarations of a private copy constructor and
an assignment operator to avoid -Weffc++ errors "class ... has
pointer data members [-Werror=effc++]".
This patch also changes a few system bindings on NOVA and Fiasco.OC,
e.g., the return value of the global 'cap_map' accessor has become a
reference. Hence, the patch touches a few places outside of core.
Fixes#3135
This patch moves the removal of the signal context from the
'_platform_finish_dissolve' to the '_platform_begin_dissolve'
method. This is needed because the removal involves taking
the signal-registry lock. The latter must adhere the same
locking order as the code path used for signal delivery.
Fixes#3109
Since the timer and timeout handling is part of the base library (the
dynamic linker), it belongs to the base repository.
Besides moving the timer and its related infrastructure (alarm, timeout
libs, tests) to the base repository, this patch also moves the timer
from the 'drivers' subdirectory directly to 'src' and disamibuates the
timer's build locations for the various kernels. Otherwise the different
timer implementations could interfere with each other when using one
build directory with multiple kernels.
Note that this patch changes the include paths for the former os/timer,
os/alarm.h, os/duration.h, and os/timed_semaphore.h to base/.
Issue #3101
This commit solves several issues:
* correct calculation of overlap region when detaching regions
in managed dataspaces
* prevent unmap of Fiasco.OC's core log buffer
* calculate the core-local address of regions in managed dataspaces
if possible at all and use it to unmap on kernels where this is
needed
Fix#976Fix#3082
This commit addresses several multiprocessing issues in base-hw:
* it reworks cross-cpu maintainance work for TLB invalidation by
introducing a generic Inter_processor_work and removes the so
called Cpu_domain_update
* thereby it solves the cross-cpu thread destruction, when the
corresponding thread is active on another cpu (fix#3043)
* it adds the missing TLB shootdown for x86 (fix#3042)
* on ARM it removes the TLB shootdown via IPIs, because this
is not needed on the multiprocessing ARM platforms we support
* it enables the per-cpu initialization of the kernel's cpu
objects, which means those object initialization is executed
by the proper cpu
* it rollbacks prior decision to make multiprocessing an aspect,
but puts back certain 'smp' mechanisms (like cross-cpu lock)
into the generic code base for simplicity reasons
* To base-hw/recipes/src add base-hw-arndale, base-hw-imx53_qsb,
base-hw-imx53_qsb_tz, base-hw-odroid_xu, base-hw-panda, base-hw-rpi,
base-hw-wand_quad
* Ensure that the correct base-hw recipe is choosen by the run module
'boot_dir/hw'
The new 'conditional' method simplifies the typical use case for
'Constructible' objects where the constructed/destructed state depends
on a configuration parameter. The method alleviates the need to
re-implement the logic again and again.
The patch also removes the 'Reconstructible' constructor arguments
because they are unused.
Fixes#3006
Previously, the trace control of a thread was initialized in its
constructor (which is generic for all components). This has the
disadvantage that the CPU-session-pointer member of the thread might not
be valid at this point. And it cannot be replaced by using the
"deprecated_env" CPU session neither as constructing the deprecated
environment in causes troubles in Core. But as the trace control
shouldn't be needed in Core anyway, the initialization can be moved to
the Thread::start implementation of non-core components. This code
already takes care of the CPU session pointer.
Fixes#2901
The sinfo API now also exports PCI devices without logical IRQs.
Therefore, explicitly check interrupt count in get_msi_params() function
and ignore such devices.
- Use latest Muen version
- Sync VirtualBox Muen subject state
- Drop unneccessary subject IP patch
- Adapt Muen RUN_OPTs
- Update documentation
Note: the GPL 2017 toolchain is now required and as the debug output
format has changed the mulog-subject.py script must be updated on
autopilot instances.
AVL trees can't be copied with the default copy constructor as the
parent pointer of the first item of both of the resulting trees would
point to the original tree. Copying an AVL node, however, generally
violates the integrity of the corresponding tree. The copy constructor
of Avl_tree is used in some places but in those places it can be
replaced easily. So, this commit deletes the copy constructor of
Avl_node_base which makes Avl_node and Avl_tree non-copyable.
Issue #2654
The patch adjust the code of the base, base-<kernel>, and os repository.
To adapt existing components to fix violations of the best practices
suggested by "Effective C++" as reported by the -Weffc++ compiler
argument. The changes follow the patterns outlined below:
* A class with virtual functions can no longer publicly inherit base
classed without a vtable. The inherited object may either be moved
to a member variable, or inherited privately. The latter would be
used for classes that inherit 'List::Element' or 'Avl_node'. In order
to enable the 'List' and 'Avl_tree' to access the meta data, the
'List' must become a friend.
* Instead of adding a virtual destructor to abstract base classes,
we inherit the new 'Interface' class, which contains a virtual
destructor. This way, single-line abstract base classes can stay
as compact as they are now. The 'Interface' utility resides in
base/include/util/interface.h.
* With the new warnings enabled, all member variables must be explicitly
initialized. Basic types may be initialized with '='. All other types
are initialized with braces '{ ... }' or as class initializers. If
basic types and non-basic types appear in a row, it is nice to only
use the brace syntax (also for basic types) and align the braces.
* If a class contains pointers as members, it must now also provide a
copy constructor and assignment operator. In the most cases, one
would make them private, effectively disallowing the objects to be
copied. Unfortunately, this warning cannot be fixed be inheriting
our existing 'Noncopyable' class (the compiler fails to detect that
the inheriting class cannot be copied and still gives the error).
For now, we have to manually add declarations for both the copy
constructor and assignment operator as private class members. Those
declarations should be prepended with a comment like this:
/*
* Noncopyable
*/
Thread(Thread const &);
Thread &operator = (Thread const &);
In the future, we should revisit these places and try to replace
the pointers with references. In the presence of at least one
reference member, the compiler would no longer implicitly generate
a copy constructor. So we could remove the manual declaration.
Issue #465
This is necessary because in contrast to the zynq boards (see specs in genode-world), only zynq_qemu uses UART_0.
These files should thus fall under the zynq_qemu spec.
Fixes#2615
Instead of changing the attributes (e.g., Xd bit) of the top-level page-tables,
set them to allow everything. Only leafs of the paging hierarchy are set
according to the paging attributes given by core. Otherwise, top-level page-
table attributes are changed during lifetime, which requires a TLB flush
operation (not intended in the semantic of the kernel/core).
This led to problems when using the non-executable features introduced by
issue #1723 in the recent past.
Recent work related to issue 1723 showed that there is potential
to get rid of code duplication in MMU fault handling especially
with regard to ARM cpus.
* Instead of always re-load page-tables when a thread context is switched
only do this when another user PD's thread is the next target,
core-threads are always executed within the last PD's page-table set
* remove the concept of the mode transition
* instead map the exception vector once in bootstrap code into kernel's
memory segment
* when a new page directory is constructed for a user PD, copy over the
top-level kernel segment entries on RISCV and X86, on ARM we use a designated
page directory register for the kernel segment
* transfer the current CPU id from bootstrap to core/kernel in a register
to ease first stack address calculation
* align cpu context member of threads and vms, because of x86 constraints
regarding the stack-pointer loading
* introduce Align_at template for members with alignment constraints
* let the x86 hardware do part of the context saving in ISS, by passing
the thread context into the TSS before leaving to user-land
* use one exception vector for all ARM platforms including Arm_v6
Fix#2091
* introduce new syscall (core-only) to create privileged threads
* take the privilege level of the thread into account
when doing a context switch
* map kernel segment as accessable for privileged code only
Ref #2091
Always switch to the "exception stack" instead of having a hardware initiated
stack switch during exceptions/interrupts when the privilege level changes only.
Moreover, this commit increases the exception stack slightly.
Ref #2091
* introduces central memory map for core/kernel
* on 32-bit platforms the kernel/core starts at 0x80000000
* on 64-bit platforms the kernel/core starts at 0xffffffc000000000
* mark kernel/core mappings as global ones (tagged TLB)
* move the exception vector to begin of core's binary,
thereby bootstrap knows from where to map it appropriately
* do not map boot modules into core anymore
* constrain core's virtual heap memory area
* differentiate in between user's and core's main thread's UTCB,
which now resides inside the kernel segment
Ref #2091
In the past, a signal context, that was chosen for handling by
'Signal_receiver::pending_signal and always triggered again before
the next call of 'pending_signal', caused all other contexts behind
in the list to starve. This was the case because 'pending_signal'
always took the first pending context in its context list.
We avoid this problem now by handling pending signals in a round-robin
fashion instead.
Ref #2532
Some x86 machines do have a LAPIC speed < 1000 ticks per millisecond
when configured to use the maximum divider (as it was always the case).
But we need microseconds precision for the timeout framework. Thus,
reduce the divider dynamically until the frequency fullfills our
requirements.
Ref #2400
There are hardware timers whose frequency can't be expressed as
ticks-per-microsecond integer-value because only a ticks-per-millisecond
integer-value is precise enough. We don't want to use expensive
floating-point values here but nonetheless want to translate from ticks
to time with microseconds precision. Thus, we split the input in two and
translate both parts separately. This way, we can raise precision by
shifting the values to their optimal bit position. Afterwards, the results
are shifted back and merged together again.
As this algorithm is not so trivial anymore and used by at least three
timer drivers (base-hw/x86_64, base-hw/cortex_a9, timer/pit), move it to a
generic header to avoid redundancy.
Ref #2400
Due to the simplicity of the algorithm that translated from timer ticks
to time, we lost microseconds precision although the timer allows for it.
Ref #2400
When running core as the kernel inside every component, a separate
stack area for core is needed that is different from the user-land
component's one.
Ref #2091
For most base platforms (except linux and sel4), the initialization of
boot modules is the same. Thus, merge this default implementation in the
new unit base/src/core/platform_rom_modules.cc.
Ref #2490
The kernel timer on RPI is able to measure time microseconds-precise.
Howeer, due to a bug, we dropped precision during the ticks-to-time
translation and return only milliseconds-precise time.
Ref #2400
rm_fault.run triggers write on read-only ROM provided by core, which
fails without this patch:
arm - "raised unhandled data abort"
x86 - (silent/invisible) busy loop because write fault gets never resolved
A bug in the timer-ticks-to-microseconds translation of the kernel timer
caused the user time to periodically get stuck for about 32 milliseconds
and then jump forward to the normal level again.
Ref #2400
The recently implemented capability resource trading scheme unfortunately
broke the automated capability memory upgrade mechanism needed by base-hw
kernel/core. This commit splits the capability memory upgrade mechanism
from the PD session ram_quota upgrade, and moves that functionality
into a separate Pd_session::Native_pd interface.
Ref #2398
On ARM, we do not have a component-local hardware time-source. The ARM
performance counter has no reliable frequency as the ARM idle command
halts the counter. Thus, we do not do local time interpolation on ARM.
Except we're on the HW kernel. In this case we can read out the kernel
time instead.
Ref #2435
By separating the session-interface concerns from the mechanics of the
dataspace creation, the code becomes simpler to follow, and the RAM
session can be more easily merged with the PD session in a subsequent
step.
Issue #2407
This patch allows core's 'Signal_transmitter' implementation to sidestep
the 'Env::Pd' interface and thereby adhere to a stricter layering within
core. The 'Signal_transmitter' now uses - on kernels that depend on it -
a dedicated (and fairly freestanding) RPC proxy mechanism for signal
deliver, instead of channeling signals through the 'Pd_session::submit'
RPC function.
Previously, the Genode::Timer::curr_time always used the
Timer_session::elapsed_ms RPC as back end. Now, Genode::Timer reads
this remote time only in a periodic fashion independently from the calls
to Genode::Timer::curr_time. If now one calls Genode::Timer::curr_time,
the function takes the last read remote time value and adapts it using
the timestamp difference since the remote-time read. The conversion
factor from timestamps to time is estimated on every remote-time read
using the last read remote-time value and the timestamp difference since
the last remote time read.
This commit also re-works the timeout test. The test now has two stages.
In the first stage, it tests fast polling of the
Genode::Timer::curr_time. This stage checks the error between locally
interpolated and timer-driver time as well as wether the locally
interpolated time is monotone and sufficiently homogeneous. In the
second stage several periodic and one-shot timeouts are scheduled at
once. This stage checks if the timeouts trigger sufficiently precise.
This commit adds the new Kernel::time syscall to base-hw. The syscall is
solely used by the Genode::Timer on base-hw as substitute for the
timestamp. This is because on ARM, the timestamp function uses the ARM
performance counter that stops counting when the WFI (wait for
interrupt) instruction is active. This instruction, however is used by
the base-hw idle contexts that get active when no user thread needs to
be scheduled. Thus, the ARM performance counter is not a good choice for
time interpolation and we use the kernel internal time instead.
With this commit, the timeout library becomes a basic library. That means
that it is linked against the LDSO which then provides it to the program it
serves. Furthermore, you can't use the timeout library anymore without the
LDSO because through the kernel-dependent LDSO make-files we can achieve a
kernel-dependent timeout implementation.
This commit introduces a structured Duration type that shall successively
replace the use of Microseconds, Milliseconds, and integer types for duration
values.
Open issues:
* The timeout test fails on Raspberry PI because of precision errors in the
first stage. However, this does not render the framework unusable in general
on the RPI but merely is an issue when speaking of microseconds precision.
* If we run on ARM with another Kernel than HW the timestamp speed may
continuously vary from almost 0 up to CPU speed. The Timer, however,
only uses interpolation if the timestamp speed remained stable (12.5%
tolerance) for at least 3 observation periods. Currently, one period is
100ms, so its 300ms. As long as this is not the case,
Timer_session::elapsed_ms is called instead.
Anyway, it might happen that the CPU load was stable for some time so
interpolation becomes active and now the timestamp speed drops. In the
worst case, we would now have 100ms of slowed down time. The bad thing
about it would be, that this also affects the timeout of the period.
Thus, it might "freeze" the local time for more than 100ms.
On the other hand, if the timestamp speed suddenly raises after some
stable time, interpolated time can get too fast. This would shorten the
period but nonetheless may result in drifting away into the far future.
Now we would have the problem that we can't deliver the real time
anymore until it has caught up because the output of Timer::curr_time
shall be monotone. So, effectively local time might "freeze" again for
more than 100ms.
It would be a solution to not use the Trace::timestamp on ARM w/o HW but
a function whose return value causes the Timer to never use
interpolation because of its stability policy.
Fixes#2400
With this, we get rid of platform specific timer interfaces. The new
Timer class does the same as the old Clock class and has a generic
interface. The old Timer class was merely used by the old Clock class.
Also, we get rid of having only one timer instance which we tell with
each method call for which CPU it shall be done. Instead now each Cpu
object has its own Timer member that knows the CPU it works for.
Also, rename all "tics" to "ticks".
Fixes#2347
Previously we did write the SPSR via an MSR instruction without
additional flags. Unfortunately, this tells the CPU to write the
register only partially. This often isn't a problem as the users PSR
reset value normally is conform to our expectations but in some cases
(e.g. PSR endianess bit on WandBoard core #4) the reset value is bad.
Thus, we have to add the CXSF flags (access Control + eXtension + Status
+ Flags) so the CPU overwrites the entire register.
Fixes#2254
This patch reduces the number of exception types by facilitating
globally defined exceptions for common usage patterns shared by most
services. In particular, RPC functions that demand a session-resource
upgrade not longer reflect this condition via a session-specific
exception but via the 'Out_of_ram' or 'Out_of_caps' types.
Furthermore, the 'Parent::Service_denied', 'Parent::Unavailable',
'Root::Invalid_args', 'Root::Unavailable', 'Service::Invalid_args',
'Service::Unavailable', and 'Local_service::Factory::Denied' types have
been replaced by the single 'Service_denied' exception type defined in
'session/session.h'.
This consolidation eases the error handling (there are fewer exceptions
to handle), alleviates the need to convert exceptions along the
session-creation call chain, and avoids possible aliasing problems
(catching the wrong type with the same name but living in a different
scope).
This patch mirrors the accounting and trading scheme that Genode employs
for physical memory to the accounting of capability allocations.
Capability quotas must now be explicitly assigned to subsystems by
specifying a 'caps=<amount>' attribute to init's start nodes.
Analogously to RAM quotas, cap quotas can be traded between clients and
servers as part of the session protocol. The capability budget of each
component is maintained by the component's corresponding PD session at
core.
At the current stage, the accounting is applied to RPC capabilities,
signal-context capabilities, and dataspace capabilities. Capabilities
that are dynamically allocated via core's CPU and TRACE service are not
yet covered. Also, the capabilities allocated by resource multiplexers
outside of core (like nitpicker) must be accounted by the respective
servers, which is not covered yet.
If a component runs out of capabilities, core's PD service prints a
warning to the log. To observe the consumption of capabilities per
component in detail, the PD service is equipped with a diagnostic
mode, which can be enabled via the 'diag' attribute in the target
node of init's routing rules. E.g., the following route enables the
diagnostic mode for the PD session of the "timer" component:
<default-route>
<service name="PD" unscoped_label="timer">
<parent diag="yes"/>
</service>
...
</default-route>
For subsystems based on a sub-init instance, init can be configured
to report the capability-quota information of its subsystems by
adding the attribute 'child_caps="yes"' to init's '<report>'
config node. Init's own capability quota can be reported by adding
the attribute 'init_caps="yes"'.
Fixes#2398