The new types in base/ram.h model different allocation scenarios and
error cases by mere C++ types without using exceptions. They are meant
to replace the former 'Ram_allocator' interface. As of now, the
'Unmapped_allocator' closely captures the former 'Ram_allocator'
semantics. The 'Constrained_allocator' is currently an alias for
'Unmapped_allocator' but is designated for eventually allocating
mapped RAM.
In contrast to the 'Ram_allocator' interface, which talked about
dataspace capabilites but left the lifetime management of the
allocated RAM to the caller, the new API represents an allocation
as a guard type 'Allocation', which deallocates on destruction by
default.
Allocation errors are captured by a 'Result' type that follows
the 'Attempt' pattern.
As a transitionary feature, the patch largely maintains API
compatibility with the original 'Ram_allocator' by providing
the original (exception-based) 'Ram_allocator::alloc' and
'Ram_allocator::free' methods as a wrapper around the new
'Ram::Constrained_allocator'. So components can be gradually
updated to the new 'Ram::' interface.
Issue #5502
After constructed, a 'Thread' object may remain in a dysfunctional state
should the stack allocation have failed. This condition is no longer
reflected as a C++ exception but as result value of 'Thread::info()'.
Keep 'Thread::name' as public constant because the stack is not always
available for storing the name.
The 'stack_top' accessor has been removed because this information is
already provided by 'Thread::info()'.
Issue #5245
With planned removal of Thread:: exceptions, we need to consider that a
'Thread' object may exist without a valid 'Stack' and therefore without
a valid 'Native_thread', which is hosted as part of the 'Stack'.
This patch reworks the code that accesses the 'Native_thread' to use the
new 'Thread::with_native_thread' interface. Within the local scope,
the native thread is referred to as 'nt'.
The _init_platform_thread and _deinit_platform_thread() have been
replaced by _init_native_thread and _deinit_native_thread, which take
a 'Stack &' as argument.
As a safety caution, 'Native_thread' objects can no longer be copied.
Issue #5245
To correctly deal with sparsely used cpu numbers on x86,
add an array to the board data-structure that tracks the actual
cpus detected. Later when memory is available, and the per-cpu
kernel stacks and cpu-local memory gets initialized, use the
array to reference the correct memory slots.
Fixgenodelabs/genode#5468
The special `hw/memory_consts.h` header file that is automatically
generated from an assembler base file to avoid pre-processing assembler
files, was missing inclusion guards.
Ref genodelabs/genode#5468
The methods to read and write state to the vCPU were named
from the viewpoint of the `Vcpu_state` data structure that was being
exchanged with the VMM, i.e. `read_vcpu_state()` was reading the data
structure (and writing it into the vCPU) wheras `write_vcpu_state()` was
writing the vCPU's state into `Vcpu_state`. This is confusing from a
more generic understanding of a "vCPU state".
Rename the state manipulation methods into `load()` and `store()`, where
`load()` loads the supplied `Vcpu_state` into the vCPU and `store()` stores the
vCPU's state in the supplied `Vcpu_state`.
Issue #5483
Initializing a vCPU on base-hw involved a round-trip to the kernel to
send the initial startup exit signal and setting up the vCPU on first
run of the vCPU thread.
Signal the vCPU handler directly from the base libary. This removes the
need to call `Vm::run()` during construction of the vCPU, possibly from
a remote core. Check that `Vm::run()` and `Vm::pause()` are only called
from the local CPU core. Finally, move the initialization of the vCPU
from `Vm::proceed()` to `Vm::run()`, so that the `Vcpu_state` can be
synced from the VMM directly after initialization.
Fixes#5483
Syncing the vCPU state with the VMM before the virtualization technology
was inititialized on the target CPU core may fail, such as on VMX where
the resulting `vmread` instructions will cause #UD faults.
Skip syncing the vCPU state when the vCPU isn't initialized
yet and sync the initial `Vcpu_state` from the VMM when the vCPU is
known to be initialized.
Issue #5474
The work in #5425 missed an adaption for setting up the stack pointer
and pushing the appropriate TRAP_VMEXIT exception value before jumping
to `_kernel_entry` after returning from `vmlaunch`.
Adapt the VMX implementation and remove a now superfluous argument from
the initialization method.
Issue #5474
On x86_64, calling Hw_vcpu::run() will cause a startup exit that is
signaled to the VMM. The VMM will subsequently call with_state().
When Hw_vcpu::run() is called from the Hw_vcpu constructor, this can
lead to a situation where the VMM calls with_state() on a vCPU that
isn't fully constructed yet.
The VMM library API requires that the vCPU starts up in order to emit a
startup exit at construction. Call Hw_vcpu::run() from the
Vm_connection::Vcpu constructor instead of calling run() from the
Hw_vcpu constructor to avoid running a native vCPU that isn't fully
constructed yet.
Fixes#5442
* Instead of only implicitely update the last scheduled Cpu context with
the CPU state, when entering the architecture-speficic machine
exception vector, cache this data on kernel context stack, and deliver
it as argument when entering the kernel via high-level language
* Handle Cpu context's exception explicitely in kernel main routine
* Make cached CPU state available to error handling lambda in case of
Kernel::Mutex double entering (aka kernel fault)
* Rename Cpu::schedule to Cpu::assign
Ref genodelabs/genode#5425
* Rename Kernel::Lock into Kernel::Mutex
* Replace Guard object by template function that expects
lambda to handle re-entrance by same cpu
Ref genodelabs/genode#5425
- and store the vCPU startup state in a dedicated enum
- return the STARTUP exit from Vm::run()
- initialize the vCPU from Vm::proceed() instead of Vm::exception()
Fixgenodelabs/genode#5450
While implementing TSC calibration in #5215, the issue of properly serializing
TSC reads came up. Some learnings of the discussion were noted in #5430.
Using `cpuid` for serialization as in Trace::timestamp() is portable,
but will cause VM exits on VMX and SVM and is therefore unsuitable to
retain a roughly working calibration loop while running virtualized.
On the other hand on most AMD systems, dispatch serializing `lfence`
needs to be explicitly enabled via a non-architectural MSR.
Enable setting up dispatch serializing lfence on AMD systems and always
serialize rdtsc accesses in Hw::Tsc::rdtsc() for maximum reliability.
Issues #5215, #5430
Upto now, bootstrap used the Programmable Interval Timer to set a
suitable divider and determine the frequency of the Local APIC.
The PIT is not available on recent x86_64 hardware anymore.
Move Local APIC calibration to bootstrap and use the ACPI timer as a
reference. Clean up hw's timer implementation a little and disable the
PIT in bootstrap.
Fixes#5215
To get the Time Stamp Counter's frequency, hw relied on a complex and
incomplete algorithm.
Since this is a one-time initialization issue, move TSC calibration to
bootstrap and implement it using the ACPI timer.
Issue #5215
* Move all Kernel::Signal_* structures to kernel/signal.*
* Remove return value of kill_signal_context, which wasn't evaluated
* Remove Kernel::Signal_context::can_kill
* Remove Kernel::Signal_context::can_submit
* Remove Kernel::Signal_receiver::can_add_handler
* Turn nullptr into cxx nullptr instead of just zero
* Turn boolean values into true/false instead of one/zero
* Always add to signal FIFO also if submit counter
cannot get increased enough
Fixgenodelabs/genode#5416
Instead of blocking in case of exceptions and MMU faults, delegate
the faulter's scheduling context to the assigned pager thread.
Fixgenodelabs/genode#5318
Instatiate a separate pager thread for each cpu core. Every time a pager
object gets managed by the Pager_entrypoint, assign it to the pager thread
on the same cpu core.
Ref genodelabs/genode#5318
* Rename Kernel::Cpu_job to Kernel::Cpu_context (alias Kernel::Cpu::Context)
* State first Cpu affinity of Cpu::Context at construction time
* Move cpu affinity argument from kernel syscall create_thread to start_thread
* Ensure that Cpu pointer is always valid
Fixgenodelabs/genode#5319
On x86, the `Vm_session_component` obscured the differences between SVM
and VMX.
Separate the implementations, factor out common functionality and
address a number of long-standing issues in the process:
- Allocate nested page tables from Core_ram_allocator as a more suitable
abstraction and account for the required memory, subtract the
necessary amount of RAM from the session's `Ram_quota` *before*
constructing the session object, to make sure that the memory
allocated from the `Core_ram_allocator` is available from the VMM's
RAM quota.
- Move the allocation of Vcpu_state and Vcpu_data into the Core::Vcpu
class and use the Core RAM Allocator to allocate memory with a known
physical address.
- Remove the fixed number of virtual CPUs and the associated reservation
of memory by using a Registry for a flexible amount of vCPUs.
Issue #5221
This patch replaces the use of 'core_env()' in 'platform_services.cc' by
the function arguments 'core_ram', 'core_rm', and 'io_port_ranges'.
It also removes the 'Pd_session' argument from 'Io_port_root' and
'Irq_root' to avoid the reliance on the 'Pd_session' interface within
core,
Issue #5408
Replace the use of the global 'core_env()' accessor by the explicit
delegation of interfaces.
- For allocating UTCBs in base-hw, 'Platform_thread' requires
a way to allocate dataspaces ('Ram_allocator') accounted to the
corresponding CPU session, a way to locally map the allocated
dataspaces (core's 'Region_map'), and a way to determine the
physical address (via 'Rpc_entrypoint') used for the initial
UTCB mapping of main threads. Hence those interfaces must be
passed to 'Platform_thread'.
- NOVA's pager code needs to look up 'Cpu_thread_component'
objects using a map item as key. The lookup requires the
'Rpc_entrypoint' that hold the 'Cpu_thread_component' objects.
To make this 'Rpc_entrypoint' available, this patch adds
the 'init_page_fault_handing' function.
- The 'Region_map_mmap' for Linux requires a way to look up
'Linux_dataspace' objects for given dataspace capabilities.
This lookup requires the 'Rpc_entrypoint' holding the dataspaces,
which is now passed to 'platform.cc' via the new Linux-specific
'Core_region_map::init' function.
Issue #5408
`Vm_session_component::create_vcpu()` is present across all supported
kernels, yet until now it was not part of the `Vm_session` interface.
Add the method to the `Vm_session` interface. This unifies calls in the
base library and is the basis to remove the need for a common base class
for separate `Vm_session` implementations for SVM and VMX on x86_64.
Issue #5221
If one of core's threads is causing an MMU fault, dump the
register state and stack backtrace of the faulting stack to
aid debugging.
Fixgenodelabs/genode#5387
This patch removes the only residual C++ exception from the kernel part
of core, eliminating the risk of the kernel thread trying to enter the
kernel itself via the C++ exception-handling path. When throwing an
exception, __cxa_allocate_exception invokes the cxx_heap, which
synchronizes accesses via a Genode::Mutex. In the contention case,
the blocking of the mutex issues a syscall to pause the caller.
The patch fixes the problem by replacing the exception with a return
value.
Fixes#5382
Issue #5245
over rsdp v1. The multiboot2 provided rsdp_v1 version may not contain the
xsdt pointer, but may have the very same acpi revision as the acpi rsdp v2
version of multiboot2.
Fixes#5332