The kernel timer used to truncated timeouts to the next lower
millisecond, which not only limits the wakeup accuracy but also results
in situations where a user-level timeout is triggered earlier than
expected. The latter effect results in the observation of a spurious
timeouts and the subsequent programming of another timeout.
The patch solves the problem by preserving the sub-milliseconds bits
in the 'us_to_ticks' implementation(s).
Issue #5142
* Move Kernel::Cpu_scheduler to Kernel::Scheduler
(we only have that one scheduler)
* Move Kernel::Cpu_share to Kernel::Scheduler::Context
* Move Kernel::Cpu_priority to Kernel::Scheduler::Priority
* Rename all functions and variables refereing to `claims` to
`prioritzed`, because claims is not that common
* Rename occurrences of `fill(s)` to `slack` to name the time portions
outside of the prioritized quantum
* Get rid of some two-liner sub-functions with only one occurrence
(like `_quota_introduction`, `_quota_revokation`,...)
Ref genodelabs/genode#5115
This private, internal function is used only in one scope,
and therefore not neccessary. But it has nasty side-effects as
it takes the parameter `duration` as reference and changes its
value. Just remove it completely.
Ref genodelabs/genode#5115
The name head is already extensively used in the context of the lists
managed by the scheduler. This terminology duplications does not simplify
reading the code. Instead we keep head for the first item in the list,
but use `current` in the variable name and API of the `Cpu_scheduler`
class to refer to the current scheduled share.
Moreover, the `_head_quota` is now `_current_time_left`, because it does
not denote quota but time left for the current schedule. The boolean
variable `_head_claims` gets removed at all. It duplicated the state of
whether a current share ist set, and whether it has so-called claim time
left.
Ref genodelabs/genode#5115
Give certain scheduler class wide variables and functions clear names:
* quota => super_period_length
* residual => super_period_left
Ref genodelabs/genode#5115
Replace double linked list by normal Genode::List with an additional
pointer to last list member to efficiently handle the scheduler share lists.
Moreover, move it into the private part of the Cpu_scheduler class,
the only scope where it is used anymore.
Ref genodelabs/genode#5115
Minor changes that should not change any semantics:
* Remove `_next_fill()` its short and only used in one context,
in which it is good to know what that code actually does
* Turn boolean values into actual boolean values
* Remove some brackets around one-liner pathes
Ref genodelabs/genode#5115
The `_head_was_removed` variable got introduced in solving #4710, but it
reflects only whether `_head` is a valid pointer or not, thereby it
duplicates state.
Ref genodelabs/genode#5115
The classes Genode::Mmio, Genode::Register_set, Genode::Attached_mmio, and
Platform::Device::Mmio now receive a template parameter 'size_t SIZE'. In each
type that derives from one of these classes, it is now statically checked that
the range of each Genode::Register::Register- and
Genode::Register_set::Register_array-deriving sub-type is within [0..SIZE).
That said, SIZE is the minimum size of the memory region provided to the above
mentioned Mmio classes in order to avoid page faults or memory corruption when
accessing the registers and register arrays declared inside.
Note, that the range end of a register array is not the end of the last item
but the end of integer access that is used for accessing the last bit in the
last item.
The constructors of Genode::Mmio, Genode::Attached_mmio, and
Platform::Device::Mmio now receive an argument 'Byte_range_ptr range' that is
expected to be the range of the backing memory region. In each type that derives
from on of these classes, it is now dynamically checked that 'range.num_bytes
>= SIZE', thereby implementing the above mention protection against page faults
and memory corruption.
The rest of the commit adapts the code throughout the Genode Labs repositories
regarding the changes. Note that for that code inside Core, the commits mostly
uses a simplified approach by constructing MMIO objects with range
[base..base+SIZE) and not with a mapping- or specification-related range size.
This should be fixed in the future.
Furthermore, there are types that derive from an MMIO class but don't declare
any registers or register arrays (especially with Platform::Device::Mmio). In
this case SIZE is set to 0. This way, the parameters must be actively corrected
by someone who later wants to add registers or register arrays, plus the places
can be easily found by grep'ing for Mmio<0>.
Fix#4081
The physical address of the memory used for the guest VMCB is already
present in Vcpu_data. Use the information there instead of storing the
physical address in the host data area, thereby freeing up 8 bytes for
a bigger Mmio class.
Issue #4081
Clearing very large RAM dataspaces could fill up core's page table,
because the dataspaces are locally mapped to clear them.
This would manifest in a loop where exhausting the local page table
leads to its flushing (which does not work for core) and a retry that
again fills up the page table and so on.
To prevent this, flush RAM dataspaces in chunks of at most 128MiB.
Fixes#5086
By adding the `irq_type` argument, one can explicitly specify whether to
use LEGACY, MSI or MSI-X interrupts. We formerly used the
`device_phys_config` to implicitly select MSI, however, with the
addition of IOMMU support to the platform driver there is at least one
instance where we need an MSI for a non-PCI device.
Yet, by adding another session argument to the Irq session, we exceed
the character limit for session args. Since not all arguments are
relevant for LEGACY interrupts resp. MSI, we can split the Irq_connection
constructor to handle the two cases separately and omit unneeded
arguments.
genodelabs/genode#5002
Make naming across architectures coherent by renaming Vm_state to
Vcpu_state, to reflect that it contains the state of a Vcpu and not that
of an entire VM.
Ref #4968
Per Affinity::Location a system control cap can be requested. The capability
provides an RPC interface to request and set Cpu_state, as provided by the
former Pd::managing_system(Cpu_state) method. Invocation of those system
control capabilities then *can* (see below) be executed on the desired CPU
as described by Affinity::Location.
The system control cap will be invalid for kernels that don't support
system_control/managing_system functionality at all.
The system control cap will be ever by the same, e.g. ignoring the
Affinity::Location parameter, if the used kernel doesn't support or doesn't
require the feature to execute the system control per CPU.
The commit is a preparation step to add guarded and selective x86 MSR
access per CPU.
Fixes#5009
The timer used to read the counter first and then the IRQ status. This
could cause a non-wrapped counter value to be considered a wrapped
counter value, leading to bogus timeout durations.
This commit fixes the bug and documents the used timer mode in the
driver in order to make future debugging of the driver easier.
Ref #4959
By splitting the 'init_capability_slab()' implementation to a separate
compilation unit 'capability_slab.cc', base-hw no longer needs a
customized version of 'lib/base/platform.cc'.
Related to issue #4784
This patch replaces the internal use 'env_deprecated()' from the
implementation of the thread API in the base library. It also
replaces the global accessor 'main_thread_cap' by the explicit
propagation of the main-thread's capability to the single point of
use via a new 'init_thread_bootstap' function.
Issue #4784
The change "core: allow offset-attached managed dataspaces" addressed a
corner case of the use of nested region maps. Apparently, this change
negatively affects other scenarios (tool_chain_auto).
In order to confidently cover all the differnt situations, this patch
reworks the page-fault resolution code for improved clarity and safety,
by introducing dedicated result types, reducing the use of basic types,
choosing expressive names, and fostering constness.
It also introduces a number of 'print' hooks that greatly ease manual
instrumentation and streamlines the error messages printed by core.
Those messages no longer appear when a user-level page-fault handler
is reistered for the faulted-at region map. So the monitor component
produces less noise on the attempt to dump non-existing memory.
Issue #4917Fixes#4920