Preloading a few cache lines ahead brings a significant speedup in
memcpy throughput. Note, the particular (optimal) value was empirically
determined on a Cortex-A9 (Zynq-7000) SoC @ 666Mhz. It is best combined
with L2 prefetching enabled (including double linefills and prefetch
offset 7). Yet, even without L2 prefetching this seems to be the sweet
spot.
genodelabs/genode#4456
The implementation is not in use any more. Furthermore, on typical ARM
cores such as the Cortex-A9, the cached read appears to be the
bottleneck rather than instruction density. On a Zynq-7000 SoC, the vfp
implementation performed significantly worse than the standard load/store
multiple implementation with preloading.
genodelabs/genode#4456
This patch makes the trace-subject state as reflected to the trace
monitor more accurate.
Until now, a subject could be in UNTRACED or TRACED state. In reality,
however, there exists an intermediate state after the trace monitor
called 'trace' for the subject but before the subject locally activated
the tracing (done when passing a trace point). This intermediate state
was reflected as UNTRACED. Consequently, threads that never pass a trace
point (e.g., just waiting for I/O) would remain to appear as UNTRACED
even after enabling its tracing by the trace monitor. This is confusing.
This patch replaces the former UNTRACED and TRACED states by three
distinct states:
UNATTACHED prior any call of 'trace'
ATTACHED after a trace monitor called 'trace'
but before the tracing is active
TRACE tracing is active
Fixes#4447
The new macros GENODE_TRACE_TSC and GENODE_TRACE_TSC_NAMED complement
the existing GENODE_LOG_TSC and GENODE_LOG_TSC_NAMED macros to simplify
TSC measurements at a low overhead of the trace mechanism.
Split the trace buffer into two partitions in order to prevent overwriting
of entries when the consumer is too slow. See file comment in buffer.h.
genodelabs/genode#4434
This commit simplifies the current implementation by overloading the
length field with a padding indicator in addition to the zero-length
head entry. This simplifies the iteration semantics as it eliminates
the need for determining whether a zero-length entries is the actual
head of the buffer or a padding at the buffer end.
genodelabs/genode#4434
When committing a new entry, the buffer wrapped if the last entry fit
perfectly into the buffer. Otherwise, the length field of the next entry
was set to 0 to mark the new head. Yet, if there was still some padding but not
enough to hold the length field of another entry, we ended up with a
headless buffer.
genodelabs/genode#4430
Since the head of the buffer is marked by a zero-length entry, we must
only write the length field if a new head was set. Otherwise, the
consumer might already read the new entry and not find the new head as a stop
condition.
genodelabs/genode#4430
XML allows attribute values like <node attr="\"/>. The XML parser
wrongly reflects this case as 'Invalid_syntax'. This behavior stems from
the implicit use of the 'end_of_quote' function, which considers the
sequence of '\"' as a quoted '"' rather than the end of a quoted string.
The patch solves this problem by making the 'end_of_quote' part of
the tokenizer's scanner policy.
The patch removes the 'end_of_quote' function from 'util/string.h'
because it is not universal, and to avoid the ambiguity with
'SCANNER_POLICY::end_of_quote'.
Fixes#4431
The official way to obtain DMA addresses for RAM dataspaces is
the RPC function 'Pd_session::dma_addr' now. User-level device drivers
should not call this function directly but use the 'Platform_session'
interface of the platform driver instead.
Fixes#2243
This patch enhances the PD-session interface with the support needed for
user-level device drivers performing DMA. Both RPC functions are
intended for the direct use by the platform driver only. If invoked for
PDs that lack the managing-system role, the operations have no effect.
The 'dma_addr()' RPC function allows the platform driver to request the
DMA address of a given RAM dataspace. It is meant to replace the
'Dataspace::phys_addr' RPC function.
The 'attach_dma' RPC function adds the given dataspace to the device
PD's I/O page table. It replaces the former heuristics of marking DMA
buffers as uncached RAM on x86.
With this patch, the UNCACHED attribute of RAM dataspaces is no longer
used to distinguish DMA buffers from regular RAM dataspaces.
Issue #2243
* renamed rpi pic to Bcm2835_pic
* renamed rpi3 pic to Bcm2837_pic
* added bcm2837 control for setting prescaler value (to fix timer_accuracy)
* changed handling of all interrupts for rpi3 by cascading to bcm2835 pic
* rpi3 irq controller base address made consistent with rpi
* added usb controller memory region for pic on rpi3 (for SOF interrupts)
Ref #3415
Fix some trivial cases where the signedness of the constant value does
not match the signedness of type the code expects to see. GCC can be
asked to warn about those by passing Wsign-covnersion flag.
Issue #4354
As far as I can tell this is not raised by any released GCC versions.
Clang 13 on the other hand warns about it due to implicit-int-conversion
warning which is automatically enabled together with Wconversion. The
problem is relatively simple, shifting access_t value does not always
produce result which is also of access_t type. For example, if access_t
is uint16_t, shifting it will produce integer result. This can be
observed even with GCC. Building the following C++ example will fail:
#include <type_traits>
#include <stdint.h>
int test() {
uint16_t a = 0xabcd;
static_assert(std::is_same_v<decltype(a<<1), uint16_t>);
return 0;
}
Changing uint16_t in the static_assert to int, will allow the code to
build.
Make such int to access_t implicit conversion explicit to allow the code
to be compiled with both GCC and clang.
Issue #4354
This patch improves the robustness of the CPU-affinity handling.
- The types in base/affinity.h received the accessors
'Location::within(space)' and 'Affinity::valid', which alleviates
the fiddling with coordinates when sanity checking the values,
in init or core.
- The 'Affinity::Location::valid' method got removed because its
meaning was too vague. For sanity checks of affinity configurations,
the new 'within' method is approriate. In cases where only the x,y
values are used for selecting a physical CPU (during thread creation),
the validity check (width*height > 0) was not meaningful anyway.
- The 'Affinity::Location::from_xml' requires a 'Affinity::Space'
as argument because a location always relates to the bounds of
a specific space. This function now implements the selection of
whole rows or columns, which has previously a feature of the
sandbox library only.
- Whenever the sandbox library (init) encounters an invalid affinity
configuration, it prints a warning message as a diagnostic aid.
- A new 'Affinity::unrestricted' function constructs an affinity that
covers the whole affinity space. The named functions clarifies
the meaning over the previous use of the default constructor.
- Core's CPU service denies session requests with an invalid
affinity parameter. Previously, it would fall back to an
unrestricted affinity.
Issue #4300
This patch changes the 'Allocator' interface to the use of 'Attempt'
return values instead of using exceptions for propagating errors.
To largely uphold compatibility with components using the original
exception-based interface - in particluar use cases where an 'Allocator'
is passed to the 'new' operator - the traditional 'alloc' is still
supported. But it existes merely as a wrapper around the new
'try_alloc'.
Issue #4324
This patch replaces the 'Ram_allocator::alloc' RPC function by a
'try_alloc' function, which reflects errors as 'Attempt' return value
instead of an exception.
Issue #4322
Issue #3612
The new 'update_list_model_from_xml' function template simplifies the
use of the list model utility by alleviating the need for implementing a
custom policy class for each model. Instead, the transformation is done
using a few lambda functions given directly as arguments.
Issue #4317
Alignas should be placed before the type. Placing it after it works for
GCC, but fails when building the same codee with clang. The error
message is:
reconstructible.h:48:27: error: 'alignas' attribute cannot be applied to types
char _space[sizeof(MT)] alignas(sizeof(addr_t));
^
Issue #4298
The new 'Env::try_session' method mirrors the existing 'Env::session'
without implicitly handling exceptions of the types 'Out_of_ram',
'Out_of_caps', 'Insufficient_ram_quota', and 'Insufficient_cap_quota'.
It enables runtime environments like init to reflect those exceptions to
their children instead of paying the costs of implicit session-quota
upgrades out of the own pocket.
By changing the 'Parent_service' to use 'try_session', this patch fixes
a resource-exhaustion problem of init in Sculpt OS that occurred when
the GPU multiplexer created a large batch of IO_MEM sessions, with each
session requiring a second attempt with the session quota upgraded by
4 KiB.
Issue #3767
If one has an object X that has a minimum alignment requirement specified
through 'alignas' this requirement is normally inherited by objects that have
object X as member, and by those that have objects as member that have X as
member, and so on... . However, this chain used to get silently interrupted
(dropping the minimum alignment requirement to 8 again) at objects that are
managed with Genode::Reconstructible or Genode::Constructible. In order to fix
this, the commit ensures that Genode::Reconstructible (and therefore also
Genode::Constructible) has at least the minimum alignment requirement (using
'alignas') as the object it manages.
Ref #4217
Introduce two new cache maintainance functions:
* cache_clean_invalidate_data
* cache_invalidate_data
used to flush or invalidate data-cache lines.
Both functions are typically empty, accept for the ARM architecture.
The commit provides implementations for the base-hw kernel, and Fiasco.OC.
Fixes#4207
This patch changes the 'alloc_aligned' interface as follows:
- The former 'from' and 'to' arguments are replaced by a single
'range' argument.
- The distinction of the use cases of regular allocations vs.
address-constrained allocations is now overed by a dedicated
overload instead of relying on a default argument.
- The 'align' argument has been changed from 'int' to 'unsigned'
to be better compatible with 'addr_t' and 'size_t'.
Fixes#4067
The 'Timer::Session::trigger_periodic' RPC function used to accept 0 as
a way to de-schedule the periodic processing. Several components such as
nitpicker relied on this special case. In "timeout: rework timeout
framework", the value of zero was silently clamped to 1, which has the
opposite effect: triggering signals at the maximum rate. This results in
a visible effect in Sculpt where the leitzentrale-nitpicker instance
produces a constant load of 2% CPU time.
This patch restores the original timer semantics by
- Documenting it in timer_session.h,
- Handling the case explicitly in the timer implementation, and
- Replacing the silent clamping of the unexpected value 0 passed
to the timeout framework by a diagnostic error message.
Issue #3884
- remove Spike/BBL support in favour of Qemu (>=4.2.1)
- add 'riscv_qemu' board, remove 'spike' board'
- update to privileged ISA v1.10 (from v1.9.1)
- use direct system calls for privileged core threads (they call into
the kernel and don't use mode changing system calls, i.e. 'ecall',
semantics)
- use 'OpenSBI' semtantics for SBI calls (to machine mode) instead of
BBL
issue #4012
By first removing unused ranges, implicitly meta data allocations are freed
up. This leads to more unused slab blocks and freed up meta data allocations
in the avl tree.
Issue #4014
Clang is generally fine with Genode::List and compiles code using it
without emitting any warnings. There is however one exception. Clang
fails hard when building base-hw/src/core/kernel/object.cc.
This is due to a call to Genode::List::remove made from
Object_identity::invalidate function. The error message clang
produces is:
list.h:96:33: error: 'Genode::List<Kernel::Object_identity_reference>::Element::_next'
is not a member of class 'const Kernel::Object_identity'
_first = le->List::Element::_next;
~~~~~~~~~~~~~~~^
When we look at the declaration of the Kernel::Object class on which
the remove method is called. as expected it does inherit Genode::List:
using Object_identity_list
= Genode::List<Kernel::Object_identity>;
class Kernel::Object : private Object_identity_list
{
...
}
Given the error message we see that List::Element should be resolved to
Genode::List<Kernel::Object_identity>::Element, and not
Genode::List<Kernel::Object_identity_reference>::Element. But how does
clang manage to figure out we're talking about Object_identity_refecence
list here? Well, I admit I don't know the exact steps it takes to arrive
at this conclusion, but it is not entirely wrong. If we take a look at
what Kernel::Object_identity is we'll see:
class Kernel::Object_identity
: public Object_identity_list::Element,
public Kernel::Object_identity_reference_list
{
...
}
Where as one can guess Object_identity_reference_list is defined as:
using Object_identity_reference_list
= Genode::List<Object_identity_reference>;
Long story short Kernel::Object has Genode::List of both Kernel::Object_identity
and Kernel::Object_identity_reference in its inheritance chain and clang
is not really sure to which of those the code refers to in
Genode::List::remove method by using List::Element::.
The fix for this is relatively simple, explicitly state the full type of
the base class the code intends to refer to. Replacing List::Element,
with List<LT>::Element makes the code buildable with both clang and GCC.
Fixes#3990
This commit restores the diag feature for selecting diagnostic output of
services provided by core. This feature became unavailable with commit
"base: remove dependency from deprecated APIs", which hard-wired the
diag flag for core services to false.
To control this feature, three possible policies can be expressed in a
routing target of init's configuration:
* Forcing silence by specifying 'diag="no"'
* Enabling diagnostics by specifying 'diag="yes"'
* Forwarding the preference of the client by omitting the 'diag'
attribute
Fixes#3962
The msg argument in Genode::Rpc_dispatcher::_read_arg is not used. GCC
does not care about this, but clang does and prints a warning regaring
this. Silence it by removing unused argument name.
fixup! base: Silence unused arg warning in rpc_server.h
To enable the interaction of a VMM with the kernel directly,
a hidden RPC gets introduced. It allows a kernel-specific
base-library implementation of the Vm_session::Client to request
a kernel-specific capability to address a VCPU, e.g., to
run/stop it.
Ref #3926
* get rid of alarm abstraction
* get rid of Timeout::Time type
* get rid of pointer arguments
* get rid of _discard_timeout indirection
* get rid of 65th bit in stored time values
* get rid of Timeout_scheduler interface
* get rid of uninitialized deadlines
* get rid of default arguments
* get rid of Timeout::_periodic
* get rid of Timeout::Raw
* use list abstraction
* only one interface for timeout handlers
* rework locking scheme to be smp safe
* move all method definitions to CC file
* name mutexes more accurate
* fix when & how to set time-source timeout
* fix deadlocks
Fixes#3884
By now, the enumeration of peripheral interrupts on Raspberry Pi 1 was
different in between base-hw kernel and Fiasco.OC. Therefore, hacks were
needed in every driver to request the correct interrupt number dependent
on the kernel. Before reproducing the same in the platform driver for rpi,
we can more easily use the same enumeration with base-hw.
Ref #3864
Introduce the managing_system privilege for components like the
platform_driver to allow it to call system management functionality
that is reserved by kernel or special firmware, e.g., ARM Trusted Firmware.
The former RAM resource configuration attribute `constrain_phys`,
which enabled to constrain the region of physical RAM to be used,
gets replaced by the new, broader managing_system configuration
attribute of a `start` node. It gets enforced by the sandbox library.
Ref #3816
- base/cancelable_lock.h becomes base/lock.h
- all members become private within base/lock.h
- solely Mutex and Blockade are friends to use base/lock.h
Fixes#3819
- Since Genode::strncpy is not 100% compatible with the POSIX
strncpy function, better use a distinct name.
- Remove bogus return value from the function, easing the potential
enforcement of mandatory return-value checks later.
Fixes#3752
The 'WHITESPACE' case of the _calc_len method wrongly accessed the
character before checking upper bound of the token. The problem is fixed
by switching the order of both conditions.
Fixes#3756
This patch removes old 'Allocator_guard' utility and replaces its use
with the modern 'Constrained_ram_allocator'.
The adjustment of core in this respect has the side effect of a more
accurate capability accounting in core's CPU, TRACE, and RM services.
In particular, the dataspace capabilities needed for core-internal
allocations via the 'Sliced_heap' are accounted to the client now.
The same goes for nitpicker and nic_dump as other former users of the
allocator guard. Hence, the patch also touches code at the client and
server sides related to these services.
The only remaining user of the 'Allocator_guard' is the Intel GPU
driver. As the adaptation of this component would be too invasive
without testing, this patch leaves this component unchanged by keeping a
copy of the 'allocator_guard.h' locally at the component.
Fixes#3750
This patch largely reverts the commit "base: lay groundwork for
base-linux caps change" because the use of 'epoll' instead of 'select'
alleviated the need to allocate large FD sets, which motivated the
introduction of the 'Native_context' hook.
Related to issue #3581