Compared to the bytewise memset, a wordwise memset (or even multi-word)
achieves a speedup of ~6.
On Zynq-7000/Cortex-A9:
317 MiB/s -> 2040 MiB/s
On base-linux x86_64:
3580 MiB/s -> 23700 MiB/s
genodelabs/genode#4456
Preloading a few cache lines ahead brings a significant speedup in
memcpy throughput. Note, the particular (optimal) value was empirically
determined on a Cortex-A9 (Zynq-7000) SoC @ 666Mhz. It is best combined
with L2 prefetching enabled (including double linefills and prefetch
offset 7). Yet, even without L2 prefetching this seems to be the sweet
spot.
genodelabs/genode#4456
The implementation is not in use any more. Furthermore, on typical ARM
cores such as the Cortex-A9, the cached read appears to be the
bottleneck rather than instruction density. On a Zynq-7000 SoC, the vfp
implementation performed significantly worse than the standard load/store
multiple implementation with preloading.
genodelabs/genode#4456
When executed on Linux, the test was impaired by the copy-on-write
optimisation since the source buffer was never initialised. By default,
Linux only maps a zeroed page until the first write access to the page
occurs. Since the source buffer was never written, the corresponding
page was always present in the physically-indexed data cache. In
consequence, the test merely measured write performance (similar to memset).
genodelabs/genode#4454
Send usb ctrl transfers with one second timeout as some devices (e.g.,
smartcard readers) do not response to certain control transfers.
Thanks to Peter for the investigation.
The i2c code has a busy loop (see commit for the location), which expects that
the jiffies advances without a cooperative scheduling decision.
Issue #4450
Our usb_host driver supports UHCI, OHCI, EHCI, and XHCI host
controllers. The USB4 host interface / Thunderbolt is currently not
supported and must therefore not be passed to the USB host driver.
If any PCI device reports 0 as interrupt PIN, drivers may try to force
MSI setup (e.g., xhci). So, we clamp the interrupt PIN to 1 to let
drivers finish initialization and don't bother the platform driver.
Encountered on the Fuji5 where for reasons currently unknown the
first xHCI HC (0:0d.0) could not be initialize due to incomplete
interupt informations. The other HCs appear to work fine (tested
with a USB low-speed mouse).
This commit removes all physical notions from the information given
to the Linux kernel regarding PCI BARs.
With the exception for the host bridge that needs to be located at
'0:00.0' as required by the Intel FB driver, all other devices are
announced at the PCI BUS in an ascending order.
Additionally the MMIO regions start at 1 GiB and are capped at 32 bit
to prevent unnecessary access to 64 bit addresses.
With this fix, the driver no longer aborts on the Tigerlake notebook and
just skips the out-of-region ACPI table. Issue #4452 is not fixed by
this commit, but in this specific case the table is not used anyway.
Upgrade to the well known worst cases by the GPU multiplexer. Do not
keep track of resources locally, in case resources are exceeded the
remain so anyway.
issue #4451
Check if there are a least 4 caps + 2MB (heap) + possible buffer size
available before any resource allocation. Only account resources that are
actually used.
issue #4451
The former implementation relied on the behaviour of how the old
intel fb driver requested the pci devices. The new lxkit however actually
really want to have all available pci devices.
Issue #4450
required by the upcoming update of the intel display driver. Make this addition
explicit, because it triggers adjustment also on the new pc_usb_host_drv.
Issue #4450
If size is zero, the platform goes out of service by:
[init -> platform_drv] Error: Uncaught exception of type 'Genode::Ram_allocator::Denied'
[init -> platform_drv] Warning: abort called - thread: e
Issue #4450
Also, the repository URL was adapted to the permanent redirect to github
to prevent the following warning.
dde_ipxe download http://git.ipxe.org/ipxe.git
dde_ipxe git Cloning into 'src/lib/dde_ipxe'...
dde_ipxe git warning: redirecting to https://github.com/ipxe/ipxe/
dde_ipxe update src/lib/dde_ipxe
This patch adds the trace-logger utility to the default set of packages
along with an optional launcher. With this change, only two steps are
needed to use Genode's tracing mechanism with Sculpt:
- Add 'trace_logger' to the 'launcher:' list of the .sculpt file
- Either manually select the 'trace_logger' from the '+' menu,
or add the following entry to the deploy configuration:
<start name="trace_logger"/>
By default, the trace logger is configured to trace all threads
executed in the runtime subsystem and to print a report every 10
seconds. This default policy can be refined in the launcher's <config>
node. Note that the trace logger does not respond to configuration
changes during runtime. Changes come into effect not before restarting
the component.
Issue #4448
This patch changes the output format of the trace logger to become
better suitable for human consumption. For example, when instrumenting
the VFS server in Sculpt using the GENODE_TRACE_TSC utility, the
trace logger now generates tabular output as follows.
Report 4
PD "init -> runtime -> arch_vbox6 -> vbox -> " ----------------
Thread "vCPU" at (0,0) total:12909024 recent:989229
Thread "vCPU" at (1,0) total:5643234 recent:786437
PD "init -> runtime -> ahci-0.fs" -----------------------------
Thread "ahci-0.fs" at (0,0) total:910497 recent:6335
Thread "ep" at (0,0) total:0 recent:0
71919692932: TSC process_packets: 8005M (4998 calls, last 4932K)
71921558516: TSC process_packets: 8006M (4999 calls, last 1596K)
71922760220: TSC process_packets: 8007M (5000 calls, last 1006K)
71929853586: TSC process_packets: 8009M (5001 calls, last 1840K)
71931315246: TSC process_packets: 8011M (5002 calls, last 1253K)
72127999920: TSC process_packets: 8016M (5003 calls, last 5606K)
72129568198: TSC process_packets: 8018M (5004 calls, last 1345K)
77161908178: TSC process_packets: 8029M (5005 calls, last 11349K)
77643225736: TSC process_packets: 8029M (5006 calls, last 217K)
89422100594: TSC process_packets: 8035M (5007 calls, last 5656K)
89422123632: TSC process_packets: 8035M (5008 calls, last 1342)
Thread "signal handler" at (0,0) total:36329 recent:3001
Thread "signal_proxy" at (0,0) total:51838 recent:13099
Thread "pdaemon" at (0,0) total:97184 recent:332
Thread "vdrain" at (0,0) total:1266 recent:286
Thread "vrele" at (0,0) total:1904 recent:516
PD "init -> runtime -> nic_drv" -------------------------------
Thread "nic_drv" at (0,0) total:34044 recent:897
Thread "signal handler" at (0,0) total:369 recent:142
...
Subjects that belong to the same PD are grouped together. The formerly
optional affinity and activity options have been removed. Those
information are now unconditionally displayed. The trace entries
belonging to a thread appear as slightly indented.
The patch also updates the coding style, avoiding excessively long
lines.
Issue #4448
This patch reduces repetitive log output by omitting inactive trace
subjects from the log output. The information about all subjects can
still be dumped by setting 'verbose="yes"'.
Issue #4448
This patch splits the creation and updating of monitor objects into two
stages. The creation of a monitor object changes the state of the
associated trace subject. The patch ensures that the new state is
captured by the update of the monitor object.
Issue #4448
This patch makes the trace-subject state as reflected to the trace
monitor more accurate.
Until now, a subject could be in UNTRACED or TRACED state. In reality,
however, there exists an intermediate state after the trace monitor
called 'trace' for the subject but before the subject locally activated
the tracing (done when passing a trace point). This intermediate state
was reflected as UNTRACED. Consequently, threads that never pass a trace
point (e.g., just waiting for I/O) would remain to appear as UNTRACED
even after enabling its tracing by the trace monitor. This is confusing.
This patch replaces the former UNTRACED and TRACED states by three
distinct states:
UNATTACHED prior any call of 'trace'
ATTACHED after a trace monitor called 'trace'
but before the tracing is active
TRACE tracing is active
Fixes#4447