Commit Graph

7916 Commits

Author SHA1 Message Date
Johannes Schlatow
410099df70 base/memset: speedup implementation
Compared to the bytewise memset, a wordwise memset (or even multi-word)
achieves a speedup of ~6.

On Zynq-7000/Cortex-A9:
317 MiB/s -> 2040 MiB/s

On base-linux x86_64:
3580 MiB/s -> 23700 MiB/s

genodelabs/genode#4456
2022-04-13 09:29:03 +02:00
Johannes Schlatow
9409f814a4 memcpy (x86): implement memcpy_cpu
By changing the bytewise copy into a wordwise copy, we get a speedup of
~3 (on base-linux x86_64).

genodelabs/genode#4456
2022-04-13 09:29:02 +02:00
Johannes Schlatow
0104a74028 memcpy (arm): cache align and use pld for speedup
Preloading a few cache lines ahead brings a significant speedup in
memcpy throughput. Note, the particular (optimal) value was empirically
determined on a Cortex-A9 (Zynq-7000) SoC @ 666Mhz. It is best combined
with L2 prefetching enabled (including double linefills and prefetch
offset 7). Yet, even without L2 prefetching this seems to be the sweet
spot.

genodelabs/genode#4456
2022-04-13 08:08:01 +02:00
Johannes Schlatow
4dcc095e5e memcpy (arm): remove unused vfp implementation
The implementation is not in use any more. Furthermore, on typical ARM
cores such as the Cortex-A9, the cached read appears to be the
bottleneck rather than instruction density. On a Zynq-7000 SoC, the vfp
implementation performed significantly worse than the standard load/store
multiple implementation with preloading.

genodelabs/genode#4456
2022-04-13 08:08:01 +02:00
Johannes Schlatow
052c33fc8c test/cache: refine test pattern
- run multiple access patterns (touch words, touch lines, memcpy)
- add make file for linux

genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
5a0e22eb98 test/memcpy: tweak test timing
add some log calls to give run script a bit more time to catch the next
output

genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
92bcc50c0a test/memcpy: test with a more consistent alignment
On some platforms, the page index affects the measurements.

genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
07736d1689 test/memcpy: fix optimistic results on Linux
When executed on Linux, the test was impaired by the copy-on-write
optimisation since the source buffer was never initialised. By default,
Linux only maps a zeroed page until the first write access to the page
occurs. Since the source buffer was never written, the corresponding
page was always present in the physically-indexed data cache. In
consequence, the test merely measured write performance (similar to memset).

genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
62f37c5b1b test/memcpy: disable Thumb when compiled on linux
genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
85daf1b3b2 cpu_bench: disable Thumb when compiled on linux
genodelabs/genode#4454
2022-04-13 08:08:01 +02:00
Johannes Schlatow
d372afd81e base-hw: add bitfield to pl310
Enabling double linefills improves memcpy throughput.

genodelabs/genode#4456
2022-04-13 08:08:01 +02:00
Christian Helmuth
d1f9434fd5 qemu-usb: send usb ctrl transfers with timeout
Send usb ctrl transfers with one second timeout as some devices (e.g.,
smartcard readers) do not response to certain control transfers.

Thanks to Peter for the investigation.
2022-04-13 08:08:00 +02:00
Alexander Boettcher
00479aea29 lx_emul(x86): shadow cpu_relax to advance jiffies
The i2c code has a busy loop (see commit for the location), which expects that
the jiffies advances without a cooperative scheduling decision.

Issue #4450
2022-04-13 08:08:00 +02:00
Christian Helmuth
18c5f1e90d tool/run: improve disk image size automatic
Set disk size to 1.5 times the run folder size and shrinked later to
real content.

Thanks to Roland for the patch.
2022-04-13 08:08:00 +02:00
Christian Helmuth
108fe84f5a Remove SIGNAL/CAP/RAM services from run scripts
Related to #2407
2022-04-13 08:08:00 +02:00
Christian Helmuth
77b572f36a platform: distinct USB4 from other USB PCI devices
Our usb_host driver supports UHCI, OHCI, EHCI, and XHCI host
controllers. The USB4 host interface / Thunderbolt is currently not
supported and must therefore not be passed to the USB host driver.
2022-04-13 08:08:00 +02:00
Christian Helmuth
1b4cd93dc2 lx_kit/x86: clamp PCI interrupt PIN to 1
If any PCI device reports 0 as interrupt PIN, drivers may try to force
MSI setup (e.g., xhci). So, we clamp the interrupt PIN to 1 to let
drivers finish initialization and don't bother the platform driver.
2022-04-13 08:08:00 +02:00
Josef Söntgen
afe02efb8f pc_usb_host: implement 'dma_pool_destroy'
Encountered on the Fuji5 where for reasons currently unknown the
first xHCI HC (0:0d.0) could not be initialize due to incomplete
interupt informations. The other HCs appear to work fine (tested
with a USB low-speed mouse).
2022-04-13 08:08:00 +02:00
Josef Söntgen
c6cc43f0e4 lx_kit/x86: use virtual information for PCI
This commit removes all physical notions from the information given
to the Linux kernel regarding PCI BARs.

With the exception for the host bridge that needs to be located at
'0:00.0' as required by the Intel FB driver, all other devices are
announced at the PCI BUS in an ascending order.

Additionally the MMIO regions start at 1 GiB and are capped at 32 bit
to prevent unnecessary access to 64 bit addresses.
2022-04-13 08:08:00 +02:00
Christian Helmuth
1c79c95868 acpi_drv: skip tables outside predefined region
With this fix, the driver no longer aborts on the Tigerlake notebook and
just skips the out-of-region ACPI table. Issue #4452 is not fixed by
this commit, but in this specific case the table is not used anyway.
2022-04-13 08:08:00 +02:00
Sebastian Sumpf
49b8232ebd libdrm: simplify resource accounting
Upgrade to the well known worst cases by the GPU multiplexer. Do not
keep track of resources locally, in case resources are exceeded the
remain so anyway.

issue #4451
2022-04-13 08:08:00 +02:00
Sebastian Sumpf
105e82ad84 gpu/intel: check resources before any operation
Check if there are a least 4 caps + 2MB (heap) + possible buffer size
available before any resource allocation. Only account resources that are
actually used.

issue #4451
2022-04-13 08:08:00 +02:00
Christian Helmuth
c1c94d37d7 microcode_intel: update to version 20220207 2022-04-13 08:08:00 +02:00
Alexander Boettcher
c0560ab0cb pc: update intel display driver
Fixes #4450
2022-04-13 08:08:00 +02:00
Alexander Boettcher
7813fca946 gpu/intel: report all devices via next_device
The former implementation relied on the behaviour of how the old
intel fb driver requested the pci devices. The new lxkit however actually
really want to have all available pci devices.

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
2548830140 pc_linux: add ACPI config
required by the upcoming update of the intel display driver. Make this addition
explicit, because it triggers adjustment also on the new pc_usb_host_drv.

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
6d924d3285 lx_kit(x86): restrict usb heuristics to usb
Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
cda0fafbd1 lx_emul: remove sw_width/height from common_dummies
required by the new upcoming intel display driver. Make the step explicit,
because it needs adjustment on the new usb driver as well.

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
b6c1b7806b lx_kit: io_mem_map with write combined support (x86)
Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
6f64917e8f lx_emul: add ioremap_cache/_wc to shadow/asm/io.h
used by intel_fb for write combined allocation

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
8dbcda9943 lx_emul: x86_32 shadow header adaptations
required for upcoming intel display driver in 32bit

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
7c3f010cd6 lx_emul: shadow asm/uaccess_32/64.h
Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
cdf1b39c5e lx_emul: shadow asm/special_insns.h
wbinvd is not supported in user mode

Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
88a6a9d628 lx_emul: add missing fpu/api.h to shadow pgtable.h
Issue #4450
2022-04-13 08:07:59 +02:00
Alexander Boettcher
279f038b9e lx_emul: shadow asm/cpufeature and asm/page_64
Issue #4450
2022-04-13 08:07:58 +02:00
Josef Söntgen
fd8df3a623 lx_emul: handle page refcount 2022-04-13 08:07:58 +02:00
Alexander Boettcher
4474460377 lx_emul: __alloc_pages support in shadow/mm/page_alloc.c 2022-04-13 08:07:58 +02:00
Alexander Boettcher
a222df31ba platform_drv(x86): avoid exception in alloc_dma
If size is zero, the platform goes out of service by:

[init -> platform_drv] Error: Uncaught exception of type 'Genode::Ram_allocator::Denied'
[init -> platform_drv] Warning: abort called - thread: e

Issue #4450
2022-04-13 08:07:58 +02:00
Alexander Boettcher
dd10e5d977 intel_fb: move to legacy_intel_fb
Move the depot recipe and consistently name the old drivers with a legacy_
prefix as done with the old usb_host driver.

Issue #4450
2022-04-13 08:07:58 +02:00
Christian Helmuth
1a2677ebe6 dde_ipxe: update Intel NIC support list from upstream
Also, the repository URL was adapted to the permanent redirect to github
to prevent the following warning.

  dde_ipxe  download http://git.ipxe.org/ipxe.git
  dde_ipxe  git Cloning into 'src/lib/dde_ipxe'...
  dde_ipxe  git warning: redirecting to https://github.com/ipxe/ipxe/
  dde_ipxe  update src/lib/dde_ipxe
2022-04-13 08:07:58 +02:00
Christian Helmuth
ad4fb2b088 nova: fix IOTLB flush for global mode
Issue alex-ab/nova#6
2022-04-13 08:07:58 +02:00
Christian Helmuth
c56ac3e909 nova: support extended addresses in FADT
Issue alex-ab/nova#5
2022-04-13 08:07:58 +02:00
Martin Stein
50fc2aa251 black_hole: provide Gpu service
Ref #4419
2022-04-13 08:07:58 +02:00
Martin Stein
046ebc3d34 black_hole: provide ROM service
Ref #4419
2022-04-13 08:07:58 +02:00
Norman Feske
bb26a986e6 sculpt: add trace_logger as optional launcher
This patch adds the trace-logger utility to the default set of packages
along with an optional launcher. With this change, only two steps are
needed to use Genode's tracing mechanism with Sculpt:

- Add 'trace_logger' to the 'launcher:' list of the .sculpt file

- Either manually select the 'trace_logger' from the '+' menu,
  or add the following entry to the deploy configuration:

    <start name="trace_logger"/>

By default, the trace logger is configured to trace all threads
executed in the runtime subsystem and to print a report every 10
seconds. This default policy can be refined in the launcher's <config>
node. Note that the trace logger does not respond to configuration
changes during runtime. Changes come into effect not before restarting
the component.

Issue #4448
2022-04-13 08:07:58 +02:00
Norman Feske
3394f97f86 trace_logger: make output format more concise
This patch changes the output format of the trace logger to become
better suitable for human consumption. For example, when instrumenting
the VFS server in Sculpt using the GENODE_TRACE_TSC utility, the
trace logger now generates tabular output as follows.

  Report 4

  PD "init -> runtime -> arch_vbox6 -> vbox -> " ----------------
   Thread "vCPU"           at (0,0)  total:12909024 recent:989229
   Thread "vCPU"           at (1,0)  total:5643234  recent:786437

  PD "init -> runtime -> ahci-0.fs" -----------------------------
   Thread "ahci-0.fs"      at (0,0)  total:910497   recent:6335
   Thread "ep"             at (0,0)  total:0        recent:0
    71919692932: TSC process_packets: 8005M (4998 calls, last 4932K)
    71921558516: TSC process_packets: 8006M (4999 calls, last 1596K)
    71922760220: TSC process_packets: 8007M (5000 calls, last 1006K)
    71929853586: TSC process_packets: 8009M (5001 calls, last 1840K)
    71931315246: TSC process_packets: 8011M (5002 calls, last 1253K)
    72127999920: TSC process_packets: 8016M (5003 calls, last 5606K)
    72129568198: TSC process_packets: 8018M (5004 calls, last 1345K)
    77161908178: TSC process_packets: 8029M (5005 calls, last 11349K)
    77643225736: TSC process_packets: 8029M (5006 calls, last 217K)
    89422100594: TSC process_packets: 8035M (5007 calls, last 5656K)
    89422123632: TSC process_packets: 8035M (5008 calls, last 1342)
   Thread "signal handler" at (0,0)  total:36329    recent:3001
   Thread "signal_proxy"   at (0,0)  total:51838    recent:13099
   Thread "pdaemon"        at (0,0)  total:97184    recent:332
   Thread "vdrain"         at (0,0)  total:1266     recent:286
   Thread "vrele"          at (0,0)  total:1904     recent:516

  PD "init -> runtime -> nic_drv" -------------------------------
   Thread "nic_drv"        at (0,0)  total:34044    recent:897
   Thread "signal handler" at (0,0)  total:369      recent:142

  ...

Subjects that belong to the same PD are grouped together. The formerly
optional affinity and activity options have been removed. Those
information are now unconditionally displayed. The trace entries
belonging to a thread appear as slightly indented.

The patch also updates the coding style, avoiding excessively long
lines.

Issue #4448
2022-04-13 08:07:58 +02:00
Norman Feske
f7270c44cb trace_logger: omit inactive subjects by default
This patch reduces repetitive log output by omitting inactive trace
subjects from the log output. The information about all subjects can
still be dumped by setting 'verbose="yes"'.

Issue #4448
2022-04-13 08:07:58 +02:00
Norman Feske
ceb91732bf trace_logger: update state after adding subjects
This patch splits the creation and updating of monitor objects into two
stages. The creation of a monitor object changes the state of the
associated trace subject. The patch ensures that the new state is
captured by the update of the monitor object.

Issue #4448
2022-04-13 08:07:58 +02:00
Norman Feske
be0a1742ac base: distinct TRACED from ATTACHED trace subjects
This patch makes the trace-subject state as reflected to the trace
monitor more accurate.

Until now, a subject could be in UNTRACED or TRACED state. In reality,
however, there exists an intermediate state after the trace monitor
called 'trace' for the subject but before the subject locally activated
the tracing (done when passing a trace point). This intermediate state
was reflected as UNTRACED. Consequently, threads that never pass a trace
point (e.g., just waiting for I/O) would remain to appear as UNTRACED
even after enabling its tracing by the trace monitor. This is confusing.

This patch replaces the former UNTRACED and TRACED states by three
distinct states:

  UNATTACHED  prior any call of 'trace'
  ATTACHED    after a trace monitor called 'trace'
              but before the tracing is active
  TRACE       tracing is active

Fixes #4447
2022-04-13 08:07:58 +02:00
Norman Feske
f3984ba5a9 base: declare build artifact for core
This is a generalization of the recent commit "base-hw: declare build
artifact for core".
2022-04-13 08:07:58 +02:00