By adding a 'write_ready' interface following the lines of the existing
'read_ready', VFS plugins become able to propagate the (de-)saturation
of I/O buffers to the VFS user. This information is important when using
a non-blocking file descriptor for writing into a TCP socket. Once the
application observes EAGAIN, it expects a subsequent 'select' call to
return as soon as new I/O buffer space becomes available.
Before this patch, the select call would always return under this
condition, causing an unnecessarily busy write loop.
Issue #4697
The new interface is meant to replace the 'Vfs::Io_response_handler'.
In contrast to the 'Io_response_handler', which had to be called
on a 'Vfs_handle', the new interface does not require any specific
'Vfs_handle'. It is merely meant to prompt the VFS user (like the libc)
to re-attempt stalled I/O operations but it does not provide any
immediate hint, about which of the handles have become ready for
reading/writing.
Issue #4697
This patch removes the 'Insufficient_buffer' exception by returning the
WRITE_ERR_WOULD_BLOCK result value instead. It also eliminates the
superfluous WRITE_ERR_AGAIN and WRITE_ERR_INTERRUPT codes.
Issue #4697
This patch fosters the batching of network packets transferred by the
lwIP stack over the NIC connection. It replaces the eager submission of
the packet-stream's data-flow signals by explicit wakeup notifications.
The commit also increases the NIC session's buffer size from 128 to 1024
packets.
Issue #4697
...and tighten constness in adjacent code parts.
The VFS-internal synchronization via mutexes is no longer needed because
the access to the VFS is serialized by the VFS client, i.e., the libc.
Issue #4697
This patch facilitates the batching of I/O operations in the VFS library
by replacing the implicit wakeup of remote peer (via the traditional
packet-stream interface like 'submit_packet') by explicit wakeup
signalling.
The wakeup signalling is triggered not before the VFS user settles down.
E.g., for libc-based applications, this is the case if the libc goes
idle, waiting for external I/O.
In the case of a busy writer to a non-blocking file descriptor or socket
(e.g., lighttpd), the remote peers are woken up once a write operation
yields an out-count of 0.
The deferring of wakeup signals is accommodated by the new 'Remote_io'
mechanism (vfs/remote_io.h) that is designated to be used by all VFS
plugins that interact with asynchronous Genode services for I/O.
Issue #4697
By replacing the calls of 'acknowledge_packet' and 'get_packet' with
'try_ack_packet' and 'try_get_packet', we avoid the implicit triggering
of data-flow signals. Instead, the VFS server now relies on explicit
calls of the packet stream's 'wakeup' interface.
Issue #4697
The change of the queue size from 16 to 32 has negligible costs (4 KiB
instead of 2 KiB for the packet-stream queues) while facilitating the
batching of many small consecutive write operations.
Issue #4697
Count more accurately how much packets are in flied, and whether
new packets can be handled. Moreover, catch potential exceptions
whenever acknowledging a packet, and warn about the lost acknowledgement.
Fixgenodelabs/genode#4678
This is required for scenarios in which a device appears at a later
point in time. If the ROM is not updated, the device_by_type() method may
operate on an outdated dataspace and never find the device it is waiting for.
Although we do not have the full ACPI information parsed yet, to
announce non-PCI devices derived from the ACPI tables, the device
description of the assumed devices is now integral-part of pci_decode.
Formerly, the information was gained separatedly as boot-module, whereby
we lost synchronization in between ACPI/PCI parsing, BIOS handover, and
PS/2 emulation code already acting.
During interrupt handling the driver masked and cleared interrupts as
recommended in the spec to prevent spurious or unnecessary interrupts
from occurring.
Due to the way the current implementation operates new Block requests
got submitted while handling completions for already finished ones.
Since interrupts where masked at this point the controller did not
generate interrupts when the newly submitted requests got completed.
As the mask/clear optimization is apparently not strictly needed and
according to the spec undefined when using MSI-X it is removed.
Fixes#4684
This commit enables users of the VMM to define CPU type and count, RAM size,
kernel and initrd ROM names, GIC version, and Virtio devices to be used.
Derived from the configuration values a flattened device-tree blob (DTB) is
generated and transfered to the VM.
Fixgenodelabs/genode#4670
Consume '<iommu/>' tag from 'devices' report. In case an IOMMU is
present map physical memory to arbitrary locations within IO page table
range 1K-4G. This way every device PD has access to ~4GB of DMA space.
issue #4665
'_env_ram' allocations can lead to
'Expanding_pd_session_client::try_alloc' quota upgrades, which in turn
may lead to a resource request by the platform driver. Therefore, we
check the available quota within the platform driver before allocations.
This is not an optimal solution.
issue #4667
related issue #3767
To prevent caching side-effects of USB DMA memory taken from the packet stream
all allocations of USB packets need to be on separated cachelines at least.
Fixgenodelabs/genode#4655
This patch equips the pin-driver framework with support for the
time-multiplexed operation of a pin as output or input. This is needed
when implementing I2C communication via a bit-banging driver.
To operate pin in both directions, a driver obtains both a pin-state and
a pin-control session for the same pin. The pin-state session can be
used to sense the current pin state. The control session allows the
client to set the pin to high or low (using the 'state' method), or to
set it to high-impedance via the 'yield' method. Once switched to
high-impedance, the pin can be used as input.
Issue genodelabs/genode-allwinner#10
If a device should not be reset, powered off, and its clocks
shall stay untouched when it gets released, the leave_operational
attribute can be set to true in the device node of the related
device inside the devices ROM delivered to the platform driver.
This is useful for drivers, which only enable and initialize
their device, and can be closed afterwards.
Ref genodelabs/genode#4654
Reintroduce:
USB Attached SCSI devices might expose a bulk-only interface
as fall-back at interface 0 and alternate setting 0. This commit
allows for probing all alternate settings of the active interface
to be able to use such devices.
The configuration was extended so that in case the device interface
is known beforehand the driver can be configured accordingly.
Additionally:
Perform configuration reset upon sessions close in order to bring USB
device to a well defined state.
fixes#4494
Implement the guest code in dedicated assembler source file, assemble
and link the binary to vmm_x86. The resulting guest-code binary
populates one page that is mapped to host the reset vector of the guest.
This approach simplifies future guest code adaption resp. extension,
e.g., to test rdmsr/wrmsr exiting.
Fixes#4638
This reverts commit 9a37ccfe29 except for the
new declarations in public headers (in order to not change any APIs again).
We revert the commit as we found that there are corner cases in which it
produces a bad UDP checksum. The bad UDP checksum was observed via Wireshark at
a TFTP server in a Sculpt 22.10 Debian 11 VM on the first request of fetching a
file with the TFTP client of the uboot on our iMX8 test board.
Ref #4636
According to OpenBSD's azalia driver some AMD HDAudio devices do not
play nice with MSIs although the capability is set. At least the
0x1457 device was tested and worked using GSIs only.
genodelabs/genode#4578
Some DHCP clients (Debian VM in Sculpt) persistently store the last lease they
obtained and try to directly DHCP REQUEST it on a new startup whithout doing
DHCP DISCOVER beforehand. In case the NIC router doesn't know about the lease
anymore (timeout, new router instance), the router used to just ignore the DHCP
REQUEST. This led to significant delays in the network startup of the client
(delayed retries until give-up and DHCP DISCOVER). With this commit, the router
answers such packets with a DHCP NAK instead, causing the client to directly
switch to DHCP DISCOVER.
Fixes#4634
On-demand initialization prevents read-write operations on BARs of
invalid devices at construction time, which may result in surprising
behavior later on, for example, when resetting X260 notebooks via ACPI
information.
These utilities simplify the control of clocks, resets, and power
domains from within the platform driver.
This is needed when driving a low-level device directly from the
platform driver, for example for driving the mbox mechanism to access
the system-control processor of the PinePhone.
Implemented as depicted in the OpenBSD driver, register description
found in 'AMD SB700/710/750 Register Reference Guide'
(43009_sb7xx_rrg_pub_1.00.pdf).
Issue #4629.
Instead of using a global value to enumerate the MSIs, use a function argument
instead. Whenever the process of PCI device reporting gets started again,
due to an initially too small report buffer, the MSI enumeration value is reset
again. Formerly, we wasted MSI numbers.
Ref genodelabs/genode#4628
Don't skip IRQ reporting if legacy IRQ/GSIs are not supported as the
device may support MSI/MSI-X exclusively.
The commit also enables reserved_memory reporting of devices without
IRQs.
Ref genodelabs/genode#4578
* Add EHCI PCI quirk
* Add UHCI reset to UHCI quirk
* Apply all PCI quirks in order of the PCI bus numbering
otherwise the machine might stall
Ref genodelabs/genode#4578
Instead of allowing the client to set a caching attribute
in the io_mem() call of the device interface, which was
only used to decide in between of the memory being
write-combined or not, remove it from the API.
Instead use the information delivered by the devices ROM,
whether memory from a PCI BAR is prefetchable or not,
to decide whether it is mapped write-combined or not.
Ref genodelabs/genode#4578
Memory descriptors in PCI BARs have a prefetchable bit, which can
be used to optimize memory access when setting, e.g. write-combined
in page-table entries.
Ref genodelabs/genode#4578
The DHCP client used to always send packets with a size of 1024 regardless of
the size of the actual content, which was always significantly lower. 1024
bytes was simply a guess to provide enough space for all types of DHCP client
packets. As we know the exact size of each packet the DHCP client sends even
before packet creation, this commit makes use of the knowledge resulting in
much smaller packets sent by the DHCP client.
Fixes#4619
Consumes the information about reserved memory region reports from
the devices ROM, and adds appropriated mappings to the corresponding
device PD.
Ref genodelabs/genode#4578
We need the information about reserved memory region reports
from the ACPI tables within the platform driver to pre-fill
IOMMU tables with the corresponding mappings. Therefore,
the pci_decode component now parses the information from the
ACPI ROM, and adds "reserved_memory" nodes to all related
devices in the devices report.
Ref genodelabs/genode#4578
In case of the GPU multiplexer, we need to delegate MMIO memory
to the framebuffer client in form of a managed dataspace. To be
able to attach a given Platform::Device::Mmio object to a region map
we need to access its capability.
Ref genodelabs/genode#4578
This commit gets rid of the router-local wrapper of Genode's AVL string tree
and replaces it with Genode's new Dictionary structure. The Dictionary is now
used for managing domains and NIC clients. Due to this change, the formerly
necessary helper classes Domain_base and Nic_client_base could be removed as
well.
Ref #4610
The default file-system communication-buffer size of 128 KiB combined
with the clamping of requests to 1/4th the buffer size results in the
fragementation of read operations into 32 KiB chunks. This is overly
conservative and causes high context-switch overhead down the storage
stack (vfs server -> part_block -> block driver).
Related to #4613
The NIC router README claims that the 'dns_config_from' attribute in a DHCP
server configuration binds the propagated link state of all interfaces at the
domain of the server to the validity of the IP config of the domain that is
given through 'dns_config_from'.
However, this was not true. The router missed to implement this detail which
led to clients of such a DHCP server sending DHCP DISCOVER packets too early.
These early DHCP DISCOVER packets were dropped by the router potentially
causing a big delay until the client started a new attempt. Unnecessary long
network boot-up delays were observed with at least the lwip run script and
Sculpt on the PinePhone and could be tracked down to this former
inconsistency in the router.
This commit fixes the inconsistency.
Fixes#4612
The new 'Dictionary' provides an easy way to access objects using
strings as key. The 'String' received the 'operator >' to simplify the
organization of strings in an AVL tree.
The patch removes the former definition of the 'operator >' from the
platform driver because it would be ambigious now.
Fixes#4610
In case a driver is waiting for data, is should only investigate 'pos'.
It should not advance the ring in any way until there is data available.
issue #4609
This patch consolidates the repetitive error handling across the RPC
functions, which take node handles or directory handles as arguments.
During this change, I noticed that directory handles - which are values
provided by the client - were not checked for their type before being
used. A misbehaving client may open a file, manually construct a
directory handle using the number of the file handle, and invoke a
directory operation at lx_fs, which would then wrongly access a file
node as directory node.
This patch solves this issue by introducing two distinct methods
_with_open_node and _with_open_dir_node, which perform the respective
safety checks.
Fixes#4608
Creating and destructing an interface was not considered a change of its real
link state as defined in the description of the <report link_state_triggers="">
config attribute in the router's README. In case of Uplink sessions this is
obviously a problem as they communicate their real link state through session
lifetime. But also in case of NIC sessions it's a possible to create an
interface that is immediately "up" after creation or destruct an interface
without its link state going "down" beforehand.
Taking into account also the practical application of the
<report link_state_triggers=""> attribute, reporting only on destruction and
construction of interfaces that are "up" seems shorthanded. This is because a
report-receiver most likely needs to be able to synchronize the lifetime of
the objects that keep track of the link states with the lifetime of the
corresponding sessions.
That said, with this commit, the router triggers a report update on each
session construction/destruction when <report link_state_triggers=""> is
set.
Fixes#4462
The NIC router used to generate reports triggered by IP config changes or link
state changes synchonously, i.e., inline with the activation context that
caused the change. This has two disadvantages. First, it can lead to an
excessive number of report updates in situations with quick bursts of
triggering changes. In such situations it is preferable to collect the changes
and reflect them with only one final report update.
Second, synchronous reporting may happen while the router is in a state that
leads to an incorrect report (e.g. during reconfiguration). To prevent this
from happening, the router so far explicitely switched off reporting when
entering incoherent states and back on when leaving them. However, this
solution is error-prone as the exclusion windows must be maintained manually.
Both issues can be solved by not directly generating a report when necessary
but instead submitting a signal and letting the signal handler do the work in
a dedicated activation context.
Ref #4462
This patch splits the querying of the number of directory entries from
the directory's 'status' information. Subsuming the number of directory
entries as part of the status makes 'stat' calls too costly for some
file systems that need to read a directory for determining the number of
entries. So when stat'ing the entries of one directory that contains sub
directories, all entries of each sub directory are visited.
Thanks to Cedric Degea for pointing out this performance bottleneck!
With this change, the 'status' function returns a 'Status::size' value
of 0 when called for a directory handle.
Fixes#4603
The DHCP client of the NIC router used to end up in an uncaught exception if
an IP address in the DNS server option of a DHCP ACK was invalid. This commit
makes the 'Dns_server' constructor (where the exception originated from)
private and instead introduces a public lambda method 'construct' that calls
one lambda argument on success and another on failure. This is also in line
with the most recent changes to the 'find_by_*' methods of other classes in
the NIC router and contributes to the goal of reducing expensive exception
handling.
Fixes#4465