Implicitely fixes problems with USB devices having more than 4G blocks.
Formerly the 16-Cmd LBA requests were silently casted to 32-bit.
Fixgenodelabs/genode#4771
Several explicit casts could not be avoided yet, due to the missing
differentiation in between virtual and physical addresses that leads
to casting problems when using 32-bit ARM, and because the MMIO
register framework does not allow to return narrowed types of bitfields.
Apart from that, this commit fixes a switch-case fallthrough error in
Mmio_register::write.
Fixgenodelabs/genode#4770
The read_config and write_config functions in the generic virtio
headers used by all drivers lead to compiler warnings resp. errors
if effective-c++ switch is enabled. Moreover, the functions require
to define the access width as parameter. We can better turn them
into template functions using the value type to read resp. write to
derive the access width.
Ref genodelabs/genode#4344
The 'file_size' type denotes the size of files on disk in bytes. On
32-bit architectures it is larger than the size_t, which refers to
in-memory object sizes.
Whereas the use of 'file_size' is appropriate for ftruncate and seek, it
is not a suitable type for the parameters of read/write operations
because those operations refer to in-memory buffers.
This patch replaces the use of 'file_size' by size_t. However, since it
affects all sites where the read/write interface is uses, it takes the
opportunity to replace the C-style (pointer, size) arguments by
'Byte_range_ptr' and 'Const_byte_range_ptr'.
Issue #4706
We determine the owner of a Vram object by the session cap of the
session that created the object. We should not copy this cap to other
places because this increases reference counting, that can become very
large with many objects. Therefore, we pass a wrapper 'Owner' object
containing the cap by reference.
issue #4713
Change the abstraction from buffers to video RAM (VRAM). The notion of
buffers can be provided at the client side (e.g., Mesa) and multiple
buffers can be there be associated to one VRAM area, thus saving
resources (meta data overhead) when allocating many buffers. A VRAM area
can also be mapped to one single buffer as before for clients or drivers
that do not take advantage of this feature.
issue #4713
This patch improves the Readonly_file::read method such that the
capacity of the specified buffer is used as upper bound for the read
operation instead of VFS-internal I/O buffer sizes. This relieves the
caller from implementing a read loop in most cases.
As a step away from C-ish use of the API, the patch deprecates the old
'read' method that takes the buffer as char *, size_t arguments.
Fixes#4745
The new utility returns a key code for a passed name and is implemented
by linear search, which is slow but sufficient in situations like config
updates.
Issue #4748
Up to now it was only checked if an issued admin command was processed
in a timely fashion. Otherwise it has been treated as failed.
However, the completion-queue entry was not examined and the caller was
not able to access the entry itself. Depending on the command, checking
the completion-queue entry might be necessary, e.g. GET/SET_FEATURE.
Issue #4715.
Since the 'Platform::Device' constructor will defer the creation until
the content of the devices ROM is valid performing the PRP list helper
creation afterwards should be done with valid IOMMU information.
Issue #4715.
* Update links from forward rules only with forward rules and links from
transport-routing rules only with transport-routing rules. Besides raising
the performance of the code, this also fixes a former bug that allowed
forward-rule links to falsely stay active because of a transport-routing
rule that matched the client destination ip and port.
* Don't use good-case exceptions for updating TCP/UDP links on re-configuration
of the router.
* Make conditions when to dismiss a forward rule easier to read.
* Introduces != operator to the public Port class in the net library.
* Fix unnecessary log message that a link was dismissed when only a potentially
matching forward rule turned out to be not matching.
* Apply Genode coding style to if statements with a single body statement.
Fix#4728
This fixes a bug that was introduced by this earlier commit:
"nic_router: find forward rules w/o exceptions"
The NIC router used to falsely dissolve TCP/UDP connection states when
reconfiguring although the connection states were still legal according to the
new config. The reason was that the above mention commit nested lambdas but
missed to return from the last nesting level when having found a configuration
that legitimates the connection state.
Ref #4728
The old 'Io_response_handler::io_progress_response' interface has been
replaced by the 'Vfs::Env::User::wakeup_vfs_user' (issue #4697). The
remaining 'read_ready_response' method is now hosted in the
appropriately named 'Read_ready_response_handler'.
Issue #4706
This commit supplements the various I/O signal handlers of the VFS
plugins with calls of the new 'Vfs::Env::User::wakeup_vfs_user'
interface, which will subsequently replace the old 'Io_progress_handler'
(issue #4697).
Issue #4706
When a domain receives a new dynamic router IP address and that domain has
active connection states (TCP/UDP/ICMP) from another domain with NAT applied,
the connection states used to stay active while becoming obsolete. They
become obsolete because their identification and their packet processor
use the old routers IP address due to NAT.
One consequence was that connections became dysfunctional when the server
domain received a new dynamic router IP address. Request packets were still
routed from client to server, but when entering the server, their source IP
address was the outdated router address. Consequently, the server responses
used the outdated address as destination and the router dropped the responses
because it did not know this address anymore.
This commit fixes the problem by letting a domain destroy all its connection
states that were initiated from within other domains whenever it detaches from
its current IP configuration.
Strictly speaking, it is not necessary to destroy all connection states, only
those that the domain applies NAT to. However, the Genode AVL tree is not built
for removing a selection of nodes and trying to do it anyways is complicated.
So, for now, we simply destroy all connection states.
Note that the other way around was handled correctly already. When a domain
detaches from its IP config, all interfaces of that domain destroy all the
connection states they created (towards other domains).
Fixes#4696
If the IP config does not change on updates to the router IP config of a domain
change (a common case on DHCP RENEW), prevent detaching from the old config and
attaching to the new one. Because this would not only create unnecessary CPU
overhead but also force all clients at all interfaces that are listening to
this config (via config attribute 'dns_config_from') to restart their
networking (re-do DHCP).
Ref #4696
By adding a 'write_ready' interface following the lines of the existing
'read_ready', VFS plugins become able to propagate the (de-)saturation
of I/O buffers to the VFS user. This information is important when using
a non-blocking file descriptor for writing into a TCP socket. Once the
application observes EAGAIN, it expects a subsequent 'select' call to
return as soon as new I/O buffer space becomes available.
Before this patch, the select call would always return under this
condition, causing an unnecessarily busy write loop.
Issue #4697
The new interface is meant to replace the 'Vfs::Io_response_handler'.
In contrast to the 'Io_response_handler', which had to be called
on a 'Vfs_handle', the new interface does not require any specific
'Vfs_handle'. It is merely meant to prompt the VFS user (like the libc)
to re-attempt stalled I/O operations but it does not provide any
immediate hint, about which of the handles have become ready for
reading/writing.
Issue #4697
This patch removes the 'Insufficient_buffer' exception by returning the
WRITE_ERR_WOULD_BLOCK result value instead. It also eliminates the
superfluous WRITE_ERR_AGAIN and WRITE_ERR_INTERRUPT codes.
Issue #4697
This patch fosters the batching of network packets transferred by the
lwIP stack over the NIC connection. It replaces the eager submission of
the packet-stream's data-flow signals by explicit wakeup notifications.
The commit also increases the NIC session's buffer size from 128 to 1024
packets.
Issue #4697
...and tighten constness in adjacent code parts.
The VFS-internal synchronization via mutexes is no longer needed because
the access to the VFS is serialized by the VFS client, i.e., the libc.
Issue #4697
This patch facilitates the batching of I/O operations in the VFS library
by replacing the implicit wakeup of remote peer (via the traditional
packet-stream interface like 'submit_packet') by explicit wakeup
signalling.
The wakeup signalling is triggered not before the VFS user settles down.
E.g., for libc-based applications, this is the case if the libc goes
idle, waiting for external I/O.
In the case of a busy writer to a non-blocking file descriptor or socket
(e.g., lighttpd), the remote peers are woken up once a write operation
yields an out-count of 0.
The deferring of wakeup signals is accommodated by the new 'Remote_io'
mechanism (vfs/remote_io.h) that is designated to be used by all VFS
plugins that interact with asynchronous Genode services for I/O.
Issue #4697
By replacing the calls of 'acknowledge_packet' and 'get_packet' with
'try_ack_packet' and 'try_get_packet', we avoid the implicit triggering
of data-flow signals. Instead, the VFS server now relies on explicit
calls of the packet stream's 'wakeup' interface.
Issue #4697
The change of the queue size from 16 to 32 has negligible costs (4 KiB
instead of 2 KiB for the packet-stream queues) while facilitating the
batching of many small consecutive write operations.
Issue #4697
Count more accurately how much packets are in flied, and whether
new packets can be handled. Moreover, catch potential exceptions
whenever acknowledging a packet, and warn about the lost acknowledgement.
Fixgenodelabs/genode#4678
This is required for scenarios in which a device appears at a later
point in time. If the ROM is not updated, the device_by_type() method may
operate on an outdated dataspace and never find the device it is waiting for.
Although we do not have the full ACPI information parsed yet, to
announce non-PCI devices derived from the ACPI tables, the device
description of the assumed devices is now integral-part of pci_decode.
Formerly, the information was gained separatedly as boot-module, whereby
we lost synchronization in between ACPI/PCI parsing, BIOS handover, and
PS/2 emulation code already acting.
During interrupt handling the driver masked and cleared interrupts as
recommended in the spec to prevent spurious or unnecessary interrupts
from occurring.
Due to the way the current implementation operates new Block requests
got submitted while handling completions for already finished ones.
Since interrupts where masked at this point the controller did not
generate interrupts when the newly submitted requests got completed.
As the mask/clear optimization is apparently not strictly needed and
according to the spec undefined when using MSI-X it is removed.
Fixes#4684
This commit enables users of the VMM to define CPU type and count, RAM size,
kernel and initrd ROM names, GIC version, and Virtio devices to be used.
Derived from the configuration values a flattened device-tree blob (DTB) is
generated and transfered to the VM.
Fixgenodelabs/genode#4670
Consume '<iommu/>' tag from 'devices' report. In case an IOMMU is
present map physical memory to arbitrary locations within IO page table
range 1K-4G. This way every device PD has access to ~4GB of DMA space.
issue #4665
'_env_ram' allocations can lead to
'Expanding_pd_session_client::try_alloc' quota upgrades, which in turn
may lead to a resource request by the platform driver. Therefore, we
check the available quota within the platform driver before allocations.
This is not an optimal solution.
issue #4667
related issue #3767
To prevent caching side-effects of USB DMA memory taken from the packet stream
all allocations of USB packets need to be on separated cachelines at least.
Fixgenodelabs/genode#4655
This patch equips the pin-driver framework with support for the
time-multiplexed operation of a pin as output or input. This is needed
when implementing I2C communication via a bit-banging driver.
To operate pin in both directions, a driver obtains both a pin-state and
a pin-control session for the same pin. The pin-state session can be
used to sense the current pin state. The control session allows the
client to set the pin to high or low (using the 'state' method), or to
set it to high-impedance via the 'yield' method. Once switched to
high-impedance, the pin can be used as input.
Issue genodelabs/genode-allwinner#10
If a device should not be reset, powered off, and its clocks
shall stay untouched when it gets released, the leave_operational
attribute can be set to true in the device node of the related
device inside the devices ROM delivered to the platform driver.
This is useful for drivers, which only enable and initialize
their device, and can be closed afterwards.
Ref genodelabs/genode#4654
Reintroduce:
USB Attached SCSI devices might expose a bulk-only interface
as fall-back at interface 0 and alternate setting 0. This commit
allows for probing all alternate settings of the active interface
to be able to use such devices.
The configuration was extended so that in case the device interface
is known beforehand the driver can be configured accordingly.
Additionally:
Perform configuration reset upon sessions close in order to bring USB
device to a well defined state.
fixes#4494
Implement the guest code in dedicated assembler source file, assemble
and link the binary to vmm_x86. The resulting guest-code binary
populates one page that is mapped to host the reset vector of the guest.
This approach simplifies future guest code adaption resp. extension,
e.g., to test rdmsr/wrmsr exiting.
Fixes#4638
This reverts commit 9a37ccfe29 except for the
new declarations in public headers (in order to not change any APIs again).
We revert the commit as we found that there are corner cases in which it
produces a bad UDP checksum. The bad UDP checksum was observed via Wireshark at
a TFTP server in a Sculpt 22.10 Debian 11 VM on the first request of fetching a
file with the TFTP client of the uboot on our iMX8 test board.
Ref #4636
According to OpenBSD's azalia driver some AMD HDAudio devices do not
play nice with MSIs although the capability is set. At least the
0x1457 device was tested and worked using GSIs only.
genodelabs/genode#4578
Some DHCP clients (Debian VM in Sculpt) persistently store the last lease they
obtained and try to directly DHCP REQUEST it on a new startup whithout doing
DHCP DISCOVER beforehand. In case the NIC router doesn't know about the lease
anymore (timeout, new router instance), the router used to just ignore the DHCP
REQUEST. This led to significant delays in the network startup of the client
(delayed retries until give-up and DHCP DISCOVER). With this commit, the router
answers such packets with a DHCP NAK instead, causing the client to directly
switch to DHCP DISCOVER.
Fixes#4634
On-demand initialization prevents read-write operations on BARs of
invalid devices at construction time, which may result in surprising
behavior later on, for example, when resetting X260 notebooks via ACPI
information.
These utilities simplify the control of clocks, resets, and power
domains from within the platform driver.
This is needed when driving a low-level device directly from the
platform driver, for example for driving the mbox mechanism to access
the system-control processor of the PinePhone.
Implemented as depicted in the OpenBSD driver, register description
found in 'AMD SB700/710/750 Register Reference Guide'
(43009_sb7xx_rrg_pub_1.00.pdf).
Issue #4629.
Instead of using a global value to enumerate the MSIs, use a function argument
instead. Whenever the process of PCI device reporting gets started again,
due to an initially too small report buffer, the MSI enumeration value is reset
again. Formerly, we wasted MSI numbers.
Ref genodelabs/genode#4628
Don't skip IRQ reporting if legacy IRQ/GSIs are not supported as the
device may support MSI/MSI-X exclusively.
The commit also enables reserved_memory reporting of devices without
IRQs.
Ref genodelabs/genode#4578
* Add EHCI PCI quirk
* Add UHCI reset to UHCI quirk
* Apply all PCI quirks in order of the PCI bus numbering
otherwise the machine might stall
Ref genodelabs/genode#4578
Instead of allowing the client to set a caching attribute
in the io_mem() call of the device interface, which was
only used to decide in between of the memory being
write-combined or not, remove it from the API.
Instead use the information delivered by the devices ROM,
whether memory from a PCI BAR is prefetchable or not,
to decide whether it is mapped write-combined or not.
Ref genodelabs/genode#4578
Memory descriptors in PCI BARs have a prefetchable bit, which can
be used to optimize memory access when setting, e.g. write-combined
in page-table entries.
Ref genodelabs/genode#4578
The DHCP client used to always send packets with a size of 1024 regardless of
the size of the actual content, which was always significantly lower. 1024
bytes was simply a guess to provide enough space for all types of DHCP client
packets. As we know the exact size of each packet the DHCP client sends even
before packet creation, this commit makes use of the knowledge resulting in
much smaller packets sent by the DHCP client.
Fixes#4619
Consumes the information about reserved memory region reports from
the devices ROM, and adds appropriated mappings to the corresponding
device PD.
Ref genodelabs/genode#4578
We need the information about reserved memory region reports
from the ACPI tables within the platform driver to pre-fill
IOMMU tables with the corresponding mappings. Therefore,
the pci_decode component now parses the information from the
ACPI ROM, and adds "reserved_memory" nodes to all related
devices in the devices report.
Ref genodelabs/genode#4578
In case of the GPU multiplexer, we need to delegate MMIO memory
to the framebuffer client in form of a managed dataspace. To be
able to attach a given Platform::Device::Mmio object to a region map
we need to access its capability.
Ref genodelabs/genode#4578
This commit gets rid of the router-local wrapper of Genode's AVL string tree
and replaces it with Genode's new Dictionary structure. The Dictionary is now
used for managing domains and NIC clients. Due to this change, the formerly
necessary helper classes Domain_base and Nic_client_base could be removed as
well.
Ref #4610
The default file-system communication-buffer size of 128 KiB combined
with the clamping of requests to 1/4th the buffer size results in the
fragementation of read operations into 32 KiB chunks. This is overly
conservative and causes high context-switch overhead down the storage
stack (vfs server -> part_block -> block driver).
Related to #4613
The NIC router README claims that the 'dns_config_from' attribute in a DHCP
server configuration binds the propagated link state of all interfaces at the
domain of the server to the validity of the IP config of the domain that is
given through 'dns_config_from'.
However, this was not true. The router missed to implement this detail which
led to clients of such a DHCP server sending DHCP DISCOVER packets too early.
These early DHCP DISCOVER packets were dropped by the router potentially
causing a big delay until the client started a new attempt. Unnecessary long
network boot-up delays were observed with at least the lwip run script and
Sculpt on the PinePhone and could be tracked down to this former
inconsistency in the router.
This commit fixes the inconsistency.
Fixes#4612
The new 'Dictionary' provides an easy way to access objects using
strings as key. The 'String' received the 'operator >' to simplify the
organization of strings in an AVL tree.
The patch removes the former definition of the 'operator >' from the
platform driver because it would be ambigious now.
Fixes#4610
In case a driver is waiting for data, is should only investigate 'pos'.
It should not advance the ring in any way until there is data available.
issue #4609
This patch consolidates the repetitive error handling across the RPC
functions, which take node handles or directory handles as arguments.
During this change, I noticed that directory handles - which are values
provided by the client - were not checked for their type before being
used. A misbehaving client may open a file, manually construct a
directory handle using the number of the file handle, and invoke a
directory operation at lx_fs, which would then wrongly access a file
node as directory node.
This patch solves this issue by introducing two distinct methods
_with_open_node and _with_open_dir_node, which perform the respective
safety checks.
Fixes#4608
Creating and destructing an interface was not considered a change of its real
link state as defined in the description of the <report link_state_triggers="">
config attribute in the router's README. In case of Uplink sessions this is
obviously a problem as they communicate their real link state through session
lifetime. But also in case of NIC sessions it's a possible to create an
interface that is immediately "up" after creation or destruct an interface
without its link state going "down" beforehand.
Taking into account also the practical application of the
<report link_state_triggers=""> attribute, reporting only on destruction and
construction of interfaces that are "up" seems shorthanded. This is because a
report-receiver most likely needs to be able to synchronize the lifetime of
the objects that keep track of the link states with the lifetime of the
corresponding sessions.
That said, with this commit, the router triggers a report update on each
session construction/destruction when <report link_state_triggers=""> is
set.
Fixes#4462
The NIC router used to generate reports triggered by IP config changes or link
state changes synchonously, i.e., inline with the activation context that
caused the change. This has two disadvantages. First, it can lead to an
excessive number of report updates in situations with quick bursts of
triggering changes. In such situations it is preferable to collect the changes
and reflect them with only one final report update.
Second, synchronous reporting may happen while the router is in a state that
leads to an incorrect report (e.g. during reconfiguration). To prevent this
from happening, the router so far explicitely switched off reporting when
entering incoherent states and back on when leaving them. However, this
solution is error-prone as the exclusion windows must be maintained manually.
Both issues can be solved by not directly generating a report when necessary
but instead submitting a signal and letting the signal handler do the work in
a dedicated activation context.
Ref #4462
This patch splits the querying of the number of directory entries from
the directory's 'status' information. Subsuming the number of directory
entries as part of the status makes 'stat' calls too costly for some
file systems that need to read a directory for determining the number of
entries. So when stat'ing the entries of one directory that contains sub
directories, all entries of each sub directory are visited.
Thanks to Cedric Degea for pointing out this performance bottleneck!
With this change, the 'status' function returns a 'Status::size' value
of 0 when called for a directory handle.
Fixes#4603
The DHCP client of the NIC router used to end up in an uncaught exception if
an IP address in the DNS server option of a DHCP ACK was invalid. This commit
makes the 'Dns_server' constructor (where the exception originated from)
private and instead introduces a public lambda method 'construct' that calls
one lambda argument on success and another on failure. This is also in line
with the most recent changes to the 'find_by_*' methods of other classes in
the NIC router and contributes to the goal of reducing expensive exception
handling.
Fixes#4465
The Interface class of the router is an abstraction for NIC client sessions,
NIC server sessions, and Uplink sessions. Nonetheless, Interface generally used
to use the packet stream types of the Nic namespace and it worked because the
Uplink packet stream types are factually the same (the are typedef'd from the
same base type templates with the same parameters).
The initial intention of this issue was to remove dependency on the diverse
packet stream stream types from Interface. However, this turned out to be more
tricky than thought. The Interface class calls function templates on the packet
stream types, making a generic virtual interface impossible. And moving the
calling code to the session classes as well would produce a lot of redundancy.
Therefore, this commit removes only the use of the Nic namespace in the
interface.* files by typedef'ing the packet stream types from the generic
Genode type templates with the same parameters as in Nic and Uplink.
Fixes#4385
The `with_sub_node` method is renamed to `with_optional_sub_node` to
better reflect that the non-existence of a sub node with the desired type is
ignored.
At the same time, the new `with_sub_node` now takes a second functor that is
called when no sub node of the desired type exists.
genodelabs/genode#4600
By using the new functions provided by the base API, this patch removes
the dependency of several components from include/decorator/xml_utils.h.
Issue #4584
The NIC router used to send an ICMP "Destination Unreachable" packet as
response to every unroutable IPv4 packet. However, RFC 1812 section 4.3.2.7
defines certain properties that must be fullfilled by an incoming packet in
order to be answered with this type of ICMP. One requirement is that the packet
is no IPv4 multicast.
This commit prevents sending the mentioned ICMP response for unroutable IPv4
multicasts and instead drops them silently.
Fixes#4563
Instead of having a generic "virt_qemu" board use "virt_qemu_<arch>" in
order to have a clean distinction between boards. Current supported
boards are "virt_qemu_arm_v7a", "virt_qemu_arm_v8a", and
"virt_qemu_riscv".
issue #4034
The NIC router used to add the DNS servers field to DHCP replies regardless of
whether there were DNS servers or not. As reported by a Genode user, the empty
DNS server field irritated at least Windows 10 guests (Vbox 6) that connected
to the NIC router. This resulted in Windows 10 ignoring DHCP offers from the
router with such characteristic.
With this commit adding the DNS server DHCP option is skipped if there are no
DNS servers at the corresponding DHCP server or the domain IP config the server
shall fetch its DNS servers from.
Fixes#4581
Provide additional PCI register information inside the pci-config part
of the devices ROM for clients able to access an Intel graphic card,
namely the GMCH control register content, which contains for instance
the GTT size and stolen memory size.
Ref genodelabs/genode#4578
Implement BIOS handover and Intel resume register update
apart from device driver to circumvent export of PCI
config space to drivers.
Ref genodelabs/genode#4578
The pci_decode has to extract the additional fields from the PCI configuration
space. The platform driver again has to parse and forward the knowledge too.
The PCI BAR indices are exported when info="yes" is set in the policy node for
the corresponding session.
Fixgenodelabs/genode#4577
The base address of I/O ports has a different encoding than
those of I/O memory. This needs to be encountered in the PCI
config helper utilities.
Fixgenodelabs/genode#4576
On okl4, pistachio, sel4 the test didn't come up fast enough in order to still
experience the first configuration of NIC router #1. This commit doubles the
lifetime of the first configuration of NIC router #1 to 4 seconds and raises
the overall test timeout accordingly.
Ref #4555
In overload situations, i.e. when a sender fills up the entire buffer, we land
in situations where the sender receives an ack_avail signal, releases one
packet, allocates and sends a packet and fails to allocate a second packet.
This is especially relevant if the receiver does not batch ack_avail signals
(such as vfs_lwip). In those ping-pong scheduling scenarios, the overhead from
catching the Packet_alloc_failed exception becomes significant. In case of the
NIC router, we will land in an overload situation if the sender is faster than
the receiver. The packet buffer will be filled up at some point and the NIC
router starts to drop packets. For every dropped packet, we currently have to
catch the Packet_alloc_failed exception.
This commit adds a new method alloc_packet_attempt to Packet_stream_source that
has almost the same signature as the older alloc_packet method but returns
an Attempt<Packet_descriptor, Alloc_packet_error> object. As the method already
used the allocator back end exception-less, changes on lower levels were not
needed. Furthermore, the NIC router was modified to use the new exception-less
alloc_packet_attempt instead of alloc_packet.
Ref #4555
Replaces the former implementation of the 'find_by_ip' method at the data
structure for ARP cache entries. This method used to return a reference to the
found object and threw an exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Ref #4555
According to a benchmarking series on Zynq (base-hw) and x260 (base-nova) using
test-nic_perf_router, increasing the 'max_packets_per_signal' has a significant
effect on the packet throughput. By increasing the default value from 32
to 150, we could gain a few hundred Mbit/s. Increasing the value further
does not seem to have such a strong effect, though.
genodelabs/genode#4555
The checksums for forwarded/routed UDP, TCP and ICMP, used to be always
re-calculated from scratch in the NIC router although the router changes only
a few packet fields. This commit replaces the old approach whereever sensible
with an algorithm for incremental checksum updates suggested in RFC 1071.
The goal is to improve router performance.
Ref #4555
The checksums for forwarded/routed IPv4, used to be always re-calculated from
scratch in the NIC router although the router changes only a few packet fields.
This commit replaces the old approach whereever sensible with an algorithm for
incremental checksum updates suggested in RFC 1071. The goal is to improve
router performance.
Ref #4555
We used to use 'unsigned long' for the accumulating variable when calculating
internet checksums. However, 'signed long' is more in accordance with RFC 1071
and will allow us to share the same back end for folding, once we implement
incremental updating of internet checksums.
Ref #4555
Prevent public reflection of the only internally used 'init_sum' argument in
'uint16_t internet_checksum(...)' that, in addition, added a default value to
the function interface.
Ref #4555
When sending an ICMP ECHO reply, the router merely swaps SRC and DST of the
IPv4 header of the corresponding request and these changes cancel each other
out in checksum calculation. Therefore, with this commit, the router skips
updating the IPv4 checksum in this context.
Ref #4555
The router used to update IPv4 checksums when routing via an <ip> rule
despite the fact that it doesn't change any IPv4 header fields in this case.
Ref #4555
The NIC router used to update IPv4 and layer 4 checksums of a packet for each
interface it was sent to (say, all interfaces of the domain the packet was
routed to). However, there was and is no technical reason for not doing it
only once and then iterating over the interfaces with the already updated
packet. This is what this commit does in an intent to raise the router's
performance.
Ref #4555
The NIC router uses the timer for relatively coarse-grained timeouts.
It therefore suffices to update and store the current time when the NIC router
is signalled and use the cached time instead. This prevents frequent
syscalls or RPCs when acquiring the current time for every packet.
genodelabs/genode#4555
The link dissolve timeout is updated for every packet, which leads to
trigger_once() RPCs that only marginally change the scheduled timeout but
significantly slow down the packet throughput.
genodelabs/genode#4555
The wakeup call only emits a single signal as it assumed both are
handled by the same signal handler. However, the original implementation
did not reset the wakeup_needed variable properly.
genodelabs/genode#4555
When using signal batching, ack_avail and packet_avail should always
be emitted and preferred over ready_to_submit and ready_to_ack.
A signal receiver might decide to not register the ready_to_* signals when it
handles congestion by dropping packets. The Nic router is an example of
such a signal receiver.
genodelabs/genode#4555
Using the 'query_buffer_ppgtt()' function allows for retrieving the
virtual address of the buffer in the PPGTT.
This is for components that manage the GPU virtual addresses rather than
the client as is the case with the lima driver.
Issue #4559.
The parent-provides model is destroyed if no <parent-provides> node is
found in the configuration, which resulted in
Warning: list model not empty at destruction time
and leaking memory for the allocated nodes. The commit now explicitly
empties the list model in the destructor of ~Parent_provides_model.
Note, the case is implicitly tested in pkg/test-init by step "denial of
forwarded session request" and <init_config version="empty">.
Thanks to Peter for reporting this issue.
Fixes#4547
Whenever a domain looses all its interfaces or the link state of all attached
interfaces is down at once, the domain potentially moves to another Ethernet
segment and should therefore consider its ARP cache to be outdated.
RFC 826 states that "... If a host moves, any connections initiated by that
host will work, assuming its own address resolution table is cleared when it
moves. ...".
Therefore, this commit introduces clearing the ARP cache and the initially
stated events.
This commit was motivated by an issue with the PinePhone Modem and USB NIC.
On the PinePhone, the Modem has its own OS and acts as direct gateway to the
outer world for the USB NIC that is driven by Genode. However, whenever the
Modem gets restarted, Modem and USB NIC receive a new MAC address. This used
to conflict with the NIC routers ARP entry for the Modem that didn't cease to
be valid.
With this commit, the integrator of such a scenario at least has a convenient
way of fixing this by ensuring that all interfaces at the USB NIC domain go
down when resetting (e.g. by ensuring that the USB NIC is the only interface at
that domain).
Fixes#4558
Replaces the former use of the 'find_by_name' method of the AVL string tree.
This method returned a reference to the found object and threw an exception if
no matching object was found.
The locally implemented replacement doesn't return anything and doesn't throw
exceptions. It takes two lambda arguments instead. One for handling the case
that a match was found with a reference to the matching object as argument and
another for handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Furthermore, this commit modifies the local wrapper for the insert method of
the AVL string tree, so, that it follows the above mentioned concept as well.
Ref #4536
Replaces the former implementation of the 'find_by_domain' method at the data
structure for NAT rules. This method used to return a reference to the found
object and threw an exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Ref #4536
Replaces the former implementation of the 'find_by_port' method at the data
structure for permit rules. This method used to return a reference to the found
object and threw an exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Furthermore, the commit introduces a convenience wrapper for finding the best
matching pair of transport rule and corresponding permit rule for a given
destination IP and port. This method as well follows the above mentioned
concept.
Ref #4536
Replaces the former implementation of the 'find_longest_prefix_match' method at
the data structure for direct rules. This method used to return a reference to
the found object and threw an exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Ref #4536
Replaces the former implementation of the 'find_longest_prefix_match' method at
the data structure for direct rules. This method used to return a reference to
the found object and threw an exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Ref #4536
Replaces the former implementation of find_by_id at the data structure for
links. This method used to return a reference to the found object and threw an
exception if no matching object was found.
The new implementation doesn't return anything and doesn't throw exceptions. It
takes two lambda arguments instead. One for handling the case that a match was
found with a reference to the matching object as argument and another for
handling the case that no object matches.
This way, expensive exception handling can be avoided and object references
stay in a local scope.
Ref #4536
Just add riscv spec files. The riscv versions should use MMIO transport
as ARM versions do. They also should work fine for riscv_qemu machine
from genode-riscv repository.
acpica and the Intel display driver tries to use the Intel Opregion
simultaneously on Genode, which is not supported nor wanted for IO_MEM region as
which it is handled.
Attempts to remove the access to the region was not successful, since some
SSDT table contains ACPI AML code which is executed regularly and read/write
the Opregion.
The patch adds support to make a copy of the Intel Opregion and report it as
is. The copy was sufficient to make the Intel display driver working to find
and lookup the Intel VBT (video bios table) information to setup all
connectors on a Fujitsu U7411 docking station.
Issue #4531
ACPICA needs access to the host bridge 0:0.0 on Intel, which is also
accessed by the Intel display driver. Since for the Intel display driver the
PCI device is specified in the policy explicitly, the PCI device is filtered
out for the ACPICA driver which uses the policy "ALL".
Issue #4532
As accommodating the session component object is already taken care of
be the root component implementation, remove the remaining redundant
checks.
Fixes#4521.
There is a race between the trace subject doing the buffer
initialization and the monitor trying to iterate the buffer entries. If
the monitor tries to iterate entries of an uninitialized buffer, it will
read the very first entry twice. The monitor should therefore only start
iteration when the buffer has been initialised.
genodelabs/genode#4513
The hover state is evaluated for the routing of input events. When
routing a touch event, the decision should be based on the most recently
observed touch position. Without this patch, however, the hover state kept
referring to the initial pointer position (screen center) in the absence
of any other motion events.
Issue #4514
* Adds methods for copying raw data to the data field of Ethernet frames and
UDP packets. This is used in the port to wrap the higher-layer packet data
prepared by the contrib code with the additionally required headers before
sending it at a network session.
* Adds a method to cast raw data to an IPv4 packet. This is required in the
port in order to check values in stand-alone IP packets produced by the
contrib code before sending them at a network session.
* Adds methods for setting UDP ports given big endian port values without
having to convert to little endian in the app and then back to big endian in
the net lib.
Ref #4397
* Some fixups for the README
* Make config ROM const when used for the session policies
* Turn Reporter into Expanding_reporter
* Always first register ROM signal handler before parsing it the first time
Outsource parts of the Main object into a common compound object,
common parts of the Makefile description and depot source package.
Fixgenodelabs/genode#4503
* Parse PCI specific information from devices ROM
* Enable DMA, I/O memory and I/O port access dependent on BARs in config space
* Introduce device PD for Nova + IOMMU support
* Enable MSIs if available
* Add PCI specific policy rules
Fixesgenodelabs/genode#4502
Instead of returning an invalid device capability when a device
is (not yet) available, e.g. a PCI device is requested before the
PCI bus got parsed accordingly, we check the device capability
within the Platform::Connection utilities, and register temporarily
an Io_signal_handler to wait for changes of the devices ROM, and
try the device aquisition again. Thereby, simple drivers so not have
to take the burden to do so.
To enable this feature for all drivers, we always have to export a
devices ROM, but limit the information about physical resources
(I/O memory addresses, IRQ numbers, I/O port ranges) to clients with
'info=yes' in their policy description.
Fixgenodelabs/genode#4496
By adding a 'report' node to the platform driver's configuration
one can enable either devices or config reports. The devices
report contains all devices and their detailed state, as well as
whether it is already in use or not. The config report contains
one by one the current configuration of the platform driver.
Moreover, this commit adds a README file describing the facilities
of the platform driver.
Fixgenodelabs/genode#4386
If PCI devices happen to miss complete configuration after boot, the
platform driver supports <pci-fixup> nodes for concrete devices
(specified by bus-device-functions tuples). The
<bar> node instructs the platform driver to remap BAR id 0 to address
0x4017002000, which amends the BIOS configuration and is stringently
required for BARs with address 0.
! <pci-fixup bus="0" device="0x15" function="3">
! <bar id="0" address="0x4017002000"/>
! </pci-fixup>
The issue was discovered with Intel LPSS devices in Fujitsu notebooks.
Fixes#4501
To discharge the generic platform driver from certain PCI bus scanning,
and ACPI + kernel specifics, this commit introduces a new component,
which consumes the acpi drivers report and the platform_info from core
to prepare a devices ROM for the platform driver that contains all
PCI devices and its resources.
Fixgenodelabs/genode#4495
USB Attached SCSI devices might expose a bulk-only interface
as fall-back at interface 0 and alternate setting 0. This commit
allows for probing all alternate settings of the active interface
to be able to use such devices.
The configuration was extended so that in case the device interface
is known beforehand the driver can be configured accordingly.
Fixes#4494.
Since the driver relies on all requests being Nvme::MPS_LOG2 aligned
as advertised in its Block::Info the added check will reject any
misaligned requests (using 'gpt_write' led to an IOMMU write fault).
Issue #4486.
To signal that a device gets used and released by a session
introduce claim, release, and release all callbacks in the
USB interface of the C-API.
Ref genodelabs/genode#4483
This patch changes the meaning of the 'offset' parameter of the
'produce_write_content' and 'consume_read_result' hook functions.
The value used to reflect the absolute byte position but in practice,
a job-relative byte offset is desired.
Issue #4474
Before this commit, the block-request handler was implemented as
Io_signal_handler and, additionally, the USB driver called the
block-request handler on request completion directly on I/O level. This
is generally a bad idea because I/O handlers should avoid to have direct
global side effects. In contrast, application logic should be
implemented in way that it consumes atomic state changes after I/O
completed. Now USB I/O completion locally submits a signal to the
block-request Signal_handler.
The Uplink_test used to end in an uncaucht exception about a failed packet
allocation on several x86_32 platforms.
* Destruct and re-construct the corresponding TX packet allocator during a
link-down-up step in the Uplink test. Fixes the exceptions but results in a
never ending test.
* Decouple the link-down-up steps from the handling of packet stream signals
by simply triggering it with a local periodic timeout of 1 sec period.
This prevents that the Uplink_test never finishes because it destructs the
Uplink connection too often.
* The test finishes not before at least 3 link-down-up steps were executed.
* Replace the Allocator_avl's used for the TX packet allocators of the Nic
and Uplink Connection with the better suited Nic::Packet_allocator.
Ref #4419
The service is merely announced but trying to request a session always causes a
Service_denied exception. This helps in scenarios where the client is
won't open a session anyway but expects the service to be available. This is
considered a temporary solution.
Ref #4419
This patch reverts the vfs-watch-handle creation whenever the subsequent
allocation of the VFS server's 'Watch' object fails. This can happen
when the session RAM or cap quota is depleted.
Fixes#4472
This fixes the following warning when building the binary archive:
Library-description file vfs_capture.mk is missing
Library-description file vfs_tap.mk is missing
'Platform::Device_pd::attach_dma_mem' may lead to insufficient resources
for meta data, which is reflected to the client via 'Out_of_caps' or
'Out_of_ram'. In case the client upgrades its session the quotas need to
be passed to core as done by
'Platform::Device_pd::Expanding_region_map_client::attach'.
issue #4451
Use 'avail_caps' and 'avail_ram' for resource guards because 'used_caps'
and 'used_ram' do not account for resources given to the platform
driver. This lead to incorrect resource accounting by the GPU
multiplexer.
issue #4451
The calculations of packet_size and packet_count in the block_io() did
not consider rounding errors. This resulted in diverging values over
several bisecting operations (/= 2) and wrongly-size packet allocations
as well as memcpy operations.
Related to #2263 (comments about partial block accesses and
_block_io()).
Fixes#4471
The NIC router used to handle each type of packet-stream signal with a distinct
method in the Interface class. However, merging those methods has advantages.
It ensures that sent packets that were already acknowledged by the counter side
are always released before handling received packets. This frees packet stream
memory which facilitates the potential allocation of response packets while
handling received packets. Furthermore, it simplifies the code and reduces the
number of entry points into the router.
This commit also removes the installation of signal handlers at packet streams
for events that are of no interest for the router (TX-ready-to-ack /
RX-ready-to-submit at NIC sessions and RX-ready-to-ack / TX-ready-to-submit at
Uplink sessions).
Fixes#4470
Whenever the nic_router encounters ARP requests on an interface
that does not have a valid IP config it will ignore them. However,
When increasing the verbosity of the component for diagnostic
purposes the resulting 'Bad network protocol' message is misleading.
Issue #4455.
* Test DHCP RENEW by the test client in the unmanaged variant.
* Add event IDs to log output of test client in order to prevent false positive
result in the managed variant.
* Let managed and unmanaged variant have separate string patterns for
'run_genode_until' because they already had different output and it will
differ even more as we don't want to test DHCP RENEW with the managed
variant.
* Delay first test client DHCP in order to fix unexpected sporadic initial IP
config.
* Remove some unnecessary code from the run script
Fixes#4460
The NIC router did update the IP config of a domain on a completed DHCP
REQUEST but not on completed DHCP RENEW or DHCP REBIND. Thus, it didn't adapt
to "real" DHCP servers (not NIC router servers) that got restarted with a
changed configuration by the means of RENEW/REBIND. The commit fixes this.
Note, that testing this is complicated as we don't have the necessary
infrastructure (we cannot simply use the DHCP server of the NIC router as this
would apply a link down/up sequence in order to let the client restart DHCP)
Ref #4460
by using the io_mem RPC of the platform session instead of parsing the
bar resources manually. This commits avoids and breakage on systems where
the Intel graphic cards just uses 64bits with addresses above 4G.
Issue #4450
Our usb_host driver supports UHCI, OHCI, EHCI, and XHCI host
controllers. The USB4 host interface / Thunderbolt is currently not
supported and must therefore not be passed to the USB host driver.
With this fix, the driver no longer aborts on the Tigerlake notebook and
just skips the out-of-region ACPI table. Issue #4452 is not fixed by
this commit, but in this specific case the table is not used anyway.
Check if there are a least 4 caps + 2MB (heap) + possible buffer size
available before any resource allocation. Only account resources that are
actually used.
issue #4451
The former implementation relied on the behaviour of how the old
intel fb driver requested the pci devices. The new lxkit however actually
really want to have all available pci devices.
Issue #4450
If size is zero, the platform goes out of service by:
[init -> platform_drv] Error: Uncaught exception of type 'Genode::Ram_allocator::Denied'
[init -> platform_drv] Warning: abort called - thread: e
Issue #4450
This patch adds the trace-logger utility to the default set of packages
along with an optional launcher. With this change, only two steps are
needed to use Genode's tracing mechanism with Sculpt:
- Add 'trace_logger' to the 'launcher:' list of the .sculpt file
- Either manually select the 'trace_logger' from the '+' menu,
or add the following entry to the deploy configuration:
<start name="trace_logger"/>
By default, the trace logger is configured to trace all threads
executed in the runtime subsystem and to print a report every 10
seconds. This default policy can be refined in the launcher's <config>
node. Note that the trace logger does not respond to configuration
changes during runtime. Changes come into effect not before restarting
the component.
Issue #4448
This patch changes the output format of the trace logger to become
better suitable for human consumption. For example, when instrumenting
the VFS server in Sculpt using the GENODE_TRACE_TSC utility, the
trace logger now generates tabular output as follows.
Report 4
PD "init -> runtime -> arch_vbox6 -> vbox -> " ----------------
Thread "vCPU" at (0,0) total:12909024 recent:989229
Thread "vCPU" at (1,0) total:5643234 recent:786437
PD "init -> runtime -> ahci-0.fs" -----------------------------
Thread "ahci-0.fs" at (0,0) total:910497 recent:6335
Thread "ep" at (0,0) total:0 recent:0
71919692932: TSC process_packets: 8005M (4998 calls, last 4932K)
71921558516: TSC process_packets: 8006M (4999 calls, last 1596K)
71922760220: TSC process_packets: 8007M (5000 calls, last 1006K)
71929853586: TSC process_packets: 8009M (5001 calls, last 1840K)
71931315246: TSC process_packets: 8011M (5002 calls, last 1253K)
72127999920: TSC process_packets: 8016M (5003 calls, last 5606K)
72129568198: TSC process_packets: 8018M (5004 calls, last 1345K)
77161908178: TSC process_packets: 8029M (5005 calls, last 11349K)
77643225736: TSC process_packets: 8029M (5006 calls, last 217K)
89422100594: TSC process_packets: 8035M (5007 calls, last 5656K)
89422123632: TSC process_packets: 8035M (5008 calls, last 1342)
Thread "signal handler" at (0,0) total:36329 recent:3001
Thread "signal_proxy" at (0,0) total:51838 recent:13099
Thread "pdaemon" at (0,0) total:97184 recent:332
Thread "vdrain" at (0,0) total:1266 recent:286
Thread "vrele" at (0,0) total:1904 recent:516
PD "init -> runtime -> nic_drv" -------------------------------
Thread "nic_drv" at (0,0) total:34044 recent:897
Thread "signal handler" at (0,0) total:369 recent:142
...
Subjects that belong to the same PD are grouped together. The formerly
optional affinity and activity options have been removed. Those
information are now unconditionally displayed. The trace entries
belonging to a thread appear as slightly indented.
The patch also updates the coding style, avoiding excessively long
lines.
Issue #4448
This patch reduces repetitive log output by omitting inactive trace
subjects from the log output. The information about all subjects can
still be dumped by setting 'verbose="yes"'.
Issue #4448
This patch splits the creation and updating of monitor objects into two
stages. The creation of a monitor object changes the state of the
associated trace subject. The patch ensures that the new state is
captured by the update of the monitor object.
Issue #4448
This patch makes the trace-subject state as reflected to the trace
monitor more accurate.
Until now, a subject could be in UNTRACED or TRACED state. In reality,
however, there exists an intermediate state after the trace monitor
called 'trace' for the subject but before the subject locally activated
the tracing (done when passing a trace point). This intermediate state
was reflected as UNTRACED. Consequently, threads that never pass a trace
point (e.g., just waiting for I/O) would remain to appear as UNTRACED
even after enabling its tracing by the trace monitor. This is confusing.
This patch replaces the former UNTRACED and TRACED states by three
distinct states:
UNATTACHED prior any call of 'trace'
ATTACHED after a trace monitor called 'trace'
but before the tracing is active
TRACE tracing is active
Fixes#4447
The utilities of the new util/formatted_output.h header complement the
existing base/output.h with the text-formatting support needed to
produce tabular output.
Fixes#4449
This commit adjusts the value such that USB sessions requested by
VirtualBox6 on Sculpt OS can get established on the first try without
invoking the session-retry mechanism. This reduces the number of
diagnostic log messages like:
Error: Insufficient 'ram_quota',got 6296372 need 6297928
The existing assignment of CPU quotas did not anticipate the dynamic
reconfiguration of init. It merely tracked the available CPU quota by
deducing the consumed amount from a global variable but never
replenished the value. This worked for static scenarios but failed in
situations where components are dynamically re-started.
So far this deficiency remained detected because CPU quotas were not
used in highly dynamic systems like Sculpt OS. However, this has
recently changed by commit "sculpt: assign CPU quotas".
The patch improves the accounting by mirroring the existing handling of
RAM and cap quotas. Note that the CPU-quota accounting is still rather
limited. In particular the dynamic rebalancing is not yet supported.
Issue #4445
With the consolidation of the file-system session's signal handlers
implemented by commit "file_system_session: merge ack and submit sigh",
we can now change the VFS server to produce batches of acknowledgements
before explicitly waking up the client. (in contrast to the traditional
'acknowledge_packet', the new 'try_ack_packet' triggers no signal)
Issue #4388
Split the trace buffer into two partitions in order to prevent overwriting
of entries when the consumer is too slow. See file comment in buffer.h.
genodelabs/genode#4434
This commit simplifies the current implementation by overloading the
length field with a padding indicator in addition to the zero-length
head entry. This simplifies the iteration semantics as it eliminates
the need for determining whether a zero-length entries is the actual
head of the buffer or a padding at the buffer end.
genodelabs/genode#4434