This commit streamlines the interaction between the Wifi::Manager
and the wpa_supplicant's CTRL interface.
As user-facing changes it alters some default settings and introduces
new features:
* Every configured network now needs to explicitly have its
'auto_connect' (to be considered an option for joining) attribute
set to 'true' whereas this was previously the default value if the
attribute was not set at all.
* The 'log_level' attribute is added and configures the supplicant's
verbosity. Valid values correspond to levels used by the supplicant
and are as follows 'excessive', 'msgdump', 'debug', 'info', 'warning'
and 'error'. The default value is 'error' and configures the least
amount of verbosity.
* The 'bgscan' attribute may be used to configure the way the
supplicant performs background-scanning to steer or rather optimize
roaming decision within the same network. The default value is set
to 'simple:30:-70:600'. It can be disabled by specifying an empty
value, e.g. 'bgscan=""'.
* The 'verbose_state' attribute was removed alltogether and similar
functionality is now coverted by 'verbose' attribute.
Implementation-wise the internals changed significantly and are
outlined in the following paragraphs.
Formerly the interaction between the manager and the supplicant
was handled in an apparent way where the internal state of each
interaction was in plain sight. This made the flow cumbersome to
follow and therefor each interaction is now confined to its own
'Action' object that encapsulates the ping-pong of commands and
responses between the manager and the supplicant. All actions are
processed in an sequential way and thus there is no longer any
need to defer pending actions depending on the interal state of
the current interaction. Configuration changes as well as events
issued by the supplicant where new actions can be created are
handled in this fashion. Of note are both signal-handlers,
'_handle_cmds' and '_handle_events' respectively.
The state report, which provides the information about the current
state of connectivity to a given wireless network, was dealt with
in the same vein and its handling was spread across the manager
implementation. Again, to make it easier to follow, the generation
of the state report is now purely driven by the 'Join_state' object.
This object encapsulates the state of connectivity and is normally
updated by events issued from the supplicant (see '_handle_events').
It is also incorporated when handling command responses (see
'_handle_cmds').
Handling of timed-actions, like scan and signal quality
update requests, was done by setting a timeout at the Timer session
directly and thus only one timed-action could be pending at any time.
This excluded dealing with timed-actions like connected-scanning
and signal quality polling concurrently. This was changed and now
a One_shot_timeout is used to programm each concurrent timed-action.
For implementing the communication channel for the CTRL interface the
manager and supplicant use a shared memory buffer, the Msg_buffer.
Since the CTRL interface for Genode was implemented using C, some
shenanigans were performed to access the memory buffer. Now the
CTRL interface implementation uses C++ and only exports the functions
required by the supplicant as C. This simplifies the usage of the
Msg_buffer and allows for removing the global functions needed for
synchronizing the Msg_buffer access as those are now part of the
object itself via the 'Notify_interface'.
Fixes#5341.
The 'convert_errno_from_linux' function was already used internally to
convert the Linux errno values to the matching FreeBSD libc ones when
calling socket functions.
It will now also be used to convert the error values included in
netlink messages as those, naturally, also correspond to the Linux
ones.
Issue #5341.
When starting testnit with the wm, the child views briefly appear at a
position relative to the top-left corner of the screen until the
top-level view has been positioned by the layouter. This patch keeps
child views invisible until their respective parent views are
positioned.
When restacking a top-level view, execute the top-level restacking
before updating the child views. Otherwise, child views may wrongly
refer to the old stacking position of the top-level view.
Issue #5242
Use Genode namespace, indicate 'Main' members as being private,
use Session_object, remove unused '_focus_request_reporter',
use Id_space for Window_registry, replace lookup by with pattern.
This is a follow-up commit to "gui_session: manage view ID at the client
side", which missed to invalidate the neighbor view of a window but
instead wrongly assigned the (now always valid) view ID 0 as neighbor.
In situations where a window disappears and re-appears (e.g., repeatedly
launching testnit in the wm.run scenario), the new window could not
always be topped.
Issue #5242
- only mark framebuffer dirty if necessary
-> gives the hardware chance to save longer power
- remove extra timer connection on Genode component side
-> use Linux time primitives
Issue #5339
With this patch, the wm accounts RAM and caps consumed on behalf of its
clients to the respective client's session quota instead of paying out
of its own pocket. This should make the wm resilient against resource
exhaustion and lowers the quota requirements.
Issue #5340
The current default session RAM quota of 36 KiB reflects the needs of
the nitpicker GUI server. However, in most commonly used scenarios, a
GUI client connects to nitpicker indirectly via the wm. The low value
worked so far because the wm did not account RAM and cap usage per
client so far but paid out of its own pocket and faithfully forwarded
all resource upgrades to nitpicker.
When adding resource accounting to the wm, the old default value has the
effect that a new client has to repeatedly attempt the session creation -
each time offering sligthly more session quota - until both nitpicker and
the wm are satisfied.
By roughly doubling the default to 80 KiB, a wm client immediately
succeeds with opening a GUI session without repeated attempts.
By specifying a custom 'cap_quota' amount to the 'Genode::Connection',
the Gui::Connection now donates enough caps for both the wm and
nitpicker.
Issue #5340
By default, a 'Connection' donates an amount of caps as declared in
SESSION_TYPE::CAP_QUOTA to the server at session-creation time.
In some situations, however, a client may deliberately want to donate a
larger amount. For example, when opening a GUI session at the wm, the
total amount of needed caps is the sum of those consumed by the wm plus
those consumed by nitpicker. Using this knowledge, the Gui::Connection
may specify a sufficient amount to avoid iterative session-creation
retries. The new 'Connection' constructor accommodates this use case by
accepting an explicit 'cap_quota' argument.
Issue #5340
This patch deduces the caps needed for the framebuffer and input RPC
objects from the resources accounted locally within the session. It also
takes precautions for the situation where a client offers too little
resources, prompting the mid-way cancelling of the 'Session_component'
creation. With the patch, the 'ep.manage' operations are rolled back
by the corresponding 'ep.dissolve' operations.
Issue #5340
This patch moves the eager allocation of view capabilities from the
'view' and 'child_view' RPC functions to the 'view_capability' RPC
function, reducing the consumption of capabilities in all scenarios
where views don't need to be shared between GUI sessions.
Issue #5340
over rsdp v1. The multiboot2 provided rsdp_v1 version may not contain the
xsdt pointer, but may have the very same acpi revision as the acpi rsdp v2
version of multiboot2.
Fixes#5332
.../repos/libports/src/test/pthread/main.cc:539:76: warning: ‘++’ expression of ‘volatile’-qualified type is deprecated [-Wvolatile]
.../repos/libports/src/test/pthread/main.cc:1104:32: warning: ‘test’ may be used uninitialized [-Wmaybe-uninitialized]
The gui_fb client may have installed a custom sync_sigh and mode_sigh.
Reset those signal handlers at the GUI server should the client
disappear. Otherwise, the GUI server (nitpicker) continues to attempt
transmitting sync signals to the no-longer existing component, spamming
the log with "Warning: invalid signal-context capability" messages.
Don't use client-provided view IDs as IDs for the wrapped nitpicker
views. There is no 1:1 relation of IDs and physical views. So if a wm
client re-uses an ID, the physical view is expected to stay in tact.
If the corresponding view object within the wm is not destroyed, however,
its ID remains allocated, which may then conflict the ID of a new view
if the ID is reused by the client. This scenario resulted in the
following error:
Error: Uncaught exception of type 'Genode::Id_space<Gui::View_ref>::Conflicting_id'
This patch handles the situation by keeping the allocator of physical
views (_real_view) decoupled from the client's ID allocator.
Issue #5242
This patch enables basic use cases of the POSIX 'alarm' function, which
schedules the delivery of a SIGALRM signal after a specified amount of
seconds.
Issue #5337
This update uses a -current (that will become 7.6 later this year)
snapshot from 2024-08-16 that includes fixes for MSI support on
AMD systems.
Fixes#5331.
This commit replaces the current vbox5 based USB HID raw test, which
runs a Genode guest to test USB passthrough with a USB human interface
device, with one using vbox6.
Fixes#5330.
This commit adds support for TIOCSETA and TIOCFLUSH in a dummy fashion
that is enough to allow vbox6's serial-port implementation to print
lines to the log.
Issue #5330.
This commit allows for suppressing failed extract operations by
setting the 'ignore_failures' attribute in the 'config' node.
It is intended for operating the component in batch-mode where
multiple archives need to be extracted but failing to extract
some of them can by ignored. The default value of this option
is 'false'.
It also adds the 'stop_on_failure' attribute that instructs
the component to stop processing any following archives after
it already has failed to do so. The default value of this
option is 'true' to preserve the current behavior.
Issue #5326.
Now, we schedule before unblocking the rx_task. This is done in order to
execute a potentially ready ksoftirqd before unblocking the rx_task,
which in turn may execute soft-interrupt handlers through bottom half
code leading to double lock attempts of the socket spinlock.
Issue #5264
Since the preset contains mesa_gpu-intel, it is specific to the pc
platform. Other platform-specific repos (such as allwinner) may contain
their own preset with the same name. To prevent that Sculpt images use
the wrong preset due to the particular order in the build.conf, we move
the preset into the pc repo.
Fixes#5322
Avoid double allocation of alternative stack. Genode's sigaltstack variant
allocates the stack with alloc_secondary_stack. Disable the warning of
sigaltstack by using explicitly the nullptr in ss_sp.
Issue #5305
This interface change gives GUI servers the freedom to allocate view
capabilities at the time of request instead of the creation time of the
view. This is useful because view capabilities are rarely needed.
Issue #5242
This patch moves the management of view IDs from the server to the
client side. The former 'create_view' and 'create_child_view'
operations do no longer return a view ID but take a view ID as
argument. While changing those operations, this patch takes the
opportunity to allow for initial view attributes. Combined, those
changes simplify the window manager while accommodating typical
client use cases with less code.
To ease the client-side ID management, the Gui::Connection hosts
a 'view_ids' ID space for optional use. E.g., the new 'Top_level_view'
class uses this ID space for ID allocation. This class accommodates the
most typical use case of opening a single window.
The 'alloc_view_id' RPC function is no longer needed.
Issue #5242
This patch reworks the view-ID handling within the nitpicker GUI server
and the window manager. The namespace of view handles are now represented
as an Id_space. In constrast to the former "handles", which could be
invalid, IDs cannot be semantically overloaded with anything other than
an actual view reference. There is no notion of an invalid handle.
IDs are like C++ references (which cannot be a nullptr).
This change requires the code to be more explicit. E.g., the stacking of
a few at the front-most position can no longer be expressed by passing
an invalid handle as neighbor.
Issue #5242
Express the allocation of a new view handle by a dedicated RPC function
instead of passing an invalid view handle to the existing 'view_handle'
function.
This eliminates the notion of invalid view handles at the GUI session
interface, clearing the way for managing view handles via an Id_space.
Issue #5242
This patch eliminates the use of invalid view handles as special
Session::Command arguments. The TO_FRONT and TO_BACK operations
interpreted as invalid neighbor as top-most or back-most position.
Those corner cases are now expressed via dedicated commands. The
new stacking commands are FRONT, BACK, FRONT_OF, and BEHIND_OF.
While changing the command interface, the patch removes the OP_
prefix from the opcode values.
Issue #5242
- Rename framebuffer_session to framebuffer and
input_session to input as those RPC interfaces are no longer
meant to be used as stand-alone sessions.
- Host Connection::input and Connection::framebuffer as public
members, thereby removing the use of pointers. This simplifies
the client-sized code. E.g., '_gui.input()->pending()' becomes
'_gui.input.pending()'.
Issue #5242
To maintain ease of use at the client side, the OUT_OF_RAM and
OUT_OF_CAPS results are handled at the 'Gui::Connection' now.
Gui::Connection does not inherit the Gui::Session interface any longer,
which allows for the use of different result types.
Issue #5242
Issue #5245
This patch replaces the optional parent argument of the create_view
RPC function by a dedicated create_child_view RPC function. This
is a preparatory step of removing the notion of an invalid handle
as a special case.
Issue #5242
The fd event handling uses the fd to directly access the array slot and
expects the fds to be contiguous and capped.
Since the returned fds from our libc were much larger than expected,
because the libc itself consumes multiple fds when managing sockets,
using the fd in this manner leads to memory corruption.
This commit limits the maxfds to 63 and always allocates 1024 slots
in the fd-array.
Fixes#5320.
Add extended FPU state detection and handling (via xsave and friends) to the
kernel, which has to store/load more FPU state (~512 -> 2k++) during context
switching of threads. Additional the referenced nova branch contains various
optimization during VM destruction and cross core IPC resource caching.
This FPU work is based upon upstream NOVA kernel and Hedron commits.
Issue #5314Fixes#3914
some more adjustments are needed for xsave support, but this port is scheduled
to be removed. Just disable xsave for the time being to make nightly test
happy.
Issue #5314
The commit that added firmware loading via the VFS (see #4861)
introduces a double-free bug where the memory that contains the
image is freed twice, once from the callback and once from the
work function.
As alle examined drivers call 'release_firmware' from the callback
function themselves, remove the erroneous 'kfree' call from the
work function.
Issue #5264.
The 'update_quality_interval' instructs the wifi driver to update
the approximated link quality to the currently connected AP every
30 seconds.
Issue #5262.
This commit introduces support for querying and updating the signal
quality of the established connection to the current accesspoint.
By setting the 'update_quality_interval' to a non-zero value specified
in seconds the 'state' report will be updated to incorporate the
current signal quality. It uses the same approximation as is already
in use by the scan results.
Fixes#5262.
Now, USB audio class devices become available in Sculpt, e.g., for vbox
passthrough, and are not automatically grabbed by the usb_hid class=3
policy. In the future, interface/endpoint level policies will enable
driving the HID interface only from usb_hid while a usb_audio driver
controls the rest of the device.
This commit adds the firmware image for the AX200 device as found
in the Tuxedo Pulse 15 Gen1, the 9560 as found in the Starlite and
the for devices found in the T430/T530.
Fixes#5282.
Turn some of the current assertions into warnings/error messages and
continue boot. Print the messages as soon as core_log is initialized,
so that on live/release systems (Sculpt OS) it may be inspected later on.
Related to issue #5307
The code to group together SMT threads of one CPU and to move P-Core to
the beginning of Genode's affinity-space, did not consider to run on
SOCs with only E-Core CPUs.
Re-structure the code to support e-Core only SOCs.
Additionally, provide a fallback mapping in case of CPU id reordering problems.
Track faulty re-mapping and delay the reporting until core_log is initialized,
so that the warnings is visible to consumers, e.g. on Sculpt OS.
Related to discussion of #5304Fixes#5307
Unmasking of a pending interrupt did not lead to immediate IRQ handler
execution in all cases.
This commit also addresses some style concerns risen during the issue
discussion.
- Replace multi-boolean IRQ state by state enum
- EOI and ACK should be same in DDE context
- Unify x86 and ARM irqchip.c
- Remove Pending_irq type
- Remove dde_irq_set_wake()
Fixes#5164
The use of the Linux-internal SLUB allocator is supported by lx_emul and
drivers may now decide between the Linux implementation or our emulation
of kmem_cache. Drivers for pc and virt already use SLUB, while other
drivers still use the emulation and may be adapted step-by-step incl.
the testing on the devices.
Fixes#5236
Allocate a Genode known stack via alloc_secondary_stack and register it
as alternative stack via Signal:use_alternative_stack().
The original semantic of Posix, where the caller may choose arbitary stack
pointers is currently not possible. Warn about the fact.
Issue #5305
The Fn key on keyboards should never be reported as real scancode event,
as it is just a hardware switch that changes the reported scancodes of
other keys. The behavior of Linux hid-apple.c is wrong as it on one hand
reports different scancodes for the same hard key depending on the Fn
state but sends the Fn press and release events too. Thus from now on,
we just drop KEY_FN events for all drivers as otherwise, scancodes
generated generated by Fn+key combinations would never be single-key
events on upper layers, for example KEY_FN + KEY_F12 on the Matias Apple
keyboard clone in the fixed issue.
Fixes#5288
By calling run_genode_until twice, we take into account that the boot
time on some boards might long than on others, while still verifying
that the second "set_rtc" is reported within about 1min (+10s).
Fixes#5306
Since page tables might need to be allocated during
insert_translation(), Out_of_ram or Out_of_caps exceptions might occur.
Entries that have already been added by insert_translation() must thus be
removed once one of those exceptions occurred.
Fixes#5254
Add TAR_OPT to global.mk that defaults to user and group 1, while
setting mtime to 0 for tar archives. This can be used in components to
produce consistent (reproducible) tar archives.
issue #5255
This patch replaces the former Child::Process and
Child::Process::Loaded_executable classes by static functions that
return failure conditions as return values.
Issue #5245
By using GCC's --debug-prefix-map argument, we can make sure that debug
archives always refer to source files at /depot. With this change, GDB
can be pointed to the correct source-file location by using the `set
substitute-path /depot /path/to/local/depot`.
Fixes#5260
This patch tightens the coupling of the 'Platform_thread' objects
with their corresponding 'Platform_pd' objects by specifying the
'Platform_pd' as constructor argument, keeping the relationship
as a reference (instead of a pointer), and constraining the
lifetime of 'Platform_pd' objects to the lifetime of the PD.
It thereby clears the way to simplify the thread creation since all
PD-related information (like quota budgets) are now known at the
construction time of the 'Platform_thread'.
The return value of 'Platform_thread::start' has been removed because it
is not evaluated by 'Cpu_thread_component'.
Related to #5256
Fix the wrong assumption about isochronous packets being always send
with maximum EP's packet size. Instead the isochronous cache now contains
a sizes array to deal with arbitrary packet sizes.
Fixgenodelabs/genode#5257
For consistency reasons, remove the cortex_a8, cortex_a9, and cortex_a15
spec directories. Such SPEC variables do not exist since a while.
Also rename remaining translation_table.h header to page_table.h to
stay consistent with the class names inside.
Fixgenodelabs/genode#5253
- Remove exceptions
- Use 'Attr' struct for attach arguments
- Let 'attach' return 'Range' instead of 'Local_addr'
- Renamed 'Region_map::State' to 'Region_map::Fault'
Issue #5245Fixes#5070
The 'Thread_creation_failed' error is now reflected as
'Thread::Start_result' return value. This change also removes the
use of 'Invalid_thread' within core as this exception is an alias
of Cpu_session::Thread_creation_failed.
Issue #5245
For each packet that got stuck with an ARP-cache miss, the router used to send
one ARP request and create one ARP waiter. However, in situations where many
packets target the same IP at one destination domain and during a short period
of time, this causes unnecessary session-quota consumption and network traffic.
This issue becomes especially pressing when taking malicious source peers,
absent destination peers, and packet batching into account.
Therefore, with this commit, the router can accumulate multiple source packets
with the same destination IP at one ARP waiter. This means, that only the first
packet with an ARP-cache for a certain IP sends an ARP request and creates an
ARP waiter. For situations where the ARP request is not answered, this
essentially rate-limits ARP requests for one IP at one destination domain
according to the lifetime of ARP waiters (default: 10s)
Ref #4534
The router used to send an ARP request for a packet before allocating the
corresponding ARP waiter. If the ARP waiter could not be allocated due to
resource exhaustion plus emergency free failed, the packet got dropped and the
router had produced unnecessary network traffic. The commit fixes this by
sending only after successful allocation.
Ref #4534
The previous default packet-batch count of 150 (<config
max_packets_per_signal>) was choosen with the only goal of preventing
starvation by huge amounts of packets from one session.
However, there is something else to keep in mind. A packet that is found to
require ARP sends an ARP request and becomes blocked after having consumed
resources. This means, that, in the worst case, the router used to send 150 ARP
requests and consume resources 150 times before making it even possible for the
outer world to react and cause resources to be freed.
With this additional scenario in mind, the default batch size should be
significantly lower.
Ref #4534
poll(2) needs to handle invalid file descriptors in the pollfd struct,
specifically -1 as it may be used to disable entries in the fds[] array.
Fix a possible nullptr dereference by checking the File_descriptor
pointer returned by find_by_libc_fd() for validity and skip processing
of any unresolved FDs, effectively implementing standard POSIX
semantics.
Fixes#5249
This patch removes the exception formerly thrown by 'Cpu_thread::state'
and turns the 'Thread_state' structure into a plain compound type w/o a
constructor.
Issue #5245Fixes#5250
With libxml2 >= 2.13, the `-path` argument can no longer be used for
setting search paths for xsd files. Instead, we use an XML catalog to
replace genode:// URIs with absolute paths.
Fixes#5248
This patch replaces exceptions of the PD session RPC interface with
result types.
The change of the quota-transfer RPC functions required the adaptation
of base/quota_transfer.h and base/child.h.
The 'alloc_signal_source' method has been renamed to 'signal_source'
to avoid an exceedingly long name of the corresponding result type.
The Pd_session::map function takes a 'Virt_range' instead of basic-type
arguments.
The 'Signal_source_capability' alias for 'Capability<Signal_source>' has
been removed.
Issue #5245
So far, this test used dynamic_rom for the re-configuration of the nic router
and tested for the expected ping results by inspecting the log with the run
tool. However, this approach had two issues:
* Timing differs significantly on different targets and so the dynamic_rom had
the difficult task of compensating with heuristics without bloating the test
duration too much.
* In case of a failing test, it was difficult to determine the cause as the
test kept running and produced output for quite some time and there was also
no specific error message but only a generic timeout.
These two issues are now fixed by introducing a test component that listens to
the ping-result report and manages the nic router configuration. The new
component exits early on failure and provides information on the error
circumstances. Furthermore, the component advances to the next test step only
after having seen the expected result of the active test step and thereby
removes the need for heuristics about target timing.
Fixes#5192
This patch updates the signal API to avoid raw pointers, and
replaces the Context_already_in_use and Context_not_associated
exceptions by diagnostic messages.
Fixes#5247
The router used to ignore the value of the <report quota=".."/> attribute when
it came to determining whether an interface's report is empty or not.
Therefore, merely configuring <report quota="yes"/> didn't cause interfaces
(and their quota) to show up in the report. Instead, interface quota was
reported as side effect of <report stats="yes"/>. The commit fixes this
inconsistency with the README.
The only object that is dynamically allocated by a network interface and that
was not equipped with a self-destruct timeout was the ARP waiter. This commit
closes this gap by adding a timeout to each ARP waiter that is set to 10
seconds by default but can be configured via the new <config> attribute
'arp_request_timeout_sec'.
Ref #4729
RFCs recommend to keep TCP connections for a certain time even after they
finished a close handshake, AFAIK, in order to be able to recognize astray
packets when they arrive later. This seems overambitious especially when in
the context of the router where session quota is pretty limited. Therefore,
this commit drops this final timeout and drops closed connections immediately.
Ref #4729
The previous value of 60 seconds was never observed in real-time scenarios and
UDP, for instance always used a timeout of 30 seconds without causing issues.
Note that this applies only to TCP connections in a state other than
ESTABLISHED, i.e., while it is still safe to early-drop the connection.
Ref #4729
The TCP connection state "ESTABLISHED" (in the router "OPEN") is a privileged
one for peers because it lasts very long without any peer interaction (in the
NIC router it's only 10 minutes, but RFCs recommend not less than 2 hours and
4 minutes). Furthermore, TCP connections in this state are normally not
available for early-drop on resource exhaustion. This means that this state
binds resources to a connection potentially for a long time without the option
of regaining them under stress. Therefore, this state should be entered with
care.
Up to now, the router marked a TCP connection with this state as soon as it had
seen one matching packet in both directions, which is rather quick. However,
implementing a very precise tracking of the exact TCP states of both peers and
only marking the connection "ESTABLISHED" when both peers are "ESTABLISHED" is
a difficult task with lots of corner cases.
That said, this commit implements a compromise. The router now has two flags
for each peer of a TCP connection - FIN sent and FIN acked - and sets them
according to the observed TCP flags. The "ESTABLISHED" state is entered only
when FIN acked is set for both peers (without having observed an RST or FIN
flag meanwhile).
Ref #4729
The Reference and Const_reference utility were introduced in order to express
that something is a reference (no null value) but can be changed dynamically
(not possible with built-in C++ references). However, the idea of preventing
every possibility for null pointer faults, with which the router was built
initially, has not prevailed and using pointers instead of the utility saves
logic and makes the code more readable to other C++ developers.
Ref #4729
The deinitialization method of Domain used to rely on Domain::with_dhcp_server
in order to dissolve and destroy a present DHCP server. However, this method
skipped calling its functor argument also when there was a DHCP server but an
invalid one. This commt replaces the with_dhcp_server with a pointer null-check
in order to fix the leak.
Ref #4729
Re-implements an emergency freeing of resources on exhaustion of session quota.
In contrast to the past one, the new algorithm is executed directly where the
exhaustion occurs. Instead of interupting the packet handling and restart it
from the beginning after the freeing action, packet handling is now continued
at the point of exhaustion (if enough resources could be freed). Furthermore,
the new algorithm frees only 100 objects (instead of 1024) at a max as we found
this to better match real-life observations. And finally, the router now drops
ICMP first, then UDP, then TCP - as this better reflects priorities - and
refrains from dropping TCP connections in the ESTABLISHED state. If the router
cannot free a sufficient amount of resources, the packet that caused the
exhaustion is dropped with a warning (verbose_packet_drop="yes").
Ref #4729
Remove the use of C++ exception as much as possible from the router as C++
exception handling can be resource intensive and can make code hard to
understand.
This also removes the garbage collection that the router used to do when a
session ran out of quota. This is motivated by the fact that the garbage
collection was rather simple and removed connection states regardless of their
current state, thereby causing broken connections. The change is part of this
commit as the approach to integrating garbage collection relied strongly on
exception handling.
The user story behind removing garbage collection: The router emergency-dropped
an established TCP connection (with NAPT) and on the next matching packet
re-created it with a different NAPT port, thereby breaking the connection. With
this commit, existing connections are prioritized over new ones during resource
exhaustion and the packets that attempt to create a new connection in such a
state are dropped with a warning in the log (verbose_packet_drop="yes").
Note that the state resolves itself with time as existing connections time out
or are closed by peers.
Ref #4729