This commit introduces the Kernel::Main class that replaces the former way of
initializing the kernel (former 'kernel_init' function) and calling the C++
kernel entry handler (former 'kernel' function). These two are now
'Main::initialize_and_handle_kernel_entry' and 'Main::handle_kernel_entry'.
Also reading the execution time of the idle threads was already moved to
'Main'. The one static Main instance is meant to successivly replace all the
global static objects of the base-hw kernel with data members of the Main
instance making the data model of the kernel much more comprehensible. The
instance and most of its interface are hidden in kernel/main.cc. There are only
rare cases where parts of the Main interface must be accessible from the
outside. This should be done in the most specific way possible (see main.h)
and, if possible, without handing out references to Main data members or the
Main instance itself.
Ref #4217
Normally, the board header can be found for each supported board under
'src/core/board/<BOARD>/board.h'. This was not the case for the board 'pc'
that was located under 'src/core/spec/x86_64/board.h'. The commit fixes this.
Ref #4217
The class name Core_thread in Kernel for the object of the first thread of
core is too generic as there can be an arbitrary number of threads in core
besides this one. Furthermore, creating a core thread has its own syscall
'new_core_thread' that isn't related in any way to Core_thread. Therefore
this commit introduces the more specific name Core_main_thread as replacement
for Core_thread.
Ref #4217
The function was only still used for reading the execution time of idle threads
of CPUs. Certainly, it is technically fine and more performant to read these
values directly from the kernel objects without doing a syscall. However,
calling cpu_pool() for it provides read and write access to a lot more than
only the execution time values. The interface via which Core directly reads
state of the kernel should be as narrow and specific as possible.
Perspectively, we want to get rid of the cpu_pool() accessor anyway. Therefore
this commit introduces Kernel::read_idle_thread_execution_time(cpu_idx) as
replacement. The function is implemented in kernel code and called by Core in
platform.cc.
Ref #4217
Apparently, there is no need for exposing the data members of Trace_source, so,
we sould better make them private before someone gets the impression that they
are meant to be accessed directly.
Ref #4217
Core used to read the kernel-reserved IRQs from the timer objects in the
kernel's CPU objects and the PIC class (inter-processor IRQ). Besides not
being "good style" to access a kernel object in Core, this becomes a problem
when trying to prevent CPU pool from being accessed via global functions.
As a solution, this commit extends the boot info to also carry an array of all
kernel-reserved IRQs.
Ref #4217
For the constructor of Kernel_object<T> there are two variants. One for the
case that it is called from Core where the kernel object (type T) must be
created via a syscall and one when it is called from within the kernel and the
kernel object can be created directly. Selecting one of these variants was done
using a bool argument to the constructor. However, this implies that the
constructor of Kernel_object<T> and that of T have the same signature in the
variadic arguments, even in the syscall case, although technically it would
then not be necessary.
This becomes a problem as soon as kernel objects created by Core shall receive
additional arguments from the kernel, for instance a reference to the global
CPU pool, and therefore stands in the way when wanting to get rid of global
statics in the kernel. Therefore, this commit introduces two constructors that
are selected through enum arguments:
! Kernel_object(Called_from_kernel, ...);
! Kernel_object(Called_from_core, ...);
Ref #4217
The various mapping methods are modelled after the requirements of
the Intel GPUs or rather the Mesa driver back end.
With upcoming support for other driver back ends, we need to
sequeeze their requirements in as well. For now hijack 'map_buffer'
to provide for specifying the kind of attributes the client needs.
For now all buffers mapped in the GGTT for Intel GPUs are treated
as RW.
Issue #4265.
This call allows for checking if the given execution buffer has been
completed and complements the completion signal. Initially the GPU
multiplexer always sent such a signal when the currently scheduled
execution buffer has been completed. During enablement of the 'iris'
driver it became necessary to properly check of sequence number.
In case of the Intel GPU multiplexer the sequence numbers are
continous, which prompted the greater-than-or-equal check in the
DRM back end. By hidding this implementation detail behind the
interface, GPU drivers are free to deal with sequence numbers any
way they like and allows for polling in the client, where the
completion signal is now more of a progress signal.
Issue #4265.
The current info implementation (as RPC) is limited in a few ways:
* The amount of data that may be transferred is constrained by the
underlying base platform
* Most information never changes during run time but is copied
nonetheless
* The information differs depending on the used GPU device and
in its current implementation only contains Intel GPU specific
details
With this commit the 'info' RPC call is replaced with the
'info_dataspace' call that transfers the capability for the dataspace
containing the information only. This is complemented by a client
local 'attached_info' call that allows for getting typed access to
the information. The layout of the information is moved to its own
and GPU-specific header file, e.g., 'gpu/info_intel.h'
Issue #4265.
Rather than using the dataspace capability directly, let the client
choose its own local identifier that is linked to the underlying
capability.
Fixes#4265.
Right now the warning about failure to forward packet from driver to
uplink RX connection reads:
"exception while trying to forward packet from driverto Uplink
connection TX"
Add missing space between "driver" and "to".
Issue #4264
32KB is a rather small value. The driver can cope with it now, but
it does not perform as well as it should. This visible especially
in scenarions like nic_router_flood where we still often hit
synchronous wait path. Bump the size to 256kB.
Issue #4264
The problem can be seen when running nic_router_flood scenarion on arm
qemu_virt boards. With the amount of data this scenario tries to send
the driver quickly complains it has failed to push data into TX VirtIO
queue. After this warning message is printed nothing really happens and
after a while the test scenario fails.
The fact that we can't write all available data to the device is not
unexpected. VirtIO queue size is slected at initialization time and we
don't change it during driver lifetime. It can be tweaked via driver
config, but this does not change the fact that we'll always be able to
produce more data packets than we have free space in the VirtIO queue.
IMO the expected behavior of the driver in such case should be to:
1. Notify the device there is data to process.
2. Wait for the device to process at least part of it.
3. Retry sending queued packets.
One could expect returning Transmit_result::RETRY from _drv_transmit_pkt
would produce such result. Unfortunately it seems that Uplink_client_base
treats RETRY return value as indication of link being down. It'll retry
sending the packet only after the device notifies it the link is once
again up. This is the reason why nothing happens when running
nic_router_flood on top of virtio_nic driver. The link never goes down
in this case so once we fill the TX VirtIO queue and tell the base class
to retry the send, we'll be stuck waiting for link up change event
which will never arrive.
To fix this problem, when sending a packet to the device fails, do a
synchrnonus TX VirtIO queue flush (tell device there is data to process
and wait until its done with it).
With this fix in place nic_router_flood test scenario passes on both arm
qemu_virt boards.
Issue #4264
The contents of those descriptor rings can be modified by the device.
Mark them as volatile so the compiler does not make any assumptions
about them.
Issue #4264
The former encoding was UTF-8, which works quite well if LC_CTYPE is
ensured to be an UTF-8 codeset (e.g., en_US.UTF-8 or C.UTF-8 . But, if
LC_CTYPE is set to C or latin1 for example, the Tcl regex library enters
an infinite loop because of unexpected characters used as markers
n the strings (e.g., SECTION SIGN U+00A7).
Therefore, the extract tool was converted to latin1 with the following
commands and now works for LC_CTYPE C and UTF-8 codesets.
iconv -f utf-8 -t latin1 tool/dts/extract > /tmp/e
cp /tmp/e tool/dts/extract
This commit contains features and buf fixes:
* Catch errors during resource allocation
* Because Mesa tries to allocate fence (hardware) registers for each
batch buffer execution, do not allocate new fences for buffer objects
that are already fenced
* Add support for global hardware status page. Each context additionally
has a per-process hardware status page, which we used to set the
global hardware status page during Vgpu switch. This was obviously
wrong. There is only one global hardware status page (set once during
initialization) and a distinct per-process page for contexts.
* Write the sequence number of the currently executing batch buffer to
dword 52 of the per-process hardware status page. We use the pipe line
command with QW_WRITE (quad word write), GLOBAL_GTT_IVB disabled
(address space is per-process address space), and STORE_DATA_INDEX
enabled (write goes to offset of hardware status page). This command
used to write to the scratch page. But Linux now uses the first
reserved word of the per-process hardware status page.
* Add Gen9+ WaEnableGapsTsvCreditFix workaround. This sets the "GAPS TSV
Credit fix Enable" bit of the Arbiter control register (GARBCNTLREG)
as described by the documentation this bit should be set by the BIOS
but is not on most Gen9/9.5 platforms. Not setting this bit leads to
random GPU hangs.
* Increase the context size from 20 to 22 pages for Gen9. On Gen8 the
hardware context is 20 pages (1 hardware status page + 19 ring context
register pages). On Gen9 the size of the ring context registers has
increased by two pages to 21 pages or 81.3125 KBytes as the IGD
documentation states.
* The logical ring size in the ring buffer control of the execlist
context has to be programmed with number of pages - 1. So 0 is 1 page.
We programmed the actual number of pages before, leading to ring
buffer execution of NOOPs if page behind our ring buffer was empty or
GPU hangs if there was data on the page.
issue #4260
* Wait for for completion before return from 'execbuffer2'. This makes
buffer execution synchronous.
* Because the Iris driver manages the virtual address space of the GPU
and creates one GEM context for each batch buffer we have to map/unmap
all buffer objects before and after batch buffer execution.
issue #4260
The lx_emul_virt_to_pages implementation initialized the page ref
counter only for the first page, leaving the remaining elements in
uninitialized state. This, in turn, rendered the Linux page_pool (as
used by the emac network driver) ineffective, ultimately leading the a
memory leak. The fix changes the call of 'init_page_count' to take the
loop variable as argument.
Issue #4225
Increased number of trace subjects since the test sporadically fails on
some platforms.
Also added a sanity check to print an error message in case we run into
the same issue again.
Fixesgenodelabs/genode#4261
This patch adds a switch to internal poll function in libssh that
allows to force this function to immediately return without actually
polling for data and in consequence processing this data. This switch
is used to avoid calling callback functions when flushing output
streams which caused locks due to recursive access to internal
ssh_terminal sessions registry.
Issue #4258
This patch allows to replace sftp packet read and write with
completely asynchronous versions needed to properly hook in existing
ssh_terminal implementation.
Issue #4258
Using 'alignas' in declarations might cause GCC to request for an
implementation of 'operator delete(void*, unsigned long, std::align_val_t)'
although it might actually never be called. This commit adds a dummy
implementation to 'cxx/new_delete.cc' that does nothing more than printing an
error to the log that a proper implementation is missing. This approach is
coherent with our treatment of other global delete operators.
Ref #4217