mirror of
https://github.com/genodelabs/genode.git
synced 2024-12-18 21:27:56 +00:00
Release notes for version 23.02
This commit is contained in:
parent
e265cf6d49
commit
89e198204f
886
doc/release_notes/23-02.txt
Normal file
886
doc/release_notes/23-02.txt
Normal file
@ -0,0 +1,886 @@
|
||||
|
||||
|
||||
===============================================
|
||||
Release notes for the Genode OS Framework 23.02
|
||||
===============================================
|
||||
|
||||
Genode Labs
|
||||
|
||||
|
||||
|
||||
With Genode's February release, almost everything goes
|
||||
[https://genode.org/about/road-map - according to plan].
|
||||
As envisioned on our road map, it features the first ready-to-install
|
||||
system image of Sculpt OS for the PinePhone, which is not merely a re-targeted
|
||||
version of the PC version but comes with a novel user interface, a new
|
||||
mechanism for rapidly switching between different application scenarios, and
|
||||
system-update functionality.
|
||||
Section [First system image of mobile Sculpt OS (PinePhone)] gives an
|
||||
overview and further links about running Genode on your PinePhone.
|
||||
|
||||
While enabling substantial application workloads on devices as constrained as
|
||||
the PinePhone, we engaged in holistic performance optimizations, ranging from
|
||||
kernel scheduling [Base-HW microkernel], over the framework's VFS
|
||||
infrastructure (Section [VFS optimization and simplification]), to the
|
||||
interfacing of GPU drivers (Section [GPU performance optimizations]).
|
||||
|
||||
For stationary ARM-based platforms like the MNT-Reform laptop,
|
||||
interactive graphical virtual machines have become available now, which
|
||||
brings us close to mirror the experience of the PC version of Sculpt OS on
|
||||
such devices (Section [Interactive graphical VMs on ARM]). This development
|
||||
is accompanied by several device-driver improvements for NXP's i.MX family.
|
||||
|
||||
For embedded devices based on Xilinx Zynq, the release introduces custom
|
||||
FPGA fabric for implementing DMA protection that is normally not covered by
|
||||
Zynq SoCs. This line of work - as outlined in
|
||||
Section [Custom IP block for DMA protection on AMD/Xilinx Zynq] - exemplifies
|
||||
how well Genode and reconfigurable hardware can go hand in hand.
|
||||
|
||||
Also, PC platforms got their share of attention, benefiting from the
|
||||
new distinction between Intel's P&E cores, or the principle support of
|
||||
suspend/resume on both NOVA and Genode's custom base-hw microkernel.
|
||||
|
||||
When it comes to running applications on top of Genode, the release brings
|
||||
good news as well. Our custom Goa tool for streamlining
|
||||
application-development work flows received the ability to largely automate
|
||||
the porting and packaging of 3rd-party libraries using CMake
|
||||
(Section [Build system and tools]).
|
||||
|
||||
|
||||
First system image of mobile Sculpt OS (PinePhone)
|
||||
##################################################
|
||||
|
||||
Just in time for our
|
||||
[https://fosdem.org/2023/schedule/event/genode_on_the_pinephone/ - public presentation]
|
||||
of Genode on the PinePhone at FOSDEM in the beginning of February,
|
||||
we published a first ready-to-use system image:
|
||||
|
||||
:First system image of mobile Sculpt OS:
|
||||
|
||||
[https://genodians.org/nfeske/2023-02-01-mobile-sculpt]
|
||||
|
||||
It features a
|
||||
[https://genodians.org/nfeske/2023-01-05-mobile-user-interface - custom user interface],
|
||||
voice calls and mobile-data connectivity, on-target software installation and
|
||||
system update, device controls (battery, brightness, volume, mic, reset,
|
||||
shutdown), and a variety of installable software. Among the installable
|
||||
applications, there is the Chromium-based Morph web browser, an OpenGL demo
|
||||
using the GPU, tests for the camera and microphone, as well as a light-weight
|
||||
Unix-like system shell.
|
||||
|
||||
The underpinnings of the Genode system image for the PinePhone are nearly
|
||||
identical to Sculpt OS on the PC. However, besides the new user interface
|
||||
specifically designed for the touch screen of the phone, two noteworthy
|
||||
differences set it apart from the regular version of Sculpt OS.
|
||||
|
||||
[image pinephone_presets]
|
||||
|
||||
First, the phone variant allows the user to rapidly switch between different
|
||||
runtime configurations, called presets. This way, the limited resources of the
|
||||
phone can be accounted and fully leveraged for each preset individually, while
|
||||
making the system extremely versatile. The loading of a preset can be imagined
|
||||
as the boot into a separate operating system, but it takes only a fraction of
|
||||
a second. The structure of the running system is made fully transparent to the
|
||||
user by the component graph known from Sculpt OS.
|
||||
|
||||
[image pinephone_scenarios]
|
||||
The variety of presets includes the Morph browser, GLMark2, a system shell,
|
||||
a simple oscilloscope, and camera test.
|
||||
|
||||
Second, the system is equipped with an on-target system update mechanism that
|
||||
allows the user to install new versions of the system image when they become
|
||||
available. System updates are secured by cryptographic signatures. The
|
||||
mechanism does not only allow for updating the system but also for the
|
||||
rollback to any previously downloaded version. This way, the user can try
|
||||
out a new version while being able to fall back to the previous one in the
|
||||
case of a regression. This reinforces the end user's ultimate control.
|
||||
|
||||
[image pinephone_update]
|
||||
|
||||
|
||||
Interactive graphical VMs on ARM
|
||||
################################
|
||||
|
||||
The virtual-machine monitor (VMM) using hardware-assisted virtualization on
|
||||
ARM started as a case study eight years ago for Samsung's Exynos 5250 SoC.
|
||||
Originally, it supported virtualization of CPU, timer, interrupt-controller,
|
||||
and a UART-device only. Since then, it received several extensions like
|
||||
support for 64-bit ARMv8 systems, VirtIO devices for network, console, and
|
||||
block access. With release 22.11, the VMM's I/O device access, RAM
|
||||
consumption, and CPU count have come configurable.
|
||||
|
||||
With the current release, we further enhance the VMM for ARM devices to
|
||||
provide all the means necessary to become a useful virtualization solution for
|
||||
interactive scenarios.
|
||||
|
||||
[image mnt_interactive_debian_vm]
|
||||
Sculpt OS running Debian in a virtual machine on the MNT Reform laptop
|
||||
|
||||
Two additional VirtIO device models are available now: A GPU model and one for
|
||||
input. Both models are mapped to Genode's GUI service under the hood. One can
|
||||
extend the configuration of the VMM accordingly:
|
||||
|
||||
! <config ...>
|
||||
! <virtio_device name="fb0" type="gpu"/>
|
||||
! <virtio_device name="event0" type="input"/>
|
||||
! ...
|
||||
! </config>
|
||||
|
||||
For now, only one GPU and one input device can be declared. Both devices get
|
||||
mapped to the very same GUI service, according to the service routing of the
|
||||
VMM.
|
||||
|
||||
Caution: the GPU and input model are still in an experimental state, and there
|
||||
are known corner cases, e.g., when the graphical window size of the VMM gets
|
||||
changed dynamically.
|
||||
|
||||
Formerly, the VMM always expected an initial RAM file system to be provided as
|
||||
ROM dataspace, which got loaded together with the Linux kernel into the VM's
|
||||
memory. Now, it is possible to omit the "initrd_rom" configuration option.
|
||||
If omitted, no initrd is provided to the Linux guest.
|
||||
|
||||
|
||||
Custom IP block for DMA protection on AMD/Xilinx Zynq
|
||||
#####################################################
|
||||
|
||||
As a continuation of the hardware-software co-design efforts presented in the
|
||||
[https://genode.org/documentation/release-notes/22.11#Hardware-software_co-design_with_Genode_on_Xilinx_Zynq - previous release],
|
||||
we turned towards enabling bulk-data transfer between the Zynq's CPU and its
|
||||
FPGA. In a first step, we built a custom hardware design that implements a DMA
|
||||
loopback device based on Xilinx' AXI DMA IP. Since we were particularly
|
||||
interested in testing out the Zynq's accelerator coherency port (ACP), we
|
||||
implemented two loopback devices: one attached to the ACP and one to the
|
||||
high-performance (HP) AXI port of the Zynq. In order to test the design in
|
||||
Genode, we added a port of Xilinx' embeddedsw repository that hosts standalone
|
||||
driver code for the Xilinx IP cores. Based on this port, we implemented the
|
||||
xilinx_axidma library as a Genode wrapper in order to simplify development of
|
||||
custom drivers using Xilinx' AXI DMA IP. A newly written test component takes
|
||||
throughput measurements for varying transfer sizes. A more detailed account of
|
||||
this story is published in an
|
||||
[https://www.hackster.io/johannes-schlatow/using-axi-dma-on-genode-6482d2 - article on hackster.io].
|
||||
|
||||
Knowing that DMA bypasses any memory protection on the Zynq as it does not
|
||||
feature an IOMMU, we further spent some development efforts on implementing a
|
||||
custom IP block, called DMA Guard, for protecting against unintended DMA
|
||||
transfers from/to the FPGA. The DMA Guard is configured with a limited set of
|
||||
address ranges for which DMA transfers will be granted. Any out-of-range
|
||||
transfer will be denied. The configuration of the DMA Guard is conducted by
|
||||
the Zynq's platform driver based on the allocated DMA buffers. For the time
|
||||
being, we applied several changes to the platform driver. These modifications
|
||||
are currently hosted in the genode-zynq repository but are going to find their
|
||||
way into the generic platform driver for the next release.
|
||||
|
||||
If you are interested in more details about the DMA Guard's development, keep
|
||||
watching our [https://www.hackster.io/genode - hackster.io channel] where we
|
||||
will put out a new article concerning this matter very soon.
|
||||
|
||||
|
||||
Base framework and OS-level infrastructure
|
||||
##########################################
|
||||
|
||||
VFS optimization and simplification
|
||||
===================================
|
||||
|
||||
For regular applications executed on Genode, input and output involves the
|
||||
virtual file system (VFS). In contrast to traditional monolithic operating
|
||||
systems (which host the VFS in the kernel) or traditional microkernel-based
|
||||
operating systems (which host the VFS in a dedicated server component),
|
||||
Genode's VFS has the form of a library, giving each component an individual
|
||||
virtual file system. The feature set of the VFS library is not fixed
|
||||
but extensible by so-called VFS plugins that come in the form of optional
|
||||
shared libraries. These plugins can implement new file-system types, but also
|
||||
expose other I/O facilities as pseudo files. For example, TCP/IP stacks like
|
||||
lwIP and lxIP (IP stack ported from Linux) have the form of VFS plugins.
|
||||
The extensibility of the VFS gives us extreme flexibility without compromising
|
||||
Genode's simplicity.
|
||||
|
||||
On the other hand, the pervasiveness of the VFS - being embedded in Genode's C
|
||||
runtime - puts it on the performance-critical path whenever application I/O is
|
||||
involved. The ever-growing sophistication of application workloads like
|
||||
running a Chromium-based web browser on the PinePhone puts merciless pressure
|
||||
on the VFS, which motivated the following I/O-throughput optimizations.
|
||||
|
||||
Even though the VFS and various VFS plugins work asynchronously, the batching
|
||||
of I/O operations is not consistently effective across different kernels. It
|
||||
particularly depends on the kernel's scheduling decision upon the delivery of
|
||||
asynchronous notifications. Kernels that eagerly switch to the signal receiver
|
||||
may thereby prevent the batching of consecutive write operations. We could
|
||||
observe variances of more than an order of magnitude of TCP throughput,
|
||||
depending on the used kernel. In the worst case, when executing a kernel that
|
||||
eagerly schedules the recipient of each asynchronous notification, the
|
||||
application performance is largely dominated by context-switching costs.
|
||||
|
||||
Based on these observations, we concluded that the influence of the kernel's
|
||||
scheduler should better be mitigated by scheduling asynchronous notifications
|
||||
less eagerly at the application level. By waking up a remote peer not before
|
||||
the application stalls for I/O, all scheduled operations would appear at the
|
||||
remote side as one batch.
|
||||
|
||||
The implementation of this idea required a slight redesign of the VFS,
|
||||
replacing the former implicit wakeup of remote peers by explicit wakeup
|
||||
signalling. The wakeup signalling is triggered not before the VFS user settles
|
||||
down. E.g., for libc-based applications, this is the case when the libc goes
|
||||
idle, waiting for external I/O. In the case of a busy writer to a non-blocking
|
||||
file descriptor or socket (e.g., lighttpd), the remote peers are woken up once
|
||||
a write operation yields an out-count of 0. The deferring of wakeup signals is
|
||||
accommodated by the new 'Remote_io' mechanism (_vfs/remote_io.h_) that is
|
||||
designated to be used by all VFS plugins that interact with asynchronous
|
||||
Genode services for I/O.
|
||||
|
||||
Combined with additional adjustments of I/O buffer sizes - like the request
|
||||
queue of the file-system session, the TCP send buffer of the lwIP stack, or
|
||||
the packet buffer of the NIC session - the VFS optimization almost eliminated
|
||||
the variance of the I/O throughput among the different kernels and generally
|
||||
improved the performance. On kernels that suffered most from the eager context
|
||||
switching, netperf
|
||||
[https://github.com/genodelabs/genode/issues/4697#issuecomment-1342542399 - shows a 10x]
|
||||
improvement. But even on kernels with more balanced scheduling, the effect is
|
||||
impressive.
|
||||
|
||||
While we were at it, and since this structural change affected all VFS plugins
|
||||
and users anyway, we took the opportunity to simplify and modernize other
|
||||
aspects of the VFS-related code as well.
|
||||
|
||||
In particular, the new interface 'Vfs::Env::User' replaces the former
|
||||
'Vfs::Io_response_handler'. In contrast to the 'Io_response_handler', which
|
||||
had to be called on a 'Vfs_handle', the new interface does not require any
|
||||
specific handle. It is merely meant to prompt the VFS user (like the libc) to
|
||||
re-attempt stalled I/O operations but it does not provide any immediate hint
|
||||
about which of the handles have become ready for reading/writing. This
|
||||
decoupling led to welcome simplifications of asynchronously working VFS
|
||||
plugins.
|
||||
|
||||
Furthermore, we removed the 'file_size' type from read/write interfaces. The
|
||||
former C-style pair of (pointer, size) arguments to those operations have been
|
||||
replaced by 'Byte_range_ptr' and 'Const_byte_range_ptr' argument types, which
|
||||
make the code safer and easier to follow. Also, the VFS utilities offered by
|
||||
_os/vfs.h_ benefit from this safety improvement.
|
||||
|
||||
|
||||
GPU performance optimizations
|
||||
=============================
|
||||
|
||||
Session interface changes
|
||||
-------------------------
|
||||
|
||||
The GPU session interface was originally developed along the first version of
|
||||
our GPU multiplexer for Intel devices. For this reason, the interface
|
||||
contained Intel specific nomenclature, like GTT and PPGTT for memory map and
|
||||
unmap operations. With the introduction of new GPU drivers with different
|
||||
architectures (e.g., Mali and Vivante), the Intel specifics should have gone
|
||||
away. With the current Genode release, we streamlined the map and unmap
|
||||
functions to semantically be more correct on all supported hardware. There are
|
||||
two map functions now: First, _map_cpu_ which maps GPU graphics memory to be
|
||||
accessed by the CPU. And second, _map_gpu_ which establishes a mapping of
|
||||
graphics memory within the GPU.
|
||||
|
||||
Additionally, we removed the concept of buffers (as used by Mesa and Linux
|
||||
drivers) to manage graphics memory and replaced it by the notion of video
|
||||
memory (VRAM) where VRAM stands for the actual graphics memory used by a GPU -
|
||||
may it be dedicated on-card memory or system RAM. The change makes it possible
|
||||
to separate the graphics-memory management from the buffer management as
|
||||
required by the Mesa library.
|
||||
|
||||
|
||||
Intel graphics
|
||||
--------------
|
||||
|
||||
When porting 3D applications using Mesa's OpenGL, we found that Mesa allocates
|
||||
and frees a lot of small GPU buffer objects (data in GPU memory) during
|
||||
operation. This is sub optimal for component-based systems because the Mesa
|
||||
library has to perform an RPC to the GPU multiplexer for each buffer
|
||||
allocation and for each buffer mapping. As mentioned above, we changed the
|
||||
session semantics from buffer object to video memory and implemented this
|
||||
feature within Intel's GPU multiplexer, which now only hands out VRAM. This
|
||||
made it possible to move the buffer handling completely to the Mesa client
|
||||
side (libdrm). Libdrm now allocates large chunks of video memory (i.e., 16MB)
|
||||
and hands out memory for buffer objects from this pool. This brings two
|
||||
advantages: First, the client-side VRAM pool acts as cache, which reduces the
|
||||
number of RPCs required for memory management significantly. Second, because
|
||||
of the larger VRAM allocations (compared to many 4K or 16K allocations before)
|
||||
fewer capabilities for the actual dataspaces that back the memory are
|
||||
required. Measurements showed that almost an order of magnitude of
|
||||
capabilities can be saved at Mesa or the client side this way.
|
||||
|
||||
|
||||
Mali graphics
|
||||
-------------
|
||||
|
||||
The 22.08 release introduced a
|
||||
[https://genode.org/documentation/release-notes/22.08#GPU_and_Mesa_driver_for_Mali-400 - driver]
|
||||
for the GPU found in the PinePhone. Since it was merely a rapid prototype, it
|
||||
was limited to one client at a time, and was normally started and stopped
|
||||
together with its client. With this release, we remedied these limitations and
|
||||
enabled support for multiple concurrent clients and also revised our libdrm
|
||||
backend for Mesa's Lima driver.
|
||||
|
||||
We have not yet explored applying the same VRAM optimizations that are employed
|
||||
by our Intel graphics stack. One VRAM allocation still correlates to one
|
||||
buffer-object.
|
||||
|
||||
|
||||
More flexible ACPI-event handling
|
||||
=================================
|
||||
|
||||
The _acpica_ component uses the Intel ACPICA library to parse and interpret
|
||||
ACPI tables and AML code. One designated feature is the monitoring of several
|
||||
ACPI event sources including optional reporting of information about state
|
||||
changes. The supported event sources are:
|
||||
|
||||
* Lid, which can be open or closed
|
||||
* Smart battery (SB), information about battery parameters (e.g., capacity)
|
||||
and charging/discharging status
|
||||
* ACPI fixed events, e.g., power buttons
|
||||
* AC adapters, which reflect power cable plug/unplug
|
||||
* Embedded controller (EC), events like Fn-* keys, Lid, AC, SB changes
|
||||
* Vendor-specific hardware events, e.g., Fujitsu FUJ02E3 key events
|
||||
|
||||
Acpica optionally reports information about state changes. These reports can
|
||||
be monitored by other components as ROMs. The following configuration
|
||||
illustrates the feature:
|
||||
|
||||
!<start name="report_rom">
|
||||
! <resource name="RAM" quantum="2M"/>
|
||||
! <provides> <service name="ROM" /> <service name="Report" /> </provides>
|
||||
! <config>
|
||||
! <policy label="acpi_event -> acpi_lid" report="acpica -> acpi_lid"/>
|
||||
! <policy label="acpi_event -> acpi_battery" report="acpica -> acpi_battery"/>
|
||||
! <policy label="acpi_event -> acpi_fixed" report="acpica -> acpi_fixed"/>
|
||||
! <policy label="acpi_event -> acpi_ac" report="acpica -> acpi_ac"/>
|
||||
! <policy label="acpi_event -> acpi_ec" report="acpica -> acpi_ec"/>
|
||||
! <policy label="acpi_event -> acpi_hid" report="acpica -> acpi_hid"/>
|
||||
! </config>
|
||||
!</start>
|
||||
!
|
||||
!<start name="acpica">
|
||||
! <resource name="RAM" quantum="8M"/>
|
||||
! <config report="yes"/>
|
||||
! <route>
|
||||
! <service name="Report"> <child name="acpi_state"/> </service>
|
||||
! ...
|
||||
! </route>
|
||||
!</start>
|
||||
|
||||
One such ACPI monitor component is _acpi_event_ that maps ACPI events to key
|
||||
events of a requested Event session based on its configuration. This way, ACPI
|
||||
state changes can be processed like ordinary key press-release events via, for
|
||||
example, the _event_filter_. The following configuration illustrates how to
|
||||
map the ACPI event types to key events:
|
||||
|
||||
!<start name="acpi_event">
|
||||
! <resource name="RAM" quantum="1M"/>
|
||||
! <config>
|
||||
! <map acpi="lid" value="CLOSED" to_key="KEY_SLEEP"/>
|
||||
! <map acpi="fixed" value="0" to_key="KEY_POWER"/>
|
||||
! <map acpi="ac" value="ONLINE" to_key="KEY_WAKEUP"/>
|
||||
! <map acpi="ec" value="20" to_key="KEY_BRIGHTNESSUP"/>
|
||||
! <map acpi="ec" value="21" to_key="KEY_BRIGHTNESSDOWN"/>
|
||||
! <map acpi="hid" value="0x4000000" to_key="KEY_FN_F4"/>
|
||||
! </config>
|
||||
! <route>
|
||||
! <service name="ROM" label="acpi_lid"> <child name="acpi_state"/> </service>
|
||||
! <service name="ROM" label="acpi_battery"> <child name="acpi_state"/> </service>
|
||||
! <service name="ROM" label="acpi_fixed"> <child name="acpi_state"/> </service>
|
||||
! <service name="ROM" label="acpi_ac"> <child name="acpi_state"/> </service>
|
||||
! <service name="ROM" label="acpi_ec"> <child name="acpi_state"/> </service>
|
||||
! <service name="ROM" label="acpi_hid"> <child name="acpi_state"/> </service>
|
||||
! <service name="Event"> <child name="event_filter" label="acpi"/> </service>
|
||||
! ...
|
||||
! </route>
|
||||
!</start>
|
||||
|
||||
In the current release, we replaced the limited list of supported key names by
|
||||
a general mechanism, which supports the use of all key names declared in
|
||||
_repos/os/include/input/keycodes.h_.
|
||||
|
||||
|
||||
Base API changes
|
||||
================
|
||||
|
||||
As part of our continuous motive to streamline and simplify the framework's
|
||||
base API as much as possible, the current release removes the interfaces
|
||||
_base/blocking.h_, _base/debug.h_, and _base/lock_guard.h_ as those headers
|
||||
contained parts of the API that have become obsolete by now. As a further
|
||||
minor change, the 'abs' function of _util/misc_math.h_ got removed.
|
||||
|
||||
The string utilities _util/string.h_ received the new 'Const_byte_range_ptr'
|
||||
type complementing the existing 'Byte_range_ptr'. Both types are designated
|
||||
for passing arguments that refer to a byte buffer, e.g., the source buffer of
|
||||
a write operation.
|
||||
|
||||
|
||||
On-target system-update and rollback mechanism
|
||||
##############################################
|
||||
|
||||
For the mobile version of Sculpt OS as covered in
|
||||
Section [First system image of mobile Sculpt OS (PinePhone)],
|
||||
we envisioned easy-to-use system updates that would enable us to quickly
|
||||
iterate based on the feedback of early field testers.
|
||||
|
||||
This topic confronted us with a variety of concerns. Just to name a few,
|
||||
conventions for booting that would not require changes in the future,
|
||||
equipping (system) images with self-reflecting version information, tools for
|
||||
generating and publishing digitally-signed images, on-target discovery of new
|
||||
image versions, secure downloading and cryptographic checking of new images,
|
||||
directing the machine's boot loader to use the new version, and possibly
|
||||
reverting to an earlier version.
|
||||
|
||||
Fortunately, most of these concerns have a lot in common with the problems
|
||||
we had to address for Genode's
|
||||
[https://genode.org/documentation/release-notes/18.02#On-target_package_installation_and_deployment - package management].
|
||||
For example, the off-target and on-target tooling for digital signatures,
|
||||
the notion of a depot, and the concept of federated software providers
|
||||
(depot users) are established and time-tested by now.
|
||||
|
||||
|
||||
Self-reflecting version information
|
||||
-----------------------------------
|
||||
|
||||
To allow a running Sculpt system to know its own version, the sculpt.run
|
||||
script generates an artificial boot module named "build_info", which can be
|
||||
evaluated at runtime by the sculpt-manager component.
|
||||
|
||||
! <build_info genode_version="22.11-260-g89be3404c0d"
|
||||
! date="2023-01-19" depot_user="nfeske" board="pinephone">
|
||||
|
||||
|
||||
Formalism for generating images and image metadata
|
||||
--------------------------------------------------
|
||||
|
||||
To enable the Sculpt system to easily detect new versions, system images must
|
||||
be accompanied by metadata discoverable at a known location. This information
|
||||
is provided by a so-called image-index file located at
|
||||
_depot/<user>/image/index_. The image index of a depot user lists the
|
||||
available images in XML form, e.g.,
|
||||
|
||||
! <index>
|
||||
! <image os="sculpt" board="pinephone" version="2023-01-19">
|
||||
! <info text="initial version"/>
|
||||
! </image>
|
||||
! ...
|
||||
! </index>
|
||||
|
||||
The 'os', 'board', and 'version' attributes can be used to infer the file name
|
||||
of the corresponding image file. The '<info>' nodes contain a summary of
|
||||
changes as information for the end user.
|
||||
|
||||
The new _gems/run/sculpt_image.run_ script provides assistance with generating
|
||||
appropriately named images, placing them into the depot, and presenting a
|
||||
template for the manually curated image index.
|
||||
|
||||
|
||||
Signing and publishing
|
||||
----------------------
|
||||
|
||||
For signing and publishing system images and image indices, we extended the
|
||||
existing _tool/depot/publish_ tool. To publish a new version of an image
|
||||
index:
|
||||
|
||||
! ./tool/depot/publish <depot-user>/image/index
|
||||
|
||||
Each system image comes in two forms, a bootable disk image and an archive of
|
||||
the boot directory. The bootable disk image can be used to install a new
|
||||
system from scratch by copying the image directly to a block device. It
|
||||
contains raw block data. The archive of the boot directory contains the
|
||||
content needed for an on-target system update to this version. Within the
|
||||
depot, this archive has the form of a directory - named after the image - that
|
||||
contains the designated content of the boot directory on target. Depending on
|
||||
the board, it may contain only a single file loaded by the boot loader (e.g.,
|
||||
uImage), or several boot modules, or even the boot-loader configuration. The
|
||||
following command publishes both forms:
|
||||
|
||||
! ./tool/depot/publish <depot-user>/image/<image-name>
|
||||
|
||||
This results in the following - accompanied by their respective .sig
|
||||
files - in the public directory:
|
||||
|
||||
! <depot-user>/image/<image-name>.img.xz (disk image)
|
||||
! <depot-user>/image/<image-name>.tar.xz (boot archive)
|
||||
! <depot-user>/image/<image-name>.zip (disk image)
|
||||
|
||||
The .zip file contains the .img file. It is provided for users who download
|
||||
the image on a system with no support for .xz.
|
||||
|
||||
|
||||
On-target image discovery, download, and verification
|
||||
-----------------------------------------------------
|
||||
|
||||
To enable a running Sculpt system to fetch image index files and images, the
|
||||
existing depot-download component accepts the following two new download
|
||||
types:
|
||||
|
||||
! <image_index path="<user>/image/index"/>
|
||||
! <image path="<user>/image/<name>"/>
|
||||
|
||||
Internally, the depot-download subsystem employs the depot-query component to
|
||||
determine the missing depot content. This component accepts the following two
|
||||
new queries:
|
||||
|
||||
! <images user="..."/>
|
||||
! <image_index user="..."/>
|
||||
|
||||
If present in the query, depot_query generates reports labeled as "images" and
|
||||
"image_index" respectively. These reports are picked up by the depot-download
|
||||
component to track the completion of each job. The reported information is
|
||||
also used by the system updater to get hold of the images that are ready to
|
||||
install.
|
||||
|
||||
|
||||
On-target image installation and rollback
|
||||
-----------------------------------------
|
||||
|
||||
Once downloaded into the local depot of a Sculpt system, the content of the
|
||||
boot directory for a given image version is readily available, e.g.,
|
||||
|
||||
! depot/nfeske/image/sculpt-pinephone-2023-02-02/uImage
|
||||
|
||||
The installation comes down to copying this content to the _/boot/_ directory.
|
||||
On the next reboot, the new image is executed.
|
||||
|
||||
When subsequently downloading new image versions, the old versions stay
|
||||
available in the depot as sibling directories. This allows for an easy
|
||||
rollback by copying the boot content of an old version to the _/boot/_
|
||||
directory.
|
||||
|
||||
|
||||
Device drivers
|
||||
##############
|
||||
|
||||
NXP i.MX Ethernet & USB
|
||||
=======================
|
||||
|
||||
The Ethernet driver for i.MX53, i.MX6, and i.MX7 got updated to use a more
|
||||
recent Linux kernel version (5.11). These drivers got aligned with the
|
||||
source-code base originally ported for the i.MX8 SoC.
|
||||
|
||||
Using the recent approach to port Linux device drivers, trying to preserve the
|
||||
original semantic, it is necessary to provide the correct clock rates to the
|
||||
driver. Therefore, specific platform drivers for i.MX6 and i.MX7 were created
|
||||
that enable the network related clocks and export their rate values.
|
||||
The i.MX53 related platform driver got extended to support these clocks.
|
||||
|
||||
The USB host-controller driver for the i.MX 8MQ EVK is now able to drive the
|
||||
USB-C connector of this board too.
|
||||
|
||||
|
||||
Realtek Wifi
|
||||
============
|
||||
|
||||
As a welcoming side effect of switching to the new DDE-Linux approach,
|
||||
enabling other drivers that are part of the same subsystem has become less
|
||||
involved. In the past, we mostly focused on getting wireless devices supported
|
||||
by the iwlwifi driver to work as those are the devices predominantly found in
|
||||
commodity laptops. That being said, every now and then, one comes across a
|
||||
different vendor and especially with the shifting focus on ARM-based systems
|
||||
covering those as well became necessary.
|
||||
|
||||
As a first experiment, we enabled the rtlwifi driver that provides support
|
||||
for Realtek-based wireless devices. Due to lacking access to other hardware,
|
||||
the driver has been so far tested only with a specific RTL8188EE based device
|
||||
(10ec:8179 rev 01). Of course, some trade-offs were made as power-management
|
||||
is currently not available. But getting it to work, nevertheless, took barely
|
||||
half a day of work, which is promising.
|
||||
|
||||
|
||||
Platforms
|
||||
#########
|
||||
|
||||
Base-HW microkernel
|
||||
===================
|
||||
|
||||
Cache-maintenance optimization
|
||||
------------------------------
|
||||
|
||||
On ARM systems, the memory view on instructions and data of the CPUs, as well
|
||||
as between CPUs and other devices is not necessarily consistent. When dealing
|
||||
with DMA transfers of devices, developers of related drivers need to ensure
|
||||
that corresponding cache lines are cleaned before a DMA transfer gets
|
||||
acknowledged. When dealing with just-in-time compilation, where instructions
|
||||
are generated on demand, the data and instruction caches have to be aligned
|
||||
too.
|
||||
|
||||
Until now, the base-API functions for such cache-maintenance operations were
|
||||
mapped to kernel system calls specific to base-hw. Only the kernel was allowed
|
||||
to execute cache maintenance related instructions. On ARMv8 however, it is
|
||||
possible to allow unprivileged components to execute most of these
|
||||
instructions.
|
||||
|
||||
With this release, we have implemented the cache maintenance functions outside
|
||||
the kernel on ARMv8 where possible. Thereby, several device drivers with a lot
|
||||
of DMA transactions, e.g. the GPU driver, benefit from this optimization
|
||||
enormously. The JavaScript engine used in the Morph and Falkon browsers
|
||||
profits as well.
|
||||
|
||||
|
||||
ACPI suspend & resume
|
||||
---------------------
|
||||
|
||||
In the previous release, we started to support the low-level
|
||||
[https://genode.org/documentation/release-notes/22.11#Low-level_mechanism_for_suspend_resume_on_PC_platforms - ACPI suspend and resume]
|
||||
mechanism with Genode for the NOVA kernel. With the current release, we added
|
||||
the required low-level support to Genode's base-hw kernel for x86 64bit
|
||||
platforms. Similar to the base-nova version, on base-hw the
|
||||
'Pd::managing_system' RPC function of Genode's core roottask is used to
|
||||
transfer the required ACPI values representing the S3 sleep state to the
|
||||
kernel. The kernel then takes care to halt all CPUs and flush its state to
|
||||
memory, before finally suspending the PC using the ACPI mechanism. On resume,
|
||||
the kernel re-initializes necessary hardware used by the kernel, e.g., all
|
||||
CPUs, interrupt controller, timer device, and serial device. One can test
|
||||
drive the new feature using the _run/acpi_suspend_ scenario introduced by the
|
||||
former release.
|
||||
|
||||
|
||||
Scheduling improvements for interactive workloads
|
||||
-------------------------------------------------
|
||||
|
||||
As Genode conquers the PinePhone, the base-hw kernel, for the first time, has
|
||||
to perform real-life multimedia on a daily basis given a resource-limited
|
||||
mobile target. One particularly important and ambitious use case has become
|
||||
video conferencing in the Morph browser. A combination of an already demanding
|
||||
browser engine with an application that not only streams video and audio in
|
||||
both directions over network but also handles video and audio I/O at the
|
||||
device, and all that fluently and at the same time.
|
||||
|
||||
A lot of thinking went into how to optimize this scenario on each level of
|
||||
abstraction and one rather low-level lever was the scheduling scheme of the
|
||||
base-hw kernel. The base-hw scheduling scheme consists of a combination of
|
||||
absolute priority bands with execution-time quotas that prevent higher
|
||||
prioritized subjects from starving lower ones. There is the notion of a super
|
||||
period and each subject owns only a fraction of that super period as quota
|
||||
together with its priority. Once a subject has depleted its quota, it can't
|
||||
use its priority until the end of the current super period where its quota
|
||||
will be re-filled. However, during that time, the subject is not blocked - It
|
||||
can become active whenever there is no subject with priority and remaining
|
||||
quota present.
|
||||
|
||||
So, this "zero" band below all the priority bands temporarily accommodates all
|
||||
subjects that have a priority but that are out of quota. It contains, however,
|
||||
also subjects that have no priority in general. These might be tasks like a GCC
|
||||
compilation or a ray tracer. While prioritized tasks would be user input
|
||||
handlers or the display driver. Now, one difficult problem that arises with
|
||||
this scheduling scheme is that system integration has to decide how much quota
|
||||
is required by a prioritized task. The perfect value can't be determined as it
|
||||
depends on many factors including the target platform. Therefore, we have to
|
||||
consider that an important task like the audio driver in the video-conference
|
||||
scenario runs out of quota shortly before finishing its work.
|
||||
|
||||
This is already bad as is as the audio driver now has to share the CPU with
|
||||
many unimportant tasks until the next super period. But it became even worse
|
||||
because, in the past implementation, subjects always entered the zero band at
|
||||
the tail position. It meant that, e.g., the remaining audio handling had to
|
||||
wait at least until all the unprioritized tasks (e.g. long-taking computations)
|
||||
had used up their zero-band time slice. In order to mitigate this situation, we
|
||||
decided that prioritized tasks when depleting their quota become head of the
|
||||
zero-band, so, they will be scheduled first whenever the higher bands become
|
||||
idle.
|
||||
|
||||
This change relaxes the consequences of quota-depletion events for
|
||||
time-critical tasks in a typical system with many unprioritized tasks.
|
||||
At the same time, it should not have a significant impact on the overall
|
||||
schedule because depletion events are rare and zero-band time-slices short.
|
||||
|
||||
|
||||
NOVA microhypervisor
|
||||
====================
|
||||
|
||||
ACPI suspend & resume
|
||||
---------------------
|
||||
|
||||
As an extension to the principal
|
||||
[https://genode.org/documentation/release-notes/22.11#Low-level_mechanism_for_suspend_resume_on_PC_platforms - ACPI suspend and resume]
|
||||
support introduced with the Genode 22.11 release, the NOVA kernel now supports
|
||||
also the re-enablement of the IOMMU after ACPI resume. The IOMMU as a hardware
|
||||
feature has been supported by Genode since
|
||||
[https://genode.org/documentation/release-notes/13.02#DMA_protection_via_IOMMU - release 13.02]
|
||||
and extended in
|
||||
[https://genode.org/documentation/release-notes/20.11#NOVA_microhypervisor - release 20.11],
|
||||
which sandboxed device hardware and (malicious/faulty) drivers to avoid
|
||||
arbitrary DMA transactions.
|
||||
|
||||
Intel P/E cores
|
||||
---------------
|
||||
|
||||
Starting with [https://en.wikipedia.org/wiki/Intel_Core#12th_generation - Intel CPU generation 12],
|
||||
Intel introduced CPUs with heterogeneous cores, similar to
|
||||
[https://en.wikipedia.org/wiki/ARM_big.LITTLE - ARM's big/LITTLE] concept.
|
||||
The new CPUs have a number of so called P-cores (performance) and E-cores
|
||||
(efficient), which differ in their performance and power characteristics.
|
||||
The CPU cores
|
||||
([https://en.wikipedia.org/wiki/Alder_Lake#CPUID_incoherence - should be])
|
||||
instruction compatible and are reported as identical via x86's CPUID
|
||||
instruction nowadays. However, an operating system such as Genode must be able
|
||||
to differentiate the cores in order to take informed decisions about the
|
||||
placement and scheduling of Genode components.
|
||||
|
||||
With the current release, we added support to the NOVA kernel to propagate the
|
||||
information about P/E cores to Genode's 'core' roottask. In Genode's core,
|
||||
this information is used to group the CPU cores into Genode's
|
||||
[https://genode.org/documentation/release-notes/13.08#Management_of_CPU_affinities - affinity space].
|
||||
With
|
||||
[https://genode.org/documentation/release-notes/20.05#NOVA_microhypervisor - release 20.05],
|
||||
we introduced the grouping of hyperthreads on the y-axis, which we keep in
|
||||
case the P-cores have the feature enabled. Following the P-cores and
|
||||
hyperthreads, all remaining E-cores are placed in the affinity space.
|
||||
|
||||
The following examples showcase the grouping in the affinity-space on x/y axis:
|
||||
|
||||
Core i7 1270P - 4 P-cores (hyperthreading enabled) and 8 E-cores:
|
||||
|
||||
! x-axis 1 2 3 4 5 6 7 8
|
||||
! ----------------------------------
|
||||
! y-axis 1 | P\ P\ P\ P\ E E E E
|
||||
! 2 | P/ P/ P/ P/ E E E E
|
||||
!
|
||||
! hyperthreads \ / of same core
|
||||
|
||||
Core i7 1280P - 6 P-cores (hyperthreading enabled) and 8 E-cores:
|
||||
|
||||
! x-axis 1 2 3 4 5 6 7 8 9 10
|
||||
! -----------------------------------------
|
||||
! y-axis 1 | P\ P\ P\ P\ P\ P\ E E E E
|
||||
! 2 | P/ P/ P/ P/ P/ P/ E E E E
|
||||
!
|
||||
! hyperthreads \ / of same core
|
||||
|
||||
The information about the P/E cores is visible in the kernel and Genode's
|
||||
log output and is reported in the 'platform_info' ROM, e.g.
|
||||
|
||||
! kernel:
|
||||
!
|
||||
! [ 0] CORE:00:00:0 6:9a:3:7 [415] P 12th Gen Intel(R) Core(TM) i7-1270P
|
||||
! ...
|
||||
! [15] CORE:00:17:0 6:9a:3:7 [415] E 12th Gen Intel(R) Core(TM) i7-1270P
|
||||
! ...
|
||||
|
||||
! Genode's core:
|
||||
!
|
||||
! mapping: affinity space -> kernel cpu id - package:core:thread
|
||||
! remap (0x0) -> 0 - 0: 0:0 P boot cpu
|
||||
! remap (0x1) -> 1 - 0: 0:1 P
|
||||
! remap (1x0) -> 2 - 0: 4:0 P
|
||||
! remap (1x1) -> 3 - 0: 4:1 P
|
||||
! remap (2x0) -> 4 - 0: 8:0 P
|
||||
! remap (2x1) -> 5 - 0: 8:1 P
|
||||
! remap (3x0) -> 6 - 0:12:0 P
|
||||
! remap (3x1) -> 7 - 0:12:1 P
|
||||
! remap (4x0) -> 8 - 0:16:0 E
|
||||
! remap (4x1) -> 9 - 0:17:0 E
|
||||
! remap (5x0) -> 10 - 0:18:0 E
|
||||
! remap (5x1) -> 11 - 0:19:0 E
|
||||
! remap (6x0) -> 12 - 0:20:0 E
|
||||
! remap (6x1) -> 13 - 0:21:0 E
|
||||
! remap (7x0) -> 14 - 0:22:0 E
|
||||
! remap (7x1) -> 15 - 0:23:0 E
|
||||
! ...
|
||||
|
||||
! platform_info ROM:
|
||||
!
|
||||
! ...
|
||||
! <cpus>
|
||||
! <cpu xpos="0" ypos="0" cpu_type="P" .../>
|
||||
! ...
|
||||
! <cpu xpos="5" ypos="0" cpu_type="E" .../>
|
||||
! ...
|
||||
! <cpus>
|
||||
! ...
|
||||
|
||||
|
||||
Build system and tools
|
||||
######################
|
||||
|
||||
Building and packaging CMake-based shared libraries (via Goa)
|
||||
=============================================================
|
||||
|
||||
The [https://github.com/nfeske/goa - Goa] tool streamlines the work of
|
||||
cross-developing, testing, and publishing Genode application software
|
||||
using commodity build tools like CMake. The tool is particularly suited for
|
||||
porting existing 3rd-party software to Sculpt OS.
|
||||
|
||||
Until recently, Goa was solely focused on applications whereas the porting of
|
||||
3rd-party libraries required the use of the traditional approach of hand
|
||||
crafting build rules for Genode's build system. This limitation of Goa got
|
||||
lifted now.
|
||||
|
||||
In the new version, a Goa project can host an _api_ file indicating that
|
||||
the project is a library project. The file contains the list of headers that
|
||||
comprise the library's public interface. The build artifact of a library
|
||||
is declared in the _artifacts_ file and is expected to have the form
|
||||
_<library-name>.lib.so_. The ABI symbols of such a library must be listed
|
||||
in the file _symbols/<library-name>_. With these bits of information supplied
|
||||
to Goa, the tool is able to build and publish both the library and the API as
|
||||
depot archives - ready to use by Genode applications linking to the library.
|
||||
The way how all those little pieces work together is best illustrated by the
|
||||
accompanied
|
||||
[https://github.com/nfeske/goa/tree/master/examples/cmake_library - example].
|
||||
For further details, please consult Goa's builtin documentation via 'goa help'
|
||||
(overview of Goa's sub commands and files) and 'goa help api' (specifics of
|
||||
the _api_ declaration file).
|
||||
|
||||
When porting a library to Genode, one manual step remains, which is the
|
||||
declaration of the ABI symbols exported by the library. The new sub command
|
||||
'goa extract-abi-symbols' eases this manual step. It automatically generates a
|
||||
template for the _symbols/<library-name>_ file from the library's built shared
|
||||
object. Note, however, that the generated symbols file is expected to be
|
||||
manually reviewed and tidied up, e.g., by removing library-internal symbols.
|
||||
|
||||
_Thanks to Pirmin Duss for having contributed this welcomed new feature, which_
|
||||
_makes Goa much more versatile!_
|
||||
|
||||
|
||||
New tool for querying metadata of ports
|
||||
=======================================
|
||||
|
||||
The integration of third-party software into Genode is implemented via _ports_
|
||||
that specify how to retrieve, verify, and patch the source code in preparation
|
||||
for use with our build system. Ports are managed by tools residing in the
|
||||
_tool/ports_ directory. For example, _tool/ports/prepare_port_ is used to
|
||||
execute all required preparation steps.
|
||||
|
||||
Currently, the base Genode sources support 90 ports (you may try
|
||||
_tool/ports/list_ yourself) and, thus, it's not trivial to keep track of all
|
||||
the ports in the repo directories. Therefore, we introduce the
|
||||
_tool/ports/metadata_ tool to extract information about license, upstream
|
||||
version, and source URLs of individual ports. The tool can be used as follows:
|
||||
|
||||
!./tool/ports/metadata virtualbox6
|
||||
!
|
||||
!PORT: virtualbox6
|
||||
!LICENSE: GPLv2
|
||||
!VERSION: 6.1.26
|
||||
!SOURCE: http://download.virtualbox.org/virtualbox/6.1.26/VirtualBox-6.1.26.tar.bz2 (virtualbox)
|
||||
!SOURCE: http://download.virtualbox.org/virtualbox/6.1.26/VirtualBoxSDK-6.1.26-145957.zip (virtualbox_sdk)
|
||||
|
||||
|
||||
Harmonization of the boot concepts across ARM and PC platforms
|
||||
==============================================================
|
||||
|
||||
To make the system-update functionality covered in
|
||||
Section [On-target system-update and rollback mechanism] equally usable across
|
||||
PC and ARM platforms, the conventions of booting the platforms had to be
|
||||
unified.
|
||||
|
||||
Traditionally, a bootable disk image for the PC contains a _boot/_ directory.
|
||||
E.g., when using NOVA it contains the GRUB boot-loader config + the hypervisor
|
||||
+ the bender pre-boot loader + the banner image + the Genode system image.
|
||||
This structure corresponds 1:1 to the _boot/_ directory as found on the 3rd
|
||||
partition of the Sculpt system, which is very nice. A manual system update of
|
||||
Sculpt comes down to replacing these files. However, on ARM platforms, SD-card
|
||||
images used to host a _uImage_ file and a U-Boot environment configuration
|
||||
file in the root directory. The distinction of these differences complicates
|
||||
both the build-time tooling and the on-target handling of system updates.
|
||||
|
||||
The current release unifies the boot convention by hosting a _boot/_ directory
|
||||
on all platforms and reinforces the consistent naming of files. On ARM, the
|
||||
_uImage_ and _uboot.env_ files now always reside under _boot/_. Thanks to this
|
||||
uniform convention, Genode's new system update mechanism can now equally
|
||||
expect that a system update corresponds to the mere replacement of the content
|
||||
of the _boot/_ directory.
|
||||
|
||||
|
||||
Minor run-tool changes
|
||||
======================
|
||||
|
||||
The functionality of the _image/uboot_fit_ plugin has been integrated into the
|
||||
regular _image/uboot_ plugin as both plugins were quite similar.
|
||||
FIT images can now be produced by adding the run option '--image-uboot-fit'.
|
||||
|
Loading…
Reference in New Issue
Block a user