Since the HW-kern-caps commit, there was a bug in the Platform_thread
constructor. When called for a user thread, the constructor stated 0
as CPU quota at the Kernel_object instead of its quota input-paramater.
Fixes#1620
- free up kernel memory of empty slabs (if already one empty slab is in
place)
- free up more page table entries
- handle CPUs with invariant TSCs gracefully
Genode/Nova running on CPUs without the invariant TSC feature may seem
to 'hang'. The referenced commit of the nova branch fixes the issue
for some older Intel CPUs.
Fixes#1615
Adjust bomb to specify the various hard-coded parameters and set up bomb.run
this way that it manages at our test machine to succeed in the given time.
Issue #1615
Bomb and any server may generate references to capabilities exceeding 256 -
use a 16bit counter until the cap handling in Genode gets unified.
Additionally try to print a warning, instead of dying, if we get cap reference
count under or overflow.
Issue #1615
Do not use slabs for allocations above 64KB, this seems to lead to memory
corruptions and the error described in issue #1613 under certain circumstances.
fixes#1613
Add a test where a locked pointer shall be taken during object destruction.
Moreover, extend the run-script so it runs on different platforms with
"real" timers.
SDL uses the Audio_out session in streaming fashion. For this reason
the audio might be played with delay of at most the queue size. To
mitigate the effect we synchronize the tail pointer to the current play
pointer when the PlayAudio() function is called by SDL for the first
time.
Fixes#1612.
Init used to specify the unique child name as session label when
requesting the binary image of a dynamically linked child. The actual
module name was propagated as "filename" session argument. Since we want
to move towards the sole use of the session label, which can be taken
into account for the session routing, the module name should always be
the last part of a ROM session label.
This patch changes the window manager, the decorator, and the
floating window layouter to propagate the usage of an alpha channel from
the client application to the decorator. This way, the decorator can
paint the decoration elements behind the affected windows, which would
otherwise be skipped.
This patch adds two new painters located at gems/include/polygon_gfx.
Both painters draw convex polygons with an arbirary number of points.
The shaded-polygon painter interpolates the color and alpha values
whereas the textured-polygon painter applies a texture to the polygon.
The painters are accompanied by simplistic 3D routines located at
gems/include/nano3d/ and a corresponding example (gems/run/nano3d.run).
This patch changes the way how CLI monitor obtains its subsystem
configurations. Originally, this information was provided via the
Genode::config mechanism. But for managing complex scenarios, the config
node becomes very complex. Hence, it is preferrable to have a distinct
file for each subsystem configuration.
The CLI monitor scans the directory '/subsystems' for files ending with
".subsystem". Each file has the same syntax as the formerly used
subsystem nodes.
Instead of using the Genode user-level signal API to signal page-faults to
a page-fault handler, use the kernel API directly. Thereby the accounting
of signal contexts needed for each paging subject can be done easily.
Fix#956
Moreover, be strict when calculating the page-table requirements of
core, which is architecture specific, and declare the virtual memory
requirements of core architecture-wise.
Ref #1588
We set 'ld -z max-page-size' to 4KiB to prevent the linker from aligning
the text segment to any built-in default (e.g., 4MiB on x86_64 or 64KiB
on ARM). Otherwise, the padding bytes are wasted at the beginning of the
final binary.
Removed the Nic::Driver implementation. All nic servers now inherit from
Nic::Session_component. Packet stream signals are dispatched to
the 'handle_packet_stream' function within a session component. Thus, nic
servers now have direct access to the packet stream channels, making handling
more flexible.
Adjusted nic_loobpack, dde_ipxe, wifi, usb, lan9118, Linux nic, and OpenVPN to
the new interface.
Fixes#1602
The ~Irq_session_component relied on the IRQ number obtained by the
corresponding kernel IRQ object to mark the IRQ as free at the IRQ
allocator. However, since the kernel IRQ object is initialized not
before the 'sigh' function is called, the IRQ of sessions that
never called 'sigh' could not be freed correctly. This patch fixes
the problem by not relying on the kernel IRQ object for obtaining
the number in the destructor but using the '_irq_number' member
variable instead.
Use printf format specifier with correct size to log error code which is
if type uint32_t. Also print the error code in hex since this simplifies
lookup as the error values are also defined as hexadecimal values, see
[1].
Fixes#1600
[1] - repos/ports/src/virtualbox/include/xpcom/nsError.h
Instead of organizing page tables within slab blocks and allocating such
blocks dynamically on demand, replace the page table allocator with a
simple, static alternative. The new page table allocator is dimensioned
at compile-time. When a PD runs out of page-tables, we simply flush its
current mappings, and re-use the freed tables. The only exception is
core/kernel that should not produce any page faults. Thereby it has to
be ensured that core has enough page tables to populate it's virtual
memory.
A positive side-effect of this static approach is that the accounting
of memory used for page-tables is now possible again. In the dynamic case
there was no protocol existent that solved the problem of donating memory
to core during a page fault.
Fix#1588
Unregister callbacks, so rx packets will not be propated to the deleteted
'Driver' object. Initialize ipxe once in the 'Main' object, thus allowing new
session connections.
Fixes#1595
With the server framework this becomes unnecessary. Also when the 'platform_drv'
has a lower priority, signaling will cause a constant load that starves the
'platform_drv'.
Fixes#1594
The recent change of the TRACE session interface triggered the
following warning:
/home/no/src/genode/repos/base/include/base/ipc.h:79:4: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
*reinterpret_cast<T *>(&_sndbuf[_write_offset]) = value;
^
In file included from /home/no/src/genode/repos/base/src/core/include/trace/session_component.h:19:0,
from /home/no/src/genode/repos/base/src/core/trace_session_component.cc:15:
/home/no/src/genode/repos/base/include/base/rpc_server.h:132:42: note: ‘ret’ was declared here
typename This_rpc_function::Ret_type ret;
The warning occurs for basic return types (like size_t), which are
indeed not initialized. The variable gets its value assigned by the
corresponding 'call_member' overload, to which the variable is passed as
reference. But the compiler apparently is not able to detect this assignment.
Declaring 'ret' with a C++11-style default initializer fixes the warning.
While importing trace sources as trace subjects into a TRACE session,
the session quota might become depleted. The TRACE session already keeps
track of the session quota via an allocator guard but the 'subjects' RPC
function missed to handle the out-of-memory condition. This patch
reflects the error condition as an 'Out_of_metadata' exception to the
TRACE client. It also contains an extension of the trace test to
exercise the corner case.
This patch enable clients of core's TRACE service to obtain the
execution times of trace subjects (i.e., threads). The execution time is
delivered as part of the 'Subject_info' structure.
Right now, the feature is available solely on NOVA. On all other base
platforms, the returned execution times are 0.
Issue #813
This patch bases the size of the destination buffer in
'Init::Child_policy_redirect_rom_file' on the maximum label size
instead of the filename size. Otherwise, the use of a long configfile
name (i.e., "trace_subject_reporter.config") in combination with a long
child name ("trace_subject_reporter") would result in a truncated label
string.
When replacing a report with a smaller one, the corresponding ROM
dataspace should not contain any traces of the old report. Otherwise,
the consumer of the ROM dataspace may mistake the stale content as
meaningful information. This is particularly annoying when manually
inspecting reports. This patch overwrites the stale content with zeros.
By appending a newline to the generated XML data, we prevent the output
from messing with the command prompt when using 'cat' on a shell.
Futhermore, when using line-buffered output, the trailing newline
ensures that the output gets gets properly flushed.
The result of the second run (TCP_MAERTS) gets extracted wrongly - due to the
change introduced by commit "run: always append to output buffer"
(Issue #1327). The output buffer is no longer reseted between several
run_genode_until invocation within a run script.
On ARM, the compiler generates calls to memcpy and memset. Most
dynamically linked programs use the libc, which provides these
functions. However, if a dynamically linked program does not use the
libc (e.g., noux/minimal or the new version of cli_monitor), those
symbols remain unresolved. By adding them to ldso's symbol.map, the
dynamic linker will resolve them with the functions of the cxx
library, which is part of the dynamic linker.
Issue #1561
This patch moves the VFS file-system factory to a separate vfs library
that is independent from libc. This enables libc-less Genode programs to
easily use the VFS infrastructure.
Fixes#1561
With this patch, the VESA driver reports the framebuffer width to the
client instead of the visible width This fixes possible distortion
if these widths differ, at the cost that content in the right-most area
might be invisible in such cases.
Issue #1264.
Add a Platform::setup_irq_mode function which enables the IRQ session to
update the trigger mode and polarity of the associated IRQ according to
the session parameters. On ARM this function is a nop.
This change enables the x86_64 platform to support devices which use
arbitrary trigger modes and polarity settings, e.g. AHCI on QEMU and
real hardware.
Fixes#1528.
Because of helping, it is possible that a core thread that wants to
destroy another thread at the kernel is using the scheduling context of
the thread that shall be destroyed at this point in time. When building
without GENODE_RELEASE defined, this always triggers an assertion in the
kernel. But when building with GENODE_RELEASE defined, this might silently
lead to kernel-memory corruption. This commit eliminates the latter case.
Should be reverted as soon as the scheduler is able to remove its head.
Ref #1537
Placement new can be misleading, as we already overload the new operator
to construct objects via pointers to allocators. To prohibit any problems here,
and to use one consistent approach, we can explicitely construct the object
with the already available 'construct_at' template function.
Ref #1443
* Introduce a hw specific Address_space interface for protection
domains, which combines all memory-virtualization related functionality
* Introduce a core-specific Platform_pd object that solves all the hen-egg
problems formerly distributed in kernel and core-platform code
Ref #595
Ref #1443
For several basic sessions that core provides default ram quota values
exist in the form of enum values. They are used e.g. by init to deduce
session costs. Unfortunately they were not used when actually establishing
the session, which lead to inconsistencies.
Ref #1443
This patch installs the parent endpoint selector and the PD's CNode into
a PD at its creation time. Furthermore, it initializes the IPC buffer
for the main thread of the new component.
This allows us to see debug messages printed at the eary initialization
of init (before init is able to obtain the regular LOG session). This
will be reverted as soon as the initialziation of the non-core base
environment works.
To build core and other Genode components, we will need to extend the
base-common.mk library with additions that conflict with the
minimalistic root-task environment of test/sel4. To preserve the
minimalistic root task, we need to decouple it from the base-common
library.
We implicitly know that the value range will not exceed access_t despite
the integer-based arithmetics, i.e., negation and shift operations.
Fixes#1524
Avoids the need to have per IRQ a thread that blocks synchronously for next
interrupt. Now a thread may wait for multiple IRQs as other signals
simultaneously.
In core no threads are required anymore for IRQs/MSI - the clients (either
the pci_drv or in case of MSI the driver) gets the IRQ delivered directly as
a ordinary Genode signal.
Useful since #1216 and #1487 is now available.
Commit applies feature of #1446 also to IRQ/MSIs.
The assumption that IRQs in the legacy ISA range are always
edge-triggered is wrong. For the free-for-use IRQs it depends on the
actual device which uses the specific IRQ. Therefore, treat IRQs 9, 10
and 11 as level-triggered.
Currently, libc_noux includes the 'base/src/base/env/platform_env.h' file
to be able to reinitialize the environment using the 'Platform_env'
interface. For base-linux, a special version of this file exists and the
inclusion of the generic version in libc_noux causes GCC 4.9 to make wrong
assumptions about the memory layout of the 'Env' object returned by
'Genode::env()'.
This commit moves the reinitialization functions to the 'Env' interface to
avoid the need to include the 'platform_env.h' file in libc_noux.
Fixes#1510
Some functions in the kernel, which create a static object and return its
address, are declared with a GCC 'const' attribute, which can cause GCC
4.9 to optimize the function call out and use the static object's address
without calling the constructor.
Fixes#1509
The information about connected devices is obtained from a ROM file named
'usb_devices', which is supposed to contain a device list as in the device
report generated by the USB driver (see issue #1506).
A policy for 'report_rom' would look like:
<policy label="vbox -> usb_devices" report="usb_drv -> devices"/>
If the 'usb_devices' ROM file is not available, a warning message gets
printed and VirtualBox continues without USB pass-through support.
The devices to be passed-through need to have a matching device filter in
the '.vbox' file. Example:
<USB>
<DeviceFilters>
<DeviceFilter name="USB Scanner" active="true" vendorId="04a9"
productId="2220" remote="0"/>
</DeviceFilters>
</USB>
The feature was tested with HID devices (mouse, keyboard) and a flatbed
scanner. Mass storage devices didn't work correctly (they also didn't work
with VirtualBox on Linux without the closed-source extension pack).
It should be made sure that the USB driver does not try to control the
devices to be passed-through itself, for example, when passing-through
a HID device, the '<hid/>' config option should not be set.
Fixes#1507
The report lists all connected devices and gets updated when devices are
added or removed.
Example report:
<devices>
<device vendor_id="0x17ef" product_id="0x4816"/>
<device vendor_id="0x0a5c" product_id="0x217f"/>
<device vendor_id="0x8087" product_id="0x0020"/>
<device vendor_id="0x8087" product_id="0x0020"/>
<device vendor_id="0x1d6b" product_id="0x0002"/>
<device vendor_id="0x1d6b" product_id="0x0002"/>
</devices>
There is no distinction yet for multiple devices of the same type.
The report is named "devices" and an example policy for 'report_rom' would
look like:
<policy label="vbox -> usb_devices" report="usb_drv -> devices"/>
The report only gets generated if enabled in the 'usb_drv' configuration:
<config>
<raw>
<report devices="yes"/>
</raw>
</config>
Fixes#1506
On NOVA, a Genode thread currently cannot destroy itself by destroying its
own 'Thread' object, because in 'Thread_base::_deinit_platform_thread()'
it cannot call 'Cpu_session::kill_thread()' anymore after it has revoked
its own UTCB.
As solution, the revocation of the UTCB can be delayed until its location
in the context area is needed by a new thread.
Fixes#1505
Enable a platform to specify how the MMIO memory allocator is to be
initialized. On ARM the existing behavior is kept while on x86 the I/O
memory is defined as the entire address space excluding the core only
RAM regions. This aligns the hw_x86_64 I/O memory allocator
initialization with how it is done for other x86 kernels such as NOVA or
Fiasco.
Perform lazy-initialization of FPU state when it is enabled for the
first time. This assures that the FXSAVE area (including the stored
MXCSR) is always properly setup and initialized to the platform default
values.
Perform all FPU-related setup in the Cpu class' init_fpu function instead of
the general system bring-up assembly code.
Set all required control register 0 and 4 flags according to Intel SDM Vol. 3A,
sections 9.2 and 9.6 instead of only enabling FPU error reporting and OSFXSR.
On seL4, we need to convert untyped memory to page frames before being
able to use it as normal memory. There already exists the hook function
'_export_ds' that is principally suitable for such tasks. It is
currently solely used on Linux where we have to create a file for each
dataspace. To make the hook useful also for seL4, we need to call
_export_ds prior _clear_ds. Otherwise, we would try to clear memory that
is still untyped.
On our test machine the xhci controller has a usb3.0 network adapter attached
and the xhci controller is the only usb controller which has MSI support,
so let us use and test it.
Issue #1216
This repository is superseded by the 'dde_bsd' repository. Though
OSSv4 served us well, its future is uncertain and having active
upstream development is preferable. In addition the ported Intel
HD Audio driver did not work on any Thinpad model newer than T60.
Issue #1498.
These audio drivers enable support for Intel HD Audio (Azalia) and
Ensoniq AudioPCI (ES1370) compatible soundcards. They are ported
from OpenBSD 5.7.
Fixes#1498.
This is especially true for i.MX53 but is not needed on Arndale
currently.
@skalk the test will still fail each night as we do not have a nic_drv
for imx53...
- send a 'state_change' signal on session creation if the device is
already attached
- evaluate the status code of a finished asynchronous operation
- return the number of actually transferred bytes for control transfers,
too
Fixes#1490
This patch avoids the attempt to extend the cxx-local heap during the
startup phase of an application. Originally, the static part of the cxx
was merely 100 bytes, which did not suffice to run the minimalistic test
roottask on seL4.
Fiasco.OC limits the UTCB area for roottask to 16K. Therefore, the
number of threads is limited to 16K / L4_UTCB_OFFSET. (see
kernel/fiasco/src/kern/kernel_thread-std.cpp:94)
In the past, when the user blocked for an IRQ signal, the last signal was
acknowledged automatically thereby unmasking the IRQ. Now, the signal session
got a dedicated RPC for acknowledging IRQs and the HW back-end of that RPC
acknowledged the IRQ signal too. This led to the situation that IRQs were
unmasked twice. However, drivers expect an interrupt to be unmasked only on
the Irq_session::ack_irq and thus IRQ unmasking was moved from
Kernel::ack_signal to a dedicated kernel call.
Fixes#1493
White list access to ports we actually need for our drivers so far and
deny everything else by default. The extend pci config space dataspace is
currently not used and exposes a potential risk (BAR rewrite) - so deny.
Related to #1487
If a null-terminated string exactly of length MAX (0 byte included) is
provided, it will be handled as invalid because of wrong string size length
checks.
Commit fixes this.
Discovered during #1486 development.
Step to move shared irq handling out of core in the long run. So, use
irq_proxy implementation from base in os and implement shared irq handling
in platform driver of x86 (pci_drv).
Fixes#1471
The thread library (thread.cc) in base-foc shared 95% of the code with
the generic implementation except myself(). Therefore, its
implementation is now separated from the other generic sources into
myself.cc, which allows base-foc to use a foc-specific primitive to
enable our base libraries in L4Linux.
Issue #1491
Otherwise RPC calls to dead/invalid destinations are rated as successful,
which leads to wrong execution paths later on. Triggered by bomb.run where
rm_session.attach() returned as successful with local address set to 0, which
causes un-handled page-faults later on.
Fixes#1480
Physical CPU quota was previously given to a thread on construction only
by directly specifying a percentage of the quota of the according CPU
session. Now, a new thread is given a weighting that can be any value.
The physical counter-value of such a weighting depends on the weightings
of the other threads at the CPU session. Thus, the physical quota of all
threads of a CPU session must be updated when a weighting is added or
removed. This is each time the session creates or destroys a thread.
This commit also adapts the "cpu_quota" test in base-hw accordingly.
Ref #1464
Use the new asynchronous IRQ interface in the mostly used drivers, e.g.:
* ahci_drv: x86/exynos5
* gpio_drv: imx53/omap4
* input_drv: imx53/dummy
* ps2_drv: x86/pl050
* timer_drv
Now, the Irq_session is requested from Gpio::Session:
From now on we use an asynchronous IRQ interface. To prevent triggering
another GPIO IRQ while currently handling the former one, IRQs must
now by acknowledged explicitly. While here, we also changed the GPIO
session interface regarding IRQ management. The generic GPIO component
now wraps the Irq_session managed by the backend instead of using the
GPIO backend methods directly. A client using the GPIO session may
request the Irq_session_capability by calling
'Gpio::Session::irq_session()' and can use this capability when using
a local Irq_session_client.
Issue #1456.
A long long time ago, in a galaxy^W^W^W we used DDE kit to ease the
porting of purely C based drivers. By now it became clear, that we
do not gain that much by following this approach. DDE kit contains
much generic functionality, which is not used or rather not needed
by most ported drivers. Hence, we implement a slim C wrapper on top
of Genode's C++ APIs, that is especially tailored to the driver.
In addition to removing the dependency on DDE kit, the iPXE driver
now uses the server framework and the newly introduced signal based
IRQ handling.
Issue #1456.
This patch adds const qualifiers to the functions Allocator::consumed,
Allocator::overhead, Allocator::avail, and Range_allocator::valid_addr.
Fixes#1481
Instead of handing over object ids to the kernel, which has to find them
in object pools then, core can simply use object pointers to reference
kernel objects.
Ref #1443
Instead of having an ID allocator per object class use one global allocator for
all. Thereby artificial limitations for the different object types are
superfluent. Moreover, replace the base-hw specific id allocator implementation
with the generic Bit_allocator, which is also memory saving.
Ref #1443
The verb "bin" in the context of destroying kernel objects seems pretty
unusual in contrast to "delete". When reading "bin" in the context of
systems software an association to something like "binary" is more likely.
Ref #1443
* Instead of using local capabilities within core's context area implementation
for stack allocation/attachment, simply do both operations while stack gets
attached, thereby getting rid of the local capabilities in generic code
* In base-hw the UTCB of core's main thread gets mapped directly instead of
constructing a dataspace component out of it and hand over its local
capability
* Remove local capability implementation from all platforms except Linux
Ref #1443
The global capability ID counter is not used by NOVA and Fiasco.OC
and in the future not needed by base-hw too. Thereby, remove the static
counter variable from the generic code base and add it where appropriated.
Ref #1443
Enable platform specific allocations and ram quota accounting for
protection domains. Needed to allocate object identity references
in the base-hw kernel when delegating capabilities via IPC.
Moreover, it can be used to account translation table entries in the
future.
Ref #1443
Currently, the 'pointed session' gets updated only when an input event
occurs, but an update is also needed in other situations, for example
when the view under the current mouse position was moved.
With this commit, the 'pointed session' gets updated whenever the
timer-triggered 'handle_input()' function is called.
Fixes#1473
There are lots of places where a numeric argument of an argument string
gets extraced as signed long value and then assigned to an unsigned long
variable. If the value in the string was negative, it would not be
detected as invalid (and replaced by the default value), but become a
positive bogus value.
With this patch, numeric values which are supposed to be unsigned get
extracted with the 'ulong_value()' function, which returns the default
value for negative numbers.
Fixes#1472
This commit enables the VirtualBox graphics adapter, provides guest mouse
pointer integration with Nitpicker using the 'vbox_pointer' application
and enhances the VirtualBox run scripts with the configuration of
Nitpicker, input merger and network driver.
Fixes#1474
The driver operates in PIO mode only. Depending on the block size (512
bytes versus 128 KiB), it has a troughput of 2 MiB/sec - 10 MiB/sec for
reading and 173 KiB/sec - 8 MiB/sec for writing.
Fixes#1475
This patch enhances the generic SD-card protocol implementation in
sd-card.h with the ability to handle the version 1.0 of the CSD register
(containing the capacity information of older SD cards).
The emergency dataspace is used to accommodate the corner case where
a signal context capability is created while issuing the first
resource request. Normally, the attempt to upgrade the signal-session
quota under such a constrained situation would fail. By freeing the
emergency dataspace in this situation, we regain enough quota to
upgrade the signal session.
This is a follow up commit for "base: Raise RAM quota of signal session
to 16K" and fixes the resource_request test on 64-bit platforms.
The 'Thread_base' class is constructed differently in some special cases
like the main thread or a thread that use a distinct CPU session. The
official API, however, should be clean from such artifacts. Hence, I
separated the official constructor from the other cases.
VirtualBox can receive absolute or relative mouse motion events from the
'Input' service and the VM can support either or both of them. With this
patch, more of the possible combinations are handled.
Fixes#1470
There were two bugs. First, the caller of Kernel::await_signal wasn't
re-activated for scheduling. Second, the caller did not memorize that he
doesn't wait on a receiver anymore which had bad side effects on further
signal handling.
Fix#1459
The port uses the Cortex-A9 private timer for the kernel and an EPIT as
user timer. It was successfully tested on the Wandboard Quad and the CuBox-i
with the signal test. It lacks L2-cache and Trustzone support by now.
Thanks to Praveen Srinivas (IIT Madras, India) and Nikolay Golikov (Ksys Labs
LLC, Russia). This work is partially based on their contributions.
Fix#1467
Do not mask edge-triggered interrupts to avoid losing them while masked,
see Intel 82093AA I/O Advanced Programmable Interrupt Controller
(IOAPIC) specification, section 3.4.2, "Interrupt Mask":
"When this bit is 1, the interrupt signal is masked. Edge-sensitive
interrupts signaled on a masked interrupt pin are ignored (i.e., not
delivered or held pending)"
Or to quote Linus Torvalds on the subject:
"Now, edge-triggered interrupts are a _lot_ harder to mask, because the
Intel APIC is an unbelievable piece of sh*t, and has the edge-detect
logic _before_ the mask logic, so if a edge happens _while_ the device
is masked, you'll never ever see the edge ever again (unmasking will not
cause a new edge, so you simply lost the interrupt)."
So when you "mask" an edge-triggered IRQ, you can't really mask it at
all, because if you did that, you'd lose it forever if the IRQ comes in
while you masked it. Instead, we're supposed to leave it active, and set
a flag, and IF the IRQ comes in, we just remember it, and mask it at
that point instead, and then on unmasking, we have to replay it by
sending a self-IPI." [1]
[1] - http://yarchive.net/comp/linux/edge_triggered_interrupts.html
Ref #1448
In order to match the I/O APIC configuration, a request for user timer
IRQ 0 is remapped to vector 50 (Board::TIMER_VECTOR_USER), all other
requests are transposed by adding the vector offset 48
(Board::VECTOR_REMAP_BASE).
On base-hw/x86_64 the quota of the signal session is not sufficient due to
the large size of the Signal_session_component. Increasing the quota to
16K avoids signal-context resource exhaustion messages as emmitted by the
run/launcher scenario:
...
Quota exceeded! amount=4096, size=4096, consumed=4096
failed to allocate signal-context resources
upgrading quota donation for signal session
C++ runtime: Genode::Parent::Quota_exceeded
void* abort(): abort called
...
Note: This change increases the quota for all kernels even though it is
strictly only required for base-hw/x86_64.
* Enable the use of the FXSAVE and FXRSTOR instructions, see Intel SDM
Vol. 3C, section 2.5.
* The state of the x87 floating point unit (FPU) is loaded and saved on
demand.
* Make the cr0 control register accessible in the Cpu class. This is in
preparation of the upcoming FPU management.
* Access to the FPU is disabled by setting the Task Switch flag in the cr0
register.
* Access to the FPU is enabled by clearing the Task Switch flag in the cr0
register.
* Implement FPU initialization
* Add is_fpu_enabled helper function
* Add pointer to CPU lazy state to CPU class
* Init FPU when finishing kernel initialization
* Add function to retry FPU instruction:
Similar to the ARM mechanism to retry undefined instructions, implement a
function for retrying an FPU instruction. If a floating-point instruction
causes an #NM exception due to the FPU being disabled, it can be retried
after the correct FPU state is restored, saving the current state and
enabling the FPU in the process.
* Disable FPU when switching to different user context:
This enables lazy save/restore of the FPU since trying to execute a
floating point instruction when the FPU is disabled will cause a #NM
exception.
* Declare constant for #NM exception
* Retry FPU instruction on #NM exception
* Assure alignment of FXSAVE area:
The FXSAVE area is 512-byte memory region that must be 16-byte aligned. As
it turns out the alignment attribute is not honored in all cases so add a
workaround to assure the alignment constraint is met by manually rounding
the start of the FXSAVE area to the next 16-byte boundary if necessary.
The LAPIC timer is programmed in one-shot mode with vector 32
(Board::TIMER_VECTOR_KERNEL). The timer frequency is measured using PIT
channel 2 as reference (50ms delay).
Disable PIT timer channel 0 since BIOS programs it to fire periodically.
This avoids potential spurious timer interrupts.
The implementation initializes the Local APIC (LAPIC) of CPU 0 in xapic
mode (mmio register access) and uses the I/O APIC to remap, mask and
unmask hardware IRQs. The remapping offset of IRQs is 48.
Also initialize the legacy PIC and mask all interrupts in order to
disable it.
For more information about LAPIC and I/O APIC see Intel SDM Vol. 3A,
chapter 10 and the Intel 82093AA I/O Advanced Programmable Interrupt
Controller (IOAPIC) specification
Set bit 9 in the RFLAGS register of user CPU context to enable
interrupts on kernel- to usermode switch.
Make the local APIC accessible via its MMIO region by adding a 2 MB
large page mapping at 0xfee00000 with memory type UC.
Note: The mapping is added to the initial page tables to make the APIC
usable prior to the activation of core's page tables, e.g. in the
constructor of the timer class.
The location in memory is arbitrary but we use the same address as the
ARM architecture. Adjust references to virtual addresses in the mode
transition pages to cope with 64-bit values.
The interrupt stack must reside in the mtc region in order to use it for
non-core threads. The size of the stack is set to 56 bytes in order to
hold the interrupt stack frame plus the additional vector number that is
pushed onto the stack by the ISR.
Call the _virt_mtc_addr function with the _mt_isrs label to calculate
the ISR base address in Idt::setup. Again, assume the address to be
below 0x10000.
Use parameter instead of class member variable because it would get
stored into the mtc region otherwise. In a further iteration only the
actual IDT should be saved into the mtc, not the complete class
instance. Currently the class instance size is equal to the IDT table
size.
The class provides the load() function which reloads the GDTR with the
GDT address in the mtc region. This is needed to make the segments
accessible to non-core threads.
Make the _gdt_start label global to use it in the call to
_virt_mtc_addr().
Use the _mt_tss label and the placement new operator to create the
Tss class instance in the mtc region. Update the hard-coded
TSS base address to use the virtual mtc address.
On exception, the CPU first checks the IDT in order to find the
associated ISR. The IDT must therefore be placed in the mode transition
pages to make them available for non-core threads.
The limit is set to match the TSS size - 1 and the base address is
hardcoded to the *current* address of the TSS instance (0x3a1100).
TODO: Set the base address using the 'tss' label. If the TSS descriptor
format were not so utterly unusable this would be straightforward.
Changes to the code that indirectly lead to a different location
of the tss result in #GP since the base address will be invalid.
The class Genode::Tss represents a 64-bit Task State Segment (TSS) as
specified by Intel SDM Vol. 3A, section 7.7.
The setup function sets the stack pointers for privilege levels 0-2 to
the kernel stack address. The load function loads the TSS segment
selector into the task register.
Implement user argument setter and getter support functions. The mapping of
the state registers corresponds to the system call parameter passing
convention.
The instruction pointer is the first field of the master context and can
directly be used as a jump argument, which avoids additional register
copy operations.
Point stack to client context region and save registers using push
instructions.
Note that since the push instruction first increments the stack pointer
and then stores the value on the stack, the RSP has to point one field
past RBP before pushing the first register value.
As the kernel entry is called from the interrupt handler the stack
layout is as specified by Intel SDM Vol. 3A, figure 6-8. An additional
vector number is stored at the top of the stack.
Gather the necessary client information from the interrupt stack frame
and store it in the client context.
The new errcode field is used to store the error code that some
interrupts provide (e.g. #PF). Rework mode transition reserved space and
offset constants to match the new CPU_state layout.
The macros are used to assign syscall arguments to specific registers.
Using the AMD64 parameter passing convention avoids additional copying of
variables since the C++ function parameters are already in the right
registers.
The interrupt return instruction in IA-32e mode applies the prepared
interrupt stack frame to set the RFLAGS, CS and SS segment as well as
the RIP and RSP registers. It then continues execution of the user code.
For detailed information refer to Intel SDM Vol. 3A, section 6.14.3.
After activating the client page tables the client context cannot be
accessed any longer. The mode transition buffer however is globally
mapped and can be used to restore the remaining register values.
Set the stack pointer to the R8 field in the client context to enable
restoring registers by popping values of the stack.
After this step the only remaining registers that do not contain client
values are RAX, RSP and RIP.
Note that the client value of RAX is pop'd to the global buffer region as
the register will still be used by subsequent steps. It will be restored to
the value in the buffer area just prior to resuming client code execution.
Set I/O privilege level to 3 to allow core to perform port I/O from
userspace. Also make sure the IF flag is cleared for now until interrupt
handling is implemented.
Setup an IA-32e interrupt stack frame in the mode transition buffer region.
It will be used to perform the mode switch to userspace using the iret
instruction.
For detailed information about the IA-32e interrupt stack frame refer to
Intel SDM Vol. 3A, figure 6-8.
The constants specify offset values of CPU context member variables as
specified by Genode::Cpu_state [1] and Genode::Cpu::Context [2].
[1] - repos/base/include/x86_64/cpu/cpu_state.h
[2] - repos/base-hw/src/core/include/spec/x86/cpu.h
The new entries specify a 64-bit code segment with DPL 3 at index 3 and a
64-bit data segment with DPL 3 at index 4.
These segments are needed for transitioning to user mode.
A pointer to the client context is placed in the mt_client_context_ptr area.
It is used to pass the current client context to the lowlevel mode-switching
assembly code.
IA-32e paging translates 48-bit linear addresses to 52-bit physical
addresses. Translation structures are hierarchical and four levels deep.
The current implementation supports regular 4KB and 1 GB and 2 MB large
page mappings.
Memory typing is not yet implemented since the encoded type bits depend
on the active page attribute table (PAT)*.
For detailed information refer to Intel SDM Vol. 3A, section 4.5.
* The default PAT after power up does not allow the encoding of the
write-combining memory type, see Intel SDM Vol. 3A, section 11.12.4.
* Add common IA-32e paging descriptor type:
The type represents a table entry and encompasses all fields shared by
paging structure entries of all four levels (PML4, PDPT, PD and PT).
* Simplify PT entry type by using common descriptor:
Differing fields are the physical address, the global flag and the memory
type flags.
* Simplify directory entry type by using common descriptor:
Page directory entries (PDPT and PD) have an additional 'page size' field
that specifies if the entry references a next level paging structure or
represents a large page mapping.
* Simplify PML4 entry type by using common descriptor
Top-level paging structure entries (PML4) do not have a 'pat' flag and the
memory type is specified by the 'pwt' and 'pcd' fields only.
* Implement access right merging for directory paging entries
The access rights for translations are determined by the U/S, R/W and XD
flags. Paging structure entries that reference other tables must provide
the superset of rights required for all entries of the referenced table.
Thus merge access rights of new mappings into existing directory entries to
grant additional rights if needed.
* Add cr3 register definition:
The control register 3 is used to set the current page-directory base
register.
* Add cr3 variable to x86_64 Cpu Context
The variable designates the address of the top-level paging structure.
* Return current cr3 value as translation table base
* Set context cr3 value on translation table assignment
* Implement switch to virtual mode in kernel
Activate translation table in init_virt_kernel function by updating the
cr3 register.
* Ignore accessed and dirty flags when comparing existing table entries
These flags can be set by the MMU and must be disregarded.
* Add isr.s assembler file:
The file declares an array of Interrupt Service Routines (ISR) to handle
the exception vectors from 0 to 19, see Intel SDM Vol. 3A, section
6.3.1.
* Add Idt class:
* The class Genode::Idt represents an Interrupt Descriptor Table as
specified by Intel SDM Vol. 3A, section 6.10.
* The setup function initializes the IDT with 20 entries using the ISR
array defined in the isr.s assembly file.
* Setup and load IDT in Genode::Cpu ctor:
The Idt::setup function is only executed once on the BSP.
* Declare ISRs for interrupts 20-255
* Set IDT size to 256
This patch contains the initial code needed to build and bootstrap the
base-hw kernel on x86 64-bit platforms. It gets stuck earlier
because the binary contains 64-bit instructions, but it is started in
32-bit mode. The initial setup of page tables and switch to long mode is
still missing from the crt0 code.
A Nic::Session client can install a signal handler that is used to
propagate changes of the link-state by calling 'link_state_sigh()'.
The actual link state is queried via 'link_state()'.
The nic-driver interface now provides a Driver_notification callback,
which is used to forward link-state changes from the driver to the
Nic::Session_component.
The following drivers now provide real link state: dde_ipxe, nic_bridge,
and usb_drv. Currently, OpenVPN, Linux nic_drv, and lan9118 do not
support link state and always report link up.
Fixes#1327
If a client acknowledges the same packet more than once, the packet also
gets freed more than once. At the second attempt the underlaying
Bit_array will throw an 'Invalid_clear' exception, which results in an
uncaught exception that leads to an abort() call in the freeing
component.
Fixes#1462.
To ease debugging without the need to tweak the kernel every time, and to
support userland developers with useful information this commit extends several
warnings and errors printed by the kernel/core by which thread/application
caused the problem, and what exactly failed.
Fix#1382Fix#1406
The driver for the Freescale eSDHCv2 doesn't support the highest
available bus frequency by now and also the bus width may be set to a
higher value but that needs further checks on the capabilities of the
inserted card.
The commits provide a benchmark as it exists for the OMAP4 SDHC driver.
Fix#1458
The 'continue_hw_accelerated' assertion at the end of the recall handler
can fail in situations which are not problematic, for example if the
'Timer' thread has set the 'VMCPU_FF_TIMER' flag in the meantime and
requested a recall afterwards. Since we don't know for sure if a recall is
requested for the other flags as well, the assertion gets replaced by a
debug message, which gets printed if any of the 'not yet verified as safe'
flags is set.
Fixes#1426
The GUID partition table (GPT) is primarily used by systems using
(U)EFI and is a replacement for the legacy MBR. For now, the current
implementation is able to address up to 128 GUID partition entries
(GPE).
To enable the GPT support in 'part_blk' it has to be configured
accrodingly:
! <start name="part_blk">
! [...]
! <config use_gpt="yes">
! [...]
! </start>
If 'part_blk' is not able to find a valid GPT header it falls back
to using the MBR.
Current limitations:
Since no endian conversion takes place it only works on LE platforms
and of all characters in the UTF-16 encoded name field of an entry
only the ones included in the ASCII encoding are printed. It also
ignores all GPE attributes.
Issue #1429.
The hover reports provides information about the session currently
pointed-to, i.e., hovered session. It can be enabled by the 'hover'
attribute of nitpicker's 'report' configuration element
<report hover="yes" />
Fixes#1442
The bindings for 32bit did not consider that in the syscall_3 function
edx changes due to the assembly instructions and that in the syscall_4
function edx and ecx change. So, the compiler wrongly assumed that the
content of these registers stayed unchanged.
Fixes#1447
In the past, unmap sometimes occured on RM clients that have no thread,
PD, or translation table assigned. However, this shouldn't be the
case anymore.
Fixes#504
* Introduce hw-specific crt0 for core that calls e.g.: init_main_thread
* re-map core's main thread UTCB to fit the right context area location
* switch core's main thread's stack to fit the right context area location
Fix#1440
This decouples the size of the mode transition control region from the
minimal mapping size of the page tables implementation. Rather, the CPU
architecture is able to specify the actual size.
Rationale: For x86_64, we need the mtc region to span two pages in order
to store all the tables required to perform the mode switch.
The size of empty structs differs in C (0 byte) and C++ (1 byte), which
leads to different offsets in compound structures. This fixes the driver
on 32Bit platforms.
Issue #1439.
The wireless stack calls timer_before(foo, timer.expires) and up to now
it was always 0. Let's be save and set this field when scheduling the
timer, although it worked fine so far.
Issue #1439.
We will always see this error message when the driver is started. It
is expected and not an actual error. When the driver is running it will
not allocate larger chunks than the Slab provides. Therefore, we can
safely ignore this message.
Issue #1439.
Some functions in the time manager, for example 'TMTimerSet()' and
'TMTimerStop()' let VirtualBox abort with a failed assertion if the timer
does not change to a 'stable' state after 1000 calls of a mixture of
'yield' and 'sleep'. On Genode, this happens sometimes when the 'EMT'
thread is executing 'TMTimerSet()' and gets interrupted by the 'TAP'
thread, which calls 'TMTimerStop()' and waits for the 'EMT' thread to
finish setting the timer. Since the 'EMT' thread has the lowest priority,
1000 retries can be too few. Without the assertion, these functions would
return an error code, which is often ignored by the caller, so it seems
safer to keep retrying until the function can return successfully.
Fixes#1437
Among others, this function is used in the for_each_set_big() macro,
which is used when configuring the data rate tables. Therefore, this
fixes observed performance issues.
Fixes#1439.
If running multiple VBox VMMs with Windows as guest concurrently then it may
happen that the system seem to hang. It turned out that actually
a VM-exit storm (vmx_exception->handle_exc_nm) causes a endless loop between
kernel and vCPU. Nothing gets scheduled nor interrupts are received anymore.
The referenced kernel commit fixes this issue.
Issue #1343