Use a bit allocator for the allocation management of thread contexts,
instead of holding allocation information within the Thread_base objects,
which lead to race conditions in the past.
Moreover, extend the Thread_base class interface with the ability to
to add additional stacks to a thread, and associate the context they're
located in with the corresponding Thread_base object. Additional stacks
can be used to do user-level scheduling with stack switching, without breaking
Genode's API.
Fixes#1024Fixes#1036
This patch introduces new types for expressing CPU affinities. Instead
of dealing with physical CPU numbers, affinities are expressed as
rectangles in a grid of virtual CPU nodes. This clears the way to
conveniently assign sets of adjacent CPUs to subsystems, each of them
managing their respective viewport of the coordinate space.
By using 2D Cartesian coordinates, the locality of CPU nodes can be
modeled for different topologies such as SMP (simple Nx1 grid), grids of
NUMA nodes, or ring topologies.
With this patch, the 'futex' syscall gets used for blocking and unblocking
of threads in the Linux-specific lock implementation.
The 'Native_thread_id' type, which was previously used in the
lock-internal 'Applicant' class to identify a thread to be woken up,
was not suitable anymore for implementing this change. With this patch,
the 'Thread_base*' type gets used instead, which also has the positive
effect of making the public 'cancelable_lock.h' header file
platform-independent.
Fixes#646.
The distinction between 'ipc.h' and 'ipc_generic.h' is no more. The only
use case for platform-specific extensions of the IPC support was the
marshalling of capabilities. However, this case is accommodated by a
function interface ('_marshal_capability', '_unmarshal_capability'). By
moving the implementation of these functions from the headers into the
respective ipc libraries, we can abandon the platform-specific 'ipc.h'
headers.
Add functionality to lookup an object and lock it. Additional the case is
handled that a object may be already in-destruction and the lookup will deny
returning the object.
The object_pool generalize the lookup and lock functionality of the rpc_server
and serve as base for following up patches to fix dangling pointer issues.
The CPU session interfaces comes with the ability to install an
exception handler per thread. This patch enhances the feature with the
provision of a default signal handler that is used if no thread-specific
handler is installed. The default signal handler can be set by
specifying an invalid thread capability and a valid signal context
capability.
Furthermore, this patch relaxes the requirement of the order of the
calls of 'exception_handler' and 'set_pager'. Originally, the exception
handler could be installed not before setting a pager. Now, we remember
the installed exception handler in the 'Cpu_thread' and propagate to to
the platform thread at a later time.
With this patch, core responds to SIGCHLD signals of terminating Genode
processes by reflecting these events as exceptions to the CPU session
interface. This way, Genode processes become able to respond to
terminating Genode child processes.
The Linux version of core used a part of the BSS to simulate access to
physical memory. All dataspaces would refer to a portion of 'some_mem'.
So every time when core would access the dataspace content, it would
access its local BSS. For all processes outside of core, dataspaces were
represented as files. This patch removes the distinction between core
and non-core processes. Now, core uses the same 'Rm_session_mmap'
implementation as regular processes. This way, the 'some_mem' could be
abandoned. We still use BSS variable for allocating core-local meta
data through.
On Linux, we want to attach additional attributes to processes, i.e.,
the chroot location, the designated UID, and GID. Instead of polluting
the generic code with such Linux-specific platform details, I introduced
the new 'Native_pd_args' type, which can be customized for each
platform. The platform-dependent policy of init is factored out in the
new 'pd_args' library.
The new 'base-linux/run/lx_pd_args.run' script can be used to validate
the propagation of those attributes into core.
Note that this patch does not add the interpretation of the new UID and
PID attributes by core. This will be subject of a follow-up patch.
Related to #510.
On Linux, we use the session label for naming the corresponding Linux
process. When looking up the processes via 'ps', the Genode process
hierarchy becomes immediately visible.
Genode used to create new processes by directly forking from the
respective Genode parent using the process library. The forking process
created a PD session at core merely for propagating the PID of the new
process into core (for later destruction). This traditional mechanisms
has the following disadvantages:
First, the PID reported by the creating process to core cannot easily be
validated by core. Therefore core has to trust the PD client to not
specify a PID of an existing process, which would happen to be killed
once the PD session gets destructed. This problem is documented by
issue #318. Second, there is no way for a Genode process to detect the
failure of its any grandchildren. The immediate parent of a faulting
process could use the SIGCHLD-and-waitpid mechanism to observe its
children but this mechanism does not work transitively.
By performing the process creation exclusively within core, all Genode
processes become immediate child processes of core. Hence, core can
respond to failures of any of those processes and reflect such
conditions via core's session interfaces. Furthermore, the PID
associated to a PD session is locally known within core and cannot be
forged anymore. In fact, there is actually no need at all to make
processes aware of any PIDs of other processes.
Please note that this patch breaks the 'chroot' mechanism that comes in
the form of the 'os/src/app/chroot' program. Because all processes are
forked from core, a chroot'ed process could sneak outside its chroot
environment by just creating a new Genode process. To address this
issue, the chroot mechanism must be added to core.
This patch changes the way of how dataspace content is accessed by
processes outside of core. Dataspaces are opened by core only and the
corresponding file descriptors are handed out the other processes via
the 'Linux_dataspace::fd()' RPC function. At the client side, the
returned file descriptor is then used to mmap the file.
Consequently, this patch eliminates all files from 'lx_rpath'. The
path is still needed by core to temporarily create dataspaces and
unix domain sockets. However, those files are unlinked immediately
after their creation.
This patch alleviates the need for any non-core process to create Unix
domain sockets locally. All sockets used for RPC communication are
created by core and subsequently passed to the other processes via RPC
or the parent interface. The immediate benefit is that no process other
than core needs to access the 'rpath' directory in order to communicate.
However, access to 'rpath' is still needed for accessing dataspaces.
Core creates one socket pair per thread on demand on the first call of
the 'Linux_cpu_session::server_sd()' or 'Linux_cpu_session::client_sd()'
functions. 'Linux_cpu_session' is a Linux-specific extension to the CPU
session interface. In addition to the socket accessors, the extension
provides a mechanism to register the PID/TID of a thread. Those
information were formerly propagated into core along with the thread
name as argument to 'create_thread()'.
Because core creates socket pairs for entrypoints, it needs to know all
threads that are potential entrypoints. For lx_hybrid programs, we
hadn't had propagated any thread information into core, yet. Hence, this
patch also contains the code for registering threads of hybrid
applications at core.
This patch eliminates the thread ID portion of the 'Native_capability'
type. The access to entrypoints is now exclusively handled by passing
socket descripts over Unix domain sockets and by inheriting the socket
descriptor of the parent entrypoint at process-creation time.
Each entrypoint creates a socket pair. The server-side socket is bound
to a unique name defined by the server. The client-side socket is then
connected to the same name. Whereas the server-side socket is meant to
be exclusively used by the server to wait for incoming requests, the
client-side socket can be delegated to other processes as payload of RPC
messages (via SCM rights). Anyone who receives a capability over RPC
receives the client-side socket of the entrypoint to which the
capability refers. Given this socket descriptor, the unique name (as
defined by the server) can be requested using 'getpeername'. Using this
name, it is possible to compare socket descriptors, which is important
to avoid duplicates from polluting the limited socket-descriptor name
space.
Wheras this patch introduces capability-based delegation of access
rights to entrypoints, it does not cover the protection of the integrity
of RPC objects. RPC objects are still referenced by a global ID passed
as normal message payload.
This patch adds prinicipal support for transmitting socket descriptors
as RPC payload. Socket descriptors are handled by the linux-specific
implementation of the capability marshalling and unmarshalling functions
in 'ipc.h'. The 'Message' type in 'src/platform/linux_socket.h' has been
extended to carry multiple descriptors in a single message.
Unfortuately, we hit a problem (and potential show stopper) here:
lx_sendmsg failed with -109 in lx_call()
The error code corresponds to ETOOMANYREFS. There is only one place in
the Linux kernel where this error code is used (net/unix/af_unix.c).
The code for 'unix_attach_fds()' suggests that there is a limit with
regard to the maximum number of references for a given Unix domain
socket. When the error occurs, core and init are running. The socket
of core's server entrypoint is present in the '/proc/pid/fd' of those
processes 8 times. The error occurs when core tries to perform an
RPC to the entrypoint to perform 'Ram_session::transfer_quota()'
(base/include/base/child.h at line 248).
By storing the reply socket descriptor inside the 'Ipc_ostream::_dst'
capability instead as part of the connection state object, we can
use the 'explicit_reply' mechanism as usual. Right now, we store
both the tid and socket handle in 'Native_capability::Dst'. In the
final version, the 'tid' member will be gone.
In the final version, the 'socket' will be the only member to remain in
the 'Dst' time. In the transition phase, we store both the old 'tid' and
the 'socket'.
This patch, which was originally created by Christian Helmuth,
represents the first step towards using SCM rights as capability
mechanism on Linux. It employs the SCM rights mechanism for transmitting
a reply capability to the server as argument of each IPC call. The
server will then send its respond to this reply file descriptor. This
way, the reply channel does not need to be globally visible anymore.
This patch extends the RAM session interface with the ability to
allocate DMA buffers. The client specifies the type of RAM dataspace to
allocate via the new 'cached' argument of the 'Ram_session::alloc()'
function. By default, 'cached' is true, which correponds to the common
case and the original behavior. When setting 'cached' to 'false', core
takes the precautions needed to register the memory as uncached in the
page table of each process that has the dataspace attached.
Currently, the support for allocating DMA buffers is implemented for
Fiasco.OC only. On x86 platforms, it is generally not needed. But on
platforms with more relaxed cache coherence (such as ARM), user-level
device drivers should always use uncacheable memory for DMA transactions.
With this patch clients of the RM service can state if they want a mapping
to be executable or not. This allows dataspaces to be mapped as
non-executable on Linux by default and as executable only if needed.
Partially fixes#176.
The 'copy_to' function turned out to be not flexible enough to
accommodate the Noux fork mechanism. This patch removes the function,
adds an accessor for the capability destination and a compound type
'Native_capability::Raw' to be used wherever plain capability
information must be communicated.
This commit unifies the policy name for the template argument for
Native_capability_tpl to Cap_dst_policy, like suggested by Norman in the
discussion resulting from issue #145. Moreover, it takes the memcpy
operation for copying a Native_capability out of the template, which is
included by a significant bunch of files, and separates it in a library,
analog to the suggestion in issue #145.
Because we use to pass a policy class to 'Native_capability_tpl'
we can pass the dst type as part of the policy instead of as
a separate template argument. This patch also adds documentation
of the POLICY interface as expected by 'Native_capability_tpl'.
This patch unifies the Native_capability classes for the different kernel
platforms by introducing an appropriate template, and eliminating naming
differences. Please refer issue #145.
To give the platform developer more freedom in how the Native_capability
class is internally implemented (e.g. turning it into a smart-pointer),
this patch removes the memcpy operation, when transfering the parent-capability
to a new process from the generic code, and let the implementation of the
platform-specific Native_capability decide how the transfer has to be done.
Please refer to issue #144.
Introduce a factory-, and dereference method for local capabilities. These are
capabilities that reference objects of services, which are known to be used
protection-domain internally only. To support the new Capability class methods
a protected constructor and accessor to the local object's pointer is needed
in the platform's capability base-classes. For further discussion details please
refer issue #139.
- Let hybrid Linux/Genode programs use POSIX threads for the
implementation of the Thread API.
- Prevent linkage of cxx library to hybrid Linux/Genode programs because
the cxx functionality is covered by glibc.