If page faults are handled concurrently (as for base-nova) the traverse lookup
call in rm_session_component must be thread safe, which it isn't.
If the faulting area is backed by nested dataspaces which are managed by
various rm_sessions then a race happens under following circumstances
(triggered occasionally by the bomb test).
The traverse lookup may return a pointer to a rm_session of a nested dataspace.
If the rm_session is in parallel subject to destruction it happened that faults
got enqueued to the faulters list of the deleted rm_session and internally to
a list of the current rm_session of the Rm_client.
During destruction of the faulting Rm_client the associated rm_session will
be dissolved from the Rm_client, which leads to dereferencing the
dangling pointer of the already destructed rm_session.
On base-nova the memory of the rm_session object get unmapped eventually, so
that the de-referencing of the dangling pointer caused page faults in core.
The memory on other kernels inside core never get unmapped so that the
bug doesn't trigger visible faults.
The patch replace the keeping of a rm_session pointer by keeping a
capability instead. The rm_session object must be looked up now explicitly in
the Object_pool implementation, which implements proper reference counting on
the rm_session object.
Issue #549
This patch removes the 'soname' link option for building the host
library for the 'lx_hybrid_ctors' test. Without this option, the
library's absolute path at build time gets hardcoded into the
application, which should be okay for this simple test case.
Fixes#638.
Add functionality to lookup an object and lock it. Additional the case is
handled that a object may be already in-destruction and the lookup will deny
returning the object.
The object_pool generalize the lookup and lock functionality of the rpc_server
and serve as base for following up patches to fix dangling pointer issues.
The CPU session interfaces comes with the ability to install an
exception handler per thread. This patch enhances the feature with the
provision of a default signal handler that is used if no thread-specific
handler is installed. The default signal handler can be set by
specifying an invalid thread capability and a valid signal context
capability.
Furthermore, this patch relaxes the requirement of the order of the
calls of 'exception_handler' and 'set_pager'. Originally, the exception
handler could be installed not before setting a pager. Now, we remember
the installed exception handler in the 'Cpu_thread' and propagate to to
the platform thread at a later time.
With this patch, core responds to SIGCHLD signals of terminating Genode
processes by reflecting these events as exceptions to the CPU session
interface. This way, Genode processes become able to respond to
terminating Genode child processes.
With this patch hybrid applications get linked with the Genode GCC's
'crtbegin.o' and 'crtend.o' files instead of the host GCC's versions
to avoid compatibility problems. This only affects the 'linux_x86'
platform, since on the 'lx_hybrid_x86' platform the Genode GCC is the
host GCC.
Fixes#550.
The Linux version of core used a part of the BSS to simulate access to
physical memory. All dataspaces would refer to a portion of 'some_mem'.
So every time when core would access the dataspace content, it would
access its local BSS. For all processes outside of core, dataspaces were
represented as files. This patch removes the distinction between core
and non-core processes. Now, core uses the same 'Rm_session_mmap'
implementation as regular processes. This way, the 'some_mem' could be
abandoned. We still use BSS variable for allocating core-local meta
data through.
This patch improves the life-time management of socket descriptors and
addresses several corner cases exposed by the 'bomb' test.
The lookup and association of file descriptors with global IDs have been
turned into an atomic operation. Otherwise, multiple threads interacting
with the singleton 'ep_sd_registry' may override each other's
associations.
Closing the socket pair used for the reply channel has been implemented
via the RAII pattern to capture all corner cases, in particular
exceptions.
If blocking operations are interrupted by signals, we throw a
'Blocking_canceled' exception.
We preserve core's socket descriptor at 'PARENT_SOCKET_HANDLE' to avoid
a corner case where the parent capability is going to dup2'ed to the
same handle.
Support for 'Thread_base::join' within core to enable leaving Genode via
Control-C.
With this patch, custom UIDs and GIDs can be assigned to individual
Genode processes or whole Genode subsystems.
The new 'base-linux/run/lx_uid.run' script contains an example of how to
use the feature.
Fixes#510
On Linux, we want to attach additional attributes to processes, i.e.,
the chroot location, the designated UID, and GID. Instead of polluting
the generic code with such Linux-specific platform details, I introduced
the new 'Native_pd_args' type, which can be customized for each
platform. The platform-dependent policy of init is factored out in the
new 'pd_args' library.
The new 'base-linux/run/lx_pd_args.run' script can be used to validate
the propagation of those attributes into core.
Note that this patch does not add the interpretation of the new UID and
PID attributes by core. This will be subject of a follow-up patch.
Related to #510.
This ensures that the cwd of the process is within the chroot
environment, improving security for root processes.
The cwd after the chroot is the same as before, this is needed to
start binaries given as relative path name.
Using the new 'join()' function, the caller can explicitly block for the
completion of the thread's 'entry()' function. The test case for this
feature can be found at 'os/src/test/thread_join'. For hybrid
Linux/Genode programs, the 'Thread_base::join()' does not map directly
to 'pthread_join'. The latter function gets already called by the
destructor of 'Thread_base'. According to the documentation, subsequent
calls of 'pthread_join' for one thread may result in undefined behaviour.
So we use a 'Genode::Lock' on this platform, which is in line with the
other platforms.
Related to #194, #501
When an IPC server is finalized two important things should happen:
First, the association of the server socket with a capability must be
invalidated. And finally, the server socket pair (server side and client
side) must be closed.
Related to #38.
This patch fixes the 'lx_hybrid_pthread_ipc.run' test. In order to use
the 'Genode::Lock' we need to set the SIGUSR1 handler to an empty handler.
Normally, this happens when creating a thread via the Genode API. But as
this test creates a thread via the pthread library and thereby bypasses
the Genode API, the signal handler remained unset.
Using the host compiler in this case seems to be an artifact from an
older change. On x86_64, this approach ended in unsable hybrid binaries
due to incompatible handling of non-trivial return values, i.e.
structures. See '-freg-struct-return' in GCC manual page:
"[...] If there is no standard convention, GCC defaults to
-fpcc-struct-return, except on targets where GCC is the principal
compiler. In those cases, we can choose the standard, and we chose
the more efficient register return alternative."
In other words: All x86_64 Linux systems break the ABI standard :-(
The thread ID reported to core was not always initialized prior the RPC
call. The 'startup_lock' ensures that the thread is completely
initialized before this information gets propagated.
Since the recent move of the process creation into core, the original chroot trampoline
mechanism implemented in 'os/src/app/chroot' does not work anymore. A
process could simply escape the chroot environment by spawning a new
process via core's PD service. Therefore, this patch moves the chroot
support into core. So the chroot policy becomes mandatory part of the
process creation. For each process created by core, core checks for
'root' argument of the PD session. If a path is present, core takes the
precautions needed to execute the new process in the specified chroot
environment.
This conceptual change implies minor changes with respect to the Genode
API and the configuration of the init process. The API changes are the
enhancement of the 'Genode::Child' and 'Genode::Process' constructors to
take the root path as argument. Init supports the specification of a
chroot per process by specifying the new 'root' attribute to the
'<start>' node of the process. In line with these changes, the
'Loader::Session::start' function has been enhanced with the additional
(optional) root argument.
When building in hybrid Linux/Genode mode, there exist two definitions
of 'size_t', one in the 'Genode' namespace and one imported from the
glibc headers.
On Linux, we use the session label for naming the corresponding Linux
process. When looking up the processes via 'ps', the Genode process
hierarchy becomes immediately visible.
Genode used to create new processes by directly forking from the
respective Genode parent using the process library. The forking process
created a PD session at core merely for propagating the PID of the new
process into core (for later destruction). This traditional mechanisms
has the following disadvantages:
First, the PID reported by the creating process to core cannot easily be
validated by core. Therefore core has to trust the PD client to not
specify a PID of an existing process, which would happen to be killed
once the PD session gets destructed. This problem is documented by
issue #318. Second, there is no way for a Genode process to detect the
failure of its any grandchildren. The immediate parent of a faulting
process could use the SIGCHLD-and-waitpid mechanism to observe its
children but this mechanism does not work transitively.
By performing the process creation exclusively within core, all Genode
processes become immediate child processes of core. Hence, core can
respond to failures of any of those processes and reflect such
conditions via core's session interfaces. Furthermore, the PID
associated to a PD session is locally known within core and cannot be
forged anymore. In fact, there is actually no need at all to make
processes aware of any PIDs of other processes.
Please note that this patch breaks the 'chroot' mechanism that comes in
the form of the 'os/src/app/chroot' program. Because all processes are
forked from core, a chroot'ed process could sneak outside its chroot
environment by just creating a new Genode process. To address this
issue, the chroot mechanism must be added to core.
This patch simplifies the system call bindings. The common syscall
bindings in 'src/platform/' have been reduced to the syscalls needed by
non-core programs. The additional syscalls that are needed solely by
core have been moved to 'src/core/include/core_linux_syscalls.h'.
Furthermore, the resource path is not used outside of core anymore.
Hence, we could get rid of the rpath library. The resource-path code has
been moved to 'src/core/include/resource_path.h'. The IPC-related parts
of 'src/platform' have been moved to the IPC library. So there is now a
clean separation between low-level syscall bindings (in 'src/platform')
and higher-level code.
The code for the socket-descriptor registry is now located in the
'src/base/ipc/socket_descriptor_registry.h' header. The interface is
separated from 'ipc.cc' because core needs to access the registry from
outside the ipc library.
This patch changes the way of how dataspace content is accessed by
processes outside of core. Dataspaces are opened by core only and the
corresponding file descriptors are handed out the other processes via
the 'Linux_dataspace::fd()' RPC function. At the client side, the
returned file descriptor is then used to mmap the file.
Consequently, this patch eliminates all files from 'lx_rpath'. The
path is still needed by core to temporarily create dataspaces and
unix domain sockets. However, those files are unlinked immediately
after their creation.
This patch alleviates the need for any non-core process to create Unix
domain sockets locally. All sockets used for RPC communication are
created by core and subsequently passed to the other processes via RPC
or the parent interface. The immediate benefit is that no process other
than core needs to access the 'rpath' directory in order to communicate.
However, access to 'rpath' is still needed for accessing dataspaces.
Core creates one socket pair per thread on demand on the first call of
the 'Linux_cpu_session::server_sd()' or 'Linux_cpu_session::client_sd()'
functions. 'Linux_cpu_session' is a Linux-specific extension to the CPU
session interface. In addition to the socket accessors, the extension
provides a mechanism to register the PID/TID of a thread. Those
information were formerly propagated into core along with the thread
name as argument to 'create_thread()'.
Because core creates socket pairs for entrypoints, it needs to know all
threads that are potential entrypoints. For lx_hybrid programs, we
hadn't had propagated any thread information into core, yet. Hence, this
patch also contains the code for registering threads of hybrid
applications at core.
This patch eliminates the thread ID portion of the 'Native_capability'
type. The access to entrypoints is now exclusively handled by passing
socket descripts over Unix domain sockets and by inheriting the socket
descriptor of the parent entrypoint at process-creation time.
Each entrypoint creates a socket pair. The server-side socket is bound
to a unique name defined by the server. The client-side socket is then
connected to the same name. Whereas the server-side socket is meant to
be exclusively used by the server to wait for incoming requests, the
client-side socket can be delegated to other processes as payload of RPC
messages (via SCM rights). Anyone who receives a capability over RPC
receives the client-side socket of the entrypoint to which the
capability refers. Given this socket descriptor, the unique name (as
defined by the server) can be requested using 'getpeername'. Using this
name, it is possible to compare socket descriptors, which is important
to avoid duplicates from polluting the limited socket-descriptor name
space.
Wheras this patch introduces capability-based delegation of access
rights to entrypoints, it does not cover the protection of the integrity
of RPC objects. RPC objects are still referenced by a global ID passed
as normal message payload.
This patch adds prinicipal support for transmitting socket descriptors
as RPC payload. Socket descriptors are handled by the linux-specific
implementation of the capability marshalling and unmarshalling functions
in 'ipc.h'. The 'Message' type in 'src/platform/linux_socket.h' has been
extended to carry multiple descriptors in a single message.
Unfortuately, we hit a problem (and potential show stopper) here:
lx_sendmsg failed with -109 in lx_call()
The error code corresponds to ETOOMANYREFS. There is only one place in
the Linux kernel where this error code is used (net/unix/af_unix.c).
The code for 'unix_attach_fds()' suggests that there is a limit with
regard to the maximum number of references for a given Unix domain
socket. When the error occurs, core and init are running. The socket
of core's server entrypoint is present in the '/proc/pid/fd' of those
processes 8 times. The error occurs when core tries to perform an
RPC to the entrypoint to perform 'Ram_session::transfer_quota()'
(base/include/base/child.h at line 248).
By storing the reply socket descriptor inside the 'Ipc_ostream::_dst'
capability instead as part of the connection state object, we can
use the 'explicit_reply' mechanism as usual. Right now, we store
both the tid and socket handle in 'Native_capability::Dst'. In the
final version, the 'tid' member will be gone.
In the final version, the 'socket' will be the only member to remain in
the 'Dst' time. In the transition phase, we store both the old 'tid' and
the 'socket'.