mirror of
https://github.com/genodelabs/genode.git
synced 2024-12-20 22:23:16 +00:00
69 lines
2.1 KiB
Plaintext
69 lines
2.1 KiB
Plaintext
|
|
||
|
Future optimizations of Genode
|
||
|
|
||
|
Norman Feske
|
||
|
|
||
|
Abstract
|
||
|
########
|
||
|
|
||
|
This document outlines possible optimizations for Genode.
|
||
|
In the first place, Genode was meant as a feasibility study.
|
||
|
Therefore, optimizing performance and memory-consumption was
|
||
|
not a primary goal of the experiment. However, there exist
|
||
|
several ideas to improve these properties. These ideas are
|
||
|
definitely not to be regarded as ToDo items. Each idea should
|
||
|
only be conducted if it fixes a _real_ hotspot in the system
|
||
|
that was found by quantitative analysis.
|
||
|
|
||
|
|
||
|
Memory consumption
|
||
|
##################
|
||
|
|
||
|
Currently, we use an AVL-tree-based best-fit memory allocator
|
||
|
('Allocator_avl') as Heap. Despite, this allocator was
|
||
|
intended to be used only for the root allocator for physical
|
||
|
memory inside Core, we use it as the basis for the current
|
||
|
heap implementation. This way, we can reuse the code,
|
||
|
reducing the overall code complexity.
|
||
|
|
||
|
The AVL-tree-based allocator uses a meta-data entry of 32
|
||
|
bytes per free or used memory block. This means that each
|
||
|
memory allocation from the heap introduces a meta-data
|
||
|
overhead of 32 up to 64 bytes. When allocating a lot of
|
||
|
small objects, this overhead is very high.
|
||
|
|
||
|
:Question:: Is this issue a real bottleneck?
|
||
|
|
||
|
Possible improvements are:
|
||
|
|
||
|
* Using slab allocators for known allocation sizes.
|
||
|
Slab entries have a much smaller footprint.
|
||
|
* Creating a list-based allocator implementation for
|
||
|
the heap. This comes at the cost of additional
|
||
|
code complexity.
|
||
|
|
||
|
|
||
|
RPC Performance
|
||
|
###############
|
||
|
|
||
|
We use C++ streams for RPC communication and therefore,
|
||
|
introduced run-time overhead for the dynamic marshaling.
|
||
|
Using an IDL compiler would improve the RPC performance.
|
||
|
Is the RPC performance a real performance problem? What
|
||
|
is the profile of RPC usage? (Number of RPCs per second,
|
||
|
Percentage of CPU time spent on RPCs, Secondary effects
|
||
|
for RPCs such as cache and TLB pollution).
|
||
|
|
||
|
|
||
|
Locking
|
||
|
#######
|
||
|
|
||
|
On L4v2-Genode, locking is implemented via yielding spin locks.
|
||
|
We may consider a L4env-like lock implementation.
|
||
|
|
||
|
|
||
|
Misc
|
||
|
####
|
||
|
|
||
|
Take a look at include/util/string.h and judge by yourself :-)
|