genode/doc/future_optimizations.txt


             Future optimizations of Genode

                     Norman Feske

Abstract
########

This document outlines possible optimizations for Genode.
In the first place, Genode was meant as a feasibility study.
Therefore, optimizing performance and memory-consumption was
not a primary goal of the experiment. However, there exist
several ideas to improve these properties. These ideas are
definitely not to be regarded as ToDo items. Each idea should
only be conducted if it fixes a _real_ hotspot in the system
that was found by quantitative analysis.


Memory consumption
##################

Currently, we use an AVL-tree-based best-fit memory allocator
('Allocator_avl') as Heap. Despite, this allocator was
intended to be used only for the root allocator for physical
memory inside Core, we use it as the basis for the current
heap implementation. This way, we can reuse the code,
reducing the overall code complexity.

The AVL-tree-based allocator uses a meta-data entry of 32
bytes per free or used memory block. This means that each
memory allocation from the heap introduces a meta-data
overhead of 32 up to 64 bytes. When allocating a lot of
small objects, this overhead is very high.

:Question:: Is this issue a real bottleneck?

Possible improvements are:

* Using slab allocators for known allocation sizes.
  Slab entries have a much smaller footprint.
* Creating a list-based allocator implementation for
  the heap. This comes at the cost of additional
  code complexity.


RPC Performance
###############

We use C++ streams for RPC communication and therefore,
introduced run-time overhead for the dynamic marshaling.
Using an IDL compiler would improve the RPC performance.
Is the RPC performance a real performance problem? What
is the profile of RPC usage? (Number of RPCs per second,
Percentage of CPU time spent on RPCs, Secondary effects
for RPCs such as cache and TLB pollution).


Locking
#######

On L4v2-Genode, locking is implemented via yielding spin locks.
We may consider a L4env-like lock implementation.


Misc
####

Take a look at include/util/string.h and judge by yourself :-)
Imported Genode release 11.11 2011-12-22 15:19:25 +00:00
			`Future optimizations of Genode`

			`Norman Feske`

			`Abstract`
			`########`

			`This document outlines possible optimizations for Genode.`
			`In the first place, Genode was meant as a feasibility study.`
			`Therefore, optimizing performance and memory-consumption was`
			`not a primary goal of the experiment. However, there exist`
			`several ideas to improve these properties. These ideas are`
			`definitely not to be regarded as ToDo items. Each idea should`
			`only be conducted if it fixes a _real_ hotspot in the system`
			`that was found by quantitative analysis.`


			`Memory consumption`
			`##################`

			`Currently, we use an AVL-tree-based best-fit memory allocator`
			`('Allocator_avl') as Heap. Despite, this allocator was`
			`intended to be used only for the root allocator for physical`
			`memory inside Core, we use it as the basis for the current`
			`heap implementation. This way, we can reuse the code,`
			`reducing the overall code complexity.`

			`The AVL-tree-based allocator uses a meta-data entry of 32`
			`bytes per free or used memory block. This means that each`
			`memory allocation from the heap introduces a meta-data`
			`overhead of 32 up to 64 bytes. When allocating a lot of`
			`small objects, this overhead is very high.`

			`:Question:: Is this issue a real bottleneck?`

			`Possible improvements are:`

			`* Using slab allocators for known allocation sizes.`
			`Slab entries have a much smaller footprint.`
			`* Creating a list-based allocator implementation for`
			`the heap. This comes at the cost of additional`
			`code complexity.`


			`RPC Performance`
			`###############`

			`We use C++ streams for RPC communication and therefore,`
			`introduced run-time overhead for the dynamic marshaling.`
			`Using an IDL compiler would improve the RPC performance.`
			`Is the RPC performance a real performance problem? What`
			`is the profile of RPC usage? (Number of RPCs per second,`
			`Percentage of CPU time spent on RPCs, Secondary effects`
			`for RPCs such as cache and TLB pollution).`


			`Locking`
			`#######`

			`On L4v2-Genode, locking is implemented via yielding spin locks.`
			`We may consider a L4env-like lock implementation.`


			`Misc`
			`####`

			`Take a look at include/util/string.h and judge by yourself :-)`