============================ Package management on Genode ============================ Norman Feske Motivation and inspiration ########################## The established system-integration work flow with Genode is based on the 'run' tool, which automates the building, configuration, integration, and testing of Genode-based systems. Whereas the run tool succeeds in overcoming the challenges that come with Genode's diversity of kernels and supported hardware platforms, its scalability is somewhat limited to appliance-like system scenarios: The result of the integration process is a system image with a certain feature set. Whenever requirements change, the system image is replaced with a new created image that takes those requirements into account. In practice, there are two limitations of this system-integration approach: First, since the run tool implicitly builds all components required for a system scenario, the system integrator has to compile all components from source. E.g., if a system includes a component based on Qt5, one needs to compile the entire Qt5 application framework, which induces significant overhead to the actual system-integration tasks of composing and configuring components. Second, general-purpose systems tend to become too complex and diverse to be treated as system images. When looking at commodity OSes, each installation differs with respect to the installed set of applications, user preferences, used device drivers and system preferences. A system based on the run tool's work flow would require the user to customize the run script of the system for each tweak. To stay up to date, the user would need to re-create the system image from time to time while manually maintaining any customizations. In practice, this is a burden, very few end users are willing to endure. The primary goal of Genode's package management is to overcome these scalability limitations, in particular: * Alleviating the need to build everything that goes into system scenarios from scratch, * Facilitating modular system compositions while abstracting from technical details, * On-target system update and system development, * Assuring the user that system updates are safe to apply by providing the ability to easily roll back the system or parts thereof to previous versions, * Securing the integrity of the deployed software, * Fostering a federalistic evolution of Genode systems, * Low friction for existing developers. The design of Genode's package-management concept is largely influenced by Git as well as the [https://nixos.org/nix/ - Nix] package manager. In particular the latter opened our eyes to discover the potential that lies beyond the package management employed in state-of-the art commodity systems. Even though we considered adapting Nix for Genode and actually conducted intensive experiments in this direction (thanks to Emery Hemingway who pushed forward this line of work), we settled on a custom solution that leverages Genode's holistic view on all levels of the operating system including the build system and tooling, source structure, ABI design, framework API, system configuration, inter-component interaction, and the components itself. Whereby Nix is designed for being used on top of Linux, Genode's whole-systems view led us to simplifications that eliminated the needs for Nix' powerful features like its custom description language. Nomenclature ############ When speaking about "package management", one has to clarify what a "package" in the context of an operating system represents. Traditionally, a package is the unit of delivery of a bunch of "dumb" files, usually wrapped up in a compressed archive. A package may depend on the presence of other packages. Thereby, a dependency graph is formed. To express how packages fit with each other, a package is usually accompanied with meta data (description). Depending on the package manager, package descriptions follow certain formalisms (e.g., package-description language) and express more-or-less complex concepts such as versioning schemes or the distinction between hard and soft dependencies. Genode's package management does not follow this notion of a "package". Instead of subsuming all deliverable content under one term, we distinguish different kinds of content, each in a tailored and simple form. To avoid the clash of the notions of the common meaning of a "package", we speak of "archives" as the basic unit of delivery. The following subsections introduce the different categories. Archives are named with their version as suffix, appended via a slash. The suffix is maintained by the author of the archive. The recommended naming scheme is the use of the release date as version suffix, e.g., 'report_rom/2017-05-14'. Raw-data archives ================= A raw-data archive contains arbitrary data that is - in contrast to executable binaries - independent from the processor architecture. Examples are configuration data, game assets, images, or fonts. The content of raw-data archives is expected to be consumed by components at runtime. It is not relevant for the build process for executable binaries. Each raw-data archive contains merely a collection of data files. There is no meta data. API archive =========== An API archive has the structure of a Genode source-code repository. It may contain all the typical content of such a source-code repository such as header files (in the _include/_ subdirectory), source codes (in the _src/_ subdirectory), library-description files (in the _lib/mk/_ subdirectory), or ABI symbols (_lib/symbols/_ subdirectory). At the top level, a LICENSE file is expected that clarifies the license of the contained source code. There is no meta data contained in an API archive. An API archive is meant to provide _ingredients_ for building components. The canonical example is the public programming interface of a library (header files) and the library's binary interface in the form of an ABI-symbols file. One API archive may contain the interfaces of multiple libraries. For example, the interfaces of libc and libm may be contained in a single "libc" API archive because they are closely related to each other. Conversely, an API archive may contain a single header file only. The granularity of those archives may vary. But they have in common that they are used at build time only, not at runtime. Source archive ============== Like an API archive, a source archive has the structure of a Genode source-tree repository and is expected to contain all the typical content of such a source repository along with a LICENSE file. But unlike an API archive, it contains descriptions of actual build targets in the form of Genode's usual 'target.mk' files. In addition to the source code, a source archive contains a file called 'used_apis', which contains a list of API-archive names with each name on a separate line. For example, the 'used_apis' file of the 'report_rom' source archive looks as follows: ! base/2017-05-14 ! os/2017-05-13 ! report_session/2017-05-13 The 'used_apis' file declares the APIs needed to incorporate into the build process when building the source archive. Hence, they represent _build-time_ _dependencies_ on the specific API versions. A source archive may be equipped with a top-level file called 'api' containing the name of exactly one API archive. If present, it declares that the source archive _implements_ the specified API. For example, the 'libc/2017-05-14' source archive contains the actual source code of the libc and libm as well as an 'api' file with the content 'libc/2017-04-13'. The latter refers to the API implemented by this version of the libc source package (note the differing versions of the API and source archives) Binary archive ============== A binary archive contains the build result of the equally-named source archive when built for a particular architecture. That is, all files that would appear at the _/bin/_ subdirectory when building all targets present in the source archive. There is no meta data present in a binary archive. A binary archive is created out of the content of its corresponding source archive and all API archives listed in the source archive's 'used_apis' file. Note that since a binary archive depends on only one source archive, which has no further dependencies, all binary archives can be built independently from each other. For example, a libc-using application needs the source code of the application as well as the libc's API archive (the libc's header file and ABI) but it does not need the actual libc library to be present. Package archive =============== A package archive contains an 'archives' file with a list of archive names that belong together at runtime. Each listed archive appears on a separate line. For example, the 'archives' file of the package archive for the window manager 'wm/2018-02-26' looks as follows: ! genodelabs/raw/wm/2018-02-14 ! genodelabs/src/wm/2018-02-26 ! genodelabs/src/report_rom/2018-02-26 ! genodelabs/src/decorator/2018-02-26 ! genodelabs/src/floating_window_layouter/2018-02-26 In contrast to the list of 'used_apis' of a source archive, the content of the 'archives' file denotes the origin of the respective archives ("genodelabs"), the archive type, followed by the versioned name of the archive. An 'archives' file may specify raw archives, source archives, or package archives (as type 'pkg'). It thereby allows the expression of _runtime dependencies_. If a package archive lists another package archive, it inherits the content of the listed archive. This way, a new package archive may easily customize an existing package archive. A package archive does not specify binary archives directly as they differ between the architecture and are already referenced by the source archives. In addition to an 'archives' file, a package archive is expected to contain a 'README' file explaining the purpose of the collection. Depot structure ############### Archives are stored within a directory tree called _depot/_. The depot is structured as follows: ! /pubkey ! /download ! /src/// ! /api/// ! /raw/// ! /pkg/// ! /bin//// The stands for the origin of the contained archives. For example, the official archives provided by Genode Labs reside in a _genodelabs/_ subdirectory. Within this directory, there is a 'pubkey' file with the user's public key that is used to verify the integrity of archives downloaded from the user. The file 'download' specifies the download location as an URL. Subsuming archives in a subdirectory that correspond to their origin (user) serves two purposes. First, it provides a user-local name space for versioning archives. E.g., there might be two versions of a 'nitpicker/2017-04-15' source archive, one by "genodelabs" and one by "nfeske". However, since each version resides under its origin's subdirectory, version-naming conflicts between different origins cannot happen. Second, by allowing multiple archive origins in the depot side-by-side, package archives may incorporate archives of different origins, which fosters the goal of a federalistic development, where contributions of different origins can be easily combined. The actual archives are stored in the subdirectories named after the archive types ('raw', 'api', 'src', 'bin', 'pkg'). Archives contained in the _bin/_ subdirectories are further subdivided in the various architectures (like 'x86_64', or 'arm_v7'). Depot management ################ The tools for managing the depot content reside under the _tool/depot/_ directory. When invoked without arguments, each tool prints a brief description of the tool and its arguments. Unless stated otherwise, the tools are able to consume any number of archives as arguments. By default, they perform their work sequentially. This can be changed by the '-j' argument, where denotes the desired level of parallelization. For example, by specifying '-j4' to the _tool/depot/build_ tool, four concurrent jobs are executed during the creation of binary archives. Downloading archives ==================== The depot can be populated with archives in two ways, either by creating the content from locally available source codes as explained by Section [Automated extraction of archives from the source tree], or by downloading ready-to-use archives from a web server. In order to download archives originating from a specific user, the depot's corresponding user subdirectory must contain two files: :_pubkey_: contains the public key of the GPG key pair used by the creator (aka "user") of the to-be-downloaded archives for signing the archives. The file contains the ASCII-armored version of the public key. :_download_: contains the base URL of the web server where to fetch archives from. The web server is expected to mirror the structure of the depot. That is, the base URL is followed by a sub directory for the user, which contains the archive-type-specific subdirectories. If both the public key and the download locations are defined, the download tool can be used as follows: ! ./tool/depot/download genodelabs/src/zlib/2018-01-10 The tool automatically downloads the specified archives and their dependencies. For example, as the zlib depends on the libc API, the libc API archive is downloaded as well. All archive types are accepted as arguments including binary and package archives. Furthermore, it is possible to download all binary archives referenced by a package archive. For example, the following command downloads the window-manager (wm) package archive including all binary archives for the 64-bit x86 architecture. Downloaded binary archives are always accompanied with their corresponding source and used API archives. ! ./tool/depot/download genodelabs/pkg/x86_64/wm/2018-02-26 Archive content is not downloaded directly to the depot. Instead, the individual archives and signature files are downloaded to a quarantine area in the form of a _public/_ directory located in the root of Genode's source tree. As its name suggests, the _public/_ directory contains data that is imported from or to-be exported to the public. The download tool populates it with the downloaded archives in their compressed form accompanied with their signatures. The compressed archives are not extracted before their signature is checked against the public key defined at _depot//pubkey_. If however the signature is valid, the archive content is imported to the target destination within the depot. This procedure ensures that depot content - whenever downloaded - is blessed by a cryptographic signature of its creator. Building binary archives from source archives ============================================= With the depot populated with source and API archives, one can use the _tool/depot/build_ tool to produce binary archives. The arguments have the form '/bin//' where '' stands for the targeted CPU architecture. For example, the following command builds the 'zlib' library for the 64-bit x86 architecture. It executes four concurrent jobs during the build process. ! ./tool/depot/build genodelabs/bin/x86_64/zlib/2018-01-10 -j4 Note that the command expects a specific version of the source archive as argument. The depot may contain several versions. So the user has to decide, which one to build. After the tool is finished, the freshly built binary archive can be found in the depot within the _genodelabs/bin////_ subdirectory. Only the final result of the built process is preserved. In the example above, that would be the _zlib.lib.so_ library. For debugging purposes, it might be interesting to inspect the intermediate state of the build. This is possible by adding 'KEEP_BUILD_DIR=1' as argument to the build command. The binary's intermediate build directory can be found besides the binary archive's location named with a '.build' suffix. By default, the build tool won't attempt to rebuild a binary archive that is already present in the depot. However, it is possible to force a rebuild via the 'REBUILD=1' argument. Publishing archives =================== Archives located in the depot can be conveniently made available to the public using the _tool/depot/publish_ tool. Given an archive path, the tool takes care of determining all archives that are implicitly needed by the specified one, wrapping the archive's content into compressed tar archives, and signing those. As a precondition, the tool requires you to possess the private key that matches the _depot//pubkey_ file within your depot. The key pair should be present in the key ring of your GNU privacy guard. To publish archives, one needs to specify the specific version to publish. For example: ! ./tool/depot/publish /pkg/x86_64/wm/2018-02-26 The command checks that the specified archive and all dependencies are present in the depot. It then proceeds with the archiving and signing operations. For the latter, the pass phrase for your private key will be requested. The publish tool prints the information about the processed archives, e.g.: ! publish /.../public//api/base/2018-02-26.tar.xz ! publish /.../public//api/framebuffer_session/2017-05-31.tar.xz ! publish /.../public//api/gems/2018-01-28.tar.xz ! publish /.../public//api/input_session/2018-01-05.tar.xz ! publish /.../public//api/nitpicker_gfx/2018-01-05.tar.xz ! publish /.../public//api/nitpicker_session/2018-01-05.tar.xz ! publish /.../public//api/os/2018-02-13.tar.xz ! publish /.../public//api/report_session/2018-01-05.tar.xz ! publish /.../public//api/scout_gfx/2018-01-05.tar.xz ! publish /.../public//bin/x86_64/decorator/2018-02-26.tar.xz ! publish /.../public//bin/x86_64/floating_window_layouter/2018-02-26.tar.xz ! publish /.../public//bin/x86_64/report_rom/2018-02-26.tar.xz ! publish /.../public//bin/x86_64/wm/2018-02-26.tar.xz ! publish /.../public//pkg/wm/2018-02-26.tar.xz ! publish /.../public//raw/wm/2018-02-14.tar.xz ! publish /.../public//src/decorator/2018-02-26.tar.xz ! publish /.../public//src/floating_window_layouter/2018-02-26.tar.xz ! publish /.../public//src/report_rom/2018-02-26.tar.xz ! publish /.../public//src/wm/2018-02-26.tar.xz According to the output, the tool populates a directory called _public/_ at the root of the Genode source tree with the to-be-published archives. The content of the _public/_ directory is now ready to be copied to a web server, e.g., by using rsync. Automated extraction of archives from the source tree ##################################################### Genode users are expected to populate their local depot with content obtained via the _tool/depot/download_ tool. However, Genode developers need a way to create depot archives locally in order to make them available to users. Thanks to the _tool/depot/extract_ tool, the assembly of archives does not need to be a manual process. Instead, archives can be conveniently generated out of the source codes present in the Genode source tree and the _contrib/_ directory. However, the granularity of splitting source code into archives, the definition of what a particular API entails, and the relationship between archives must be augmented by the archive creator as this kind of information is not present in the source tree as is. This is where so-called "archive recipes" enter the picture. An archive recipe defines the content of an archive. Such recipes can be located at an _recipes/_ subdirectory of any source-code repository, similar to how port descriptions and run scripts are organized. Each _recipe/_ directory contains subdirectories for the archive types, which, in turn, contain a directory for each archive. The latter is called a _recipe directory_. Recipe directory ---------------- The recipe directory is named after the archive _omitting the archive version_ and contains at least one file named _hash_. This file defines the version of the archive along with a hash value of the archive's content separated by a space character. By tying the version name to a particular hash value, the _extract_ tool is able to detect the appropriate points in time whenever the version should be increased due to a change of the archive's content. API, source, and raw-data archive recipes ----------------------------------------- Recipe directories for API, source, or raw-data archives contain a _content.mk_ file that defines the archive content in the form of make rules. The content.mk file is executed from the archive's location within the depot. Hence, the contained rules can refer to archive-relative files as targets. The first (default) rule of the content.mk file is executed with a customized make environment: :GENODE_DIR: A variable that holds the path to root of the Genode source tree, :REP_DIR: A variable with the path to source code repository where the recipe is located :port_dir: A make function that returns the directory of a port within the _contrib/_ directory. The function expects the location of the corresponding port file as argument, for example, the 'zlib' recipe residing in the _libports/_ repository may specify '$(REP_DIR)/ports/zlib' to access the 3rd-party zlib source code. Source archive recipes contain simplified versions of the 'used_apis' and (for libraries) 'api' files as found in the archives. In contrast to the depot's counterparts of these files, which contain version-suffixed names, the files contained in recipe directories omit the version suffix. This is possible because the extract tool always extracts the _current_ version of a given archive from the source tree. This current version is already defined in the corresponding recipe directory. Package-archive recipes ----------------------- The recipe directory for a package archive contains the verbatim content of the to-be-created package archive except for the _archives_ file. All other files are copied verbatim to the archive. The content of the recipe's _archives_ file may omit the version information from the listed ingredients. Furthermore, the user part of each entry can be left blank by using '_' as a wildcard. When generating the package archive from the recipe, the extract tool will replace this wildcard with the user that creates the archive. Convenience front-end to the extract, build tools ################################################# For developers, the work flow of interacting with the depot is most often the combination of the _extract_ and _build_ tools whereas the latter expects concrete version names as arguments. The _create_ tool accelerates this common usage pattern by allowing the user to omit the version names. Operations implicitly refer to the _current_ version of the archives as defined in the recipes. Furthermore, the _create_ tool is able to manage version updates for the developer. If invoked with the argument 'UPDATE_VERSIONS=1', it automatically updates hash files of the involved recipes by taking the current date as version name. This is a valuable assistance in situations where a commonly used API changes. In this case, the versions of the API and all dependent archives must be increased, which would be a labour-intensive task otherwise. If the depot already contains an archive of the current version, the create tools won't re-create the depot archive by default. Local modifications of the source code in the repository do not automatically result in a new archive. To ensure that the depot archive is current, one can specify 'FORCE=1' to the create tool. With this argument, existing depot archives are replaced by freshly extracted ones and version updates are detected. When specified for creating binary archives, 'FORCE=1' normally implies 'REBUILD=1'. To prevent the superfluous rebuild of binary archives whose source versions remain unchanged, 'FORCE=1' can be combined with the argument 'REBUILD='. Accessing depot content from run scripts ######################################## The depot tools are not meant to replace the run tool but rather to complement it. When both tools are combined, the run tool implicitly refers to "current" archive versions as defined for the archive's corresponding recipes. This way, the regular run-tool work flow can be maintained while attaining a productivity boost by fetching content from the depot instead of building it. Run scripts can use the 'import_from_depot' function to incorporate archive content from the depot into a scenario. The function must be called after the 'create_boot_directory' function and takes any number of pkg, src, or raw archives as arguments. An archive is specified as depot-relative path of the form '//name'. Run scripts may call 'import_from_depot' repeatedly. Each argument can refer to a specific version of an archive or just the version-less archive name. In the latter case, the current version (as defined by a corresponding archive recipe in the source tree) is used. If a 'src' archive is specified, the run tool integrates the content of the corresponding binary archive into the scenario. The binary archives are selected according the spec values as defined for the build directory.