File.........: overview.txt Content......: Overview of how crosstool-NG works. Copyrigth....: (C) 2007 Yann E. MORIN License......: Creative Commons Attribution Share Alike (CC-by-sa), v2.5 ____________________ / Table Of Content / _________________/ Introduction History Installing crosstool-NG Install method The hacker's way Preparing for packaging Shell completion Contributed code Configuring crosstool-NG Interesting config options Re-building an existing toolchain Running crosstool-NG Stopping and restarting a build Testing all toolchains at once Overriding the number of // jobs Using the toolchain Toolchain types Internals Makefile front-end Kconfig parser Architecture-specific Adding a new version of a component Build scripts ________________ / Introduction / _____________/ crosstool-NG aims at building toolchains. Toolchains are an essential component in a software development project. It will compile, assemble and link the code that is being developed. Some pieces of the toolchain will eventually end up in the resulting binary/ies: static libraries are but an example. So, a toolchain is a very sensitive piece of software, as any bug in one of the components, or a poorly configured component, can lead to execution problems, ranging from poor performance, to applications ending unexpectedly, to mis-behaving software (which more than often is hard to detect), to hardware damage, or even to human risks (which is more than regrettable). Toolchains are made of different piece of software, each being quite complex and requiring specially crafted options to build and work seamlessly. This is usually not that easy, even in the not-so-trivial case of native toolchains. The work reaches a higher degree of complexity when it comes to cross- compilation, where it can become quite a nightmare... Some cross-toolchains exist on the internet, and can be used for general development, but they have a number of limitations: - they can be general purpose, in that they are configured for the majority: no optimisation for your specific target, - they can be prepared for a specific target and thus are not easy to use, nor optimised for, or even supporting your target, - they often are using aging components (compiler, C library, etc...) not supporting special features of your shiny new processor; On the other side, these toolchain offer some advantages: - they are ready to use and quite easy to install and setup, - they are proven if used by a wide community. But once you want to get all the juice out of your specific hardware, you will want to build your own toolchain. This is where crosstool-NG comes into play. There are also a number of tools that build toolchains for specific needs, which are not really scalable. Examples are: - buildroot (buildroot.uclibc.org) whose main purpose is to build root file systems, hence the name. But once you have your toolchain with buildroot, part of it is installed in the root-to-be, so if you want to build a whole new root, you either have to save the existing one as a template and restore it later, or restart again from scratch. This is not convenient, - ptxdist (www.pengutronix.de/software/ptxdist), whose purpose is very similar to buildroot, - other projects (openembedded.org for example), which are again used to build root file systems. crosstool-NG is really targeted at building toolchains, and only toolchains. It is then up to you to use it the way you want. ___________ / History / ________/ crosstool was first 'conceived' by Dan Kegel, who offered it to the community as a set of scripts, a repository of patches, and some pre-configured, general purpose setup files to be used to configure crosstool. This is available at http://www.kegel.com/crosstool, and the subversion repository is hosted on google at http://code.google.com/p/crosstool/. I once managed to add support for uClibc-based toolchains, but it did not make into mainline, mostly because I didn't have time to port the patch forward to the new versions, due in part to the big effort it was taking. So I decided to clean up crosstool in the state it was, re-order the things in place, add appropriate support for what I needed, that is uClibc support and a menu-driven configuration, named the new implementation crosstool-NG, (standing for crosstool Next Generation, as many other comunity projects do, and as a wink at the TV series "Star Trek: The Next Generation" ;-) ) and made it available to the community, in case it was of interest to any one. ___________________________ / Installing crosstool-NG / ________________________/ There are two ways you can use crosstool-NG: - build and install it, then get rid of the sources like you'd do for most programs, - or only build it and run from the source directory. The former should be used if you got crosstool-NG from a packaged tarball, see "Install method", below, while the latter is most useful for developpers that checked the code out from SVN, and want to submit patches, see "The Hacker's way", below. Install method | ---------------+ If you go for the install, then you just follow the classical, but yet easy ./configure way: ./configure --prefix=/some/place make make install export PATH="${PATH}:/some/place/bin" You can then get rid of crosstool-NG source. Next create a directory to serve as a working place, cd in there and run: ct-ng help See below for complete usage. The Hacker's way | -----------------+ If you go the hacker's way, then the usage is a bit different, although very simple: ./configure --local make make install Now, *do not* remove crosstool-NG sources. They are needed to run crosstool-NG! Stay in the directory holding the sources, and run: ./ct-ng help See below for complete usage. Now, provided you checked-out the code, you can send me your interesting changes by running: svn diff and mailing me the result! :-P Preparing for packaging | ------------------------+ If you plan on packaging crosstool-NG, you surely don't want to install it in your root file system. The install procedure of crosstool-NG honors the DESTDIR variable: ./configure --prefix=/usr make make DESDTDIR=/packaging/place install Shell completion | -----------------+ crosstool-NG comes with a shell script fragment that defines bash-compatible completion. That shell fragment is currently not installed automatically, but this is planned. To install the shell script fragment, you have two options: - install system-wide, most probably by copying ct-ng.comp into /etc/bash_completion.d/ - install for a single user, by copying ct-ng.comp into ${HOME}/ and sourcing this file from your ${HOME}/.bashrc Contributed code | -----------------+ Some people contibuted code that couldn't get merged for various reasons. This code is available as patches in the contrib/ sub-directory. These patches are to be applied to the source of crosstool-NG, prior to installing. An easy way to use contributed code is to pass the --with-contrib= option to ./configure. The possible values depend upon which contributions are packaged with your version, but you can get with it with passing one of those two special values: --with-contrib=list will list all available contributions --with-contrib=all will select all avalaible contributions There is no guarantee that a particuliar contribution applies to the current version of crosstool-ng, or that it will work at all. Use contributions at your own risk. ____________________________ / Configuring crosstool-NG / _________________________/ crosstool-NG is configured with a configurator presenting a menu-stuctured set of options. These options let you specify the way you want your toolchain built, where you want it installed, what architecture and specific processor it will support, the version of the components you want to use, etc... The value for those options are then stored in a configuration file. The configurator works the same way you configure your Linux kernel. It is assumed you now how to handle this. To enter the menu, type: ct-ng menuconfig Almost every config item has a help entry. Read them carefully. String and number options can refer to environment variables. In such a case, you must use the shell syntax: ${VAR}. You shall neither single- nor double- quote the string/number options. There are three environment variables that are computed by crosstool-NG, and that you can use: CT_TARGET: It represents the target tuple you are building for. You can use it for example in the installation/prefix directory, such as: /opt/x-tools/${CT_TARGET} CT_TOP_DIR: The top directory where crosstool-NG is running. You shouldn't need it in most cases. There is one case where you may need it: if you have local patches and you store them in your running directory, you can refer to them by using CT_TOP_DIR, such as: ${CT_TOP_DIR}/patches.myproject CT_VERSION: The version of crosstool-NG you are using. Not much use for you, but it's there if you need it. Interesting config options | ---------------------------+ CT_LOCAL_TARBALLS_DIR: If you already have some tarballs in a direcotry, enter it here. That will speed up the retrieving phase, where crosstool-NG would otherwise download those tarballs. CT_PREFIX_DIR: This is where the toolchain will be installed in (and for now, where it will run from). Common use is to add the target tuple in the directory path, such as (see above): /opt/x-tools/${CT_TARGET} CT_TARGET_VENDOR: An identifier for your toolchain, will take place in the vendor part of the target tuple. It shall *not* contain spaces or dashes. Usually, keep it to a one-word string, or use underscores to separate words if you need. Avoid dots, commas, and special characters. CT_TARGET_ALIAS: An alias for the toolchian. It will be used as a prefix to the toolchain tools. For example, you will have ${CT_TARGET_ALIAS}-gcc Also, if you think you don't see enough versions, you can try to enable one of those: CT_OBSOLETE: Show obsolete versions or tools. Most of the time, you don't want to base your toolchain on too old a version (of gcc, for example). But at times, it can come handy to use such an old version for regression tests. Those old versions are hidden behind CT_OBSOLETE. CT_EXPERIMENTAL: Show experimental versions or tools. Again, you might not want to base your toolchain on too recent tools (eg. gcc) for production. But if you need a feature present only in a recent version, or a new tool, you can find them hidden behind CT_EXPERIMENTAL. Re-building an existing toolchain | ----------------------------------+ If you have an existing toolchain, you can re-use the options used to build it to create a new toolchain. That needs a very little bit of effort on your side but is quite easy. The options to build a toolchain are saved with the toolchain, and you can retrieve this configuration by running: ${CT_TARGET}-config This will dump the configuration to stdout, so to rebuild a toolchain with this configuration, the following is all you need to do: ${CT_TARGET}-config >.config Then, you can review and change the configuration by running: ct-ng menuconfig ________________________ / Running crosstool-NG / _____________________/ To build the toolchain, simply type: ct-ng build This will use the above configuration to retrieve, extract and patch the components, build, install and eventually test your newly built toolchain. You are then free to add the toolchain /bin directory in your PATH to use it at will. In any case, you can get some terse help. Just type: ct-ng help or: man 1 ct-ng Stopping and restarting a build | --------------------------------+ If you want to stop the build after a step you are debugging, you can pass the variable STOP to make: ct-ng STOP=some_step Conversely, if you want to restart a build at a specific step you are debugging, you can pass the RESTART variable to make: ct-ng RESTART=some_step Alternatively, you can call make with the name of a step to just do that step: ct-ng libc_headers is equivalent to: ct-ng RESTART=libc_headers STOP=libc_headers The shortcuts +step_name and step_name+ allow to respectively stop or restart at that step. Thus: ct-ng +libc_headers and: ct-ng libc_headers+ are equivalent to: ct-ng STOP=libc_headers and: ct-ng RESTART=libc_headers To obtain the list of acceptable steps, please call: ct-ng list-steps Note that in order to restart a build, you'll have to say 'Y' to the config option CT_DEBUG_CT_SAVE_STEPS, and that the previous build effectively went that far. Building all toolchains at once | --------------------------------+ You can build all samples; simply call: ct-ng build-all Overriding the number of // jobs | ---------------------------------+ If you want to override the number of jobs to run in // (the -j option to make), you can either re-enter the menuconfig, or simply add it on the command line, as such: ct-ng build.4 which tells crosstool-NG to override the number of // jobs to 4. You can see the actions that support overriding the number of // jobs in the help menu. Those are the ones with [.#] after them (eg. build[.#] or build-all[.#], and so on...). Note on // jobs | ----------------+ The crosstool-NG script 'ct-ng' is a Makefile-script. It does *not* execute in parallel (there is not much to gain). When speaking of // jobs, we are refering to the number of // jobs when making the *components*. That is, we speak of the number of // jobs used to build gcc, glibc, and so on... _______________________ / Using the toolchain / ____________________/ Using the toolchain is as simple as adding the toolchain's bin directory in your PATH, such as: export PATH="${PATH}:/your/toolchain/path/bin" and then using the target tuple to tell the build systems to use your toolchain: ./configure --target=your-target-tuple or make CC=your-target-tuple-gcc or make CROSS_COMPILE=your-target-tuple- and so on... It is strongly advised not to use the toolchain sys-root directory as an install directory for your programs/packages. If you do so, you will not be able to use your toolchain for another project. It is even strongly advised that your toolchain is chmod-ed to read-only once successfully build, so that you don't go polluting your toolchain with your programs/packages' files. Thus, when you build a program/package, install it in a separate directory, eg. /your/root. This directory is the /image/ of what would be in the root file system of your target, and will contain all that your programs/packages have installed. When your root directory is ready, it is still missing some important bits: the toolchain's libraries. To populate your root directory with those libs, just run: your-target-tuple-populate -s /your/root -d /your/root-populated This will copy /your/root into /your/root-populated, and put the needed and only the needed libraries there. Thus you don't polute /your/root with any cruft that would no longer be needed should you have to remove stuff. /your/root always contains only those things you install in it. You can then use /your/root-populated to build up your file system image, a tarball, or to NFS-mount it from your target, or whatever you need. populate accepts the following options: -s [src_dir] Use 'src_dir' as the 'source', un-populated root directory -d [dst_dir] Put the 'destination', populated root directory in 'dst_dir' -f Remove 'dst_dir' if it previously existed -v Be verbose, and tell what's going on (you can see exactly where libs are coming from). -h Print the help ___________________ / Toolchain types / ________________/ There are four kinds of toolchains you could encounter. First off, you must understand the following: when it comes to compilers there are up to four machines involved: 1) the machine configuring the toolchain components: the config machine 2) the machine building the toolchain components: the build machine 3) the machine running the toolchain: the host machine 4) the machine the toolchain is generating code for: the target machine We can most of the time assume that the config machine and the build machine are the same. Most of the time, this will be true. The only time it isn't is if you're using distributed compilation (such as distcc). Let's forget this for the sake of simplicity. So we're left with three machines: - build - host - target Any toolchain will involve those three machines. You can be as pretty sure of this as "2 and 2 are 4". Here is how they come into play: 1) build == host == target This is a plain native toolchain, targetting the exact same machine as the one it is built on, and running again on this exact same machine. You have to build such a toolchain when you want to use an updated component, such as a newer gcc for example. crosstool-NG calls it "native". 2) build == host != target This is a classic cross-toolchain, which is expected to be run on the same machine it is compiled on, and generate code to run on a second machine, the target. crosstool-NG calls it "cross". 3) build != host == target Such a toolchain is also a native toolchain, as it targets the same machine as it runs on. But it is build on another machine. You want such a toolchain when porting to a new architecture, or if the build machine is much faster than the host machine. crosstool-NG calls it "cross-native". 4) build != host != target This one is called a canadian-toolchain (*), and is tricky. The three machines in play are different. You might want such a toolchain if you have a fast build machine, but the users will use it on another machine, and will produce code to run on a third machine. crosstool-NG calls it "canadian". crosstool-NG can build all these kinds of toolchains (or is aiming at it, anyway!) (*) The term Canadian Cross came about because at the time that these issues were all being hashed out, Canada had three national political parties. http://en.wikipedia.org/wiki/Cross_compiler _____________ / Internals / __________/ Internally, crosstool-NG is script-based. To ease usage, the frontend is Makefile-based. Makefile front-end | -------------------+ The entry point to crosstool-NG is the Makefile script "ct-ng". Calling this script with an action will act exactly as if the Makefile was in the current working directory and make was called with the action as rule. Thus: ct-ng menuconfig is equivalent to having the Makefile in CWD, and calling: make menuconfig Having ct-ng as it is avoids copying the Makefile everywhere, and acts as a traditional command. ct-ng loads sub- Makefiles from the library directory $(CT_LIB_DIR), as set up at configuration time with ./configure. ct-ng also searches for config files, sub-tools, samples, scripts and patches in that library directory. Because of a stupid make behavior/bug I was unable to track down, implicit make rules are disabled: installing with --local would triger those rules, and mconf was unbuildable. Kconfig parser | ---------------+ The kconfig language is a hacked version, vampirised from the Linux kernel (http://www.kernel.org/), and (heavily) adapted to my needs. The list of the most notable changes (at least the ones I remember) follows: - the CONFIG_ prefix has been replaced with CT_ - a leading | in prompts is skipped, and subsequent leading spaces are not trimmed - otherwise leading spaces are silently trimmed The kconfig parsers (conf and mconf) are not installed pre-built, but as source files. Thus you can have the directory where crosstool-NG is installed, exported (via NFS or whatever) and have clients with different architectures use the same crosstool-NG installation, and most notably, the same set of patches. Architecture-specific | ----------------------+ Note: this chapter is not really well written, and might thus be a little bit complex to understand. To get a better grasp of what an architecture is, the reader is kindly encouraged to look at the "arch/" sub-directory, and to the existing architectures to see how things are laid out. An architecture is defined by: - a human-readable name, in lower case letters, with numbers as appropriate. The underscore is allowed; space and special characters are not. Eg.: arm, x86_64 - a file in "config/arch/", named after the architecture's name, and suffixed with ".in". Eg.: config/arch/arm.in - a file in "scripts/build/arch/", named after the architecture's name, and suffixed with ".sh". Eg.: scripts/build/arch/arm.sh The architecture's ".in" file API: > the config option "ARCH_%arch%" (where %arch% is to be replaced with the actual architecture name). That config option must have *neither* a type, *nor* a prompt! Also, it can *not* depend on any other config option (EXPERIMENTAL is managed as above). Eg.: config ARCH_arm + mandatory: defines a (terse) help entry for this architecture: Eg.: config ARCH_arm help The ARM architecture. + optional: selects adequate associated config options. Note: 64-bit architectures *shall* select ARCH_64 Eg.: config ARCH_arm select ARCH_SUPPORTS_BOTH_ENDIAN select ARCH_DEFAULT_LE help The ARM architecture. Eg.: config ARCH_x86_64 select ARCH_64 help The x86_64 architecture. > other target-specific options, at your discretion. Note however that to avoid name-clashing, such options shall be prefixed with "ARCH_%arch%", where %arch% is again replaced by the actual architecture name. (Note: due to historical reasons, and lack of time to clean up the code, I may have left some config options that do not completely conform to this, as the architecture name was written all upper case. However, the prefix is unique among architectures, and does not cause harm). The architecture's ".sh" file API: > the function "CT_DoArchTupleValues" + parameters: none + environment: - all variables from the ".config" file, - the two variables "target_endian_eb" and "target_endian_el" which are the endianness suffixes + return value: 0 upon success, !0 upon failure + provides: - mandatory - the environment variable CT_TARGET_ARCH - contains: the architecture part of the target tuple. Eg.: "armeb" for big endian ARM "i386" for an i386 + provides: - optional - the environment variable CT_TARGET_SYS - contains: the sytem part of the target tuple. Eg.: "gnu" for glibc on most architectures "gnueabi" for glibc on an ARM EABI - defaults to: - for glibc-based toolchain: "gnu" - for uClibc-based toolchain: "uclibc" + provides: - optional - the environment variable CT_KERNEL_ARCH - contains: the architecture name as understandable by the Linux kernel build system. Eg.: "arm" for an ARM "powerpc" for a PowerPC "i386" for an x86 - defaults to: ${CT_ARCH} + provides: - optional - the environment variables to configure the cross-gcc (defaults) - CT_ARCH_WITH_ARCH : the gcc ./configure switch to select architecture level ( "--with-arch=${CT_ARCH_ARCH}" ) - CT_ARCH_WITH_ABI : the gcc ./configure switch to select ABI level ( "--with-abi=${CT_ARCH_ABI}" ) - CT_ARCH_WITH_CPU : the gcc ./configure switch to select CPU instruction set ( "--with-cpu=${CT_ARCH_CPU}" ) - CT_ARCH_WITH_TUNE : the gcc ./configure switch to select scheduling ( "--with-tune=${CT_ARCH_TUNE}" ) - CT_ARCH_WITH_FPU : the gcc ./configure switch to select FPU type ( "--with-fpu=${CT_ARCH_FPU}" ) - CT_ARCH_WITH_FLOAT : the gcc ./configure switch to select floating point arithmetics ( "--with-float=soft" or /empty/ ) + provides: - optional - the environment variables to pass to the cross-gcc to build target binaries (defaults) - CT_ARCH_ARCH_CFLAG : the gcc switch to select architecture level ( "-march=${CT_ARCH_ARCH}" ) - CT_ARCH_ABI_CFLAG : the gcc switch to select ABI level ( "-mabi=${CT_ARCH_ABI}" ) - CT_ARCH_CPU_CFLAG : the gcc switch to select CPU instruction set ( "-mcpu=${CT_ARCH_CPU}" ) - CT_ARCH_TUNE_CFLAG : the gcc switch to select scheduling ( "-mtune=${CT_ARCH_TUNE}" ) - CT_ARCH_FPU_CFLAG : the gcc switch to select FPU type ( "-mfpu=${CT_ARCH_FPU}" ) - CT_ARCH_FLOAT_CFLAG : the gcc switch to choose floating point arithmetics ( "-msoft-float" or /empty/ ) - CT_ARCH_ENDIAN_CFLAG : the gcc switch to choose big or little endian ( "-mbig-endian" or "-mlittle-endian" ) - default to: see above. + provides: - optional - the environement variables to configure the core and final compiler, specific to this architecture: - CT_ARCH_CC_CORE_EXTRA_CONFIG : additional, architecture specific core gcc ./configure flags - CT_ARCH_CC_EXTRA_CONFIG : additional, architecture specific final gcc ./configure flags - default to: - all empty + provides: - optional - the architecture-specific CFLAGS and LDFLAGS: - CT_ARCH_TARGET_CLFAGS - CT_ARCH_TARGET_LDFLAGS - default to: - all empty You can have a look at "config/arch/arm.in" and "scripts/build/arch/arm.sh" for a quite complete example of what an actual architecture description looks like. Kernel specific | ----------------+ A kernel is defined by: - a human-readable name, in lower case letters, with numbers as appropriate. The underscore is allowed; space and special characters are not (although they are internally replaced with underscores. Eg.: linux, bare-metal - a file in "config/kernel/", named after the kernel name, and suffixed with ".in". Eg.: config/kernel/linux.in, config/kernel/bare-metal.in - a file in "scripts/build/kernel/", named after the kernel name, and suffixed with ".sh". Eg.: scripts/build/kernel/linux.sh, scripts/build/kernel/bare-metal.sh The kernel's ".in" file must contain: > an optional lines containing exactly "# EXPERIMENTAL", starting on the first column, and without any following space or other character. If this line is present, then this kernel is considered EXPERIMENTAL, and correct dependency on EXPERIMENTAL will be set. > the config option "KERNEL_%kernel_name%" (where %kernel_name% is to be replaced with the actual kernel name, with all special characters and spaces replaced by underscores). That config option must have *neither* a type, *nor* a prompt! Also, it can *not* depends on EXPERIMENTAL. Eg.: KERNEL_linux, KERNEL_bare_metal + mandatory: defines a (terse) help entry for this kernel. Eg.: config KERNEL_bare_metal help Build a compiler for use without any kernel. + optional: selects adequate associated config options. Eg.: config KERNEL_bare_metal select BARE_METAL help Build a compiler for use without any kernel. > other kernel specific options, at your discretion. Note however that, to avoid name-clashing, such options should be prefixed with "KERNEL_%kernel_name%", where %kernel_name% is again tp be replaced with the actual kernel name. (Note: due to historical reasons, and lack of time to clean up the code, I may have left some config options that do not completely conform to this, as the kernel name was written all upper case. However, the prefix is unique among kernels, and does not cause harm). The kernel's ".sh" file API: > is a bash script fragment > defines the function CT_DoKernelTupleValues + see the architecture's CT_DoArchTupleValues, except for: + set the environment variable CT_TARGET_KERNEL, the kernel part of the target tuple + return value: ignored > defines the function "do_print_filename": + parameters: none + environment: - all variables from the ".config" file, + return value: ignored + behavior: output the kernel's tarball filename, with adequate suffix, on stdout. Eg.: linux-2.6.26.5.tar.bz2 > defines the function "do_kernel_get": + parameters: none + environment: - all variables from the ".config" file. + return value: 0 for success, !0 for failure. + behavior: download the kernel's sources, and store the tarball into "${CT_TARBALLS_DIR}". To this end, a functions is available, that abstracts downloading tarballs: - CT_DoGet Eg.: CT_DoGet linux-2.6.26.5 ftp://ftp.kernel.org/pub/linux/kernel/v2.6 Note: retrieving sources from svn, cvs, git and the likes is not supported by CT_DoGet. You'll have to do this by hand, as it is done for eglibc in "scripts/build/libc/eglibc.sh" > defines the function "do_kernel_extract": + parameters: none + environment: - all variables from the ".config" file, + return value: 0 for success, !0 for failure. + behavior: extract the kernel's tarball into "${CT_SRC_DIR}", and apply required patches. To this end, a function is available, that abstracts extracting tarballs: - CT_ExtractAndPatch Eg.: CT_ExtractAndPatch linux-2.6.26.5 > defines the function "do_kernel_headers": + parameters: none + environment: - all variables from the ".config" file, + return value: 0 for success, !0 for failure. + behavior: install the kernel headers (if any) in "${CT_SYSROOT_DIR}/usr/include" > defines any kernel-specific helper functions These functions, if any, must be prefixed with "do_kernel_%CT_KERNEL%_", where '%CT_KERNEL%' is to be replaced with the actual kernel name, to avoid any name-clashing. You can have a look at "config/kernel/linux.in" and "scripts/build/kernel/linux.sh" as an example of what a complex kernel description looks like. Adding a new version of a component | ------------------------------------+ When a new component, such as the Linux kernel, gcc or any other is released, adding the new version to crosstool-NG is quite easy. There is a script that will do all that for you: tools/addToolVersion.sh Run it with no option to get some help. Build scripts | --------------+ To Be Written later...