mirror of
https://github.com/AFLplusplus/AFLplusplus.git
synced 2025-06-12 01:58:17 +00:00
Edit instrumentation READMEs
This commit is contained in:
@ -1,11 +1,12 @@
|
|||||||
# CmpLog instrumentation
|
# CmpLog instrumentation
|
||||||
|
|
||||||
The CmpLog instrumentation enables logging of comparison operands in a
|
The CmpLog instrumentation enables logging of comparison operands in a shared
|
||||||
shared memory.
|
memory.
|
||||||
|
|
||||||
These values can be used by various mutators built on top of it.
|
These values can be used by various mutators built on top of it. At the moment,
|
||||||
At the moment we support the RedQueen mutator (input-2-state instructions only),
|
we support the RedQueen mutator (input-2-state instructions only), for details
|
||||||
for details see [the RedQueen paper](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf).
|
see
|
||||||
|
[the RedQueen paper](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf).
|
||||||
|
|
||||||
## Build
|
## Build
|
||||||
|
|
||||||
@ -14,7 +15,8 @@ program.
|
|||||||
|
|
||||||
The first version is built using the regular AFL++ instrumentation.
|
The first version is built using the regular AFL++ instrumentation.
|
||||||
|
|
||||||
The second one, the CmpLog binary, is built with setting AFL_LLVM_CMPLOG during the compilation.
|
The second one, the CmpLog binary, is built with setting AFL_LLVM_CMPLOG during
|
||||||
|
the compilation.
|
||||||
|
|
||||||
For example:
|
For example:
|
||||||
|
|
||||||
@ -32,8 +34,8 @@ unset AFL_LLVM_CMPLOG
|
|||||||
|
|
||||||
## Use
|
## Use
|
||||||
|
|
||||||
AFL++ has the new `-c` option that needs to be used to specify the CmpLog binary (the second
|
AFL++ has the new `-c` option that needs to be used to specify the CmpLog binary
|
||||||
build).
|
(the second build).
|
||||||
|
|
||||||
For example:
|
For example:
|
||||||
|
|
||||||
@ -41,4 +43,4 @@ For example:
|
|||||||
afl-fuzz -i input -o output -c ./program.cmplog -m none -- ./program.afl @@
|
afl-fuzz -i input -o output -c ./program.cmplog -m none -- ./program.afl @@
|
||||||
```
|
```
|
||||||
|
|
||||||
Be sure to use `-m none` because CmpLog can map a lot of pages.
|
Be sure to use `-m none` because CmpLog can map a lot of pages.
|
@ -1,64 +1,68 @@
|
|||||||
# GCC-based instrumentation for afl-fuzz
|
# GCC-based instrumentation for afl-fuzz
|
||||||
|
|
||||||
See [../README.md](../README.md) for the general instruction manual.
|
For the general instruction manual, see [../README.md](../README.md).
|
||||||
See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.
|
For the LLVM-based instrumentation, see [README.llvm.md](README.llvm.md).
|
||||||
|
|
||||||
This document describes how to build and use `afl-gcc-fast` and `afl-g++-fast`,
|
This document describes how to build and use `afl-gcc-fast` and `afl-g++-fast`,
|
||||||
which instrument the target with the help of gcc plugins.
|
which instrument the target with the help of gcc plugins.
|
||||||
|
|
||||||
TLDR:
|
TL;DR:
|
||||||
* check the version of your gcc compiler: `gcc --version`
|
* Check the version of your gcc compiler: `gcc --version`
|
||||||
* `apt-get install gcc-VERSION-plugin-dev` or similar to install headers for gcc plugins
|
* `apt-get install gcc-VERSION-plugin-dev` or similar to install headers for gcc
|
||||||
* `gcc` and `g++` must match the gcc-VERSION you installed headers for. You can set `AFL_CC`/`AFL_CXX`
|
plugins.
|
||||||
to point to these!
|
* `gcc` and `g++` must match the gcc-VERSION you installed headers for. You can
|
||||||
* `make`
|
set `AFL_CC`/`AFL_CXX` to point to these!
|
||||||
* just use `afl-gcc-fast`/`afl-g++-fast` normally like you would do with `afl-clang-fast`
|
* `make`
|
||||||
|
* Just use `afl-gcc-fast`/`afl-g++-fast` normally like you would do with
|
||||||
|
`afl-clang-fast`.
|
||||||
|
|
||||||
## 1) Introduction
|
## 1) Introduction
|
||||||
|
|
||||||
The code in this directory allows to instrument programs for AFL using
|
The code in this directory allows to instrument programs for AFL++ using true
|
||||||
true compiler-level instrumentation, instead of the more crude
|
compiler-level instrumentation, instead of the more crude assembly-level
|
||||||
assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
|
rewriting approach taken by afl-gcc and afl-clang. This has several interesting
|
||||||
several interesting properties:
|
properties:
|
||||||
|
|
||||||
- The compiler can make many optimizations that are hard to pull off when
|
- The compiler can make many optimizations that are hard to pull off when
|
||||||
manually inserting assembly. As a result, some slow, CPU-bound programs will
|
manually inserting assembly. As a result, some slow, CPU-bound programs will
|
||||||
run up to around faster.
|
run up to around faster.
|
||||||
|
|
||||||
The gains are less pronounced for fast binaries, where the speed is limited
|
The gains are less pronounced for fast binaries, where the speed is limited
|
||||||
chiefly by the cost of creating new processes. In such cases, the gain will
|
chiefly by the cost of creating new processes. In such cases, the gain will
|
||||||
probably stay within 10%.
|
probably stay within 10%.
|
||||||
|
|
||||||
- The instrumentation is CPU-independent. At least in principle, you should
|
- The instrumentation is CPU-independent. At least in principle, you should be
|
||||||
be able to rely on it to fuzz programs on non-x86 architectures (after
|
able to rely on it to fuzz programs on non-x86 architectures (after building
|
||||||
building `afl-fuzz` with `AFL_NOX86=1`).
|
`afl-fuzz` with `AFL_NOX86=1`).
|
||||||
|
|
||||||
- Because the feature relies on the internals of GCC, it is gcc-specific
|
- Because the feature relies on the internals of GCC, it is gcc-specific and
|
||||||
and will *not* work with LLVM (see [README.llvm.md](README.llvm.md) for an alternative).
|
will *not* work with LLVM (see [README.llvm.md](README.llvm.md) for an
|
||||||
|
alternative).
|
||||||
|
|
||||||
Once this implementation is shown to be sufficiently robust and portable, it
|
Once this implementation is shown to be sufficiently robust and portable, it
|
||||||
will probably replace afl-gcc. For now, it can be built separately and
|
will probably replace afl-gcc. For now, it can be built separately and co-exists
|
||||||
co-exists with the original code.
|
with the original code.
|
||||||
|
|
||||||
The idea and much of the implementation comes from Laszlo Szekeres.
|
The idea and much of the implementation comes from Laszlo Szekeres.
|
||||||
|
|
||||||
## 2) How to use
|
## 2) How to use
|
||||||
|
|
||||||
In order to leverage this mechanism, you need to have modern enough GCC
|
In order to leverage this mechanism, you need to have modern enough GCC (>=
|
||||||
(>= version 4.5.0) and the plugin development headers installed on your system. That
|
version 4.5.0) and the plugin development headers installed on your system. That
|
||||||
should be all you need. On Debian machines, these headers can be acquired by
|
should be all you need. On Debian machines, these headers can be acquired by
|
||||||
installing the `gcc-VERSION-plugin-dev` packages.
|
installing the `gcc-VERSION-plugin-dev` packages.
|
||||||
|
|
||||||
To build the instrumentation itself, type `make`. This will generate binaries
|
To build the instrumentation itself, type `make`. This will generate binaries
|
||||||
called `afl-gcc-fast` and `afl-g++-fast` in the parent directory.
|
called `afl-gcc-fast` and `afl-g++-fast` in the parent directory.
|
||||||
|
|
||||||
The gcc and g++ compiler links have to point to gcc-VERSION - or set these
|
The gcc and g++ compiler links have to point to gcc-VERSION - or set these by
|
||||||
by pointing the environment variables `AFL_CC`/`AFL_CXX` to them.
|
pointing the environment variables `AFL_CC`/`AFL_CXX` to them. If the `CC`/`CXX`
|
||||||
If the `CC`/`CXX` environment variables have been set, those compilers will be
|
environment variables have been set, those compilers will be preferred over
|
||||||
preferred over those from the `AFL_CC`/`AFL_CXX` settings.
|
those from the `AFL_CC`/`AFL_CXX` settings.
|
||||||
|
|
||||||
Once this is done, you can instrument third-party code in a way similar to the
|
Once this is done, you can instrument third-party code in a way similar to the
|
||||||
standard operating mode of AFL, e.g.:
|
standard operating mode of AFL++, e.g.:
|
||||||
|
|
||||||
```
|
```
|
||||||
CC=/path/to/afl/afl-gcc-fast
|
CC=/path/to/afl/afl-gcc-fast
|
||||||
CXX=/path/to/afl/afl-g++-fast
|
CXX=/path/to/afl/afl-g++-fast
|
||||||
@ -66,15 +70,15 @@ standard operating mode of AFL, e.g.:
|
|||||||
./configure [...options...]
|
./configure [...options...]
|
||||||
make
|
make
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: We also used `CXX` to set the C++ compiler to `afl-g++-fast` for C++ code.
|
Note: We also used `CXX` to set the C++ compiler to `afl-g++-fast` for C++ code.
|
||||||
|
|
||||||
The tool honors roughly the same environmental variables as `afl-gcc` (see
|
The tool honors roughly the same environmental variables as `afl-gcc` (see
|
||||||
[env_variables.md](../docs/env_variables.md). This includes `AFL_INST_RATIO`,
|
[docs/env_variables.md](../docs/env_variables.md). This includes
|
||||||
`AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`.
|
`AFL_INST_RATIO`, `AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`.
|
||||||
|
|
||||||
Note: if you want the GCC plugin to be installed on your system for all
|
Note: if you want the GCC plugin to be installed on your system for all users,
|
||||||
users, you need to build it before issuing 'make install' in the parent
|
you need to build it before issuing 'make install' in the parent directory.
|
||||||
directory.
|
|
||||||
|
|
||||||
## 3) Gotchas, feedback, bugs
|
## 3) Gotchas, feedback, bugs
|
||||||
|
|
||||||
@ -83,41 +87,40 @@ reports to afl@aflplus.plus.
|
|||||||
|
|
||||||
## 4) Bonus feature #1: deferred initialization
|
## 4) Bonus feature #1: deferred initialization
|
||||||
|
|
||||||
AFL tries to optimize performance by executing the targeted binary just once,
|
AFL++ tries to optimize performance by executing the targeted binary just once,
|
||||||
stopping it just before main(), and then cloning this "main" process to get
|
stopping it just before `main()`, and then cloning this "main" process to get a
|
||||||
a steady supply of targets to fuzz.
|
steady supply of targets to fuzz.
|
||||||
|
|
||||||
Although this approach eliminates much of the OS-, linker- and libc-level
|
Although this approach eliminates much of the OS-, linker- and libc-level costs
|
||||||
costs of executing the program, it does not always help with binaries that
|
of executing the program, it does not always help with binaries that perform
|
||||||
perform other time-consuming initialization steps - say, parsing a large config
|
other time-consuming initialization steps - say, parsing a large config file
|
||||||
file before getting to the fuzzed data.
|
before getting to the fuzzed data.
|
||||||
|
|
||||||
In such cases, it's beneficial to initialize the forkserver a bit later, once
|
In such cases, it's beneficial to initialize the forkserver a bit later, once
|
||||||
most of the initialization work is already done, but before the binary attempts
|
most of the initialization work is already done, but before the binary attempts
|
||||||
to read the fuzzed input and parse it; in some cases, this can offer a 10x+
|
to read the fuzzed input and parse it; in some cases, this can offer a 10x+
|
||||||
performance gain. You can implement delayed initialization in GCC mode in a
|
performance gain. You can implement delayed initialization in GCC mode in a
|
||||||
fairly simple way.
|
fairly simple way:
|
||||||
|
|
||||||
First, locate a suitable location in the code where the delayed cloning can
|
First, locate a suitable location in the code where the delayed cloning can take
|
||||||
take place. This needs to be done with *extreme* care to avoid breaking the
|
place. This needs to be done with *extreme* care to avoid breaking the binary.
|
||||||
binary. In particular, the program will probably malfunction if you select
|
In particular, the program will probably malfunction if you select a location
|
||||||
a location after:
|
after:
|
||||||
|
|
||||||
- The creation of any vital threads or child processes - since the forkserver
|
- The creation of any vital threads or child processes - since the forkserver
|
||||||
can't clone them easily.
|
can't clone them easily.
|
||||||
|
|
||||||
- The initialization of timers via setitimer() or equivalent calls.
|
- The initialization of timers via `setitimer()` or equivalent calls.
|
||||||
|
|
||||||
- The creation of temporary files, network sockets, offset-sensitive file
|
- The creation of temporary files, network sockets, offset-sensitive file
|
||||||
descriptors, and similar shared-state resources - but only provided that
|
descriptors, and similar shared-state resources - but only provided that their
|
||||||
their state meaningfully influences the behavior of the program later on.
|
state meaningfully influences the behavior of the program later on.
|
||||||
|
|
||||||
- Any access to the fuzzed input, including reading the metadata about its
|
- Any access to the fuzzed input, including reading the metadata about its size.
|
||||||
size.
|
|
||||||
|
|
||||||
With the location selected, add this code in the appropriate spot:
|
With the location selected, add this code in the appropriate spot:
|
||||||
|
|
||||||
```
|
```c
|
||||||
#ifdef __AFL_HAVE_MANUAL_CONTROL
|
#ifdef __AFL_HAVE_MANUAL_CONTROL
|
||||||
__AFL_INIT();
|
__AFL_INIT();
|
||||||
#endif
|
#endif
|
||||||
@ -131,14 +134,14 @@ Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will
|
|||||||
|
|
||||||
## 5) Bonus feature #2: persistent mode
|
## 5) Bonus feature #2: persistent mode
|
||||||
|
|
||||||
Some libraries provide APIs that are stateless, or whose state can be reset in
|
Some libraries provide APIs that are stateless or whose state can be reset in
|
||||||
between processing different input files. When such a reset is performed, a
|
between processing different input files. When such a reset is performed, a
|
||||||
single long-lived process can be reused to try out multiple test cases,
|
single long-lived process can be reused to try out multiple test cases,
|
||||||
eliminating the need for repeated `fork()` calls and the associated OS overhead.
|
eliminating the need for repeated `fork()` calls and the associated OS overhead.
|
||||||
|
|
||||||
The basic structure of the program that does this would be:
|
The basic structure of the program that does this would be:
|
||||||
|
|
||||||
```
|
```c
|
||||||
while (__AFL_LOOP(1000)) {
|
while (__AFL_LOOP(1000)) {
|
||||||
|
|
||||||
/* Read input data. */
|
/* Read input data. */
|
||||||
@ -147,22 +150,21 @@ The basic structure of the program that does this would be:
|
|||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Exit normally */
|
/* Exit normally. */
|
||||||
```
|
```
|
||||||
|
|
||||||
The numerical value specified within the loop controls the maximum number
|
The numerical value specified within the loop controls the maximum number of
|
||||||
of iterations before AFL will restart the process from scratch. This minimizes
|
iterations before AFL++ will restart the process from scratch. This minimizes
|
||||||
the impact of memory leaks and similar glitches; 1000 is a good starting point.
|
the impact of memory leaks and similar glitches; 1000 is a good starting point.
|
||||||
|
|
||||||
A more detailed template is shown in ../utils/persistent_mode/.
|
A more detailed template is shown in ../utils/persistent_mode/. Similarly to the
|
||||||
Similarly to the previous mode, the feature works only with afl-gcc-fast or
|
previous mode, the feature works only with afl-gcc-fast or afl-clang-fast;
|
||||||
afl-clang-fast; #ifdef guards can be used to suppress it when using other
|
#ifdef guards can be used to suppress it when using other compilers.
|
||||||
compilers.
|
|
||||||
|
|
||||||
Note that as with the previous mode, the feature is easy to misuse; if you
|
Note that as with the previous mode, the feature is easy to misuse; if you do
|
||||||
do not reset the critical state fully, you may end up with false positives or
|
not reset the critical state fully, you may end up with false positives or waste
|
||||||
waste a whole lot of CPU power doing nothing useful at all. Be particularly
|
a whole lot of CPU power doing nothing useful at all. Be particularly wary of
|
||||||
wary of memory leaks and the state of file descriptors.
|
memory leaks and the state of file descriptors.
|
||||||
|
|
||||||
When running in this mode, the execution paths will inherently vary a bit
|
When running in this mode, the execution paths will inherently vary a bit
|
||||||
depending on whether the input loop is being entered for the first time or
|
depending on whether the input loop is being entered for the first time or
|
||||||
@ -171,5 +173,5 @@ executed again. To avoid spurious warnings, the feature implies
|
|||||||
|
|
||||||
## 6) Bonus feature #3: selective instrumentation
|
## 6) Bonus feature #3: selective instrumentation
|
||||||
|
|
||||||
It can be more effective to fuzzing to only instrument parts of the code.
|
It can be more effective to fuzzing to only instrument parts of the code. For
|
||||||
For details see [README.instrument_list.md](README.instrument_list.md).
|
details, see [README.instrument_list.md](README.instrument_list.md).
|
@ -1,80 +1,84 @@
|
|||||||
# Using AFL++ with partial instrumentation
|
# Using AFL++ with partial instrumentation
|
||||||
|
|
||||||
This file describes two different mechanisms to selectively instrument
|
This file describes two different mechanisms to selectively instrument only
|
||||||
only specific parts in the target.
|
specific parts in the target.
|
||||||
|
|
||||||
Both mechanisms work for LLVM and GCC_PLUGIN, but not for afl-clang/afl-gcc.
|
Both mechanisms work for LLVM and GCC_PLUGIN, but not for afl-clang/afl-gcc.
|
||||||
|
|
||||||
## 1) Description and purpose
|
## 1) Description and purpose
|
||||||
|
|
||||||
When building and testing complex programs where only a part of the program is
|
When building and testing complex programs where only a part of the program is
|
||||||
the fuzzing target, it often helps to only instrument the necessary parts of
|
the fuzzing target, it often helps to only instrument the necessary parts of the
|
||||||
the program, leaving the rest uninstrumented. This helps to focus the fuzzer
|
program, leaving the rest uninstrumented. This helps to focus the fuzzer on the
|
||||||
on the important parts of the program, avoiding undesired noise and
|
important parts of the program, avoiding undesired noise and disturbance by
|
||||||
disturbance by uninteresting code being exercised.
|
uninteresting code being exercised.
|
||||||
|
|
||||||
For this purpose, "partial instrumentation" support is provided by AFL++ that
|
For this purpose, "partial instrumentation" support is provided by AFL++ that
|
||||||
allows to specify what should be instrumented and what not.
|
allows to specify what should be instrumented and what not.
|
||||||
|
|
||||||
Both mechanisms can be used together.
|
Both mechanisms for partial instrumentation can be used together.
|
||||||
|
|
||||||
## 2) Selective instrumentation with __AFL_COVERAGE_... directives
|
## 2) Selective instrumentation with __AFL_COVERAGE_... directives
|
||||||
|
|
||||||
In this mechanism the selective instrumentation is done in the source code.
|
In this mechanism, the selective instrumentation is done in the source code.
|
||||||
|
|
||||||
After the includes a special define has to be made, eg.:
|
After the includes, a special define has to be made, e.g.:
|
||||||
|
|
||||||
```
|
```
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
// ...
|
// ...
|
||||||
|
|
||||||
__AFL_COVERAGE(); // <- required for this feature to work
|
__AFL_COVERAGE(); // <- required for this feature to work
|
||||||
```
|
```
|
||||||
|
|
||||||
If you want to disable the coverage at startup until you specify coverage
|
If you want to disable the coverage at startup until you specify coverage should
|
||||||
should be started, then add `__AFL_COVERAGE_START_OFF();` at that position.
|
be started, then add `__AFL_COVERAGE_START_OFF();` at that position.
|
||||||
|
|
||||||
From here on out you have the following macros available that you can use
|
From here on out, you have the following macros available that you can use in
|
||||||
in any function where you want:
|
any function where you want:
|
||||||
|
|
||||||
* `__AFL_COVERAGE_ON();` - enable coverage from this point onwards
|
* `__AFL_COVERAGE_ON();` - Enable coverage from this point onwards.
|
||||||
* `__AFL_COVERAGE_OFF();` - disable coverage from this point onwards
|
* `__AFL_COVERAGE_OFF();` - Disable coverage from this point onwards.
|
||||||
* `__AFL_COVERAGE_DISCARD();` - reset all coverage gathered until this point
|
* `__AFL_COVERAGE_DISCARD();` - Reset all coverage gathered until this point.
|
||||||
* `__AFL_COVERAGE_SKIP();` - mark this test case as unimportant. Whatever happens, afl-fuzz will ignore it.
|
* `__AFL_COVERAGE_SKIP();` - Mark this test case as unimportant. Whatever
|
||||||
|
happens, afl-fuzz will ignore it.
|
||||||
|
|
||||||
A special function is `__afl_coverage_interesting`.
|
A special function is `__afl_coverage_interesting`. To use this, you must define
|
||||||
To use this, you must define `void __afl_coverage_interesting(u8 val, u32 id);`.
|
`void __afl_coverage_interesting(u8 val, u32 id);`. Then you can use this
|
||||||
Then you can use this function globally, where the `val` parameter can be set
|
function globally, where the `val` parameter can be set by you, the `id`
|
||||||
by you, the `id` parameter is for afl-fuzz and will be overwritten.
|
parameter is for afl-fuzz and will be overwritten. Note that useful parameters
|
||||||
Note that useful parameters for `val` are: 1, 2, 3, 4, 8, 16, 32, 64, 128.
|
for `val` are: 1, 2, 3, 4, 8, 16, 32, 64, 128. A value of, e.g., 33 will be seen
|
||||||
A value of e.g. 33 will be seen as 32 for coverage purposes.
|
as 32 for coverage purposes.
|
||||||
|
|
||||||
## 3) Selective instrumentation with AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST
|
## 3) Selective instrumentation with AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST
|
||||||
|
|
||||||
This feature is equivalent to llvm 12 sancov feature and allows to specify
|
This feature is equivalent to llvm 12 sancov feature and allows to specify on a
|
||||||
on a filename and/or function name level to instrument these or skip them.
|
filename and/or function name level to instrument these or skip them.
|
||||||
|
|
||||||
### 3a) How to use the partial instrumentation mode
|
### 3a) How to use the partial instrumentation mode
|
||||||
|
|
||||||
In order to build with partial instrumentation, you need to build with
|
In order to build with partial instrumentation, you need to build with
|
||||||
afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++.
|
afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++. The only
|
||||||
The only required change is that you need to set either the environment variable
|
required change is that you need to set either the environment variable
|
||||||
AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST set with a filename.
|
`AFL_LLVM_ALLOWLIST` or `AFL_LLVM_DENYLIST` set with a filename.
|
||||||
|
|
||||||
That file should contain the file names or functions that are to be instrumented
|
That file should contain the file names or functions that are to be instrumented
|
||||||
(AFL_LLVM_ALLOWLIST) or are specifically NOT to be instrumented (AFL_LLVM_DENYLIST).
|
(`AFL_LLVM_ALLOWLIST`) or are specifically NOT to be instrumented
|
||||||
|
(`AFL_LLVM_DENYLIST`).
|
||||||
|
|
||||||
GCC_PLUGIN: you can use either AFL_LLVM_ALLOWLIST or AFL_GCC_ALLOWLIST (or the
|
GCC_PLUGIN: you can use either `AFL_LLVM_ALLOWLIST` or `AFL_GCC_ALLOWLIST` (or
|
||||||
same for _DENYLIST), both work.
|
the same for `_DENYLIST`), both work.
|
||||||
|
|
||||||
For matching to succeed, the function/file name that is being compiled must end in the
|
For matching to succeed, the function/file name that is being compiled must end
|
||||||
function/file name entry contained in this instrument file list. That is to avoid
|
in the function/file name entry contained in this instrument file list. That is
|
||||||
breaking the match when absolute paths are used during compilation.
|
to avoid breaking the match when absolute paths are used during compilation.
|
||||||
|
|
||||||
**NOTE:** In builds with optimization enabled, functions might be inlined and would not match!
|
**NOTE:** In builds with optimization enabled, functions might be inlined and
|
||||||
|
would not match!
|
||||||
|
|
||||||
|
For example, if your source tree looks like this:
|
||||||
|
|
||||||
For example if your source tree looks like this:
|
|
||||||
```
|
```
|
||||||
project/
|
project/
|
||||||
project/feature_a/a1.cpp
|
project/feature_a/a1.cpp
|
||||||
@ -83,36 +87,45 @@ project/feature_b/b1.cpp
|
|||||||
project/feature_b/b2.cpp
|
project/feature_b/b2.cpp
|
||||||
```
|
```
|
||||||
|
|
||||||
and you only want to test feature_a, then create an "instrument file list" file containing:
|
And you only want to test feature_a, then create an "instrument file list" file
|
||||||
|
containing:
|
||||||
|
|
||||||
```
|
```
|
||||||
feature_a/a1.cpp
|
feature_a/a1.cpp
|
||||||
feature_a/a2.cpp
|
feature_a/a2.cpp
|
||||||
```
|
```
|
||||||
|
|
||||||
However if the "instrument file list" file contains only this, it works as well:
|
However, if the "instrument file list" file contains only this, it works as
|
||||||
|
well:
|
||||||
|
|
||||||
```
|
```
|
||||||
a1.cpp
|
a1.cpp
|
||||||
a2.cpp
|
a2.cpp
|
||||||
```
|
```
|
||||||
but it might lead to files being unwantedly instrumented if the same filename
|
|
||||||
|
But it might lead to files being unwantedly instrumented if the same filename
|
||||||
exists somewhere else in the project directories.
|
exists somewhere else in the project directories.
|
||||||
|
|
||||||
You can also specify function names. Note that for C++ the function names
|
You can also specify function names. Note that for C++ the function names must
|
||||||
must be mangled to match! `nm` can print these names.
|
be mangled to match! `nm` can print these names.
|
||||||
|
|
||||||
|
AFL++ is able to identify whether an entry is a filename or a function. However,
|
||||||
|
if you want to be sure (and compliant to the sancov allow/blocklist format), you
|
||||||
|
can specify source file entries like this:
|
||||||
|
|
||||||
AFL++ is able to identify whether an entry is a filename or a function.
|
|
||||||
However if you want to be sure (and compliant to the sancov allow/blocklist
|
|
||||||
format), you can specify source file entries like this:
|
|
||||||
```
|
```
|
||||||
src: *malloc.c
|
src: *malloc.c
|
||||||
```
|
```
|
||||||
and function entries like this:
|
|
||||||
|
And function entries like this:
|
||||||
|
|
||||||
```
|
```
|
||||||
fun: MallocFoo
|
fun: MallocFoo
|
||||||
```
|
```
|
||||||
|
|
||||||
Note that whitespace is ignored and comments (`# foo`) are supported.
|
Note that whitespace is ignored and comments (`# foo`) are supported.
|
||||||
|
|
||||||
### 3b) UNIX-style pattern matching
|
### 3b) UNIX-style pattern matching
|
||||||
|
|
||||||
You can add UNIX-style pattern matching in the "instrument file list" entries.
|
You can add UNIX-style pattern matching in the "instrument file list" entries.
|
||||||
See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
|
See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
|
@ -2,19 +2,17 @@
|
|||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
|
|
||||||
This originally is the work of an individual nicknamed laf-intel.
|
This originally is the work of an individual nicknamed laf-intel. His blog
|
||||||
His blog [Circumventing Fuzzing Roadblocks with Compiler Transformations](https://lafintel.wordpress.com/)
|
[Circumventing Fuzzing Roadblocks with Compiler Transformations](https://lafintel.wordpress.com/)
|
||||||
and gitlab repo [laf-llvm-pass](https://gitlab.com/laf-intel/laf-llvm-pass/)
|
and GitLab repo [laf-llvm-pass](https://gitlab.com/laf-intel/laf-llvm-pass/)
|
||||||
describe some code transformations that
|
describe some code transformations that help AFL++ to enter conditional blocks,
|
||||||
help AFL++ to enter conditional blocks, where conditions consist of
|
where conditions consist of comparisons of large values.
|
||||||
comparisons of large values.
|
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
By default these passes will not run when you compile programs using
|
By default, these passes will not run when you compile programs using
|
||||||
afl-clang-fast. Hence, you can use AFL as usual.
|
afl-clang-fast. Hence, you can use AFL++ as usual. To enable the passes, you
|
||||||
To enable the passes you must set environment variables before you
|
must set environment variables before you compile the target project.
|
||||||
compile the target project.
|
|
||||||
|
|
||||||
The following options exist:
|
The following options exist:
|
||||||
|
|
||||||
@ -24,32 +22,30 @@ Enables the split-switches pass.
|
|||||||
|
|
||||||
`export AFL_LLVM_LAF_TRANSFORM_COMPARES=1`
|
`export AFL_LLVM_LAF_TRANSFORM_COMPARES=1`
|
||||||
|
|
||||||
Enables the transform-compares pass (strcmp, memcmp, strncmp,
|
Enables the transform-compares pass (strcmp, memcmp, strncmp, strcasecmp,
|
||||||
strcasecmp, strncasecmp).
|
strncasecmp).
|
||||||
|
|
||||||
`export AFL_LLVM_LAF_SPLIT_COMPARES=1`
|
`export AFL_LLVM_LAF_SPLIT_COMPARES=1`
|
||||||
|
|
||||||
Enables the split-compares pass.
|
Enables the split-compares pass. By default, it will
|
||||||
By default it will
|
|
||||||
1. simplify operators >= (and <=) into chains of > (<) and == comparisons
|
1. simplify operators >= (and <=) into chains of > (<) and == comparisons
|
||||||
2. change signed integer comparisons to a chain of sign-only comparison
|
2. change signed integer comparisons to a chain of sign-only comparison and
|
||||||
and unsigned integer comparisons
|
unsigned integer comparisons
|
||||||
3. split all unsigned integer comparisons with bit widths of
|
3. split all unsigned integer comparisons with bit widths of 64, 32, or 16 bits
|
||||||
64, 32 or 16 bits to chains of 8 bits comparisons.
|
to chains of 8 bits comparisons.
|
||||||
|
|
||||||
You can change the behaviour of the last step by setting
|
You can change the behavior of the last step by setting `export
|
||||||
`export AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>`, where
|
AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>`, where bit_width may be 64, 32, or
|
||||||
bit_width may be 64, 32 or 16. For example, a bit_width of 16
|
16. For example, a bit_width of 16 would split larger comparisons down to 16 bit
|
||||||
would split larger comparisons down to 16 bit comparisons.
|
comparisons.
|
||||||
|
|
||||||
A new experimental feature is splitting floating point comparisons into a
|
A new experimental feature is splitting floating point comparisons into a series
|
||||||
series of sign, exponent and mantissa comparisons followed by splitting each
|
of sign, exponent and mantissa comparisons followed by splitting each of them
|
||||||
of them into 8 bit comparisons when necessary.
|
into 8 bit comparisons when necessary. It is activated with the
|
||||||
It is activated with the `AFL_LLVM_LAF_SPLIT_FLOATS` setting.
|
`AFL_LLVM_LAF_SPLIT_FLOATS` setting. Please note that full IEEE 754
|
||||||
Please note that full IEEE 754 functionality is not preserved, that is
|
functionality is not preserved, that is values of nan and infinity will probably
|
||||||
values of nan and infinity will probably behave differently.
|
behave differently.
|
||||||
|
|
||||||
Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES`
|
Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES`.
|
||||||
|
|
||||||
You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled :-)
|
|
||||||
|
|
||||||
|
You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled. :-)
|
@ -1,55 +1,56 @@
|
|||||||
# afl-clang-lto - collision free instrumentation at link time
|
# afl-clang-lto - collision free instrumentation at link time
|
||||||
|
|
||||||
## TLDR;
|
## TL;DR:
|
||||||
|
|
||||||
This version requires a current llvm 11+ compiled from the github master.
|
This version requires a current llvm 11+ compiled from the GitHub master.
|
||||||
|
|
||||||
1. Use afl-clang-lto/afl-clang-lto++ because it is faster and gives better
|
1. Use afl-clang-lto/afl-clang-lto++ because it is faster and gives better
|
||||||
coverage than anything else that is out there in the AFL world
|
coverage than anything else that is out there in the AFL world.
|
||||||
|
|
||||||
2. You can use it together with llvm_mode: laf-intel and the instrument file listing
|
2. You can use it together with llvm_mode: laf-intel and the instrument file
|
||||||
features and can be combined with cmplog/Redqueen
|
listing features and can be combined with cmplog/Redqueen.
|
||||||
|
|
||||||
3. It only works with llvm 11+
|
3. It only works with llvm 11+.
|
||||||
|
|
||||||
4. AUTODICTIONARY feature! see below
|
4. AUTODICTIONARY feature (see below)!
|
||||||
|
|
||||||
5. If any problems arise be sure to set `AR=llvm-ar RANLIB=llvm-ranlib`.
|
5. If any problems arise, be sure to set `AR=llvm-ar RANLIB=llvm-ranlib`. Some
|
||||||
Some targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
|
targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
|
||||||
|
|
||||||
## Introduction and problem description
|
## Introduction and problem description
|
||||||
|
|
||||||
A big issue with how AFL/AFL++ works is that the basic block IDs that are
|
A big issue with how AFL++ works is that the basic block IDs that are set during
|
||||||
set during compilation are random - and hence naturally the larger the number
|
compilation are random - and hence naturally the larger the number of
|
||||||
of instrumented locations, the higher the number of edge collisions are in the
|
instrumented locations, the higher the number of edge collisions are in the map.
|
||||||
map. This can result in not discovering new paths and therefore degrade the
|
This can result in not discovering new paths and therefore degrade the
|
||||||
efficiency of the fuzzing process.
|
efficiency of the fuzzing process.
|
||||||
|
|
||||||
*This issue is underestimated in the fuzzing community!*
|
*This issue is underestimated in the fuzzing community!* With a 2^16 = 64kb
|
||||||
With a 2^16 = 64kb standard map at already 256 instrumented blocks there is
|
standard map at already 256 instrumented blocks, there is on average one
|
||||||
on average one collision. On average a target has 10.000 to 50.000
|
collision. On average, a target has 10.000 to 50.000 instrumented blocks, hence
|
||||||
instrumented blocks hence the real collisions are between 750-18.000!
|
the real collisions are between 750-18.000!
|
||||||
|
|
||||||
To reach a solution that prevents any collisions took several approaches
|
To reach a solution that prevents any collisions took several approaches and
|
||||||
and many dead ends until we got to this:
|
many dead ends until we got to this:
|
||||||
|
|
||||||
* We instrument at link time when we have all files pre-compiled
|
* We instrument at link time when we have all files pre-compiled.
|
||||||
* To instrument at link time we compile in LTO (link time optimization) mode
|
* To instrument at link time, we compile in LTO (link time optimization) mode.
|
||||||
* Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the
|
* Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the correct
|
||||||
correct LTO options and runs our own afl-ld linker instead of the system
|
LTO options and runs our own afl-ld linker instead of the system linker.
|
||||||
linker
|
* The LLVM linker collects all LTO files to link and instruments them so that we
|
||||||
* The LLVM linker collects all LTO files to link and instruments them so that
|
have non-colliding edge overage.
|
||||||
we have non-colliding edge overage
|
* We use a new (for afl) edge coverage - which is the same as in llvm
|
||||||
* We use a new (for afl) edge coverage - which is the same as in llvm
|
-fsanitize=coverage edge coverage mode. :)
|
||||||
-fsanitize=coverage edge coverage mode :)
|
|
||||||
|
|
||||||
The result:
|
The result:
|
||||||
* 10-25% speed gain compared to llvm_mode
|
|
||||||
* guaranteed non-colliding edge coverage :-)
|
* 10-25% speed gain compared to llvm_mode
|
||||||
* The compile time especially for binaries to an instrumented library can be
|
* guaranteed non-colliding edge coverage :-)
|
||||||
much longer
|
* The compile time, especially for binaries to an instrumented library, can be
|
||||||
|
much longer.
|
||||||
|
|
||||||
Example build output from a libtiff build:
|
Example build output from a libtiff build:
|
||||||
|
|
||||||
```
|
```
|
||||||
libtool: link: afl-clang-lto -g -O2 -Wall -W -o thumbnail thumbnail.o ../libtiff/.libs/libtiff.a ../port/.libs/libport.a -llzma -ljbig -ljpeg -lz -lm
|
libtool: link: afl-clang-lto -g -O2 -Wall -W -o thumbnail thumbnail.o ../libtiff/.libs/libtiff.a ../port/.libs/libport.a -llzma -ljbig -ljpeg -lz -lm
|
||||||
afl-clang-lto++2.63d by Marc "vanHauser" Heuse <mh@mh-sec.de> in mode LTO
|
afl-clang-lto++2.63d by Marc "vanHauser" Heuse <mh@mh-sec.de> in mode LTO
|
||||||
@ -62,21 +63,24 @@ AUTODICTIONARY: 11 strings found
|
|||||||
|
|
||||||
### Installing llvm version 11 or 12
|
### Installing llvm version 11 or 12
|
||||||
|
|
||||||
llvm 11 or even 12 should be available in all current Linux repositories.
|
llvm 11 or even 12 should be available in all current Linux repositories. If you
|
||||||
If you use an outdated Linux distribution read the next section.
|
use an outdated Linux distribution, read the next section.
|
||||||
|
|
||||||
### Installing llvm from the llvm repository (version 12+)
|
### Installing llvm from the llvm repository (version 12+)
|
||||||
|
|
||||||
Installing the llvm snapshot builds is easy and mostly painless:
|
Installing the llvm snapshot builds is easy and mostly painless:
|
||||||
|
|
||||||
In the follow line change `NAME` for your Debian or Ubuntu release name
|
In the following line, change `NAME` for your Debian or Ubuntu release name
|
||||||
(e.g. buster, focal, eon, etc.):
|
(e.g. buster, focal, eon, etc.):
|
||||||
|
|
||||||
```
|
```
|
||||||
echo deb http://apt.llvm.org/NAME/ llvm-toolchain-NAME NAME >> /etc/apt/sources.list
|
echo deb http://apt.llvm.org/NAME/ llvm-toolchain-NAME NAME >> /etc/apt/sources.list
|
||||||
```
|
```
|
||||||
then add the pgp key of llvm and install the packages:
|
|
||||||
|
Then add the pgp key of llvm and install the packages:
|
||||||
|
|
||||||
```
|
```
|
||||||
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
|
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
|
||||||
apt-get update && apt-get upgrade -y
|
apt-get update && apt-get upgrade -y
|
||||||
apt-get install -y clang-12 clang-tools-12 libc++1-12 libc++-12-dev \
|
apt-get install -y clang-12 clang-tools-12 libc++1-12 libc++-12-dev \
|
||||||
libc++abi1-12 libc++abi-12-dev libclang1-12 libclang-12-dev \
|
libc++abi1-12 libc++abi-12-dev libclang1-12 libclang-12-dev \
|
||||||
@ -87,7 +91,8 @@ apt-get install -y clang-12 clang-tools-12 libc++1-12 libc++-12-dev \
|
|||||||
|
|
||||||
### Building llvm yourself (version 12+)
|
### Building llvm yourself (version 12+)
|
||||||
|
|
||||||
Building llvm from github takes quite some long time and is not painless:
|
Building llvm from GitHub takes quite some time and is not painless:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
sudo apt install binutils-dev # this is *essential*!
|
sudo apt install binutils-dev # this is *essential*!
|
||||||
git clone --depth=1 https://github.com/llvm/llvm-project
|
git clone --depth=1 https://github.com/llvm/llvm-project
|
||||||
@ -126,10 +131,12 @@ sudo make install
|
|||||||
|
|
||||||
Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
|
Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
|
||||||
|
|
||||||
Also the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST -> [README.instrument_list.md](README.instrument_list.md)) and
|
Also, the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST ->
|
||||||
laf-intel/compcov (AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
|
[README.instrument_list.md](README.instrument_list.md)) and laf-intel/compcov
|
||||||
|
(AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure
|
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure
|
||||||
make
|
make
|
||||||
@ -143,51 +150,48 @@ NOTE: some targets also need to set the linker, try both `afl-clang-lto` and
|
|||||||
Note: this is highly discouraged! Try to compile to static libraries with
|
Note: this is highly discouraged! Try to compile to static libraries with
|
||||||
afl-clang-lto instead of shared libraries!
|
afl-clang-lto instead of shared libraries!
|
||||||
|
|
||||||
To make instrumented shared libraries work with afl-clang-lto you have to do
|
To make instrumented shared libraries work with afl-clang-lto, you have to do
|
||||||
quite some extra steps.
|
quite some extra steps.
|
||||||
|
|
||||||
Every shared library you want to instrument has to be individually compiled.
|
Every shared library you want to instrument has to be individually compiled. The
|
||||||
The environment variable `AFL_LLVM_LTO_DONTWRITEID=1` has to be set during
|
environment variable `AFL_LLVM_LTO_DONTWRITEID=1` has to be set during
|
||||||
compilation.
|
compilation. Additionally, the environment variable `AFL_LLVM_LTO_STARTID` has
|
||||||
Additionally the environment variable `AFL_LLVM_LTO_STARTID` has to be set to
|
to be set to the added edge count values of all previous compiled instrumented
|
||||||
the added edge count values of all previous compiled instrumented shared
|
shared libraries for that target. E.g., for the first shared library this would
|
||||||
libraries for that target.
|
be `AFL_LLVM_LTO_STARTID=0` and afl-clang-lto will then report how many edges
|
||||||
E.g. for the first shared library this would be `AFL_LLVM_LTO_STARTID=0` and
|
have been instrumented (let's say it reported 1000 instrumented edges). The
|
||||||
afl-clang-lto will then report how many edges have been instrumented (let's say
|
second shared library then has to be set to that value
|
||||||
it reported 1000 instrumented edges).
|
|
||||||
The second shared library then has to be set to that value
|
|
||||||
(`AFL_LLVM_LTO_STARTID=1000` in our example), for the third to all previous
|
(`AFL_LLVM_LTO_STARTID=1000` in our example), for the third to all previous
|
||||||
counts added, etc.
|
counts added, etc.
|
||||||
|
|
||||||
The final program compilation step then may *not* have `AFL_LLVM_LTO_DONTWRITEID`
|
The final program compilation step then may *not* have
|
||||||
set, and `AFL_LLVM_LTO_STARTID` must be set to all edge counts added of all shared
|
`AFL_LLVM_LTO_DONTWRITEID` set, and `AFL_LLVM_LTO_STARTID` must be set to all
|
||||||
libraries it will be linked to.
|
edge counts added of all shared libraries it will be linked to.
|
||||||
|
|
||||||
This is quite some hands-on work, so better stay away from instrumenting
|
This is quite some hands-on work, so better stay away from instrumenting shared
|
||||||
shared libraries :-)
|
libraries. :-)
|
||||||
|
|
||||||
## AUTODICTIONARY feature
|
## AUTODICTIONARY feature
|
||||||
|
|
||||||
While compiling, a dictionary based on string comparisons is automatically
|
While compiling, a dictionary based on string comparisons is automatically
|
||||||
generated and put into the target binary. This dictionary is transfered to afl-fuzz
|
generated and put into the target binary. This dictionary is transferred to
|
||||||
on start. This improves coverage statistically by 5-10% :)
|
afl-fuzz on start. This improves coverage statistically by 5-10%. :)
|
||||||
|
|
||||||
Note that if for any reason you do not want to use the autodictionary feature
|
Note that if for any reason you do not want to use the autodictionary feature,
|
||||||
then just set the environment variable `AFL_NO_AUTODICT` when starting afl-fuzz.
|
then just set the environment variable `AFL_NO_AUTODICT` when starting afl-fuzz.
|
||||||
|
|
||||||
## Fixed memory map
|
## Fixed memory map
|
||||||
|
|
||||||
To speed up fuzzing a little bit more, it is possible to set a fixed shared
|
To speed up fuzzing a little bit more, it is possible to set a fixed shared
|
||||||
memory map.
|
memory map. Recommended is the value 0x10000.
|
||||||
Recommended is the value 0x10000.
|
|
||||||
|
|
||||||
In most cases this will work without any problems. However if a target uses
|
In most cases, this will work without any problems. However, if a target uses
|
||||||
early constructors, ifuncs or a deferred forkserver this can crash the target.
|
early constructors, ifuncs, or a deferred forkserver, this can crash the target.
|
||||||
|
|
||||||
Also on unusual operating systems/processors/kernels or weird libraries the
|
Also, on unusual operating systems/processors/kernels or weird libraries the
|
||||||
recommended 0x10000 address might not work, so then change the fixed address.
|
recommended 0x10000 address might not work, so then change the fixed address.
|
||||||
|
|
||||||
To enable this feature set AFL_LLVM_MAP_ADDR with the address.
|
To enable this feature, set `AFL_LLVM_MAP_ADDR` with the address.
|
||||||
|
|
||||||
## Document edge IDs
|
## Document edge IDs
|
||||||
|
|
||||||
@ -206,143 +210,155 @@ these.
|
|||||||
An example of a hard to solve target is ffmpeg. Here is how to successfully
|
An example of a hard to solve target is ffmpeg. Here is how to successfully
|
||||||
instrument it:
|
instrument it:
|
||||||
|
|
||||||
1. Get and extract the current ffmpeg and change to its directory
|
1. Get and extract the current ffmpeg and change to its directory.
|
||||||
|
|
||||||
2. Running configure with --cc=clang fails and various other items will fail
|
2. Running configure with --cc=clang fails and various other items will fail
|
||||||
when compiling, so we have to trick configure:
|
when compiling, so we have to trick configure:
|
||||||
|
|
||||||
```
|
```
|
||||||
./configure --enable-lto --disable-shared --disable-inline-asm
|
./configure --enable-lto --disable-shared --disable-inline-asm
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Now the configuration is done - and we edit the settings in `./ffbuild/config.mak`
|
3. Now the configuration is done - and we edit the settings in
|
||||||
(-: the original line, +: what to change it into):
|
`./ffbuild/config.mak` (-: the original line, +: what to change it into):
|
||||||
```
|
|
||||||
-CC=gcc
|
|
||||||
+CC=afl-clang-lto
|
|
||||||
-CXX=g++
|
|
||||||
+CXX=afl-clang-lto++
|
|
||||||
-AS=gcc
|
|
||||||
+AS=llvm-as
|
|
||||||
-LD=gcc
|
|
||||||
+LD=afl-clang-lto++
|
|
||||||
-DEPCC=gcc
|
|
||||||
+DEPCC=afl-clang-lto
|
|
||||||
-DEPAS=gcc
|
|
||||||
+DEPAS=afl-clang-lto++
|
|
||||||
-AR=ar
|
|
||||||
+AR=llvm-ar
|
|
||||||
-AR_CMD=ar
|
|
||||||
+AR_CMD=llvm-ar
|
|
||||||
-NM_CMD=nm -g
|
|
||||||
+NM_CMD=llvm-nm -g
|
|
||||||
-RANLIB=ranlib -D
|
|
||||||
+RANLIB=llvm-ranlib -D
|
|
||||||
```
|
|
||||||
|
|
||||||
4. Then type make, wait for a long time and you are done :)
|
```
|
||||||
|
-CC=gcc
|
||||||
|
+CC=afl-clang-lto
|
||||||
|
-CXX=g++
|
||||||
|
+CXX=afl-clang-lto++
|
||||||
|
-AS=gcc
|
||||||
|
+AS=llvm-as
|
||||||
|
-LD=gcc
|
||||||
|
+LD=afl-clang-lto++
|
||||||
|
-DEPCC=gcc
|
||||||
|
+DEPCC=afl-clang-lto
|
||||||
|
-DEPAS=gcc
|
||||||
|
+DEPAS=afl-clang-lto++
|
||||||
|
-AR=ar
|
||||||
|
+AR=llvm-ar
|
||||||
|
-AR_CMD=ar
|
||||||
|
+AR_CMD=llvm-ar
|
||||||
|
-NM_CMD=nm -g
|
||||||
|
+NM_CMD=llvm-nm -g
|
||||||
|
-RANLIB=ranlib -D
|
||||||
|
+RANLIB=llvm-ranlib -D
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Then type make, wait for a long time, and you are done. :)
|
||||||
|
|
||||||
### Example: WebKit jsc
|
### Example: WebKit jsc
|
||||||
|
|
||||||
Building jsc is difficult as the build script has bugs.
|
Building jsc is difficult as the build script has bugs.
|
||||||
|
|
||||||
1. checkout Webkit:
|
1. Checkout Webkit:
|
||||||
```
|
|
||||||
svn checkout https://svn.webkit.org/repository/webkit/trunk WebKit
|
```
|
||||||
cd WebKit
|
svn checkout https://svn.webkit.org/repository/webkit/trunk WebKit
|
||||||
```
|
cd WebKit
|
||||||
|
```
|
||||||
|
|
||||||
2. Fix the build environment:
|
2. Fix the build environment:
|
||||||
```
|
|
||||||
mkdir -p WebKitBuild/Release
|
|
||||||
cd WebKitBuild/Release
|
|
||||||
ln -s ../../../../../usr/bin/llvm-ar-12 llvm-ar-12
|
|
||||||
ln -s ../../../../../usr/bin/llvm-ranlib-12 llvm-ranlib-12
|
|
||||||
cd ../..
|
|
||||||
```
|
|
||||||
|
|
||||||
3. Build :)
|
```
|
||||||
|
mkdir -p WebKitBuild/Release
|
||||||
|
cd WebKitBuild/Release
|
||||||
|
ln -s ../../../../../usr/bin/llvm-ar-12 llvm-ar-12
|
||||||
|
ln -s ../../../../../usr/bin/llvm-ranlib-12 llvm-ranlib-12
|
||||||
|
cd ../..
|
||||||
|
```
|
||||||
|
|
||||||
```
|
3. Build. :)
|
||||||
Tools/Scripts/build-jsc --jsc-only --cli --cmakeargs="-DCMAKE_AR='llvm-ar-12' -DCMAKE_RANLIB='llvm-ranlib-12' -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CC_FLAGS='-O3 -lrt' -DCMAKE_CXX_FLAGS='-O3 -lrt' -DIMPORTED_LOCATION='/lib/x86_64-linux-gnu/' -DCMAKE_CC=afl-clang-lto -DCMAKE_CXX=afl-clang-lto++ -DENABLE_STATIC_JSC=ON"
|
|
||||||
```
|
```
|
||||||
|
Tools/Scripts/build-jsc --jsc-only --cli --cmakeargs="-DCMAKE_AR='llvm-ar-12' -DCMAKE_RANLIB='llvm-ranlib-12' -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CC_FLAGS='-O3 -lrt' -DCMAKE_CXX_FLAGS='-O3 -lrt' -DIMPORTED_LOCATION='/lib/x86_64-linux-gnu/' -DCMAKE_CC=afl-clang-lto -DCMAKE_CXX=afl-clang-lto++ -DENABLE_STATIC_JSC=ON"
|
||||||
|
```
|
||||||
|
|
||||||
## Potential issues
|
## Potential issues
|
||||||
|
|
||||||
### compiling libraries fails
|
### Compiling libraries fails
|
||||||
|
|
||||||
If you see this message:
|
If you see this message:
|
||||||
|
|
||||||
```
|
```
|
||||||
/bin/ld: libfoo.a: error adding symbols: archive has no index; run ranlib to add one
|
/bin/ld: libfoo.a: error adding symbols: archive has no index; run ranlib to add one
|
||||||
```
|
```
|
||||||
This is because usually gnu gcc ranlib is being called which cannot deal with clang LTO files.
|
|
||||||
The solution is simple: when you ./configure you also have to set RANLIB=llvm-ranlib and AR=llvm-ar
|
This is because usually gnu gcc ranlib is being called which cannot deal with
|
||||||
|
clang LTO files. The solution is simple: when you `./configure`, you also have
|
||||||
|
to set `RANLIB=llvm-ranlib` and `AR=llvm-ar`.
|
||||||
|
|
||||||
Solution:
|
Solution:
|
||||||
|
|
||||||
```
|
```
|
||||||
AR=llvm-ar RANLIB=llvm-ranlib CC=afl-clang-lto CXX=afl-clang-lto++ ./configure --disable-shared
|
AR=llvm-ar RANLIB=llvm-ranlib CC=afl-clang-lto CXX=afl-clang-lto++ ./configure --disable-shared
|
||||||
```
|
```
|
||||||
and on some targets you have to set AR=/RANLIB= even for make as the configure script does not save it.
|
|
||||||
Other targets ignore environment variables and need the parameters set via
|
|
||||||
`./configure --cc=... --cxx= --ranlib= ...` etc. (I am looking at you ffmpeg!).
|
|
||||||
|
|
||||||
|
And on some targets you have to set `AR=/RANLIB=` even for `make` as the
|
||||||
|
configure script does not save it. Other targets ignore environment variables
|
||||||
|
and need the parameters set via `./configure --cc=... --cxx= --ranlib= ...` etc.
|
||||||
|
(I am looking at you ffmpeg!)
|
||||||
|
|
||||||
|
If you see this message:
|
||||||
|
|
||||||
If you see this message
|
|
||||||
```
|
```
|
||||||
assembler command failed ...
|
assembler command failed ...
|
||||||
```
|
```
|
||||||
then try setting `llvm-as` for configure:
|
|
||||||
|
Then try setting `llvm-as` for configure:
|
||||||
|
|
||||||
```
|
```
|
||||||
AS=llvm-as ...
|
AS=llvm-as ...
|
||||||
```
|
```
|
||||||
|
|
||||||
### compiling programs still fail
|
### Compiling programs still fail
|
||||||
|
|
||||||
afl-clang-lto is still work in progress.
|
afl-clang-lto is still work in progress.
|
||||||
|
|
||||||
Known issues:
|
Known issues:
|
||||||
* Anything that llvm 11+ cannot compile, afl-clang-lto cannot compile either - obviously
|
* Anything that llvm 11+ cannot compile, afl-clang-lto cannot compile either -
|
||||||
* Anything that does not compile with LTO, afl-clang-lto cannot compile either - obviously
|
obviously.
|
||||||
|
* Anything that does not compile with LTO, afl-clang-lto cannot compile either -
|
||||||
|
obviously.
|
||||||
|
|
||||||
Hence if building a target with afl-clang-lto fails try to build it with llvm12
|
Hence, if building a target with afl-clang-lto fails, try to build it with
|
||||||
and LTO enabled (`CC=clang-12` `CXX=clang++-12` `CFLAGS=-flto=full` and
|
llvm12 and LTO enabled (`CC=clang-12`, `CXX=clang++-12`, `CFLAGS=-flto=full`,
|
||||||
`CXXFLAGS=-flto=full`).
|
and `CXXFLAGS=-flto=full`).
|
||||||
|
|
||||||
If this succeeeds then there is an issue with afl-clang-lto. Please report at
|
If this succeeds, then there is an issue with afl-clang-lto. Please report at
|
||||||
[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226)
|
[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226).
|
||||||
|
|
||||||
Even some targets where clang-12 fails can be build if the fail is just in
|
Even some targets where clang-12 fails can be build if the fail is just in
|
||||||
`./configure`, see `Solving difficult targets` above.
|
`./configure`, see `Solving difficult targets` above.
|
||||||
|
|
||||||
## History
|
## History
|
||||||
|
|
||||||
This was originally envisioned by hexcoder- in Summer 2019, however we saw no
|
This was originally envisioned by hexcoder- in Summer 2019. However, we saw no
|
||||||
way to create a pass that is run at link time - although there is a option
|
way to create a pass that is run at link time - although there is a option for
|
||||||
for this in the PassManager: EP_FullLinkTimeOptimizationLast
|
this in the PassManager: EP_FullLinkTimeOptimizationLast. ("Fun" info - nobody
|
||||||
("Fun" info - nobody knows what this is doing. And the developer who
|
knows what this is doing. And the developer who implemented this didn't respond
|
||||||
implemented this didn't respond to emails.)
|
to emails.)
|
||||||
|
|
||||||
In December then came the idea to implement this as a pass that is run via
|
In December then came the idea to implement this as a pass that is run via the
|
||||||
the llvm "opt" program, which is performed via an own linker that afterwards
|
llvm "opt" program, which is performed via an own linker that afterwards calls
|
||||||
calls the real linker.
|
the real linker. This was first implemented in January and work ... kinda. The
|
||||||
This was first implemented in January and work ... kinda.
|
LTO time instrumentation worked, however, "how" the basic blocks were
|
||||||
The LTO time instrumentation worked, however "how" the basic blocks were
|
instrumented was a problem, as reducing duplicates turned out to be very, very
|
||||||
instrumented was a problem, as reducing duplicates turned out to be very,
|
difficult with a program that has so many paths and therefore so many
|
||||||
very difficult with a program that has so many paths and therefore so many
|
dependencies. A lot of strategies were implemented - and failed. And then sat
|
||||||
dependencies. A lot of strategies were implemented - and failed.
|
solvers were tried, but with over 10.000 variables that turned out to be a
|
||||||
And then sat solvers were tried, but with over 10.000 variables that turned
|
dead-end too.
|
||||||
out to be a dead-end too.
|
|
||||||
|
|
||||||
The final idea to solve this came from domenukk who proposed to insert a block
|
The final idea to solve this came from domenukk who proposed to insert a block
|
||||||
into an edge and then just use incremental counters ... and this worked!
|
into an edge and then just use incremental counters ... and this worked! After
|
||||||
After some trials and errors to implement this vanhauser-thc found out that
|
some trials and errors to implement this vanhauser-thc found out that there is
|
||||||
there is actually an llvm function for this: SplitEdge() :-)
|
actually an llvm function for this: SplitEdge() :-)
|
||||||
|
|
||||||
Still more problems came up though as this only works without bugs from
|
Still more problems came up though as this only works without bugs from llvm 9
|
||||||
llvm 9 onwards, and with high optimization the link optimization ruins
|
onwards, and with high optimization the link optimization ruins the instrumented
|
||||||
the instrumented control flow graph.
|
control flow graph.
|
||||||
|
|
||||||
This is all now fixed with llvm 11+. The llvm's own linker is now able to
|
This is all now fixed with llvm 11+. The llvm's own linker is now able to load
|
||||||
load passes and this bypasses all problems we had.
|
passes and this bypasses all problems we had.
|
||||||
|
|
||||||
Happy end :)
|
Happy end :)
|
@ -132,7 +132,7 @@ and you should be all set!
|
|||||||
Some libraries provide APIs that are stateless, or whose state can be reset in
|
Some libraries provide APIs that are stateless, or whose state can be reset in
|
||||||
between processing different input files. When such a reset is performed, a
|
between processing different input files. When such a reset is performed, a
|
||||||
single long-lived process can be reused to try out multiple test cases,
|
single long-lived process can be reused to try out multiple test cases,
|
||||||
eliminating the need for repeated fork() calls and the associated OS overhead.
|
eliminating the need for repeated `fork()` calls and the associated OS overhead.
|
||||||
|
|
||||||
The basic structure of the program that does this would be:
|
The basic structure of the program that does this would be:
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user