mirror of
https://github.com/AFLplusplus/AFLplusplus.git
synced 2025-06-18 20:48:07 +00:00
Clean up docs folder
This commit is contained in:
@ -37,9 +37,10 @@ superior to blind fuzzing or coverage-only tools.
|
||||
|
||||
## Understanding the status screen
|
||||
|
||||
This document provides an overview of the status screen - plus tips for
|
||||
troubleshooting any warnings and red text shown in the UI. See
|
||||
[README.md](../README.md) for the general instruction manual.
|
||||
This chapter provides an overview of the status screen - plus tips for
|
||||
troubleshooting any warnings and red text shown in the UI.
|
||||
|
||||
For the general instruction manual, see [README.md](../README.md).
|
||||
|
||||
### A note about colors
|
||||
|
||||
@ -47,7 +48,7 @@ The status screen and error messages use colors to keep things readable and
|
||||
attract your attention to the most important details. For example, red almost
|
||||
always means "consult this doc" :-)
|
||||
|
||||
Unfortunately, the UI will render correctly only if your terminal is using
|
||||
Unfortunately, the UI will only render correctly if your terminal is using
|
||||
traditional un*x palette (white text on black background) or something close to
|
||||
that.
|
||||
|
||||
@ -61,7 +62,7 @@ If you are using inverse video, you may want to change your settings, say:
|
||||
Alternatively, if you really like your current colors, you can edit config.h to
|
||||
comment out USE_COLORS, then do `make clean all`.
|
||||
|
||||
I'm not aware of any other simple way to make this work without causing other
|
||||
We are not aware of any other simple way to make this work without causing other
|
||||
side effects - sorry about that.
|
||||
|
||||
With that out of the way, let's talk about what's actually on the screen...
|
||||
@ -103,8 +104,8 @@ will be allowed to run for months.
|
||||
There's one important thing to watch out for: if the tool is not finding new
|
||||
paths within several minutes of starting, you're probably not invoking the
|
||||
target binary correctly and it never gets to parse the input files we're
|
||||
throwing at it; another possible explanations are that the default memory limit
|
||||
(`-m`) is too restrictive, and the program exits after failing to allocate a
|
||||
throwing at it; other possible explanations are that the default memory limit
|
||||
(`-m`) is too restrictive and the program exits after failing to allocate a
|
||||
buffer very early on; or that the input files are patently invalid and always
|
||||
fail a basic header check.
|
||||
|
||||
@ -124,9 +125,9 @@ red warning in this section, too :-)
|
||||
|
||||
The first field in this section gives you the count of queue passes done so far
|
||||
- that is, the number of times the fuzzer went over all the interesting test
|
||||
cases discovered so far, fuzzed them, and looped back to the very beginning.
|
||||
Every fuzzing session should be allowed to complete at least one cycle; and
|
||||
ideally, should run much longer than that.
|
||||
cases discovered so far, fuzzed them, and looped back to the very beginning.
|
||||
Every fuzzing session should be allowed to complete at least one cycle; and
|
||||
ideally, should run much longer than that.
|
||||
|
||||
As noted earlier, the first pass can take a day or longer, so sit back and
|
||||
relax.
|
||||
@ -140,7 +141,8 @@ while.
|
||||
The remaining fields in this part of the screen should be pretty obvious:
|
||||
there's the number of test cases ("paths") discovered so far, and the number of
|
||||
unique faults. The test cases, crashes, and hangs can be explored in real-time
|
||||
by browsing the output directory, as discussed in [README.md](../README.md).
|
||||
by browsing the output directory, see
|
||||
[#interpreting-output](#interpreting-output).
|
||||
|
||||
### Cycle progress
|
||||
|
||||
|
@ -1,49 +1,61 @@
|
||||
# Important features of AFL++
|
||||
|
||||
AFL++ supports llvm from 3.8 up to version 12, very fast binary fuzzing with QEMU 5.1
|
||||
with laf-intel and redqueen, frida mode, unicorn mode, gcc plugin, full *BSD,
|
||||
Mac OS, Solaris and Android support and much, much, much more.
|
||||
AFL++ supports llvm from 3.8 up to version 12, very fast binary fuzzing with
|
||||
QEMU 5.1 with laf-intel and redqueen, frida mode, unicorn mode, gcc plugin, full
|
||||
*BSD, Mac OS, Solaris and Android support and much, much, much more.
|
||||
|
||||
| Feature/Instrumentation | afl-gcc | llvm | gcc_plugin | frida_mode(9) | qemu_mode(10) |unicorn_mode(10) |coresight_mode(11)|
|
||||
| -------------------------|:-------:|:---------:|:----------:|:----------------:|:----------------:|:----------------:|:----------------:|
|
||||
| Threadsafe counters | | x(3) | | | | | |
|
||||
| NeverZero | x86[_64]| x(1) | x | x | x | x | |
|
||||
| Persistent Mode | | x | x | x86[_64]/arm64 | x86[_64]/arm[64] | x | |
|
||||
| LAF-Intel / CompCov | | x | | | x86[_64]/arm[64] | x86[_64]/arm[64] | |
|
||||
| CmpLog | | x | | x86[_64]/arm64 | x86[_64]/arm[64] | | |
|
||||
| Selective Instrumentation| | x | x | x | x | | |
|
||||
| Non-Colliding Coverage | | x(4) | | | (x)(5) | | |
|
||||
| Ngram prev_loc Coverage | | x(6) | | | | | |
|
||||
| Context Coverage | | x(6) | | | | | |
|
||||
| Auto Dictionary | | x(7) | | | | | |
|
||||
| Snapshot LKM Support | | (x)(8) | (x)(8) | | (x)(5) | | |
|
||||
| Shared Memory Test cases | | x | x | x86[_64]/arm64 | x | x | |
|
||||
| Feature/Instrumentation | afl-gcc | llvm | gcc_plugin | frida_mode(9) | qemu_mode(10) |unicorn_mode(10) |coresight_mode(11)|
|
||||
| -------------------------|:-------:|:---------:|:----------:|:----------------:|:----------------:|:----------------:|:----------------:|
|
||||
| Threadsafe counters | | x(3) | | | | | |
|
||||
| NeverZero | x86[_64]| x(1) | x | x | x | x | |
|
||||
| Persistent Mode | | x | x | x86[_64]/arm64 | x86[_64]/arm[64] | x | |
|
||||
| LAF-Intel / CompCov | | x | | | x86[_64]/arm[64] | x86[_64]/arm[64] | |
|
||||
| CmpLog | | x | | x86[_64]/arm64 | x86[_64]/arm[64] | | |
|
||||
| Selective Instrumentation| | x | x | x | x | | |
|
||||
| Non-Colliding Coverage | | x(4) | | | (x)(5) | | |
|
||||
| Ngram prev_loc Coverage | | x(6) | | | | | |
|
||||
| Context Coverage | | x(6) | | | | | |
|
||||
| Auto Dictionary | | x(7) | | | | | |
|
||||
| Snapshot LKM Support | | (x)(8) | (x)(8) | | (x)(5) | | |
|
||||
| Shared Memory Test cases | | x | x | x86[_64]/arm64 | x | x | |
|
||||
|
||||
1. default for LLVM >= 9.0, env var for older version due an efficiency bug in previous llvm versions
|
||||
2. GCC creates non-performant code, hence it is disabled in gcc_plugin
|
||||
3. with `AFL_LLVM_THREADSAFE_INST`, disables NeverZero
|
||||
4. with pcguard mode and LTO mode for LLVM 11 and newer
|
||||
5. upcoming, development in the branch
|
||||
6. not compatible with LTO instrumentation and needs at least LLVM v4.1
|
||||
7. automatic in LTO mode with LLVM 11 and newer, an extra pass for all LLVM versions that write to a file to use with afl-fuzz' `-x`
|
||||
8. the snapshot LKM is currently unmaintained due to too many kernel changes coming too fast :-(
|
||||
9. frida mode is supported on Linux and MacOS for Intel and ARM
|
||||
10. QEMU/Unicorn is only supported on Linux
|
||||
11. Coresight mode is only available on AARCH64 Linux with a CPU with Coresight extension
|
||||
1. default for LLVM >= 9.0, env var for older version due an efficiency bug in
|
||||
previous llvm versions
|
||||
2. GCC creates non-performant code, hence it is disabled in gcc_plugin
|
||||
3. with `AFL_LLVM_THREADSAFE_INST`, disables NeverZero
|
||||
4. with pcguard mode and LTO mode for LLVM 11 and newer
|
||||
5. upcoming, development in the branch
|
||||
6. not compatible with LTO instrumentation and needs at least LLVM v4.1
|
||||
7. automatic in LTO mode with LLVM 11 and newer, an extra pass for all LLVM
|
||||
versions that write to a file to use with afl-fuzz' `-x`
|
||||
8. the snapshot LKM is currently unmaintained due to too many kernel changes
|
||||
coming too fast :-(
|
||||
9. frida mode is supported on Linux and MacOS for Intel and ARM
|
||||
10. QEMU/Unicorn is only supported on Linux
|
||||
11. Coresight mode is only available on AARCH64 Linux with a CPU with Coresight
|
||||
extension
|
||||
|
||||
Among others, the following features and patches have been integrated:
|
||||
Among others, the following features and patches have been integrated:
|
||||
|
||||
* NeverZero patch for afl-gcc, instrumentation, qemu_mode and unicorn_mode which prevents a wrapping map value to zero, increases coverage
|
||||
* Persistent mode, deferred forkserver and in-memory fuzzing for qemu_mode
|
||||
* Unicorn mode which allows fuzzing of binaries from completely different platforms (integration provided by domenukk)
|
||||
* The new CmpLog instrumentation for LLVM and QEMU inspired by [Redqueen](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf)
|
||||
* Win32 PE binary-only fuzzing with QEMU and Wine
|
||||
* AFLfast's power schedules by Marcel Böhme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast)
|
||||
* The MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL)
|
||||
* LLVM mode Ngram coverage by Adrian Herrera [https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass)
|
||||
* LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode (with enhanced capabilities)
|
||||
* Radamsa and honggfuzz mutators (as custom mutators).
|
||||
* QBDI mode to fuzz android native libraries via Quarkslab's [QBDI](https://github.com/QBDI/QBDI) framework
|
||||
* Frida and ptrace mode to fuzz binary-only libraries, etc.
|
||||
* NeverZero patch for afl-gcc, instrumentation, qemu_mode and unicorn_mode which
|
||||
prevents a wrapping map value to zero, increases coverage
|
||||
* Persistent mode, deferred forkserver and in-memory fuzzing for qemu_mode
|
||||
* Unicorn mode which allows fuzzing of binaries from completely different
|
||||
platforms (integration provided by domenukk)
|
||||
* The new CmpLog instrumentation for LLVM and QEMU inspired by
|
||||
[Redqueen](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf)
|
||||
* Win32 PE binary-only fuzzing with QEMU and Wine
|
||||
* AFLfast's power schedules by Marcel Böhme:
|
||||
[https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast)
|
||||
* The MOpt mutator:
|
||||
[https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL)
|
||||
* LLVM mode Ngram coverage by Adrian Herrera
|
||||
[https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass)
|
||||
* LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode
|
||||
(with enhanced capabilities)
|
||||
* Radamsa and honggfuzz mutators (as custom mutators).
|
||||
* QBDI mode to fuzz android native libraries via Quarkslab's
|
||||
[QBDI](https://github.com/QBDI/QBDI) framework
|
||||
* Frida and ptrace mode to fuzz binary-only libraries, etc.
|
||||
|
||||
So all in all this is the best-of AFL that is out there :-)
|
||||
So all in all this is the best-of AFL that is out there :-)
|
@ -84,6 +84,8 @@ Wine, python3, and the pefile python package installed.
|
||||
|
||||
It is included in AFL++.
|
||||
|
||||
For more information, see [qemu_mode/README.wine.md](../qemu_mode/README.wine.md).
|
||||
|
||||
### Frida_mode
|
||||
|
||||
In frida_mode, you can fuzz binary-only targets as easily as with QEMU.
|
||||
@ -99,11 +101,13 @@ make
|
||||
```
|
||||
|
||||
For additional instructions and caveats, see
|
||||
[frida_mode/README.md](../frida_mode/README.md). If possible, you should use the
|
||||
persistent mode, see [qemu_frida/README.md](../qemu_frida/README.md). The mode
|
||||
is approximately 2-5x slower than compile-time instrumentation, and is less
|
||||
conducive to parallelization. But for binary-only fuzzing, it gives a huge speed
|
||||
improvement if it is possible to use.
|
||||
[frida_mode/README.md](../frida_mode/README.md).
|
||||
|
||||
If possible, you should use the persistent mode, see
|
||||
[qemu_frida/README.md](../qemu_frida/README.md). The mode is approximately 2-5x
|
||||
slower than compile-time instrumentation, and is less conducive to
|
||||
parallelization. But for binary-only fuzzing, it gives a huge speed improvement
|
||||
if it is possible to use.
|
||||
|
||||
If you want to fuzz a binary-only library, then you can fuzz it with frida-gum
|
||||
via frida_mode/. You will have to write a harness to call the target function in
|
||||
@ -154,8 +158,6 @@ and use afl-untracer.c as a template. It is slower than frida_mode.
|
||||
For more information, see
|
||||
[utils/afl_untracer/README.md](../utils/afl_untracer/README.md).
|
||||
|
||||
## Binary rewriters
|
||||
|
||||
### Coresight
|
||||
|
||||
Coresight is ARM's answer to Intel's PT. With AFL++ v3.15, there is a coresight
|
||||
@ -163,6 +165,35 @@ tracer implementation available in `coresight_mode/` which is faster than QEMU,
|
||||
however, cannot run in parallel. Currently, only one process can be traced, it
|
||||
is WIP.
|
||||
|
||||
Fore more information, see
|
||||
[coresight_mode/README.md](../coresight_mode/README.md).
|
||||
|
||||
## Binary rewriters
|
||||
|
||||
An alternative solution are binary rewriters. They are faster then the solutions native to AFL++ but don't always work.
|
||||
|
||||
### ZAFL
|
||||
ZAFL is a static rewriting platform supporting x86-64 C/C++,
|
||||
stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional
|
||||
instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel,
|
||||
context sensitivity, InsTrim, etc.).
|
||||
|
||||
Its baseline instrumentation speed typically averages 90-95% of
|
||||
afl-clang-fast's.
|
||||
|
||||
[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl)
|
||||
|
||||
### RetroWrite
|
||||
|
||||
If you have an x86/x86_64 binary that still has its symbols, is compiled with
|
||||
position independent code (PIC/PIE), and does not use most of the C++ features,
|
||||
then the RetroWrite solution might be for you. It decompiles to ASM files which
|
||||
can then be instrumented with afl-gcc.
|
||||
|
||||
It is at about 80-85% performance.
|
||||
|
||||
[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite)
|
||||
|
||||
### Dyninst
|
||||
|
||||
Dyninst is a binary instrumentation framework similar to Pintool and DynamoRIO.
|
||||
@ -183,27 +214,6 @@ with afl-dyninst.
|
||||
|
||||
[https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst)
|
||||
|
||||
### Intel PT
|
||||
|
||||
If you have a newer Intel CPU, you can make use of Intel's processor trace. The
|
||||
big issue with Intel's PT is the small buffer size and the complex encoding of
|
||||
the debug information collected through PT. This makes the decoding very CPU
|
||||
intensive and hence slow. As a result, the overall speed decrease is about
|
||||
70-90% (depending on the implementation and other factors).
|
||||
|
||||
There are two AFL intel-pt implementations:
|
||||
|
||||
1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt)
|
||||
=> This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.
|
||||
|
||||
2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer)
|
||||
=> This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be
|
||||
used. This one is faster than the other.
|
||||
|
||||
Note that there is also honggfuzz:
|
||||
[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But
|
||||
its IPT performance is just 6%!
|
||||
|
||||
### Mcsema
|
||||
|
||||
Theoretically, you can also decompile to llvm IR with mcsema, and then use
|
||||
@ -211,6 +221,8 @@ llvm_mode to instrument the binary. Good luck with that.
|
||||
|
||||
[https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema)
|
||||
|
||||
## Binary tracers
|
||||
|
||||
### Pintool & DynamoRIO
|
||||
|
||||
Pintool and DynamoRIO are dynamic instrumentation engines. They can be used for
|
||||
@ -236,27 +248,26 @@ Pintool solutions:
|
||||
* [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode)
|
||||
<= only old Pintool version supported
|
||||
|
||||
### RetroWrite
|
||||
### Intel PT
|
||||
|
||||
If you have an x86/x86_64 binary that still has its symbols, is compiled with
|
||||
position independent code (PIC/PIE), and does not use most of the C++ features,
|
||||
then the RetroWrite solution might be for you. It decompiles to ASM files which
|
||||
can then be instrumented with afl-gcc.
|
||||
If you have a newer Intel CPU, you can make use of Intel's processor trace. The
|
||||
big issue with Intel's PT is the small buffer size and the complex encoding of
|
||||
the debug information collected through PT. This makes the decoding very CPU
|
||||
intensive and hence slow. As a result, the overall speed decrease is about
|
||||
70-90% (depending on the implementation and other factors).
|
||||
|
||||
It is at about 80-85% performance.
|
||||
There are two AFL intel-pt implementations:
|
||||
|
||||
[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite)
|
||||
1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt)
|
||||
=> This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.
|
||||
|
||||
### ZAFL
|
||||
ZAFL is a static rewriting platform supporting x86-64 C/C++,
|
||||
stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional
|
||||
instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel,
|
||||
context sensitivity, InsTrim, etc.).
|
||||
2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer)
|
||||
=> This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be
|
||||
used. This one is faster than the other.
|
||||
|
||||
Its baseline instrumentation speed typically averages 90-95% of
|
||||
afl-clang-fast's.
|
||||
|
||||
[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl)
|
||||
Note that there is also honggfuzz:
|
||||
[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But
|
||||
its IPT performance is just 6%!
|
||||
|
||||
## Non-AFL++ solutions
|
||||
|
||||
|
@ -1,36 +1,37 @@
|
||||
# Known limitations & areas for improvement
|
||||
|
||||
Here are some of the most important caveats for AFL:
|
||||
Here are some of the most important caveats for AFL++:
|
||||
|
||||
- AFL++ detects faults by checking for the first spawned process dying due to
|
||||
a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for
|
||||
- AFL++ detects faults by checking for the first spawned process dying due to a
|
||||
signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for
|
||||
these signals may need to have the relevant code commented out. In the same
|
||||
vein, faults in child processes spawned by the fuzzed target may evade
|
||||
detection unless you manually add some code to catch that.
|
||||
|
||||
- As with any other brute-force tool, the fuzzer offers limited coverage if
|
||||
- As with any other brute-force tool, the fuzzer offers limited coverage if
|
||||
encryption, checksums, cryptographic signatures, or compression are used to
|
||||
wholly wrap the actual data format to be tested.
|
||||
|
||||
To work around this, you can comment out the relevant checks (see
|
||||
utils/libpng_no_checksum/ for inspiration); if this is not possible,
|
||||
you can also write a postprocessor, one of the hooks of custom mutators.
|
||||
See [custom_mutators.md](custom_mutators.md) on how to use
|
||||
`AFL_CUSTOM_MUTATOR_LIBRARY`
|
||||
To work around this, you can comment out the relevant checks (see
|
||||
utils/libpng_no_checksum/ for inspiration); if this is not possible, you can
|
||||
also write a postprocessor, one of the hooks of custom mutators. See
|
||||
[custom_mutators.md](custom_mutators.md) on how to use
|
||||
`AFL_CUSTOM_MUTATOR_LIBRARY`.
|
||||
|
||||
- There are some unfortunate trade-offs with ASAN and 64-bit binaries. This
|
||||
- There are some unfortunate trade-offs with ASAN and 64-bit binaries. This
|
||||
isn't due to any specific fault of afl-fuzz.
|
||||
|
||||
- There is no direct support for fuzzing network services, background
|
||||
daemons, or interactive apps that require UI interaction to work. You may
|
||||
need to make simple code changes to make them behave in a more traditional
|
||||
way. Preeny may offer a relatively simple option, too - see:
|
||||
- There is no direct support for fuzzing network services, background daemons,
|
||||
or interactive apps that require UI interaction to work. You may need to make
|
||||
simple code changes to make them behave in a more traditional way. Preeny may
|
||||
offer a relatively simple option, too - see:
|
||||
[https://github.com/zardus/preeny](https://github.com/zardus/preeny)
|
||||
|
||||
Some useful tips for modifying network-based services can be also found at:
|
||||
[https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop)
|
||||
Some useful tips for modifying network-based services can be also found at:
|
||||
[https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop)
|
||||
|
||||
- Occasionally, sentient machines rise against their creators. If this
|
||||
happens to you, please consult [https://lcamtuf.coredump.cx/prep/](https://lcamtuf.coredump.cx/prep/).
|
||||
- Occasionally, sentient machines rise against their creators. If this happens
|
||||
to you, please consult
|
||||
[https://lcamtuf.coredump.cx/prep/](https://lcamtuf.coredump.cx/prep/).
|
||||
|
||||
Beyond this, see [INSTALL.md](INSTALL.md) for platform-specific tips.
|
@ -1,256 +0,0 @@
|
||||
# Tips for parallel fuzzing
|
||||
|
||||
This document talks about synchronizing afl-fuzz jobs on a single machine or
|
||||
across a fleet of systems. See README.md for the general instruction manual.
|
||||
|
||||
Note that this document is rather outdated. please refer to the main document
|
||||
section on multiple core usage
|
||||
[fuzzing_in_depth.md:b) Using multiple cores](fuzzing_in_depth.md#b-using-multiple-cores)
|
||||
for up to date strategies!
|
||||
|
||||
## 1) Introduction
|
||||
|
||||
Every copy of afl-fuzz will take up one CPU core. This means that on an n-core
|
||||
system, you can almost always run around n concurrent fuzzing jobs with
|
||||
virtually no performance hit (you can use the afl-gotcpu tool to make sure).
|
||||
|
||||
In fact, if you rely on just a single job on a multi-core system, you will be
|
||||
underutilizing the hardware. So, parallelization is always the right way to go.
|
||||
|
||||
When targeting multiple unrelated binaries or using the tool in
|
||||
"non-instrumented" (-n) mode, it is perfectly fine to just start up several
|
||||
fully separate instances of afl-fuzz. The picture gets more complicated when you
|
||||
want to have multiple fuzzers hammering a common target: if a hard-to-hit but
|
||||
interesting test case is synthesized by one fuzzer, the remaining instances will
|
||||
not be able to use that input to guide their work.
|
||||
|
||||
To help with this problem, afl-fuzz offers a simple way to synchronize test
|
||||
cases on the fly.
|
||||
|
||||
It is a good idea to use different power schedules if you run several instances
|
||||
in parallel (`-p` option).
|
||||
|
||||
Alternatively running other AFL spinoffs in parallel can be of value, e.g.
|
||||
Angora (https://github.com/AngoraFuzzer/Angora/)
|
||||
|
||||
## 2) Single-system parallelization
|
||||
|
||||
If you wish to parallelize a single job across multiple cores on a local system,
|
||||
simply create a new, empty output directory ("sync dir") that will be shared by
|
||||
all the instances of afl-fuzz; and then come up with a naming scheme for every
|
||||
instance - say, "fuzzer01", "fuzzer02", etc.
|
||||
|
||||
Run the first one ("main node", -M) like this:
|
||||
|
||||
```
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -M fuzzer01 [...other stuff...]
|
||||
```
|
||||
|
||||
...and then, start up secondary (-S) instances like this:
|
||||
|
||||
```
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer02 [...other stuff...]
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer03 [...other stuff...]
|
||||
```
|
||||
|
||||
Each fuzzer will keep its state in a separate subdirectory, like so:
|
||||
|
||||
/path/to/sync_dir/fuzzer01/
|
||||
|
||||
Each instance will also periodically rescan the top-level sync directory for any
|
||||
test cases found by other fuzzers - and will incorporate them into its own
|
||||
fuzzing when they are deemed interesting enough. For performance reasons only -M
|
||||
main node syncs the queue with everyone, the -S secondary nodes will only sync
|
||||
from the main node.
|
||||
|
||||
The difference between the -M and -S modes is that the main instance will still
|
||||
perform deterministic checks; while the secondary instances will proceed
|
||||
straight to random tweaks.
|
||||
|
||||
Note that you must always have one -M main instance! Running multiple -M
|
||||
instances is wasteful!
|
||||
|
||||
You can also monitor the progress of your jobs from the command line with the
|
||||
provided afl-whatsup tool. When the instances are no longer finding new paths,
|
||||
it's probably time to stop.
|
||||
|
||||
WARNING: Exercise caution when explicitly specifying the -f option. Each fuzzer
|
||||
must use a separate temporary file; otherwise, things will go south. One safe
|
||||
example may be:
|
||||
|
||||
```
|
||||
./afl-fuzz [...] -S fuzzer10 -f file10.txt ./fuzzed/binary @@
|
||||
./afl-fuzz [...] -S fuzzer11 -f file11.txt ./fuzzed/binary @@
|
||||
./afl-fuzz [...] -S fuzzer12 -f file12.txt ./fuzzed/binary @@
|
||||
```
|
||||
|
||||
This is not a concern if you use @@ without -f and let afl-fuzz come up with the
|
||||
file name.
|
||||
|
||||
## 3) Multiple -M mains
|
||||
|
||||
|
||||
There is support for parallelizing the deterministic checks. This is only needed
|
||||
where
|
||||
|
||||
1. many new paths are found fast over a long time and it looks unlikely that
|
||||
main node will ever catch up, and
|
||||
2. deterministic fuzzing is actively helping path discovery (you can see this
|
||||
in the main node for the first for lines in the "fuzzing strategy yields"
|
||||
section. If the ration `found/attempts` is high, then it is effective. It
|
||||
most commonly isn't.)
|
||||
|
||||
Only if both are true it is beneficial to have more than one main. You can
|
||||
leverage this by creating -M instances like so:
|
||||
|
||||
```
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...]
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...]
|
||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...]
|
||||
```
|
||||
|
||||
... where the first value after ':' is the sequential ID of a particular main
|
||||
instance (starting at 1), and the second value is the total number of fuzzers to
|
||||
distribute the deterministic fuzzing across. Note that if you boot up fewer
|
||||
fuzzers than indicated by the second number passed to -M, you may end up with
|
||||
poor coverage.
|
||||
|
||||
## 4) Syncing with non-AFL fuzzers or independent instances
|
||||
|
||||
A -M main node can be told with the `-F other_fuzzer_queue_directory` option to
|
||||
sync results from other fuzzers, e.g. libfuzzer or honggfuzz.
|
||||
|
||||
Only the specified directory will by synced into afl, not subdirectories. The
|
||||
specified directory does not need to exist yet at the start of afl.
|
||||
|
||||
The `-F` option can be passed to the main node several times.
|
||||
|
||||
## 5) Multi-system parallelization
|
||||
|
||||
The basic operating principle for multi-system parallelization is similar to the
|
||||
mechanism explained in section 2. The key difference is that you need to write a
|
||||
simple script that performs two actions:
|
||||
|
||||
- Uses SSH with authorized_keys to connect to every machine and retrieve a tar
|
||||
archive of the /path/to/sync_dir/<main_node(s)> directory local to the
|
||||
machine. It is best to use a naming scheme that includes host name and it's
|
||||
being a main node (e.g. main1, main2) in the fuzzer ID, so that you can do
|
||||
something like:
|
||||
|
||||
```sh
|
||||
for host in `cat HOSTLIST`; do
|
||||
ssh user@$host "tar -czf - sync/$host_main*/" > $host.tgz
|
||||
done
|
||||
```
|
||||
|
||||
- Distributes and unpacks these files on all the remaining machines, e.g.:
|
||||
|
||||
```sh
|
||||
for srchost in `cat HOSTLIST`; do
|
||||
for dsthost in `cat HOSTLIST`; do
|
||||
test "$srchost" = "$dsthost" && continue
|
||||
ssh user@$srchost 'tar -kxzf -' < $dsthost.tgz
|
||||
done
|
||||
done
|
||||
```
|
||||
|
||||
There is an example of such a script in utils/distributed_fuzzing/.
|
||||
|
||||
There are other (older) more featured, experimental tools:
|
||||
* https://github.com/richo/roving
|
||||
* https://github.com/MartijnB/disfuzz-afl
|
||||
|
||||
However these do not support syncing just main nodes (yet).
|
||||
|
||||
When developing custom test case sync code, there are several optimizations to
|
||||
keep in mind:
|
||||
|
||||
- The synchronization does not have to happen very often; running the task
|
||||
every 60 minutes or even less often at later fuzzing stages is fine
|
||||
|
||||
- There is no need to synchronize crashes/ or hangs/; you only need to copy
|
||||
over queue/* (and ideally, also fuzzer_stats).
|
||||
|
||||
- It is not necessary (and not advisable!) to overwrite existing files; the -k
|
||||
option in tar is a good way to avoid that.
|
||||
|
||||
- There is no need to fetch directories for fuzzers that are not running
|
||||
locally on a particular machine, and were simply copied over onto that
|
||||
system during earlier runs.
|
||||
|
||||
- For large fleets, you will want to consolidate tarballs for each host, as
|
||||
this will let you use n SSH connections for sync, rather than n*(n-1).
|
||||
|
||||
You may also want to implement staged synchronization. For example, you
|
||||
could have 10 groups of systems, with group 1 pushing test cases only to
|
||||
group 2; group 2 pushing them only to group 3; and so on, with group
|
||||
eventually 10 feeding back to group 1.
|
||||
|
||||
This arrangement would allow test interesting cases to propagate across the
|
||||
fleet without having to copy every fuzzer queue to every single host.
|
||||
|
||||
- You do not want a "main" instance of afl-fuzz on every system; you should
|
||||
run them all with -S, and just designate a single process somewhere within
|
||||
the fleet to run with -M.
|
||||
|
||||
- Syncing is only necessary for the main nodes on a system. It is possible to
|
||||
run main-less with only secondaries. However then you need to find out which
|
||||
secondary took over the temporary role to be the main node. Look for the
|
||||
`is_main_node` file in the fuzzer directories, eg.
|
||||
`sync-dir/hostname-*/is_main_node`
|
||||
|
||||
It is *not* advisable to skip the synchronization script and run the fuzzers
|
||||
directly on a network filesystem; unexpected latency and unkillable processes in
|
||||
I/O wait state can mess things up.
|
||||
|
||||
## 6) Remote monitoring and data collection
|
||||
|
||||
You can use screen, nohup, tmux, or something equivalent to run remote instances
|
||||
of afl-fuzz. If you redirect the program's output to a file, it will
|
||||
automatically switch from a fancy UI to more limited status reports. There is
|
||||
also basic machine-readable information which is always written to the
|
||||
fuzzer_stats file in the output directory. Locally, that information can be
|
||||
interpreted with afl-whatsup.
|
||||
|
||||
In principle, you can use the status screen of the main (-M) instance to monitor
|
||||
the overall fuzzing progress and decide when to stop. In this mode, the most
|
||||
important signal is just that no new paths are being found for a longer while.
|
||||
If you do not have a main instance, just pick any single secondary instance to
|
||||
watch and go by that.
|
||||
|
||||
You can also rely on that instance's output directory to collect the synthesized
|
||||
corpus that covers all the noteworthy paths discovered anywhere within the
|
||||
fleet. Secondary (-S) instances do not require any special monitoring, other
|
||||
than just making sure that they are up.
|
||||
|
||||
Keep in mind that crashing inputs are *not* automatically propagated to the main
|
||||
instance, so you may still want to monitor for crashes fleet-wide from within
|
||||
your synchronization or health checking scripts (see afl-whatsup).
|
||||
|
||||
## 7) Asymmetric setups
|
||||
|
||||
It is perhaps worth noting that all of the following is permitted:
|
||||
|
||||
- Running afl-fuzz with conjunction with other guided tools that can extend
|
||||
coverage (e.g., via concolic execution). Third-party tools simply need to
|
||||
follow the protocol described above for pulling new test cases from
|
||||
out_dir/<fuzzer_id>/queue/* and writing their own finds to sequentially
|
||||
numbered id:nnnnnn files in out_dir/<ext_tool_id>/queue/*.
|
||||
|
||||
- Running some of the synchronized fuzzers with different (but related) target
|
||||
binaries. For example, simultaneously stress-testing several different JPEG
|
||||
parsers (say, IJG jpeg and libjpeg-turbo) while sharing the discovered test
|
||||
cases can have synergistic effects and improve the overall coverage.
|
||||
|
||||
(In this case, running one -M instance per target is necessary.)
|
||||
|
||||
- Having some of the fuzzers invoke the binary in different ways. For example,
|
||||
'djpeg' supports several DCT modes, configurable with a command-line flag,
|
||||
while 'dwebp' supports incremental and one-shot decoding. In some scenarios,
|
||||
going after multiple distinct modes and then pooling test cases will improve
|
||||
coverage.
|
||||
|
||||
- Much less convincingly, running the synchronized fuzzers with different
|
||||
starting test cases (e.g., progressive and standard JPEG) or dictionaries.
|
||||
The synchronization mechanism ensures that the test sets will get fairly
|
||||
homogeneous over time, but it introduces some initial variability.
|
@ -1,550 +0,0 @@
|
||||
# Technical "whitepaper" for afl-fuzz
|
||||
|
||||
|
||||
NOTE: this document is mostly outdated!
|
||||
|
||||
|
||||
This document provides a quick overview of the guts of American Fuzzy Lop.
|
||||
See README.md for the general instruction manual; and for a discussion of
|
||||
motivations and design goals behind AFL, see historical_notes.md.
|
||||
|
||||
## 0. Design statement
|
||||
|
||||
American Fuzzy Lop does its best not to focus on any singular principle of
|
||||
operation and not be a proof-of-concept for any specific theory. The tool can
|
||||
be thought of as a collection of hacks that have been tested in practice,
|
||||
found to be surprisingly effective, and have been implemented in the simplest,
|
||||
most robust way I could think of at the time.
|
||||
|
||||
Many of the resulting features are made possible thanks to the availability of
|
||||
lightweight instrumentation that served as a foundation for the tool, but this
|
||||
mechanism should be thought of merely as a means to an end. The only true
|
||||
governing principles are speed, reliability, and ease of use.
|
||||
|
||||
## 1. Coverage measurements
|
||||
|
||||
The instrumentation injected into compiled programs captures branch (edge)
|
||||
coverage, along with coarse branch-taken hit counts. The code injected at
|
||||
branch points is essentially equivalent to:
|
||||
|
||||
```c
|
||||
cur_location = <COMPILE_TIME_RANDOM>;
|
||||
shared_mem[cur_location ^ prev_location]++;
|
||||
prev_location = cur_location >> 1;
|
||||
```
|
||||
|
||||
The `cur_location` value is generated randomly to simplify the process of
|
||||
linking complex projects and keep the XOR output distributed uniformly.
|
||||
|
||||
The `shared_mem[]` array is a 64 kB SHM region passed to the instrumented binary
|
||||
by the caller. Every byte set in the output map can be thought of as a hit for
|
||||
a particular (`branch_src`, `branch_dst`) tuple in the instrumented code.
|
||||
|
||||
The size of the map is chosen so that collisions are sporadic with almost all
|
||||
of the intended targets, which usually sport between 2k and 10k discoverable
|
||||
branch points:
|
||||
|
||||
```
|
||||
Branch cnt | Colliding tuples | Example targets
|
||||
------------+------------------+-----------------
|
||||
1,000 | 0.75% | giflib, lzo
|
||||
2,000 | 1.5% | zlib, tar, xz
|
||||
5,000 | 3.5% | libpng, libwebp
|
||||
10,000 | 7% | libxml
|
||||
20,000 | 14% | sqlite
|
||||
50,000 | 30% | -
|
||||
```
|
||||
|
||||
At the same time, its size is small enough to allow the map to be analyzed
|
||||
in a matter of microseconds on the receiving end, and to effortlessly fit
|
||||
within L2 cache.
|
||||
|
||||
This form of coverage provides considerably more insight into the execution
|
||||
path of the program than simple block coverage. In particular, it trivially
|
||||
distinguishes between the following execution traces:
|
||||
|
||||
```
|
||||
A -> B -> C -> D -> E (tuples: AB, BC, CD, DE)
|
||||
A -> B -> D -> C -> E (tuples: AB, BD, DC, CE)
|
||||
```
|
||||
|
||||
This aids the discovery of subtle fault conditions in the underlying code,
|
||||
because security vulnerabilities are more often associated with unexpected
|
||||
or incorrect state transitions than with merely reaching a new basic block.
|
||||
|
||||
The reason for the shift operation in the last line of the pseudocode shown
|
||||
earlier in this section is to preserve the directionality of tuples (without
|
||||
this, A ^ B would be indistinguishable from B ^ A) and to retain the identity
|
||||
of tight loops (otherwise, A ^ A would be obviously equal to B ^ B).
|
||||
|
||||
The absence of simple saturating arithmetic opcodes on Intel CPUs means that
|
||||
the hit counters can sometimes wrap around to zero. Since this is a fairly
|
||||
unlikely and localized event, it's seen as an acceptable performance trade-off.
|
||||
|
||||
### 2. Detecting new behaviors
|
||||
|
||||
The fuzzer maintains a global map of tuples seen in previous executions; this
|
||||
data can be rapidly compared with individual traces and updated in just a couple
|
||||
of dword- or qword-wide instructions and a simple loop.
|
||||
|
||||
When a mutated input produces an execution trace containing new tuples, the
|
||||
corresponding input file is preserved and routed for additional processing
|
||||
later on (see section #3). Inputs that do not trigger new local-scale state
|
||||
transitions in the execution trace (i.e., produce no new tuples) are discarded,
|
||||
even if their overall control flow sequence is unique.
|
||||
|
||||
This approach allows for a very fine-grained and long-term exploration of
|
||||
program state while not having to perform any computationally intensive and
|
||||
fragile global comparisons of complex execution traces, and while avoiding the
|
||||
scourge of path explosion.
|
||||
|
||||
To illustrate the properties of the algorithm, consider that the second trace
|
||||
shown below would be considered substantially new because of the presence of
|
||||
new tuples (CA, AE):
|
||||
|
||||
```
|
||||
#1: A -> B -> C -> D -> E
|
||||
#2: A -> B -> C -> A -> E
|
||||
```
|
||||
|
||||
At the same time, with #2 processed, the following pattern will not be seen
|
||||
as unique, despite having a markedly different overall execution path:
|
||||
|
||||
```
|
||||
#3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E
|
||||
```
|
||||
|
||||
In addition to detecting new tuples, the fuzzer also considers coarse tuple
|
||||
hit counts. These are divided into several buckets:
|
||||
|
||||
```
|
||||
1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
|
||||
```
|
||||
|
||||
To some extent, the number of buckets is an implementation artifact: it allows
|
||||
an in-place mapping of an 8-bit counter generated by the instrumentation to
|
||||
an 8-position bitmap relied on by the fuzzer executable to keep track of the
|
||||
already-seen execution counts for each tuple.
|
||||
|
||||
Changes within the range of a single bucket are ignored; transition from one
|
||||
bucket to another is flagged as an interesting change in program control flow,
|
||||
and is routed to the evolutionary process outlined in the section below.
|
||||
|
||||
The hit count behavior provides a way to distinguish between potentially
|
||||
interesting control flow changes, such as a block of code being executed
|
||||
twice when it was normally hit only once. At the same time, it is fairly
|
||||
insensitive to empirically less notable changes, such as a loop going from
|
||||
47 cycles to 48. The counters also provide some degree of "accidental"
|
||||
immunity against tuple collisions in dense trace maps.
|
||||
|
||||
The execution is policed fairly heavily through memory and execution time
|
||||
limits; by default, the timeout is set at 5x the initially-calibrated
|
||||
execution speed, rounded up to 20 ms. The aggressive timeouts are meant to
|
||||
prevent dramatic fuzzer performance degradation by descending into tarpits
|
||||
that, say, improve coverage by 1% while being 100x slower; we pragmatically
|
||||
reject them and hope that the fuzzer will find a less expensive way to reach
|
||||
the same code. Empirical testing strongly suggests that more generous time
|
||||
limits are not worth the cost.
|
||||
|
||||
## 3. Evolving the input queue
|
||||
|
||||
Mutated test cases that produced new state transitions within the program are
|
||||
added to the input queue and used as a starting point for future rounds of
|
||||
fuzzing. They supplement, but do not automatically replace, existing finds.
|
||||
|
||||
In contrast to more greedy genetic algorithms, this approach allows the tool
|
||||
to progressively explore various disjoint and possibly mutually incompatible
|
||||
features of the underlying data format, as shown in this image:
|
||||
|
||||

|
||||
|
||||
Several practical examples of the results of this algorithm are discussed
|
||||
here:
|
||||
|
||||
https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html
|
||||
https://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.html
|
||||
|
||||
The synthetic corpus produced by this process is essentially a compact
|
||||
collection of "hmm, this does something new!" input files, and can be used to
|
||||
seed any other testing processes down the line (for example, to manually
|
||||
stress-test resource-intensive desktop apps).
|
||||
|
||||
With this approach, the queue for most targets grows to somewhere between 1k
|
||||
and 10k entries; approximately 10-30% of this is attributable to the discovery
|
||||
of new tuples, and the remainder is associated with changes in hit counts.
|
||||
|
||||
The following table compares the relative ability to discover file syntax and
|
||||
explore program states when using several different approaches to guided
|
||||
fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with `-O3` and
|
||||
seeded with a dummy text file; the session consisted of a single pass over the
|
||||
input queue with afl-fuzz:
|
||||
|
||||
```
|
||||
Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage
|
||||
strategy used | reached | reached | cnt var | test case generated
|
||||
------------------+---------+---------+----------+---------------------------
|
||||
(Initial file) | 156 | 163 | 1.00 | (none)
|
||||
| | | |
|
||||
Blind fuzzing S | 182 | 205 | 2.23 | First 2 B of RCS diff
|
||||
Blind fuzzing L | 228 | 265 | 2.23 | First 4 B of -c mode diff
|
||||
Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff
|
||||
Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff
|
||||
AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff
|
||||
```
|
||||
|
||||
The first entry for blind fuzzing ("S") corresponds to executing just a single
|
||||
round of testing; the second set of figures ("L") shows the fuzzer running in a
|
||||
loop for a number of execution cycles comparable with that of the instrumented
|
||||
runs, which required more time to fully process the growing queue.
|
||||
|
||||
Roughly similar results have been obtained in a separate experiment where the
|
||||
fuzzer was modified to compile out all the random fuzzing stages and leave just
|
||||
a series of rudimentary, sequential operations such as walking bit flips.
|
||||
Because this mode would be incapable of altering the size of the input file,
|
||||
the sessions were seeded with a valid unified diff:
|
||||
|
||||
```
|
||||
Queue extension | Blocks | Edges | Edge hit | Number of unique
|
||||
strategy used | reached | reached | cnt var | crashes found
|
||||
------------------+---------+---------+----------+------------------
|
||||
(Initial file) | 624 | 717 | 1.00 | -
|
||||
| | | |
|
||||
Blind fuzzing | 1,101 | 1,409 | 1.60 | 0
|
||||
Block coverage | 1,255 | 1,649 | 1.48 | 0
|
||||
Edge coverage | 1,259 | 1,734 | 1.72 | 0
|
||||
AFL model | 1,452 | 2,040 | 3.16 | 1
|
||||
```
|
||||
|
||||
At noted earlier on, some of the prior work on genetic fuzzing relied on
|
||||
maintaining a single test case and evolving it to maximize coverage. At least
|
||||
in the tests described above, this "greedy" approach appears to confer no
|
||||
substantial benefits over blind fuzzing strategies.
|
||||
|
||||
### 4. Culling the corpus
|
||||
|
||||
The progressive state exploration approach outlined above means that some of
|
||||
the test cases synthesized later on in the game may have edge coverage that
|
||||
is a strict superset of the coverage provided by their ancestors.
|
||||
|
||||
To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a
|
||||
fast algorithm that selects a smaller subset of test cases that still cover
|
||||
every tuple seen so far, and whose characteristics make them particularly
|
||||
favorable to the tool.
|
||||
|
||||
The algorithm works by assigning every queue entry a score proportional to its
|
||||
execution latency and file size; and then selecting lowest-scoring candidates
|
||||
for each tuple.
|
||||
|
||||
The tuples are then processed sequentially using a simple workflow:
|
||||
|
||||
1) Find next tuple not yet in the temporary working set,
|
||||
2) Locate the winning queue entry for this tuple,
|
||||
3) Register *all* tuples present in that entry's trace in the working set,
|
||||
4) Go to #1 if there are any missing tuples in the set.
|
||||
|
||||
The generated corpus of "favored" entries is usually 5-10x smaller than the
|
||||
starting data set. Non-favored entries are not discarded, but they are skipped
|
||||
with varying probabilities when encountered in the queue:
|
||||
|
||||
- If there are new, yet-to-be-fuzzed favorites present in the queue, 99%
|
||||
of non-favored entries will be skipped to get to the favored ones.
|
||||
- If there are no new favorites:
|
||||
* If the current non-favored entry was fuzzed before, it will be skipped
|
||||
95% of the time.
|
||||
* If it hasn't gone through any fuzzing rounds yet, the odds of skipping
|
||||
drop down to 75%.
|
||||
|
||||
Based on empirical testing, this provides a reasonable balance between queue
|
||||
cycling speed and test case diversity.
|
||||
|
||||
Slightly more sophisticated but much slower culling can be performed on input
|
||||
or output corpora with `afl-cmin`. This tool permanently discards the redundant
|
||||
entries and produces a smaller corpus suitable for use with `afl-fuzz` or
|
||||
external tools.
|
||||
|
||||
## 5. Trimming input files
|
||||
|
||||
File size has a dramatic impact on fuzzing performance, both because large
|
||||
files make the target binary slower, and because they reduce the likelihood
|
||||
that a mutation would touch important format control structures, rather than
|
||||
redundant data blocks. This is discussed in more detail in perf_tips.md.
|
||||
|
||||
The possibility that the user will provide a low-quality starting corpus aside,
|
||||
some types of mutations can have the effect of iteratively increasing the size
|
||||
of the generated files, so it is important to counter this trend.
|
||||
|
||||
Luckily, the instrumentation feedback provides a simple way to automatically
|
||||
trim down input files while ensuring that the changes made to the files have no
|
||||
impact on the execution path.
|
||||
|
||||
The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data
|
||||
with variable length and stepover; any deletion that doesn't affect the checksum
|
||||
of the trace map is committed to disk. The trimmer is not designed to be
|
||||
particularly thorough; instead, it tries to strike a balance between precision
|
||||
and the number of `execve()` calls spent on the process, selecting the block size
|
||||
and stepover to match. The average per-file gains are around 5-20%.
|
||||
|
||||
The standalone `afl-tmin` tool uses a more exhaustive, iterative algorithm, and
|
||||
also attempts to perform alphabet normalization on the trimmed files. The
|
||||
operation of `afl-tmin` is as follows.
|
||||
|
||||
First, the tool automatically selects the operating mode. If the initial input
|
||||
crashes the target binary, afl-tmin will run in non-instrumented mode, simply
|
||||
keeping any tweaks that produce a simpler file but still crash the target.
|
||||
The same mode is used for hangs, if `-H` (hang mode) is specified.
|
||||
If the target is non-crashing, the tool uses an instrumented mode and keeps only
|
||||
the tweaks that produce exactly the same execution path.
|
||||
|
||||
The actual minimization algorithm is:
|
||||
|
||||
1) Attempt to zero large blocks of data with large stepovers. Empirically,
|
||||
this is shown to reduce the number of execs by preempting finer-grained
|
||||
efforts later on.
|
||||
2) Perform a block deletion pass with decreasing block sizes and stepovers,
|
||||
binary-search-style.
|
||||
3) Perform alphabet normalization by counting unique characters and trying
|
||||
to bulk-replace each with a zero value.
|
||||
4) As a last result, perform byte-by-byte normalization on non-zero bytes.
|
||||
|
||||
Instead of zeroing with a 0x00 byte, `afl-tmin` uses the ASCII digit '0'. This
|
||||
is done because such a modification is much less likely to interfere with
|
||||
text parsing, so it is more likely to result in successful minimization of
|
||||
text files.
|
||||
|
||||
The algorithm used here is less involved than some other test case
|
||||
minimization approaches proposed in academic work, but requires far fewer
|
||||
executions and tends to produce comparable results in most real-world
|
||||
applications.
|
||||
|
||||
## 6. Fuzzing strategies
|
||||
|
||||
The feedback provided by the instrumentation makes it easy to understand the
|
||||
value of various fuzzing strategies and optimize their parameters so that they
|
||||
work equally well across a wide range of file types. The strategies used by
|
||||
afl-fuzz are generally format-agnostic and are discussed in more detail here:
|
||||
|
||||
https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html
|
||||
|
||||
It is somewhat notable that especially early on, most of the work done by
|
||||
`afl-fuzz` is actually highly deterministic, and progresses to random stacked
|
||||
modifications and test case splicing only at a later stage. The deterministic
|
||||
strategies include:
|
||||
|
||||
- Sequential bit flips with varying lengths and stepovers,
|
||||
- Sequential addition and subtraction of small integers,
|
||||
- Sequential insertion of known interesting integers (`0`, `1`, `INT_MAX`, etc),
|
||||
|
||||
The purpose of opening with deterministic steps is related to their tendency to
|
||||
produce compact test cases and small diffs between the non-crashing and crashing
|
||||
inputs.
|
||||
|
||||
With deterministic fuzzing out of the way, the non-deterministic steps include
|
||||
stacked bit flips, insertions, deletions, arithmetics, and splicing of different
|
||||
test cases.
|
||||
|
||||
The relative yields and `execve()` costs of all these strategies have been
|
||||
investigated and are discussed in the aforementioned blog post.
|
||||
|
||||
For the reasons discussed in historical_notes.md (chiefly, performance,
|
||||
simplicity, and reliability), AFL generally does not try to reason about the
|
||||
relationship between specific mutations and program states; the fuzzing steps
|
||||
are nominally blind, and are guided only by the evolutionary design of the
|
||||
input queue.
|
||||
|
||||
That said, there is one (trivial) exception to this rule: when a new queue
|
||||
entry goes through the initial set of deterministic fuzzing steps, and tweaks to
|
||||
some regions in the file are observed to have no effect on the checksum of the
|
||||
execution path, they may be excluded from the remaining phases of
|
||||
deterministic fuzzing - and the fuzzer may proceed straight to random tweaks.
|
||||
Especially for verbose, human-readable data formats, this can reduce the number
|
||||
of execs by 10-40% or so without an appreciable drop in coverage. In extreme
|
||||
cases, such as normally block-aligned tar archives, the gains can be as high as
|
||||
90%.
|
||||
|
||||
Because the underlying "effector maps" are local every queue entry and remain
|
||||
in force only during deterministic stages that do not alter the size or the
|
||||
general layout of the underlying file, this mechanism appears to work very
|
||||
reliably and proved to be simple to implement.
|
||||
|
||||
## 7. Dictionaries
|
||||
|
||||
The feedback provided by the instrumentation makes it easy to automatically
|
||||
identify syntax tokens in some types of input files, and to detect that certain
|
||||
combinations of predefined or auto-detected dictionary terms constitute a
|
||||
valid grammar for the tested parser.
|
||||
|
||||
A discussion of how these features are implemented within afl-fuzz can be found
|
||||
here:
|
||||
|
||||
https://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html
|
||||
|
||||
In essence, when basic, typically easily-obtained syntax tokens are combined
|
||||
together in a purely random manner, the instrumentation and the evolutionary
|
||||
design of the queue together provide a feedback mechanism to differentiate
|
||||
between meaningless mutations and ones that trigger new behaviors in the
|
||||
instrumented code - and to incrementally build more complex syntax on top of
|
||||
this discovery.
|
||||
|
||||
The dictionaries have been shown to enable the fuzzer to rapidly reconstruct
|
||||
the grammar of highly verbose and complex languages such as JavaScript, SQL,
|
||||
or XML; several examples of generated SQL statements are given in the blog
|
||||
post mentioned above.
|
||||
|
||||
Interestingly, the AFL instrumentation also allows the fuzzer to automatically
|
||||
isolate syntax tokens already present in an input file. It can do so by looking
|
||||
for run of bytes that, when flipped, produce a consistent change to the
|
||||
program's execution path; this is suggestive of an underlying atomic comparison
|
||||
to a predefined value baked into the code. The fuzzer relies on this signal
|
||||
to build compact "auto dictionaries" that are then used in conjunction with
|
||||
other fuzzing strategies.
|
||||
|
||||
## 8. De-duping crashes
|
||||
|
||||
De-duplication of crashes is one of the more important problems for any
|
||||
competent fuzzing tool. Many of the naive approaches run into problems; in
|
||||
particular, looking just at the faulting address may lead to completely
|
||||
unrelated issues being clustered together if the fault happens in a common
|
||||
library function (say, `strcmp`, `strcpy`); while checksumming call stack
|
||||
backtraces can lead to extreme crash count inflation if the fault can be
|
||||
reached through a number of different, possibly recursive code paths.
|
||||
|
||||
The solution implemented in `afl-fuzz` considers a crash unique if any of two
|
||||
conditions are met:
|
||||
|
||||
- The crash trace includes a tuple not seen in any of the previous crashes,
|
||||
- The crash trace is missing a tuple that was always present in earlier
|
||||
faults.
|
||||
|
||||
The approach is vulnerable to some path count inflation early on, but exhibits
|
||||
a very strong self-limiting effect, similar to the execution path analysis
|
||||
logic that is the cornerstone of `afl-fuzz`.
|
||||
|
||||
## 9. Investigating crashes
|
||||
|
||||
The exploitability of many types of crashes can be ambiguous; afl-fuzz tries
|
||||
to address this by providing a crash exploration mode where a known-faulting
|
||||
test case is fuzzed in a manner very similar to the normal operation of the
|
||||
fuzzer, but with a constraint that causes any non-crashing mutations to be
|
||||
thrown away.
|
||||
|
||||
A detailed discussion of the value of this approach can be found here:
|
||||
|
||||
https://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html
|
||||
|
||||
The method uses instrumentation feedback to explore the state of the crashing
|
||||
program to get past the ambiguous faulting condition and then isolate the
|
||||
newly-found inputs for human review.
|
||||
|
||||
On the subject of crashes, it is worth noting that in contrast to normal
|
||||
queue entries, crashing inputs are *not* trimmed; they are kept exactly as
|
||||
discovered to make it easier to compare them to the parent, non-crashing entry
|
||||
in the queue. That said, `afl-tmin` can be used to shrink them at will.
|
||||
|
||||
## 10 The fork server
|
||||
|
||||
To improve performance, `afl-fuzz` uses a "fork server", where the fuzzed process
|
||||
goes through `execve()`, linking, and libc initialization only once, and is then
|
||||
cloned from a stopped process image by leveraging copy-on-write. The
|
||||
implementation is described in more detail here:
|
||||
|
||||
https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html
|
||||
|
||||
The fork server is an integral aspect of the injected instrumentation and
|
||||
simply stops at the first instrumented function to await commands from
|
||||
`afl-fuzz`.
|
||||
|
||||
With fast targets, the fork server can offer considerable performance gains,
|
||||
usually between 1.5x and 2x. It is also possible to:
|
||||
|
||||
- Use the fork server in manual ("deferred") mode, skipping over larger,
|
||||
user-selected chunks of initialization code. It requires very modest
|
||||
code changes to the targeted program, and With some targets, can
|
||||
produce 10x+ performance gains.
|
||||
- Enable "persistent" mode, where a single process is used to try out
|
||||
multiple inputs, greatly limiting the overhead of repetitive `fork()`
|
||||
calls. This generally requires some code changes to the targeted program,
|
||||
but can improve the performance of fast targets by a factor of 5 or more - approximating the benefits of in-process fuzzing jobs while still
|
||||
maintaining very robust isolation between the fuzzer process and the
|
||||
targeted binary.
|
||||
|
||||
## 11. Parallelization
|
||||
|
||||
The parallelization mechanism relies on periodically examining the queues
|
||||
produced by independently-running instances on other CPU cores or on remote
|
||||
machines, and then selectively pulling in the test cases that, when tried
|
||||
out locally, produce behaviors not yet seen by the fuzzer at hand.
|
||||
|
||||
This allows for extreme flexibility in fuzzer setup, including running synced
|
||||
instances against different parsers of a common data format, often with
|
||||
synergistic effects.
|
||||
|
||||
For more information about this design, see parallel_fuzzing.md.
|
||||
|
||||
## 12. Binary-only instrumentation
|
||||
|
||||
Instrumentation of black-box, binary-only targets is accomplished with the
|
||||
help of a separately-built version of QEMU in "user emulation" mode. This also
|
||||
allows the execution of cross-architecture code - say, ARM binaries on x86.
|
||||
|
||||
QEMU uses basic blocks as translation units; the instrumentation is implemented
|
||||
on top of this and uses a model roughly analogous to the compile-time hooks:
|
||||
|
||||
```c
|
||||
if (block_address > elf_text_start && block_address < elf_text_end) {
|
||||
|
||||
cur_location = (block_address >> 4) ^ (block_address << 8);
|
||||
shared_mem[cur_location ^ prev_location]++;
|
||||
prev_location = cur_location >> 1;
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
The shift-and-XOR-based scrambling in the second line is used to mask the
|
||||
effects of instruction alignment.
|
||||
|
||||
The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly
|
||||
slow; to counter this, the QEMU mode leverages a fork server similar to that
|
||||
used for compiler-instrumented code, effectively spawning copies of an
|
||||
already-initialized process paused at `_start`.
|
||||
|
||||
First-time translation of a new basic block also incurs substantial latency. To
|
||||
eliminate this problem, the AFL fork server is extended by providing a channel
|
||||
between the running emulator and the parent process. The channel is used
|
||||
to notify the parent about the addresses of any newly-encountered blocks and to
|
||||
add them to the translation cache that will be replicated for future child
|
||||
processes.
|
||||
|
||||
As a result of these two optimizations, the overhead of the QEMU mode is
|
||||
roughly 2-5x, compared to 100x+ for PIN.
|
||||
|
||||
## 13. The `afl-analyze` tool
|
||||
|
||||
The file format analyzer is a simple extension of the minimization algorithm
|
||||
discussed earlier on; instead of attempting to remove no-op blocks, the tool
|
||||
performs a series of walking byte flips and then annotates runs of bytes
|
||||
in the input file.
|
||||
|
||||
It uses the following classification scheme:
|
||||
|
||||
- "No-op blocks" - segments where bit flips cause no apparent changes to
|
||||
control flow. Common examples may be comment sections, pixel data within
|
||||
a bitmap file, etc.
|
||||
- "Superficial content" - segments where some, but not all, bitflips
|
||||
produce some control flow changes. Examples may include strings in rich
|
||||
documents (e.g., XML, RTF).
|
||||
- "Critical stream" - a sequence of bytes where all bit flips alter control
|
||||
flow in different but correlated ways. This may be compressed data,
|
||||
non-atomically compared keywords or magic values, etc.
|
||||
- "Suspected length field" - small, atomic integer that, when touched in
|
||||
any way, causes a consistent change to program control flow, suggestive
|
||||
of a failed length check.
|
||||
- "Suspected cksum or magic int" - an integer that behaves similarly to a
|
||||
length field, but has a numerical value that makes the length explanation
|
||||
unlikely. This is suggestive of a checksum or other "magic" integer.
|
||||
- "Suspected checksummed block" - a long block of data where any change
|
||||
always triggers the same new execution path. Likely caused by failing
|
||||
a checksum or a similar integrity check before any subsequent parsing
|
||||
takes place.
|
||||
- "Magic value section" - a generic token where changes cause the type
|
||||
of binary behavior outlined earlier, but that doesn't meet any of the
|
||||
other criteria. May be an atomically compared keyword or so.
|
@ -1,33 +1,57 @@
|
||||
# Tools that help fuzzing with AFL++
|
||||
|
||||
Speeding up fuzzing:
|
||||
* [libfiowrapper](https://github.com/marekzmyslowski/libfiowrapper) - if the function you want to fuzz requires loading a file, this allows using the shared memory test case feature :-) - recommended.
|
||||
* [libfiowrapper](https://github.com/marekzmyslowski/libfiowrapper) - if the
|
||||
function you want to fuzz requires loading a file, this allows using the
|
||||
shared memory test case feature :-) - recommended.
|
||||
|
||||
Minimization of test cases:
|
||||
* [afl-pytmin](https://github.com/ilsani/afl-pytmin) - a wrapper for afl-tmin that tries to speed up the process of minimization of a single test case by using many CPU cores.
|
||||
* [afl-ddmin-mod](https://github.com/MarkusTeufelberger/afl-ddmin-mod) - a variation of afl-tmin based on the ddmin algorithm.
|
||||
* [halfempty](https://github.com/googleprojectzero/halfempty) - is a fast utility for minimizing test cases by Tavis Ormandy based on parallelization.
|
||||
* [afl-pytmin](https://github.com/ilsani/afl-pytmin) - a wrapper for afl-tmin
|
||||
that tries to speed up the process of minimization of a single test case by
|
||||
using many CPU cores.
|
||||
* [afl-ddmin-mod](https://github.com/MarkusTeufelberger/afl-ddmin-mod) - a
|
||||
variation of afl-tmin based on the ddmin algorithm.
|
||||
* [halfempty](https://github.com/googleprojectzero/halfempty) - is a fast
|
||||
utility for minimizing test cases by Tavis Ormandy based on parallelization.
|
||||
|
||||
Distributed execution:
|
||||
* [disfuzz-afl](https://github.com/MartijnB/disfuzz-afl) - distributed fuzzing for AFL.
|
||||
* [AFLDFF](https://github.com/quantumvm/AFLDFF) - AFL distributed fuzzing framework.
|
||||
* [afl-launch](https://github.com/bnagy/afl-launch) - a tool for the execution of many AFL instances.
|
||||
* [afl-mothership](https://github.com/afl-mothership/afl-mothership) - management and execution of many synchronized AFL fuzzers on AWS cloud.
|
||||
* [afl-in-the-cloud](https://github.com/abhisek/afl-in-the-cloud) - another script for running AFL in AWS.
|
||||
* [disfuzz-afl](https://github.com/MartijnB/disfuzz-afl) - distributed fuzzing
|
||||
for AFL.
|
||||
* [AFLDFF](https://github.com/quantumvm/AFLDFF) - AFL distributed fuzzing
|
||||
framework.
|
||||
* [afl-launch](https://github.com/bnagy/afl-launch) - a tool for the execution
|
||||
of many AFL instances.
|
||||
* [afl-mothership](https://github.com/afl-mothership/afl-mothership) -
|
||||
management and execution of many synchronized AFL fuzzers on AWS cloud.
|
||||
* [afl-in-the-cloud](https://github.com/abhisek/afl-in-the-cloud) - another
|
||||
script for running AFL in AWS.
|
||||
|
||||
Deployment, management, monitoring, reporting
|
||||
* [afl-utils](https://gitlab.com/rc0r/afl-utils) - a set of utilities for automatic processing/analysis of crashes and reducing the number of test cases.
|
||||
* [afl-other-arch](https://github.com/shellphish/afl-other-arch) - is a set of patches and scripts for easily adding support for various non-x86 architectures for AFL.
|
||||
* [afl-trivia](https://github.com/bnagy/afl-trivia) - a few small scripts to simplify the management of AFL.
|
||||
* [afl-monitor](https://github.com/reflare/afl-monitor) - a script for monitoring AFL.
|
||||
* [afl-manager](https://github.com/zx1340/afl-manager) - a web server on Python for managing multi-afl.
|
||||
* [afl-remote](https://github.com/block8437/afl-remote) - a web server for the remote management of AFL instances.
|
||||
* [afl-extras](https://github.com/fekir/afl-extras) - shell scripts to parallelize afl-tmin, startup, and data collection.
|
||||
* [afl-utils](https://gitlab.com/rc0r/afl-utils) - a set of utilities for
|
||||
automatic processing/analysis of crashes and reducing the number of test
|
||||
cases.
|
||||
* [afl-other-arch](https://github.com/shellphish/afl-other-arch) - is a set of
|
||||
patches and scripts for easily adding support for various non-x86
|
||||
architectures for AFL.
|
||||
* [afl-trivia](https://github.com/bnagy/afl-trivia) - a few small scripts to
|
||||
simplify the management of AFL.
|
||||
* [afl-monitor](https://github.com/reflare/afl-monitor) - a script for
|
||||
monitoring AFL.
|
||||
* [afl-manager](https://github.com/zx1340/afl-manager) - a web server on Python
|
||||
for managing multi-afl.
|
||||
* [afl-remote](https://github.com/block8437/afl-remote) - a web server for the
|
||||
remote management of AFL instances.
|
||||
* [afl-extras](https://github.com/fekir/afl-extras) - shell scripts to
|
||||
parallelize afl-tmin, startup, and data collection.
|
||||
|
||||
Crash processing
|
||||
* [afl-crash-analyzer](https://github.com/floyd-fuh/afl-crash-analyzer) - another crash analyzer for AFL.
|
||||
* [fuzzer-utils](https://github.com/ThePatrickStar/fuzzer-utils) - a set of scripts for the analysis of results.
|
||||
* [atriage](https://github.com/Ayrx/atriage) - a simple triage tool.
|
||||
* [afl-kit](https://github.com/kcwu/afl-kit) - afl-cmin on Python.
|
||||
* [AFLize](https://github.com/d33tah/aflize) - a tool that automatically generates builds of debian packages suitable for AFL.
|
||||
* [afl-fid](https://github.com/FoRTE-Research/afl-fid) - a set of tools for working with input data.
|
||||
* [afl-crash-analyzer](https://github.com/floyd-fuh/afl-crash-analyzer) -
|
||||
another crash analyzer for AFL.
|
||||
* [fuzzer-utils](https://github.com/ThePatrickStar/fuzzer-utils) - a set of
|
||||
scripts for the analysis of results.
|
||||
* [atriage](https://github.com/Ayrx/atriage) - a simple triage tool.
|
||||
* [afl-kit](https://github.com/kcwu/afl-kit) - afl-cmin on Python.
|
||||
* [AFLize](https://github.com/d33tah/aflize) - a tool that automatically
|
||||
generates builds of debian packages suitable for AFL.
|
||||
* [afl-fid](https://github.com/FoRTE-Research/afl-fid) - a set of tools for
|
||||
working with input data.
|
@ -1,6 +1,6 @@
|
||||
# Tutorials
|
||||
|
||||
Here are some good writeups to show how to effectively use AFL++:
|
||||
Here are some good write-ups to show how to effectively use AFL++:
|
||||
|
||||
* [https://aflplus.plus/docs/tutorials/libxml2_tutorial/](https://aflplus.plus/docs/tutorials/libxml2_tutorial/)
|
||||
* [https://bananamafia.dev/post/gb-fuzz/](https://bananamafia.dev/post/gb-fuzz/)
|
||||
@ -18,9 +18,13 @@ training, then we can highly recommend the following:
|
||||
If you are interested in fuzzing structured data (where you define what the
|
||||
structure is), these links have you covered:
|
||||
|
||||
* Superion for AFL++: [https://github.com/adrian-rt/superion-mutator](https://github.com/adrian-rt/superion-mutator)
|
||||
* libprotobuf for AFL++: [https://github.com/P1umer/AFLplusplus-protobuf-mutator](https://github.com/P1umer/AFLplusplus-protobuf-mutator)
|
||||
* libprotobuf raw: [https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator)
|
||||
* libprotobuf for old AFL++ API: [https://github.com/thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator)
|
||||
* Superion for AFL++:
|
||||
[https://github.com/adrian-rt/superion-mutator](https://github.com/adrian-rt/superion-mutator)
|
||||
* libprotobuf for AFL++:
|
||||
[https://github.com/P1umer/AFLplusplus-protobuf-mutator](https://github.com/P1umer/AFLplusplus-protobuf-mutator)
|
||||
* libprotobuf raw:
|
||||
[https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator)
|
||||
* libprotobuf for old AFL++ API:
|
||||
[https://github.com/thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator)
|
||||
|
||||
If you find other good ones, please send them to us :-)
|
Reference in New Issue
Block a user