moved txt to md (fleissarbeit)

This commit is contained in:
Dominik Maier
2020-02-03 15:09:10 +01:00
parent 2fe7889912
commit 8908803532
12 changed files with 689 additions and 803 deletions

View File

@ -1,28 +1,30 @@
========================= # Installation instructions
Installation instructions
=========================
This document provides basic installation instructions and discusses known This document provides basic installation instructions and discusses known
issues for a variety of platforms. See README for the general instruction issues for a variety of platforms. See README.md for the general instruction
manual. manual.
1) Linux on x86 ## 1) Linux on x86
--------------- ---------------
This platform is expected to work well. Compile the program with: This platform is expected to work well. Compile the program with:
$ make ```bash
make
```
You can start using the fuzzer without installation, but it is also possible to You can start using the fuzzer without installation, but it is also possible to
install it with: install it with:
# make install ```bash
make install
```
There are no special dependencies to speak of; you will need GNU make and a There are no special dependencies to speak of; you will need GNU make and a
working compiler (gcc or clang). Some of the optional scripts bundled with the working compiler (gcc or clang). Some of the optional scripts bundled with the
program may depend on bash, gdb, and similar basic tools. program may depend on bash, gdb, and similar basic tools.
If you are using clang, please review llvm_mode/README.llvm; the LLVM If you are using clang, please review llvm_mode/README.md; the LLVM
integration mode can offer substantial performance gains compared to the integration mode can offer substantial performance gains compared to the
traditional approach. traditional approach.
@ -30,27 +32,30 @@ You may have to change several settings to get optimal results (most notably,
disable crash reporting utilities and switch to a different CPU governor), but disable crash reporting utilities and switch to a different CPU governor), but
afl-fuzz will guide you through that if necessary. afl-fuzz will guide you through that if necessary.
2) OpenBSD, FreeBSD, NetBSD on x86 ## OpenBSD, FreeBSD, NetBSD on x86
----------------------------------
Similarly to Linux, these platforms are expected to work well and are Similarly to Linux, these platforms are expected to work well and are
regularly tested. Compile everything with GNU make: regularly tested. Compile everything with GNU make:
$ gmake ```bash
gmake
```
Note that BSD make will *not* work; if you do not have gmake on your system, Note that BSD make will *not* work; if you do not have gmake on your system,
please install it first. As on Linux, you can use the fuzzer itself without please install it first. As on Linux, you can use the fuzzer itself without
installation, or install it with: installation, or install it with:
# gmake install ```
gmake install
```
Keep in mind that if you are using csh as your shell, the syntax of some of the Keep in mind that if you are using csh as your shell, the syntax of some of the
shell commands given in the README and other docs will be different. shell commands given in the README.md and other docs will be different.
The llvm_mode requires a dynamically linked, fully-operational installation of The `llvm_mode` requires a dynamically linked, fully-operational installation of
clang. At least on FreeBSD, the clang binaries are static and do not include clang. At least on FreeBSD, the clang binaries are static and do not include
some of the essential tools, so if you want to make it work, you may need to some of the essential tools, so if you want to make it work, you may need to
follow the instructions in llvm_mode/README.llvm. follow the instructions in llvm_mode/README.md.
Beyond that, everything should work as advertised. Beyond that, everything should work as advertised.
@ -58,8 +63,7 @@ The QEMU mode is currently supported only on Linux. I think it's just a QEMU
problem, I couldn't get a vanilla copy of user-mode emulation support working problem, I couldn't get a vanilla copy of user-mode emulation support working
correctly on BSD at all. correctly on BSD at all.
3) MacOS X on x86 ## 3. MacOS X on x86
-----------------
MacOS X should work, but there are some gotchas due to the idiosyncrasies of MacOS X should work, but there are some gotchas due to the idiosyncrasies of
the platform. On top of this, I have limited release testing capabilities the platform. On top of this, I have limited release testing capabilities
@ -69,8 +73,8 @@ To build AFL, install Xcode and follow the general instructions for Linux.
The Xcode 'gcc' tool is just a wrapper for clang, so be sure to use afl-clang The Xcode 'gcc' tool is just a wrapper for clang, so be sure to use afl-clang
to compile any instrumented binaries; afl-gcc will fail unless you have GCC to compile any instrumented binaries; afl-gcc will fail unless you have GCC
installed from another source (in which case, please specify AFL_CC and installed from another source (in which case, please specify `AFL_CC` and
AFL_CXX to point to the "real" GCC binaries). `AFL_CXX` to point to the "real" GCC binaries).
Only 64-bit compilation will work on the platform; porting the 32-bit Only 64-bit compilation will work on the platform; porting the 32-bit
instrumentation would require a fair amount of work due to the way OS X instrumentation would require a fair amount of work due to the way OS X
@ -80,47 +84,45 @@ The crash reporting daemon that comes by default with MacOS X will cause
problems with fuzzing. You need to turn it off by following the instructions problems with fuzzing. You need to turn it off by following the instructions
provided here: http://goo.gl/CCcd5u provided here: http://goo.gl/CCcd5u
The fork() semantics on OS X are a bit unusual compared to other unix systems The `fork()` semantics on OS X are a bit unusual compared to other unix systems
and definitely don't look POSIX-compliant. This means two things: and definitely don't look POSIX-compliant. This means two things:
- Fuzzing will be probably slower than on Linux. In fact, some folks report - Fuzzing will be probably slower than on Linux. In fact, some folks report
considerable performance gains by running the jobs inside a Linux VM on considerable performance gains by running the jobs inside a Linux VM on
MacOS X. MacOS X.
- Some non-portable, platform-specific code may be incompatible with the - Some non-portable, platform-specific code may be incompatible with the
AFL forkserver. If you run into any problems, set AFL_NO_FORKSRV=1 in the AFL forkserver. If you run into any problems, set `AFL_NO_FORKSRV=1` in the
environment before starting afl-fuzz. environment before starting afl-fuzz.
User emulation mode of QEMU does not appear to be supported on MacOS X, so User emulation mode of QEMU does not appear to be supported on MacOS X, so
black-box instrumentation mode (-Q) will not work. black-box instrumentation mode (`-Q`) will not work.
The llvm_mode requires a fully-operational installation of clang. The one that The llvm_mode requires a fully-operational installation of clang. The one that
comes with Xcode is missing some of the essential headers and helper tools. comes with Xcode is missing some of the essential headers and helper tools.
See llvm_mode/README.llvm for advice on how to build the compiler from scratch. See llvm_mode/README.md for advice on how to build the compiler from scratch.
4) Linux or *BSD on non-x86 systems ## 4. Linux or *BSD on non-x86 systems
-----------------------------------
Standard build will fail on non-x86 systems, but you should be able to Standard build will fail on non-x86 systems, but you should be able to
leverage two other options: leverage two other options:
- The LLVM mode (see llvm_mode/README.llvm), which does not rely on - The LLVM mode (see llvm_mode/README.md), which does not rely on
x86-specific assembly shims. It's fast and robust, but requires a x86-specific assembly shims. It's fast and robust, but requires a
complete installation of clang. complete installation of clang.
- The QEMU mode (see qemu_mode/README.md), which can be also used for
- The QEMU mode (see qemu_mode/README.qemu), which can be also used for
fuzzing cross-platform binaries. It's slower and more fragile, but fuzzing cross-platform binaries. It's slower and more fragile, but
can be used even when you don't have the source for the tested app. can be used even when you don't have the source for the tested app.
If you're not sure what you need, you need the LLVM mode. To get it, try: If you're not sure what you need, you need the LLVM mode. To get it, try:
$ AFL_NO_X86=1 gmake && gmake -C llvm_mode ```bash
AFL_NO_X86=1 gmake && gmake -C llvm_mode
```
...and compile your target program with afl-clang-fast or afl-clang-fast++ ...and compile your target program with afl-clang-fast or afl-clang-fast++
instead of the traditional afl-gcc or afl-clang wrappers. instead of the traditional afl-gcc or afl-clang wrappers.
5) Solaris on x86 ## 5. Solaris on x86
-----------------
The fuzzer reportedly works on Solaris, but I have not tested this first-hand, The fuzzer reportedly works on Solaris, but I have not tested this first-hand,
and the user base is fairly small, so I don't have a lot of feedback. and the user base is fairly small, so I don't have a lot of feedback.
@ -128,36 +130,39 @@ and the user base is fairly small, so I don't have a lot of feedback.
To get the ball rolling, you will need to use GNU make and GCC or clang. I'm To get the ball rolling, you will need to use GNU make and GCC or clang. I'm
being told that the stock version of GCC that comes with the platform does not being told that the stock version of GCC that comes with the platform does not
work properly due to its reliance on a hardcoded location for 'as' (completely work properly due to its reliance on a hardcoded location for 'as' (completely
ignoring the -B parameter or $PATH). ignoring the `-B` parameter or `$PATH`).
To fix this, you may want to build stock GCC from the source, like so: To fix this, you may want to build stock GCC from the source, like so:
$ ./configure --prefix=$HOME/gcc --with-gnu-as --with-gnu-ld \ ```sh
./configure --prefix=$HOME/gcc --with-gnu-as --with-gnu-ld \
--with-gmp-include=/usr/include/gmp --with-mpfr-include=/usr/include/mpfr --with-gmp-include=/usr/include/gmp --with-mpfr-include=/usr/include/mpfr
$ make make
$ sudo make install sudo make install
```
Do *not* specify --with-as=/usr/gnu/bin/as - this will produce a GCC binary that Do *not* specify `--with-as=/usr/gnu/bin/as` - this will produce a GCC binary that
ignores the -B flag and you will be back to square one. ignores the `-B` flag and you will be back to square one.
Note that Solaris reportedly comes with crash reporting enabled, which causes Note that Solaris reportedly comes with crash reporting enabled, which causes
problems with crashes being misinterpreted as hangs, similarly to the gotchas problems with crashes being misinterpreted as hangs, similarly to the gotchas
for Linux and MacOS X. AFL does not auto-detect crash reporting on this for Linux and MacOS X. AFL does not auto-detect crash reporting on this
particular platform, but you may need to run the following command: particular platform, but you may need to run the following command:
$ coreadm -d global -d global-setid -d process -d proc-setid \ ```sh
coreadm -d global -d global-setid -d process -d proc-setid \
-d kzone -d log -d kzone -d log
```
User emulation mode of QEMU is not available on Solaris, so black-box User emulation mode of QEMU is not available on Solaris, so black-box
instrumentation mode (-Q) will not work. instrumentation mode (`-Q`) will not work.
6) Everything else ## 6. Everything else
------------------
You're on your own. On POSIX-compliant systems, you may be able to compile and You're on your own. On POSIX-compliant systems, you may be able to compile and
run the fuzzer; and the LLVM mode may offer a way to instrument non-x86 code. run the fuzzer; and the LLVM mode may offer a way to instrument non-x86 code.
The fuzzer will not run on Windows. It will also not work under Cygwin. It The fuzzer will run on Windows in WSL only. It will not work under Cygwin on in the normal Windows world. It
could be ported to the latter platform fairly easily, but it's a pretty bad could be ported to the latter platform fairly easily, but it's a pretty bad
idea, because Cygwin is extremely slow. It makes much more sense to use idea, because Cygwin is extremely slow. It makes much more sense to use
VirtualBox or so to run a hardware-accelerated Linux VM; it will run around VirtualBox or so to run a hardware-accelerated Linux VM; it will run around
@ -171,13 +176,15 @@ It's possible that all you need is this workaround:
https://github.com/pelya/android-shmem https://github.com/pelya/android-shmem
Joshua J. Drake notes that the Android linker adds a shim that automatically Joshua J. Drake notes that the Android linker adds a shim that automatically
intercepts SIGSEGV and related signals. To fix this issue and be able to see intercepts `SIGSEGV` and related signals. To fix this issue and be able to see
crashes, you need to put this at the beginning of the fuzzed program: crashes, you need to put this at the beginning of the fuzzed program:
```sh
signal(SIGILL, SIG_DFL); signal(SIGILL, SIG_DFL);
signal(SIGABRT, SIG_DFL); signal(SIGABRT, SIG_DFL);
signal(SIGBUS, SIG_DFL); signal(SIGBUS, SIG_DFL);
signal(SIGFPE, SIG_DFL); signal(SIGFPE, SIG_DFL);
signal(SIGSEGV, SIG_DFL); signal(SIGSEGV, SIG_DFL);
```
You may need to #include <signal.h> first. You may need to `#include <signal.h>` first.

View File

@ -1,9 +1,11 @@
# Applied Patches
The following patches from https://github.com/vanhauser-thc/afl-patches The following patches from https://github.com/vanhauser-thc/afl-patches
have been installed or not installed: have been installed or not installed:
INSTALLED ## INSTALLED
========= ```
afl-llvm-fix.diff by kcwu(at)csie(dot)org afl-llvm-fix.diff by kcwu(at)csie(dot)org
afl-sort-all_uniq-fix.diff by legarrec(dot)vincent(at)gmail(dot)com afl-sort-all_uniq-fix.diff by legarrec(dot)vincent(at)gmail(dot)com
laf-intel.diff by heiko(dot)eissfeldt(at)hexco(dot)de laf-intel.diff by heiko(dot)eissfeldt(at)hexco(dot)de
@ -16,6 +18,7 @@ afl-qemu-ppc64.diff by william(dot)barsse(at)airbus(dot)com
afl-qemu-optimize-entrypoint.diff by mh(at)mh-sec(dot)de afl-qemu-optimize-entrypoint.diff by mh(at)mh-sec(dot)de
afl-qemu-speed.diff by abiondo on github afl-qemu-speed.diff by abiondo on github
afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de
```
+ Custom mutator (native library) (by kyakdan) + Custom mutator (native library) (by kyakdan)
+ unicorn_mode (modernized and updated by domenukk) + unicorn_mode (modernized and updated by domenukk)
@ -28,10 +31,12 @@ afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de
+ forkserver patch for afl-tmin (github.com/nccgroup/TriforceAFL) + forkserver patch for afl-tmin (github.com/nccgroup/TriforceAFL)
NOT INSTALLED ## NOT INSTALLED
=============
```
afl-fuzz-context_sensitive.diff - changes too much of the behaviour afl-fuzz-context_sensitive.diff - changes too much of the behaviour
afl-tmpfs.diff - same as afl-fuzz-tmpdir.diff but more complex afl-tmpfs.diff - same as afl-fuzz-tmpdir.diff but more complex
afl-cmin-reduce-dataset.diff - unsure of the impact afl-cmin-reduce-dataset.diff - unsure of the impact
afl-llvm-fix2.diff - not needed with the other patches afl-llvm-fix2.diff - not needed with the other patches
```

View File

@ -1,17 +1,14 @@
================ # Historical notes
Historical notes
================
This doc talks about the rationale of some of the high-level design decisions This doc talks about the rationale of some of the high-level design decisions
for American Fuzzy Lop. It's adopted from a discussion with Rob Graham. for American Fuzzy Lop. It's adopted from a discussion with Rob Graham.
See README for the general instruction manual, and technical_details.txt for See README.md for the general instruction manual, and technical_details.md for
additional implementation-level insights. additional implementation-level insights.
1) Influences ## 1) Influences
-------------
In short, afl-fuzz is inspired chiefly by the work done by Tavis Ormandy back In short, `afl-fuzz` is inspired chiefly by the work done by Tavis Ormandy back
in 2007. Tavis did some very persuasive experiments using gcov block coverage in 2007. Tavis did some very persuasive experiments using `gcov` block coverage
to select optimal test cases out of a large corpus of data, and then using to select optimal test cases out of a large corpus of data, and then using
them as a starting point for traditional fuzzing workflows. them as a starting point for traditional fuzzing workflows.
@ -22,7 +19,7 @@ In parallel to this, both Tavis and I were interested in evolutionary fuzzing.
Tavis had his experiments, and I was working on a tool called bunny-the-fuzzer, Tavis had his experiments, and I was working on a tool called bunny-the-fuzzer,
released somewhere in 2007. released somewhere in 2007.
Bunny used a generational algorithm not much different from afl-fuzz, but Bunny used a generational algorithm not much different from `afl-fuzz`, but
also tried to reason about the relationship between various input bits and also tried to reason about the relationship between various input bits and
the internal state of the program, with hopes of deriving some additional value the internal state of the program, with hopes of deriving some additional value
from that. The reasoning / correlation part was probably in part inspired by from that. The reasoning / correlation part was probably in part inspired by
@ -75,8 +72,7 @@ But I digress; ultimately, attribution is hard, and glorying the fundamental
concepts behind AFL is probably a waste of time. The devil is very much in the concepts behind AFL is probably a waste of time. The devil is very much in the
often-overlooked details, which brings us to... often-overlooked details, which brings us to...
2) Design goals for afl-fuzz ## 2. Design goals for afl-fuzz
----------------------------
In short, I believe that the current implementation of afl-fuzz takes care of In short, I believe that the current implementation of afl-fuzz takes care of
several itches that seemed impossible to scratch with other tools: several itches that seemed impossible to scratch with other tools:
@ -86,7 +82,7 @@ several itches that seemed impossible to scratch with other tools:
likely to find a bug, but runs 100x slower, your users are getting a bad likely to find a bug, but runs 100x slower, your users are getting a bad
deal. deal.
To avoid starting with a handicap, afl-fuzz is meant to let you fuzz most of To avoid starting with a handicap, `afl-fuzz` is meant to let you fuzz most of
the intended targets at roughly their native speed - so even if it doesn't the intended targets at roughly their native speed - so even if it doesn't
add value, you do not lose much. add value, you do not lose much.
@ -107,7 +103,7 @@ several itches that seemed impossible to scratch with other tools:
them strictly worse than "dumb" tools, and such degradation can be difficult them strictly worse than "dumb" tools, and such degradation can be difficult
for less experienced users to notice and correct. for less experienced users to notice and correct.
In contrast, afl-fuzz is designed to be rock solid, chiefly by keeping it In contrast, `afl-fuzz` is designed to be rock solid, chiefly by keeping it
simple. In fact, at its core, it's designed to be just a very good simple. In fact, at its core, it's designed to be just a very good
traditional fuzzer with a wide range of interesting, well-researched traditional fuzzer with a wide range of interesting, well-researched
strategies to go by. The fancy parts just help it focus the effort in strategies to go by. The fancy parts just help it focus the effort in
@ -143,5 +139,5 @@ of small, complementary methods that were shown to reliably yields results
better than chance. The use of instrumentation is a part of that toolkit, but is better than chance. The use of instrumentation is a part of that toolkit, but is
far from being the most important one. far from being the most important one.
Ultimately, what matters is that afl-fuzz is designed to find cool bugs - and Ultimately, what matters is that `afl-fuzz` is designed to find cool bugs - and
has a pretty robust track record of doing just that. has a pretty robust track record of doing just that.

90
docs/life_pro_tips.md Normal file
View File

@ -0,0 +1,90 @@
# AFL "Life Pro Tips"
Bite-sized advice for those who understand the basics, but can't be bothered
to read or memorize every other piece of documentation for AFL.
## Get more bang for your buck by using fuzzing dictionaries.
See dictionaries/README.md to learn how.
## You can get the most out of your hardware by parallelizing AFL jobs.
See docs/parallel_fuzzing.md for step-by-step tips.
## Improve the odds of spotting memory corruption bugs with libdislocator.so!
It's easy. Consult libdislocator/README.md for usage tips.
## Want to understand how your target parses a particular input file?
Try the bundled `afl-analyze` tool; it's got colors and all!
## You can visually monitor the progress of your fuzzing jobs.
Run the bundled `afl-plot` utility to generate browser-friendly graphs.
## Need to monitor AFL jobs programmatically?
Check out the `fuzzer_stats` file in the AFL output dir or try `afl-whatsup`.
## Puzzled by something showing up in red or purple in the AFL UI?
It could be important - consult docs/status_screen.md right away!
## Know your target? Convert it to persistent mode for a huge performance gain!
Consult section #5 in llvm_mode/README.md for tips.
## Using clang?
Check out llvm_mode/ for a faster alternative to afl-gcc!
## Did you know that AFL can fuzz closed-source or cross-platform binaries?
Check out qemu_mode/README.md and unicorn_mode/README.md for more.
## Did you know that afl-fuzz can minimize any test case for you?
Try the bundled `afl-tmin` tool - and get small repro files fast!
## Not sure if a crash is exploitable? AFL can help you figure it out. Specify
`-C` to enable the peruvian were-rabbit mode.
## Trouble dealing with a machine uprising? Relax, we've all been there.
Find essential survival tips at http://lcamtuf.coredump.cx/prep/.
## Want to automatically spot non-crashing memory handling bugs?
Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind.
## Good selection of input files is critical to a successful fuzzing job.
See docs/perf_tips.md for pro tips.
## You can improve the odds of automatically spotting stack corruption issues.
Specify `AFL_HARDEN=1` in the environment to enable hardening flags.
## Bumping into problems with non-reproducible crashes?
It happens, but usually
isn't hard to diagnose. See section #7 in README for tips.
## Fuzzing is not just about memory corruption issues in the codebase.
Add some
sanity-checking `assert()` / `abort()` statements to effortlessly catch logic bugs.
## Hey kid... pssst... want to figure out how AFL really works?
Check out docs/technical_details.md for all the gory details in one place!
## There's a ton of third-party helper tools designed to work with AFL!
Be sure to check out docs/sister_projects.md before writing your own.
## Need to fuzz the command-line arguments of a particular program?
You can find a simple solution in experimental/argv_fuzzing.
## Attacking a format that uses checksums?
Remove the checksum-checking code or
use a postprocessor! See experimental/post_library/ for more.
## Dealing with a very slow target or hoping for instant results?
Specify `-d` when calling afl-fuzz!

View File

@ -1,128 +0,0 @@
# ===================
# AFL "Life Pro Tips"
# ===================
#
# Bite-sized advice for those who understand the basics, but can't be bothered
# to read or memorize every other piece of documentation for AFL.
#
%
Get more bang for your buck by using fuzzing dictionaries.
See dictionaries/README.dictionaries to learn how.
%
You can get the most out of your hardware by parallelizing AFL jobs.
See docs/parallel_fuzzing.md for step-by-step tips.
%
Improve the odds of spotting memory corruption bugs with libdislocator.so!
It's easy. Consult libdislocator/README.dislocator for usage tips.
%
Want to understand how your target parses a particular input file?
Try the bundled afl-analyze tool; it's got colors and all!
%
You can visually monitor the progress of your fuzzing jobs.
Run the bundled afl-plot utility to generate browser-friendly graphs.
%
Need to monitor AFL jobs programmatically? Check out the fuzzer_stats file
in the AFL output dir or try afl-whatsup.
%
Puzzled by something showing up in red or purple in the AFL UI?
It could be important - consult docs/status_screen.txt right away!
%
Know your target? Convert it to persistent mode for a huge performance gain!
Consult section #5 in llvm_mode/README.llvm for tips.
%
Using clang? Check out llvm_mode/ for a faster alternative to afl-gcc!
%
Did you know that AFL can fuzz closed-source or cross-platform binaries?
Check out qemu_mode/README.qemu for more.
%
Did you know that afl-fuzz can minimize any test case for you?
Try the bundled afl-tmin tool - and get small repro files fast!
%
Not sure if a crash is exploitable? AFL can help you figure it out. Specify
-C to enable the peruvian were-rabbit mode. See section #10 in README for more.
%
Trouble dealing with a machine uprising? Relax, we've all been there.
Find essential survival tips at http://lcamtuf.coredump.cx/prep/.
%
AFL-generated corpora can be used to power other testing processes.
See section #2 in README for inspiration - it tends to pay off!
%
Want to automatically spot non-crashing memory handling bugs?
Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind.
%
Good selection of input files is critical to a successful fuzzing job.
See section #5 in README (or docs/perf_tips.txt) for pro tips.
%
You can improve the odds of automatically spotting stack corruption issues.
Specify AFL_HARDEN=1 in the environment to enable hardening flags.
%
Bumping into problems with non-reproducible crashes? It happens, but usually
isn't hard to diagnose. See section #7 in README for tips.
%
Fuzzing is not just about memory corruption issues in the codebase. Add some
sanity-checking assert() / abort() statements to effortlessly catch logic bugs.
%
Hey kid... pssst... want to figure out how AFL really works?
Check out docs/technical_details.txt for all the gory details in one place!
%
There's a ton of third-party helper tools designed to work with AFL!
Be sure to check out docs/sister_projects.txt before writing your own.
%
Need to fuzz the command-line arguments of a particular program?
You can find a simple solution in experimental/argv_fuzzing.
%
Attacking a format that uses checksums? Remove the checksum-checking code or
use a postprocessor! See experimental/post_library/ for more.
%
Dealing with a very slow target or hoping for instant results? Specify -d
when calling afl-fuzz!
%

View File

@ -1,12 +1,9 @@
================================= ## Tips for performance optimization
Tips for performance optimization
=================================
This file provides tips for troubleshooting slow or wasteful fuzzing jobs. This file provides tips for troubleshooting slow or wasteful fuzzing jobs.
See README for the general instruction manual. See README for the general instruction manual.
1) Keep your test cases small ## 1. Keep your test cases small
-----------------------------
This is probably the single most important step to take! Large test cases do This is probably the single most important step to take! Large test cases do
not merely take more time and memory to be parsed by the tested binary, but not merely take more time and memory to be parsed by the tested binary, but
@ -29,22 +26,20 @@ as high as 500x or so.
In practice, this means that you shouldn't fuzz image parsers with your In practice, this means that you shouldn't fuzz image parsers with your
vacation photos. Generate a tiny 16x16 picture instead, and run it through vacation photos. Generate a tiny 16x16 picture instead, and run it through
jpegtran or pngcrunch for good measure. The same goes for most other types `jpegtran` or `pngcrunch` for good measure. The same goes for most other types
of documents. of documents.
There's plenty of small starting test cases in ../testcases/* - try them out There's plenty of small starting test cases in ../testcases/ - try them out
or submit new ones! or submit new ones!
If you want to start with a larger, third-party corpus, run afl-cmin with an If you want to start with a larger, third-party corpus, run `afl-cmin` with an
aggressive timeout on that data set first. aggressive timeout on that data set first.
2) Use a simpler target ## 2. Use a simpler target
-----------------------
Consider using a simpler target binary in your fuzzing work. For example, for Consider using a simpler target binary in your fuzzing work. For example, for
image formats, bundled utilities such as djpeg, readpng, or gifhisto are image formats, bundled utilities such as `djpeg`, `readpng`, or `gifhisto` are
considerably (10-20x) faster than the convert tool from ImageMagick - all while considerably (10-20x) faster than the convert tool from ImageMagick - all while exercising roughly the same library-level image parsing code.
exercising roughly the same library-level image parsing code.
Even if you don't have a lightweight harness for a particular target, remember Even if you don't have a lightweight harness for a particular target, remember
that you can always use another, related library to generate a corpus that will that you can always use another, related library to generate a corpus that will
@ -53,11 +48,10 @@ be then manually fed to a more resource-hungry program later on.
Also note that reading the fuzzing input via stdin is faster than reading from Also note that reading the fuzzing input via stdin is faster than reading from
a file. a file.
3) Use LLVM instrumentation ## 3. Use LLVM instrumentation
---------------------------
When fuzzing slow targets, you can gain 20-100% performance improvement by When fuzzing slow targets, you can gain 20-100% performance improvement by
using the LLVM-based instrumentation mode described in llvm_mode/README.llvm. using the LLVM-based instrumentation mode described in [the llvm_mode README](../llvm_mode/README.md).
Note that this mode requires the use of clang and will not work with GCC. Note that this mode requires the use of clang and will not work with GCC.
The LLVM mode also offers a "persistent", in-process fuzzing mode that can The LLVM mode also offers a "persistent", in-process fuzzing mode that can
@ -67,25 +61,26 @@ that can offer huge benefits for programs with high startup overhead. Both
modes require you to edit the source code of the fuzzed program, but the modes require you to edit the source code of the fuzzed program, but the
changes often amount to just strategically placing a single line or two. changes often amount to just strategically placing a single line or two.
If there are important data comparisons performed (e.g. strcmp(ptr, MAGIC_HDR) If there are important data comparisons performed (e.g. `strcmp(ptr, MAGIC_HDR)`)
then using laf-intel (see llvm_mode/README.laf-intel) will help afl-fuzz a lot then using laf-intel (see llvm_mode/README.laf-intel.md) will help `afl-fuzz` a lot
to get to the important parts in the code. to get to the important parts in the code.
If you are only intested in specific parts of the code being fuzzed, you can If you are only interested in specific parts of the code being fuzzed, you can
whitelist the files that are actually relevant. This improves the speed and whitelist the files that are actually relevant. This improves the speed and
accuracy of afl. See llvm_mode/README.whitelist accuracy of afl. See llvm_mode/README.whitelist.md
Also use the InsTrim mode on larger binaries, this improves performance and Also use the InsTrim mode on larger binaries, this improves performance and
coverage a lot. coverage a lot.
4) Profile and optimize the binary ## 4. Profile and optimize the binary
----------------------------------
Check for any parameters or settings that obviously improve performance. For Check for any parameters or settings that obviously improve performance. For
example, the djpeg utility that comes with IJG jpeg and libjpeg-turbo can be example, the djpeg utility that comes with IJG jpeg and libjpeg-turbo can be
called with: called with:
```bash
-dct fast -nosmooth -onepass -dither none -scale 1/4 -dct fast -nosmooth -onepass -dither none -scale 1/4
```
...and that will speed things up. There is a corresponding drop in the quality ...and that will speed things up. There is a corresponding drop in the quality
of decoded images, but it's probably not something you care about. of decoded images, but it's probably not something you care about.
@ -98,134 +93,132 @@ With some laid-back parsers, enabling "strict" mode (i.e., bailing out after
first error) may result in smaller files and improved run time without first error) may result in smaller files and improved run time without
sacrificing coverage; for example, for sqlite, you may want to specify -bail. sacrificing coverage; for example, for sqlite, you may want to specify -bail.
If the program is still too slow, you can use strace -tt or an equivalent If the program is still too slow, you can use `strace -tt` or an equivalent
profiling tool to see if the targeted binary is doing anything silly. profiling tool to see if the targeted binary is doing anything silly.
Sometimes, you can speed things up simply by specifying /dev/null as the Sometimes, you can speed things up simply by specifying `/dev/null` as the
config file, or disabling some compile-time features that aren't really needed config file, or disabling some compile-time features that aren't really needed
for the job (try ./configure --help). One of the notoriously resource-consuming for the job (try `./configure --help`). One of the notoriously resource-consuming
things would be calling other utilities via exec*(), popen(), system(), or things would be calling other utilities via `exec*()`, `popen()`, `system()`, or
equivalent calls; for example, tar can invoke external decompression tools equivalent calls; for example, tar can invoke external decompression tools
when it decides that the input file is a compressed archive. when it decides that the input file is a compressed archive.
Some programs may also intentionally call sleep(), usleep(), or nanosleep(); Some programs may also intentionally call `sleep()`, `usleep()`, or `nanosleep()`;
vim is a good example of that. Other programs may attempt fsync() and so on. vim is a good example of that. Other programs may attempt `fsync()` and so on.
There are third-party libraries that make it easy to get rid of such code, There are third-party libraries that make it easy to get rid of such code,
e.g.: e.g.:
https://launchpad.net/libeatmydata https://launchpad.net/libeatmydata
In programs that are slow due to unavoidable initialization overhead, you may In programs that are slow due to unavoidable initialization overhead, you may
want to try the LLVM deferred forkserver mode (see llvm_mode/README.llvm), want to try the LLVM deferred forkserver mode (see llvm_mode/README.md),
which can give you speed gains up to 10x, as mentioned above. which can give you speed gains up to 10x, as mentioned above.
Last but not least, if you are using ASAN and the performance is unacceptable, Last but not least, if you are using ASAN and the performance is unacceptable,
consider turning it off for now, and manually examining the generated corpus consider turning it off for now, and manually examining the generated corpus
with an ASAN-enabled binary later on. with an ASAN-enabled binary later on.
5) Instrument just what you need ## 5. Instrument just what you need
--------------------------------
Instrument just the libraries you actually want to stress-test right now, one Instrument just the libraries you actually want to stress-test right now, one
at a time. Let the program use system-wide, non-instrumented libraries for at a time. Let the program use system-wide, non-instrumented libraries for
any functionality you don't actually want to fuzz. For example, in most any functionality you don't actually want to fuzz. For example, in most
cases, it doesn't make to instrument libgmp just because you're testing a cases, it doesn't make to instrument `libgmp` just because you're testing a
crypto app that relies on it for bignum math. crypto app that relies on it for bignum math.
Beware of programs that come with oddball third-party libraries bundled with Beware of programs that come with oddball third-party libraries bundled with
their source code (Spidermonkey is a good example of this). Check ./configure their source code (Spidermonkey is a good example of this). Check `./configure`
options to use non-instrumented system-wide copies instead. options to use non-instrumented system-wide copies instead.
6) Parallelize your fuzzers ## 6. Parallelize your fuzzers
---------------------------
The fuzzer is designed to need ~1 core per job. This means that on a, say, The fuzzer is designed to need ~1 core per job. This means that on a, say,
4-core system, you can easily run four parallel fuzzing jobs with relatively 4-core system, you can easily run four parallel fuzzing jobs with relatively
little performance hit. For tips on how to do that, see parallel_fuzzing.md. little performance hit. For tips on how to do that, see parallel_fuzzing.md.
The afl-gotcpu utility can help you understand if you still have idle CPU The `afl-gotcpu` utility can help you understand if you still have idle CPU
capacity on your system. (It won't tell you about memory bandwidth, cache capacity on your system. (It won't tell you about memory bandwidth, cache
misses, or similar factors, but they are less likely to be a concern.) misses, or similar factors, but they are less likely to be a concern.)
7) Keep memory use and timeouts in check ## 7. Keep memory use and timeouts in check
----------------------------------------
If you have increased the -m or -t limits more than truly necessary, consider If you have increased the `-m` or `-t` limits more than truly necessary, consider
dialing them back down. dialing them back down.
For programs that are nominally very fast, but get sluggish for some inputs, For programs that are nominally very fast, but get sluggish for some inputs,
you can also try setting -t values that are more punishing than what afl-fuzz you can also try setting `-t` values that are more punishing than what `afl-fuzz`
dares to use on its own. On fast and idle machines, going down to -t 5 may be dares to use on its own. On fast and idle machines, going down to `-t 5` may be
a viable plan. a viable plan.
The -m parameter is worth looking at, too. Some programs can end up spending The `-m` parameter is worth looking at, too. Some programs can end up spending
a fair amount of time allocating and initializing megabytes of memory when a fair amount of time allocating and initializing megabytes of memory when
presented with pathological inputs. Low -m values can make them give up sooner presented with pathological inputs. Low `-m` values can make them give up sooner
and not waste CPU time. and not waste CPU time.
8) Check OS configuration ## 8. Check OS configuration
-------------------------
There are several OS-level factors that may affect fuzzing speed: There are several OS-level factors that may affect fuzzing speed:
- If you have no risk of power loss then run your fuzzing on a tmpfs - If you have no risk of power loss then run your fuzzing on a tmpfs
partition. This increases the performance noticably. partition. This increases the performance noticably.
Alternatively you can use AFL_TMPDIR to point to a tmpfs location to Alternatively you can use `AFL_TMPDIR` to point to a tmpfs location to
just write the input file to a tmpfs. just write the input file to a tmpfs.
- High system load. Use idle machines where possible. Kill any non-essential - High system load. Use idle machines where possible. Kill any non-essential
CPU hogs (idle browser windows, media players, complex screensavers, etc). CPU hogs (idle browser windows, media players, complex screensavers, etc).
- Network filesystems, either used for fuzzer input / output, or accessed by - Network filesystems, either used for fuzzer input / output, or accessed by
the fuzzed binary to read configuration files (pay special attention to the the fuzzed binary to read configuration files (pay special attention to the
home directory - many programs search it for dot-files). home directory - many programs search it for dot-files).
- On-demand CPU scaling. The Linux `ondemand` governor performs its analysis
- On-demand CPU scaling. The Linux 'ondemand' governor performs its analysis
on a particular schedule and is known to underestimate the needs of on a particular schedule and is known to underestimate the needs of
short-lived processes spawned by afl-fuzz (or any other fuzzer). On Linux, short-lived processes spawned by `afl-fuzz` (or any other fuzzer). On Linux,
this can be fixed with: this can be fixed with:
``` bash
cd /sys/devices/system/cpu cd /sys/devices/system/cpu
echo performance | tee cpu*/cpufreq/scaling_governor echo performance | tee cpu*/cpufreq/scaling_governor
```
On other systems, the impact of CPU scaling will be different; when fuzzing, On other systems, the impact of CPU scaling will be different; when fuzzing,
use OS-specific tools to find out if all cores are running at full speed. use OS-specific tools to find out if all cores are running at full speed.
- Transparent huge pages. Some allocators, such as `jemalloc`, can incur a
- Transparent huge pages. Some allocators, such as jemalloc, can incur a
heavy fuzzing penalty when transparent huge pages (THP) are enabled in the heavy fuzzing penalty when transparent huge pages (THP) are enabled in the
kernel. You can disable this via: kernel. You can disable this via:
```bash
echo never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/enabled
```
- Suboptimal scheduling strategies. The significance of this will vary from - Suboptimal scheduling strategies. The significance of this will vary from
one target to another, but on Linux, you may want to make sure that the one target to another, but on Linux, you may want to make sure that the
following options are set: following options are set:
```bash
echo 1 >/proc/sys/kernel/sched_child_runs_first echo 1 >/proc/sys/kernel/sched_child_runs_first
echo 1 >/proc/sys/kernel/sched_autogroup_enabled echo 1 >/proc/sys/kernel/sched_autogroup_enabled
```
Setting a different scheduling policy for the fuzzer process - say Setting a different scheduling policy for the fuzzer process - say
SCHED_RR - can usually speed things up, too, but needs to be done with `SCHED_RR` - can usually speed things up, too, but needs to be done with
care. care.
- Use the `afl-system-config` script to set all proc/sys settings above in one go.
- Use the afl-system-config script to set all proc/sys settings above
- Disable all the spectre, meltdown etc. security countermeasures in the - Disable all the spectre, meltdown etc. security countermeasures in the
kernel if your machine is properly separated: kernel if your machine is properly separated:
"ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off
no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable ```
nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off
spectre_v2=off stf_barrier=off" no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable
In most Linux distributions you can put this into a /etc/default/grub nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off
spectre_v2=off stf_barrier=off
```
In most Linux distributions you can put this into a `/etc/default/grub`
variable. variable.
9) If all other options fail, use -d ## 9. If all other options fail, use `-d`
------------------------------------
For programs that are genuinely slow, in cases where you really can't escape For programs that are genuinely slow, in cases where you really can't escape
using huge input files, or when you simply want to get quick and dirty results using huge input files, or when you simply want to get quick and dirty results
early on, you can always resort to the -d mode. early on, you can always resort to the `-d` mode.
The mode causes afl-fuzz to skip all the deterministic fuzzing steps, which The mode causes `afl-fuzz` to skip all the deterministic fuzzing steps, which
makes output a lot less neat and can ultimately make the testing a bit less makes output a lot less neat and can ultimately make the testing a bit less
in-depth, but it will give you an experience more familiar from other fuzzing in-depth, but it will give you an experience more familiar from other fuzzing
tools. tools.

View File

@ -1,8 +1,9 @@
afl++'s power schedules based on AFLfast # afl++'s power schedules based on AFLfast
<a href="https://comp.nus.edu.sg/~mboehme/paper/CCS16.pdf"><img src="https://comp.nus.edu.sg/~mboehme/paper/CCS16.png" align="right" width="250"></a> <a href="https://mboehme.github.io/paper/CCS16.pdf"><img src="https://mboehme.github.io/paper/CCS16.png" align="right" width="250"></a>
Power schedules implemented by Marcel Böhme \<marcel.boehme@acm.org\>. Power schedules implemented by Marcel Böhme \<marcel.boehme@acm.org\>.
AFLFast is an extension of AFL which was written by Michal Zalewski. AFLFast is an extension of AFL which is written and maintained by
Michal Zalewski \<lcamtuf@google.com\>.
AFLfast has helped in the success of Team Codejitsu at the finals of the DARPA Cyber Grand Challenge where their bot Galactica took **2nd place** in terms of #POVs proven (see red bar at https://www.cybergrandchallenge.com/event#results). AFLFast exposed several previously unreported CVEs that could not be exposed by AFL in 24 hours and otherwise exposed vulnerabilities significantly faster than AFL while generating orders of magnitude more unique crashes. AFLfast has helped in the success of Team Codejitsu at the finals of the DARPA Cyber Grand Challenge where their bot Galactica took **2nd place** in terms of #POVs proven (see red bar at https://www.cybergrandchallenge.com/event#results). AFLFast exposed several previously unreported CVEs that could not be exposed by AFL in 24 hours and otherwise exposed vulnerabilities significantly faster than AFL while generating orders of magnitude more unique crashes.

318
docs/sister_projects.md Normal file
View File

@ -0,0 +1,318 @@
# Sister projects
This doc lists some of the projects that are inspired by, derived from,
designed for, or meant to integrate with AFL. See README for the general
instruction manual.
!!!
!!! This list is outdated and needs an update, missing: e.g. Angora, FairFuzz
!!!
## Support for other languages / environments:
### Python AFL (Jakub Wilk)
Allows fuzz-testing of Python programs. Uses custom instrumentation and its
own forkserver.
http://jwilk.net/software/python-afl
### Go-fuzz (Dmitry Vyukov)
AFL-inspired guided fuzzing approach for Go targets:
https://github.com/dvyukov/go-fuzz
### afl.rs (Keegan McAllister)
Allows Rust features to be easily fuzzed with AFL (using the LLVM mode).
https://github.com/kmcallister/afl.rs
### OCaml support (KC Sivaramakrishnan)
Adds AFL-compatible instrumentation to OCaml programs.
https://github.com/ocamllabs/opam-repo-dev/pull/23
http://canopy.mirage.io/Posts/Fuzzing
### AFL for GCJ Java and other GCC frontends (-)
GCC Java programs are actually supported out of the box - simply rename
afl-gcc to afl-gcj. Unfortunately, by default, unhandled exceptions in GCJ do
not result in abort() being called, so you will need to manually add a
top-level exception handler that exits with SIGABRT or something equivalent.
Other GCC-supported languages should be fairly easy to get working, but may
face similar problems. See https://gcc.gnu.org/frontends.html for a list of
options.
## AFL-style in-process fuzzer for LLVM (Kostya Serebryany)
Provides an evolutionary instrumentation-guided fuzzing harness that allows
some programs to be fuzzed without the fork / execve overhead. (Similar
functionality is now available as the "persistent" feature described in
[the llvm_mode readme](../llvm_mode/README.md))
http://llvm.org/docs/LibFuzzer.html
## AFL fixup shim (Ben Nagy)
Allows AFL_POST_LIBRARY postprocessors to be written in arbitrary languages
that don't have C / .so bindings. Includes examples in Go.
https://github.com/bnagy/aflfix
## TriforceAFL (Tim Newsham and Jesse Hertz)
Leverages QEMU full system emulation mode to allow AFL to target operating
systems and other alien worlds:
https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2016/june/project-triforce-run-afl-on-everything/
## WinAFL (Ivan Fratric)
As the name implies, allows you to fuzz Windows binaries (using DynamoRio).
https://github.com/ivanfratric/winafl
Another Windows alternative may be:
https://github.com/carlosgprado/BrundleFuzz/
## Network fuzzing
### Preeny (Yan Shoshitaishvili)
Provides a fairly simple way to convince dynamically linked network-centric
programs to read from a file or not fork. Not AFL-specific, but described as
useful by many users. Some assembly required.
https://github.com/zardus/preeny
## Distributed fuzzing and related automation
### roving (Richo Healey)
A client-server architecture for effortlessly orchestrating AFL runs across
a fleet of machines. You don't want to use this on systems that face the
Internet or live in other untrusted environments.
https://github.com/richo/roving
### Distfuzz-AFL (Martijn Bogaard)
Simplifies the management of afl-fuzz instances on remote machines. The
author notes that the current implementation isn't secure and should not
be exposed on the Internet.
https://github.com/MartijnB/disfuzz-afl
### AFLDFF (quantumvm)
A nice GUI for managing AFL jobs.
https://github.com/quantumvm/AFLDFF
### afl-launch (Ben Nagy)
Batch AFL launcher utility with a simple CLI.
https://github.com/bnagy/afl-launch
### AFL Utils (rc0r)
Simplifies the triage of discovered crashes, start parallel instances, etc.
https://github.com/rc0r/afl-utils
Another crash triage tool:
https://github.com/floyd-fuh/afl-crash-analyzer
### afl-fuzzing-scripts (Tobias Ospelt)
Simplifies starting up multiple parallel AFL jobs.
https://github.com/floyd-fuh/afl-fuzzing-scripts/
### afl-sid (Jacek Wielemborek)
Allows users to more conveniently build and deploy AFL via Docker.
https://github.com/d33tah/afl-sid
Another Docker-related project:
https://github.com/ozzyjohnson/docker-afl
### afl-monitor (Paul S. Ziegler)
Provides more detailed and versatile statistics about your running AFL jobs.
https://github.com/reflare/afl-monitor
### FEXM (Security in Telecommunications)
Fully automated fuzzing framework, based on AFL
https://github.com/fgsect/fexm
## Crash triage, coverage analysis, and other companion tools:
### afl-crash-analyzer (Tobias Ospelt)
Makes it easier to navigate and annotate crashing test cases.
https://github.com/floyd-fuh/afl-crash-analyzer/
### Crashwalk (Ben Nagy)
AFL-aware tool to annotate and sort through crashing test cases.
https://github.com/bnagy/crashwalk
### afl-cov (Michael Rash)
Produces human-readable coverage data based on the output queue of afl-fuzz.
https://github.com/mrash/afl-cov
### afl-sancov (Bhargava Shastry)
Similar to afl-cov, but uses clang sanitizer instrumentation.
https://github.com/bshastry/afl-sancov
### RecidiVM (Jakub Wilk)
Makes it easy to estimate memory usage limits when fuzzing with ASAN or MSAN.
http://jwilk.net/software/recidivm
### aflize (Jacek Wielemborek)
Automatically build AFL-enabled versions of Debian packages.
https://github.com/d33tah/aflize
### afl-ddmin-mod (Markus Teufelberger)
A variant of afl-tmin that uses a more sophisticated (but slower)
minimization algorithm.
https://github.com/MarkusTeufelberger/afl-ddmin-mod
### afl-kit (Kuang-che Wu)
Replacements for afl-cmin and afl-tmin with additional features, such
as the ability to filter crashes based on stderr patterns.
https://github.com/kcwu/afl-kit
## Narrow-purpose or experimental:
### Cygwin support (Ali Rizvi-Santiago)
Pretty self-explanatory. As per the author, this "mostly" ports AFL to
Windows. Field reports welcome!
https://github.com/arizvisa/afl-cygwin
### Pause and resume scripts (Ben Nagy)
Simple automation to suspend and resume groups of fuzzing jobs.
https://github.com/bnagy/afl-trivia
### Static binary-only instrumentation (Aleksandar Nikolich)
Allows black-box binaries to be instrumented statically (i.e., by modifying
the binary ahead of the time, rather than translating it on the run). Author
reports better performance compared to QEMU, but occasional translation
errors with stripped binaries.
https://github.com/vanhauser-thc/afl-dyninst
### AFL PIN (Parker Thompson)
Early-stage Intel PIN instrumentation support (from before we settled on
faster-running QEMU).
https://github.com/mothran/aflpin
### AFL-style instrumentation in llvm (Kostya Serebryany)
Allows AFL-equivalent instrumentation to be injected at compiler level.
This is currently not supported by AFL as-is, but may be useful in other
projects.
https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters
### AFL JS (Han Choongwoo)
One-off optimizations to speed up the fuzzing of JavaScriptCore (now likely
superseded by LLVM deferred forkserver init - see llvm_mode/README.llvm).
https://github.com/tunz/afl-fuzz-js
### AFL harness for fwknop (Michael Rash)
An example of a fairly involved integration with AFL.
https://github.com/mrash/fwknop/tree/master/test/afl
### Building harnesses for DNS servers (Jonathan Foote, Ron Bowes)
Two articles outlining the general principles and showing some example code.
https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop
https://goo.gl/j9EgFf
### Fuzzer shell for SQLite (Richard Hipp)
A simple SQL shell designed specifically for fuzzing the underlying library.
http://www.sqlite.org/src/artifact/9e7e273da2030371
### Support for Python mutation modules (Christian Holler)
now integrated in AFL++, originally from here
https://github.com/choller/afl/blob/master/docs/mozilla/python_modules.txt
### Support for selective instrumentation (Christian Holler)
now integrated in AFL++, originally from here
https://github.com/choller/afl/blob/master/docs/mozilla/partial_instrumentation.txt
### Syzkaller (Dmitry Vyukov)
A similar guided approach as applied to fuzzing syscalls:
https://github.com/google/syzkaller/wiki/Found-Bugs
https://github.com/dvyukov/linux/commit/33787098ffaaa83b8a7ccf519913ac5fd6125931
http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf
### Kernel Snapshot Fuzzing using Unicornafl (Security in Telecommunications)
https://github.com/fgsect/unicorefuzz
### Android support (ele7enxxh)
Based on a somewhat dated version of AFL:
https://github.com/ele7enxxh/android-afl
### CGI wrapper (floyd)
Facilitates the testing of CGI scripts.
https://github.com/floyd-fuh/afl-cgi-wrapper
### Fuzzing difficulty estimation (Marcel Boehme)
A fork of AFL that tries to quantify the likelihood of finding additional
paths or crashes at any point in a fuzzing job.
https://github.com/mboehme/pythia

View File

@ -1,360 +0,0 @@
===============
Sister projects
===============
This doc lists some of the projects that are inspired by, derived from,
designed for, or meant to integrate with AFL. See README for the general
instruction manual.
!!!
!!! This list is outdated and needs an update, missing: e.g. Angora, FairFuzz
!!!
-------------------------------------------
Support for other languages / environments:
-------------------------------------------
Python AFL (Jakub Wilk)
-----------------------
Allows fuzz-testing of Python programs. Uses custom instrumentation and its
own forkserver.
http://jwilk.net/software/python-afl
Go-fuzz (Dmitry Vyukov)
-----------------------
AFL-inspired guided fuzzing approach for Go targets:
https://github.com/dvyukov/go-fuzz
afl.rs (Keegan McAllister)
--------------------------
Allows Rust features to be easily fuzzed with AFL (using the LLVM mode).
https://github.com/kmcallister/afl.rs
OCaml support (KC Sivaramakrishnan)
-----------------------------------
Adds AFL-compatible instrumentation to OCaml programs.
https://github.com/ocamllabs/opam-repo-dev/pull/23
http://canopy.mirage.io/Posts/Fuzzing
AFL for GCJ Java and other GCC frontends (-)
--------------------------------------------
GCC Java programs are actually supported out of the box - simply rename
afl-gcc to afl-gcj. Unfortunately, by default, unhandled exceptions in GCJ do
not result in abort() being called, so you will need to manually add a
top-level exception handler that exits with SIGABRT or something equivalent.
Other GCC-supported languages should be fairly easy to get working, but may
face similar problems. See https://gcc.gnu.org/frontends.html for a list of
options.
AFL-style in-process fuzzer for LLVM (Kostya Serebryany)
--------------------------------------------------------
Provides an evolutionary instrumentation-guided fuzzing harness that allows
some programs to be fuzzed without the fork / execve overhead. (Similar
functionality is now available as the "persistent" feature described in
../llvm_mode/README.llvm.)
http://llvm.org/docs/LibFuzzer.html
AFL fixup shim (Ben Nagy)
-------------------------
Allows AFL_POST_LIBRARY postprocessors to be written in arbitrary languages
that don't have C / .so bindings. Includes examples in Go.
https://github.com/bnagy/aflfix
TriforceAFL (Tim Newsham and Jesse Hertz)
-----------------------------------------
Leverages QEMU full system emulation mode to allow AFL to target operating
systems and other alien worlds:
https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2016/june/project-triforce-run-afl-on-everything/
WinAFL (Ivan Fratric)
---------------------
As the name implies, allows you to fuzz Windows binaries (using DynamoRio).
https://github.com/ivanfratric/winafl
Another Windows alternative may be:
https://github.com/carlosgprado/BrundleFuzz/
----------------
Network fuzzing:
----------------
Preeny (Yan Shoshitaishvili)
----------------------------
Provides a fairly simple way to convince dynamically linked network-centric
programs to read from a file or not fork. Not AFL-specific, but described as
useful by many users. Some assembly required.
https://github.com/zardus/preeny
-------------------------------------------
Distributed fuzzing and related automation:
-------------------------------------------
roving (Richo Healey)
---------------------
A client-server architecture for effortlessly orchestrating AFL runs across
a fleet of machines. You don't want to use this on systems that face the
Internet or live in other untrusted environments.
https://github.com/richo/roving
Distfuzz-AFL (Martijn Bogaard)
------------------------------
Simplifies the management of afl-fuzz instances on remote machines. The
author notes that the current implementation isn't secure and should not
be exposed on the Internet.
https://github.com/MartijnB/disfuzz-afl
AFLDFF (quantumvm)
------------------
A nice GUI for managing AFL jobs.
https://github.com/quantumvm/AFLDFF
afl-launch (Ben Nagy)
---------------------
Batch AFL launcher utility with a simple CLI.
https://github.com/bnagy/afl-launch
AFL Utils (rc0r)
----------------
Simplifies the triage of discovered crashes, start parallel instances, etc.
https://github.com/rc0r/afl-utils
Another crash triage tool:
https://github.com/floyd-fuh/afl-crash-analyzer
afl-fuzzing-scripts (Tobias Ospelt)
-----------------------------------
Simplifies starting up multiple parallel AFL jobs.
https://github.com/floyd-fuh/afl-fuzzing-scripts/
afl-sid (Jacek Wielemborek)
---------------------------
Allows users to more conveniently build and deploy AFL via Docker.
https://github.com/d33tah/afl-sid
Another Docker-related project:
https://github.com/ozzyjohnson/docker-afl
afl-monitor (Paul S. Ziegler)
-----------------------------
Provides more detailed and versatile statistics about your running AFL jobs.
https://github.com/reflare/afl-monitor
-----------------------------------------------------------
Crash triage, coverage analysis, and other companion tools:
-----------------------------------------------------------
afl-crash-analyzer (Tobias Ospelt)
----------------------------------
Makes it easier to navigate and annotate crashing test cases.
https://github.com/floyd-fuh/afl-crash-analyzer/
Crashwalk (Ben Nagy)
--------------------
AFL-aware tool to annotate and sort through crashing test cases.
https://github.com/bnagy/crashwalk
afl-cov (Michael Rash)
----------------------
Produces human-readable coverage data based on the output queue of afl-fuzz.
https://github.com/mrash/afl-cov
afl-sancov (Bhargava Shastry)
-----------------------------
Similar to afl-cov, but uses clang sanitizer instrumentation.
https://github.com/bshastry/afl-sancov
RecidiVM (Jakub Wilk)
---------------------
Makes it easy to estimate memory usage limits when fuzzing with ASAN or MSAN.
http://jwilk.net/software/recidivm
aflize (Jacek Wielemborek)
--------------------------
Automatically build AFL-enabled versions of Debian packages.
https://github.com/d33tah/aflize
afl-ddmin-mod (Markus Teufelberger)
-----------------------------------
A variant of afl-tmin that uses a more sophisticated (but slower)
minimization algorithm.
https://github.com/MarkusTeufelberger/afl-ddmin-mod
afl-kit (Kuang-che Wu)
----------------------
Replacements for afl-cmin and afl-tmin with additional features, such
as the ability to filter crashes based on stderr patterns.
https://github.com/kcwu/afl-kit
-------------------------------
Narrow-purpose or experimental:
-------------------------------
Cygwin support (Ali Rizvi-Santiago)
-----------------------------------
Pretty self-explanatory. As per the author, this "mostly" ports AFL to
Windows. Field reports welcome!
https://github.com/arizvisa/afl-cygwin
Pause and resume scripts (Ben Nagy)
-----------------------------------
Simple automation to suspend and resume groups of fuzzing jobs.
https://github.com/bnagy/afl-trivia
Static binary-only instrumentation (Aleksandar Nikolich)
--------------------------------------------------------
Allows black-box binaries to be instrumented statically (i.e., by modifying
the binary ahead of the time, rather than translating it on the run). Author
reports better performance compared to QEMU, but occasional translation
errors with stripped binaries.
https://github.com/vanhauser-thc/afl-dyninst
AFL PIN (Parker Thompson)
-------------------------
Early-stage Intel PIN instrumentation support (from before we settled on
faster-running QEMU).
https://github.com/mothran/aflpin
AFL-style instrumentation in llvm (Kostya Serebryany)
-----------------------------------------------------
Allows AFL-equivalent instrumentation to be injected at compiler level.
This is currently not supported by AFL as-is, but may be useful in other
projects.
https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters
AFL JS (Han Choongwoo)
----------------------
One-off optimizations to speed up the fuzzing of JavaScriptCore (now likely
superseded by LLVM deferred forkserver init - see llvm_mode/README.llvm).
https://github.com/tunz/afl-fuzz-js
AFL harness for fwknop (Michael Rash)
-------------------------------------
An example of a fairly involved integration with AFL.
https://github.com/mrash/fwknop/tree/master/test/afl
Building harnesses for DNS servers (Jonathan Foote, Ron Bowes)
--------------------------------------------------------------
Two articles outlining the general principles and showing some example code.
https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop
https://goo.gl/j9EgFf
Fuzzer shell for SQLite (Richard Hipp)
--------------------------------------
A simple SQL shell designed specifically for fuzzing the underlying library.
http://www.sqlite.org/src/artifact/9e7e273da2030371
Support for Python mutation modules (Christian Holler)
------------------------------------------------------
now integrated in AFL++, originally from here
https://github.com/choller/afl/blob/master/docs/mozilla/python_modules.txt
Support for selective instrumentation (Christian Holler)
--------------------------------------------------------
now integrated in AFL++, originally from here
https://github.com/choller/afl/blob/master/docs/mozilla/partial_instrumentation.txt
Kernel fuzzing (Dmitry Vyukov)
------------------------------
A similar guided approach as applied to fuzzing syscalls:
https://github.com/google/syzkaller/wiki/Found-Bugs
https://github.com/dvyukov/linux/commit/33787098ffaaa83b8a7ccf519913ac5fd6125931
http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf
Android support (ele7enxxh)
---------------------------
Based on a somewhat dated version of AFL:
https://github.com/ele7enxxh/android-afl
CGI wrapper (floyd)
-------------------
Facilitates the testing of CGI scripts.
https://github.com/floyd-fuh/afl-cgi-wrapper
Fuzzing difficulty estimation (Marcel Boehme)
---------------------------------------------
A fork of AFL that tries to quantify the likelihood of finding additional
paths or crashes at any point in a fuzzing job.
https://github.com/mboehme/pythia

View File

@ -1,13 +1,10 @@
=============================== # Understanding the status screen
Understanding the status screen
===============================
This document provides an overview of the status screen - plus tips for This document provides an overview of the status screen - plus tips for
troubleshooting any warnings and red text shown in the UI. See README for troubleshooting any warnings and red text shown in the UI. See README for
the general instruction manual. the general instruction manual.
0) A note about colors ## A note about colors
----------------------
The status screen and error messages use colors to keep things readable and The status screen and error messages use colors to keep things readable and
attract your attention to the most important details. For example, red almost attract your attention to the most important details. For example, red almost
@ -19,21 +16,18 @@ to that.
If you are using inverse video, you may want to change your settings, say: If you are using inverse video, you may want to change your settings, say:
- For GNOME Terminal, go to Edit > Profile preferences, select the "colors" - For GNOME Terminal, go to `Edit > Profile` preferences, select the "colors" tab, and from the list of built-in schemes, choose "white on black".
tab, and from the list of built-in schemes, choose "white on black". - For the MacOS X Terminal app, open a new window using the "Pro" scheme via the `Shell > New Window` menu (or make "Pro" your default).
- For the MacOS X Terminal app, open a new window using the "Pro" scheme via
the Shell > New Window menu (or make "Pro" your default).
Alternatively, if you really like your current colors, you can edit config.h Alternatively, if you really like your current colors, you can edit config.h
to comment out USE_COLORS, then do 'make clean all'. to comment out USE_COLORS, then do `make clean all`.
I'm not aware of any other simple way to make this work without causing I'm not aware of any other simple way to make this work without causing
other side effects - sorry about that. other side effects - sorry about that.
With that out of the way, let's talk about what's actually on the screen... With that out of the way, let's talk about what's actually on the screen...
0) The status bar ### The status bar
The top line shows you which mode afl-fuzz is running in The top line shows you which mode afl-fuzz is running in
(normal: "american fuzy lop", crash exploration mode: "peruvian rabbit mode") (normal: "american fuzy lop", crash exploration mode: "peruvian rabbit mode")
@ -43,15 +37,16 @@ either show the binary name being fuzzed, or the -M/-S master/slave name for
parallel fuzzing. parallel fuzzing.
Finally, the last item is the power schedule mode being run (default: explore). Finally, the last item is the power schedule mode being run (default: explore).
1) Process timing ### Process timing
-----------------
```
+----------------------------------------------------+ +----------------------------------------------------+
| run time : 0 days, 8 hrs, 32 min, 43 sec | | run time : 0 days, 8 hrs, 32 min, 43 sec |
| last new path : 0 days, 0 hrs, 6 min, 40 sec | | last new path : 0 days, 0 hrs, 6 min, 40 sec |
| last uniq crash : none seen yet | | last uniq crash : none seen yet |
| last uniq hang : 0 days, 1 hrs, 24 min, 32 sec | | last uniq hang : 0 days, 1 hrs, 24 min, 32 sec |
+----------------------------------------------------+ +----------------------------------------------------+
```
This section is fairly self-explanatory: it tells you how long the fuzzer has This section is fairly self-explanatory: it tells you how long the fuzzer has
been running and how much time has elapsed since its most recent finds. This is been running and how much time has elapsed since its most recent finds. This is
@ -67,36 +62,36 @@ There's one important thing to watch out for: if the tool is not finding new
paths within several minutes of starting, you're probably not invoking the paths within several minutes of starting, you're probably not invoking the
target binary correctly and it never gets to parse the input files we're target binary correctly and it never gets to parse the input files we're
throwing at it; another possible explanations are that the default memory limit throwing at it; another possible explanations are that the default memory limit
(-m) is too restrictive, and the program exits after failing to allocate a (`-m`) is too restrictive, and the program exits after failing to allocate a
buffer very early on; or that the input files are patently invalid and always buffer very early on; or that the input files are patently invalid and always
fail a basic header check. fail a basic header check.
If there are no new paths showing up for a while, you will eventually see a big If there are no new paths showing up for a while, you will eventually see a big
red warning in this section, too :-) red warning in this section, too :-)
2) Overall results ### Overall results
------------------
```
+-----------------------+ +-----------------------+
| cycles done : 0 | | cycles done : 0 |
| total paths : 2095 | | total paths : 2095 |
| uniq crashes : 0 | | uniq crashes : 0 |
| uniq hangs : 19 | | uniq hangs : 19 |
+-----------------------+ +-----------------------+
```
The first field in this section gives you the count of queue passes done so far The first field in this section gives you the count of queue passes done so far - that is, the number of times the fuzzer went over all the interesting test
- that is, the number of times the fuzzer went over all the interesting test
cases discovered so far, fuzzed them, and looped back to the very beginning. cases discovered so far, fuzzed them, and looped back to the very beginning.
Every fuzzing session should be allowed to complete at least one cycle; and Every fuzzing session should be allowed to complete at least one cycle; and
ideally, should run much longer than that. ideally, should run much longer than that.
As noted earlier, the first pass can take a day or longer, so sit back and As noted earlier, the first pass can take a day or longer, so sit back and
relax. If you want to get broader but more shallow coverage right away, try relax. If you want to get broader but more shallow coverage right away, try
the -d option - it gives you a more familiar experience by skipping the the `-d` option - it gives you a more familiar experience by skipping the
deterministic fuzzing steps. It is, however, inferior to the standard mode in deterministic fuzzing steps. It is, however, inferior to the standard mode in
a couple of subtle ways. a couple of subtle ways.
To help make the call on when to hit Ctrl-C, the cycle counter is color-coded. To help make the call on when to hit `Ctrl-C`, the cycle counter is color-coded.
It is shown in magenta during the first pass, progresses to yellow if new finds It is shown in magenta during the first pass, progresses to yellow if new finds
are still being made in subsequent rounds, then blue when that ends - and are still being made in subsequent rounds, then blue when that ends - and
finally, turns green after the fuzzer hasn't been seeing any action for a finally, turns green after the fuzzer hasn't been seeing any action for a
@ -105,33 +100,35 @@ longer while.
The remaining fields in this part of the screen should be pretty obvious: The remaining fields in this part of the screen should be pretty obvious:
there's the number of test cases ("paths") discovered so far, and the number of there's the number of test cases ("paths") discovered so far, and the number of
unique faults. The test cases, crashes, and hangs can be explored in real-time unique faults. The test cases, crashes, and hangs can be explored in real-time
by browsing the output directory, as discussed in the README. by browsing the output directory, as discussed in README.md.
3) Cycle progress ### Cycle progress
-----------------
```
+-------------------------------------+ +-------------------------------------+
| now processing : 1296 (61.86%) | | now processing : 1296 (61.86%) |
| paths timed out : 0 (0.00%) | | paths timed out : 0 (0.00%) |
+-------------------------------------+ +-------------------------------------+
```
This box tells you how far along the fuzzer is with the current queue cycle: it This box tells you how far along the fuzzer is with the current queue cycle: it
shows the ID of the test case it is currently working on, plus the number of shows the ID of the test case it is currently working on, plus the number of
inputs it decided to ditch because they were persistently timing out. inputs it decided to ditch because they were persistently timing out.
The "*" suffix sometimes shown in the first line means that the currently The "*" suffix sometimes shown in the first line means that the currently
processed path is not "favored" (a property discussed later on, in section 6). processed path is not "favored" (a property discussed later on).
If you feel that the fuzzer is progressing too slowly, see the note about the If you feel that the fuzzer is progressing too slowly, see the note about the
-d option in section 2 of this doc. `-d` option in this doc.
4) Map coverage ### Map coverage
---------------
```
+--------------------------------------+ +--------------------------------------+
| map density : 10.15% / 29.07% | | map density : 10.15% / 29.07% |
| count coverage : 4.03 bits/tuple | | count coverage : 4.03 bits/tuple |
+--------------------------------------+ +--------------------------------------+
```
The section provides some trivia about the coverage observed by the The section provides some trivia about the coverage observed by the
instrumentation embedded in the target binary. instrumentation embedded in the target binary.
@ -148,37 +145,35 @@ Be wary of extremes:
due to being linked against a non-instrumented copy of the target due to being linked against a non-instrumented copy of the target
library); or that it is bailing out prematurely on your input test cases. library); or that it is bailing out prematurely on your input test cases.
The fuzzer will try to mark this in pink, just to make you aware. The fuzzer will try to mark this in pink, just to make you aware.
- Percentages over 70% may very rarely happen with very complex programs - Percentages over 70% may very rarely happen with very complex programs
that make heavy use of template-generated code. that make heavy use of template-generated code.
Because high bitmap density makes it harder for the fuzzer to reliably Because high bitmap density makes it harder for the fuzzer to reliably
discern new program states, I recommend recompiling the binary with discern new program states, I recommend recompiling the binary with
AFL_INST_RATIO=10 or so and trying again (see env_variables.txt). `AFL_INST_RATIO=10` or so and trying again (see env_variables.md).
The fuzzer will flag high percentages in red. Chances are, you will never The fuzzer will flag high percentages in red. Chances are, you will never
see that unless you're fuzzing extremely hairy software (say, v8, perl, see that unless you're fuzzing extremely hairy software (say, v8, perl,
ffmpeg). ffmpeg).
The other line deals with the variability in tuple hit counts seen in the The other line deals with the variability in tuple hit counts seen in the
binary. In essence, if every taken branch is always taken a fixed number of binary. In essence, if every taken branch is always taken a fixed number of
times for all the inputs we have tried, this will read "1.00". As we manage times for all the inputs we have tried, this will read `1.00`. As we manage
to trigger other hit counts for every branch, the needle will start to move to trigger other hit counts for every branch, the needle will start to move
toward "8.00" (every bit in the 8-bit map hit), but will probably never toward `8.00` (every bit in the 8-bit map hit), but will probably never
reach that extreme. reach that extreme.
Together, the values can be useful for comparing the coverage of several Together, the values can be useful for comparing the coverage of several
different fuzzing jobs that rely on the same instrumented binary. different fuzzing jobs that rely on the same instrumented binary.
5) Stage progress ### Stage progress
-----------------
```
+-------------------------------------+ +-------------------------------------+
| now trying : interest 32/8 | | now trying : interest 32/8 |
| stage execs : 3996/34.4k (11.62%) | | stage execs : 3996/34.4k (11.62%) |
| total execs : 27.4M | | total execs : 27.4M |
| exec speed : 891.7/sec | | exec speed : 891.7/sec |
+-------------------------------------+ +-------------------------------------+
```
This part gives you an in-depth peek at what the fuzzer is actually doing right This part gives you an in-depth peek at what the fuzzer is actually doing right
now. It tells you about the current stage, which can be any of: now. It tells you about the current stage, which can be any of:
@ -186,39 +181,31 @@ now. It tells you about the current stage, which can be any of:
- calibration - a pre-fuzzing stage where the execution path is examined - calibration - a pre-fuzzing stage where the execution path is examined
to detect anomalies, establish baseline execution speed, and so on. Executed to detect anomalies, establish baseline execution speed, and so on. Executed
very briefly whenever a new find is being made. very briefly whenever a new find is being made.
- trim L/S - another pre-fuzzing stage where the test case is trimmed to the - trim L/S - another pre-fuzzing stage where the test case is trimmed to the
shortest form that still produces the same execution path. The length (L) shortest form that still produces the same execution path. The length (L)
and stepover (S) are chosen in general relationship to file size. and stepover (S) are chosen in general relationship to file size.
- bitflip L/S - deterministic bit flips. There are L bits toggled at any given - bitflip L/S - deterministic bit flips. There are L bits toggled at any given
time, walking the input file with S-bit increments. The current L/S variants time, walking the input file with S-bit increments. The current L/S variants
are: 1/1, 2/1, 4/1, 8/8, 16/8, 32/8. are: `1/1`, `2/1`, `4/1`, `8/8`, `16/8`, `32/8`.
- arith L/8 - deterministic arithmetics. The fuzzer tries to subtract or add - arith L/8 - deterministic arithmetics. The fuzzer tries to subtract or add
small integers to 8-, 16-, and 32-bit values. The stepover is always 8 bits. small integers to 8-, 16-, and 32-bit values. The stepover is always 8 bits.
- interest L/8 - deterministic value overwrite. The fuzzer has a list of known - interest L/8 - deterministic value overwrite. The fuzzer has a list of known
"interesting" 8-, 16-, and 32-bit values to try. The stepover is 8 bits. "interesting" 8-, 16-, and 32-bit values to try. The stepover is 8 bits.
- extras - deterministic injection of dictionary terms. This can be shown as - extras - deterministic injection of dictionary terms. This can be shown as
"user" or "auto", depending on whether the fuzzer is using a user-supplied "user" or "auto", depending on whether the fuzzer is using a user-supplied
dictionary (-x) or an auto-created one. You will also see "over" or "insert", dictionary (`-x`) or an auto-created one. You will also see "over" or "insert",
depending on whether the dictionary words overwrite existing data or are depending on whether the dictionary words overwrite existing data or are
inserted by offsetting the remaining data to accommodate their length. inserted by offsetting the remaining data to accommodate their length.
- havoc - a sort-of-fixed-length cycle with stacked random tweaks. The - havoc - a sort-of-fixed-length cycle with stacked random tweaks. The
operations attempted during this stage include bit flips, overwrites with operations attempted during this stage include bit flips, overwrites with
random and "interesting" integers, block deletion, block duplication, plus random and "interesting" integers, block deletion, block duplication, plus
assorted dictionary-related operations (if a dictionary is supplied in the assorted dictionary-related operations (if a dictionary is supplied in the
first place). first place).
- splice - a last-resort strategy that kicks in after the first full queue - splice - a last-resort strategy that kicks in after the first full queue
cycle with no new paths. It is equivalent to 'havoc', except that it first cycle with no new paths. It is equivalent to 'havoc', except that it first
splices together two random inputs from the queue at some arbitrarily splices together two random inputs from the queue at some arbitrarily
selected midpoint. selected midpoint.
- sync - a stage used only when `-M` or `-S` is set (see parallel_fuzzing.md).
- sync - a stage used only when -M or -S is set (see parallel_fuzzing.md).
No real fuzzing is involved, but the tool scans the output from other No real fuzzing is involved, but the tool scans the output from other
fuzzers and imports test cases as necessary. The first time this is done, fuzzers and imports test cases as necessary. The first time this is done,
it may take several minutes or so. it may take several minutes or so.
@ -234,15 +221,16 @@ The fuzzer will explicitly warn you about slow targets, too. If this happens,
see the perf_tips.txt file included with the fuzzer for ideas on how to speed see the perf_tips.txt file included with the fuzzer for ideas on how to speed
things up. things up.
6) Findings in depth ### Findings in depth
--------------------
```
+--------------------------------------+ +--------------------------------------+
| favored paths : 879 (41.96%) | | favored paths : 879 (41.96%) |
| new edges on : 423 (20.19%) | | new edges on : 423 (20.19%) |
| total crashes : 0 (0 unique) | | total crashes : 0 (0 unique) |
| total tmouts : 24 (19 unique) | | total tmouts : 24 (19 unique) |
+--------------------------------------+ +--------------------------------------+
```
This gives you several metrics that are of interest mostly to complete nerds. This gives you several metrics that are of interest mostly to complete nerds.
The section includes the number of paths that the fuzzer likes the most based The section includes the number of paths that the fuzzer likes the most based
@ -255,9 +243,9 @@ Note that the timeout counter is somewhat different from the hang counter; this
one includes all test cases that exceeded the timeout, even if they did not one includes all test cases that exceeded the timeout, even if they did not
exceed it by a margin sufficient to be classified as hangs. exceed it by a margin sufficient to be classified as hangs.
7) Fuzzing strategy yields ### Fuzzing strategy yields
--------------------------
```
+-----------------------------------------------------+ +-----------------------------------------------------+
| bit flips : 57/289k, 18/289k, 18/288k | | bit flips : 57/289k, 18/289k, 18/288k |
| byte flips : 0/36.2k, 4/35.7k, 7/34.6k | | byte flips : 0/36.2k, 4/35.7k, 7/34.6k |
@ -267,6 +255,7 @@ exceed it by a margin sufficient to be classified as hangs.
| havoc : 1903/20.0M, 0/0 | | havoc : 1903/20.0M, 0/0 |
| trim : 20.31%/9201, 17.05% | | trim : 20.31%/9201, 17.05% |
+-----------------------------------------------------+ +-----------------------------------------------------+
```
This is just another nerd-targeted section keeping track of how many paths we This is just another nerd-targeted section keeping track of how many paths we
have netted, in proportion to the number of execs attempted, for each of the have netted, in proportion to the number of execs attempted, for each of the
@ -280,9 +269,9 @@ goal. Finally, the third number shows the proportion of bytes that, although
not possible to remove, were deemed to have no effect and were excluded from not possible to remove, were deemed to have no effect and were excluded from
some of the more expensive deterministic fuzzing steps. some of the more expensive deterministic fuzzing steps.
8) Path geometry ### Path geometry
----------------
```
+---------------------+ +---------------------+
| levels : 5 | | levels : 5 |
| pending : 1570 | | pending : 1570 |
@ -291,6 +280,7 @@ some of the more expensive deterministic fuzzing steps.
| imported : 0 | | imported : 0 |
| stability : 100.00% | | stability : 100.00% |
+---------------------+ +---------------------+
```
The first field in this section tracks the path depth reached through the The first field in this section tracks the path depth reached through the
guided fuzzing process. In essence: the initial test cases supplied by the guided fuzzing process. In essence: the initial test cases supplied by the
@ -323,46 +313,40 @@ there are several things to look at:
- The use of uninitialized memory in conjunction with some intrinsic sources - The use of uninitialized memory in conjunction with some intrinsic sources
of entropy in the tested binary. Harmless to AFL, but could be indicative of entropy in the tested binary. Harmless to AFL, but could be indicative
of a security bug. of a security bug.
- Attempts to manipulate persistent resources, such as left over temporary - Attempts to manipulate persistent resources, such as left over temporary
files or shared memory objects. This is usually harmless, but you may want files or shared memory objects. This is usually harmless, but you may want
to double-check to make sure the program isn't bailing out prematurely. to double-check to make sure the program isn't bailing out prematurely.
Running out of disk space, SHM handles, or other global resources can Running out of disk space, SHM handles, or other global resources can
trigger this, too. trigger this, too.
- Hitting some functionality that is actually designed to behave randomly. - Hitting some functionality that is actually designed to behave randomly.
Generally harmless. For example, when fuzzing sqlite, an input like Generally harmless. For example, when fuzzing sqlite, an input like
'select random();' will trigger a variable execution path. `select random();` will trigger a variable execution path.
- Multiple threads executing at once in semi-random order. This is harmless - Multiple threads executing at once in semi-random order. This is harmless
when the 'stability' metric stays over 90% or so, but can become an issue when the 'stability' metric stays over 90% or so, but can become an issue
if not. Here's what to try: if not. Here's what to try:
* Use afl-clang-fast from [llvm_mode](../llvm_mode/) - it uses a thread-local tracking
- Use afl-clang-fast from llvm_mode/ - it uses a thread-local tracking
model that is less prone to concurrency issues, model that is less prone to concurrency issues,
* See if the target can be compiled or run without threads. Common
- See if the target can be compiled or run without threads. Common `./configure` options include `--without-threads`, `--disable-pthreads`, or
./configure options include --without-threads, --disable-pthreads, or `--disable-openmp`.
--disable-openmp. * Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which
- Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which
allows you to use a deterministic scheduler. allows you to use a deterministic scheduler.
- In persistent mode, minor drops in the "stability" metric can be normal, - In persistent mode, minor drops in the "stability" metric can be normal,
because not all the code behaves identically when re-entered; but major because not all the code behaves identically when re-entered; but major
dips may signify that the code within __AFL_LOOP() is not behaving dips may signify that the code within `__AFL_LOOP()` is not behaving
correctly on subsequent iterations (e.g., due to incomplete clean-up or correctly on subsequent iterations (e.g., due to incomplete clean-up or
reinitialization of the state) and that most of the fuzzing effort goes reinitialization of the state) and that most of the fuzzing effort goes
to waste. to waste.
The paths where variable behavior is detected are marked with a matching entry The paths where variable behavior is detected are marked with a matching entry
in the <out_dir>/queue/.state/variable_behavior/ directory, so you can look in the `<out_dir>/queue/.state/variable_behavior/` directory, so you can look
them up easily. them up easily.
9) CPU load ### CPU load
-----------
```
[cpu: 25%] [cpu: 25%]
```
This tiny widget shows the apparent CPU utilization on the local system. It is This tiny widget shows the apparent CPU utilization on the local system. It is
calculated by taking the number of processes in the "runnable" state, and then calculated by taking the number of processes in the "runnable" state, and then
@ -380,39 +364,37 @@ are ready to run, but not how resource-hungry they may be. It also doesn't
distinguish between physical cores, logical cores, and virtualized CPUs; the distinguish between physical cores, logical cores, and virtualized CPUs; the
performance characteristics of each of these will differ quite a bit. performance characteristics of each of these will differ quite a bit.
If you want a more accurate measurement, you can run the afl-gotcpu utility If you want a more accurate measurement, you can run the `afl-gotcpu` utility from the command line.
from the command line.
10) Addendum: status and plot files ### Addendum: status and plot files
-----------------------------------
For unattended operation, some of the key status screen information can be also For unattended operation, some of the key status screen information can be also
found in a machine-readable format in the fuzzer_stats file in the output found in a machine-readable format in the fuzzer_stats file in the output
directory. This includes: directory. This includes:
- start_time - unix time indicating the start time of afl-fuzz - `start_time` - unix time indicating the start time of afl-fuzz
- last_update - unix time corresponding to the last update of this file - `last_update` - unix time corresponding to the last update of this file
- fuzzer_pid - PID of the fuzzer process - `fuzzer_pid` - PID of the fuzzer process
- cycles_done - queue cycles completed so far - `cycles_done` - queue cycles completed so far
- execs_done - number of execve() calls attempted - `execs_done` - number of execve() calls attempted
- execs_per_sec - current number of execs per second - `execs_per_sec` - current number of execs per second
- paths_total - total number of entries in the queue - `paths_total` - total number of entries in the queue
- paths_found - number of entries discovered through local fuzzing - `paths_found` - number of entries discovered through local fuzzing
- paths_imported - number of entries imported from other instances - `paths_imported` - number of entries imported from other instances
- max_depth - number of levels in the generated data set - `max_depth` - number of levels in the generated data set
- cur_path - currently processed entry number - `cur_path` - currently processed entry number
- pending_favs - number of favored entries still waiting to be fuzzed - `pending_favs` - number of favored entries still waiting to be fuzzed
- pending_total - number of all entries waiting to be fuzzed - `pending_total` - number of all entries waiting to be fuzzed
- stability - percentage of bitmap bytes that behave consistently - `stability - percentage of bitmap bytes that behave consistently
- variable_paths - number of test cases showing variable behavior - `variable_paths` - number of test cases showing variable behavior
- unique_crashes - number of unique crashes recorded - `unique_crashes` - number of unique crashes recorded
- unique_hangs - number of unique hangs encountered - `unique_hangs` - number of unique hangs encountered
- command_line - full command line used for the fuzzing session - `command_line` - full command line used for the fuzzing session
- slowest_exec_ms- real time of the slowest execution in seconds - `slowest_exec_ms`- real time of the slowest execution in seconds
- peak_rss_mb - max rss usage reached during fuzzing in MB - `peak_rss_mb` - max rss usage reached during fuzzing in MB
Most of these map directly to the UI elements discussed earlier on. Most of these map directly to the UI elements discussed earlier on.
On top of that, you can also find an entry called 'plot_data', containing a On top of that, you can also find an entry called `plot_data`, containing a
plottable history for most of these fields. If you have gnuplot installed, you plottable history for most of these fields. If you have gnuplot installed, you
can turn this into a nice progress report with the included 'afl-plot' tool. can turn this into a nice progress report with the included `afl-plot` tool.

View File

@ -1,13 +1,10 @@
=================================== # Technical "whitepaper" for afl-fuzz
Technical "whitepaper" for afl-fuzz
===================================
This document provides a quick overview of the guts of American Fuzzy Lop. This document provides a quick overview of the guts of American Fuzzy Lop.
See README for the general instruction manual; and for a discussion of See README for the general instruction manual; and for a discussion of
motivations and design goals behind AFL, see historical_notes.txt. motivations and design goals behind AFL, see historical_notes.md.
0) Design statement ## 0. Design statement
-------------------
American Fuzzy Lop does its best not to focus on any singular principle of American Fuzzy Lop does its best not to focus on any singular principle of
operation and not be a proof-of-concept for any specific theory. The tool can operation and not be a proof-of-concept for any specific theory. The tool can
@ -20,28 +17,30 @@ lightweight instrumentation that served as a foundation for the tool, but this
mechanism should be thought of merely as a means to an end. The only true mechanism should be thought of merely as a means to an end. The only true
governing principles are speed, reliability, and ease of use. governing principles are speed, reliability, and ease of use.
1) Coverage measurements ## 1. Coverage measurements
------------------------
The instrumentation injected into compiled programs captures branch (edge) The instrumentation injected into compiled programs captures branch (edge)
coverage, along with coarse branch-taken hit counts. The code injected at coverage, along with coarse branch-taken hit counts. The code injected at
branch points is essentially equivalent to: branch points is essentially equivalent to:
```c
cur_location = <COMPILE_TIME_RANDOM>; cur_location = <COMPILE_TIME_RANDOM>;
shared_mem[cur_location ^ prev_location]++; shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location >> 1; prev_location = cur_location >> 1;
```
The cur_location value is generated randomly to simplify the process of The `cur_location` value is generated randomly to simplify the process of
linking complex projects and keep the XOR output distributed uniformly. linking complex projects and keep the XOR output distributed uniformly.
The shared_mem[] array is a 64 kB SHM region passed to the instrumented binary The `shared_mem[]` array is a 64 kB SHM region passed to the instrumented binary
by the caller. Every byte set in the output map can be thought of as a hit for by the caller. Every byte set in the output map can be thought of as a hit for
a particular (branch_src, branch_dst) tuple in the instrumented code. a particular (`branch_src`, `branch_dst`) tuple in the instrumented code.
The size of the map is chosen so that collisions are sporadic with almost all The size of the map is chosen so that collisions are sporadic with almost all
of the intended targets, which usually sport between 2k and 10k discoverable of the intended targets, which usually sport between 2k and 10k discoverable
branch points: branch points:
```
Branch cnt | Colliding tuples | Example targets Branch cnt | Colliding tuples | Example targets
------------+------------------+----------------- ------------+------------------+-----------------
1,000 | 0.75% | giflib, lzo 1,000 | 0.75% | giflib, lzo
@ -50,6 +49,7 @@ branch points:
10,000 | 7% | libxml 10,000 | 7% | libxml
20,000 | 14% | sqlite 20,000 | 14% | sqlite
50,000 | 30% | - 50,000 | 30% | -
```
At the same time, its size is small enough to allow the map to be analyzed At the same time, its size is small enough to allow the map to be analyzed
in a matter of microseconds on the receiving end, and to effortlessly fit in a matter of microseconds on the receiving end, and to effortlessly fit
@ -59,8 +59,10 @@ This form of coverage provides considerably more insight into the execution
path of the program than simple block coverage. In particular, it trivially path of the program than simple block coverage. In particular, it trivially
distinguishes between the following execution traces: distinguishes between the following execution traces:
```
A -> B -> C -> D -> E (tuples: AB, BC, CD, DE) A -> B -> C -> D -> E (tuples: AB, BC, CD, DE)
A -> B -> D -> C -> E (tuples: AB, BD, DC, CE) A -> B -> D -> C -> E (tuples: AB, BD, DC, CE)
```
This aids the discovery of subtle fault conditions in the underlying code, This aids the discovery of subtle fault conditions in the underlying code,
because security vulnerabilities are more often associated with unexpected because security vulnerabilities are more often associated with unexpected
@ -75,8 +77,7 @@ The absence of simple saturating arithmetic opcodes on Intel CPUs means that
the hit counters can sometimes wrap around to zero. Since this is a fairly the hit counters can sometimes wrap around to zero. Since this is a fairly
unlikely and localized event, it's seen as an acceptable performance trade-off. unlikely and localized event, it's seen as an acceptable performance trade-off.
2) Detecting new behaviors ### 2. Detecting new behaviors
--------------------------
The fuzzer maintains a global map of tuples seen in previous executions; this The fuzzer maintains a global map of tuples seen in previous executions; this
data can be rapidly compared with individual traces and updated in just a couple data can be rapidly compared with individual traces and updated in just a couple
@ -97,18 +98,24 @@ To illustrate the properties of the algorithm, consider that the second trace
shown below would be considered substantially new because of the presence of shown below would be considered substantially new because of the presence of
new tuples (CA, AE): new tuples (CA, AE):
```
#1: A -> B -> C -> D -> E #1: A -> B -> C -> D -> E
#2: A -> B -> C -> A -> E #2: A -> B -> C -> A -> E
```
At the same time, with #2 processed, the following pattern will not be seen At the same time, with #2 processed, the following pattern will not be seen
as unique, despite having a markedly different overall execution path: as unique, despite having a markedly different overall execution path:
```
#3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E
```
In addition to detecting new tuples, the fuzzer also considers coarse tuple In addition to detecting new tuples, the fuzzer also considers coarse tuple
hit counts. These are divided into several buckets: hit counts. These are divided into several buckets:
```
1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
```
To some extent, the number of buckets is an implementation artifact: it allows To some extent, the number of buckets is an implementation artifact: it allows
an in-place mapping of an 8-bit counter generated by the instrumentation to an in-place mapping of an 8-bit counter generated by the instrumentation to
@ -135,8 +142,7 @@ reject them and hope that the fuzzer will find a less expensive way to reach
the same code. Empirical testing strongly suggests that more generous time the same code. Empirical testing strongly suggests that more generous time
limits are not worth the cost. limits are not worth the cost.
3) Evolving the input queue ## 3. Evolving the input queue
---------------------------
Mutated test cases that produced new state transitions within the program are Mutated test cases that produced new state transitions within the program are
added to the input queue and used as a starting point for future rounds of added to the input queue and used as a starting point for future rounds of
@ -146,7 +152,7 @@ In contrast to more greedy genetic algorithms, this approach allows the tool
to progressively explore various disjoint and possibly mutually incompatible to progressively explore various disjoint and possibly mutually incompatible
features of the underlying data format, as shown in this image: features of the underlying data format, as shown in this image:
http://lcamtuf.coredump.cx/afl/afl_gzip.png ![gzip_coverage](./visualization/afl_gzip.png)
Several practical examples of the results of this algorithm are discussed Several practical examples of the results of this algorithm are discussed
here: here:
@ -165,10 +171,11 @@ of new tuples, and the remainder is associated with changes in hit counts.
The following table compares the relative ability to discover file syntax and The following table compares the relative ability to discover file syntax and
explore program states when using several different approaches to guided explore program states when using several different approaches to guided
fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with -O3 and fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with `-O3` and
seeded with a dummy text file; the session consisted of a single pass over the seeded with a dummy text file; the session consisted of a single pass over the
input queue with afl-fuzz: input queue with afl-fuzz:
```
Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage
strategy used | reached | reached | cnt var | test case generated strategy used | reached | reached | cnt var | test case generated
------------------+---------+---------+----------+--------------------------- ------------------+---------+---------+----------+---------------------------
@ -179,6 +186,7 @@ input queue with afl-fuzz:
Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff
Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff
AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff
```
The first entry for blind fuzzing ("S") corresponds to executing just a single The first entry for blind fuzzing ("S") corresponds to executing just a single
round of testing; the second set of figures ("L") shows the fuzzer running in a round of testing; the second set of figures ("L") shows the fuzzer running in a
@ -191,6 +199,7 @@ a series of rudimentary, sequential operations such as walking bit flips.
Because this mode would be incapable of altering the size of the input file, Because this mode would be incapable of altering the size of the input file,
the sessions were seeded with a valid unified diff: the sessions were seeded with a valid unified diff:
```
Queue extension | Blocks | Edges | Edge hit | Number of unique Queue extension | Blocks | Edges | Edge hit | Number of unique
strategy used | reached | reached | cnt var | crashes found strategy used | reached | reached | cnt var | crashes found
------------------+---------+---------+----------+------------------ ------------------+---------+---------+----------+------------------
@ -200,14 +209,14 @@ the sessions were seeded with a valid unified diff:
Block coverage | 1,255 | 1,649 | 1.48 | 0 Block coverage | 1,255 | 1,649 | 1.48 | 0
Edge coverage | 1,259 | 1,734 | 1.72 | 0 Edge coverage | 1,259 | 1,734 | 1.72 | 0
AFL model | 1,452 | 2,040 | 3.16 | 1 AFL model | 1,452 | 2,040 | 3.16 | 1
```
At noted earlier on, some of the prior work on genetic fuzzing relied on At noted earlier on, some of the prior work on genetic fuzzing relied on
maintaining a single test case and evolving it to maximize coverage. At least maintaining a single test case and evolving it to maximize coverage. At least
in the tests described above, this "greedy" approach appears to confer no in the tests described above, this "greedy" approach appears to confer no
substantial benefits over blind fuzzing strategies. substantial benefits over blind fuzzing strategies.
4) Culling the corpus ### 4. Culling the corpus
---------------------
The progressive state exploration approach outlined above means that some of The progressive state exploration approach outlined above means that some of
the test cases synthesized later on in the game may have edge coverage that the test cases synthesized later on in the game may have edge coverage that
@ -225,11 +234,8 @@ for each tuple.
The tuples are then processed sequentially using a simple workflow: The tuples are then processed sequentially using a simple workflow:
1) Find next tuple not yet in the temporary working set, 1) Find next tuple not yet in the temporary working set,
2) Locate the winning queue entry for this tuple, 2) Locate the winning queue entry for this tuple,
3) Register *all* tuples present in that entry's trace in the working set, 3) Register *all* tuples present in that entry's trace in the working set,
4) Go to #1 if there are any missing tuples in the set. 4) Go to #1 if there are any missing tuples in the set.
The generated corpus of "favored" entries is usually 5-10x smaller than the The generated corpus of "favored" entries is usually 5-10x smaller than the
@ -238,30 +244,26 @@ with varying probabilities when encountered in the queue:
- If there are new, yet-to-be-fuzzed favorites present in the queue, 99% - If there are new, yet-to-be-fuzzed favorites present in the queue, 99%
of non-favored entries will be skipped to get to the favored ones. of non-favored entries will be skipped to get to the favored ones.
- If there are no new favorites: - If there are no new favorites:
* If the current non-favored entry was fuzzed before, it will be skipped
- If the current non-favored entry was fuzzed before, it will be skipped
95% of the time. 95% of the time.
* If it hasn't gone through any fuzzing rounds yet, the odds of skipping
- If it hasn't gone through any fuzzing rounds yet, the odds of skipping
drop down to 75%. drop down to 75%.
Based on empirical testing, this provides a reasonable balance between queue Based on empirical testing, this provides a reasonable balance between queue
cycling speed and test case diversity. cycling speed and test case diversity.
Slightly more sophisticated but much slower culling can be performed on input Slightly more sophisticated but much slower culling can be performed on input
or output corpora with afl-cmin. This tool permanently discards the redundant or output corpora with `afl-cmin`. This tool permanently discards the redundant
entries and produces a smaller corpus suitable for use with afl-fuzz or entries and produces a smaller corpus suitable for use with `afl-fuzz` or
external tools. external tools.
5) Trimming input files ## 5. Trimming input files
-----------------------
File size has a dramatic impact on fuzzing performance, both because large File size has a dramatic impact on fuzzing performance, both because large
files make the target binary slower, and because they reduce the likelihood files make the target binary slower, and because they reduce the likelihood
that a mutation would touch important format control structures, rather than that a mutation would touch important format control structures, rather than
redundant data blocks. This is discussed in more detail in perf_tips.txt. redundant data blocks. This is discussed in more detail in perf_tips.md.
The possibility that the user will provide a low-quality starting corpus aside, The possibility that the user will provide a low-quality starting corpus aside,
some types of mutations can have the effect of iteratively increasing the size some types of mutations can have the effect of iteratively increasing the size
@ -275,12 +277,12 @@ The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data
with variable length and stepover; any deletion that doesn't affect the checksum with variable length and stepover; any deletion that doesn't affect the checksum
of the trace map is committed to disk. The trimmer is not designed to be of the trace map is committed to disk. The trimmer is not designed to be
particularly thorough; instead, it tries to strike a balance between precision particularly thorough; instead, it tries to strike a balance between precision
and the number of execve() calls spent on the process, selecting the block size and the number of `execve()` calls spent on the process, selecting the block size
and stepover to match. The average per-file gains are around 5-20%. and stepover to match. The average per-file gains are around 5-20%.
The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and The standalone `afl-tmin` tool uses a more exhaustive, iterative algorithm, and
also attempts to perform alphabet normalization on the trimmed files. The also attempts to perform alphabet normalization on the trimmed files. The
operation of afl-tmin is as follows. operation of `afl-tmin` is as follows.
First, the tool automatically selects the operating mode. If the initial input First, the tool automatically selects the operating mode. If the initial input
crashes the target binary, afl-tmin will run in non-instrumented mode, simply crashes the target binary, afl-tmin will run in non-instrumented mode, simply
@ -293,16 +295,13 @@ The actual minimization algorithm is:
1) Attempt to zero large blocks of data with large stepovers. Empirically, 1) Attempt to zero large blocks of data with large stepovers. Empirically,
this is shown to reduce the number of execs by preempting finer-grained this is shown to reduce the number of execs by preempting finer-grained
efforts later on. efforts later on.
2) Perform a block deletion pass with decreasing block sizes and stepovers, 2) Perform a block deletion pass with decreasing block sizes and stepovers,
binary-search-style. binary-search-style.
3) Perform alphabet normalization by counting unique characters and trying 3) Perform alphabet normalization by counting unique characters and trying
to bulk-replace each with a zero value. to bulk-replace each with a zero value.
4) As a last result, perform byte-by-byte normalization on non-zero bytes. 4) As a last result, perform byte-by-byte normalization on non-zero bytes.
Instead of zeroing with a 0x00 byte, afl-tmin uses the ASCII digit '0'. This Instead of zeroing with a 0x00 byte, `afl-tmin` uses the ASCII digit '0'. This
is done because such a modification is much less likely to interfere with is done because such a modification is much less likely to interfere with
text parsing, so it is more likely to result in successful minimization of text parsing, so it is more likely to result in successful minimization of
text files. text files.
@ -312,8 +311,7 @@ minimization approaches proposed in academic work, but requires far fewer
executions and tends to produce comparable results in most real-world executions and tends to produce comparable results in most real-world
applications. applications.
6) Fuzzing strategies ## 6. Fuzzing strategies
---------------------
The feedback provided by the instrumentation makes it easy to understand the The feedback provided by the instrumentation makes it easy to understand the
value of various fuzzing strategies and optimize their parameters so that they value of various fuzzing strategies and optimize their parameters so that they
@ -323,15 +321,13 @@ afl-fuzz are generally format-agnostic and are discussed in more detail here:
http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html
It is somewhat notable that especially early on, most of the work done by It is somewhat notable that especially early on, most of the work done by
afl-fuzz is actually highly deterministic, and progresses to random stacked `afl-fuzz` is actually highly deterministic, and progresses to random stacked
modifications and test case splicing only at a later stage. The deterministic modifications and test case splicing only at a later stage. The deterministic
strategies include: strategies include:
- Sequential bit flips with varying lengths and stepovers, - Sequential bit flips with varying lengths and stepovers,
- Sequential addition and subtraction of small integers, - Sequential addition and subtraction of small integers,
- Sequential insertion of known interesting integers (`0`, `1`, `INT_MAX`, etc),
- Sequential insertion of known interesting integers (0, 1, INT_MAX, etc),
The purpose of opening with deterministic steps is related to their tendency to The purpose of opening with deterministic steps is related to their tendency to
produce compact test cases and small diffs between the non-crashing and crashing produce compact test cases and small diffs between the non-crashing and crashing
@ -341,10 +337,10 @@ With deterministic fuzzing out of the way, the non-deterministic steps include
stacked bit flips, insertions, deletions, arithmetics, and splicing of different stacked bit flips, insertions, deletions, arithmetics, and splicing of different
test cases. test cases.
The relative yields and execve() costs of all these strategies have been The relative yields and `execve()` costs of all these strategies have been
investigated and are discussed in the aforementioned blog post. investigated and are discussed in the aforementioned blog post.
For the reasons discussed in historical_notes.txt (chiefly, performance, For the reasons discussed in historical_notes.md (chiefly, performance,
simplicity, and reliability), AFL generally does not try to reason about the simplicity, and reliability), AFL generally does not try to reason about the
relationship between specific mutations and program states; the fuzzing steps relationship between specific mutations and program states; the fuzzing steps
are nominally blind, and are guided only by the evolutionary design of the are nominally blind, and are guided only by the evolutionary design of the
@ -365,8 +361,7 @@ in force only during deterministic stages that do not alter the size or the
general layout of the underlying file, this mechanism appears to work very general layout of the underlying file, this mechanism appears to work very
reliably and proved to be simple to implement. reliably and proved to be simple to implement.
7) Dictionaries ## 7. Dictionaries
---------------
The feedback provided by the instrumentation makes it easy to automatically The feedback provided by the instrumentation makes it easy to automatically
identify syntax tokens in some types of input files, and to detect that certain identify syntax tokens in some types of input files, and to detect that certain
@ -398,31 +393,28 @@ to a predefined value baked into the code. The fuzzer relies on this signal
to build compact "auto dictionaries" that are then used in conjunction with to build compact "auto dictionaries" that are then used in conjunction with
other fuzzing strategies. other fuzzing strategies.
8) De-duping crashes ## 8. De-duping crashes
--------------------
De-duplication of crashes is one of the more important problems for any De-duplication of crashes is one of the more important problems for any
competent fuzzing tool. Many of the naive approaches run into problems; in competent fuzzing tool. Many of the naive approaches run into problems; in
particular, looking just at the faulting address may lead to completely particular, looking just at the faulting address may lead to completely
unrelated issues being clustered together if the fault happens in a common unrelated issues being clustered together if the fault happens in a common
library function (say, strcmp, strcpy); while checksumming call stack library function (say, `strcmp`, `strcpy`); while checksumming call stack
backtraces can lead to extreme crash count inflation if the fault can be backtraces can lead to extreme crash count inflation if the fault can be
reached through a number of different, possibly recursive code paths. reached through a number of different, possibly recursive code paths.
The solution implemented in afl-fuzz considers a crash unique if any of two The solution implemented in `afl-fuzz` considers a crash unique if any of two
conditions are met: conditions are met:
- The crash trace includes a tuple not seen in any of the previous crashes, - The crash trace includes a tuple not seen in any of the previous crashes,
- The crash trace is missing a tuple that was always present in earlier - The crash trace is missing a tuple that was always present in earlier
faults. faults.
The approach is vulnerable to some path count inflation early on, but exhibits The approach is vulnerable to some path count inflation early on, but exhibits
a very strong self-limiting effect, similar to the execution path analysis a very strong self-limiting effect, similar to the execution path analysis
logic that is the cornerstone of afl-fuzz. logic that is the cornerstone of `afl-fuzz`.
9) Investigating crashes ## 9. Investigating crashes
------------------------
The exploitability of many types of crashes can be ambiguous; afl-fuzz tries The exploitability of many types of crashes can be ambiguous; afl-fuzz tries
to address this by providing a crash exploration mode where a known-faulting to address this by providing a crash exploration mode where a known-faulting
@ -441,13 +433,12 @@ newly-found inputs for human review.
On the subject of crashes, it is worth noting that in contrast to normal On the subject of crashes, it is worth noting that in contrast to normal
queue entries, crashing inputs are *not* trimmed; they are kept exactly as queue entries, crashing inputs are *not* trimmed; they are kept exactly as
discovered to make it easier to compare them to the parent, non-crashing entry discovered to make it easier to compare them to the parent, non-crashing entry
in the queue. That said, afl-tmin can be used to shrink them at will. in the queue. That said, `afl-tmin` can be used to shrink them at will.
10) The fork server ## 10 The fork server
-------------------
To improve performance, afl-fuzz uses a "fork server", where the fuzzed process To improve performance, `afl-fuzz` uses a "fork server", where the fuzzed process
goes through execve(), linking, and libc initialization only once, and is then goes through `execve()`, linking, and libc initialization only once, and is then
cloned from a stopped process image by leveraging copy-on-write. The cloned from a stopped process image by leveraging copy-on-write. The
implementation is described in more detail here: implementation is described in more detail here:
@ -455,7 +446,7 @@ implementation is described in more detail here:
The fork server is an integral aspect of the injected instrumentation and The fork server is an integral aspect of the injected instrumentation and
simply stops at the first instrumented function to await commands from simply stops at the first instrumented function to await commands from
afl-fuzz. `afl-fuzz`.
With fast targets, the fork server can offer considerable performance gains, With fast targets, the fork server can offer considerable performance gains,
usually between 1.5x and 2x. It is also possible to: usually between 1.5x and 2x. It is also possible to:
@ -464,17 +455,14 @@ usually between 1.5x and 2x. It is also possible to:
user-selected chunks of initialization code. It requires very modest user-selected chunks of initialization code. It requires very modest
code changes to the targeted program, and With some targets, can code changes to the targeted program, and With some targets, can
produce 10x+ performance gains. produce 10x+ performance gains.
- Enable "persistent" mode, where a single process is used to try out - Enable "persistent" mode, where a single process is used to try out
multiple inputs, greatly limiting the overhead of repetitive fork() multiple inputs, greatly limiting the overhead of repetitive `fork()`
calls. This generally requires some code changes to the targeted program, calls. This generally requires some code changes to the targeted program,
but can improve the performance of fast targets by a factor of 5 or more but can improve the performance of fast targets by a factor of 5 or more - approximating the benefits of in-process fuzzing jobs while still
- approximating the benefits of in-process fuzzing jobs while still
maintaining very robust isolation between the fuzzer process and the maintaining very robust isolation between the fuzzer process and the
targeted binary. targeted binary.
11) Parallelization ## 11. Parallelization
-------------------
The parallelization mechanism relies on periodically examining the queues The parallelization mechanism relies on periodically examining the queues
produced by independently-running instances on other CPU cores or on remote produced by independently-running instances on other CPU cores or on remote
@ -487,8 +475,7 @@ synergistic effects.
For more information about this design, see parallel_fuzzing.md. For more information about this design, see parallel_fuzzing.md.
12) Binary-only instrumentation ## 12. Binary-only instrumentation
-------------------------------
Instrumentation of black-box, binary-only targets is accomplished with the Instrumentation of black-box, binary-only targets is accomplished with the
help of a separately-built version of QEMU in "user emulation" mode. This also help of a separately-built version of QEMU in "user emulation" mode. This also
@ -497,6 +484,7 @@ allows the execution of cross-architecture code - say, ARM binaries on x86.
QEMU uses basic blocks as translation units; the instrumentation is implemented QEMU uses basic blocks as translation units; the instrumentation is implemented
on top of this and uses a model roughly analogous to the compile-time hooks: on top of this and uses a model roughly analogous to the compile-time hooks:
```c
if (block_address > elf_text_start && block_address < elf_text_end) { if (block_address > elf_text_start && block_address < elf_text_end) {
cur_location = (block_address >> 4) ^ (block_address << 8); cur_location = (block_address >> 4) ^ (block_address << 8);
@ -504,6 +492,7 @@ on top of this and uses a model roughly analogous to the compile-time hooks:
prev_location = cur_location >> 1; prev_location = cur_location >> 1;
} }
```
The shift-and-XOR-based scrambling in the second line is used to mask the The shift-and-XOR-based scrambling in the second line is used to mask the
effects of instruction alignment. effects of instruction alignment.
@ -511,7 +500,7 @@ effects of instruction alignment.
The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly
slow; to counter this, the QEMU mode leverages a fork server similar to that slow; to counter this, the QEMU mode leverages a fork server similar to that
used for compiler-instrumented code, effectively spawning copies of an used for compiler-instrumented code, effectively spawning copies of an
already-initialized process paused at _start. already-initialized process paused at `_start`.
First-time translation of a new basic block also incurs substantial latency. To First-time translation of a new basic block also incurs substantial latency. To
eliminate this problem, the AFL fork server is extended by providing a channel eliminate this problem, the AFL fork server is extended by providing a channel
@ -523,8 +512,7 @@ processes.
As a result of these two optimizations, the overhead of the QEMU mode is As a result of these two optimizations, the overhead of the QEMU mode is
roughly 2-5x, compared to 100x+ for PIN. roughly 2-5x, compared to 100x+ for PIN.
13) The afl-analyze tool ## 13. The `afl-analyze` tool
------------------------
The file format analyzer is a simple extension of the minimization algorithm The file format analyzer is a simple extension of the minimization algorithm
discussed earlier on; instead of attempting to remove no-op blocks, the tool discussed earlier on; instead of attempting to remove no-op blocks, the tool
@ -536,28 +524,22 @@ It uses the following classification scheme:
- "No-op blocks" - segments where bit flips cause no apparent changes to - "No-op blocks" - segments where bit flips cause no apparent changes to
control flow. Common examples may be comment sections, pixel data within control flow. Common examples may be comment sections, pixel data within
a bitmap file, etc. a bitmap file, etc.
- "Superficial content" - segments where some, but not all, bitflips - "Superficial content" - segments where some, but not all, bitflips
produce some control flow changes. Examples may include strings in rich produce some control flow changes. Examples may include strings in rich
documents (e.g., XML, RTF). documents (e.g., XML, RTF).
- "Critical stream" - a sequence of bytes where all bit flips alter control - "Critical stream" - a sequence of bytes where all bit flips alter control
flow in different but correlated ways. This may be compressed data, flow in different but correlated ways. This may be compressed data,
non-atomically compared keywords or magic values, etc. non-atomically compared keywords or magic values, etc.
- "Suspected length field" - small, atomic integer that, when touched in - "Suspected length field" - small, atomic integer that, when touched in
any way, causes a consistent change to program control flow, suggestive any way, causes a consistent change to program control flow, suggestive
of a failed length check. of a failed length check.
- "Suspected cksum or magic int" - an integer that behaves similarly to a - "Suspected cksum or magic int" - an integer that behaves similarly to a
length field, but has a numerical value that makes the length explanation length field, but has a numerical value that makes the length explanation
unlikely. This is suggestive of a checksum or other "magic" integer. unlikely. This is suggestive of a checksum or other "magic" integer.
- "Suspected checksummed block" - a long block of data where any change - "Suspected checksummed block" - a long block of data where any change
always triggers the same new execution path. Likely caused by failing always triggers the same new execution path. Likely caused by failing
a checksum or a similar integrity check before any subsequent parsing a checksum or a similar integrity check before any subsequent parsing
takes place. takes place.
- "Magic value section" - a generic token where changes cause the type - "Magic value section" - a generic token where changes cause the type
of binary behavior outlined earlier, but that doesn't meet any of the of binary behavior outlined earlier, but that doesn't meet any of the
other criteria. May be an atomically compared keyword or so. other criteria. May be an atomically compared keyword or so.