mirror of
https://github.com/AFLplusplus/AFLplusplus.git
synced 2025-06-10 17:21:33 +00:00
* first commit, looks good * fix ascii percentage calc * fix ascii percentage calc * modify txt configs for test * further refinement * Revert "Merge branch 'text_inputs' into dev" This reverts commit 6d9b29daca46c8912aa9ddf6c053bc8554e9e9f7, reversing changes made to 07648f75ea5ef8f03a92db0c7566da8c229dc27b. * blacklist -> ignore renaming * rename whitelist -> instrumentlist * reduce the time interval in which the secondaries sync Co-authored-by: root <root@localhost.localdomain>
277 lines
10 KiB
Markdown
277 lines
10 KiB
Markdown
# afl-clang-lto - collision free instrumentation at link time
|
|
|
|
## TLDR;
|
|
|
|
This version requires a current llvm 11 compiled from the github master.
|
|
|
|
1. Use afl-clang-lto/afl-clang-lto++ because it is faster and gives better
|
|
coverage than anything else that is out there in the AFL world
|
|
|
|
2. You can use it together with llvm_mode: laf-intel and the instrument file listing
|
|
features and can be combined with cmplog/Redqueen
|
|
|
|
3. It only works with llvm 11 (current github master state)
|
|
|
|
4. AUTODICTIONARY feature! see below
|
|
|
|
5. If any problems arise be sure to set `AR=llvm-ar RANLIB=llvm-ranlib`.
|
|
Some targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
|
|
|
|
6. If a target uses _init functions or early constructors then additionally
|
|
set `AFL_LLVM_MAP_DYNAMIC=1` as your target will crash otherwise!
|
|
|
|
## Introduction and problem description
|
|
|
|
A big issue with how afl/afl++ works is that the basic block IDs that are
|
|
set during compilation are random - and hence naturally the larger the number
|
|
of instrumented locations, the higher the number of edge collisions are in the
|
|
map. This can result in not discovering new paths and therefore degrade the
|
|
efficiency of the fuzzing process.
|
|
|
|
*This issue is underestimated in the fuzzing community!*
|
|
With a 2^16 = 64kb standard map at already 256 instrumented blocks there is
|
|
on average one collision. On average a target has 10.000 to 50.000
|
|
instrumented blocks hence the real collisions are between 750-18.000!
|
|
|
|
To reach a solution that prevents any collisions took several approaches
|
|
and many dead ends until we got to this:
|
|
|
|
* We instrument at link time when we have all files pre-compiled
|
|
* To instrument at link time we compile in LTO (link time optimization) mode
|
|
* Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the
|
|
correct LTO options and runs our own afl-ld linker instead of the system
|
|
linker
|
|
* The LLVM linker collects all LTO files to link and instruments them so that
|
|
we have non-colliding edge overage
|
|
* We use a new (for afl) edge coverage - which is the same as in llvm
|
|
-fsanitize=coverage edge coverage mode :)
|
|
|
|
The result:
|
|
* 10-25% speed gain compared to llvm_mode
|
|
* guaranteed non-colliding edge coverage :-)
|
|
* The compile time especially for binaries to an instrumented library can be
|
|
much longer
|
|
|
|
Example build output from a libtiff build:
|
|
```
|
|
libtool: link: afl-clang-lto -g -O2 -Wall -W -o thumbnail thumbnail.o ../libtiff/.libs/libtiff.a ../port/.libs/libport.a -llzma -ljbig -ljpeg -lz -lm
|
|
afl-clang-lto++2.63d by Marc "vanHauser" Heuse <mh@mh-sec.de> in mode LTO
|
|
afl-llvm-lto++2.63d by Marc "vanHauser" Heuse <mh@mh-sec.de>
|
|
AUTODICTIONARY: 11 strings found
|
|
[+] Instrumented 12071 locations with no collisions (on average 1046 collisions would be in afl-gcc/afl-clang-fast) (non-hardened mode).
|
|
```
|
|
|
|
## Getting llvm 11
|
|
|
|
### Installing llvm 11 from the llvm repository
|
|
|
|
Installing the llvm snapshot builds is easy and mostly painless:
|
|
|
|
In the follow line change `NAME` for your Debian or Ubuntu release name
|
|
(e.g. buster, focal, eon, etc.):
|
|
```
|
|
echo deb http://apt.llvm.org/NAME/ llvm-toolchain-NAME NAME >> /etc/apt/sources.list
|
|
```
|
|
then add the pgp key of llvm and install the packages:
|
|
```
|
|
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
|
|
apt-get update && apt-get upgrade -y
|
|
apt-get install -y clang-11 clang-tools-11 libc++1-11 libc++-11-dev \
|
|
libc++abi1-11 libc++abi-11-dev libclang1-11 libclang-11-dev \
|
|
libclang-common-11-dev libclang-cpp11 libclang-cpp11-dev liblld-11 \
|
|
liblld-11-dev liblldb-11 liblldb-11-dev libllvm11 libomp-11-dev \
|
|
libomp5-11 lld-11 lldb-11 llvm-11 llvm-11-dev llvm-11-runtime llvm-11-tools
|
|
```
|
|
|
|
### Building llvm 11 yourself
|
|
|
|
Building llvm from github takes quite some long time and is not painless:
|
|
```
|
|
sudo apt install binutils-dev # this is *essential*!
|
|
git clone https://github.com/llvm/llvm-project
|
|
cd llvm-project
|
|
mkdir build
|
|
cd build
|
|
cmake -DLLVM_ENABLE_PROJECTS='clang;clang-tools-extra;compiler-rt;libclc;libcxx;libcxxabi;libunwind;lld' -DCMAKE_BUILD_TYPE=Release -DLLVM_BINUTILS_INCDIR=/usr/include/ ../llvm/
|
|
make -j $(nproc)
|
|
export PATH=`pwd`/bin:$PATH
|
|
export LLVM_CONFIG=`pwd`/bin/llvm-config
|
|
cd /path/to/AFLplusplus/
|
|
make
|
|
cd llvm_mode
|
|
make
|
|
cd ..
|
|
make install
|
|
```
|
|
|
|
## How to use afl-clang-lto
|
|
|
|
Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
|
|
|
|
Also the instrument file listing (AFL_LLVM_INSTRUMENT_FILE -> [README.instrument_file.md](README.instrument_file.md)) and
|
|
laf-intel/compcov (AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
|
|
InsTrim (control flow graph instrumentation) is supported and recommended!
|
|
(set `AFL_LLVM_INSTRUMENT=CFG`)
|
|
|
|
Example:
|
|
```
|
|
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure
|
|
export AFL_LLVM_INSTRUMENT=CFG
|
|
make
|
|
```
|
|
|
|
NOTE: some targets also need to set the linker, try both `afl-clang-lto` and
|
|
`afl-ld-lto` for this for `LD=` for `configure`.
|
|
|
|
## AUTODICTIONARY feature
|
|
|
|
Setting `AFL_LLVM_LTO_AUTODICTIONARY` will generate a dictionary in the
|
|
target binary based on string compare and memory compare functions.
|
|
afl-fuzz will automatically get these transmitted when starting to fuzz.
|
|
This improves coverage on a lot of targets.
|
|
|
|
## Fixed memory map
|
|
|
|
To speed up fuzzing, the shared memory map is hard set to a specific address,
|
|
by default 0x10000. In most cases this will work without any problems.
|
|
On unusual operating systems/processors/kernels or weird libraries this might
|
|
fail so to change the fixed address at compile time set
|
|
AFL_LLVM_MAP_ADDR with a better value (a value of 0 or empty sets the map address
|
|
to be dynamic - the original afl way, which is slower).
|
|
AFL_LLVM_MAP_DYNAMIC can be set so the shared memory address is dynamic (which
|
|
is safer but also slower).
|
|
|
|
## Solving difficult targets
|
|
|
|
Some targets are difficult because the configure script does unusual stuff that
|
|
is unexpected for afl. See the next chapter `Potential issues` how to solve
|
|
these.
|
|
|
|
An example of a hard to solve target is ffmpeg. Here is how to successfully
|
|
instrument it:
|
|
|
|
1. Get and extract the current ffmpeg and change to it's directory
|
|
|
|
2. Running configure with --cc=clang fails and various other items will fail
|
|
when compiling, so we have to trick configure:
|
|
|
|
```
|
|
./configure --enable-lto --disable-shared
|
|
```
|
|
|
|
3. Now the configuration is done - and we edit the settings in `./ffbuild/config.mak`
|
|
(-: the original line, +: what to change it into):
|
|
```
|
|
-CC=gcc
|
|
+CC=afl-clang-lto
|
|
-CXX=g++
|
|
+CXX=afl-clang-lto++
|
|
-AS=gcc
|
|
+AS=llvm-as
|
|
-LD=gcc
|
|
+LD=afl-clang-lto++
|
|
-DEPCC=gcc
|
|
+DEPCC=afl-clang-lto
|
|
-DEPAS=gcc
|
|
+DEPAS=afl-clang-lto++
|
|
-AR=ar
|
|
+AR=llvm-ar
|
|
-AR_CMD=ar
|
|
+AR_CMD=llvm-ar
|
|
-NM_CMD=nm -g
|
|
+NM_CMD=llvm-nm -g
|
|
-RANLIB=ranlib -D
|
|
+RANLIB=llvm-ranlib -D
|
|
```
|
|
|
|
4. Then type make, wait for a long time and you are done :)
|
|
|
|
## Potential issues
|
|
|
|
### compiling libraries fails
|
|
|
|
If you see this message:
|
|
```
|
|
/bin/ld: libfoo.a: error adding symbols: archive has no index; run ranlib to add one
|
|
```
|
|
This is because usually gnu gcc ranlib is being called which cannot deal with clang LTO files.
|
|
The solution is simple: when you ./configure you have also have to set RANLIB=llvm-ranlib and AR=llvm-ar
|
|
|
|
Solution:
|
|
```
|
|
AR=llvm-ar RANLIB=llvm-ranlib CC=afl-clang-lto CXX=afl-clang-lto++ ./configure --disable-shared
|
|
```
|
|
and on some target you have to to AR=/RANLIB= even for make as the configure script does not save it.
|
|
Other targets ignore environment variables and need the parameters set via
|
|
`./configure --cc=... --cxx= --ranlib= ...` etc. (I am looking at you ffmpeg!).
|
|
|
|
|
|
If you see this message
|
|
```
|
|
assembler command failed ...
|
|
```
|
|
then try setting `llvm-as` for configure:
|
|
```
|
|
AS=llvm-as ...
|
|
```
|
|
|
|
### compiling programs still fail
|
|
|
|
afl-clang-lto is still work in progress.
|
|
|
|
Known issues:
|
|
* Anything that llvm 11 cannot compile, afl-clang-lto can not compile either - obviously
|
|
* Anything that does not compile with LTO, afl-clang-lto can not compile either - obviously
|
|
|
|
Hence if building a target with afl-clang-lto fails try to build it with llvm11
|
|
and LTO enabled (`CC=clang-11` `CXX=clang++-11` `CFLAGS=-flto=full` and
|
|
`CXXFLAGS=-flto=full`).
|
|
|
|
If this succeeeds then there is an issue with afl-clang-lto. Please report at
|
|
[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226)
|
|
|
|
Even some targets where clang-11 fails can be build if the fail is just in
|
|
`./configure`, see `Solving difficult targets` above.
|
|
|
|
### Target crashes immediately
|
|
|
|
If the target is using early constructors (priority values smaller than 6)
|
|
or have their own _init/.init functions and these are instrumented then the
|
|
target will likely crash when started. This can be avoided by compiling with
|
|
`AFL_LLVM_MAP_DYNAMIC=1` .
|
|
|
|
This can e.g. happen with OpenSSL.
|
|
|
|
## History
|
|
|
|
This was originally envisioned by hexcoder- in Summer 2019, however we saw no
|
|
way to create a pass that is run at link time - although there is a option
|
|
for this in the PassManager: EP_FullLinkTimeOptimizationLast
|
|
("Fun" info - nobody knows what this is doing. And the developer who
|
|
implemented this didn't respond to emails.)
|
|
|
|
In December came then the idea to implement this as a pass that is run via
|
|
the llvm "opt" program, which is performed via an own linker that afterwards
|
|
calls the real linker.
|
|
This was first implemented in January and work ... kinda.
|
|
The LTO time instrumentation worked, however the "how" the basic blocks were
|
|
instrumented was a problem, as reducing duplicates turned out to be very,
|
|
very difficult with a program that has so many paths and therefore so many
|
|
dependencies. At lot of strategies were implemented - and failed.
|
|
And then sat solvers were tried, but with over 10.000 variables that turned
|
|
out to be a dead-end too.
|
|
|
|
The final idea to solve this came from domenukk who proposed to insert a block
|
|
into an edge and then just use incremental counters ... and this worked!
|
|
After some trials and errors to implement this vanhauser-thc found out that
|
|
there is actually an llvm function for this: SplitEdge() :-)
|
|
|
|
Still more problems came up though as this only works without bugs from
|
|
llvm 9 onwards, and with high optimization the link optimization ruins
|
|
the instrumented control flow graph.
|
|
|
|
This is all now fixed with llvm 11. The llvm's own linker is now able to
|
|
load passes and this bypasses all problems we had.
|
|
|
|
Happy end :)
|