add LTO AFL_LLVM_DOCUMENT_IDS feature

This commit is contained in:
van Hauser
2020-07-31 17:53:01 +02:00
parent c101a3f5ab
commit 185f443659
7 changed files with 55 additions and 21 deletions

View File

@ -26,6 +26,8 @@ sending a mail to <afl-users+subscribe@googlegroups.com>.
- LTO: autodictionary mode is a default
- LTO: instrim instrumentation disabled, only classic support used
as it is always better
- LTO: env var AFL_LLVM_DOCUMENT_IDS=file will document which edge ID
was given to which function during compilation
- setting AFL_LLVM_LAF_SPLIT_FLOATS now activates
AFL_LLVM_LAF_SPLIT_COMPARES
- added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz

View File

@ -95,12 +95,13 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
2. Second step: Find the responsible function.
a) For LTO instrumented binaries just disassemble or decompile the target
and look which edge is writing to that edge ID. Ghidra is a good tool
for this: [https://ghidra-sre.org/](https://ghidra-sre.org/)
a) For LTO instrumented binaries this can be documented during compile
time, just set `export AFL_LLVM_DOCUMENT_IDS=/path/to/afile`.
This file will have one assigned edge ID and the corresponding function
per line.
b) For PCGUARD instrumented binaries it is more difficult. Here you can
either modify the __sanitizer_cov_trace_pc_guard function in
b) For PCGUARD instrumented binaries it is much more difficult. Here you
can either modify the __sanitizer_cov_trace_pc_guard function in
llvm_mode/afl-llvm-rt.o.c to write a backtrace to a file if the ID in
__afl_area_ptr[*guard] is one of the unstable edge IDs. Then recompile
and reinstall llvm_mode and rebuild your target. Run the recompiled
@ -121,4 +122,3 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
4. Fourth step: recompile the target
Recompile, fuzz it, be happy :)

View File

@ -121,18 +121,16 @@ Then there are a few specific features that are only available in llvm_mode:
built if LLVM 11 or newer is used.
- AFL_LLVM_INSTRUMENT=CFG will use Control Flow Graph instrumentation.
(recommended)
- AFL_LLVM_LTO_AUTODICTIONARY will generate a dictionary in the target
binary based on string compare and memory compare functions.
afl-fuzz will automatically get these transmitted when starting to
fuzz.
(not recommended!)
None of the following options are necessary to be used and are rather for
manual use (which only ever the author of this LTO implementation will use).
These are used if several seperated instrumentation are performed which
are then later combined.
- AFL_LLVM_DOCUMENT_IDS=file will document to a file which edge ID was given
to which function. This helps to identify functions with variable bytes
or which functions were touched by an input.
- AFL_LLVM_MAP_ADDR sets the fixed map address to a different address than
the default 0x10000. A value of 0 or empty sets the map address to be
dynamic (the original afl way, which is slower)
@ -254,15 +252,6 @@ checks or alter some of the more exotic semantics of the tool:
useful if you can't change the defaults (e.g., no root access to the
system) and are OK with some performance loss.
- Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
fork + execve() call for every tested input. This is useful mostly when
working with unruly libraries that create threads or do other crazy
things when initializing (before the instrumentation has a chance to run).
Note that this setting inhibits some of the user-friendly diagnostics
normally done when starting up the forkserver and causes a pretty
significant performance drop.
- AFL_EXIT_WHEN_DONE causes afl-fuzz to terminate when all existing paths
have been fuzzed and there were no new finds for a while. This would be
normally indicated by the cycle counter in the UI turning green. May be
@ -338,6 +327,13 @@ checks or alter some of the more exotic semantics of the tool:
- In QEMU mode (-Q), AFL_PATH will be searched for afl-qemu-trace.
- Setting AFL_CYCLE_SCHEDULES will switch to a different schedule everytime
a cycle is finished.
- Setting AFL_EXPAND_HAVOC_NOW will start in the extended havoc mode that
includes costly mutations. afl-fuzz automatically enables this mode when
deemed useful otherwise.
- Setting AFL_PRELOAD causes AFL to set LD_PRELOAD for the target binary
without disrupting the afl-fuzz process itself. This is useful, among other
things, for bootstrapping libdislocator.so.
@ -365,6 +361,15 @@ checks or alter some of the more exotic semantics of the tool:
for an existing out folder, even if a different `-i` was provided.
Without this setting, afl-fuzz will refuse execution for a long-fuzzed out dir.
- Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
fork + execve() call for every tested input. This is useful mostly when
working with unruly libraries that create threads or do other crazy
things when initializing (before the instrumentation has a chance to run).
Note that this setting inhibits some of the user-friendly diagnostics
normally done when starting up the forkserver and causes a pretty
significant performance drop.
- Outdated environment variables that are that not supported anymore:
AFL_DEFER_FORKSRV
AFL_PERSISTENT

View File

@ -65,6 +65,7 @@ static char *afl_environment_variables[] = {
"AFL_LLVM_CMPLOG",
"AFL_LLVM_INSTRIM",
"AFL_LLVM_CTX",
"AFL_LLVM_DOCUMENT_IDS",
"AFL_LLVM_INSTRUMENT",
"AFL_LLVM_INSTRIM_LOOPHEAD",
"AFL_LLVM_LTO_AUTODICTIONARY",

View File

@ -140,6 +140,12 @@ to be dynamic - the original afl way, which is slower).
AFL_LLVM_MAP_DYNAMIC can be set so the shared memory address is dynamic (which
is safer but also slower).
## Document edge IDs
Setting `export AFL_LLVM_DOCUMENT_IDS=file` will document to a file which edge
ID was given to which function. This helps to identify functions with variable
bytes or which functions were touched by an input.
## Solving difficult targets
Some targets are difficult because the configure script does unusual stuff that

View File

@ -890,6 +890,8 @@ int main(int argc, char **argv, char **envp) {
"AFL_NO_BUILTIN: compile for use with libtokencap.so\n"
"AFL_PATH: path to instrumenting pass and runtime "
"(afl-llvm-rt.*o)\n"
"AFL_LLVM_DOCUMENT_IDS: document edge IDs given to which function (LTO "
"only)\n"
"AFL_QUIET: suppress verbose output\n"
"AFL_USE_ASAN: activate address sanitizer\n"
"AFL_USE_CFISAN: activate control flow sanitizer\n"

View File

@ -103,6 +103,7 @@ bool AFLLTOPass::runOnModule(Module &M) {
std::vector<CallInst *> calls;
DenseMap<Value *, std::string *> valueMap;
char * ptr;
FILE * documentFile = NULL;
IntegerType *Int8Ty = IntegerType::getInt8Ty(C);
IntegerType *Int32Ty = IntegerType::getInt32Ty(C);
@ -120,6 +121,13 @@ bool AFLLTOPass::runOnModule(Module &M) {
be_quiet = 1;
if ((ptr = getenv("AFL_LLVM_DOCUMENT_IDS")) != NULL) {
if ((documentFile = fopen(ptr, "a")) == NULL)
WARNF("Cannot access document file %s", ptr);
}
if (getenv("AFL_LLVM_MAP_DYNAMIC")) map_addr = 0;
if (getenv("AFL_LLVM_INSTRIM_SKIPSINGLEBLOCK") ||
@ -579,6 +587,14 @@ bool AFLLTOPass::runOnModule(Module &M) {
}
if (documentFile) {
fprintf(documentFile, "%s %u\n",
origBB->getParent()->getName().str().c_str(),
afl_global_id);
}
BasicBlock::iterator IP = newBB->getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
@ -632,6 +648,8 @@ bool AFLLTOPass::runOnModule(Module &M) {
}
if (documentFile) fclose(documentFile);
}
// save highest location ID to global variable