add LTO AFL_LLVM_DOCUMENT_IDS feature

This commit is contained in:
van Hauser
2020-07-31 17:53:01 +02:00
parent c101a3f5ab
commit 185f443659
7 changed files with 55 additions and 21 deletions

View File

@ -26,6 +26,8 @@ sending a mail to <afl-users+subscribe@googlegroups.com>.
- LTO: autodictionary mode is a default - LTO: autodictionary mode is a default
- LTO: instrim instrumentation disabled, only classic support used - LTO: instrim instrumentation disabled, only classic support used
as it is always better as it is always better
- LTO: env var AFL_LLVM_DOCUMENT_IDS=file will document which edge ID
was given to which function during compilation
- setting AFL_LLVM_LAF_SPLIT_FLOATS now activates - setting AFL_LLVM_LAF_SPLIT_FLOATS now activates
AFL_LLVM_LAF_SPLIT_COMPARES AFL_LLVM_LAF_SPLIT_COMPARES
- added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz - added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz

View File

@ -95,12 +95,13 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
2. Second step: Find the responsible function. 2. Second step: Find the responsible function.
a) For LTO instrumented binaries just disassemble or decompile the target a) For LTO instrumented binaries this can be documented during compile
and look which edge is writing to that edge ID. Ghidra is a good tool time, just set `export AFL_LLVM_DOCUMENT_IDS=/path/to/afile`.
for this: [https://ghidra-sre.org/](https://ghidra-sre.org/) This file will have one assigned edge ID and the corresponding function
per line.
b) For PCGUARD instrumented binaries it is more difficult. Here you can b) For PCGUARD instrumented binaries it is much more difficult. Here you
either modify the __sanitizer_cov_trace_pc_guard function in can either modify the __sanitizer_cov_trace_pc_guard function in
llvm_mode/afl-llvm-rt.o.c to write a backtrace to a file if the ID in llvm_mode/afl-llvm-rt.o.c to write a backtrace to a file if the ID in
__afl_area_ptr[*guard] is one of the unstable edge IDs. Then recompile __afl_area_ptr[*guard] is one of the unstable edge IDs. Then recompile
and reinstall llvm_mode and rebuild your target. Run the recompiled and reinstall llvm_mode and rebuild your target. Run the recompiled
@ -121,4 +122,3 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
4. Fourth step: recompile the target 4. Fourth step: recompile the target
Recompile, fuzz it, be happy :) Recompile, fuzz it, be happy :)

View File

@ -121,18 +121,16 @@ Then there are a few specific features that are only available in llvm_mode:
built if LLVM 11 or newer is used. built if LLVM 11 or newer is used.
- AFL_LLVM_INSTRUMENT=CFG will use Control Flow Graph instrumentation. - AFL_LLVM_INSTRUMENT=CFG will use Control Flow Graph instrumentation.
(recommended) (not recommended!)
- AFL_LLVM_LTO_AUTODICTIONARY will generate a dictionary in the target
binary based on string compare and memory compare functions.
afl-fuzz will automatically get these transmitted when starting to
fuzz.
None of the following options are necessary to be used and are rather for None of the following options are necessary to be used and are rather for
manual use (which only ever the author of this LTO implementation will use). manual use (which only ever the author of this LTO implementation will use).
These are used if several seperated instrumentation are performed which These are used if several seperated instrumentation are performed which
are then later combined. are then later combined.
- AFL_LLVM_DOCUMENT_IDS=file will document to a file which edge ID was given
to which function. This helps to identify functions with variable bytes
or which functions were touched by an input.
- AFL_LLVM_MAP_ADDR sets the fixed map address to a different address than - AFL_LLVM_MAP_ADDR sets the fixed map address to a different address than
the default 0x10000. A value of 0 or empty sets the map address to be the default 0x10000. A value of 0 or empty sets the map address to be
dynamic (the original afl way, which is slower) dynamic (the original afl way, which is slower)
@ -254,15 +252,6 @@ checks or alter some of the more exotic semantics of the tool:
useful if you can't change the defaults (e.g., no root access to the useful if you can't change the defaults (e.g., no root access to the
system) and are OK with some performance loss. system) and are OK with some performance loss.
- Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
fork + execve() call for every tested input. This is useful mostly when
working with unruly libraries that create threads or do other crazy
things when initializing (before the instrumentation has a chance to run).
Note that this setting inhibits some of the user-friendly diagnostics
normally done when starting up the forkserver and causes a pretty
significant performance drop.
- AFL_EXIT_WHEN_DONE causes afl-fuzz to terminate when all existing paths - AFL_EXIT_WHEN_DONE causes afl-fuzz to terminate when all existing paths
have been fuzzed and there were no new finds for a while. This would be have been fuzzed and there were no new finds for a while. This would be
normally indicated by the cycle counter in the UI turning green. May be normally indicated by the cycle counter in the UI turning green. May be
@ -338,6 +327,13 @@ checks or alter some of the more exotic semantics of the tool:
- In QEMU mode (-Q), AFL_PATH will be searched for afl-qemu-trace. - In QEMU mode (-Q), AFL_PATH will be searched for afl-qemu-trace.
- Setting AFL_CYCLE_SCHEDULES will switch to a different schedule everytime
a cycle is finished.
- Setting AFL_EXPAND_HAVOC_NOW will start in the extended havoc mode that
includes costly mutations. afl-fuzz automatically enables this mode when
deemed useful otherwise.
- Setting AFL_PRELOAD causes AFL to set LD_PRELOAD for the target binary - Setting AFL_PRELOAD causes AFL to set LD_PRELOAD for the target binary
without disrupting the afl-fuzz process itself. This is useful, among other without disrupting the afl-fuzz process itself. This is useful, among other
things, for bootstrapping libdislocator.so. things, for bootstrapping libdislocator.so.
@ -365,6 +361,15 @@ checks or alter some of the more exotic semantics of the tool:
for an existing out folder, even if a different `-i` was provided. for an existing out folder, even if a different `-i` was provided.
Without this setting, afl-fuzz will refuse execution for a long-fuzzed out dir. Without this setting, afl-fuzz will refuse execution for a long-fuzzed out dir.
- Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
fork + execve() call for every tested input. This is useful mostly when
working with unruly libraries that create threads or do other crazy
things when initializing (before the instrumentation has a chance to run).
Note that this setting inhibits some of the user-friendly diagnostics
normally done when starting up the forkserver and causes a pretty
significant performance drop.
- Outdated environment variables that are that not supported anymore: - Outdated environment variables that are that not supported anymore:
AFL_DEFER_FORKSRV AFL_DEFER_FORKSRV
AFL_PERSISTENT AFL_PERSISTENT

View File

@ -65,6 +65,7 @@ static char *afl_environment_variables[] = {
"AFL_LLVM_CMPLOG", "AFL_LLVM_CMPLOG",
"AFL_LLVM_INSTRIM", "AFL_LLVM_INSTRIM",
"AFL_LLVM_CTX", "AFL_LLVM_CTX",
"AFL_LLVM_DOCUMENT_IDS",
"AFL_LLVM_INSTRUMENT", "AFL_LLVM_INSTRUMENT",
"AFL_LLVM_INSTRIM_LOOPHEAD", "AFL_LLVM_INSTRIM_LOOPHEAD",
"AFL_LLVM_LTO_AUTODICTIONARY", "AFL_LLVM_LTO_AUTODICTIONARY",

View File

@ -140,6 +140,12 @@ to be dynamic - the original afl way, which is slower).
AFL_LLVM_MAP_DYNAMIC can be set so the shared memory address is dynamic (which AFL_LLVM_MAP_DYNAMIC can be set so the shared memory address is dynamic (which
is safer but also slower). is safer but also slower).
## Document edge IDs
Setting `export AFL_LLVM_DOCUMENT_IDS=file` will document to a file which edge
ID was given to which function. This helps to identify functions with variable
bytes or which functions were touched by an input.
## Solving difficult targets ## Solving difficult targets
Some targets are difficult because the configure script does unusual stuff that Some targets are difficult because the configure script does unusual stuff that

View File

@ -890,6 +890,8 @@ int main(int argc, char **argv, char **envp) {
"AFL_NO_BUILTIN: compile for use with libtokencap.so\n" "AFL_NO_BUILTIN: compile for use with libtokencap.so\n"
"AFL_PATH: path to instrumenting pass and runtime " "AFL_PATH: path to instrumenting pass and runtime "
"(afl-llvm-rt.*o)\n" "(afl-llvm-rt.*o)\n"
"AFL_LLVM_DOCUMENT_IDS: document edge IDs given to which function (LTO "
"only)\n"
"AFL_QUIET: suppress verbose output\n" "AFL_QUIET: suppress verbose output\n"
"AFL_USE_ASAN: activate address sanitizer\n" "AFL_USE_ASAN: activate address sanitizer\n"
"AFL_USE_CFISAN: activate control flow sanitizer\n" "AFL_USE_CFISAN: activate control flow sanitizer\n"

View File

@ -103,6 +103,7 @@ bool AFLLTOPass::runOnModule(Module &M) {
std::vector<CallInst *> calls; std::vector<CallInst *> calls;
DenseMap<Value *, std::string *> valueMap; DenseMap<Value *, std::string *> valueMap;
char * ptr; char * ptr;
FILE * documentFile = NULL;
IntegerType *Int8Ty = IntegerType::getInt8Ty(C); IntegerType *Int8Ty = IntegerType::getInt8Ty(C);
IntegerType *Int32Ty = IntegerType::getInt32Ty(C); IntegerType *Int32Ty = IntegerType::getInt32Ty(C);
@ -120,6 +121,13 @@ bool AFLLTOPass::runOnModule(Module &M) {
be_quiet = 1; be_quiet = 1;
if ((ptr = getenv("AFL_LLVM_DOCUMENT_IDS")) != NULL) {
if ((documentFile = fopen(ptr, "a")) == NULL)
WARNF("Cannot access document file %s", ptr);
}
if (getenv("AFL_LLVM_MAP_DYNAMIC")) map_addr = 0; if (getenv("AFL_LLVM_MAP_DYNAMIC")) map_addr = 0;
if (getenv("AFL_LLVM_INSTRIM_SKIPSINGLEBLOCK") || if (getenv("AFL_LLVM_INSTRIM_SKIPSINGLEBLOCK") ||
@ -579,6 +587,14 @@ bool AFLLTOPass::runOnModule(Module &M) {
} }
if (documentFile) {
fprintf(documentFile, "%s %u\n",
origBB->getParent()->getName().str().c_str(),
afl_global_id);
}
BasicBlock::iterator IP = newBB->getFirstInsertionPt(); BasicBlock::iterator IP = newBB->getFirstInsertionPt();
IRBuilder<> IRB(&(*IP)); IRBuilder<> IRB(&(*IP));
@ -632,6 +648,8 @@ bool AFLLTOPass::runOnModule(Module &M) {
} }
if (documentFile) fclose(documentFile);
} }
// save highest location ID to global variable // save highest location ID to global variable