Files
AFLplusplus/instrumentation/afl-llvm-common.cc
van Hauser fff7f1c558 Dev (#1962)
* Pure Python (3.6) port of benchmark.sh as benchmark.py, no other changes

* Test standard and persistent modes separately

* Add support for multi-core benchmarking

* Save the results to a json file

* Allow config of all experiment params, average across runs

* Add start_time_of_run and total_execs_per_sec, cleanup for PR

* benchmark: cleanup, add results, add a data exploration notebook

* benchmark: add a README, lower default runs from 5 to 3

* benchmark: notebook wording tweaks

* copy 'detect_leaks=0' from ASAN to LSAN

fix for issue #1733, set "detect_leaks=0" when ASAN_OPTIONS contains it and LSAN_OPTIONS are not set.

* fix of fix: make sure ASAN_OPTIONS and LSAN_OPTIONS agree on leak detection

* fix lsan fix

* clang-format 16->17

* Add missing initialisation for havoc_queued during the custom mutator's stage.

* fix dictionary and cmin

* Use direct call to write to OpenBSD

The linker on OpenBSD emits a warning when linking this file:
warning: syscall() may go away, please rewrite code to use direct calls

* Fix possible doc inconsistency for custom mutator's queue_get function.

* update todos

* benchmark: Add support for COMPARISON file

* benchmark: show the number of cores used in COMPARISON

* benchmark: lower minimum Python version to 3.8

* benchmark: use afl's execs/s; increase CPU model width

* benchmark: disallow duplicate entries for the same CPU in COMPARISON

* Update benchmark.py

* fix inf in stats

* Fix benchmark.py

* missing closing parenthesis

* Update benchmark.py

* benchmark: remove self-calculation of execs/sec

* benchmark: update COMPARISON

* benchmark: Update Jupyter notebook and results file.

* benchmark: rename afl_execs_per_sec to execs_per_sec

* benchmark: update README

* update

* add benchmark

* nits

* add benchmarks

* Update unicornafl ref

* Pass correct Nyx ID when creating a Nyx runner

* Fix typo in docker pull command, add exampe to mount current dir as volume (#1914)

* mini fix

* add custom_post_run.c

* update afl-fuzz-run

* update python module

* format code

* update

* merge function

* changes

* code format

* improve cmplog

* nit

* nit

* fix

* fix

* Stop hardcoding the path /usr/local/lib/afl in afl-ld-lto.c and respect the configured PREFIX.

* Add benchmark for Raspberry Pi 5

* ryzen 5950 benchmark

* add missing raspery5

* comparison -> comparison.md

* removing options "-Wl,-rpath" "LLVM_LIBDIR" when using gcc

* fixing -Wl,-rpath=<LLVM_LIBDIR>

* nits

* fix

* afl-cc fixes

* nit

* add n_fuzz to ignore_timeouts

* fix

* Fix #1927

* in-depth blog post

* add AFL_FUZZER_LOOPCOUNT

* AFL_FUZZER_LOOPCOUNT

* fix 2 mutation bugs

* v4.09c release

* v4.10a init

* switch to explore powerschedule as default

* fix MUT_INSERTASCIINUM

* fix MUT_STRATEGY_ARRAY_SIZE

* fix bad fix for MUT_STRATEGY_ARRAY_SIZE

* remove afl-network-client on uninstall

* update nyx

* Improve binary-only related docs

* llvm 18 build fixes.

* code format

* Fix custom_send link

Add a leading '/' to walk in the repo root instead of current dir.

* Use ../ instead

* initial simple injection detection support

* inject docs

* fix for issue #1916, iLLVM crash in split-floatingpoint-compares

* LLVM 17 bug workaround

* finish injection implementation

* remove tmp todo

* update changelog

* forgot to add the injection pass

* Output afl-clang-fast stuffs only if necessary (#1912)

* afl-cc header

* afl-cc common declarations

 - Add afl-cc-state.c
 - Strip includes, find_object, debug/be_quiet/have_*/callname setting from afl-cc.c
 - Use debugf_args in main
 - Modify execvp stuffs to fit new aflcc struct

* afl-cc show usage

* afl-cc mode selecting

1. compiler_mode by callname in argv[0]
2. compiler_mode by env "AFL_CC_COMPILER"
3. compiler_mode/instrument_mode by command line options "--afl-..."
4. instrument_mode/compiler_mode by various env vars including "AFL_LLVM_INSTRUMENT"
5. final checking steps
6. print "... - mode: %s-%s\n"
7. determine real argv[0] according to compiler_mode

* afl-cc macro defs

* afl-cc linking behaviors

* afl-cc fsanitize behaviors

* afl-cc misc

* afl-cc body update

* afl-cc all-in-one

formated with custom-format.py

* nits

---------

Co-authored-by: vanhauser-thc <vh@thc.org>

* changelog

* update grammar mutator

* lto llvm 12+

* docs(custom_mutators): fix missing ':' (#1953)

* Fix broken LTO mode and response file support (#1948)

* Strip `-Wl,-no-undefined` during compilation (#1952)

Make the compiler wrapper stripping `-Wl,-no-undefined` in addition to `-Wl,--no-undefined`.
Both versions of the flag are accepted by clang and, therefore, used by building systems in the wild (e.g., samba will not build without this fix).

* Remove dead code in write_to_testcase (#1955)

The custom_mutators_count check in if case is duplicate with if condition.
The else case is custom_mutators_count == 0, neither custom_mutator_list iteration nor sent check needed.

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>

* update qemuafl

* WIP: Add ability to generate drcov trace using QEMU backend (#1956)

* Document new drcov QEMU plugin

* Add link to lightkeeper for QEMU drcov file loading

---------

Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>

* code format

* changelog

* sleep on uid != 0 afl-system-config

* fix segv about skip_next, warn on unsupported cases of linking options (#1958)

* todos

* ensure afl-cc only allows available compiler modes

* update grammar mutator

* disable aslr on apple

* fix for arm64

* help selective instrumentation

* typos

* macos

* add compiler test script

* apple fixes

---------

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Chris Ball <chris@printf.net>
Co-authored-by: hexcoder <hexcoder-@users.noreply.github.com>
Co-authored-by: hexcoder- <heiko@hexco.de>
Co-authored-by: Manuel Carrasco <m.carrasco@imperial.ac.uk>
Co-authored-by: Jasper Lievisse Adriaanse <j@jasper.la>
Co-authored-by: ifyGecko <26214995+ifyGecko@users.noreply.github.com>
Co-authored-by: Dominik Maier <domenukk@gmail.com>
Co-authored-by: Christian Holler (:decoder) <choller@mozilla.com>
Co-authored-by: Carlo Maragno <ste.maragno@gmail.com>
Co-authored-by: yangzao <yangzaocn@outlook.com>
Co-authored-by: Romain Geissler <romain.geissler@amadeus.com>
Co-authored-by: Jakob Lell <jakob@jakoblell.com>
Co-authored-by: vincenzo MEZZELA <vincenzo.mezzela@amadeus.com>
Co-authored-by: Andrea Fioraldi <andreafioraldi@gmail.com>
Co-authored-by: Bet4 <0xbet4@gmail.com>
Co-authored-by: David Carlier <devnexen@gmail.com>
Co-authored-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Sonic <50692172+SonicStark@users.noreply.github.com>
Co-authored-by: Nils Bars <nils.bars@rub.de>
Co-authored-by: Jean-Romain Garnier <7504819+JRomainG@users.noreply.github.com>
Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>
2024-01-20 10:19:46 +00:00

607 lines
15 KiB
C++

#define AFL_LLVM_PASS
#include "config.h"
#include "debug.h"
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <fnmatch.h>
#include <list>
#include <string>
#include <fstream>
#include <cmath>
#include <llvm/Support/raw_ostream.h>
#define IS_EXTERN extern
#include "afl-llvm-common.h"
using namespace llvm;
static std::list<std::string> allowListFiles;
static std::list<std::string> allowListFunctions;
static std::list<std::string> denyListFiles;
static std::list<std::string> denyListFunctions;
char *getBBName(const llvm::BasicBlock *BB) {
static char *name;
if (!BB->getName().empty()) {
name = strdup(BB->getName().str().c_str());
return name;
}
std::string Str;
raw_string_ostream OS(Str);
#if LLVM_VERSION_MAJOR >= 4 || \
(LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7)
BB->printAsOperand(OS, false);
#endif
name = strdup(OS.str().c_str());
return name;
}
/* Function that we never instrument or analyze */
/* Note: this ignore check is also called in isInInstrumentList() */
bool isIgnoreFunction(const llvm::Function *F) {
// Starting from "LLVMFuzzer" these are functions used in libfuzzer based
// fuzzing campaign installations, e.g. oss-fuzz
static constexpr const char *ignoreList[] = {
"asan.",
"llvm.",
"sancov.",
"__ubsan",
"ign.",
"__afl",
"_fini",
"__libc_",
"__asan",
"__msan",
"__cmplog",
"__sancov",
"__san",
"__cxx_",
"__decide_deferred",
"_GLOBAL",
"_ZZN6__asan",
"_ZZN6__lsan",
"msan.",
"LLVMFuzzerM",
"LLVMFuzzerC",
"LLVMFuzzerI",
"maybe_duplicate_stderr",
"discard_output",
"close_stdout",
"dup_and_close_stderr",
"maybe_close_fd_mask",
"ExecuteFilesOnyByOne"
};
for (auto const &ignoreListFunc : ignoreList) {
if (F->getName().startswith(ignoreListFunc)) { return true; }
}
static constexpr const char *ignoreSubstringList[] = {
"__asan", "__msan", "__ubsan", "__lsan", "__san",
"__sanitize", "DebugCounter", "DwarfDebug", "DebugLoc"
};
// This check is very sensitive, we must be sure to not include patterns
// that are part of user-written C++ functions like the ones including
// std::string as parameter (see #1927) as the mangled type is inserted in the
// mangled name of the user-written function
for (auto const &ignoreListFunc : ignoreSubstringList) {
// hexcoder: F->getName().contains() not avaiilable in llvm 3.8.0
if (StringRef::npos != F->getName().find(ignoreListFunc)) { return true; }
}
return false;
}
void initInstrumentList() {
char *allowlist = getenv("AFL_LLVM_ALLOWLIST");
if (!allowlist) allowlist = getenv("AFL_LLVM_INSTRUMENT_FILE");
if (!allowlist) allowlist = getenv("AFL_LLVM_WHITELIST");
char *denylist = getenv("AFL_LLVM_DENYLIST");
if (!denylist) denylist = getenv("AFL_LLVM_BLOCKLIST");
if (allowlist && denylist)
FATAL(
"You can only specify either AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST "
"but not both!");
if (allowlist) {
std::string line;
std::ifstream fileStream;
fileStream.open(allowlist);
if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_ALLOWLIST");
getline(fileStream, line);
while (fileStream) {
int is_file = -1;
std::size_t npos;
std::string original_line = line;
line.erase(std::remove_if(line.begin(), line.end(), ::isspace),
line.end());
// remove # and following
if ((npos = line.find("#")) != std::string::npos)
line = line.substr(0, npos);
if (line.compare(0, 4, "fun:") == 0) {
is_file = 0;
line = line.substr(4);
} else if (line.compare(0, 9, "function:") == 0) {
is_file = 0;
line = line.substr(9);
} else if (line.compare(0, 4, "src:") == 0) {
is_file = 1;
line = line.substr(4);
} else if (line.compare(0, 7, "source:") == 0) {
is_file = 1;
line = line.substr(7);
}
if (line.find(":") != std::string::npos) {
FATAL("invalid line in AFL_LLVM_ALLOWLIST: %s", original_line.c_str());
}
if (line.length() > 0) {
// if the entry contains / or . it must be a file
if (is_file == -1)
if (line.find("/") != std::string::npos ||
line.find(".") != std::string::npos)
is_file = 1;
// otherwise it is a function
if (is_file == 1)
allowListFiles.push_back(line);
else
allowListFunctions.push_back(line);
}
getline(fileStream, line);
}
if (debug)
DEBUGF("loaded allowlist with %zu file and %zu function entries\n",
allowListFiles.size() / 4, allowListFunctions.size() / 4);
}
if (denylist) {
std::string line;
std::ifstream fileStream;
fileStream.open(denylist);
if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_DENYLIST");
getline(fileStream, line);
while (fileStream) {
int is_file = -1;
std::size_t npos;
std::string original_line = line;
line.erase(std::remove_if(line.begin(), line.end(), ::isspace),
line.end());
// remove # and following
if ((npos = line.find("#")) != std::string::npos)
line = line.substr(0, npos);
if (line.compare(0, 4, "fun:") == 0) {
is_file = 0;
line = line.substr(4);
} else if (line.compare(0, 9, "function:") == 0) {
is_file = 0;
line = line.substr(9);
} else if (line.compare(0, 4, "src:") == 0) {
is_file = 1;
line = line.substr(4);
} else if (line.compare(0, 7, "source:") == 0) {
is_file = 1;
line = line.substr(7);
}
if (line.find(":") != std::string::npos) {
FATAL("invalid line in AFL_LLVM_DENYLIST: %s", original_line.c_str());
}
if (line.length() > 0) {
// if the entry contains / or . it must be a file
if (is_file == -1)
if (line.find("/") != std::string::npos ||
line.find(".") != std::string::npos)
is_file = 1;
// otherwise it is a function
if (is_file == 1)
denyListFiles.push_back(line);
else
denyListFunctions.push_back(line);
}
getline(fileStream, line);
}
if (debug)
DEBUGF("loaded denylist with %zu file and %zu function entries\n",
denyListFiles.size() / 4, denyListFunctions.size() / 4);
}
}
void scanForDangerousFunctions(llvm::Module *M) {
if (!M) return;
#if LLVM_VERSION_MAJOR >= 4 || \
(LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 9)
for (GlobalIFunc &IF : M->ifuncs()) {
StringRef ifunc_name = IF.getName();
Constant *r = IF.getResolver();
if (r->getNumOperands() == 0) { continue; }
StringRef r_name = cast<Function>(r->getOperand(0))->getName();
if (!be_quiet)
fprintf(stderr,
"Note: Found an ifunc with name %s that points to resolver "
"function %s, we will not instrument this, putting it into the "
"block list.\n",
ifunc_name.str().c_str(), r_name.str().c_str());
denyListFunctions.push_back(r_name.str());
}
GlobalVariable *GV = M->getNamedGlobal("llvm.global_ctors");
if (GV && !GV->isDeclaration() && !GV->hasLocalLinkage()) {
ConstantArray *InitList = dyn_cast<ConstantArray>(GV->getInitializer());
if (InitList) {
for (unsigned i = 0, e = InitList->getNumOperands(); i != e; ++i) {
if (ConstantStruct *CS =
dyn_cast<ConstantStruct>(InitList->getOperand(i))) {
if (CS->getNumOperands() >= 2) {
if (CS->getOperand(1)->isNullValue())
break; // Found a null terminator, stop here.
ConstantInt *CI = dyn_cast<ConstantInt>(CS->getOperand(0));
int Priority = CI ? CI->getSExtValue() : 0;
Constant *FP = CS->getOperand(1);
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(FP))
if (CE->isCast()) FP = CE->getOperand(0);
if (Function *F = dyn_cast<Function>(FP)) {
if (!F->isDeclaration() &&
strncmp(F->getName().str().c_str(), "__afl", 5) != 0) {
if (!be_quiet)
fprintf(stderr,
"Note: Found constructor function %s with prio "
"%u, we will not instrument this, putting it into a "
"block list.\n",
F->getName().str().c_str(), Priority);
denyListFunctions.push_back(F->getName().str());
}
}
}
}
}
}
}
#endif
}
static std::string getSourceName(llvm::Function *F) {
// let's try to get the filename for the function
auto bb = &F->getEntryBlock();
BasicBlock::iterator IP = bb->getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
DebugLoc Loc = IP->getDebugLoc();
#if LLVM_VERSION_MAJOR >= 4 || \
(LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7)
if (Loc) {
StringRef instFilename;
DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
if (cDILoc) { instFilename = cDILoc->getFilename(); }
if (instFilename.str().empty() && cDILoc) {
/* If the original location is empty, try using the inlined location
*/
DILocation *oDILoc = cDILoc->getInlinedAt();
if (oDILoc) { instFilename = oDILoc->getFilename(); }
}
return instFilename.str();
}
#else
if (!Loc.isUnknown()) {
DILocation cDILoc(Loc.getAsMDNode(F->getContext()));
StringRef instFilename = cDILoc.getFilename();
/* Continue only if we know where we actually are */
return instFilename.str();
}
#endif
return std::string("");
}
bool isInInstrumentList(llvm::Function *F, std::string Filename) {
bool return_default = true;
// is this a function with code? If it is external we don't instrument it
// anyway and it can't be in the instrument file list. Or if it is it is
// ignored.
if (!F->size() || isIgnoreFunction(F)) return false;
if (!denyListFiles.empty() || !denyListFunctions.empty()) {
if (!denyListFunctions.empty()) {
std::string instFunction = F->getName().str();
for (std::list<std::string>::iterator it = denyListFunctions.begin();
it != denyListFunctions.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
* check that the actual filename ends in the filename
* specified in the list. We also allow UNIX-style pattern
* matching */
if (instFunction.length() >= it->length()) {
if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) {
if (debug)
DEBUGF(
"Function %s is in the deny function list, not instrumenting "
"... \n",
instFunction.c_str());
return false;
}
}
}
}
if (!denyListFiles.empty()) {
std::string source_file = getSourceName(F);
if (source_file.empty()) { source_file = Filename; }
if (!source_file.empty()) {
for (std::list<std::string>::iterator it = denyListFiles.begin();
it != denyListFiles.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
* check that the actual filename ends in the filename
* specified in the list. We also allow UNIX-style pattern
* matching */
if (source_file.length() >= it->length()) {
if (fnmatch(("*" + *it).c_str(), source_file.c_str(), 0) == 0) {
return false;
}
}
}
} else {
// we could not find out the location. in this case we say it is not
// in the instrument file list
if (!be_quiet)
WARNF(
"No debug information found for function %s, will be "
"instrumented (recompile with -g -O[1-3] and use a modern llvm).",
F->getName().str().c_str());
}
}
}
// if we do not have a instrument file list return true
if (!allowListFiles.empty() || !allowListFunctions.empty()) {
return_default = false;
if (!allowListFunctions.empty()) {
std::string instFunction = F->getName().str();
for (std::list<std::string>::iterator it = allowListFunctions.begin();
it != allowListFunctions.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
* check that the actual filename ends in the filename
* specified in the list. We also allow UNIX-style pattern
* matching */
if (instFunction.length() >= it->length()) {
if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) {
if (debug)
DEBUGF(
"Function %s is in the allow function list, instrumenting "
"... \n",
instFunction.c_str());
return true;
}
}
}
}
if (!allowListFiles.empty()) {
std::string source_file = getSourceName(F);
if (source_file.empty()) { source_file = Filename; }
if (!source_file.empty()) {
for (std::list<std::string>::iterator it = allowListFiles.begin();
it != allowListFiles.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
* check that the actual filename ends in the filename
* specified in the list. We also allow UNIX-style pattern
* matching */
if (source_file.length() >= it->length()) {
if (fnmatch(("*" + *it).c_str(), source_file.c_str(), 0) == 0) {
if (debug)
DEBUGF(
"Function %s is in the allowlist (%s), instrumenting ... "
"\n",
F->getName().str().c_str(), source_file.c_str());
return true;
}
}
}
} else {
// we could not find out the location. In this case we say it is not
// in the instrument file list
if (!be_quiet)
WARNF(
"No debug information found for function %s, will not be "
"instrumented (recompile with -g -O[1-3] and use a modern llvm).",
F->getName().str().c_str());
return false;
}
}
}
return return_default;
}
// Calculate the number of average collisions that would occur if all
// location IDs would be assigned randomly (like normal afl/AFL++).
// This uses the "balls in bins" algorithm.
unsigned long long int calculateCollisions(uint32_t edges) {
double bins = MAP_SIZE;
double balls = edges;
double step1 = 1 - (1 / bins);
double step2 = pow(step1, balls);
double step3 = bins * step2;
double step4 = round(step3);
unsigned long long int empty = step4;
unsigned long long int collisions = edges - (MAP_SIZE - empty);
return collisions;
}