Files
AFLplusplus/instrumentation/afl-llvm-dict2file.so.cc
van Hauser fff7f1c558 Dev (#1962)
* Pure Python (3.6) port of benchmark.sh as benchmark.py, no other changes

* Test standard and persistent modes separately

* Add support for multi-core benchmarking

* Save the results to a json file

* Allow config of all experiment params, average across runs

* Add start_time_of_run and total_execs_per_sec, cleanup for PR

* benchmark: cleanup, add results, add a data exploration notebook

* benchmark: add a README, lower default runs from 5 to 3

* benchmark: notebook wording tweaks

* copy 'detect_leaks=0' from ASAN to LSAN

fix for issue #1733, set "detect_leaks=0" when ASAN_OPTIONS contains it and LSAN_OPTIONS are not set.

* fix of fix: make sure ASAN_OPTIONS and LSAN_OPTIONS agree on leak detection

* fix lsan fix

* clang-format 16->17

* Add missing initialisation for havoc_queued during the custom mutator's stage.

* fix dictionary and cmin

* Use direct call to write to OpenBSD

The linker on OpenBSD emits a warning when linking this file:
warning: syscall() may go away, please rewrite code to use direct calls

* Fix possible doc inconsistency for custom mutator's queue_get function.

* update todos

* benchmark: Add support for COMPARISON file

* benchmark: show the number of cores used in COMPARISON

* benchmark: lower minimum Python version to 3.8

* benchmark: use afl's execs/s; increase CPU model width

* benchmark: disallow duplicate entries for the same CPU in COMPARISON

* Update benchmark.py

* fix inf in stats

* Fix benchmark.py

* missing closing parenthesis

* Update benchmark.py

* benchmark: remove self-calculation of execs/sec

* benchmark: update COMPARISON

* benchmark: Update Jupyter notebook and results file.

* benchmark: rename afl_execs_per_sec to execs_per_sec

* benchmark: update README

* update

* add benchmark

* nits

* add benchmarks

* Update unicornafl ref

* Pass correct Nyx ID when creating a Nyx runner

* Fix typo in docker pull command, add exampe to mount current dir as volume (#1914)

* mini fix

* add custom_post_run.c

* update afl-fuzz-run

* update python module

* format code

* update

* merge function

* changes

* code format

* improve cmplog

* nit

* nit

* fix

* fix

* Stop hardcoding the path /usr/local/lib/afl in afl-ld-lto.c and respect the configured PREFIX.

* Add benchmark for Raspberry Pi 5

* ryzen 5950 benchmark

* add missing raspery5

* comparison -> comparison.md

* removing options "-Wl,-rpath" "LLVM_LIBDIR" when using gcc

* fixing -Wl,-rpath=<LLVM_LIBDIR>

* nits

* fix

* afl-cc fixes

* nit

* add n_fuzz to ignore_timeouts

* fix

* Fix #1927

* in-depth blog post

* add AFL_FUZZER_LOOPCOUNT

* AFL_FUZZER_LOOPCOUNT

* fix 2 mutation bugs

* v4.09c release

* v4.10a init

* switch to explore powerschedule as default

* fix MUT_INSERTASCIINUM

* fix MUT_STRATEGY_ARRAY_SIZE

* fix bad fix for MUT_STRATEGY_ARRAY_SIZE

* remove afl-network-client on uninstall

* update nyx

* Improve binary-only related docs

* llvm 18 build fixes.

* code format

* Fix custom_send link

Add a leading '/' to walk in the repo root instead of current dir.

* Use ../ instead

* initial simple injection detection support

* inject docs

* fix for issue #1916, iLLVM crash in split-floatingpoint-compares

* LLVM 17 bug workaround

* finish injection implementation

* remove tmp todo

* update changelog

* forgot to add the injection pass

* Output afl-clang-fast stuffs only if necessary (#1912)

* afl-cc header

* afl-cc common declarations

 - Add afl-cc-state.c
 - Strip includes, find_object, debug/be_quiet/have_*/callname setting from afl-cc.c
 - Use debugf_args in main
 - Modify execvp stuffs to fit new aflcc struct

* afl-cc show usage

* afl-cc mode selecting

1. compiler_mode by callname in argv[0]
2. compiler_mode by env "AFL_CC_COMPILER"
3. compiler_mode/instrument_mode by command line options "--afl-..."
4. instrument_mode/compiler_mode by various env vars including "AFL_LLVM_INSTRUMENT"
5. final checking steps
6. print "... - mode: %s-%s\n"
7. determine real argv[0] according to compiler_mode

* afl-cc macro defs

* afl-cc linking behaviors

* afl-cc fsanitize behaviors

* afl-cc misc

* afl-cc body update

* afl-cc all-in-one

formated with custom-format.py

* nits

---------

Co-authored-by: vanhauser-thc <vh@thc.org>

* changelog

* update grammar mutator

* lto llvm 12+

* docs(custom_mutators): fix missing ':' (#1953)

* Fix broken LTO mode and response file support (#1948)

* Strip `-Wl,-no-undefined` during compilation (#1952)

Make the compiler wrapper stripping `-Wl,-no-undefined` in addition to `-Wl,--no-undefined`.
Both versions of the flag are accepted by clang and, therefore, used by building systems in the wild (e.g., samba will not build without this fix).

* Remove dead code in write_to_testcase (#1955)

The custom_mutators_count check in if case is duplicate with if condition.
The else case is custom_mutators_count == 0, neither custom_mutator_list iteration nor sent check needed.

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>

* update qemuafl

* WIP: Add ability to generate drcov trace using QEMU backend (#1956)

* Document new drcov QEMU plugin

* Add link to lightkeeper for QEMU drcov file loading

---------

Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>

* code format

* changelog

* sleep on uid != 0 afl-system-config

* fix segv about skip_next, warn on unsupported cases of linking options (#1958)

* todos

* ensure afl-cc only allows available compiler modes

* update grammar mutator

* disable aslr on apple

* fix for arm64

* help selective instrumentation

* typos

* macos

* add compiler test script

* apple fixes

---------

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Chris Ball <chris@printf.net>
Co-authored-by: hexcoder <hexcoder-@users.noreply.github.com>
Co-authored-by: hexcoder- <heiko@hexco.de>
Co-authored-by: Manuel Carrasco <m.carrasco@imperial.ac.uk>
Co-authored-by: Jasper Lievisse Adriaanse <j@jasper.la>
Co-authored-by: ifyGecko <26214995+ifyGecko@users.noreply.github.com>
Co-authored-by: Dominik Maier <domenukk@gmail.com>
Co-authored-by: Christian Holler (:decoder) <choller@mozilla.com>
Co-authored-by: Carlo Maragno <ste.maragno@gmail.com>
Co-authored-by: yangzao <yangzaocn@outlook.com>
Co-authored-by: Romain Geissler <romain.geissler@amadeus.com>
Co-authored-by: Jakob Lell <jakob@jakoblell.com>
Co-authored-by: vincenzo MEZZELA <vincenzo.mezzela@amadeus.com>
Co-authored-by: Andrea Fioraldi <andreafioraldi@gmail.com>
Co-authored-by: Bet4 <0xbet4@gmail.com>
Co-authored-by: David Carlier <devnexen@gmail.com>
Co-authored-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Sonic <50692172+SonicStark@users.noreply.github.com>
Co-authored-by: Nils Bars <nils.bars@rub.de>
Co-authored-by: Jean-Romain Garnier <7504819+JRomainG@users.noreply.github.com>
Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>
2024-01-20 10:19:46 +00:00

774 lines
22 KiB
C++

/*
american fuzzy lop++ - LLVM LTO instrumentation pass
----------------------------------------------------
Written by Marc Heuse <mh@mh-sec.de>
Copyright 2019-2023 AFLplusplus Project. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
https://www.apache.org/licenses/LICENSE-2.0
This library is plugged into LLVM when invoking clang through afl-clang-lto.
*/
#define AFL_LLVM_PASS
#include "config.h"
#include "debug.h"
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <ctype.h>
#include <list>
#include <string>
#include <fstream>
#include <set>
#include "llvm/Config/llvm-config.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/IR/IRBuilder.h"
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
#include "llvm/Passes/PassPlugin.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/IR/PassManager.h"
#else
#include "llvm/IR/LegacyPassManager.h"
#endif
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/CFG.h"
#include "llvm/IR/Verifier.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"
#if LLVM_VERSION_MAJOR < 17
#include "llvm/Transforms/IPO/PassManagerBuilder.h"
#endif
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Pass.h"
#include "llvm/IR/Constants.h"
#include "afl-llvm-common.h"
#ifndef O_DSYNC
#define O_DSYNC O_SYNC
#endif
using namespace llvm;
namespace {
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
class AFLdict2filePass : public PassInfoMixin<AFLdict2filePass> {
std::ofstream of;
void dict2file(u8 *, u32);
public:
AFLdict2filePass() {
#else
class AFLdict2filePass : public ModulePass {
std::ofstream of;
void dict2file(u8 *, u32);
public:
static char ID;
AFLdict2filePass() : ModulePass(ID) {
#endif
if (getenv("AFL_DEBUG")) debug = 1;
}
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);
#else
bool runOnModule(Module &M) override;
#endif
};
} // namespace
#if LLVM_MAJOR >= 11
extern "C" ::llvm::PassPluginLibraryInfo LLVM_ATTRIBUTE_WEAK
llvmGetPassPluginInfo() {
return {LLVM_PLUGIN_API_VERSION, "AFLdict2filePass", "v0.1",
/* lambda to insert our pass into the pass pipeline. */
[](PassBuilder &PB) {
#if LLVM_VERSION_MAJOR <= 13
using OptimizationLevel = typename PassBuilder::OptimizationLevel;
#endif
PB.registerOptimizerLastEPCallback(
[](ModulePassManager &MPM, OptimizationLevel OL) {
MPM.addPass(AFLdict2filePass());
});
}};
}
#else
char AFLdict2filePass::ID = 0;
#endif
void AFLdict2filePass::dict2file(u8 *mem, u32 len) {
u32 i, j, binary = 0;
char line[MAX_AUTO_EXTRA * 8], tmp[8];
strcpy(line, "\"");
j = 1;
for (i = 0; i < len; i++) {
if (isprint(mem[i]) && mem[i] != '\\' && mem[i] != '"') {
line[j++] = mem[i];
} else {
if (i + 1 != len || mem[i] != 0 || binary || len == 4 || len == 8) {
line[j] = 0;
sprintf(tmp, "\\x%02x", (u8)mem[i]);
strcat(line, tmp);
j = strlen(line);
}
binary = 1;
}
}
line[j] = 0;
strcat(line, "\"\n");
of << line;
of.flush();
if (!be_quiet) fprintf(stderr, "Found dictionary token: %s", line);
}
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
PreservedAnalyses AFLdict2filePass::run(Module &M, ModuleAnalysisManager &MAM) {
#else
bool AFLdict2filePass::runOnModule(Module &M) {
#endif
DenseMap<Value *, std::string *> valueMap;
char *ptr;
int found = 0, handle_main = 1;
/* Show a banner */
setvbuf(stdout, NULL, _IONBF, 0);
if ((isatty(2) && !getenv("AFL_QUIET")) || debug) {
SAYF(cCYA "afl-llvm-dict2file" VERSION cRST
" by Marc \"vanHauser\" Heuse <mh@mh-sec.de>\n");
} else {
be_quiet = 1;
}
if (getenv("AFL_LLVM_DICT2FILE_NO_MAIN")) { handle_main = 0; }
scanForDangerousFunctions(&M);
ptr = getenv("AFL_LLVM_DICT2FILE");
if (!ptr) {
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
auto PA = PreservedAnalyses::all();
return PA;
#else
return true;
#endif
}
if (*ptr != '/')
FATAL("AFL_LLVM_DICT2FILE is not set to an absolute path: %s", ptr);
of.open(ptr, std::ofstream::out | std::ofstream::app);
if (!of.is_open()) PFATAL("Could not open/create %s.", ptr);
/* Instrument all the things! */
for (auto &F : M) {
if (!handle_main &&
(!F.getName().compare("main") || !F.getName().compare("_main"))) {
continue;
}
if (isIgnoreFunction(&F)) { continue; }
if (!isInInstrumentList(&F, MNAME) || !F.size()) { continue; }
/* Some implementation notes.
*
* We try to handle 3 cases:
* - memcmp("foo", arg, 3) <- literal string
* - static char globalvar[] = "foo";
* memcmp(globalvar, arg, 3) <- global variable
* - char localvar[] = "foo";
* memcmp(locallvar, arg, 3) <- local variable
*
* The local variable case is the hardest. We can only detect that
* case if there is no reassignment or change in the variable.
* And it might not work across llvm version.
* What we do is hooking the initializer function for local variables
* (llvm.memcpy.p0i8.p0i8.i64) and note the string and the assigned
* variable. And if that variable is then used in a compare function
* we use that noted string.
* This seems not to work for tokens that have a size <= 4 :-(
*
* - if the compared length is smaller than the string length we
* save the full string. This is likely better for fuzzing but
* might be wrong in a few cases depending on optimizers
*
* - not using StringRef because there is a bug in the llvm 11
* checkout I am using which sometimes points to wrong strings
*
* Over and out. Took me a full day. damn. mh/vh
*/
for (auto &BB : F) {
for (auto &IN : BB) {
CallInst *callInst = nullptr;
CmpInst *cmpInst = nullptr;
if ((cmpInst = dyn_cast<CmpInst>(&IN))) {
Value *op = cmpInst->getOperand(1);
ConstantInt *ilen = dyn_cast<ConstantInt>(op);
/* We skip > 64 bit integers. why? first because their value is
difficult to obtain, and second because clang does not support
literals > 64 bit (as of llvm 12) */
if (ilen && ilen->uge(0xffffffffffffffff) == false) {
u64 val2 = 0, val = ilen->getZExtValue();
u32 len = 0;
if (val > 0x10000 && val < 0xffffffff) len = 4;
if (val > 0x100000001 && val < 0xffffffffffffffff) len = 8;
if (len) {
auto c = cmpInst->getPredicate();
switch (c) {
case CmpInst::FCMP_OGT: // fall through
case CmpInst::FCMP_OLE: // fall through
case CmpInst::ICMP_SLE: // fall through
case CmpInst::ICMP_SGT:
// signed comparison and it is a negative constant
if ((len == 4 && (val & 80000000)) ||
(len == 8 && (val & 8000000000000000))) {
if ((val & 0xffff) != 1) val2 = val - 1;
break;
}
// fall through
case CmpInst::FCMP_UGT: // fall through
case CmpInst::FCMP_ULE: // fall through
case CmpInst::ICMP_UGT: // fall through
case CmpInst::ICMP_ULE:
if ((val & 0xffff) != 0xfffe) val2 = val + 1;
break;
case CmpInst::FCMP_OLT: // fall through
case CmpInst::FCMP_OGE: // fall through
case CmpInst::ICMP_SLT: // fall through
case CmpInst::ICMP_SGE:
// signed comparison and it is a negative constant
if ((len == 4 && (val & 80000000)) ||
(len == 8 && (val & 8000000000000000))) {
if ((val & 0xffff) != 1) val2 = val - 1;
break;
}
// fall through
case CmpInst::FCMP_ULT: // fall through
case CmpInst::FCMP_UGE: // fall through
case CmpInst::ICMP_ULT: // fall through
case CmpInst::ICMP_UGE:
if ((val & 0xffff) != 1) val2 = val - 1;
break;
default:
val2 = 0;
}
dict2file((u8 *)&val, len);
found++;
if (val2) {
dict2file((u8 *)&val2, len);
found++;
}
}
}
}
if ((callInst = dyn_cast<CallInst>(&IN))) {
bool isStrcmp = true;
bool isMemcmp = true;
bool isStrncmp = true;
bool isStrcasecmp = true;
bool isStrncasecmp = true;
bool isIntMemcpy = true;
bool isStdString = true;
bool isStrstr = true;
size_t optLen = 0;
Function *Callee = callInst->getCalledFunction();
if (!Callee) continue;
if (callInst->getCallingConv() != llvm::CallingConv::C) continue;
std::string FuncName = Callee->getName().str();
isStrcmp &=
(!FuncName.compare("strcmp") || !FuncName.compare("xmlStrcmp") ||
!FuncName.compare("xmlStrEqual") ||
!FuncName.compare("g_strcmp0") ||
!FuncName.compare("curl_strequal") ||
!FuncName.compare("strcsequal"));
isMemcmp &=
(!FuncName.compare("memcmp") || !FuncName.compare("bcmp") ||
!FuncName.compare("CRYPTO_memcmp") ||
!FuncName.compare("OPENSSL_memcmp") ||
!FuncName.compare("memcmp_const_time") ||
!FuncName.compare("memcmpct"));
isStrncmp &= (!FuncName.compare("strncmp") ||
!FuncName.compare("xmlStrncmp") ||
!FuncName.compare("curl_strnequal"));
isStrcasecmp &= (!FuncName.compare("strcasecmp") ||
!FuncName.compare("stricmp") ||
!FuncName.compare("ap_cstr_casecmp") ||
!FuncName.compare("OPENSSL_strcasecmp") ||
!FuncName.compare("xmlStrcasecmp") ||
!FuncName.compare("g_strcasecmp") ||
!FuncName.compare("g_ascii_strcasecmp") ||
!FuncName.compare("Curl_strcasecompare") ||
!FuncName.compare("Curl_safe_strcasecompare") ||
!FuncName.compare("cmsstrcasecmp"));
isStrncasecmp &= (!FuncName.compare("strncasecmp") ||
!FuncName.compare("strnicmp") ||
!FuncName.compare("ap_cstr_casecmpn") ||
!FuncName.compare("OPENSSL_strncasecmp") ||
!FuncName.compare("xmlStrncasecmp") ||
!FuncName.compare("g_ascii_strncasecmp") ||
!FuncName.compare("Curl_strncasecompare") ||
!FuncName.compare("g_strncasecmp"));
isStrstr &= (!FuncName.compare("strstr") ||
!FuncName.compare("g_strstr_len") ||
!FuncName.compare("ap_strcasestr") ||
!FuncName.compare("xmlStrstr") ||
!FuncName.compare("xmlStrcasestr") ||
!FuncName.compare("g_str_has_prefix") ||
!FuncName.compare("g_str_has_suffix"));
isIntMemcpy &= !FuncName.compare("llvm.memcpy.p0i8.p0i8.i64");
isStdString &= ((FuncName.find("basic_string") != std::string::npos &&
FuncName.find("compare") != std::string::npos) ||
(FuncName.find("basic_string") != std::string::npos &&
FuncName.find("find") != std::string::npos));
if (!isStrcmp && !isMemcmp && !isStrncmp && !isStrcasecmp &&
!isStrncasecmp && !isIntMemcpy && !isStdString && !isStrstr)
continue;
/* Verify the strcmp/memcmp/strncmp/strcasecmp/strncasecmp function
* prototype */
FunctionType *FT = Callee->getFunctionType();
isStrstr &=
FT->getNumParams() == 2 &&
FT->getParamType(0) == FT->getParamType(1) &&
FT->getParamType(0) ==
IntegerType::getInt8Ty(M.getContext())->getPointerTo(0);
isStrcmp &=
FT->getNumParams() == 2 && FT->getReturnType()->isIntegerTy(32) &&
FT->getParamType(0) == FT->getParamType(1) &&
FT->getParamType(0) ==
IntegerType::getInt8Ty(M.getContext())->getPointerTo(0);
isStrcasecmp &=
FT->getNumParams() == 2 && FT->getReturnType()->isIntegerTy(32) &&
FT->getParamType(0) == FT->getParamType(1) &&
FT->getParamType(0) ==
IntegerType::getInt8Ty(M.getContext())->getPointerTo(0);
isMemcmp &= FT->getNumParams() == 3 &&
FT->getReturnType()->isIntegerTy(32) &&
FT->getParamType(0)->isPointerTy() &&
FT->getParamType(1)->isPointerTy() &&
FT->getParamType(2)->isIntegerTy();
isStrncmp &=
FT->getNumParams() == 3 && FT->getReturnType()->isIntegerTy(32) &&
FT->getParamType(0) == FT->getParamType(1) &&
FT->getParamType(0) ==
IntegerType::getInt8Ty(M.getContext())->getPointerTo(0) &&
FT->getParamType(2)->isIntegerTy();
isStrncasecmp &=
FT->getNumParams() == 3 && FT->getReturnType()->isIntegerTy(32) &&
FT->getParamType(0) == FT->getParamType(1) &&
FT->getParamType(0) ==
IntegerType::getInt8Ty(M.getContext())->getPointerTo(0) &&
FT->getParamType(2)->isIntegerTy();
isStdString &= FT->getNumParams() >= 2 &&
FT->getParamType(0)->isPointerTy() &&
FT->getParamType(1)->isPointerTy();
if (!isStrcmp && !isMemcmp && !isStrncmp && !isStrcasecmp &&
!isStrncasecmp && !isIntMemcpy && !isStdString && !isStrstr)
continue;
/* is a str{n,}{case,}cmp/memcmp, check if we have
* str{case,}cmp(x, "const") or str{case,}cmp("const", x)
* strn{case,}cmp(x, "const", ..) or strn{case,}cmp("const", x, ..)
* memcmp(x, "const", ..) or memcmp("const", x, ..) */
Value *Str1P = callInst->getArgOperand(0),
*Str2P = callInst->getArgOperand(1);
std::string Str1, Str2;
StringRef TmpStr;
bool HasStr1;
getConstantStringInfo(Str1P, TmpStr);
if (isStrstr || TmpStr.empty()) {
HasStr1 = false;
} else {
HasStr1 = true;
Str1 = TmpStr.str();
}
bool HasStr2;
getConstantStringInfo(Str2P, TmpStr);
if (TmpStr.empty()) {
HasStr2 = false;
} else {
HasStr2 = true;
Str2 = TmpStr.str();
}
if (debug)
fprintf(stderr, "F:%s %p(%s)->\"%s\"(%s) %p(%s)->\"%s\"(%s)\n",
FuncName.c_str(), (void *)Str1P,
Str1P->getName().str().c_str(), Str1.c_str(),
HasStr1 == true ? "true" : "false", (void *)Str2P,
Str2P->getName().str().c_str(), Str2.c_str(),
HasStr2 == true ? "true" : "false");
// we handle the 2nd parameter first because of llvm memcpy
if (!HasStr2) {
auto *Ptr = dyn_cast<ConstantExpr>(Str2P);
if (Ptr && Ptr->getOpcode() == Instruction::GetElementPtr) {
if (auto *Var = dyn_cast<GlobalVariable>(Ptr->getOperand(0))) {
if (Var->hasInitializer()) {
if (auto *Array =
dyn_cast<ConstantDataArray>(Var->getInitializer())) {
HasStr2 = true;
Str2 = Array->getRawDataValues().str();
}
}
}
}
}
// for the internal memcpy routine we only care for the second
// parameter and are not reporting anything.
if (isIntMemcpy == true) {
if (HasStr2 == true) {
Value *op2 = callInst->getArgOperand(2);
ConstantInt *ilen = dyn_cast<ConstantInt>(op2);
if (ilen) {
uint64_t literalLength = Str2.length();
uint64_t optLength = ilen->getZExtValue();
if (optLength > literalLength + 1) {
optLength = Str2.length() + 1;
}
if (literalLength + 1 == optLength) {
Str2.append("\0", 1); // add null byte
}
}
valueMap[Str1P] = new std::string(Str2);
if (debug) {
fprintf(stderr, "Saved: %s for %p\n", Str2.c_str(),
(void *)Str1P);
}
continue;
}
continue;
}
// Neither a literal nor a global variable?
// maybe it is a local variable that we saved
if (!HasStr2) {
std::string *strng = valueMap[Str2P];
if (strng && !strng->empty()) {
Str2 = *strng;
HasStr2 = true;
if (debug)
fprintf(stderr, "Filled2: %s for %p\n", strng->c_str(),
(void *)Str2P);
}
}
if (!HasStr1) {
auto Ptr = dyn_cast<ConstantExpr>(Str1P);
if (Ptr && Ptr->getOpcode() == Instruction::GetElementPtr) {
if (auto *Var = dyn_cast<GlobalVariable>(Ptr->getOperand(0))) {
if (Var->hasInitializer()) {
if (auto *Array =
dyn_cast<ConstantDataArray>(Var->getInitializer())) {
HasStr1 = true;
Str1 = Array->getRawDataValues().str();
}
}
}
}
}
// Neither a literal nor a global variable?
// maybe it is a local variable that we saved
if (!HasStr1) {
std::string *strng = valueMap[Str1P];
if (strng && !strng->empty()) {
Str1 = *strng;
HasStr1 = true;
if (debug)
fprintf(stderr, "Filled1: %s for %p\n", strng->c_str(),
(void *)Str1P);
}
}
/* handle cases of one string is const, one string is variable */
if (!(HasStr1 ^ HasStr2)) continue;
std::string thestring;
if (HasStr1)
thestring = Str1;
else
thestring = Str2;
optLen = thestring.length();
if (optLen < 2 || (optLen == 2 && !thestring[1])) { continue; }
if (isMemcmp || isStrncmp || isStrncasecmp) {
Value *op2 = callInst->getArgOperand(2);
ConstantInt *ilen = dyn_cast<ConstantInt>(op2);
if (ilen) {
uint64_t literalLength = optLen;
optLen = ilen->getZExtValue();
if (optLen > thestring.length() + 1) {
optLen = thestring.length() + 1;
}
if (optLen < 2) { continue; }
if (literalLength + 1 == optLen) { // add null byte
thestring.append("\0", 1);
}
}
}
// add null byte if this is a string compare function and a null
// was not already added
if (!isMemcmp) {
/*
if (addedNull == false && thestring[optLen - 1] != '\0')
{
thestring.append("\0", 1); // add null byte
optLen++;
}
*/
if (!isStdString && thestring.find('\0', 0) != std::string::npos) {
// ensure we do not have garbage
size_t offset = thestring.find('\0', 0);
if (offset + 1 < optLen) optLen = offset + 1;
thestring = thestring.substr(0, optLen);
}
}
// we take the longer string, even if the compare was to a
// shorter part. Note that depending on the optimizer of the
// compiler this can be wrong, but it is more likely that this
// is helping the fuzzer
if (optLen != thestring.length()) optLen = thestring.length();
if (optLen > MAX_AUTO_EXTRA) optLen = MAX_AUTO_EXTRA;
if (optLen < 3) // too short? skip
continue;
ptr = (char *)thestring.c_str();
dict2file((u8 *)ptr, optLen);
found++;
}
}
}
}
of.close();
/* Say something nice. */
if (!be_quiet) {
if (!found)
OKF("No entries for a dictionary found.");
else
OKF("Wrote %d entries to the dictionary file.\n", found);
}
#if LLVM_VERSION_MAJOR >= 11 /* use new pass manager */
auto PA = PreservedAnalyses::all();
return PA;
#else
return true;
#endif
}
#if LLVM_VERSION_MAJOR < 11 /* use old pass manager */
static void registerAFLdict2filePass(const PassManagerBuilder &,
legacy::PassManagerBase &PM) {
PM.add(new AFLdict2filePass());
}
static RegisterPass<AFLdict2filePass> X("afl-dict2file",
"AFL++ dict2file instrumentation pass",
false, false);
static RegisterStandardPasses RegisterAFLdict2filePass(
PassManagerBuilder::EP_OptimizerLast, registerAFLdict2filePass);
static RegisterStandardPasses RegisterAFLdict2filePass0(
PassManagerBuilder::EP_EnabledOnOptLevel0, registerAFLdict2filePass);
#endif