Files
AFLplusplus/instrumentation/afl-gcc-cmptrs-pass.so.cc
van Hauser 602eceed8b push to stable (#1983)
* Output afl-clang-fast stuffs only if necessary (#1912)

* afl-cc header

* afl-cc common declarations

 - Add afl-cc-state.c
 - Strip includes, find_object, debug/be_quiet/have_*/callname setting from afl-cc.c
 - Use debugf_args in main
 - Modify execvp stuffs to fit new aflcc struct

* afl-cc show usage

* afl-cc mode selecting

1. compiler_mode by callname in argv[0]
2. compiler_mode by env "AFL_CC_COMPILER"
3. compiler_mode/instrument_mode by command line options "--afl-..."
4. instrument_mode/compiler_mode by various env vars including "AFL_LLVM_INSTRUMENT"
5. final checking steps
6. print "... - mode: %s-%s\n"
7. determine real argv[0] according to compiler_mode

* afl-cc macro defs

* afl-cc linking behaviors

* afl-cc fsanitize behaviors

* afl-cc misc

* afl-cc body update

* afl-cc all-in-one

formated with custom-format.py

* nits

---------

Co-authored-by: vanhauser-thc <vh@thc.org>

* changelog

* update grammar mutator

* lto llvm 12+

* docs(custom_mutators): fix missing ':' (#1953)

* Fix broken LTO mode and response file support (#1948)

* Strip `-Wl,-no-undefined` during compilation (#1952)

Make the compiler wrapper stripping `-Wl,-no-undefined` in addition to `-Wl,--no-undefined`.
Both versions of the flag are accepted by clang and, therefore, used by building systems in the wild (e.g., samba will not build without this fix).

* Remove dead code in write_to_testcase (#1955)

The custom_mutators_count check in if case is duplicate with if condition.
The else case is custom_mutators_count == 0, neither custom_mutator_list iteration nor sent check needed.

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>

* update qemuafl

* WIP: Add ability to generate drcov trace using QEMU backend (#1956)

* Document new drcov QEMU plugin

* Add link to lightkeeper for QEMU drcov file loading

---------

Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>

* code format

* changelog

* sleep on uid != 0 afl-system-config

* fix segv about skip_next, warn on unsupported cases of linking options (#1958)

* todos

* ensure afl-cc only allows available compiler modes

* update grammar mutator

* disable aslr on apple

* fix for arm64

* help selective instrumentation

* typos

* macos

* add compiler test script

* apple fixes

* bump nyx submodules (#1963)

* fix docs

* update changelog

* update grammar mutator

* improve compiler test script

* gcc asan workaround (#1966)

* fix github merge fuckup

* fix

* Fix afl-cc (#1968)

- Check if too many cmdline params here, each time before insert a new param.
 - Check if it is "-fsanitize=..." before we do sth.
 - Remove improper param_st transfer.

* Avoid adding llvmnative instrumentation when linking rust sanitizer runtime (#1969)

* Dynamic instrumentation filtering for LLVM native (#1971)

* Add two dynamic instrumentation filter methods to runtime

* Always use pc-table with native pcguard

* Add make_symbol_list.py and README

* changelog

* todos

* new forkserver check

* fix

* nyx test for CI

* improve nyx docs

* Fixes to afl-cc and documentation (#1974)

* Always compile with -ldl when building for CODE_COVERAGE

When building with CODE_COVERAGE, the afl runtime contains code that
calls `dladdr` which requires -ldl. Under most circumstances, clang
already adds this (e.g. when building with pc-table), but there are some
circumstances where it isn't added automatically.

* Add visibility declaration to __afl_connected

When building with hidden visibility, the use of __AFL_LOOP inside such
code can cause linker errors due to __afl_connected being declared
"hidden".

* Update docs to clarify that CODE_COVERAGE=1 is required for dynamic_covfilter

* nits

* nyx build script updates

* test error output

* debug ci

* debug ci

* Improve afl-cc (#1975)

* update response file support

 - full support of rsp file
 - fix some segv issues

* Improve afl-cc

 - remove dead code about allow/denylist options of sancov
 - missing `if (!aflcc->have_msan)`
 - add docs for each function
 - typo

* enable nyx

* debug ci

* debug ci

* debug ci

* debug ci

* debug ci

* debug ci

* debug ci

* debug ci

* fix ci

* clean test script

* NO_NYX

* NO_NYX

* fix ci

* debug ci

* fix ci

* finalize ci fix

* Enhancement on Deterministic stage (#1972)

* fuzzer: init commit based on aflpp 60dc37a8cf

* fuzzers: adding the skip variables and initialize

* log: profile the det/havoc finding

* log: add profile log output

* fuzzers: sperate log/skipdet module

* fuzzers: add quick eff_map calc

* fuzzers: add skip_eff_map in fuzz_one

* fuzzers: mark whole input space in eff_map

* fuzzers: add undet bit threshold to skip some seeds

* fuzzers: fix one byte overflow

* fuzzers: fix overflow

* fix code format

* add havoc only again

* code format

* remove log to INTROSPECTION, rename skipdet module

* rename skipdet module

* remove log to stats

* clean redundant code

* code format

* remove redundant code format check

* remove redundant doc

* remove redundant objects

* clean files

* change -d to default skipdet

* disable deterministic when using CUSTOM_MUTATOR

* revert fix

* final touches for skipdet

* remove unused var

* remove redundant eff struct (#1977)

* update QEMU-Nyx submodule (#1978)

* update QEMU-Nyx submodule (#1980)

* Fix type in AFL_NOOPT env variable in afl-cc help message (#1982)

* nits

* 2024 v4.10c release

* fixes

---------

Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Sonic <50692172+SonicStark@users.noreply.github.com>
Co-authored-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Nils Bars <nils.bars@rub.de>
Co-authored-by: Jean-Romain Garnier <7504819+JRomainG@users.noreply.github.com>
Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>
Co-authored-by: Sergej Schumilo <sergej@schumilo.de>
Co-authored-by: Christian Holler (:decoder) <choller@mozilla.com>
Co-authored-by: Han Zheng <35988108+kdsjZh@users.noreply.github.com>
Co-authored-by: Khaled Yakdan <yakdan@code-intelligence.com>
2024-02-03 10:55:51 +00:00

370 lines
11 KiB
C++

/* GCC plugin for cmplog routines instrumentation of code for AFL++.
Copyright 2014-2019 Free Software Foundation, Inc
Copyright 2015, 2016 Google Inc. All rights reserved.
Copyright 2019-2020 AFLplusplus Project. All rights reserved.
Copyright 2019-2024 AdaCore
Written by Alexandre Oliva <oliva@adacore.com>, based on the AFL++
LLVM CmpLog Routines pass by Andrea Fioraldi
<andreafioraldi@gmail.com>, and on the AFL GCC CmpLog pass.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "afl-gcc-common.h"
/* This plugin, being under the same license as GCC, satisfies the
"GPL-compatible Software" definition in the GCC RUNTIME LIBRARY
EXCEPTION, so it can be part of an "Eligible" "Compilation
Process". */
int plugin_is_GPL_compatible = 1;
namespace {
static const struct pass_data afl_cmptrs_pass_data = {
.type = GIMPLE_PASS,
.name = "aflcmptrs",
.optinfo_flags = OPTGROUP_NONE,
.tv_id = TV_NONE,
.properties_required = 0,
.properties_provided = 0,
.properties_destroyed = 0,
.todo_flags_start = 0,
.todo_flags_finish = (TODO_update_ssa | TODO_cleanup_cfg | TODO_verify_il |
TODO_rebuild_cgraph_edges),
};
struct afl_cmptrs_pass : afl_base_pass {
afl_cmptrs_pass(bool quiet)
: afl_base_pass(quiet, /*debug=*/false, afl_cmptrs_pass_data),
tp8u(),
cmptrs_hooks() {
}
/* A pointer type to a unsigned 8-bit integral type. */
tree tp8u;
/* Declarations for the various cmptrs hook functions, allocated on
demand.. [0] is for compares between any pointers, [1] is for
compares between G++ std::string, [2] is for compares between G++
std::string and GCC C strings, [3] and [4] are analogous to [1]
and [2] but for LLVM C++ strings. */
tree cmptrs_hooks[5];
tree cmptrs_hook(unsigned i) {
if (!tp8u) {
tree t8u;
if (BITS_PER_UNIT == 8)
t8u = unsigned_char_type_node;
else
t8u = build_nonstandard_integer_type(8, 1);
tp8u = build_pointer_type(t8u);
}
if (i <= ARRAY_SIZE(cmptrs_hooks) && cmptrs_hooks[i])
return cmptrs_hooks[i];
const char *n = NULL;
switch (i) {
case 0:
n = "__cmplog_rtn_hook";
break;
case 1:
n = "__cmplog_rtn_gcc_stdstring_stdstring";
break;
case 2:
n = "__cmplog_rtn_gcc_stdstring_cstring";
break;
case 3:
n = "__cmplog_rtn_llvm_stdstring_stdstring";
break;
case 4:
n = "__cmplog_rtn_llvm_stdstring_cstring";
break;
default:
gcc_unreachable();
}
tree fnt = build_function_type_list(void_type_node, tp8u, tp8u, NULL_TREE);
tree t = cmptrs_hooks[i] = build_fn_decl(n, fnt);
/* Mark the newly-created decl as non-throwing, so that we can
insert call within basic blocks. */
TREE_NOTHROW(t) = 1;
return t;
}
/* Return true if T is the char* type. */
bool is_c_string(tree t) {
return (POINTER_TYPE_P(t) &&
TYPE_MAIN_VARIANT(TREE_TYPE(t)) == char_type_node);
}
/* Return true if T is an indirect std::string type. The LLVM pass
tests portions of the mangled name of the callee. We could do
that in GCC too, but computing the mangled name may cause
template instantiations and get symbols defined that could
otherwise be considered unused. We check for compatible layout,
and class, namespace, and field names. These have been unchanged
since at least GCC 7, probably longer, up to GCC 11. Odds are
that, if it were to change in significant ways, mangling would
also change to flag the incompatibility, and we'd have to use a
different hook anyway. */
bool is_gxx_std_string(tree t) {
/* We need a pointer or reference type. */
if (!POINTER_TYPE_P(t)) return false;
/* Get to the pointed-to type. */
t = TREE_TYPE(t);
if (!t) return false;
/* Select the main variant, so that can compare types with pointers. */
t = TYPE_MAIN_VARIANT(t);
/* We expect it to be a record type. */
if (TREE_CODE(t) != RECORD_TYPE) return false;
/* The type has an identifier. */
if (!TYPE_IDENTIFIER(t)) return false;
/* The type of the template is basic_string. */
if (strcmp(IDENTIFIER_POINTER(TYPE_IDENTIFIER(t)), "basic_string") != 0)
return false;
/* It's declared in an internal namespace named __cxx11. */
tree c = DECL_CONTEXT(TYPE_NAME(t));
if (!c || TREE_CODE(c) != NAMESPACE_DECL ||
strcmp(IDENTIFIER_POINTER(DECL_NAME(c)), "__cxx11") != 0)
return false;
/* The __cxx11 namespace is a member of namespace std. */
c = DECL_CONTEXT(c);
if (!c || TREE_CODE(c) != NAMESPACE_DECL ||
strcmp(IDENTIFIER_POINTER(DECL_NAME(c)), "std") != 0)
return false;
/* And the std namespace is in the global namespace. */
c = DECL_CONTEXT(c);
if (c && TREE_CODE(c) != TRANSLATION_UNIT_DECL) return false;
/* Check that the first nonstatic data member of the record type
is named _M_dataplus. */
for (c = TYPE_FIELDS(t); c; c = DECL_CHAIN(c))
if (TREE_CODE(c) == FIELD_DECL) break;
if (!c || !integer_zerop(DECL_FIELD_BIT_OFFSET(c)) ||
strcmp(IDENTIFIER_POINTER(DECL_NAME(c)), "_M_dataplus") != 0)
return false;
/* Check that the second nonstatic data member of the record type
is named _M_string_length. */
tree f2;
for (f2 = DECL_CHAIN(c); f2; f2 = DECL_CHAIN(f2))
if (TREE_CODE(f2) == FIELD_DECL) break;
if (!f2 /* No need to check this field's offset. */
|| strcmp(IDENTIFIER_POINTER(DECL_NAME(f2)), "_M_string_length") != 0)
return false;
/* The type of the second data member is size_t. */
if (!TREE_TYPE(f2) || TYPE_MAIN_VARIANT(TREE_TYPE(f2)) != size_type_node)
return false;
/* Now go back to the first data member. Its type should be a
record type named _Alloc_hider. */
c = TREE_TYPE(c);
if (!c || TREE_CODE(c) != RECORD_TYPE || !TYPE_IDENTIFIER(t) ||
strcmp(IDENTIFIER_POINTER(TYPE_IDENTIFIER(c)), "_Alloc_hider") != 0)
return false;
/* And its first data member is named _M_p. */
for (c = TYPE_FIELDS(c); c; c = DECL_CHAIN(c))
if (TREE_CODE(c) == FIELD_DECL) break;
if (!c || !integer_zerop(DECL_FIELD_BIT_OFFSET(c)) ||
strcmp(IDENTIFIER_POINTER(DECL_NAME(c)), "_M_p") != 0)
return false;
/* For the basic_string<char> type we're interested in, the type
of the data member is the C string type. */
if (!is_c_string(TREE_TYPE(c))) return false;
/* This might not be the real thing, but the bits that matter for
the hook are there. */
return true;
}
/* ??? This is not implemented. What would the point be of
recognizing LLVM's string type in GCC? */
bool is_llvm_std_string(tree t) {
return false;
}
virtual unsigned int execute(function *fn) {
if (!isInInstrumentList(fn)) return 0;
basic_block bb;
FOR_EACH_BB_FN(bb, fn) {
for (gimple_stmt_iterator gsi = gsi_after_labels(bb); !gsi_end_p(gsi);
gsi_next(&gsi)) {
gimple stmt = gsi_stmt(gsi);
/* We're only interested in GIMPLE_CALLs. */
if (gimple_code(stmt) != GIMPLE_CALL) continue;
if (gimple_call_num_args(stmt) < 2) continue;
gcall *c = as_a<gcall *>(stmt);
tree callee_type = gimple_call_fntype(c);
if (!callee_type || !TYPE_ARG_TYPES(callee_type) ||
!TREE_CHAIN(TYPE_ARG_TYPES(callee_type)))
continue;
tree arg_type[2] = {
TYPE_MAIN_VARIANT(TREE_VALUE(TYPE_ARG_TYPES(callee_type))),
TYPE_MAIN_VARIANT(
TREE_VALUE(TREE_CHAIN(TYPE_ARG_TYPES(callee_type))))};
tree fn = NULL;
/* Callee arglist starts with two GCC std::string arguments. */
if (arg_type[0] == arg_type[1] && is_gxx_std_string(arg_type[0]))
fn = cmptrs_hook(1);
/* Callee arglist starts with GCC std::string and C string. */
else if (is_gxx_std_string(arg_type[0]) && is_c_string(arg_type[1]))
fn = cmptrs_hook(2);
/* Callee arglist starts with two LLVM std::string arguments. */
else if (arg_type[0] == arg_type[1] && is_llvm_std_string(arg_type[0]))
fn = cmptrs_hook(3);
/* Callee arglist starts with LLVM std::string and C string. */
else if (is_llvm_std_string(arg_type[0]) && is_c_string(arg_type[1]))
fn = cmptrs_hook(4);
/* Callee arglist starts with two pointers to the same type,
and callee returns a value. */
else if (arg_type[0] == arg_type[1] && POINTER_TYPE_P(arg_type[0]) &&
(TYPE_MAIN_VARIANT(gimple_call_return_type(c)) !=
void_type_node))
fn = cmptrs_hook(0);
else
continue;
tree arg[2] = {gimple_call_arg(c, 0), gimple_call_arg(c, 1)};
for (unsigned i = 0; i < ARRAY_SIZE(arg); i++) {
tree c = fold_convert_loc(UNKNOWN_LOCATION, tp8u, arg[i]);
if (!is_gimple_val(c)) {
tree s = make_ssa_name(tp8u);
gimple g = gimple_build_assign(s, c);
c = s;
gsi_insert_before(&gsi, g, GSI_SAME_STMT);
}
arg[i] = c;
}
gimple call = gimple_build_call(fn, 2, arg[0], arg[1]);
gsi_insert_before(&gsi, call, GSI_SAME_STMT);
}
}
return 0;
}
};
static struct plugin_info afl_cmptrs_plugin = {
.version = "20220420",
.help = G_("AFL gcc cmptrs plugin\n\
\n\
Set AFL_QUIET in the environment to silence it.\n\
"),
};
} // namespace
/* This is the function GCC calls when loading a plugin. Initialize
and register further callbacks. */
int plugin_init(struct plugin_name_args *info,
struct plugin_gcc_version *version) {
if (!plugin_default_version_check(version, &gcc_version))
FATAL(G_("GCC and plugin have incompatible versions, expected GCC %s, "
"is %s"),
gcc_version.basever, version->basever);
/* Show a banner. */
bool quiet = false;
if (isatty(2) && !getenv("AFL_QUIET"))
SAYF(cCYA "afl-gcc-cmptrs-pass " cBRI VERSION cRST
" by <oliva@adacore.com>\n");
else
quiet = true;
const char *name = info->base_name;
register_callback(name, PLUGIN_INFO, NULL, &afl_cmptrs_plugin);
afl_cmptrs_pass *aflp = new afl_cmptrs_pass(quiet);
struct register_pass_info pass_info = {
.pass = aflp,
.reference_pass_name = "ssa",
.ref_pass_instance_number = 1,
.pos_op = PASS_POS_INSERT_AFTER,
};
register_callback(name, PLUGIN_PASS_MANAGER_SETUP, NULL, &pass_info);
return 0;
}