Compare commits

...

131 Commits
2.52c ... 2.53c

Author SHA1 Message Date
f97409dd2d v2.53c 2019-07-26 14:19:04 +02:00
c384367f17 fix readme.md makefile change 2019-07-26 10:45:26 +02:00
eea1c6606c incorporated most of the 2.53b changes 2019-07-26 10:39:14 +02:00
8f4f45c524 incorporated most of the 2.53b changes 2019-07-26 10:35:58 +02:00
db2392b778 cleanup 2019-07-25 18:48:28 +02:00
ce842648ae afl_trace_pc fix 2019-07-25 11:18:14 +02:00
ad1c4bf202 squash typos 2019-07-25 10:34:03 +02:00
5969b7cdbc filenames should not have spaces 2019-07-25 10:19:17 +02:00
6013d20aef unicorn build workaround 2019-07-25 09:07:35 +02:00
dfb3bd8e33 documentation update 2019-07-25 09:00:22 +02:00
d6c2db9620 Merge pull request #27 from domenukk/afl-unicorn
Add AFL Unicorn
2019-07-25 08:47:22 +02:00
00dc8a0ad5 Added AFL-Unicorn mode 2019-07-25 02:26:51 +02:00
9246f21f2a remove the unreadable AFLFast schedules tabel in markdown from README 2019-07-24 15:54:05 +02:00
2237319ebb qemu mode TODO update 2019-07-24 15:35:52 +02:00
6fa95008bc fix root check 2019-07-24 12:55:37 +02:00
3789a56225 updated changelog and readme 2019-07-23 17:04:04 +02:00
0a2d9af2a1 doc update 2019-07-21 23:58:40 +02:00
2b7a627181 removed gcc_plugin from master 2019-07-21 20:25:06 +02:00
f697752b52 moved gcc_plugin to a branch, it is nowhere near "ok" 2019-07-21 20:24:40 +02:00
914426d887 Merge pull request #26 from vanhauser-thc/qemu-compcov
Qemu CompCov
2019-07-20 14:23:07 +02:00
302e717790 better rely on compiler for size information 2019-07-20 15:04:07 +02:00
27928fbc94 fix conflict 2019-07-20 14:10:19 +02:00
253056b932 more speed to libcompcov using real libc functions 2019-07-20 14:08:45 +02:00
1d1d0d9b6f warn on calling the target binary without an explicit path 2019-07-20 13:15:41 +02:00
c7887abb64 added test and debug 2019-07-20 13:12:19 +02:00
47525f0dd6 fix #24 checking for validity of the requested block address 2019-07-20 13:09:45 +02:00
5ac5d91c6b CompCov TODO 2019-07-20 12:00:31 +02:00
322b5a736b updated docs and crash issues with gcc_plugin 2019-07-20 09:06:47 +02:00
907c054142 this closes #23 2019-07-19 17:56:52 +02:00
7b6d51a9d0 libcompcov for QEMU 2019-07-19 17:47:53 +02:00
d3eba93c7d ops typo 2019-07-19 17:46:24 +02:00
866e22355c show selected core and code cleanup 2019-07-19 12:08:02 +02:00
fe084b9866 several documentation fixes 2019-07-19 11:17:30 +02:00
5f7e3025d9 enable AFL_QUIET again 2019-07-19 11:10:10 +02:00
13b8bc1a89 add root check 2019-07-19 11:08:23 +02:00
054cec8a5d fix typos 2019-07-19 08:35:29 +02:00
8dc326e1f1 env variables update 2019-07-19 01:13:14 +02:00
81dd1aea82 experimental x86 support for compcov in QEMU 2019-07-19 00:55:41 +02:00
5b2cb426be code cleanup and documented secret cmdline option 2019-07-18 12:54:19 +02:00
5fa19f2801 cpu scaling updated for newer kernels 2019-07-18 10:17:50 +02:00
4f5acb8f52 test case files with time information 2019-07-17 16:39:35 +02:00
cf71c53559 Merge pull request #17 from dkasak/patch-1
Fix typo: add missing underscore
2019-07-16 21:06:04 +02:00
80c98f4d0c added readme 2019-07-16 21:05:50 +02:00
73f8ab3aa8 Fix typo: add missing underscore 2019-07-16 18:13:54 +00:00
da372335bf updated .gitignore 2019-07-16 11:14:39 +02:00
0af9f664db env doc update for gcc_plugin 2019-07-16 08:52:13 +02:00
995eb0cd79 deprecate afl-gcc 2019-07-16 08:51:00 +02:00
9f07965876 added TODO file 2019-07-16 08:42:15 +02:00
8a4cdd56d4 added gcc_plugin 2019-07-16 08:34:17 +02:00
3252523823 fixing commit fuckup 2019-07-15 11:22:54 +02:00
2628f9f61b fix crash with case insensitive compare functions (str(n)casecmp()) 2019-07-15 08:54:12 +02:00
0d217e15d5 fix merge artefact (check_binary) 2019-07-14 22:56:27 +02:00
520c85c7b7 updated README 2019-07-14 20:12:46 +02:00
82d70e0720 fix 2019-07-14 20:10:43 +02:00
054976c390 Merge pull request #14 from vanhauser-thc/shared_memory_mmap_refactor
Shared memory mmap refactor
2019-07-14 20:04:26 +02:00
da8e03e18a Merge branch 'master' into shared_memory_mmap_refactor 2019-07-14 20:02:20 +02:00
4a80dbdd10 Merge pull request #13 from vanhauser-thc/instrim
Instrim imported
2019-07-14 19:58:04 +02:00
013a1731d5 set instrim as default and updated documentation 2019-07-14 19:48:28 +02:00
e664024853 whitelist features works now 2019-07-14 10:50:13 +02:00
495f3b9a68 notZero added and first attempt at whitelist 2019-07-14 10:23:54 +02:00
98a6963911 make fix 2019-07-14 10:05:46 +02:00
c204efaaab Compile fix for LLVM 3.8.0 2019-07-13 23:12:36 +02:00
0f13137616 compiles now with LLVM 8.0 2019-07-13 23:40:34 +02:00
864056fcaa initial commit 2019-07-13 11:08:13 +02:00
5c0830f628 fix detection of glibc 2019-07-13 09:39:51 +02:00
e96a2dd681 fix Makefile 2019-07-13 09:39:51 +02:00
f45332e1ab portability fix: getcwd(NULL, 0) is a non-POSIX glibc extension. Refactor
detect_file_args() in a separate file in order to avoid multiple copies.
2019-07-13 09:39:51 +02:00
5508e30854 -E fix 2019-07-12 20:32:07 +02:00
3e14d63a0a update doc 2019-07-12 19:16:59 +02:00
eddfddccb2 -E option and docu update 2019-07-12 18:17:32 +02:00
c067ef0216 qemu was not make clean'ed 2019-07-12 14:00:59 +02:00
f7d9019b8c Readme updates 2019-07-10 16:14:30 +02:00
519678192f Merge pull request #12 from vanhauser-thc/MOpt
Mopt
2019-07-10 14:20:06 +02:00
c3083a77d4 updated references 2019-07-10 14:19:00 +02:00
891ab3951b fix 2019-07-08 17:12:07 +02:00
11251c77ca fix 2019-07-08 11:42:21 +02:00
71e22d9263 updated docs 2019-07-08 11:39:06 +02:00
3095d96715 added doc 2019-07-08 11:37:10 +02:00
198946231c imported MOpt and worked around the collisions with other patches 2019-07-08 11:36:52 +02:00
d9c70c7b8c add explicit llvm library for OpenBSD 2019-07-05 20:33:36 +02:00
7ae61e7393 fix redundant messages (appearing again) 2019-07-05 20:09:42 +02:00
984ae35948 increased portability, replace sed with tr (*BSD)
sanity check versions from clang and llvm, adjust clang path if needed.
2019-07-05 20:02:40 +02:00
0d6cddda4d comment never_zero for afl-as 2019-07-05 13:29:26 +02:00
18e031d346 Merge pull request #11 from vanhauser-thc/neverZero_counters
Never zero counters added
2019-07-05 13:27:53 +02:00
c0332ad98b Merge branch 'master' into neverZero_counters 2019-07-05 13:27:38 +02:00
7f6aaa5314 final touches 2019-07-05 11:28:08 +02:00
9199967022 this is the best solution IMHO 2019-07-04 11:19:18 +02:00
04c92c8470 notzero for afl-gcc 2019-07-03 19:10:48 +02:00
00b22e37df select implementations 2019-07-03 16:36:31 +02:00
aaa810c64a add -lrt with afl-gcc/clang automatically in mmap mode 2019-07-03 12:11:02 +02:00
b57b2073ac LAF_... -> AFL_LLVM_LAF_... 2019-07-03 12:05:58 +02:00
771a9e9cd2 more python module examples 2019-07-03 04:22:53 +02:00
cc48f4499a add librt under NetBSD 2019-07-02 20:20:07 +02:00
3e2f2ddb56 remove redundant header 2019-07-02 20:18:21 +02:00
0ca6df6f09 typo fix 2019-07-02 11:51:09 +02:00
37a379f959 Makefile magic for llvm_mode 2019-07-02 00:26:27 +02:00
625d6c2ed7 fix SHM mmap flag setting 2019-07-01 20:19:30 +02:00
134d2bd766 various fixes 2019-07-01 11:46:45 +02:00
9eb2cd7327 various fixes 2019-07-01 11:46:14 +02:00
c0347c80b2 Merge pull request #7 from bpfoley/master
Fix some github URL typos in docs
2019-06-30 17:20:47 +02:00
d9ff84e39e Refactor to use an alternative method for shared memory.
If USEMMAP is defined, the shared memory segment is created/attached etc.
now by shm_open() and mmap().
This API is hopefully more often available (at least for iOS).

In order to reduce code duplication I have added new files
sharedmem.[ch] which now encapsulate the shared memory method.

This is based on the work of Proteas to support iOS fuzzing (thanks).
866af8ad1c

Currently this is in an experimental status yet. Please report
whether this variant works on 32 and 64 bit and on the supported platforms.

This branch enables USEMMAP and has been tested on Linux.
There is no auto detection for the mmap API yet.
2019-06-30 10:37:14 +02:00
7256e6d203 Fix some github URL typos in docs 2019-06-29 14:31:46 -07:00
c083fd895c added .gitignore 2019-06-27 23:27:13 +02:00
0cd7a3d216 afl-tmin forkserver patch 2019-06-27 18:02:29 +02:00
aa4fc44a80 2 different implementations 2019-06-27 15:43:51 +02:00
f07d49e877 more power 2019-06-27 11:48:08 +02:00
45be91ff48 experimental implementation of counters that skip zero on overflow.
Enable with AFL_NZERO_COUNTS=1 during compilation of target.
2019-06-25 22:03:59 +02:00
c657b3d072 updates patches file 2019-06-25 12:11:34 +02:00
5dfb3ded17 improved documentation 2019-06-25 12:08:50 +02:00
0104e99caa llvm_mode whitelist (partial instrumentation) support added 2019-06-25 12:00:12 +02:00
e16593c9b1 doc update 2019-06-23 19:38:57 +02:00
1cc69df0f4 display power schedule in status screen 2019-06-23 18:37:02 +02:00
2db576f52b better power schedule documentation 2019-06-23 11:19:51 +02:00
421edce623 friendly power schedule names 2019-06-22 19:03:15 +02:00
549b83504f added -s fixed_seed feature 2019-06-20 13:51:39 +02:00
d10ebd1a68 python mutator examples added 2019-06-20 12:22:46 +02:00
4e3d921f1a updated PATCHES file 2019-06-20 11:54:53 +02:00
1d6e1ec61c Python 2.7 mutator module support added 2019-06-19 19:45:05 +02:00
db3cc11195 minor documentation update 2019-06-17 18:47:13 +02:00
d64efa6a68 Merge pull request #6 from pbst/patch
Fix crashes
2019-06-17 15:16:48 +02:00
7b5905bda6 llvm_mode/split-switches-pass: add checks
Add extra check to allow early exist in trivial cases that would
sometimes lead to crashes.
2019-06-17 04:18:55 +02:00
f5ba5ffe80 fix zero terminated string issue
In C "strings" are zero terminated. Functions like
strcmp/strncmp/memcmp/... work on them. We have to be careful to not
ignore the last byte.
2019-06-13 14:42:10 +00:00
0113c4f834 Merge branch 'master' of https://github.com/vanhauser-thc/AFLplusplus 2019-06-12 17:21:26 +02:00
1c2ed83960 bugfixes from pbst for laf-intel transformations 2019-06-12 17:20:25 +02:00
7a236b11b8 version bumb for github dev version 2019-06-11 11:32:11 +02:00
a0328bbcf8 Merge pull request #5 from practicalswift/remove-references-to-cla
Remove references to the Google CLA process
2019-06-07 21:33:47 +02:00
46e58b434a Merge pull request #4 from practicalswift/typo
Fix typos
2019-06-07 21:32:27 +02:00
7955f8a7cb Remove references to Google CLA process 2019-06-07 18:10:25 +02:00
263fd37590 Fix typos 2019-06-07 17:56:29 +02:00
ba37bf13d6 fix gui misalignment in show_stats() 2019-06-05 11:50:04 +02:00
b59d71546b improve afl_maybe_log tcg call generation + merge elfload diffs 2019-06-05 11:48:36 +02:00
91 changed files with 11514 additions and 911 deletions

23
.gitignore vendored Normal file
View File

@ -0,0 +1,23 @@
*.o
*.so
.gitignore
afl-analyze
afl-as
afl-clang
afl-clang++
afl-clang-fast
afl-clang-fast++
afl-fuzz
afl-g++
afl-gcc
afl-gcc-fast
afl-g++-fast
afl-gotcpu
afl-qemu-trace
afl-showmap
afl-tmin
as
qemu_mode/qemu-3.1.0
qemu_mode/qemu-3.1.0.tar.xz
unicorn_mode/unicorn
unicorn_mode/unicorn-*

11
.travis.yml Normal file
View File

@ -0,0 +1,11 @@
language: c
env:
- AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1 AFL_NO_UI=1
script:
- make
- ./afl-gcc ./test-instr.c -o test-instr
- mkdir seeds; mkdir out
- echo "" > seeds/nil_seed
- timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr

135
Makefile
View File

@ -13,6 +13,9 @@
# http://www.apache.org/licenses/LICENSE-2.0
#
# For Heiko:
#TEST_MMAP=1
PROGNAME = afl
VERSION = $(shell grep '^\#define VERSION ' config.h | cut -d '"' -f2)
@ -32,6 +35,8 @@ CFLAGS += -Wall -D_FORTIFY_SOURCE=2 -g -Wno-pointer-sign \
-DAFL_PATH=\"$(HELPER_PATH)\" -DDOC_PATH=\"$(DOC_PATH)\" \
-DBIN_PATH=\"$(BIN_PATH)\"
PYTHON_INCLUDE ?= /usr/include/python2.7
ifneq "$(filter Linux GNU%,$(shell uname))" ""
LDFLAGS += -ldl
endif
@ -44,15 +49,40 @@ endif
COMM_HDR = alloc-inl.h config.h debug.h types.h
all: test_x86 $(PROGS) afl-as test_build all_done
ifeq "$(shell echo '\#include <Python.h>@int main() {return 0; }' | tr @ '\n' | $(CC) -x c - -o .test -I$(PYTHON_INCLUDE) -lpython2.7 2>/dev/null && echo 1 || echo 0 )" "1"
PYTHON_OK=1
PYFLAGS=-DUSE_PYTHON -I$(PYTHON_INCLUDE) -lpython2.7
else
PYTHON_OK=0
PYFLAGS=
endif
ifeq "$(shell echo '\#include <sys/ipc.h>@\#include <sys/shm.h>@int main() { int _id = shmget(IPC_PRIVATE, 65536, IPC_CREAT | IPC_EXCL | 0600); shmctl(_id, IPC_RMID, 0); return 0;}' | tr @ '\n' | $(CC) -x c - -o .test2 2>/dev/null && echo 1 || echo 0 )" "1"
SHMAT_OK=1
else
SHMAT_OK=0
CFLAGS+=-DUSEMMAP=1
LDFLAGS+=-Wno-deprecated-declarations -lrt
endif
ifeq "$(TEST_MMAP)" "1"
SHMAT_OK=0
CFLAGS+=-DUSEMMAP=1
LDFLAGS+=-Wno-deprecated-declarations -lrt
endif
all: test_x86 test_shm test_python27 ready $(PROGS) afl-as test_build all_done
ifndef AFL_NO_X86
test_x86:
@echo "[*] Checking for the ability to compile x86 code..."
@echo 'main() { __asm__("xorb %al, %al"); }' | $(CC) -w -x c - -o .test || ( echo; echo "Oops, looks like your compiler can't generate x86 code."; echo; echo "Don't panic! You can use the LLVM or QEMU mode, but see docs/INSTALL first."; echo "(To ignore this error, set AFL_NO_X86=1 and try again.)"; echo; exit 1 )
@rm -f .test
@echo "[+] Everything seems to be working, ready to compile."
@echo 'main() { __asm__("xorb %al, %al"); }' | $(CC) -w -x c - -o .test1 || ( echo; echo "Oops, looks like your compiler can't generate x86 code."; echo; echo "Don't panic! You can use the LLVM or QEMU mode, but see docs/INSTALL first."; echo "(To ignore this error, set AFL_NO_X86=1 and try again.)"; echo; exit 1 )
@rm -f .test1
else
@ -61,6 +91,38 @@ test_x86:
endif
ifeq "$(SHMAT_OK)" "1"
test_shm:
@echo "[+] shmat seems to be working."
@rm -f .test2
else
test_shm:
@echo "[-] shmat seems not to be working, switching to mmap implementation"
endif
ifeq "$(PYTHON_OK)" "1"
test_python27:
@rm -f .test 2> /dev/null
@echo "[+] Python 2.7 support seems to be working."
else
test_python27:
@echo "[-] You seem to need to install the package python2.7-dev, but it is optional so we continue"
endif
ready:
@echo "[+] Everything seems to be working, ready to compile."
afl-gcc: afl-gcc.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
set -e; for i in afl-g++ afl-clang afl-clang++; do ln -sf afl-gcc $$i; done
@ -69,21 +131,28 @@ afl-as: afl-as.c afl-as.h $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
ln -sf afl-as as
afl-fuzz: afl-fuzz.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
afl-common.o : afl-common.c
$(CC) $(CFLAGS) -c afl-common.c
afl-showmap: afl-showmap.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
sharedmem.o : sharedmem.c
$(CC) $(CFLAGS) -c sharedmem.c
afl-tmin: afl-tmin.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
afl-fuzz: afl-fuzz.c afl-common.o sharedmem.o $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c afl-common.o sharedmem.o -o $@ $(LDFLAGS) $(PYFLAGS)
afl-analyze: afl-analyze.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
afl-showmap: afl-showmap.c afl-common.o sharedmem.o $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c afl-common.o sharedmem.o -o $@ $(LDFLAGS)
afl-tmin: afl-tmin.c afl-common.o sharedmem.o $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c afl-common.o sharedmem.o -o $@ $(LDFLAGS)
afl-analyze: afl-analyze.c afl-common.o sharedmem.o $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c afl-common.o sharedmem.o -o $@ $(LDFLAGS)
afl-gotcpu: afl-gotcpu.c $(COMM_HDR) | test_x86
$(CC) $(CFLAGS) $@.c -o $@ $(LDFLAGS)
ifndef AFL_NO_X86
test_build: afl-gcc afl-as afl-showmap
@ -102,17 +171,18 @@ test_build: afl-gcc afl-as afl-showmap
endif
all_done: test_build
@if [ ! "`which clang 2>/dev/null`" = "" ]; then echo "[+] LLVM users: see llvm_mode/README.llvm for a faster alternative to afl-gcc."; fi
@echo "[+] All done! Be sure to review README - it's pretty short and useful."
@echo "[+] All done! Be sure to review the README.md - it's pretty short and useful."
@if [ "`uname`" = "Darwin" ]; then printf "\nWARNING: Fuzzing on MacOS X is slow because of the unusually high overhead of\nfork() on this OS. Consider using Linux or *BSD. You can also use VirtualBox\n(virtualbox.org) to put AFL inside a Linux or *BSD VM.\n\n"; fi
@! tty <&1 >/dev/null || printf "\033[0;30mNOTE: If you can read this, your terminal probably uses white background.\nThis will make the UI hard to read. See docs/status_screen.txt for advice.\033[0m\n" 2>/dev/null
.NOTPARALLEL: clean
clean:
rm -f $(PROGS) afl-as as afl-g++ afl-clang afl-clang++ *.o *~ a.out core core.[1-9][0-9]* *.stackdump test .test test-instr .test-instr0 .test-instr1 qemu_mode/qemu-2.10.0.tar.bz2 afl-qemu-trace
rm -rf out_dir qemu_mode/qemu-2.10.0
rm -f $(PROGS) afl-as as afl-g++ afl-clang afl-clang++ *.o *~ a.out core core.[1-9][0-9]* *.stackdump test .test .test1 .test2 test-instr .test-instr0 .test-instr1 qemu_mode/qemu-3.1.0.tar.xz afl-qemu-trace afl-gcc-fast afl-gcc-pass.so afl-gcc-rt.o afl-g++-fast
rm -rf out_dir qemu_mode/qemu-3.1.0
$(MAKE) -C llvm_mode clean
$(MAKE) -C libdislocator clean
$(MAKE) -C libtokencap clean
@ -123,8 +193,9 @@ install: all
install -m 755 $(PROGS) $(SH_PROGS) $${DESTDIR}$(BIN_PATH)
rm -f $${DESTDIR}$(BIN_PATH)/afl-as
if [ -f afl-qemu-trace ]; then install -m 755 afl-qemu-trace $${DESTDIR}$(BIN_PATH); fi
#if [ -f afl-gcc-fast ]; then set e; install -m 755 afl-gcc-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-gcc-fast $${DESTDIR}$(BIN_PATH)/afl-g++-fast; install -m 755 afl-gcc-pass.so afl-gcc-rt.o $${DESTDIR}$(HELPER_PATH); fi
ifndef AFL_TRACE_PC
if [ -f afl-clang-fast -a -f afl-llvm-pass.so -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 afl-llvm-pass.so afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi
if [ -f afl-clang-fast -a -f libLLVMInsTrim.so -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 libLLVMInsTrim.so afl-llvm-pass.so afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi
else
if [ -f afl-clang-fast -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi
endif
@ -134,24 +205,26 @@ endif
if [ -f split-compares-pass.so ]; then set -e; install -m 755 split-compares-pass.so $${DESTDIR}$(HELPER_PATH); fi
if [ -f split-switches-pass.so ]; then set -e; install -m 755 split-switches-pass.so $${DESTDIR}$(HELPER_PATH); fi
set -e; for i in afl-g++ afl-clang afl-clang++; do ln -sf afl-gcc $${DESTDIR}$(BIN_PATH)/$$i; done
set -e; ln -sf afl-gcc $${DESTDIR}$(BIN_PATH)/afl-g++
set -e; if [ -f afl-clang-fast ] ; then ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang ; ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang++ ; else ln -sf afl-gcc $${DESTDIR}$(BIN_PATH)/afl-clang ; ln -sf afl-gcc $${DESTDIR}$(BIN_PATH)/afl-clang++; fi
install -m 755 afl-as $${DESTDIR}$(HELPER_PATH)
ln -sf afl-as $${DESTDIR}$(HELPER_PATH)/as
install -m 644 docs/README docs/ChangeLog docs/*.txt $${DESTDIR}$(DOC_PATH)
install -m 644 docs/README.md docs/ChangeLog docs/*.txt $${DESTDIR}$(DOC_PATH)
cp -r testcases/ $${DESTDIR}$(MISC_PATH)
cp -r dictionaries/ $${DESTDIR}$(MISC_PATH)
publish: clean
test "`basename $$PWD`" = "afl" || exit 1
test -f ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz; if [ "$$?" = "0" ]; then echo; echo "Change program version in config.h, mmkay?"; echo; exit 1; fi
cd ..; rm -rf $(PROGNAME)-$(VERSION); cp -pr $(PROGNAME) $(PROGNAME)-$(VERSION); \
tar -cvz -f ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz $(PROGNAME)-$(VERSION)
chmod 644 ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz
( cd ~/www/afl/releases/; ln -s -f $(PROGNAME)-$(VERSION).tgz $(PROGNAME)-latest.tgz )
cat docs/README >~/www/afl/README.txt
cat docs/status_screen.txt >~/www/afl/status_screen.txt
cat docs/historical_notes.txt >~/www/afl/historical_notes.txt
cat docs/technical_details.txt >~/www/afl/technical_details.txt
cat docs/ChangeLog >~/www/afl/ChangeLog.txt
cat docs/QuickStartGuide.txt >~/www/afl/QuickStartGuide.txt
echo -n "$(VERSION)" >~/www/afl/version.txt
# test "`basename $$PWD`" = "afl" || exit 1
# test -f ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz; if [ "$$?" = "0" ]; then echo; echo "Change program version in config.h, mmkay?"; echo; exit 1; fi
# cd ..; rm -rf $(PROGNAME)-$(VERSION); cp -pr $(PROGNAME) $(PROGNAME)-$(VERSION); \
# tar -cvz -f ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz $(PROGNAME)-$(VERSION)
# chmod 644 ~/www/afl/releases/$(PROGNAME)-$(VERSION).tgz
# ( cd ~/www/afl/releases/; ln -s -f $(PROGNAME)-$(VERSION).tgz $(PROGNAME)-latest.tgz )
# cat docs/README.md >~/www/afl/README.txt
# cat docs/status_screen.txt >~/www/afl/status_screen.txt
# cat docs/historical_notes.txt >~/www/afl/historical_notes.txt
# cat docs/technical_details.txt >~/www/afl/technical_details.txt
# cat docs/ChangeLog >~/www/afl/ChangeLog.txt
# cat docs/QuickStartGuide.txt >~/www/afl/QuickStartGuide.txt
# echo -n "$(VERSION)" >~/www/afl/version.txt

1
README
View File

@ -1 +0,0 @@
docs/README

View File

@ -1,41 +1,48 @@
============================
american fuzzy lop plus plus
============================
# american fuzzy lop plus plus (afl++)
Written by Michal Zalewski <lcamtuf@google.com>
Originally developed by Michal "lcamtuf" Zalewski.
Repository: https://github.com/vanhauser-thc/AFLplusplus
Repository: [https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus)
afl++ is maintained by Marc Heuse <mh@mh-sec.de> and Heiko Eissfeldt
<heiko.eissfeldt@hexco.de> as there have been no updates to afl since
November 2017.
afl++ is maintained by Marc Heuse <mh@mh-sec.de>, Heiko Eissfeldt
<heiko.eissfeldt@hexco.de> and Andrea Fioraldi <andreafioraldi@gmail.com>.
This version has several bug fixes, new features and speed enhancements
based on community patches from https://github.com/vanhauser-thc/afl-patches
To see the list of which patches have been applied, see the PATCHES file.
## The enhancements compared to the original stock afl
Additionally AFLfast's power schedules by Marcel Boehme from
github.com/mboehme/aflfast have been incorporated.
Many improvements were made over the official afl release - which did not
get any improvements since November 2017.
Plus it was upgraded to qemu 3.1 from 2.1 with the work of
https://github.com/andreafioraldi/afl and got the community patches applied
to it.
Among others afl++ has, e.g. more performant llvm_mode, supporting
llvm up to version 8, Qemu 3.1, more speed and crashfixes for Qemu,
laf-intel feature for Qemu (with libcompcov) and more.
Additionally the following patches have been integrated:
* AFLfast's power schedules by Marcel Boehme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast)
* C. Hollers afl-fuzz Python mutator module and llvm_mode whitelist support: [https://github.com/choller/afl](https://github.com/choller/afl)
* the new excellent MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL)
* instrim, a very effective CFG llvm_mode instrumentation implementation for large targets: [https://github.com/csienslab/instrim](https://github.com/csienslab/instrim)
* unicorn_mode which allows fuzzing of binaries from completely different platforms (integration provided by domenukk)
A more thorough list is available in the PATCHES file.
So all in all this is the best-of AFL that is currently out there :-)
Copyright 2013, 2014, 2015, 2016 Google Inc. All rights reserved.
Released under terms and conditions of Apache License, Version 2.0.
For new versions and additional information, check out:
http://lcamtuf.coredump.cx/afl/
[https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus)
To compare notes with other users or get notified about major new features,
send a mail to <afl-users+subscribe@googlegroups.com>.
** See QuickStartGuide.txt if you don't have time to read this file. **
See [docs/QuickStartGuide.txt](docs/QuickStartGuide.txt) if you don't have time to
read this file.
1) Challenges of guided fuzzing
## 1) Challenges of guided fuzzing
-------------------------------
Fuzzing is one of the most powerful and proven strategies for identifying
@ -62,8 +69,8 @@ All these methods are extremely promising in experimental settings, but tend
to suffer from reliability and performance problems in practical uses - and
currently do not offer a viable alternative to "dumb" fuzzing techniques.
2) The afl-fuzz approach
------------------------
## 2) The afl-fuzz approach
American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple
but rock-solid instrumentation-guided genetic algorithm. It uses a modified
@ -101,8 +108,13 @@ closed-source tools.
The fuzzer is thoroughly tested to deliver out-of-the-box performance far
superior to blind fuzzing or coverage-only tools.
3) Instrumenting programs for use with AFL
------------------------------------------
## 3) Instrumenting programs for use with AFL
PLEASE NOTE: llvm_mode compilation with afl-clang-fast/afl-clang-fast++
instead of afl-gcc/afl-g++ is much faster and has a few cool features.
See llvm_mode/ - however few code does not compile with llvm.
We support llvm versions 4.0 to 8.
When source code is available, instrumentation can be injected by a companion
tool that works as a drop-in replacement for gcc or clang in any standard build
@ -115,37 +127,45 @@ or even faster than possible with traditional tools.
The correct way to recompile the target program may vary depending on the
specifics of the build process, but a nearly-universal approach would be:
```shell
$ CC=/path/to/afl/afl-gcc ./configure
$ make clean all
```
For C++ programs, you'd would also want to set CXX=/path/to/afl/afl-g++.
For C++ programs, you'd would also want to set `CXX=/path/to/afl/afl-g++`.
The clang wrappers (afl-clang and afl-clang++) can be used in the same way;
clang users may also opt to leverage a higher-performance instrumentation mode,
as described in llvm_mode/README.llvm.
Clang/LLVM has a much better performance, but only works with LLVM up to and
including 6.0.1.
as described in [llvm_mode/README.llvm](llvm_mode/README.llvm).
Clang/LLVM has a much better performance and works with LLVM version 4.0 to 8.
Using the LAF Intel performance enhancements are also recommended, see
docs/README.laf-intel
[llvm_mode/README.laf-intel](llvm_mode/README.laf-intel)
Using partial instrumentation is also recommended, see
[llvm_mode/README.whitelist](llvm_mode/README.whitelist)
When testing libraries, you need to find or write a simple program that reads
data from stdin or from a file and passes it to the tested library. In such a
case, it is essential to link this executable against a static version of the
instrumented library, or to make sure that the correct .so file is loaded at
runtime (usually by setting LD_LIBRARY_PATH). The simplest option is a static
runtime (usually by setting `LD_LIBRARY_PATH`). The simplest option is a static
build, usually possible via:
```shell
$ CC=/path/to/afl/afl-gcc ./configure --disable-shared
```
Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to
Setting `AFL_HARDEN=1` when calling 'make' will cause the CC wrapper to
automatically enable code hardening options that make it easier to detect
simple memory bugs. Libdislocator, a helper library included with AFL (see
libdislocator/README.dislocator) can help uncover heap corruption issues, too.
[libdislocator/README.dislocator](libdislocator/README.dislocator)) can help uncover heap corruption issues, too.
PS. ASAN users are advised to review notes_for_asan.txt file for important
caveats.
PS. ASAN users are advised to review [docs/notes_for_asan.txt](docs/notes_for_asan.txt)
file for important caveats.
4) Instrumenting binary-only apps
## 4) Instrumenting binary-only apps
---------------------------------
When source code is *NOT* available, the fuzzer offers experimental support for
@ -155,15 +175,57 @@ with a version of QEMU running in the lesser-known "user space emulation" mode.
QEMU is a project separate from AFL, but you can conveniently build the
feature by doing:
```shell
$ cd qemu_mode
$ ./build_qemu_support.sh
```
For additional instructions and caveats, see qemu_mode/README.qemu.
For additional instructions and caveats, see [qemu_mode/README.qemu](qemu_mode/README.qemu).
The mode is approximately 2-5x slower than compile-time instrumentation, is
less conductive to parallelization, and may have some other quirks.
5) Choosing initial test cases
If [afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) works for
your binary, then you can use afl-fuzz normally and it will have twice
the speed compared to qemu_mode.
A more comprehensive description of these and other options can be found in
[docs/binaryonly_fuzzing.txt](docs/binaryonly_fuzzing.txt)
## 5) Power schedules
------------------
The power schedules were copied from Marcel Böhme's excellent AFLfast
implementation and expands on the ability to discover new paths and
therefore the coverage.
The available schedules are:
- explore (default)
- fast
- coe
- quad
- lin
- exploit
In parallel mode (-M/-S, several instances with shared queue), we suggest to
run the master using the exploit schedule (-p exploit) and the slaves with a
combination of cut-off-exponential (-p coe), exponential (-p fast; default),
and explore (-p explore) schedules.
In single mode, using -p fast is usually more beneficial than the default
explore mode.
(We don't want to change the default behaviour of afl, so "fast" has not been
made the default mode).
More details can be found in the paper published at the 23rd ACM Conference on
Computer and Communications Security (CCS'16):
(https://www.sigsac.org/ccs/CCS2016/accepted-papers/)[https://www.sigsac.org/ccs/CCS2016/accepted-papers/]
## 6) Choosing initial test cases
------------------------------
To operate correctly, the fuzzer requires one or more starting file that
@ -171,7 +233,7 @@ contains a good example of the input data normally expected by the targeted
application. There are two basic rules:
- Keep the files small. Under 1 kB is ideal, although not strictly necessary.
For a discussion of why size matters, see perf_tips.txt.
For a discussion of why size matters, see [perf_tips.txt](docs/perf_tips.txt).
- Use multiple test cases only if they are functionally different from
each other. There is no point in using fifty different vacation photos
@ -184,7 +246,8 @@ PS. If a large corpus of data is available for screening, you may want to use
the afl-cmin utility to identify a subset of functionally distinct files that
exercise different code paths in the target binary.
6) Fuzzing binaries
## 7) Fuzzing binaries
-------------------
The fuzzing process itself is carried out by the afl-fuzz utility. This program
@ -193,13 +256,17 @@ store its findings, plus a path to the binary to test.
For target binaries that accept input directly from stdin, the usual syntax is:
```shell
$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...]
```
For programs that take input from a file, use '@@' to mark the location in
the target's command line where the input file name should be placed. The
fuzzer will substitute this for you:
```shell
$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@
```
You can also use the -f option to have the mutated data written to a specific
file. This is useful if the program expects a particular file extension or so.
@ -211,19 +278,20 @@ You can use -t and -m to override the default timeout and memory limit for the
executed process; rare examples of targets that may need these settings touched
include compilers and video decoders.
Tips for optimizing fuzzing performance are discussed in perf_tips.txt.
Tips for optimizing fuzzing performance are discussed in [perf_tips.txt](docs/perf_tips.txt).
Note that afl-fuzz starts by performing an array of deterministic fuzzing
steps, which can take several days, but tend to produce neat test cases. If you
want quick & dirty results right away - akin to zzuf and other traditional
fuzzers - add the -d option to the command line.
7) Interpreting output
## 8) Interpreting output
----------------------
See the status_screen.txt file for information on how to interpret the
displayed stats and monitor the health of the process. Be sure to consult this
file especially if any UI elements are highlighted in red.
See the [docs/status_screen.txt](docs/status_screen.txt) file for information on
how to interpret the displayed stats and monitor the health of the process. Be
sure to consult this file especially if any UI elements are highlighted in red.
The fuzzing process will continue until you press Ctrl-C. At minimum, you want
to allow the fuzzer to complete one queue cycle, which may take anywhere from a
@ -261,33 +329,39 @@ queue entries. This should help with debugging.
When you can't reproduce a crash found by afl-fuzz, the most likely cause is
that you are not setting the same memory limit as used by the tool. Try:
```shell
$ LIMIT_MB=50
$ ( ulimit -Sv $[LIMIT_MB << 10]; /path/to/tested_binary ... )
```
Change LIMIT_MB to match the -m parameter passed to afl-fuzz. On OpenBSD,
also change -Sv to -Sd.
Any existing output directory can be also used to resume aborted jobs; try:
```shell
$ ./afl-fuzz -i- -o existing_output_dir [...etc...]
```
If you have gnuplot installed, you can also generate some pretty graphs for any
active fuzzing task using afl-plot. For an example of how this looks like,
see http://lcamtuf.coredump.cx/afl/plot/.
see [http://lcamtuf.coredump.cx/afl/plot/](http://lcamtuf.coredump.cx/afl/plot/).
8) Parallelized fuzzing
## 9) Parallelized fuzzing
-----------------------
Every instance of afl-fuzz takes up roughly one core. This means that on
multi-core systems, parallelization is necessary to fully utilize the hardware.
For tips on how to fuzz a common target on multiple cores or multiple networked
machines, please refer to parallel_fuzzing.txt.
machines, please refer to [docs/parallel_fuzzing.txt](docs/parallel_fuzzing.txt).
The parallel fuzzing mode also offers a simple way for interfacing AFL to other
fuzzers, to symbolic or concolic execution engines, and so forth; again, see the
last section of parallel_fuzzing.txt for tips.
last section of [docs/parallel_fuzzing.txt](docs/parallel_fuzzing.txt) for tips.
9) Fuzzer dictionaries
## 10) Fuzzer dictionaries
----------------------
By default, afl-fuzz mutation engine is optimized for compact data formats -
@ -298,13 +372,13 @@ redundant verbiage - notably including HTML, SQL, or JavaScript.
To avoid the hassle of building syntax-aware tools, afl-fuzz provides a way to
seed the fuzzing process with an optional dictionary of language keywords,
magic headers, or other special tokens associated with the targeted data type
- and use that to reconstruct the underlying grammar on the go:
-- and use that to reconstruct the underlying grammar on the go:
http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html
[http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html](http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html)
To use this feature, you first need to create a dictionary in one of the two
formats discussed in dictionaries/README.dictionaries; and then point the fuzzer
to it via the -x option in the command line.
formats discussed in [dictionaries/README.dictionaries](ictionaries/README.dictionaries);
and then point the fuzzer to it via the -x option in the command line.
(Several common dictionaries are already provided in that subdirectory, too.)
@ -312,7 +386,7 @@ There is no way to provide more structured descriptions of the underlying
syntax, but the fuzzer will likely figure out some of this based on the
instrumentation feedback alone. This actually works in practice, say:
http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html
[http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html](http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html)
PS. Even when no explicit dictionary is given, afl-fuzz will try to extract
existing syntax tokens in the input corpus by watching the instrumentation
@ -321,9 +395,10 @@ parsers and grammars, but isn't nearly as good as the -x mode.
If a dictionary is really hard to come by, another option is to let AFL run
for a while, and then use the token capture library that comes as a companion
utility with AFL. For that, see libtokencap/README.tokencap.
utility with AFL. For that, see [libtokencap/README.tokencap](libtokencap/README.tokencap).
10) Crash triage
## 11) Crash triage
----------------
The coverage-based grouping of crashes usually produces a small data set that
@ -352,7 +427,9 @@ beneath.
Oh, one more thing: for test case minimization, give afl-tmin a try. The tool
can be operated in a very simple way:
```shell
$ ./afl-tmin -i test_case -o minimized_result -- /path/to/program [...]
```
The tool works with crashing and non-crashing test cases alike. In the crash
mode, it will happily accept instrumented and non-instrumented binaries. In the
@ -367,9 +444,10 @@ file, attempts to sequentially flip bytes, and observes the behavior of the
tested program. It then color-codes the input based on which sections appear to
be critical, and which are not; while not bulletproof, it can often offer quick
insights into complex file formats. More info about its operation can be found
near the end of technical_details.txt.
near the end of [docs/technical_details.txt](docs/technical_details.txt).
11) Going beyond crashes
## 12) Going beyond crashes
------------------------
Fuzzing is a wonderful and underutilized technique for discovering non-crashing
@ -390,10 +468,11 @@ found by modifying the target programs to call abort() when, say:
Implementing these or similar sanity checks usually takes very little time;
if you are the maintainer of a particular package, you can make this code
conditional with #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION (a flag also
shared with libfuzzer) or #ifdef __AFL_COMPILER (this one is just for AFL).
conditional with `#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` (a flag also
shared with libfuzzer) or `#ifdef __AFL_COMPILER` (this one is just for AFL).
12) Common-sense risks
## 13) Common-sense risks
----------------------
Please keep in mind that, similarly to many other computationally-intensive
@ -419,9 +498,12 @@ tasks, fuzzing may put strain on your hardware and on the OS. In particular:
A good way to monitor disk I/O on Linux is the 'iostat' command:
```shell
$ iostat -d 3 -x -k [...optional disk ID...]
```
13) Known limitations & areas for improvement
## 14) Known limitations & areas for improvement
---------------------------------------------
Here are some of the most important caveats for AFL:
@ -439,35 +521,37 @@ Here are some of the most important caveats for AFL:
To work around this, you can comment out the relevant checks (see
experimental/libpng_no_checksum/ for inspiration); if this is not possible,
you can also write a postprocessor, as explained in
experimental/post_library/.
experimental/post_library/ (with AFL_POST_LIBRARY)
- There are some unfortunate trade-offs with ASAN and 64-bit binaries. This
isn't due to any specific fault of afl-fuzz; see notes_for_asan.txt for
tips.
isn't due to any specific fault of afl-fuzz; see [docs/notes_for_asan.txt](docs/notes_for_asan.txt)
for tips.
- There is no direct support for fuzzing network services, background
daemons, or interactive apps that require UI interaction to work. You may
need to make simple code changes to make them behave in a more traditional
way. Preeny may offer a relatively simple option, too - see:
https://github.com/zardus/preeny
[https://github.com/zardus/preeny](https://github.com/zardus/preeny)
Some useful tips for modifying network-based services can be also found at:
https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop
[https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop)
- AFL doesn't output human-readable coverage data. If you want to monitor
coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov
coverage, use afl-cov from Michael Rash: [https://github.com/mrash/afl-cov](https://github.com/mrash/afl-cov)
- Occasionally, sentient machines rise against their creators. If this
happens to you, please consult http://lcamtuf.coredump.cx/prep/.
happens to you, please consult [http://lcamtuf.coredump.cx/prep/](http://lcamtuf.coredump.cx/prep/).
Beyond this, see INSTALL for platform-specific tips.
14) Special thanks
## 15) Special thanks
------------------
Many of the improvements to afl-fuzz wouldn't be possible without feedback,
bug reports, or patches from:
Many of the improvements to the original afl wouldn't be possible without
feedback, bug reports, or patches from:
```
Jann Horn Hanno Boeck
Felix Groebert Jakub Wilk
Richard W. M. Jones Alexander Cherepanov
@ -507,27 +591,18 @@ bug reports, or patches from:
Rene Freingruber Sergey Davidoff
Sami Liedes Craig Young
Andrzej Jackowski Daniel Hodson
Nathan Voss Dominik Maier
```
Thank you!
15) Contact
## 16) Contact
-----------
Questions? Concerns? Bug reports? The author can be usually reached at
<lcamtuf@google.com>.
Questions? Concerns? Bug reports? The contributors can be reached via
[https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus)
There is also a mailing list for the project; to join, send a mail to
There is also a mailing list for the afl project; to join, send a mail to
<afl-users+subscribe@googlegroups.com>. Or, if you prefer to browse
archives first, try:
https://groups.google.com/group/afl-users
PS. If you wish to submit raw code to be incorporated into the project, please
be aware that the copyright on most of AFL is claimed by Google. While you do
retain copyright on your contributions, they do ask people to agree to a simple
CLA first:
https://cla.developers.google.com/clas
Sorry about the hassle. Of course, no CLA is required for feature requests or
bug reports.
archives first, try: [https://groups.google.com/group/afl-users](https://groups.google.com/group/afl-users)

34
TODO Normal file
View File

@ -0,0 +1,34 @@
Roadmap 2.53d:
==============
- indent all the code: clang-format -style=Google
- update docs/sister_projects.txt
afl-fuzz:
- put mutator, scheduler, forkserver and input channels in individual files
- reuse forkserver for showmap, afl-cmin, etc.
gcc_plugin:
- needs to be rewritten
- fix crashes when compiling :(
- whitelist support
- skip over uninteresting blocks
- laf-intel
- neverZero
qemu_mode:
- deferred mode with AFL_DEFERRED_QEMU=0xaddress
unit testing / or large testcase campaign
Roadmap 2.54d:
==============
- expand MAP size to 256k (current L2 cache size on processors)
-> 18 bit map
- llvm_mode: dynamic map size and collission free basic block IDs
qemu_mode:
- persistent mode patching the return address (WinAFL style)
- instrument only comparison with immediate values by default when using compcov

View File

@ -26,6 +26,8 @@
#include "debug.h"
#include "alloc-inl.h"
#include "hash.h"
#include "sharedmem.h"
#include "afl-common.h"
#include <stdio.h>
#include <unistd.h>
@ -47,7 +49,7 @@
static s32 child_pid; /* PID of the tested program */
static u8* trace_bits; /* SHM with instrumentation bitmap */
u8* trace_bits; /* SHM with instrumentation bitmap */
static u8 *in_file, /* Analyzer input test case */
*prog_in, /* Targeted program input file */
@ -64,8 +66,7 @@ static u32 in_len, /* Input data length */
static u64 mem_limit = MEM_LIMIT; /* Memory limit (MB) */
static s32 shm_id, /* ID of the SHM region */
dev_null_fd = -1; /* FD to /dev/null */
static s32 dev_null_fd = -1; /* FD to /dev/null */
static u8 edges_only, /* Ignore hit counts? */
use_hex_offsets, /* Show hex offsets? */
@ -141,37 +142,11 @@ static inline u8 anything_set(void) {
}
/* Get rid of shared memory and temp files (atexit handler). */
/* Get rid of temp files (atexit handler). */
static void remove_shm(void) {
static void at_exit_handler(void) {
unlink(prog_in); /* Ignore errors */
shmctl(shm_id, IPC_RMID, NULL);
}
/* Configure shared memory. */
static void setup_shm(void) {
u8* shm_str;
shm_id = shmget(IPC_PRIVATE, MAP_SIZE, IPC_CREAT | IPC_EXCL | 0600);
if (shm_id < 0) PFATAL("shmget() failed");
atexit(remove_shm);
shm_str = alloc_printf("%d", shm_id);
setenv(SHM_ENV_VAR, shm_str, 1);
ck_free(shm_str);
trace_bits = shmat(shm_id, NULL, 0);
if (!trace_bits) PFATAL("shmat() failed");
}
@ -750,48 +725,6 @@ static void setup_signal_handlers(void) {
}
/* Detect @@ in args. */
static void detect_file_args(char** argv) {
u32 i = 0;
u8* cwd = getcwd(NULL, 0);
if (!cwd) PFATAL("getcwd() failed");
while (argv[i]) {
u8* aa_loc = strstr(argv[i], "@@");
if (aa_loc) {
u8 *aa_subst, *n_arg;
/* Be sure that we're always using fully-qualified paths. */
if (prog_in[0] == '/') aa_subst = prog_in;
else aa_subst = alloc_printf("%s/%s", cwd, prog_in);
/* Construct a replacement argv value. */
*aa_loc = 0;
n_arg = alloc_printf("%s%s%s", argv[i], aa_subst, aa_loc + 2);
argv[i] = n_arg;
*aa_loc = '@';
if (prog_in[0] != '/') ck_free(aa_subst);
}
i++;
}
free(cwd); /* not tracked */
}
/* Display usage hints. */
static void usage(u8* argv0) {
@ -807,7 +740,8 @@ static void usage(u8* argv0) {
" -f file - input file read by the tested program (stdin)\n"
" -t msec - timeout for each run (%u ms)\n"
" -m megs - memory limit for child process (%u MB)\n"
" -Q - use binary-only instrumentation (QEMU mode)\n\n"
" -Q - use binary-only instrumentation (QEMU mode)\n"
" -U - use unicorn-based instrumentation (Unicorn mode)\n\n"
"Analysis settings:\n\n"
@ -933,20 +867,19 @@ static char** get_qemu_argv(u8* own_loc, char** argv, int argc) {
}
/* Main entry point */
int main(int argc, char** argv) {
s32 opt;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0, unicorn_mode = 0;
char** use_argv;
doc_path = access(DOC_PATH, F_OK) ? "docs" : DOC_PATH;
SAYF(cCYA "afl-analyze" VERSION cRST " by <lcamtuf@google.com>\n");
while ((opt = getopt(argc,argv,"+i:f:m:t:eQ")) > 0)
while ((opt = getopt(argc,argv,"+i:f:m:t:eQU")) > 0)
switch (opt) {
@ -1026,6 +959,14 @@ int main(int argc, char** argv) {
qemu_mode = 1;
break;
case 'U':
if (unicorn_mode) FATAL("Multiple -U options not supported");
if (!mem_limit_given) mem_limit = MEM_LIMIT_UNICORN;
unicorn_mode = 1;
break;
default:
usage(argv[0]);
@ -1036,13 +977,14 @@ int main(int argc, char** argv) {
use_hex_offsets = !!getenv("AFL_ANALYZE_HEX");
setup_shm();
setup_shm(0);
atexit(at_exit_handler);
setup_signal_handlers();
set_up_environment();
find_binary(argv[optind]);
detect_file_args(argv + optind);
detect_file_args(argv + optind, prog_in);
if (qemu_mode)
use_argv = get_qemu_argv(argv[0], argv + optind, argc - optind);

View File

@ -377,7 +377,7 @@ static void add_instrumentation(void) {
}
/* Label of some sort. This may be a branch destination, but we need to
tread carefully and account for several different formatting
read carefully and account for several different formatting
conventions. */
#ifdef __APPLE__

View File

@ -189,6 +189,7 @@ static const u8* main_payload_32 =
" orb $1, (%edx, %edi, 1)\n"
#else
" incb (%edx, %edi, 1)\n"
" adcb $0, (%edx, %edi, 1)\n" // never zero counter implementation. slightly better path discovery and little performance impact
#endif /* ^SKIP_COUNTS */
"\n"
"__afl_return:\n"
@ -220,6 +221,29 @@ static const u8* main_payload_32 =
" testl %eax, %eax\n"
" je __afl_setup_abort\n"
"\n"
#ifdef USEMMAP
" pushl $384 /* shm_open mode 0600 */\n"
" pushl $2 /* flags O_RDWR */\n"
" pushl %eax /* SHM file path */\n"
" call shm_open\n"
" addl $12, %esp\n"
"\n"
" cmpl $-1, %eax\n"
" je __afl_setup_abort\n"
"\n"
" pushl $0 /* mmap off */\n"
" pushl %eax /* shm fd */\n"
" pushl $1 /* mmap flags */\n"
" pushl $3 /* mmap prot */\n"
" pushl $"STRINGIFY(MAP_SIZE)" /* mmap len */\n"
" pushl $0 /* mmap addr */\n"
" call mmap\n"
" addl $12, %esp\n"
"\n"
" cmpl $-1, %eax\n"
" je __afl_setup_abort\n"
"\n"
#else
" pushl %eax\n"
" call atoi\n"
" addl $4, %esp\n"
@ -233,6 +257,7 @@ static const u8* main_payload_32 =
" cmpl $-1, %eax\n"
" je __afl_setup_abort\n"
"\n"
#endif
" /* Store the address of the SHM region. */\n"
"\n"
" movl %eax, __afl_area_ptr\n"
@ -417,6 +442,7 @@ static const u8* main_payload_64 =
" orb $1, (%rdx, %rcx, 1)\n"
#else
" incb (%rdx, %rcx, 1)\n"
" adcb $0, (%rdx, %rcx, 1)\n" // never zero counter implementation. slightly better path discovery and little performance impact
#endif /* ^SKIP_COUNTS */
"\n"
"__afl_return:\n"
@ -501,6 +527,27 @@ static const u8* main_payload_64 =
" testq %rax, %rax\n"
" je __afl_setup_abort\n"
"\n"
#ifdef USEMMAP
" movl $384, %edx /* shm_open mode 0600 */\n"
" movl $2, %esi /* flags O_RDWR */\n"
" movq %rax, %rdi /* SHM file path */\n"
CALL_L64("shm_open")
"\n"
" cmpq $-1, %rax\n"
" je __afl_setup_abort\n"
"\n"
" movl $0, %r9d\n"
" movl %eax, %r8d\n"
" movl $1, %ecx\n"
" movl $3, %edx\n"
" movl $"STRINGIFY(MAP_SIZE)", %esi\n"
" movl $0, %edi\n"
CALL_L64("mmap")
"\n"
" cmpq $-1, %rax\n"
" je __afl_setup_abort\n"
"\n"
#else
" movq %rax, %rdi\n"
CALL_L64("atoi")
"\n"
@ -512,6 +559,7 @@ static const u8* main_payload_64 =
" cmpq $-1, %rax\n"
" je __afl_setup_abort\n"
"\n"
#endif
" /* Store the address of the SHM region. */\n"
"\n"
" movq %rax, %rdx\n"

View File

@ -49,9 +49,9 @@ MEM_LIMIT=100
TIMEOUT=none
unset IN_DIR OUT_DIR STDIN_FILE EXTRA_PAR MEM_LIMIT_GIVEN \
AFL_CMIN_CRASHES_ONLY AFL_CMIN_ALLOW_ANY QEMU_MODE
AFL_CMIN_CRASHES_ONLY AFL_CMIN_ALLOW_ANY QEMU_MODE UNICORN_MODE
while getopts "+i:o:f:m:t:eQC" opt; do
while getopts "+i:o:f:m:t:eQUC" opt; do
case "$opt" in
@ -83,6 +83,11 @@ while getopts "+i:o:f:m:t:eQC" opt; do
test "$MEM_LIMIT_GIVEN" = "" && MEM_LIMIT=250
QEMU_MODE=1
;;
"U")
EXTRA_PAR="$EXTRA_PAR -U"
test "$MEM_LIMIT_GIVEN" = "" && MEM_LIMIT=250
UNICORN_MODE=1
;;
"?")
exit 1
;;
@ -111,7 +116,8 @@ Execution control settings:
-m megs - memory limit for child process ($MEM_LIMIT MB)
-t msec - run time limit for child process (none)
-Q - use binary-only instrumentation (QEMU mode)
-U - use unicorn-based instrumentation (Unicorn mode)
Minimization settings:
-C - keep crashing inputs, reject everything else
@ -196,7 +202,7 @@ if [ ! -f "$TARGET_BIN" -o ! -x "$TARGET_BIN" ]; then
fi
if [ "$AFL_SKIP_BIN_CHECK" = "" -a "$QEMU_MODE" = "" ]; then
if [ "$AFL_SKIP_BIN_CHECK" = "" -a "$QEMU_MODE" = "" -a "$UNICORN_MODE" = "" ]; then
if ! grep -qF "__AFL_SHM_ID" "$TARGET_BIN"; then
echo "[-] Error: binary '$TARGET_BIN' doesn't appear to be instrumented." 1>&2

69
afl-common.c Normal file
View File

@ -0,0 +1,69 @@
/*
gather some functions common to multiple executables
detect_file_args
*/
#include <stdlib.h>
#include <stdio.h>
#include <strings.h>
#include "debug.h"
#include "alloc-inl.h"
/* Detect @@ in args. */
#ifndef __glibc__
#include <unistd.h>
#endif
void detect_file_args(char** argv, u8* prog_in) {
u32 i = 0;
#ifdef __GLIBC__
u8* cwd = getcwd(NULL, 0); /* non portable glibc extension */
#else
u8* cwd;
char *buf;
long size = pathconf(".", _PC_PATH_MAX);
if ((buf = (char *)malloc((size_t)size)) != NULL) {
cwd = getcwd(buf, (size_t)size); /* portable version */
} else {
PFATAL("getcwd() failed");
}
#endif
if (!cwd) PFATAL("getcwd() failed");
while (argv[i]) {
u8* aa_loc = strstr(argv[i], "@@");
if (aa_loc) {
u8 *aa_subst, *n_arg;
if (!prog_in) FATAL("@@ syntax is not supported by this tool.");
/* Be sure that we're always using fully-qualified paths. */
if (prog_in[0] == '/') aa_subst = prog_in;
else aa_subst = alloc_printf("%s/%s", cwd, prog_in);
/* Construct a replacement argv value. */
*aa_loc = 0;
n_arg = alloc_printf("%s%s%s", argv[i], aa_subst, aa_loc + 2);
argv[i] = n_arg;
*aa_loc = '@';
if (prog_in[0] != '/') ck_free(aa_subst);
}
i++;
}
free(cwd); /* not tracked */
}

5
afl-common.h Normal file
View File

@ -0,0 +1,5 @@
#ifndef __AFLCOMMON_H
#define __AFLCOMMON_H
void detect_file_args(char **argv, u8 *prog_in);
#endif

4631
afl-fuzz.c

File diff suppressed because it is too large Load Diff

View File

@ -252,6 +252,10 @@ static void edit_params(u32 argc, char** argv) {
}
#ifdef USEMMAP
cc_params[cc_par_cnt++] = "-lrt";
#endif
if (!getenv("AFL_DONT_OPTIMIZE")) {
#if defined(__FreeBSD__) && defined(__x86_64__)
@ -304,6 +308,7 @@ int main(int argc, char** argv) {
if (isatty(2) && !getenv("AFL_QUIET")) {
SAYF(cCYA "afl-cc" VERSION cRST " by <lcamtuf@google.com>\n");
SAYF(cYEL "[!] " cBRI "NOTE: " cRST "afl-gcc is deprecated, llvm_mode is much faster and has more options\n");
} else be_quiet = 1;

View File

@ -28,6 +28,8 @@
#include "debug.h"
#include "alloc-inl.h"
#include "hash.h"
#include "sharedmem.h"
#include "afl-common.h"
#include <stdio.h>
#include <unistd.h>
@ -48,7 +50,7 @@
static s32 child_pid; /* PID of the tested program */
static u8* trace_bits; /* SHM with instrumentation bitmap */
u8* trace_bits; /* SHM with instrumentation bitmap */
static u8 *out_file, /* Trace output file */
*doc_path, /* Path to docs */
@ -59,8 +61,6 @@ static u32 exec_tmout; /* Exec timeout (ms) */
static u64 mem_limit = MEM_LIMIT; /* Memory limit (MB) */
static s32 shm_id; /* ID of the SHM region */
static u8 quiet_mode, /* Hide non-essential messages? */
edges_only, /* Ignore hit counts? */
cmin_mode, /* Generate output in afl-cmin mode? */
@ -126,39 +126,6 @@ static void classify_counts(u8* mem, const u8* map) {
}
/* Get rid of shared memory (atexit handler). */
static void remove_shm(void) {
shmctl(shm_id, IPC_RMID, NULL);
}
/* Configure shared memory. */
static void setup_shm(void) {
u8* shm_str;
shm_id = shmget(IPC_PRIVATE, MAP_SIZE, IPC_CREAT | IPC_EXCL | 0600);
if (shm_id < 0) PFATAL("shmget() failed");
atexit(remove_shm);
shm_str = alloc_printf("%d", shm_id);
setenv(SHM_ENV_VAR, shm_str, 1);
ck_free(shm_str);
trace_bits = shmat(shm_id, NULL, 0);
if (!trace_bits) PFATAL("shmat() failed");
}
/* Write results. */
static u32 write_results(void) {
@ -413,50 +380,6 @@ static void setup_signal_handlers(void) {
}
/* Detect @@ in args. */
static void detect_file_args(char** argv) {
u32 i = 0;
u8* cwd = getcwd(NULL, 0);
if (!cwd) PFATAL("getcwd() failed");
while (argv[i]) {
u8* aa_loc = strstr(argv[i], "@@");
if (aa_loc) {
u8 *aa_subst, *n_arg;
if (!at_file) FATAL("@@ syntax is not supported by this tool.");
/* Be sure that we're always using fully-qualified paths. */
if (at_file[0] == '/') aa_subst = at_file;
else aa_subst = alloc_printf("%s/%s", cwd, at_file);
/* Construct a replacement argv value. */
*aa_loc = 0;
n_arg = alloc_printf("%s%s%s", argv[i], aa_subst, aa_loc + 2);
argv[i] = n_arg;
*aa_loc = '@';
if (at_file[0] != '/') ck_free(aa_subst);
}
i++;
}
free(cwd); /* not tracked */
}
/* Show banner. */
static void show_banner(void) {
@ -481,7 +404,9 @@ static void usage(u8* argv0) {
" -t msec - timeout for each run (none)\n"
" -m megs - memory limit for child process (%u MB)\n"
" -Q - use binary-only instrumentation (QEMU mode)\n\n"
" -Q - use binary-only instrumentation (QEMU mode)\n"
" -U - use Unicorn-based instrumentation (Unicorn mode)\n"
" (Not necessary, here for consistency with other afl-* tools)\n\n"
"Other settings:\n\n"
@ -610,19 +535,18 @@ static char** get_qemu_argv(u8* own_loc, char** argv, int argc) {
}
/* Main entry point */
int main(int argc, char** argv) {
s32 opt;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0, unicorn_mode = 0;
u32 tcnt;
char** use_argv;
doc_path = access(DOC_PATH, F_OK) ? "docs" : DOC_PATH;
while ((opt = getopt(argc,argv,"+o:m:t:A:eqZQbc")) > 0)
while ((opt = getopt(argc,argv,"+o:m:t:A:eqZQUbc")) > 0)
switch (opt) {
@ -719,6 +643,14 @@ int main(int argc, char** argv) {
qemu_mode = 1;
break;
case 'U':
if (unicorn_mode) FATAL("Multiple -U options not supported");
if (!mem_limit_given) mem_limit = MEM_LIMIT_UNICORN;
unicorn_mode = 1;
break;
case 'b':
/* Secret undocumented mode. Writes output in raw binary format
@ -741,7 +673,7 @@ int main(int argc, char** argv) {
if (optind == argc || !out_file) usage(argv[0]);
setup_shm();
setup_shm(0);
setup_signal_handlers();
set_up_environment();
@ -753,7 +685,7 @@ int main(int argc, char** argv) {
ACTF("Executing '%s'...\n", target_path);
}
detect_file_args(argv + optind);
detect_file_args(argv + optind, at_file);
if (qemu_mode)
use_argv = get_qemu_argv(argv[0], argv + optind, argc - optind);

View File

@ -1,5 +1,9 @@
#!/bin/sh
echo This reconfigures the system to have a better fuzzing performance
if [ '!' "$EUID" = 0 ] && [ '!' `id -u` = 0 ] ; then
echo Error you need to be root to run this
exit 1
fi
sysctl -w kernel.core_pattern=core
sysctl -w kernel.randomize_va_space=0
sysctl -w kernel.sched_child_runs_first=1
@ -7,7 +11,11 @@ sysctl -w kernel.sched_autogroup_enabled=1
sysctl -w kernel.sched_migration_cost_ns=50000000
sysctl -w kernel.sched_latency_ns=250000000
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor > /dev/null
test -e /sys/devices/system/cpu/cpufreq/scaling_governor && echo performance | tee /sys/devices/system/cpu/cpufreq/scaling_governor
test -e /sys/devices/system/cpu/cpufreq/policy0/scaling_governor && echo performance | tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
test -e /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor && echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
test -e /sys/devices/system/cpu/intel_pstate/no_turbo && echo 0 > /sys/devices/system/cpu/intel_pstate/no_turbo
test -e /sys/devices/system/cpu/cpufreq/boost && echo 1 > /sys/devices/system/cpu/cpufreq/boost
echo
echo It is recommended to boot the kernel with lots of security off - if you are running a machine that is in a secured network - so set this:
echo '/etc/default/grub:GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"'

View File

@ -26,6 +26,8 @@
#include "debug.h"
#include "alloc-inl.h"
#include "hash.h"
#include "sharedmem.h"
#include "afl-common.h"
#include <stdio.h>
#include <unistd.h>
@ -44,10 +46,14 @@
#include <sys/types.h>
#include <sys/resource.h>
static s32 child_pid; /* PID of the tested program */
static s32 forksrv_pid, /* PID of the fork server */
child_pid; /* PID of the tested program */
static u8 *trace_bits, /* SHM with instrumentation bitmap */
*mask_bitmap; /* Mask for trace bits (-B) */
static s32 fsrv_ctl_fd, /* Fork server control pipe (write) */
fsrv_st_fd; /* Fork server status pipe (read) */
u8 *trace_bits; /* SHM with instrumentation bitmap */
static u8 *mask_bitmap; /* Mask for trace bits (-B) */
static u8 *in_file, /* Minimizer input test case */
*out_file, /* Minimizer output file */
@ -55,6 +61,8 @@ static u8 *in_file, /* Minimizer input test case */
*target_path, /* Path to target binary */
*doc_path; /* Path to docs */
static s32 prog_in_fd; /* Persistent fd for prog_in */
static u8* in_data; /* Input data for trimming */
static u32 in_len, /* Input data length */
@ -67,8 +75,7 @@ static u32 in_len, /* Input data length */
static u64 mem_limit = MEM_LIMIT; /* Memory limit (MB) */
static s32 shm_id, /* ID of the SHM region */
dev_null_fd = -1; /* FD to /dev/null */
static s32 dev_null_fd = -1; /* FD to /dev/null */
static u8 crash_mode, /* Crash-centric mode? */
exit_crash, /* Treat non-zero exit as crash? */
@ -153,42 +160,12 @@ static inline u8 anything_set(void) {
}
/* Get rid of temp files (atexit handler). */
/* Get rid of shared memory and temp files (atexit handler). */
static void remove_shm(void) {
static void at_exit_handler(void) {
if (prog_in) unlink(prog_in); /* Ignore errors */
shmctl(shm_id, IPC_RMID, NULL);
}
/* Configure shared memory. */
static void setup_shm(void) {
u8* shm_str;
shm_id = shmget(IPC_PRIVATE, MAP_SIZE, IPC_CREAT | IPC_EXCL | 0600);
if (shm_id < 0) PFATAL("shmget() failed");
atexit(remove_shm);
shm_str = alloc_printf("%d", shm_id);
setenv(SHM_ENV_VAR, shm_str, 1);
ck_free(shm_str);
trace_bits = shmat(shm_id, NULL, 0);
if (!trace_bits) PFATAL("shmat() failed");
}
/* Read initial file. */
static void read_initial_file(void) {
@ -236,38 +213,70 @@ static s32 write_to_file(u8* path, u8* mem, u32 len) {
}
/* Write modified data to file for testing. If use_stdin is clear, the old file
is unlinked and a new one is created. Otherwise, prog_in_fd is rewound and
truncated. */
static void write_to_testcase(void* mem, u32 len) {
s32 fd = prog_in_fd;
if (!use_stdin) {
unlink(prog_in); /* Ignore errors. */
fd = open(prog_in, O_WRONLY | O_CREAT | O_EXCL, 0600);
if (fd < 0) PFATAL("Unable to create '%s'", prog_in);
} else lseek(fd, 0, SEEK_SET);
ck_write(fd, mem, len, prog_in);
if (use_stdin) {
if (ftruncate(fd, len)) PFATAL("ftruncate() failed");
lseek(fd, 0, SEEK_SET);
} else close(fd);
}
/* Handle timeout signal. */
static void handle_timeout(int sig) {
if (child_pid > 0) {
child_timed_out = 1;
if (child_pid > 0) kill(child_pid, SIGKILL);
kill(child_pid, SIGKILL);
} else if (child_pid == -1 && forksrv_pid > 0) {
child_timed_out = 1;
kill(forksrv_pid, SIGKILL);
}
}
/* Execute target application. Returns 0 if the changes are a dud, or
1 if they should be kept. */
static u8 run_target(char** argv, u8* mem, u32 len, u8 first_run) {
/* start the app and it's forkserver */
static void init_forkserver(char **argv) {
static struct itimerval it;
int st_pipe[2], ctl_pipe[2];
int status = 0;
s32 rlen;
s32 prog_in_fd;
u32 cksum;
ACTF("Spinning up the fork server...");
if (pipe(st_pipe) || pipe(ctl_pipe)) PFATAL("pipe() failed");
memset(trace_bits, 0, MAP_SIZE);
MEM_BARRIER();
forksrv_pid = fork();
prog_in_fd = write_to_file(prog_in, mem, len);
if (forksrv_pid < 0) PFATAL("fork() failed");
child_pid = fork();
if (child_pid < 0) PFATAL("fork() failed");
if (!child_pid) {
if (!forksrv_pid) {
struct rlimit r;
@ -304,6 +313,16 @@ static u8 run_target(char** argv, u8* mem, u32 len, u8 first_run) {
r.rlim_max = r.rlim_cur = 0;
setrlimit(RLIMIT_CORE, &r); /* Ignore errors */
/* Set up control and status pipes, close the unneeded original fds. */
if (dup2(ctl_pipe[0], FORKSRV_FD) < 0) PFATAL("dup2() failed");
if (dup2(st_pipe[1], FORKSRV_FD + 1) < 0) PFATAL("dup2() failed");
close(ctl_pipe[0]);
close(ctl_pipe[1]);
close(st_pipe[0]);
close(st_pipe[1]);
execv(target_path, argv);
*(u32*)trace_bits = EXEC_FAIL_SIG;
@ -311,17 +330,113 @@ static u8 run_target(char** argv, u8* mem, u32 len, u8 first_run) {
}
close(prog_in_fd);
/* Close the unneeded endpoints. */
close(ctl_pipe[0]);
close(st_pipe[1]);
fsrv_ctl_fd = ctl_pipe[1];
fsrv_st_fd = st_pipe[0];
/* Configure timeout, wait for child, cancel timeout. */
if (exec_tmout) {
child_timed_out = 0;
it.it_value.tv_sec = (exec_tmout * FORK_WAIT_MULT / 1000);
it.it_value.tv_usec = ((exec_tmout * FORK_WAIT_MULT) % 1000) * 1000;
}
setitimer(ITIMER_REAL, &it, NULL);
rlen = read(fsrv_st_fd, &status, 4);
it.it_value.tv_sec = 0;
it.it_value.tv_usec = 0;
setitimer(ITIMER_REAL, &it, NULL);
/* If we have a four-byte "hello" message from the server, we're all set.
Otherwise, try to figure out what went wrong. */
if (rlen == 4) {
ACTF("All right - fork server is up.");
return;
}
if (waitpid(forksrv_pid, &status, 0) <= 0)
PFATAL("waitpid() failed");
u8 child_crashed;
if (WIFSIGNALED(status))
child_crashed = 1;
if (child_timed_out)
SAYF(cLRD "\n+++ Program timed off +++\n" cRST);
else if (stop_soon)
SAYF(cLRD "\n+++ Program aborted by user +++\n" cRST);
else if (child_crashed)
SAYF(cLRD "\n+++ Program killed by signal %u +++\n" cRST, WTERMSIG(status));
}
/* Execute target application. Returns 0 if the changes are a dud, or
1 if they should be kept. */
static u8 run_target(char** argv, u8* mem, u32 len, u8 first_run) {
static struct itimerval it;
static u32 prev_timed_out = 0;
int status = 0;
u32 cksum;
memset(trace_bits, 0, MAP_SIZE);
MEM_BARRIER();
write_to_testcase(mem, len);
s32 res;
/* we have the fork server up and running, so simply
tell it to have at it, and then read back PID. */
if ((res = write(fsrv_ctl_fd, &prev_timed_out, 4)) != 4) {
if (stop_soon) return 0;
RPFATAL(res, "Unable to request new process from fork server (OOM?)");
}
if ((res = read(fsrv_st_fd, &child_pid, 4)) != 4) {
if (stop_soon) return 0;
RPFATAL(res, "Unable to request new process from fork server (OOM?)");
}
if (child_pid <= 0) FATAL("Fork server is misbehaving (OOM?)");
/* Configure timeout, wait for child, cancel timeout. */
if (exec_tmout) {
child_timed_out = 0;
it.it_value.tv_sec = (exec_tmout / 1000);
it.it_value.tv_usec = (exec_tmout % 1000) * 1000;
}
setitimer(ITIMER_REAL, &it, NULL);
if (waitpid(child_pid, &status, 0) <= 0) FATAL("waitpid() failed");
if ((res = read(fsrv_st_fd, &status, 4)) != 4) {
if (stop_soon) return 0;
RPFATAL(res, "Unable to communicate with fork server (OOM?)");
}
child_pid = 0;
it.it_value.tv_sec = 0;
@ -556,7 +671,7 @@ next_del_blksize:
alpha_del1 = 0;
syms_removed = 0;
memset(alpha_map, 0, 256 * sizeof(u32));
memset(alpha_map, 0, sizeof(alpha_map));
for (i = 0; i < in_len; i++) {
if (!alpha_map[in_data[i]]) alpha_size++;
@ -687,6 +802,13 @@ static void set_up_environment(void) {
}
unlink(prog_in);
prog_in_fd = open(prog_in, O_RDWR | O_CREAT | O_EXCL, 0600);
if (prog_in_fd < 0) PFATAL("Unable to create '%s'", prog_in);
/* Set sane defaults... */
x = getenv("ASAN_OPTIONS");
@ -760,48 +882,6 @@ static void setup_signal_handlers(void) {
}
/* Detect @@ in args. */
static void detect_file_args(char** argv) {
u32 i = 0;
u8* cwd = getcwd(NULL, 0);
if (!cwd) PFATAL("getcwd() failed");
while (argv[i]) {
u8* aa_loc = strstr(argv[i], "@@");
if (aa_loc) {
u8 *aa_subst, *n_arg;
/* Be sure that we're always using fully-qualified paths. */
if (prog_in[0] == '/') aa_subst = prog_in;
else aa_subst = alloc_printf("%s/%s", cwd, prog_in);
/* Construct a replacement argv value. */
*aa_loc = 0;
n_arg = alloc_printf("%s%s%s", argv[i], aa_subst, aa_loc + 2);
argv[i] = n_arg;
*aa_loc = '@';
if (prog_in[0] != '/') ck_free(aa_subst);
}
i++;
}
free(cwd); /* not tracked */
}
/* Display usage hints. */
static void usage(u8* argv0) {
@ -818,7 +898,9 @@ static void usage(u8* argv0) {
" -f file - input file read by the tested program (stdin)\n"
" -t msec - timeout for each run (%u ms)\n"
" -m megs - memory limit for child process (%u MB)\n"
" -Q - use binary-only instrumentation (QEMU mode)\n\n"
" -Q - use binary-only instrumentation (QEMU mode)\n"
" -U - use Unicorn-based instrumentation (Unicorn mode)\n\n"
" (Not necessary, here for consistency with other afl-* tools)\n\n"
"Minimization settings:\n\n"
@ -945,7 +1027,6 @@ static char** get_qemu_argv(u8* own_loc, char** argv, int argc) {
}
/* Read mask bitmap from file. This is for the -B option. */
static void read_bitmap(u8* fname) {
@ -967,14 +1048,14 @@ static void read_bitmap(u8* fname) {
int main(int argc, char** argv) {
s32 opt;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0;
u8 mem_limit_given = 0, timeout_given = 0, qemu_mode = 0, unicorn_mode = 0;
char** use_argv;
doc_path = access(DOC_PATH, F_OK) ? "docs" : DOC_PATH;
SAYF(cCYA "afl-tmin" VERSION cRST " by <lcamtuf@google.com>\n");
while ((opt = getopt(argc,argv,"+i:o:f:m:t:B:xeQ")) > 0)
while ((opt = getopt(argc,argv,"+i:o:f:m:t:B:xeQU")) > 0)
switch (opt) {
@ -1066,6 +1147,14 @@ int main(int argc, char** argv) {
qemu_mode = 1;
break;
case 'U':
if (unicorn_mode) FATAL("Multiple -Q options not supported");
if (!mem_limit_given) mem_limit = MEM_LIMIT_UNICORN;
unicorn_mode = 1;
break;
case 'B': /* load bitmap */
/* This is a secret undocumented option! It is speculated to be useful
@ -1094,13 +1183,14 @@ int main(int argc, char** argv) {
if (optind == argc || !in_file || !out_file) usage(argv[0]);
setup_shm();
setup_shm(0);
atexit(at_exit_handler);
setup_signal_handlers();
set_up_environment();
find_binary(argv[optind]);
detect_file_args(argv + optind);
detect_file_args(argv + optind, prog_in);
if (qemu_mode)
use_argv = get_qemu_argv(argv[0], argv + optind, argc - optind);
@ -1113,6 +1203,8 @@ int main(int argc, char** argv) {
read_initial_file();
init_forkserver(use_argv);
ACTF("Performing dry run (mem limit = %llu MB, timeout = %u ms%s)...",
mem_limit, exec_tmout, edges_only ? ", edges only" : "");

View File

@ -76,6 +76,17 @@
/* Sanity-checking macros for pointers. */
#define CHECK_PTR(_p) do { \
if (_p) { \
if (ALLOC_C1(_p) ^ ALLOC_MAGIC_C1) {\
if (ALLOC_C1(_p) == ALLOC_MAGIC_F) \
ABORT("Use after free."); \
else ABORT("Corrupted head alloc canary."); \
} \
} \
} while (0)
/*
#define CHECK_PTR(_p) do { \
if (_p) { \
if (ALLOC_C1(_p) ^ ALLOC_MAGIC_C1) {\
@ -87,6 +98,7 @@
ABORT("Corrupted tail alloc canary."); \
} \
} while (0)
*/
#define CHECK_PTR_EXPR(_p) ({ \
typeof (_p) _tmp = (_p); \

View File

@ -21,7 +21,7 @@
/* Version string: */
#define VERSION "++2.52c"
#define VERSION "++2.53c"
/******************************************************
* *
@ -59,6 +59,10 @@
#define MEM_LIMIT_QEMU 200
/* Default memory limit when running in Unicorn mode (MB): */
#define MEM_LIMIT_UNICORN 200
/* Number of calibration cycles per every new test case (and for test
cases that show variable behavior): */
@ -83,6 +87,7 @@
of 32-bit int overflows): */
#define HAVOC_MAX_MULT 16
#define HAVOC_MAX_MULT_MOPT 32
/* Absolute minimum number of havoc cycles (after all adjustments): */

View File

@ -13,14 +13,58 @@ Want to stay in the loop on major new features? Join our mailing list by
sending a mail to <afl-users+subscribe@googlegroups.com>.
--------------------------
Version ++2.53c (release):
--------------------------
- README is now README.md
- imported the few minor changes from the 2.53b release
- unicorn_mode got added - thanks to domenukk for the patch!
- fix llvm_mode AFL_TRACE_PC with modern llvm
- fix a crash in qemu_mode which also exists in stock afl
- added libcompcov, a laf-intel implementation for qemu! :)
see qemu_mode/libcompcov/README.libcompcov
- afl-fuzz now displays the selected core in the status screen (blue {#})
- updated afl-fuzz and afl-system-config for new scaling governor location
in modern kernels
- using the old ineffective afl-gcc will now show a deprecation warning
- all queue, hang and crash files now have their discovery time in their name
- if llvm_mode was compiled, afl-clang/afl-clang++ will point to these
instead of afl-gcc
- added instrim, a much faster llvm_mode instrumentation at the cost of
path discovery. See llvm_mode/README.instrim (https://github.com/csienslab/instrim)
- added MOpt (github.com/puppet-meteor/MOpt-AFL) mode, see docs/README.MOpt
- added code to make it more portable to other platforms than Intel Linux
- added never zero counters for afl-gcc and optionally (because of an
optimization issue in llvm < 9) for llvm_mode (AFL_LLVM_NEVER_ZERO=1)
- added a new doc about binary only fuzzing: docs/binaryonly_fuzzing.txt
- more cpu power for afl-system-config
- added forkserver patch to afl-tmin, makes it much faster (originally from
github.com/nccgroup/TriforceAFL)
- added whitelist support for llvm_mode via AFL_LLVM_WHITELIST to allow
only to instrument what is actually interesting. Gives more speed and less
map pollution (originally by choller@mozilla)
- added Python Module mutator support, python2.7-dev is autodetected.
see docs/python_mutators.txt (originally by choller@mozilla)
- added AFL_CAL_FAST for slow applications and AFL_DEBUG_CHILD_OUTPUT for
debugging
- added -V time and -E execs option to better comparison runs, runs afl-fuzz
for a specific time/executions.
- added a -s seed switch to allow afl run with a fixed initial
seed that is not updated. This is good for performance and path discovery
tests as the random numbers are deterministic then
- llvm_mode LAF_... env variables can now be specified as AFL_LLVM_LAF_...
that is longer but in line with other llvm specific env vars
-----------------------------
Version ++2.52c (2019-05-28):
Version ++2.52c (2019-06-05):
-----------------------------
- Applied community patches. See docs/PATCHES for the full list.
LLVM and Qemu modes are now faster.
Important changes:
afl-fuzz: -e EXTENSION commandline option
afl-fuzz: -e EXTENSION commandline option
llvm_mode: LAF-intel performance (needs activation, see llvm/README.laf-intel)
a few new environment variables for afl-fuzz, llvm and qemu, see docs/env_variables.txt
- Added the power schedules of AFLfast by Marcel Boehme, but set the default

View File

@ -17,7 +17,14 @@ afl-qemu-optimize-entrypoint.diff by mh(at)mh-sec(dot)de
afl-qemu-speed.diff by abiondo on github
afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de
additionally AFLfast additions (github.com/mboehme/aflfast) were incorporated.
+ unicorn_mode (modernized and updated by domenukk)
+ instrim (https://github.com/csienslab/instrim) was integrated
+ MOpt (github.com/puppet-meteor/MOpt-AFL) was imported
+ AFLfast additions (github.com/mboehme/aflfast) were incorporated.
+ Qemu 3.1 upgrade with enhancement patches (github.com/andreafioraldi/afl)
+ Python mutator modules support (github.com/choller/afl)
+ Whitelisting in LLVM mode (github.com/choller/afl)
+ forkserver patch for afl-tmin (github.com/nccgroup/TriforceAFL)
NOT INSTALLED

View File

@ -2,7 +2,7 @@
AFL quick start guide
=====================
You should read docs/README. It's pretty short. If you really can't, here's
You should read docs/README.md - it's pretty short. If you really can't, here's
how to hit the ground running:
1) Compile AFL with 'make'. If build fails, see docs/INSTALL for tips.
@ -12,10 +12,12 @@ how to hit the ground running:
If testing a network service, modify it to run in the foreground and read
from stdin. When fuzzing a format that uses checksums, comment out the
checksum verification code, too.
If this is not possible (e.g. in -Q(emu) mode) then use AFL_POST_LIBRARY
to calculate the values with your own library.
The program must crash properly when a fault is encountered. Watch out for
custom SIGSEGV or SIGABRT handlers and background processes. For tips on
detecting non-crashing flaws, see section 11 in docs/README.
detecting non-crashing flaws, see section 11 in docs/README.md .
3) Compile the program / library to be fuzzed using afl-gcc. A common way to
do this would be:
@ -40,10 +42,13 @@ how to hit the ground running:
6) Investigate anything shown in red in the fuzzer UI by promptly consulting
docs/status_screen.txt.
7) compile and use llvm_mode (afl-clang-fast/afl-clang-fast++) as it is way
faster and has a few cool features
That's it. Sit back, relax, and - time permitting - try to skim through the
following files:
- docs/README - A general introduction to AFL,
- docs/README.md - A general introduction to AFL,
- docs/perf_tips.txt - Simple tips on how to fuzz more quickly,
- docs/status_screen.txt - An explanation of the tidbits shown in the UI,
- docs/parallel_fuzzing.txt - Advice on running AFL on multiple cores.

51
docs/README.MOpt Normal file
View File

@ -0,0 +1,51 @@
# MOpt(imized) AFL by <puppet@zju.edu.cn>
### 1. Description
MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm
Optimization (PSO) algorithm to find the optimal selection probability
distribution of operators with respect to fuzzing effectiveness.
More details can be found in the technical report.
### 2. Cite Information
Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song and
Raheem Beyah, MOPT: Optimized Mutation Scheduling for Fuzzers,
USENIX Security 2019.
### 3. Seed Sets
We open source all the seed sets used in the paper
"MOPT: Optimized Mutation Scheduling for Fuzzers".
### 4. Experiment Results
The experiment results can be found in
https://drive.google.com/drive/folders/184GOzkZGls1H2NuLuUfSp9gfqp1E2-lL?usp=sharing.
We only open source the crash files since the space is limited.
### 5. Technical Report
MOpt_TechReport.pdf is the technical report of the paper
"MOPT: Optimized Mutation Scheduling for Fuzzers", which contains more deatails.
### 6. Parameter Introduction
Most important, you must add the parameter `-L` (e.g., `-L 0`) to launch the
MOpt scheme.
Option '-L' controls the time to move on to the pacemaker fuzzing mode.
'-L t': when MOpt-AFL finishes the mutation of one input, if it has not
discovered any new unique crash or path for more than t minutes, MOpt-AFL will
enter the pacemaker fuzzing mode.
Setting 0 will enter the pacemaker fuzzing mode at first, which is
recommended in a short time-scale evaluation.
Other important parameters can be found in afl-fuzz.c, for instance,
'swarm_num': the number of the PSO swarms used in the fuzzing process.
'period_pilot': how many times MOpt-AFL will execute the target program
in the pilot fuzzing module, then it will enter the core fuzzing module.
'period_core': how many times MOpt-AFL will execute the target program in the
core fuzzing module, then it will enter the PSO updating module.
'limit_time_bound': control how many interesting test cases need to be found
before MOpt-AFL quits the pacemaker fuzzing mode and reuses the deterministic stage.
0 < 'limit_time_bound' < 1, MOpt-AFL-tmp.
'limit_time_bound' >= 1, MOpt-AFL-ever.
Have fun with MOpt in AFL!

1
docs/README.md Symbolic link
View File

@ -0,0 +1 @@
../README.md

140
docs/binaryonly_fuzzing.txt Normal file
View File

@ -0,0 +1,140 @@
Fuzzing binary-only programs with afl++
=======================================
afl++, libfuzzer and others are great if you have the source code, and
it allows for very fast and coverage guided fuzzing.
However, if there is only the binary program and not source code available,
then standard afl++ (dumb mode) is not effective.
The following is a description of how these can be fuzzed with afl++
!!!!!
TL;DR: try DYNINST with afl-dyninst. If it produces too many crashes then
use afl -Q qemu_mode, or better: use both in parallel.
!!!!!
QEMU
----
Qemu is the "native" solution to the program.
It is available in the ./qemu_mode/ directory and once compiled it can
be accessed by the afl-fuzz -Q command line option.
The speed decrease is at about 50%
It is the easiest to use alternative and even works for cross-platform binaries.
As it is included in afl++ this needs no URL.
UNICORN
-------
Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar.
In contrast to QEMU, Unicorn does not offer a full system or even userland emulation.
Runtime environment and/or loaders have to be written from scratch, if needed.
On top, block chaining has been removed. This means the speed boost introduced in
to the patched QEMU Mode of afl++ cannot simply be ported over to Unicorn.
For further information, check out ./unicorn_mode.txt.
DYNINST
-------
Dyninst is a binary instrumentation framework similar to Pintool and Dynamorio
(see far below). However whereas Pintool and Dynamorio work at runtime, dyninst
instruments the target at load time, and then let it run.
This is great for some things, e.g. fuzzing, and not so effective for others,
e.g. malware analysis.
So what we can do with dyninst is taking every basic block, and put afl's
instrumention code in there - and then save the binary.
Afterwards we can just fuzz the newly saved target binary with afl-fuzz.
Sounds great? It is. The issue though - it is a non-trivial problem to
insert instructions, which change addresses in the process space, so
everything is still working afterwards. Hence more often than not binaries
crash when they are run (because of instrumentation).
The speed decrease is about 15-35%, depending on the optimization options
used with afl-dyninst.
So if dyninst works, it is the best option available. Otherwise it just doesn't
work well.
https://github.com/vanhauser-thc/afl-dyninst
INTEL-PT
--------
If you have a newer Intel CPU, you can make use of Intels processor trace.
The big issue with Intel's PT is the small buffer size and the complex
encoding of the debug information collected through PT.
This makes the decoding very CPU intensive and hence slow.
As a result, the overall speed decrease is about 70-90% (depending on
the implementation and other factors).
There are two afl intel-pt implementations:
1. https://github.com/junxzm1990/afl-pt
=> this needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.
2. https://github.com/hunter-ht-2018/ptfuzzer
=> this needs a 4.14 or 4.15 kernel. the "nopti" kernel boot option must
be used. This one is faster than the other.
CORESIGHT
---------
Coresight is ARM's answer to Intel's PT.
There is no implementation so far which handle coresight and getting
it working on an ARM Linux is very difficult due to custom kernel building
on embedded systems is difficult. And finding one that has coresight in
the ARM chip is difficult too.
My guess is that it is slower than Qemu, but faster than Intel PT.
If anyone finds any coresight implementation for afl please ping me:
vh@thc.org
PIN & DYNAMORIO
---------------
Pintool and Dynamorio are dynamic instrumentation engines, and they can be
used for getting basic block information at runtime.
Pintool is only available for Intel x32/x64 on Linux, Mac OS and Windows
whereas Dynamorio is additionally available for ARM and AARCH64.
Dynamorio is also 10x faster than Pintool.
The big issue with Dynamorio (and therefore Pintool too) is speed.
Dynamorio has a speed decrease of 98-99%
Pintool has a speed decrease of 99.5%
Hence Dynamorio is the option to go for if everything fails, and Pintool
only if Dynamorio fails too.
Dynamorio solutions:
https://github.com/vanhauser-thc/afl-dynamorio
https://github.com/mxmssh/drAFL
https://github.com/googleprojectzero/winafl/ <= very good but windows only
Pintool solutions:
https://github.com/vanhauser-thc/afl-pin
https://github.com/mothran/aflpin
https://github.com/spinpx/afl_pin_mode <= only old Pintool version supported
Non-AFL solutions
-----------------
There are many binary-only fuzzing frameworks. Some are great for CTFs but don't
work with large binaries, others are very slow but have good path discovery,
some are very hard to set-up ...
QSYM: https://github.com/sslab-gatech/qsym
Manticore: https://github.com/trailofbits/manticore
S2E: https://github.com/S2E
<please send me any missing that are good>
That's it!
News, corrections, updates?
Email vh@thc.org

View File

@ -7,8 +7,8 @@ Environmental variables
users or for some types of custom fuzzing setups. See README for the general
instruction manual.
1) Settings for afl-gcc, afl-clang, and afl-as
----------------------------------------------
1) Settings for afl-gcc, afl-clang, and afl-as - and gcc_plugin afl-gcc-fast
----------------------------------------------------------------------------
Because they can't directly accept command-line options, the compile-time
tools make fairly broad use of environmental variables:
@ -68,8 +68,11 @@ tools make fairly broad use of environmental variables:
- Setting AFL_QUIET will prevent afl-cc and afl-as banners from being
displayed during compilation, in case you find them distracting.
2) Settings for afl-clang-fast
------------------------------
- Setting AFL_CAL_FAST will speed up the initial calibration, if the
application is very slow
2) Settings for afl-clang-fast / afl-clang-fast++
-------------------------------------------------
The native LLVM instrumentation helper accepts a subset of the settings
discussed in section #1, with the exception of:
@ -79,9 +82,57 @@ discussed in section #1, with the exception of:
- TMPDIR and AFL_KEEP_ASSEMBLY, since no temporary assembly files are
created.
Note that AFL_INST_RATIO will behave a bit differently than for afl-gcc,
because functions are *not* instrumented unconditionally - so low values
will have a more striking effect. For this tool, 0 is not a valid choice.
- AFL_INST_RATIO, as we switched for instrim instrumentation which
is more effective but makes not much sense together with this option.
Then there are a few specific features that are only available in llvm_mode:
LAF-INTEL
=========
This great feature will split compares to series of single byte comparisons
to allow afl-fuzz to find otherwise rather impossible paths. It is not
restricted to Intel CPUs ;-)
- Setting AFL_LLVM_LAF_SPLIT_SWITCHES will split switch()es
- Setting AFL_LLVM_LAF_TRANSFORM_COMPARES will split string compare functions
- Setting AFL_LLVM_LAF_SPLIT_COMPARES will split > 8 bit CMP instructions
See llvm_mode/README.laf-intel for more information.
WHITELIST
=========
This feature allows selectively instrumentation of the source
- Setting AFL_LLVM_WHITELIST with a filename will only instrument those
files that match the names listed in this file.
See llvm_mode/README.whitelist for more information.
INSTRIM
=======
This feature increases the speed by whopping 20% but at the cost of a
lower path discovery and therefore coverage.
- Setting AFL_LLVM_INSTRIM activates this mode
- Setting AFL_LLVM_INSTRIM_LOOPHEAD=1 expands on INSTRIM to optimize loops.
afl-fuzz will only be able to see the path the loop took, but not how
many times it was called (unless it is a complex loop).
See llvm_mode/README.instrim
NOT_ZERO
========
- Setting AFL_LLVM_NOT_ZERO=1 during compilation will use counters
that skip zero on overflow. This is the default for llvm >= 9,
however for llvm versions below that this will increase an unnecessary
slowdown due a performance issue that is only fixed in llvm 9+.
This feature increases path discovery by a little bit.
See llvm_mode/README.neverzero
3) Settings for afl-fuzz
------------------------
@ -132,8 +183,8 @@ checks or alter some of the more exotic semantics of the tool:
- AFL_TMPDIR is used to write the .cur_input file to if exists, and in
the normal output directory otherwise. You would use this to point to
a ramdisk/tmpfs. This increases the speed by a very minimal value but
also reduces the stress on SSDs.
a ramdisk/tmpfs. This increases the speed by a small value but also
reduces the stress on SSDs.
- When developing custom instrumentation on top of afl-fuzz, you can use
AFL_SKIP_BIN_CHECK to inhibit the checks for non-instrumented binaries
@ -150,6 +201,11 @@ checks or alter some of the more exotic semantics of the tool:
mutated files - say, to fix up checksums. See experimental/post_library/
for more.
- For AFL_PYTHON_MODULE and AFL_PYTHON_ONLY - they require to be compiled
with -DUSE_PYTHON. Please see docs/python_mutators.txt
This feature allows to configure custom mutators which can be very helpful
in e.g. fuzzing XML or other highly flexible structured input.
- AFL_FAST_CAL keeps the calibration stage about 2.5x faster (albeit less
precise), which can help when starting a session against a slow target.
@ -174,6 +230,9 @@ checks or alter some of the more exotic semantics of the tool:
processing the first queue entry; and AFL_BENCH_UNTIL_CRASH causes it to
exit soon after the first crash is found.
- Setting AFL_DEBUG_CHILD_OUTPUT will not suppress the child output.
Not pretty but good for debugging purposes.
4) Settings for afl-qemu-trace
------------------------------
@ -185,6 +244,10 @@ The QEMU wrapper used to instrument binary-only code supports several settings:
- Setting AFL_INST_LIBS causes the translator to also instrument the code
inside any dynamically linked libraries (notably including glibc).
- Setting AFL_QEMU_COMPCOV enables the CompareCoverage tracing of all
cmp and sub in x86 and x86_64. Support for other architectures and
comparison functions (mem/strcmp et al.) is planned.
- The underlying QEMU binary will recognize any standard "user space
emulation" variables (e.g., QEMU_STACK_SIZE), but there should be no

View File

@ -64,6 +64,14 @@ that can offer huge benefits for programs with high startup overhead. Both
modes require you to edit the source code of the fuzzed program, but the
changes often amount to just strategically placing a single line or two.
If there are important data comparisons performed (e.g. strcmp(ptr, MAGIC_HDR)
then using laf-intel (see llvm_mode/README.laf-intel) will help afl-fuzz a lot
to get to the important parts in the code.
If you are only intested in specific parts of the code being fuzzed, you can
whitelist the files that are actually relevant. This improves the speed and
accuracy of afl. See llvm_mode/README.whitelist
4) Profile and optimize the binary
----------------------------------
@ -191,7 +199,7 @@ There are several OS-level factors that may affect fuzzing speed:
- Use the afl-system-config script to set all proc/sys settings above
- Disable all the spectre, meltdown etc. security countermeasures in the
kernel if your machine is properly seperated:
kernel if your machine is properly separated:
"ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off
no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable
nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off

View File

@ -1,31 +1,26 @@
This document was copied and modified from AFLfast at github.com/mboehme/aflfast
afl++'s power schedules based on AFLfast
<a href="https://comp.nus.edu.sg/~mboehme/paper/CCS16.pdf"><img src="https://comp.nus.edu.sg/~mboehme/paper/CCS16.png" align="right" width="250"></a>
Power schedules implemented by Marcel Böhme \<marcel.boehme@acm.org\>.
AFLFast is an extension of AFL which was written by Michal Zalewski \<lcamtuf@google.com\>.
Essentially, we observed that most generated inputs exercise the same few
"high-frequency" paths and developed strategies to gravitate towards
low-frequency paths, to stress significantly more program behavior in the
same amount of time. We devised several **search strategies** that decide
in which order the seeds should be fuzzed and **power schedules** that
smartly regulate the number of inputs generated from a seed (i.e., the
time spent fuzzing a seed). We call the number of inputs generated from a
seed, the seed's **energy**.
AFLfast has helped in the success of Team Codejitsu at the finals of the DARPA Cyber Grand Challenge where their bot Galactica took **2nd place** in terms of #POVs proven (see red bar at https://www.cybergrandchallenge.com/event#results). AFLFast exposed several previously unreported CVEs that could not be exposed by AFL in 24 hours and otherwise exposed vulnerabilities significantly faster than AFL while generating orders of magnitude more unique crashes.
Old AFL used -p exploit which had a too high cost, current AFL uses -p explore.
Essentially, we observed that most generated inputs exercise the same few "high-frequency" paths and developed strategies to gravitate towards low-frequency paths, to stress significantly more program behavior in the same amount of time. We devised several **search strategies** that decide in which order the seeds should be fuzzed and **power schedules** that smartly regulate the number of inputs generated from a seed (i.e., the time spent fuzzing a seed). We call the number of inputs generated from a seed, the seed's **energy**.
AFLfast implemented 4 new power schedules which are highly recommended to run
in parallel.
We find that AFL's exploitation-based constant schedule assigns **too much energy to seeds exercising high-frequency paths** (e.g., paths that reject invalid inputs) and not enough energy to seeds exercising low-frequency paths (e.g., paths that stress interesting behaviors). Technically, we modified the computation of a seed's performance score (`calculate_score`), which seed is marked as favourite (`update_bitmap_score`), and which seed is chosen next from the circular queue (`main`). We implemented the following schedules (in the order of their effectiveness, best first):
| AFL flag | Power Schedule |
| ------------- | -------------------------- |
| `-p fast` (default)| ![FAST](http://latex.codecogs.com/gif.latex?p(i)=\\min\\left(\\frac{\\alpha(i)}{\\beta}\\cdot\\frac{2^{s(i)}}{f(i)},M\\right)) |
| `-p explore` (default)| ![EXPLORE](http://latex.codecogs.com/gif.latex?p%28i%29%3D%5Cfrac%7B%5Calpha%28i%29%7D%7B%5Cbeta%7D) |
| `-p fast` | ![FAST](http://latex.codecogs.com/gif.latex?p(i)=\\min\\left(\\frac{\\alpha(i)}{\\beta}\\cdot\\frac{2^{s(i)}}{f(i)},M\\right)) |
| `-p coe` | ![COE](http://latex.codecogs.com/gif.latex?p%28i%29%3D%5Cbegin%7Bcases%7D%200%20%26%20%5Ctext%7B%20if%20%7D%20f%28i%29%20%3E%20%5Cmu%5C%5C%20%5Cmin%5Cleft%28%5Cfrac%7B%5Calpha%28i%29%7D%7B%5Cbeta%7D%5Ccdot%202%5E%7Bs%28i%29%7D%2C%20M%5Cright%29%20%26%20%5Ctext%7B%20otherwise.%7D%20%5Cend%7Bcases%7D) |
| `-p explore` | ![EXPLORE](http://latex.codecogs.com/gif.latex?p%28i%29%3D%5Cfrac%7B%5Calpha%28i%29%7D%7B%5Cbeta%7D) |
| `-p quad` | ![QUAD](http://latex.codecogs.com/gif.latex?p%28i%29%20%3D%20%5Cmin%5Cleft%28%5Cfrac%7B%5Calpha%28i%29%7D%7B%5Cbeta%7D%5Ccdot%5Cfrac%7Bs%28i%29%5E2%7D%7Bf%28i%29%7D%2CM%5Cright%29) |
| `-p lin` | ![LIN](http://latex.codecogs.com/gif.latex?p%28i%29%20%3D%20%5Cmin%5Cleft%28%5Cfrac%7B%5Calpha%28i%29%7D%7B%5Cbeta%7D%5Ccdot%5Cfrac%7Bs%28i%29%7D%7Bf%28i%29%7D%2CM%5Cright%29) |
| `-p exploit` (AFL) | ![LIN](http://latex.codecogs.com/gif.latex?p%28i%29%20%3D%20%5Calpha%28i%29) |
where *α(i)* is the performance score that AFL uses to compute for the seed input *i*, *β(i)>1* is a constant, *s(i)* is the number of times that seed *i* has been chosen from the queue, *f(i)* is the number of generated inputs that exercise the same path as seed *i*, and *μ* is the average number of generated inputs exercising a path.
More details can be found in our paper that was recently accepted at the [23rd ACM Conference on Computer and Communications Security (CCS'16)](https://www.sigsac.org/ccs/CCS2016/accepted-papers/).
More details can be found in the paper that was accepted at the [23rd ACM Conference on Computer and Communications Security (CCS'16)](https://www.sigsac.org/ccs/CCS2016/accepted-papers/).
PS: In parallel mode (several instances with shared queue), we suggest to run the master using the exploit schedule (-p exploit) and the slaves with a combination of cut-off-exponential (-p coe), exponential (-p fast; default), and explore (-p explore) schedules. In single mode, the default settings will do. **EDIT:** In parallel mode, AFLFast seems to perform poorly because the path probability estimates are incorrect for the imported seeds. Pull requests to fix this issue by syncing the estimates accross instances are appreciated :)

152
docs/python_mutators.txt Normal file
View File

@ -0,0 +1,152 @@
==================================================
Adding custom mutators to AFL using Python modules
==================================================
This file describes how you can utilize the external Python API to write
your own custom mutation routines.
Note: This feature is highly experimental. Use at your own risk.
Implemented by Christian Holler (:decoder) <choller@mozilla.com>.
NOTE: This is for Python 2.7 !
Anyone who wants to add Python 3.7 support is happily welcome :)
For an example and a template see ../python_mutators/
1) Description and purpose
--------------------------
While AFLFuzz comes with a good selection of generic deterministic and
non-deterministic mutation operations, it sometimes might make sense to extend
these to implement strategies more specific to the target you are fuzzing.
For simplicity and in order to allow people without C knowledge to extend
AFLFuzz, I implemented a "Python" stage that can make use of an external
module (written in Python) that implements a custom mutation stage.
The main motivation behind this is to lower the barrier for people
experimenting with this tool. Hopefully, someone will be able to do useful
things with this extension.
If you find it useful, have questions or need additional features added to the
interface, feel free to send a mail to <choller@mozilla.com>.
See the following information to get a better pictures:
https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf
https://bugs.chromium.org/p/chromium/issues/detail?id=930663
2) How the Python module looks like
-----------------------------------
You can find a simple example in pymodules/example.py including documentation
explaining each function. In the same directory, you can find another simple
module that performs simple mutations.
Right now, "init" is called at program startup and can be used to perform any
kinds of one-time initializations while "fuzz" is called each time a mutation
is requested.
There is also optional support for a trimming API, see the section below for
further information about this feature.
3) How to compile AFLFuzz with Python support
---------------------------------------------
You must install the python 2.7 development package of your Linux distribution
before this will work. On Debian/Ubuntu/Kali this can be done with:
apt install python2.7-dev
A prerequisite for using this mode is to compile AFLFuzz with Python support.
The afl Makefile performs some magic and detects Python 2.7 if it is in the
default path and compiles afl-fuzz with the feature if available (which is
/usr/include/python2.7 for the Python.h include and /usr/lib/x86_64-linux-gnu
for the libpython2.7.a library)
In case your setup is different set the necessary variables like this:
PYTHON_INCLUDE=/path/to/python2.7/include LDFLAGS=-L/path/to/python2.7/lib make
4) How to run AFLFuzz with your custom module
---------------------------------------------
You must pass the module name inside the env variable AFL_PYTHON_MODULE.
In addition, if you are trying to load the module from the local directory,
you must adjust your PYTHONPATH to reflect this circumstance. The following
command should work if you are inside the aflfuzz directory:
$ AFL_PYTHON_MODULE="pymodules.test" PYTHONPATH=. ./afl-fuzz
Optionally, the following environment variables are supported:
AFL_PYTHON_ONLY - Disable all other mutation stages. This can prevent broken
testcases (those that your Python module can't work with
anymore) to fill up your queue. Best combined with a custom
trimming routine (see below) because trimming can cause the
same test breakage like havoc and splice.
AFL_DEBUG - When combined with AFL_NO_UI, this causes the C trimming code
to emit additional messages about the performance and actions
of your custom Python trimmer. Use this to see if it works :)
5) Order and statistics
-----------------------
The Python stage is set to be the first non-deterministic stage (right before
the havoc stage). In the statistics however, it shows up as the third number
under "havoc". That's because I'm lazy and I didn't want to mess with the UI
too much ;)
6) Trimming support
-------------------
The generic trimming routines implemented in AFLFuzz can easily destroy the
structure of complex formats, possibly leading to a point where you have a lot
of testcases in the queue that your Python module cannot process anymore but
your target application still accepts. This is especially the case when your
target can process a part of the input (causing coverage) and then errors out
on the remaining input.
In such cases, it makes sense to implement a custom trimming routine in Python.
The API consists of multiple methods because after each trimming step, we have
to go back into the C code to check if the coverage bitmap is still the same
for the trimmed input. Here's a quick API description:
init_trim: This method is called at the start of each trimming operation
and receives the initial buffer. It should return the amount
of iteration steps possible on this input (e.g. if your input
has n elements and you want to remove them one by one, return n,
if you do a binary search, return log(n), and so on...).
If your trimming algorithm doesn't allow you to determine the
amount of (remaining) steps easily (esp. while running), then you
can alternatively return 1 here and always return 0 in post_trim
until you are finished and no steps remain. In that case,
returning 1 in post_trim will end the trimming routine. The whole
current index/max iterations stuff is only used to show progress.
trim: This method is called for each trimming operation. It doesn't
have any arguments because we already have the initial buffer
from init_trim and we can memorize the current state in global
variables. This can also save reparsing steps for each iteration.
It should return the trimmed input buffer, where the returned data
must not exceed the initial input data in length. Returning anything
that is larger than the original data (passed to init_trim) will
result in a fatal abort of AFLFuzz.
post_trim: This method is called after each trim operation to inform you
if your trimming step was successful or not (in terms of coverage).
If you receive a failure here, you should reset your input to the
last known good state.
In any case, this method must return the next trim iteration index
(from 0 to the maximum amount of steps you returned in init_trim).
Omitting any of the methods will cause Python trimming to be disabled and
trigger a fallback to the builtin default trimming routine.

View File

@ -6,6 +6,10 @@ Sister projects
designed for, or meant to integrate with AFL. See README for the general
instruction manual.
!!!
!!! This list is outdated and needs an update, missing: e.g. Angora, FairFuzz
!!!
-------------------------------------------
Support for other languages / environments:
-------------------------------------------
@ -263,7 +267,7 @@ Static binary-only instrumentation (Aleksandar Nikolich)
reports better performance compared to QEMU, but occasional translation
errors with stripped binaries.
https://github.com/vrtadmin/moflow/tree/master/afl-dyninst
https://github.com/vanhauser-thc/afl-dyninst
AFL PIN (Parker Thompson)
-------------------------

View File

@ -33,6 +33,16 @@ other side effects - sorry about that.
With that out of the way, let's talk about what's actually on the screen...
0) The status bar
The top line shows you which mode afl-fuzz is running in
(normal: "american fuzy lop", crash exploration mode: "peruvian rabbit mode")
and the version of afl++.
Next to the version is the banner, which, if not set with -T by hand, will
either show the binary name being fuzzed, or the -M/-S master/slave name for
parallel fuzzing.
Finally, the last item is the power schedule mode being run (default: explore).
1) Process timing
-----------------

107
docs/unicorn_mode.txt Normal file
View File

@ -0,0 +1,107 @@
=========================================================
Unicorn-based binary-only instrumentation for afl-fuzz
=========================================================
1) Introduction
---------------
The code in ./unicorn_mode allows you to build a standalone feature that
leverages the Unicorn Engine and allows callers to obtain instrumentation
output for black-box, closed-source binary code snippets. This mechanism
can be then used by afl-fuzz to stress-test targets that couldn't be built
with afl-gcc or used in QEMU mode, or with other extensions such as
TriforceAFL.
There is a significant performance penalty compared to native AFL,
but at least we're able to use AFL on these binaries, right?
The idea and much of the implementation comes from Nathan Voss <njvoss299@gmail.com>.
2) How to use
-------------
*** Building AFL's Unicorn Mode ***
First, make afl as usual.
Once that completes successfully you need to build and add in the Unicorn Mode
features:
$ cd unicorn_mode
$ ./build_unicorn_support.sh
NOTE: This script downloads a recent Unicorn Engine commit that has been tested
and is stable-ish from the Unicorn github page. If you are offline, you'll need
to hack up this script a little bit and supply your own copy of Unicorn's latest
stable release. It's not very hard, just check out the beginning of the
build_unicorn_support.sh script and adjust as necessary.
Building Unicorn will take a little bit (~5-10 minutes). Once it completes
it automatically compiles a sample application and verify that it works.
*** Fuzzing with Unicorn Mode ***
To really use unicorn-mode effectively you need to prepare the following:
* Relevant binary code to be fuzzed
* Knowledge of the memory map and good starting state
* Folder containing sample inputs to start fuzzing with
- Same ideas as any other AFL inputs
- Quality/speed of results will depend greatly on quality of starting
samples
- See AFL's guidance on how to create a sample corpus
* Unicorn-based test harness which:
- Adds memory map regions
- Loads binary code into memory
- Emulates at least one instruction*
- Yeah, this is lame. See 'Gotchas' section below for more info
- Loads and verifies data to fuzz from a command-line specified file
- AFL will provide mutated inputs by changing the file passed to
the test harness
- Presumably the data to be fuzzed is at a fixed buffer address
- If input constraints (size, invalid bytes, etc.) are known they
should be checked after the file is loaded. If a constraint
fails, just exit the test harness. AFL will treat the input as
'uninteresting' and move on.
- Sets up registers and memory state for beginning of test
- Emulates the interested code from beginning to end
- If a crash is detected, the test harness must 'crash' by
throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.)
Once you have all those things ready to go you just need to run afl-fuzz in
'unicorn-mode' by passing in the '-U' flag:
$ afl-fuzz -U -m none -i /path/to/inputs -o /path/to/results -- ./test_harness @@
The normal afl-fuzz command line format applies to everything here. Refer to
AFL's main documentation for more info about how to use afl-fuzz effectively.
For a much clearer vision of what all of this looks like, please refer to the
sample provided in the 'unicorn_mode/samples' directory. There is also a blog
post that goes over the basics at:
https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf
The 'helper_scripts' directory also contains several helper scripts that allow you
to dump context from a running process, load it, and hook heap allocations. For details
on how to use this check out the follow-up blog post to the one linked above.
A example use of AFL-Unicorn mode is discussed in the Paper Unicorefuzz:
https://www.usenix.org/conference/woot19/presentation/maier
3) Gotchas, feedback, bugs
--------------------------
To make sure that AFL's fork server starts up correctly the Unicorn test
harness script must emulate at least one instruction before loading the
data that will be fuzzed from the input file. It doesn't matter what the
instruction is, nor if it is valid. This is an artifact of how the fork-server
is started and could likely be fixed with some clever re-arranging of the
patches applied to Unicorn.
Running the build script builds Unicorn and its python bindings and installs
them on your system. This installation will supersede any existing Unicorn
installation with the patched afl-unicorn version.
Refer to the unicorn_mode/samples/arm_example/arm_tester.c for an example
of how to do this properly! If you don't get this right, AFL will not
load any mutated inputs and your fuzzing will be useless!

350
llvm_mode/LLVMInsTrim.so.cc Normal file
View File

@ -0,0 +1,350 @@
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <unistd.h>
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/IR/CFG.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/Module.h"
#include "llvm/Pass.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/IPO/PassManagerBuilder.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"
#include <unordered_set>
#include <random>
#include <list>
#include <string>
#include <fstream>
#include "../config.h"
#include "../debug.h"
#include "MarkNodes.h"
using namespace llvm;
static cl::opt<bool> MarkSetOpt("markset", cl::desc("MarkSet"),
cl::init(false));
static cl::opt<bool> LoopHeadOpt("loophead", cl::desc("LoopHead"),
cl::init(false));
namespace {
struct InsTrim : public ModulePass {
protected:
std::list<std::string> myWhitelist;
private:
std::mt19937 generator;
int total_instr = 0;
unsigned genLabel() {
return generator() % 65536;
}
public:
static char ID;
InsTrim() : ModulePass(ID), generator(0) {//}
// AFLCoverage() : ModulePass(ID) {
char* instWhiteListFilename = getenv("AFL_LLVM_WHITELIST");
if (instWhiteListFilename) {
std::string line;
std::ifstream fileStream;
fileStream.open(instWhiteListFilename);
if (!fileStream)
report_fatal_error("Unable to open AFL_LLVM_WHITELIST");
getline(fileStream, line);
while (fileStream) {
myWhitelist.push_back(line);
getline(fileStream, line);
}
}
}
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();
}
#if LLVM_VERSION_MAJOR < 4
const char *
#else
StringRef
#endif
getPassName() const override {
return "InstTrim Instrumentation";
}
bool runOnModule(Module &M) override {
char be_quiet = 0;
if (isatty(2) && !getenv("AFL_QUIET")) {
SAYF(cCYA "LLVMInsTrim" VERSION cRST " by csienslab\n");
} else be_quiet = 1;
#if LLVM_VERSION_MAJOR < 9
char* neverZero_counters_str;
if ((neverZero_counters_str = getenv("AFL_LLVM_NOT_ZERO")) != NULL)
OKF("LLVM neverZero activated (by hexcoder)\n");
#endif
if (getenv("AFL_LLVM_INSTRIM_LOOPHEAD") != NULL || getenv("LOOPHEAD") != NULL) {
LoopHeadOpt = true;
}
// this is our default
MarkSetOpt = true;
/* // I dont think this makes sense to port into LLVMInsTrim
char* inst_ratio_str = getenv("AFL_INST_RATIO");
unsigned int inst_ratio = 100;
if (inst_ratio_str) {
if (sscanf(inst_ratio_str, "%u", &inst_ratio) != 1 || !inst_ratio || inst_ratio > 100)
FATAL("Bad value of AFL_INST_RATIO (must be between 1 and 100)");
}
*/
LLVMContext &C = M.getContext();
IntegerType *Int8Ty = IntegerType::getInt8Ty(C);
IntegerType *Int32Ty = IntegerType::getInt32Ty(C);
GlobalVariable *CovMapPtr = new GlobalVariable(
M, PointerType::getUnqual(Int8Ty), false, GlobalValue::ExternalLinkage,
nullptr, "__afl_area_ptr");
GlobalVariable *OldPrev = new GlobalVariable(
M, Int32Ty, false, GlobalValue::ExternalLinkage, 0, "__afl_prev_loc",
0, GlobalVariable::GeneralDynamicTLSModel, 0, false);
u64 total_rs = 0;
u64 total_hs = 0;
for (Function &F : M) {
if (!F.size()) {
continue;
}
if (!myWhitelist.empty()) {
bool instrumentBlock = false;
DebugLoc Loc;
StringRef instFilename;
for (auto &BB : F) {
BasicBlock::iterator IP = BB.getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
if (!Loc)
Loc = IP->getDebugLoc();
}
if ( Loc ) {
DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
unsigned int instLine = cDILoc->getLine();
instFilename = cDILoc->getFilename();
if (instFilename.str().empty()) {
/* If the original location is empty, try using the inlined location */
DILocation *oDILoc = cDILoc->getInlinedAt();
if (oDILoc) {
instFilename = oDILoc->getFilename();
instLine = oDILoc->getLine();
}
}
/* Continue only if we know where we actually are */
if (!instFilename.str().empty()) {
for (std::list<std::string>::iterator it = myWhitelist.begin(); it != myWhitelist.end(); ++it) {
if (instFilename.str().length() >= it->length()) {
if (instFilename.str().compare(instFilename.str().length() - it->length(), it->length(), *it) == 0) {
instrumentBlock = true;
break;
}
}
}
}
}
/* Either we couldn't figure out our location or the location is
* not whitelisted, so we skip instrumentation. */
if (!instrumentBlock) {
if (!instFilename.str().empty())
SAYF(cYEL "[!] " cBRI "Not in whitelist, skipping %s ...\n", instFilename.str().c_str());
else
SAYF(cYEL "[!] " cBRI "No filename information found, skipping it");
continue;
}
}
std::unordered_set<BasicBlock *> MS;
if (!MarkSetOpt) {
for (auto &BB : F) {
MS.insert(&BB);
}
total_rs += F.size();
} else {
auto Result = markNodes(&F);
auto RS = Result.first;
auto HS = Result.second;
MS.insert(RS.begin(), RS.end());
if (!LoopHeadOpt) {
MS.insert(HS.begin(), HS.end());
total_rs += MS.size();
} else {
DenseSet<std::pair<BasicBlock *, BasicBlock *>> EdgeSet;
DominatorTreeWrapperPass *DTWP = &getAnalysis<DominatorTreeWrapperPass>(F);
auto DT = &DTWP->getDomTree();
total_rs += RS.size();
total_hs += HS.size();
for (BasicBlock *BB : HS) {
bool Inserted = false;
for (auto BI = pred_begin(BB), BE = pred_end(BB);
BI != BE; ++BI
) {
auto Edge = BasicBlockEdge(*BI, BB);
if (Edge.isSingleEdge() && DT->dominates(Edge, BB)) {
EdgeSet.insert({*BI, BB});
Inserted = true;
break;
}
}
if (!Inserted) {
MS.insert(BB);
total_rs += 1;
total_hs -= 1;
}
}
for (auto I = EdgeSet.begin(), E = EdgeSet.end(); I != E; ++I) {
auto PredBB = I->first;
auto SuccBB = I->second;
auto NewBB = SplitBlockPredecessors(SuccBB, {PredBB}, ".split",
DT, nullptr,
#if LLVM_VERSION_MAJOR >= 8
nullptr,
#endif
false);
MS.insert(NewBB);
}
}
auto *EBB = &F.getEntryBlock();
if (succ_begin(EBB) == succ_end(EBB)) {
MS.insert(EBB);
total_rs += 1;
}
for (BasicBlock &BB : F) {
if (MS.find(&BB) == MS.end()) {
continue;
}
IRBuilder<> IRB(&*BB.getFirstInsertionPt());
IRB.CreateStore(ConstantInt::get(Int32Ty, genLabel()), OldPrev);
}
}
for (BasicBlock &BB : F) {
auto PI = pred_begin(&BB);
auto PE = pred_end(&BB);
if (MarkSetOpt && MS.find(&BB) == MS.end()) {
continue;
}
IRBuilder<> IRB(&*BB.getFirstInsertionPt());
Value *L = NULL;
if (PI == PE) {
L = ConstantInt::get(Int32Ty, genLabel());
} else {
auto *PN = PHINode::Create(Int32Ty, 0, "", &*BB.begin());
DenseMap<BasicBlock *, unsigned> PredMap;
for (auto PI = pred_begin(&BB), PE = pred_end(&BB);
PI != PE; ++PI
) {
BasicBlock *PBB = *PI;
auto It = PredMap.insert({PBB, genLabel()});
unsigned Label = It.first->second;
PN->addIncoming(ConstantInt::get(Int32Ty, Label), PBB);
}
L = PN;
}
/* Load prev_loc */
LoadInst *PrevLoc = IRB.CreateLoad(OldPrev);
PrevLoc->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *PrevLocCasted = IRB.CreateZExt(PrevLoc, IRB.getInt32Ty());
/* Load SHM pointer */
LoadInst *MapPtr = IRB.CreateLoad(CovMapPtr);
MapPtr->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *MapPtrIdx = IRB.CreateGEP(MapPtr, IRB.CreateXor(PrevLocCasted, L));
/* Update bitmap */
LoadInst *Counter = IRB.CreateLoad(MapPtrIdx);
Counter->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *Incr = IRB.CreateAdd(Counter, ConstantInt::get(Int8Ty, 1));
#if LLVM_VERSION_MAJOR < 9
if (neverZero_counters_str != NULL) { // with llvm 9 we make this the default as the bug in llvm is then fixed
#else
#warning "neverZero implementation needs to be reviewed!"
#endif
/* hexcoder: Realize a counter that skips zero during overflow.
* Once this counter reaches its maximum value, it next increments to 1
*
* Instead of
* Counter + 1 -> Counter
* we inject now this
* Counter + 1 -> {Counter, OverflowFlag}
* Counter + OverflowFlag -> Counter
*/
auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
auto carry = IRB.CreateZExt(cf, Int8Ty);
Incr = IRB.CreateAdd(Incr, carry);
#if LLVM_VERSION_MAJOR < 9
}
#endif
IRB.CreateStore(Incr, MapPtrIdx)->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
/* Set prev_loc to cur_loc >> 1 */
/*
StoreInst *Store = IRB.CreateStore(ConstantInt::get(Int32Ty, cur_loc >> 1), AFLPrevLoc);
Store->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
*/
total_instr++;
}
}
OKF("Instrumented %u locations (%llu, %llu) (%s mode)\n"/*", ratio %u%%)."*/,
total_instr, total_rs, total_hs,
getenv("AFL_HARDEN") ? "hardened" :
((getenv("AFL_USE_ASAN") || getenv("AFL_USE_MSAN")) ?
"ASAN/MSAN" : "non-hardened")/*, inst_ratio*/);
return false;
}
}; // end of struct InsTrim
} // end of anonymous namespace
char InsTrim::ID = 0;
static void registerAFLPass(const PassManagerBuilder &,
legacy::PassManagerBase &PM) {
PM.add(new InsTrim());
}
static RegisterStandardPasses RegisterAFLPass(
PassManagerBuilder::EP_OptimizerLast, registerAFLPass);
static RegisterStandardPasses RegisterAFLPass0(
PassManagerBuilder::EP_EnabledOnOptLevel0, registerAFLPass);

View File

@ -16,6 +16,9 @@
# http://www.apache.org/licenses/LICENSE-2.0
#
# For Heiko:
#TEST_MMAP=1
PREFIX ?= /usr/local
HELPER_PATH = $(PREFIX)/lib/afl
BIN_PATH = $(PREFIX)/bin
@ -23,17 +26,23 @@ BIN_PATH = $(PREFIX)/bin
VERSION = $(shell grep '^\#define VERSION ' ../config.h | cut -d '"' -f2)
LLVM_CONFIG ?= llvm-config
#LLVM_OK = $(shell $(LLVM_CONFIG) --version | egrep -q '^[5-6]' && echo 0 || echo 1 )
LLVMVER = $(shell $(LLVM_CONFIG) --version)
LLVM_UNSUPPORTED = $(shell $(LLVM_CONFIG) --version | egrep -q '^9|3.0' && echo 1 || echo 0 )
LLVM_MAJOR = ($shell $(LLVM_CONFIG) --version | sed 's/\..*//')
ifeq "$(LLVM_UNSUPPORTED)" "1"
$(warn llvm_mode only supports versions 3.8.0 up to 8.x )
endif
# this is not visible yet:
ifeq "$(LLVM_MAJOR)" "9"
$(info llvm_mode deteted llvm 9, enabling neverZero implementation)
endif
CFLAGS ?= -O3 -funroll-loops
CFLAGS += -Wall -D_FORTIFY_SOURCE=2 -g -Wno-pointer-sign \
-DAFL_PATH=\"$(HELPER_PATH)\" -DBIN_PATH=\"$(BIN_PATH)\" \
-DVERSION=\"$(VERSION)\"
-DVERSION=\"$(VERSION)\"
ifdef AFL_TRACE_PC
CFLAGS += -DUSE_TRACE_PC=1
endif
@ -45,12 +54,16 @@ CXXFLAGS += -Wall -D_FORTIFY_SOURCE=2 -g -Wno-pointer-sign \
CLANG_CFL = `$(LLVM_CONFIG) --cxxflags` -Wl,-znodelete -fno-rtti -fpic $(CXXFLAGS)
CLANG_LFL = `$(LLVM_CONFIG) --ldflags` $(LDFLAGS)
# User teor2345 reports that this is required to make things work on MacOS X.
# User teor2345 reports that this is required to make things work on MacOS X.
ifeq "$(shell uname)" "Darwin"
CLANG_LFL += -Wl,-flat_namespace -Wl,-undefined,suppress
endif
ifeq "$(shell uname)" "OpenBSD"
CLANG_LFL += `$(LLVM_CONFIG) --libdir`/libLLVM.so.0.0
endif
# We were using llvm-config --bindir to get the location of clang, but
# this seems to be busted on some distros, so using the one in $PATH is
# probably better.
@ -60,13 +73,53 @@ ifeq "$(origin CC)" "default"
CXX = clang++
endif
# sanity check.
# Are versions of clang --version and llvm-config --version equal?
CLANGVER = $(shell $(CC) --version | sed -E -ne '/^.*([0-9]\.[0-9]\.[0-9]).*/s//\1/p')
ifeq "$(shell echo '\#include <sys/ipc.h>@\#include <sys/shm.h>@int main() { int _id = shmget(IPC_PRIVATE, 65536, IPC_CREAT | IPC_EXCL | 0600); shmctl(_id, IPC_RMID, 0); return 0;}' | tr @ '\n' | $(CC) -x c - -o .test2 2>/dev/null && echo 1 || echo 0 )" "1"
SHMAT_OK=1
else
SHMAT_OK=0
CFLAGS+=-DUSEMMAP=1
LDFLAGS += -lrt
endif
ifeq "$(TEST_MMAP)" "1"
SHMAT_OK=0
CFLAGS+=-DUSEMMAP=1
LDFLAGS += -lrt
endif
ifndef AFL_TRACE_PC
PROGS = ../afl-clang-fast ../afl-llvm-pass.so ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so
PROGS = ../afl-clang-fast ../afl-llvm-pass.so ../libLLVMInsTrim.so ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so
else
PROGS = ../afl-clang-fast ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so
endif
all: test_deps $(PROGS) test_build all_done
ifneq "$(CLANGVER)" "$(LLVMVER)"
CC = $(shell llvm-config --bindir)/clang
CXX = $(shell llvm-config --bindir)/clang++
endif
all: test_shm test_deps $(PROGS) test_build all_done
ifeq "$(SHMAT_OK)" "1"
test_shm:
@echo "[+] shmat seems to be working."
@rm -f .test2
else
test_shm:
@echo "[-] shmat seems not to be working, switching to mmap implementation"
endif
test_deps:
ifndef AFL_TRACE_PC
@ -77,6 +130,13 @@ else
endif
@echo "[*] Checking for working '$(CC)'..."
@which $(CC) >/dev/null 2>&1 || ( echo "[-] Oops, can't find '$(CC)'. Make sure that it's in your \$$PATH (or set \$$CC and \$$CXX)."; exit 1 )
@echo "[*] Checking for matching versions of '$(CC)' and '$(LLVM_CONFIG)'"
ifneq "$(CLANGVER)" "$(LLVMVER)"
@echo "[!] WARNING: we have llvm-config version $(LLVMVER) and a clang version $(CLANGVER)"
@echo "[!] Retrying with the clang compiler from llvm: CC=`llvm-config --bindir`/clang"
else
@echo "[*] We have llvm-config version $(LLVMVER) with a clang version $(CLANGVER), good."
endif
@echo "[*] Checking for '../afl-showmap'..."
@test -f ../afl-showmap || ( echo "[-] Oops, can't find '../afl-showmap'. Be sure to compile AFL first."; exit 1 )
@echo "[+] All set and ready to build."
@ -85,8 +145,11 @@ endif
$(CC) $(CFLAGS) $< -o $@ $(LDFLAGS)
ln -sf afl-clang-fast ../afl-clang-fast++
../libLLVMInsTrim.so: LLVMInsTrim.so.cc MarkNodes.cc | test_deps
$(CXX) $(CLANG_CFL) -DLLVMInsTrim_EXPORTS -fno-rtti -fPIC -std=gnu++11 -shared $< MarkNodes.cc -o $@ $(CLANG_LFL)
../afl-llvm-pass.so: afl-llvm-pass.so.cc | test_deps
$(CXX) $(CLANG_CFL) -shared $< -o $@ $(CLANG_LFL)
$(CXX) $(CLANG_CFL) -DLLVMInsTrim_EXPORTS -fno-rtti -fPIC -std=gnu++11 -shared $< -o $@ $(CLANG_LFL)
# laf
../split-switches-pass.so: split-switches-pass.so.cc | test_deps
@ -110,7 +173,7 @@ endif
test_build: $(PROGS)
@echo "[*] Testing the CC wrapper and instrumentation output..."
unset AFL_USE_ASAN AFL_USE_MSAN AFL_INST_RATIO; AFL_QUIET=1 AFL_PATH=. AFL_CC=$(CC) LAF_SPLIT_SWITCHES=1 LAF_TRANSFORM_COMPARES=1 LAF_SPLIT_COMPARES=1 ../afl-clang-fast $(CFLAGS) ../test-instr.c -o test-instr $(LDFLAGS)
unset AFL_USE_ASAN AFL_USE_MSAN AFL_INST_RATIO; AFL_QUIET=1 AFL_PATH=. AFL_CC=$(CC) AFL_LLVM_LAF_SPLIT_SWITCHES=1 AFL_LLVM_LAF_TRANSFORM_COMPARES=1 AFL_LLVM_LAF_SPLIT_COMPARES=1 ../afl-clang-fast $(CFLAGS) ../test-instr.c -o test-instr $(LDFLAGS)
echo 0 | ../afl-showmap -m none -q -o .test-instr0 ./test-instr
echo 1 | ../afl-showmap -m none -q -o .test-instr1 ./test-instr
@rm -f test-instr
@ -123,5 +186,5 @@ all_done: test_build
.NOTPARALLEL: clean
clean:
rm -f *.o *.so *~ a.out core core.[1-9][0-9]* test-instr .test-instr0 .test-instr1
rm -f *.o *.so *~ a.out core core.[1-9][0-9]* .test2 test-instr .test-instr0 .test-instr1
rm -f $(PROGS) ../afl-clang-fast++

355
llvm_mode/MarkNodes.cc Normal file
View File

@ -0,0 +1,355 @@
#include <algorithm>
#include <map>
#include <queue>
#include <set>
#include <vector>
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/Module.h"
#include "llvm/Pass.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
DenseMap<BasicBlock *, uint32_t> LMap;
std::vector<BasicBlock *> Blocks;
std::set<uint32_t> Marked , Markabove;
std::vector< std::vector<uint32_t> > Succs , Preds;
void reset(){
LMap.clear();
Blocks.clear();
Marked.clear();
Markabove.clear();
}
uint32_t start_point;
void labelEachBlock(Function *F) {
// Fake single endpoint;
LMap[NULL] = Blocks.size();
Blocks.push_back(NULL);
// Assign the unique LabelID to each block;
for (auto I = F->begin(), E = F->end(); I != E; ++I) {
BasicBlock *BB = &*I;
LMap[BB] = Blocks.size();
Blocks.push_back(BB);
}
start_point = LMap[&F->getEntryBlock()];
}
void buildCFG(Function *F) {
Succs.resize( Blocks.size() );
Preds.resize( Blocks.size() );
for( size_t i = 0 ; i < Succs.size() ; i ++ ){
Succs[ i ].clear();
Preds[ i ].clear();
}
//uint32_t FakeID = 0;
for (auto S = F->begin(), E = F->end(); S != E; ++S) {
BasicBlock *BB = &*S;
uint32_t MyID = LMap[BB];
//if (succ_begin(BB) == succ_end(BB)) {
//Succs[MyID].push_back(FakeID);
//Marked.insert(MyID);
//}
for (auto I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {
Succs[MyID].push_back(LMap[*I]);
}
}
}
std::vector< std::vector<uint32_t> > tSuccs;
std::vector<bool> tag , indfs;
void DFStree(size_t now_id) {
if(tag[now_id]) return;
tag[now_id]=true;
indfs[now_id]=true;
for (auto succ: tSuccs[now_id]) {
if(tag[succ] and indfs[succ]) {
Marked.insert(succ);
Markabove.insert(succ);
continue;
}
Succs[now_id].push_back(succ);
Preds[succ].push_back(now_id);
DFStree(succ);
}
indfs[now_id]=false;
}
void turnCFGintoDAG(Function *F) {
tSuccs = Succs;
tag.resize(Blocks.size());
indfs.resize(Blocks.size());
for (size_t i = 0; i < Blocks.size(); ++ i) {
Succs[i].clear();
tag[i]=false;
indfs[i]=false;
}
DFStree(start_point);
for (size_t i = 0; i < Blocks.size(); ++ i)
if( Succs[i].empty() ){
Succs[i].push_back(0);
Preds[0].push_back(i);
}
}
uint32_t timeStamp;
namespace DominatorTree{
std::vector< std::vector<uint32_t> > cov;
std::vector<uint32_t> dfn, nfd, par, sdom, idom, mom, mn;
bool Compare(uint32_t u, uint32_t v) {
return dfn[u] < dfn[v];
}
uint32_t eval(uint32_t u) {
if( mom[u] == u ) return u;
uint32_t res = eval( mom[u] );
if(Compare(sdom[mn[mom[u]]] , sdom[mn[u]])) {
mn[u] = mn[mom[u]];
}
return mom[u] = res;
}
void DFS(uint32_t now) {
timeStamp += 1;
dfn[now] = timeStamp;
nfd[timeStamp - 1] = now;
for( auto succ : Succs[now] ) {
if( dfn[succ] == 0 ) {
par[succ] = now;
DFS(succ);
}
}
}
void DominatorTree(Function *F) {
if( Blocks.empty() ) return;
uint32_t s = start_point;
// Initialization
mn.resize(Blocks.size());
cov.resize(Blocks.size());
dfn.resize(Blocks.size());
nfd.resize(Blocks.size());
par.resize(Blocks.size());
mom.resize(Blocks.size());
sdom.resize(Blocks.size());
idom.resize(Blocks.size());
for( uint32_t i = 0 ; i < Blocks.size() ; i ++ ) {
dfn[i] = 0;
nfd[i] = Blocks.size();
cov[i].clear();
idom[i] = mom[i] = mn[i] = sdom[i] = i;
}
timeStamp = 0;
DFS(s);
for( uint32_t i = Blocks.size() - 1 ; i >= 1u ; i -- ) {
uint32_t now = nfd[i];
if( now == Blocks.size() ) {
continue;
}
for( uint32_t pre : Preds[ now ] ) {
if( dfn[ pre ] ) {
eval(pre);
if( Compare(sdom[mn[pre]], sdom[now]) ) {
sdom[now] = sdom[mn[pre]];
}
}
}
cov[sdom[now]].push_back(now);
mom[now] = par[now];
for( uint32_t x : cov[par[now]] ) {
eval(x);
if( Compare(sdom[mn[x]], par[now]) ) {
idom[x] = mn[x];
} else {
idom[x] = par[now];
}
}
}
for( uint32_t i = 1 ; i < Blocks.size() ; i += 1 ) {
uint32_t now = nfd[i];
if( now == Blocks.size() ) {
continue;
}
if(idom[now] != sdom[now])
idom[now] = idom[idom[now]];
}
}
}; // End of DominatorTree
std::vector<uint32_t> Visited, InStack;
std::vector<uint32_t> TopoOrder, InDeg;
std::vector< std::vector<uint32_t> > t_Succ , t_Pred;
void Go(uint32_t now, uint32_t tt) {
if( now == tt ) return;
Visited[now] = InStack[now] = timeStamp;
for(uint32_t nxt : Succs[now]) {
if(Visited[nxt] == timeStamp and InStack[nxt] == timeStamp) {
Marked.insert(nxt);
}
t_Succ[now].push_back(nxt);
t_Pred[nxt].push_back(now);
InDeg[nxt] += 1;
if(Visited[nxt] == timeStamp) {
continue;
}
Go(nxt, tt);
}
InStack[now] = 0;
}
void TopologicalSort(uint32_t ss, uint32_t tt) {
timeStamp += 1;
Go(ss, tt);
TopoOrder.clear();
std::queue<uint32_t> wait;
wait.push(ss);
while( not wait.empty() ) {
uint32_t now = wait.front(); wait.pop();
TopoOrder.push_back(now);
for(uint32_t nxt : t_Succ[now]) {
InDeg[nxt] -= 1;
if(InDeg[nxt] == 0u) {
wait.push(nxt);
}
}
}
}
std::vector< std::set<uint32_t> > NextMarked;
bool Indistinguish(uint32_t node1, uint32_t node2) {
if(NextMarked[node1].size() > NextMarked[node2].size()){
uint32_t _swap = node1;
node1 = node2;
node2 = _swap;
}
for(uint32_t x : NextMarked[node1]) {
if( NextMarked[node2].find(x) != NextMarked[node2].end() ) {
return true;
}
}
return false;
}
void MakeUniq(uint32_t now) {
bool StopFlag = false;
if (Marked.find(now) == Marked.end()) {
for(uint32_t pred1 : t_Pred[now]) {
for(uint32_t pred2 : t_Pred[now]) {
if(pred1 == pred2) continue;
if(Indistinguish(pred1, pred2)) {
Marked.insert(now);
StopFlag = true;
break;
}
}
if (StopFlag) {
break;
}
}
}
if(Marked.find(now) != Marked.end()) {
NextMarked[now].insert(now);
} else {
for(uint32_t pred : t_Pred[now]) {
for(uint32_t x : NextMarked[pred]) {
NextMarked[now].insert(x);
}
}
}
}
void MarkSubGraph(uint32_t ss, uint32_t tt) {
TopologicalSort(ss, tt);
if(TopoOrder.empty()) return;
for(uint32_t i : TopoOrder) {
NextMarked[i].clear();
}
NextMarked[TopoOrder[0]].insert(TopoOrder[0]);
for(uint32_t i = 1 ; i < TopoOrder.size() ; i += 1) {
MakeUniq(TopoOrder[i]);
}
}
void MarkVertice(Function *F) {
uint32_t s = start_point;
InDeg.resize(Blocks.size());
Visited.resize(Blocks.size());
InStack.resize(Blocks.size());
t_Succ.resize(Blocks.size());
t_Pred.resize(Blocks.size());
NextMarked.resize(Blocks.size());
for( uint32_t i = 0 ; i < Blocks.size() ; i += 1 ) {
Visited[i] = InStack[i] = InDeg[i] = 0;
t_Succ[i].clear();
t_Pred[i].clear();
}
timeStamp = 0;
uint32_t t = 0;
//MarkSubGraph(s, t);
//return;
while( s != t ) {
MarkSubGraph(DominatorTree::idom[t], t);
t = DominatorTree::idom[t];
}
}
// return {marked nodes}
std::pair<std::vector<BasicBlock *>,
std::vector<BasicBlock *> >markNodes(Function *F) {
assert(F->size() > 0 && "Function can not be empty");
reset();
labelEachBlock(F);
buildCFG(F);
turnCFGintoDAG(F);
DominatorTree::DominatorTree(F);
MarkVertice(F);
std::vector<BasicBlock *> Result , ResultAbove;
for( uint32_t x : Markabove ) {
auto it = Marked.find( x );
if( it != Marked.end() )
Marked.erase( it );
if( x )
ResultAbove.push_back(Blocks[x]);
}
for( uint32_t x : Marked ) {
if (x == 0) {
continue;
} else {
Result.push_back(Blocks[x]);
}
}
return { Result , ResultAbove };
}

11
llvm_mode/MarkNodes.h Normal file
View File

@ -0,0 +1,11 @@
#ifndef __MARK_NODES__
#define __MARK_NODES__
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Function.h"
#include<vector>
std::pair<std::vector<llvm::BasicBlock *>,
std::vector<llvm::BasicBlock *>> markNodes(llvm::Function *F);
#endif

26
llvm_mode/README.instrim Normal file
View File

@ -0,0 +1,26 @@
# InsTrim
InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing
## Introduction
InsTrim uses CFG and markers to instrument just what is necessary in the
binary in llvm_mode. It is about 20-25% faster but as a cost has a lower
path discovery.
## Usage
Set the environment variable AFL_LLVM_INSTRIM=1
There is also an advanced mode which instruments loops in a way so that
afl-fuzz can see which loop path has been selected but not being able to
see how often the loop has been rerun.
This again is a tradeoff for speed for less path information.
To enable this mode set AFL_LLVM_INSTRIM_LOOPHEAD=1
## Background
The paper: [InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing]
(https://www.ndss-symposium.org/wp-content/uploads/2018/07/bar2018_14_Hsu_paper.pdf)

View File

@ -8,13 +8,13 @@ compile the target project.
The following options exist:
export LAF_SPLIT_SWITCHES=1 Enables the split-switches pass.
export AFL_LLVM_LAF_SPLIT_SWITCHES=1 Enables the split-switches pass.
export LAF_TRANSFORM_COMPARES=1 Enables the transform-compares pass
export AFL_LLVM_LAF_TRANSFORM_COMPARES=1 Enables the transform-compares pass
(strcmp, memcmp, strncmp, strcasecmp, strncasecmp).
export LAF_SPLIT_COMPARES=1 Enables the split-compares pass.
export AFL_LLVM_LAF_SPLIT_COMPARES=1 Enables the split-compares pass.
By default it will split all compares with a bit width <= 64 bits.
You can change this behaviour by setting
export LAF_SPLIT_COMPARES_BITW=<bit_width>.
export AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>.

View File

@ -3,6 +3,7 @@ Fast LLVM-based instrumentation for afl-fuzz
============================================
(See ../docs/README for the general instruction manual.)
(See ../gcc_plugin/README.gcc for the GCC-based instrumentation.)
1) Introduction
---------------
@ -30,7 +31,7 @@ several interesting properties:
- The instrumentation can cope a bit better with multi-threaded targets.
- Because the feature relies on the internals of LLVM, it is clang-specific
and will *not* work with GCC.
and will *not* work with GCC (see ../gcc_plugin/ for an alternative).
Once this implementation is shown to be sufficiently robust and portable, it
will probably replace afl-clang. For now, it can be built separately and
@ -38,8 +39,8 @@ co-exists with the original code.
The idea and much of the implementation comes from Laszlo Szekeres.
2) How to use
-------------
2) How to use this
------------------
In order to leverage this mechanism, you need to have clang installed on your
system. You should also make sure that the llvm-config tool is in your path
@ -69,21 +70,47 @@ operating mode of AFL, e.g.:
Be sure to also include CXX set to afl-clang-fast++ for C++ code.
The tool honors roughly the same environmental variables as afl-gcc (see
../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN,
AFL_HARDEN, and AFL_DONT_OPTIMIZE.
../docs/env_variables.txt). This includes AFL_USE_ASAN,
AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored
as it does not serve a good purpose with the more effective instrim CFG
analysis.
Note: if you want the LLVM helper to be installed on your system for all
users, you need to build it before issuing 'make install' in the parent
directory.
3) Gotchas, feedback, bugs
3) Options
Several options are present to make llvm_mode faster or help it rearrange
the code to make afl-fuzz path discovery easier.
If you need just to instrument specific parts of the code, you can whitelist
which C/C++ files to actually intrument. See README.whitelist
For splitting memcmp, strncmp, etc. please see README.laf-intel
Then there is an optimized instrumentation strategy that uses CFGs and
markers to just instrument what is needed. This increases speed by 20-25%
however has a lower path discovery.
If you want to use this, set AFL_LLVM_INSTRIM=1
See README.instrim
Finally if your llvm version is 8 or lower, you can activate a mode that
prevents that a counter overflow result in a 0 value. This is good for
path discovery, but the llvm implementation for intel for this functionality
is not optimal and was only fixed in llvm 9.
You can set this with AFL_LLVM_NOT_ZERO=1
See README.neverzero
4) Gotchas, feedback, bugs
--------------------------
This is an early-stage mechanism, so field reports are welcome. You can send bug
reports to <afl-users@googlegroups.com>.
4) Bonus feature #1: deferred instrumentation
---------------------------------------------
5) Bonus feature #1: deferred initialization
--------------------------------------------
AFL tries to optimize performance by executing the targeted binary just once,
stopping it just before main(), and then cloning this "master" process to get
@ -129,7 +156,7 @@ will keep working normally when compiled with a tool other than afl-clang-fast.
Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will
*not* generate a deferred-initialization binary) - and you should be all set!
5) Bonus feature #2: persistent mode
6) Bonus feature #2: persistent mode
------------------------------------
Some libraries provide APIs that are stateless, or whose state can be reset in
@ -169,7 +196,7 @@ PS. Because there are task switches still involved, the mode isn't as fast as
faster than the normal fork() model, and compared to in-process fuzzing,
should be a lot more robust.
6) Bonus feature #3: new 'trace-pc-guard' mode
8) Bonus feature #3: new 'trace-pc-guard' mode
----------------------------------------------
Recent versions of LLVM are shipping with a built-in execution tracing feature
@ -178,10 +205,8 @@ post-process the assembly or install any compiler plugins. See:
http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards
As of this writing, the feature is only available on SVN trunk, and is yet to
make it to an official release of LLVM. Nevertheless, if you have a
sufficiently recent compiler and want to give it a try, build afl-clang-fast
this way:
If you have a sufficiently recent compiler and want to give it a try, build
afl-clang-fast this way:
AFL_TRACE_PC=1 make clean all

View File

@ -0,0 +1,22 @@
Usage
=====
In larger, complex or reiterative programs the map that collects the edge pairs
can easily fill up and wrap.
This is not that much of an issue - unless by chance it wraps just to a 0
when the program execution ends.
In this case afl-fuzz is not able to see that the pair has been accessed and
will ignore it.
NeverZero prevents this behaviour. If a counter wraps, it jumps over the 0
directly to a 1. This improves path discovery (by a very little amount)
at a very little cost (one instruction per edge).
This is implemented in afl-gcc, however for llvm_mode this is optional if
the llvm version is below 9 - as there is a perfomance bug that is only fixed
in version 9 and onwards.
If you want to enable this for llvm < 9 then set
export AFL_LLVM_NOT_ZERO=1

View File

@ -0,0 +1,75 @@
========================================
Using afl++ with partial instrumentation
========================================
This file describes how you can selectively instrument only the source files
that are interesting to you using the LLVM instrumentation provided by
afl++
Originally developed by Christian Holler (:decoder) <choller@mozilla.com>.
1) Description and purpose
--------------------------
When building and testing complex programs where only a part of the program is
the fuzzing target, it often helps to only instrument the necessary parts of
the program, leaving the rest uninstrumented. This helps to focus the fuzzer
on the important parts of the program, avoiding undesired noise and
disturbance by uninteresting code being exercised.
For this purpose, I have added a "partial instrumentation" support to the LLVM
mode of AFLFuzz that allows you to specify on a source file level which files
should be compiled with or without instrumentation.
2) Building the LLVM module
---------------------------
The new code is part of the existing afl++ LLVM module in the llvm_mode/
subdirectory. There is nothing specifically to do :)
3) How to use the partial instrumentation mode
----------------------------------------------
In order to build with partial instrumentation, you need to build with
afl-clang-fast and afl-clang-fast++ respectively. The only required change is
that you need to set the environment variable AFL_LLVM_WHITELIST when calling
the compiler.
The environment variable must point to a file containing all the filenames
that should be instrumented. For matching, the filename that is being compiled
must end in the filename contained in this whitelist (to avoid breaking the
matching when absolute paths are used during compilation).
For example if your source tree looks like this:
project/
project/feature_a/a1.cpp
project/feature_a/a2.cpp
project/feature_b/b1.cpp
project/feature_b/b2.cpp
And you only want to test feature_a, then create a whitelist file containing:
feature_a/a1.cpp
feature_a/a2.cpp
However if the whitelist file contains this, it works as well:
a1.cpp
a2.cpp
but it might lead to files being unwantedly instrumented if the same filename
exists somewhere else in the project.
The created whitelist file is then set to AFL_INST_WHITELIST when you compile
your program. For each file that didn't match the whitelist, the compiler will
issue a warning at the end stating that no blocks were instrumented. If you
didn't intend to instrument that file, then you can safely ignore that warning.
For old LLVM versions this feature might require to be compiled with debug
information (-g), however at least from llvm version 6.0 onwards this is not
required anymore (and might hurt performance and crash detection, so better not
use -g)

View File

@ -32,6 +32,7 @@
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
static u8* obj_path; /* Path to runtime libraries */
static u8** cc_params; /* Parameters passed to the real CC */
@ -87,7 +88,7 @@ static void find_obj(u8* argv0) {
return;
}
FATAL("Unable to find 'afl-llvm-rt.o' or 'afl-llvm-pass.so'. Please set AFL_PATH");
FATAL("Unable to find 'afl-llvm-rt.o' or 'afl-llvm-pass.so.cc'. Please set AFL_PATH");
}
@ -112,29 +113,29 @@ static void edit_params(u32 argc, char** argv) {
cc_params[0] = alt_cc ? alt_cc : (u8*)"clang";
}
/* There are two ways to compile afl-clang-fast. In the traditional mode, we
use afl-llvm-pass.so to inject instrumentation. In the experimental
/* There are three ways to compile with afl-clang-fast. In the traditional
mode, we use afl-llvm-pass.so, then there is libLLVMInsTrim.so which is
much faster but has less coverage. Finally tere is the experimental
'trace-pc-guard' mode, we use native LLVM instrumentation callbacks
instead. The latter is a very recent addition - see:
instead. For trace-pc-guard see:
http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards */
// laf
if (getenv("LAF_SPLIT_SWITCHES")) {
if (getenv("LAF_SPLIT_SWITCHES")||getenv("AFL_LLVM_LAF_SPLIT_SWITCHES")) {
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = "-load";
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = alloc_printf("%s/split-switches-pass.so", obj_path);
}
if (getenv("LAF_TRANSFORM_COMPARES")) {
if (getenv("LAF_TRANSFORM_COMPARES")||getenv("AFL_LLVM_LAF_TRANSFORM_COMPARES")) {
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = "-load";
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = alloc_printf("%s/compare-transform-pass.so", obj_path);
}
if (getenv("LAF_SPLIT_COMPARES")) {
if (getenv("LAF_SPLIT_COMPARES")||getenv("AFL_LLVM_LAF_SPLIT_COMPARES")) {
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = "-load";
cc_params[cc_par_cnt++] = "-Xclang";
@ -143,14 +144,18 @@ static void edit_params(u32 argc, char** argv) {
// /laf
#ifdef USE_TRACE_PC
cc_params[cc_par_cnt++] = "-fsanitize-coverage=trace-pc-guard";
cc_params[cc_par_cnt++] = "-mllvm";
cc_params[cc_par_cnt++] = "-sanitizer-coverage-block-threshold=0";
cc_params[cc_par_cnt++] = "-fsanitize-coverage=trace-pc-guard"; // edge coverage by default
//cc_params[cc_par_cnt++] = "-mllvm";
//cc_params[cc_par_cnt++] = "-fsanitize-coverage=trace-cmp,trace-div,trace-gep";
//cc_params[cc_par_cnt++] = "-sanitizer-coverage-block-threshold=0";
#else
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = "-load";
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = alloc_printf("%s/afl-llvm-pass.so", obj_path);
if (getenv("AFL_LLVM_INSTRIM") != NULL || getenv("INSTRIM_LIB") != NULL)
cc_params[cc_par_cnt++] = alloc_printf("%s/libLLVMInsTrim.so", obj_path);
else
cc_params[cc_par_cnt++] = alloc_printf("%s/afl-llvm-pass.so", obj_path);
#endif /* ^USE_TRACE_PC */
cc_params[cc_par_cnt++] = "-Qunused-arguments";
@ -246,6 +251,10 @@ static void edit_params(u32 argc, char** argv) {
}
#ifdef USEMMAP
cc_params[cc_par_cnt++] = "-lrt";
#endif
cc_params[cc_par_cnt++] = "-D__AFL_HAVE_MANUAL_CONTROL=1";
cc_params[cc_par_cnt++] = "-D__AFL_COMPILER=1";
cc_params[cc_par_cnt++] = "-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1";

View File

@ -31,6 +31,11 @@
#include <stdlib.h>
#include <unistd.h>
#include <list>
#include <string>
#include <fstream>
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LegacyPassManager.h"
@ -48,7 +53,21 @@ namespace {
public:
static char ID;
AFLCoverage() : ModulePass(ID) { }
AFLCoverage() : ModulePass(ID) {
char* instWhiteListFilename = getenv("AFL_LLVM_WHITELIST");
if (instWhiteListFilename) {
std::string line;
std::ifstream fileStream;
fileStream.open(instWhiteListFilename);
if (!fileStream)
report_fatal_error("Unable to open AFL_LLVM_WHITELIST");
getline(fileStream, line);
while (fileStream) {
myWhitelist.push_back(line);
getline(fileStream, line);
}
}
}
bool runOnModule(Module &M) override;
@ -56,6 +75,10 @@ namespace {
// return "American Fuzzy Lop Instrumentation";
// }
protected:
std::list<std::string> myWhitelist;
};
}
@ -95,6 +118,10 @@ bool AFLCoverage::runOnModule(Module &M) {
}
#if LLVM_VERSION_MAJOR < 9
char* neverZero_counters_str = getenv("AFL_LLVM_NOT_ZERO");
#endif
/* Get globals for the SHM region and the previous location. Note that
__afl_prev_loc is thread-local. */
@ -115,6 +142,51 @@ bool AFLCoverage::runOnModule(Module &M) {
BasicBlock::iterator IP = BB.getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
if (!myWhitelist.empty()) {
bool instrumentBlock = false;
/* Get the current location using debug information.
* For now, just instrument the block if we are not able
* to determine our location. */
DebugLoc Loc = IP->getDebugLoc();
if ( Loc ) {
DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
unsigned int instLine = cDILoc->getLine();
StringRef instFilename = cDILoc->getFilename();
if (instFilename.str().empty()) {
/* If the original location is empty, try using the inlined location */
DILocation *oDILoc = cDILoc->getInlinedAt();
if (oDILoc) {
instFilename = oDILoc->getFilename();
instLine = oDILoc->getLine();
}
}
/* Continue only if we know where we actually are */
if (!instFilename.str().empty()) {
for (std::list<std::string>::iterator it = myWhitelist.begin(); it != myWhitelist.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
* check that the actual filename ends in the filename
* specified in the list. */
if (instFilename.str().length() >= it->length()) {
if (instFilename.str().compare(instFilename.str().length() - it->length(), it->length(), *it) == 0) {
instrumentBlock = true;
break;
}
}
}
}
}
/* Either we couldn't figure out our location or the location is
* not whitelisted, so we skip instrumentation. */
if (!instrumentBlock) continue;
}
if (AFL_R(100) >= inst_ratio) continue;
@ -159,21 +231,69 @@ bool AFLCoverage::runOnModule(Module &M) {
LoadInst *MapPtr = IRB.CreateLoad(AFLMapPtr);
MapPtr->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *MapPtrIdx =
IRB.CreateGEP(MapPtr, IRB.CreateXor(PrevLocCasted, CurLoc));
Value *MapPtrIdx = IRB.CreateGEP(MapPtr, IRB.CreateXor(PrevLocCasted, CurLoc));
/* Update bitmap */
LoadInst *Counter = IRB.CreateLoad(MapPtrIdx);
Counter->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *Incr = IRB.CreateAdd(Counter, ConstantInt::get(Int8Ty, 1));
IRB.CreateStore(Incr, MapPtrIdx)
->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
#if LLVM_VERSION_MAJOR < 9
if (neverZero_counters_str != NULL) { // with llvm 9 we make this the default as the bug in llvm is then fixed
#endif
/* hexcoder: Realize a counter that skips zero during overflow.
* Once this counter reaches its maximum value, it next increments to 1
*
* Instead of
* Counter + 1 -> Counter
* we inject now this
* Counter + 1 -> {Counter, OverflowFlag}
* Counter + OverflowFlag -> Counter
*/
/* // we keep the old solutions just in case
// Solution #1
if (neverZero_counters_str[0] == '1') {
CallInst *AddOv = IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_overflow, Counter, ConstantInt::get(Int8Ty, 1));
AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
Value *SumWithOverflowBit = AddOv;
Incr = IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0), // sum
IRB.CreateZExt( // convert from one bit type to 8 bits type
IRB.CreateExtractValue(SumWithOverflowBit, 1), // overflow
Int8Ty));
// Solution #2
} else if (neverZero_counters_str[0] == '2') {
auto cf = IRB.CreateICmpEQ(Counter, ConstantInt::get(Int8Ty, 255));
Value *HowMuch = IRB.CreateAdd(ConstantInt::get(Int8Ty, 1), cf);
Incr = IRB.CreateAdd(Counter, HowMuch);
// Solution #3
} else if (neverZero_counters_str[0] == '3') {
*/
// this is the solution we choose because llvm9 should do the right thing here
auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
auto carry = IRB.CreateZExt(cf, Int8Ty);
Incr = IRB.CreateAdd(Incr, carry);
/*
// Solution #4
} else if (neverZero_counters_str[0] == '4') {
auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
auto carry = IRB.CreateZExt(cf, Int8Ty);
Incr = IRB.CreateAdd(Incr, carry);
} else {
fprintf(stderr, "Error: unknown value for AFL_NZERO_COUNTS: %s (valid is 1-4)\n", neverZero_counters_str);
exit(-1);
}
*/
#if LLVM_VERSION_MAJOR < 9
}
#endif
IRB.CreateStore(Incr, MapPtrIdx)->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
/* Set prev_loc to cur_loc >> 1 */
StoreInst *Store =
IRB.CreateStore(ConstantInt::get(Int32Ty, cur_loc >> 1), AFLPrevLoc);
StoreInst *Store = IRB.CreateStore(ConstantInt::get(Int32Ty, cur_loc >> 1), AFLPrevLoc);
Store->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
inst_blocks++;

View File

@ -44,6 +44,9 @@
# define CONST_PRIO 0
#endif /* ^USE_TRACE_PC */
#include <sys/mman.h>
#include <fcntl.h>
/* Globals needed by the injected instrumentation. The __afl_area_initial region
is used for instrumentation output before __afl_map_shm() has a chance to run.
@ -71,10 +74,34 @@ static void __afl_map_shm(void) {
hacky .init code to work correctly in projects such as OpenSSL. */
if (id_str) {
#ifdef USEMMAP
const char *shm_file_path = id_str;
int shm_fd = -1;
unsigned char *shm_base = NULL;
/* create the shared memory segment as if it was a file */
shm_fd = shm_open(shm_file_path, O_RDWR, 0600);
if (shm_fd == -1) {
printf("shm_open() failed\n");
exit(1);
}
/* map the shared memory segment to the address space of the process */
shm_base = mmap(0, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
if (shm_base == MAP_FAILED) {
close(shm_fd);
shm_fd = -1;
printf("mmap() failed\n");
exit(2);
}
__afl_area_ptr = shm_base;
#else
u32 shm_id = atoi(id_str);
__afl_area_ptr = shmat(shm_id, NULL, 0);
#endif
/* Whooooops. */

View File

@ -144,7 +144,7 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
if (!isStrcmp && !isMemcmp && !isStrncmp && !isStrcasecmp && !isStrncasecmp)
continue;
/* is a str{n,}{case,}cmp/memcmp, check is we have
/* is a str{n,}{case,}cmp/memcmp, check if we have
* str{case,}cmp(x, "const") or str{case,}cmp("const", x)
* strn{case,}cmp(x, "const", ..) or strn{case,}cmp("const", x, ..)
* memcmp(x, "const", ..) or memcmp("const", x, ..) */
@ -184,6 +184,7 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
Value *Str1P = callInst->getArgOperand(0), *Str2P = callInst->getArgOperand(1);
StringRef Str1, Str2, ConstStr;
std::string TmpConstStr;
Value *VarStr;
bool HasStr1 = getConstantStringInfo(Str1P, Str1);
getConstantStringInfo(Str2P, Str2);
@ -202,15 +203,21 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
}
if (HasStr1) {
ConstStr = Str1;
TmpConstStr = Str1.str();
VarStr = Str2P;
constLen = isMemcmp ? sizedLen : GetStringLength(Str1P);
}
else {
ConstStr = Str2;
TmpConstStr = Str2.str();
VarStr = Str1P;
constLen = isMemcmp ? sizedLen : GetStringLength(Str2P);
}
/* properly handle zero terminated C strings by adding the terminating 0 to
* the StringRef (in comparison to std::string a StringRef has built-in
* runtime bounds checking, which makes debugging easier) */
TmpConstStr.append("\0", 1); ConstStr = StringRef(TmpConstStr);
if (isSizedcmp && constLen > sizedLen) {
constLen = sizedLen;
}
@ -250,6 +257,7 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
std::vector<Value *> args;
args.push_back(load);
load = IRB.CreateCall(tolowerFn, args, "tmp");
load = IRB.CreateTrunc(load, Int8Ty);
}
Value *isub;
if (HasStr1)
@ -265,14 +273,9 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
next_bb = BasicBlock::Create(C, "cmp_added", end_bb->getParent(), end_bb);
BranchInst::Create(end_bb, next_bb);
#if LLVM_VERSION_MAJOR < 8
TerminatorInst *term = cur_bb->getTerminator();
#else
Instruction *term = cur_bb->getTerminator();
#endif
Value *icmp = IRB.CreateICmpEQ(isub, ConstantInt::get(Int8Ty, 0));
IRB.CreateCondBr(icmp, next_bb, end_bb);
term->eraseFromParent();
cur_bb->getTerminator()->eraseFromParent();
} else {
//IRB.CreateBr(end_bb);
}
@ -297,7 +300,8 @@ bool CompareTransform::transformCmps(Module &M, const bool processStrcmp, const
bool CompareTransform::runOnModule(Module &M) {
llvm::errs() << "Running compare-transform-pass by laf.intel@gmail.com, extended by heiko@hexco.de\n";
if (getenv("AFL_QUIET") == NULL)
llvm::errs() << "Running compare-transform-pass by laf.intel@gmail.com, extended by heiko@hexco.de\n";
transformCmps(M, true, true, true, true, true);
verifyModule(M);

View File

@ -259,7 +259,7 @@ bool SplitComparesTransform::simplifySignedness(Module &M) {
Instruction *icmp_inv_sig_cmp;
BasicBlock* sign_bb = BasicBlock::Create(C, "sign", end_bb->getParent(), end_bb);
if (pred == CmpInst::ICMP_SGT) {
/* if we check for > and the op0 positiv and op1 negative then the final
/* if we check for > and the op0 positive and op1 negative then the final
* result is true. if op0 negative and op1 pos, the cmp must result
* in false
*/
@ -369,7 +369,7 @@ bool SplitComparesTransform::splitCompares(Module &M, unsigned bitw) {
BasicBlock* end_bb = bb->splitBasicBlock(BasicBlock::iterator(IcmpInst));
/* create the comparison of the top halfs of the original operands */
/* create the comparison of the top halves of the original operands */
Instruction *s_op0, *op0_high, *s_op1, *op1_high, *icmp_high;
s_op0 = BinaryOperator::Create(Instruction::LShr, op0, ConstantInt::get(OldIntType, bitw / 2));
@ -403,7 +403,7 @@ bool SplitComparesTransform::splitCompares(Module &M, unsigned bitw) {
cmp_low_bb->getInstList().push_back(icmp_low);
BranchInst::Create(end_bb, cmp_low_bb);
/* dependant on the cmp of the high parts go to the end or go on with
/* dependent on the cmp of the high parts go to the end or go on with
* the comparison */
auto term = bb->getTerminator();
if (pred == CmpInst::ICMP_EQ) {
@ -448,7 +448,7 @@ bool SplitComparesTransform::splitCompares(Module &M, unsigned bitw) {
term->eraseFromParent();
BranchInst::Create(end_bb, inv_cmp_bb, icmp_high, bb);
/* create a bb which handles the cmp of the lower halfs */
/* create a bb which handles the cmp of the lower halves */
BasicBlock* cmp_low_bb = BasicBlock::Create(C, "injected", end_bb->getParent(), end_bb);
op0_low = new TruncInst(op0, NewIntType);
cmp_low_bb->getInstList().push_back(op0_low);
@ -477,6 +477,8 @@ bool SplitComparesTransform::runOnModule(Module &M) {
int bitw = 64;
char* bitw_env = getenv("LAF_SPLIT_COMPARES_BITW");
if (!bitw_env)
bitw_env = getenv("AFL_LLVM_LAF_SPLIT_COMPARES_BITW");
if (bitw_env) {
bitw = atoi(bitw_env);
}
@ -485,7 +487,8 @@ bool SplitComparesTransform::runOnModule(Module &M) {
simplifySignedness(M);
errs() << "Split-compare-pass by laf.intel@gmail.com\n";
if (getenv("AFL_QUIET") == NULL)
errs() << "Split-compare-pass by laf.intel@gmail.com\n";
switch (bitw) {
case 64:

View File

@ -87,6 +87,7 @@ BasicBlock* SplitSwitchesTransform::switchConvert(CaseVector Cases, std::vector<
std::vector<uint8_t> setSizes;
std::vector<std::set<uint8_t>> byteSets(BytesInValue, std::set<uint8_t>());
assert(ValTypeBitWidth >= 8 && ValTypeBitWidth <= 64);
/* for each of the possible cases we iterate over all bytes of the values
* build a set of possible values at each byte position in byteSets */
@ -98,6 +99,8 @@ BasicBlock* SplitSwitchesTransform::switchConvert(CaseVector Cases, std::vector<
}
}
/* find the index of the first byte position that was not yet checked. then
* save the number of possible values at that byte position */
unsigned smallestIndex = 0;
unsigned smallestSize = 257;
for(unsigned i = 0; i < byteSets.size(); i++) {
@ -152,7 +155,7 @@ BasicBlock* SplitSwitchesTransform::switchConvert(CaseVector Cases, std::vector<
}
PHINode *PN = cast<PHINode>(I);
/* Only update the first occurence. */
/* Only update the first occurrence. */
unsigned Idx = 0, E = PN->getNumIncomingValues();
for (; Idx != E; ++Idx) {
if (PN->getIncomingBlock(Idx) == OrigBlock) {
@ -235,9 +238,14 @@ bool SplitSwitchesTransform::splitSwitches(Module &M) {
/* this is the value we are switching on */
Value *Val = SI->getCondition();
BasicBlock* Default = SI->getDefaultDest();
unsigned bitw = Val->getType()->getIntegerBitWidth();
/* If there is only the default destination, don't bother with the code below. */
if (!SI->getNumCases()) {
errs() << "switch: " << SI->getNumCases() << " cases " << bitw << " bit\n";
/* If there is only the default destination or the condition checks 8 bit or less, don't bother with the code below. */
if (!SI->getNumCases() || bitw <= 8) {
if (getenv("AFL_QUIET") == NULL)
errs() << "skip trivial switch..\n";
continue;
}
@ -259,7 +267,9 @@ bool SplitSwitchesTransform::splitSwitches(Module &M) {
#else
Cases.push_back(CaseExpr(i->getCaseValue(), i->getCaseSuccessor()));
#endif
std::vector<bool> bytesChecked(Cases[0].Val->getBitWidth() / 8, false);
/* bugfix thanks to pbst
* round up bytesChecked (in case getBitWidth() % 8 != 0) */
std::vector<bool> bytesChecked((7 + Cases[0].Val->getBitWidth()) / 8, false);
BasicBlock* SwitchBlock = switchConvert(Cases, bytesChecked, OrigBlock, NewDefault, Val, 0);
/* Branch to our shiny new if-then stuff... */
@ -276,7 +286,7 @@ bool SplitSwitchesTransform::splitSwitches(Module &M) {
}
PHINode *PN = cast<PHINode>(I);
/* Only update the first occurence. */
/* Only update the first occurrence. */
unsigned Idx = 0, E = PN->getNumIncomingValues();
for (; Idx != E; ++Idx) {
if (PN->getIncomingBlock(Idx) == OrigBlock) {
@ -293,7 +303,8 @@ bool SplitSwitchesTransform::splitSwitches(Module &M) {
bool SplitSwitchesTransform::runOnModule(Module &M) {
llvm::errs() << "Running split-switches-pass by laf.intel@gmail.com\n";
if (getenv("AFL_QUIET") == NULL)
llvm::errs() << "Running split-switches-pass by laf.intel@gmail.com\n";
splitSwitches(M);
verifyModule(M);

15
python_mutators/README Normal file
View File

@ -0,0 +1,15 @@
These are example and helper files for the AFL_PYTHON_MODULE feature.
See docs/python_mutators.txt for more information
example.py - this is the template you can use, the functions are there
but they are empty
simple-chunk-replace.py - this is a simple example where chunks are replaced
common.py - this can be used for common functions and helpers.
the examples do not use this though. But you can :)
wrapper_afl_min.py - mutation of XML documents, loads XmlMutatorMin.py
XmlMutatorMin.py - module for XML mutation

View File

@ -0,0 +1,331 @@
#!/usr/bin/python
""" Mutation of XML documents, should be called from one of its wrappers (CLI, AFL, ...) """
from __future__ import print_function
from copy import deepcopy
from lxml import etree as ET
import random, re, io
###########################
# The XmlMutatorMin class #
###########################
class XmlMutatorMin:
"""
Optionals parameters:
seed Seed used by the PRNG (default: "RANDOM")
verbose Verbosity (default: False)
"""
def __init__(self, seed="RANDOM", verbose=False):
""" Initialize seed, database and mutators """
# Verbosity
self.verbose = verbose
# Initialize PRNG
self.seed = str(seed)
if self.seed == "RANDOM":
random.seed()
else:
if self.verbose:
print("Static seed '%s'" % self.seed)
random.seed(self.seed)
# Initialize input and output documents
self.input_tree = None
self.tree = None
# High-level mutators (no database needed)
hl_mutators_delete = [ "del_node_and_children", "del_node_but_children", "del_attribute", "del_content" ] # Delete items
hl_mutators_fuzz = ["fuzz_attribute"] # Randomly change attribute values
# Exposed mutators
self.hl_mutators_all = hl_mutators_fuzz + hl_mutators_delete
def __parse_xml (self, xml):
""" Parse an XML string. Basic wrapper around lxml.parse() """
try:
# Function parse() takes care of comments / DTD / processing instructions / ...
tree = ET.parse(io.BytesIO(xml))
except ET.ParseError:
raise RuntimeError("XML isn't well-formed!")
except LookupError as e:
raise RuntimeError(e)
# Return a document wrapper
return tree
def __exec_among (self, module, functions, min_times, max_times):
""" Randomly execute $functions between $min and $max times """
for i in xrange (random.randint (min_times, max_times)):
# Function names are mangled because they are "private"
getattr (module, "_XmlMutatorMin__" + random.choice(functions)) ()
def __serialize_xml (self, tree):
""" Serialize a XML document. Basic wrapper around lxml.tostring() """
return ET.tostring(tree, with_tail=False, xml_declaration=True, encoding=tree.docinfo.encoding)
def __ver (self, version):
""" Helper for displaying lxml version numbers """
return ".".join(map(str, version))
def reset (self):
""" Reset the mutator """
self.tree = deepcopy(self.input_tree)
def init_from_string (self, input_string):
""" Initialize the mutator from a XML string """
# Get a pointer to the top-element
self.input_tree = self.__parse_xml(input_string)
# Get a working copy
self.tree = deepcopy(self.input_tree)
def save_to_string (self):
""" Return the current XML document as UTF-8 string """
# Return a text version of the tree
return self.__serialize_xml(self.tree)
def __pick_element (self, exclude_root_node = False):
""" Pick a random element from the current document """
# Get a list of all elements, but nodes like PI and comments
elems = list(self.tree.getroot().iter(tag=ET.Element))
# Is the root node excluded?
if exclude_root_node:
start = 1
else:
start = 0
# Pick a random element
try:
elem_id = random.randint (start, len(elems) - 1)
elem = elems[elem_id]
except ValueError:
# Should only occurs if "exclude_root_node = True"
return (None, None)
return (elem_id, elem)
def __fuzz_attribute (self):
""" Fuzz (part of) an attribute value """
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Get all the attributes
attribs = rand_elem.keys()
# Is there attributes?
if len(attribs) < 1:
if self.verbose:
print("No attribute: can't replace!")
return
# Pick a random attribute
rand_attrib_id = random.randint (0, len(attribs) - 1)
rand_attrib = attribs[rand_attrib_id]
# We have the attribute to modify
# Get its value
attrib_value = rand_elem.get(rand_attrib);
# print("- Value: " + attrib_value)
# Should we work on the whole value?
func_call = "(?P<func>[a-zA-Z:\-]+)\((?P<args>.*?)\)"
p = re.compile(func_call)
l = p.findall(attrib_value)
if random.choice((True,False)) and l:
# Randomly pick one the function calls
(func, args) = random.choice(l)
# Split by "," and randomly pick one of the arguments
value = random.choice(args.split(','))
# Remove superfluous characters
unclean_value = value
value = value.strip(" ").strip("'")
# print("Selected argument: [%s]" % value)
else:
value = attrib_value
# For each type, define some possible replacement values
choices_number = ( \
"0", \
"11111", \
"-128", \
"2", \
"-1", \
"1/3", \
"42/0", \
"1094861636 idiv 1.0", \
"-1123329771506872 idiv 3.8", \
"17=$numericRTF", \
str(3 + random.randrange(0, 100)), \
)
choices_letter = ( \
"P" * (25 * random.randrange(1, 100)), \
"%s%s%s%s%s%s", \
"foobar", \
)
choices_alnum = ( \
"Abc123", \
"020F0302020204030204", \
"020F0302020204030204" * (random.randrange(5, 20)), \
)
# Fuzz the value
if random.choice((True,False)) and value == "":
# Empty
new_value = value
elif random.choice((True,False)) and value.isdigit():
# Numbers
new_value = random.choice(choices_number)
elif random.choice((True,False)) and value.isalpha():
# Letters
new_value = random.choice(choices_letter)
elif random.choice((True,False)) and value.isalnum():
# Alphanumeric
new_value = random.choice(choices_alnum)
else:
# Default type
new_value = random.choice(choices_alnum + choices_letter + choices_number)
# If we worked on a substring, apply changes to the whole string
if value != attrib_value:
# No ' around empty values
if new_value != "" and value != "":
new_value = "'" + new_value + "'"
# Apply changes
new_value = attrib_value.replace(unclean_value, new_value)
# Log something
if self.verbose:
print("Fuzzing attribute #%i '%s' of tag #%i '%s'" % (rand_attrib_id, rand_attrib, rand_elem_id, rand_elem.tag))
# Modify the attribute
rand_elem.set(rand_attrib, new_value.decode("utf-8"))
def __del_node_and_children (self):
""" High-level minimizing mutator
Delete a random node and its children (i.e. delete a random tree) """
self.__del_node(True)
def __del_node_but_children (self):
""" High-level minimizing mutator
Delete a random node but its children (i.e. link them to the parent of the deleted node) """
self.__del_node(False)
def __del_node (self, delete_children):
""" Called by the __del_node_* mutators """
# Select a node to modify (but the root one)
(rand_elem_id, rand_elem) = self.__pick_element (exclude_root_node = True)
# If the document includes only a top-level element
# Then we can't pick a element (given that "exclude_root_node = True")
# Is the document deep enough?
if rand_elem is None:
if self.verbose:
print("Can't delete a node: document not deep enough!")
return
# Log something
if self.verbose:
but_or_and = "and" if delete_children else "but"
print("Deleting tag #%i '%s' %s its children" % (rand_elem_id, rand_elem.tag, but_or_and))
if delete_children is False:
# Link children of the random (soon to be deleted) node to its parent
for child in rand_elem:
rand_elem.getparent().append(child)
# Remove the node
rand_elem.getparent().remove(rand_elem)
def __del_content (self):
""" High-level minimizing mutator
Delete the attributes and children of a random node """
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Log something
if self.verbose:
print("Reseting tag #%i '%s'" % (rand_elem_id, rand_elem.tag))
# Reset the node
rand_elem.clear()
def __del_attribute (self):
""" High-level minimizing mutator
Delete a random attribute from a random node """
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Get all the attributes
attribs = rand_elem.keys()
# Is there attributes?
if len(attribs) < 1:
if self.verbose:
print("No attribute: can't delete!")
return
# Pick a random attribute
rand_attrib_id = random.randint (0, len(attribs) - 1)
rand_attrib = attribs[rand_attrib_id]
# Log something
if self.verbose:
print("Deleting attribute #%i '%s' of tag #%i '%s'" % (rand_attrib_id, rand_attrib, rand_elem_id, rand_elem.tag))
# Delete the attribute
rand_elem.attrib.pop(rand_attrib)
def mutate (self, min=1, max=5):
""" Execute some high-level mutators between $min and $max times, then some medium-level ones """
# High-level mutation
self.__exec_among(self, self.hl_mutators_all, min, max)

37
python_mutators/common.py Normal file
View File

@ -0,0 +1,37 @@
#!/usr/bin/env python
# encoding: utf-8
'''
Module containing functions shared between multiple AFL modules
@author: Christian Holler (:decoder)
@license:
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at http://mozilla.org/MPL/2.0/.
@contact: choller@mozilla.com
'''
from __future__ import print_function
import random
import os
import re
def randel(l):
if not l:
return None
return l[random.randint(0,len(l)-1)]
def randel_pop(l):
if not l:
return None
return l.pop(random.randint(0,len(l)-1))
def write_exc_example(data, exc):
exc_name = re.sub(r'[^a-zA-Z0-9]', '_', repr(exc))
if not os.path.exists(exc_name):
with open(exc_name, 'w') as f:
f.write(data)

103
python_mutators/example.py Normal file
View File

@ -0,0 +1,103 @@
#!/usr/bin/env python
# encoding: utf-8
'''
Example Python Module for AFLFuzz
@author: Christian Holler (:decoder)
@license:
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at http://mozilla.org/MPL/2.0/.
@contact: choller@mozilla.com
'''
import random
def init(seed):
'''
Called once when AFLFuzz starts up. Used to seed our RNG.
@type seed: int
@param seed: A 32-bit random value
'''
random.seed(seed)
return 0
def fuzz(buf, add_buf):
'''
Called per fuzzing iteration.
@type buf: bytearray
@param buf: The buffer that should be mutated.
@type add_buf: bytearray
@param add_buf: A second buffer that can be used as mutation source.
@rtype: bytearray
@return: A new bytearray containing the mutated data
'''
ret = bytearray(buf)
# Do something interesting with ret
return ret
# Uncomment and implement the following methods if you want to use a custom
# trimming algorithm. See also the documentation for a better API description.
# def init_trim(buf):
# '''
# Called per trimming iteration.
#
# @type buf: bytearray
# @param buf: The buffer that should be trimmed.
#
# @rtype: int
# @return: The maximum number of trimming steps.
# '''
# global ...
#
# # Initialize global variables
#
# # Figure out how many trimming steps are possible.
# # If this is not possible for your trimming, you can
# # return 1 instead and always return 0 in post_trim
# # until you are done (then you return 1).
#
# return steps
#
# def trim():
# '''
# Called per trimming iteration.
#
# @rtype: bytearray
# @return: A new bytearray containing the trimmed data.
# '''
# global ...
#
# # Implement the actual trimming here
#
# return bytearray(...)
#
# def post_trim(success):
# '''
# Called after each trimming operation.
#
# @type success: bool
# @param success: Indicates if the last trim operation was successful.
#
# @rtype: int
# @return: The next trim index (0 to max number of steps) where max
# number of steps indicates the trimming is done.
# '''
# global ...
#
# if not success:
# # Restore last known successful input, determine next index
# else:
# # Just determine the next index, based on what was successfully
# # removed in the last step
#
# return next_index

View File

@ -0,0 +1,59 @@
#!/usr/bin/env python
# encoding: utf-8
'''
Simple Chunk Cross-Over Replacement Module for AFLFuzz
@author: Christian Holler (:decoder)
@license:
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at http://mozilla.org/MPL/2.0/.
@contact: choller@mozilla.com
'''
import random
def init(seed):
'''
Called once when AFLFuzz starts up. Used to seed our RNG.
@type seed: int
@param seed: A 32-bit random value
'''
# Seed our RNG
random.seed(seed)
return 0
def fuzz(buf, add_buf):
'''
Called per fuzzing iteration.
@type buf: bytearray
@param buf: The buffer that should be mutated.
@type add_buf: bytearray
@param add_buf: A second buffer that can be used as mutation source.
@rtype: bytearray
@return: A new bytearray containing the mutated data
'''
# Make a copy of our input buffer for returning
ret = bytearray(buf)
# Take a random fragment length between 2 and 32 (or less if add_buf is shorter)
fragment_len = random.randint(1, min(len(add_buf), 32))
# Determine a random source index where to take the data chunk from
rand_src_idx = random.randint(0, len(add_buf) - fragment_len)
# Determine a random destination index where to put the data chunk
rand_dst_idx = random.randint(0, len(buf))
# Make the chunk replacement
ret[rand_dst_idx:rand_dst_idx + fragment_len] = add_buf[rand_src_idx:rand_src_idx + fragment_len]
# Return data
return ret

View File

@ -0,0 +1,117 @@
#!/usr/bin/env python
from XmlMutatorMin import XmlMutatorMin
# Default settings (production mode)
__mutator__ = None
__seed__ = "RANDOM"
__log__ = False
__log_file__ = "wrapper.log"
# AFL functions
def log(text):
"""
Logger
"""
global __seed__
global __log__
global __log_file__
if __log__:
with open(__log_file__, "a") as logf:
logf.write("[%s] %s\n" % (__seed__, text))
def init(seed):
"""
Called once when AFL starts up. Seed is used to identify the AFL instance in log files
"""
global __mutator__
global __seed__
# Get the seed
__seed__ = seed
# Create a global mutation class
try:
__mutator__ = XmlMutatorMin(__seed__, verbose=__log__)
log("init(): Mutator created")
except RuntimeError as e:
log("init(): Can't create mutator: %s" % e.message)
def fuzz(buf, add_buf):
"""
Called for each fuzzing iteration.
"""
global __mutator__
# Do we have a working mutator object?
if __mutator__ is None:
log("fuzz(): Can't fuzz, no mutator available")
return buf
# Try to use the AFL buffer
via_buffer = True
# Interpret the AFL buffer (an array of bytes) as a string
if via_buffer:
try:
buf_str = str(buf)
log("fuzz(): AFL buffer converted to a string")
except:
via_buffer = False
log("fuzz(): Can't convert AFL buffer to a string")
# Load XML from the AFL string
if via_buffer:
try:
__mutator__.init_from_string(buf_str)
log("fuzz(): Mutator successfully initialized with AFL buffer (%d bytes)" % len(buf_str))
except:
via_buffer = False
log("fuzz(): Can't initialize mutator with AFL buffer")
# If init from AFL buffer wasn't succesful
if not via_buffer:
log("fuzz(): Returning unmodified AFL buffer")
return buf
# Sucessful initialization -> mutate
try:
__mutator__.mutate(max=5)
log("fuzz(): Input mutated")
except:
log("fuzz(): Can't mutate input => returning buf")
return buf
# Convert mutated data to a array of bytes
try:
data = bytearray(__mutator__.save_to_string())
log("fuzz(): Mutated data converted as bytes")
except:
log("fuzz(): Can't convert mutated data to bytes => returning buf")
return buf
# Everything went fine, returning mutated content
log("fuzz(): Returning %d bytes" % len(data))
return data
# Main (for debug)
if __name__ == '__main__':
__log__ = True
__log_file__ = "/dev/stdout"
__seed__ = "RANDOM"
init(__seed__)
in_1 = bytearray("<foo ddd='eeee'>ffff<a b='c' d='456' eee='ffffff'>zzzzzzzzzzzz</a><b yyy='YYY' zzz='ZZZ'></b></foo>")
in_2 = bytearray("<abc abc123='456' abcCBA='ppppppppppppppppppppppppppppp'/>")
out = fuzz(in_1, in_2)
print(out)

View File

@ -117,7 +117,7 @@ program control flow without actually executing each and every code path.
If you want to experiment with this mode of operation, there is a module
contributed by Aleksandar Nikolich:
https://github.com/vrtadmin/moflow/tree/master/afl-dyninst
https://github.com/vanhauser-thc/afl-dyninst
https://groups.google.com/forum/#!topic/afl-users/HlSQdbOTlpg
At this point, the author reports the possibility of hiccups with stripped

View File

@ -133,7 +133,7 @@ patch -p1 <../patches/cpu-exec.diff || exit 1
patch -p1 <../patches/syscall.diff || exit 1
patch -p1 <../patches/translate-all.diff || exit 1
patch -p1 <../patches/tcg.diff || exit 1
patch -p1 <../patches/elfload2.diff || exit 1
patch -p1 <../patches/i386-translate.diff || exit 1
echo "[+] Patching done."

View File

@ -0,0 +1,42 @@
#
# american fuzzy lop - libcompcov
# --------------------------------
#
# Written by Andrea Fioraldi <andreafioraldi@gmail.com>
#
# Copyright 2019 Andrea Fioraldi. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
#
# http://www.apache.org/licenses/LICENSE-2.0
#
PREFIX ?= /usr/local
HELPER_PATH = $(PREFIX)/lib/afl
VERSION = $(shell grep '^\#define VERSION ' ../config.h | cut -d '"' -f2)
CFLAGS ?= -O3 -funroll-loops
CFLAGS += -Wall -Wno-unused-result -D_FORTIFY_SOURCE=2 -g -Wno-pointer-sign
LDFLAGS += -ldl
all: libcompcov.so compcovtest
libcompcov.so: libcompcov.so.c ../../config.h
$(CC) $(CFLAGS) -shared -fPIC $< -o $@ $(LDFLAGS)
.NOTPARALLEL: clean
clean:
rm -f *.o *.so *~ a.out core core.[1-9][0-9]*
rm -f libcompcov.so compcovtest
compcovtest: compcovtest.cc
$(CXX) $< -o $@
install: all
install -m 755 libcompcov.so $${DESTDIR}$(HELPER_PATH)
install -m 644 README.compcov $${DESTDIR}$(HELPER_PATH)

View File

@ -0,0 +1,33 @@
================================================================
strcmp() / memcmp() CompareCoverage library for AFLplusplus-QEMU
================================================================
Written by Andrea Fioraldi <andreafioraldi@gmail.com>
This Linux-only companion library allows you to instrument strcmp(), memcmp(),
and related functions to log the CompareCoverage of these libcalls.
Use this with caution. While this can speedup a lot the bypass of hard
branch conditions it can also waste a lot of time and take up unnecessary space
in the shared memory when logging the coverage related to functions that
doesn't process input-related data.
To use the library, you *need* to make sure that your fuzzing target is linked
dynamically and make use of strcmp(), memcmp(), and related functions.
For optimized binaries this is an issue, those functions are often inlined
and this module is not capable to log the coverage in this case.
If you have the source code of the fuzzing target you should nto use this
library and QEMU but build ot with afl-clang-fast and the laf-intel options.
To use this library make sure to preload it with AFL_PRELOAD.
export AFL_PRELOAD=/path/to/libcompcov.so
export AFL_QEMU_COMPCOV=1
afl-fuzz -Q -i input -o output <your options> -- <target args>
The library make use of https://github.com/ouadev/proc_maps_parser and so it is
Linux specific. However this is not a strict dependency, other UNIX operating
systems can be supported simply replacing the code related to the
/proc/self/maps parsing.

View File

@ -0,0 +1,63 @@
/////////////////////////////////////////////////////////////////////////
//
// Author: Mateusz Jurczyk (mjurczyk@google.com)
//
// Copyright 2019 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// solution: echo -ne 'The quick brown fox jumps over the lazy dog\xbe\xba\xfe\xca\xbe\xba\xfe\xca\xde\xc0\xad\xde\xef\xbe' | ./compcovtest
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <cstring>
int main() {
char buffer[44] = { /* zero padding */ };
fread(buffer, 1, sizeof(buffer) - 1, stdin);
if (memcmp(&buffer[0], "The quick brown fox ", 20) != 0 ||
strncmp(&buffer[20], "jumps over ", 11) != 0 ||
strcmp(&buffer[31], "the lazy dog") != 0) {
return 1;
}
uint64_t x = 0;
fread(&x, sizeof(x), 1, stdin);
if (x != 0xCAFEBABECAFEBABE) {
return 2;
}
uint32_t y = 0;
fread(&y, sizeof(y), 1, stdin);
if (y != 0xDEADC0DE) {
return 3;
}
uint16_t z = 0;
fread(&z, sizeof(z), 1, stdin);
switch (z) {
case 0xBEEF:
break;
default:
return 4;
}
printf("Puzzle solved, congrats!\n");
abort();
return 0;
}

View File

@ -0,0 +1,279 @@
/*
american fuzzy lop++ - strcmp() / memcmp() CompareCoverage library
------------------------------------------------------------------
Written and maintained by Andrea Fioraldi <andreafioraldi@gmail.com>
Copyright 2019 Andrea Fioraldi. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
This Linux-only companion library allows you to instrument strcmp(),
memcmp(), and related functions to get compare coverage.
See README.compcov for more info.
*/
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/shm.h>
#include "../../types.h"
#include "../../config.h"
#include "pmparser.h"
#ifndef __linux__
# error "Sorry, this library is Linux-specific for now!"
#endif /* !__linux__ */
/* Change this value to tune the compare coverage */
#define MAX_CMP_LENGTH 32
static void *__compcov_code_start,
*__compcov_code_end;
static u8 *__compcov_afl_map;
static int (*__libc_strcmp)(const char*, const char*);
static int (*__libc_strncmp)(const char*, const char*, size_t);
static int (*__libc_strcasecmp)(const char*, const char*);
static int (*__libc_strncasecmp)(const char*, const char*, size_t);
static int (*__libc_memcmp)(const void*, const void*, size_t);
static int debug_fd = -1;
static size_t __strlen2(const char *s1, const char *s2, size_t max_length) {
// from https://github.com/googleprojectzero/CompareCoverage
size_t len = 0;
for (; len < max_length && s1[len] != '\0' && s2[len] != '\0'; len++) { }
return len;
}
/* Identify the binary boundaries in the memory mapping */
static void __compcov_load(void) {
__libc_strcmp = dlsym(RTLD_NEXT, "strcmp");
__libc_strncmp = dlsym(RTLD_NEXT, "strncmp");
__libc_strcasecmp = dlsym(RTLD_NEXT, "strcasecmp");
__libc_strncasecmp = dlsym(RTLD_NEXT, "strncasecmp");
__libc_memcmp = dlsym(RTLD_NEXT, "memcmp");
char *id_str = getenv(SHM_ENV_VAR);
int shm_id;
if (id_str) {
shm_id = atoi(id_str);
__compcov_afl_map = shmat(shm_id, NULL, 0);
if (__compcov_afl_map == (void*)-1) exit(1);
} else {
__compcov_afl_map = calloc(1, MAP_SIZE);
}
if (getenv("AFL_INST_LIBS")) {
__compcov_code_start = (void*)0;
__compcov_code_end = (void*)-1;
return;
}
char* bin_name = getenv("AFL_COMPCOV_BINNAME");
procmaps_iterator* maps = pmparser_parse(-1);
procmaps_struct* maps_tmp = NULL;
while ((maps_tmp = pmparser_next(maps)) != NULL) {
/* If AFL_COMPCOV_BINNAME is not set pick the first executable segment */
if (!bin_name || strstr(maps_tmp->pathname, bin_name) != NULL) {
if (maps_tmp->is_x) {
if (!__compcov_code_start)
__compcov_code_start = maps_tmp->addr_start;
if (!__compcov_code_end)
__compcov_code_end = maps_tmp->addr_end;
}
}
}
pmparser_free(maps);
}
static void __compcov_trace(u64 cur_loc, const u8* v0, const u8* v1, size_t n) {
size_t i;
if (debug_fd != 1) {
char debugbuf[4096];
snprintf(debugbuf, sizeof(debugbuf), "0x%llx %s %s %lu\n", cur_loc, v0 == NULL ? "(null)" : (char*)v0, v1 == NULL ? "(null)" : (char*)v1, n);
write(debug_fd, debugbuf, strlen(debugbuf));
}
for (i = 0; i < n && v0[i] == v1[i]; ++i) {
__compcov_afl_map[cur_loc +i]++;
}
}
/* Check an address against the list of read-only mappings. */
static u8 __compcov_is_in_bound(const void* ptr) {
return ptr >= __compcov_code_start && ptr < __compcov_code_end;
}
/* Replacements for strcmp(), memcmp(), and so on. Note that these will be used
only if the target is compiled with -fno-builtins and linked dynamically. */
#undef strcmp
int strcmp(const char* str1, const char* str2) {
void* retaddr = __builtin_return_address(0);
if (__compcov_is_in_bound(retaddr)) {
size_t n = __strlen2(str1, str2, MAX_CMP_LENGTH +1);
if (n <= MAX_CMP_LENGTH) {
u64 cur_loc = (u64)retaddr;
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
__compcov_trace(cur_loc, str1, str2, n);
}
}
return __libc_strcmp(str1, str2);
}
#undef strncmp
int strncmp(const char* str1, const char* str2, size_t len) {
void* retaddr = __builtin_return_address(0);
if (__compcov_is_in_bound(retaddr)) {
size_t n = __strlen2(str1, str2, MAX_CMP_LENGTH +1);
n = MIN(n, len);
if (n <= MAX_CMP_LENGTH) {
u64 cur_loc = (u64)retaddr;
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
__compcov_trace(cur_loc, str1, str2, n);
}
}
return __libc_strncmp(str1, str2, len);
}
#undef strcasecmp
int strcasecmp(const char* str1, const char* str2) {
void* retaddr = __builtin_return_address(0);
if (__compcov_is_in_bound(retaddr)) {
/* Fallback to strcmp, maybe improve in future */
size_t n = __strlen2(str1, str2, MAX_CMP_LENGTH +1);
if (n <= MAX_CMP_LENGTH) {
u64 cur_loc = (u64)retaddr;
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
__compcov_trace(cur_loc, str1, str2, n);
}
}
return __libc_strcasecmp(str1, str2);
}
#undef strncasecmp
int strncasecmp(const char* str1, const char* str2, size_t len) {
void* retaddr = __builtin_return_address(0);
if (__compcov_is_in_bound(retaddr)) {
/* Fallback to strncmp, maybe improve in future */
size_t n = __strlen2(str1, str2, MAX_CMP_LENGTH +1);
n = MIN(n, len);
if (n <= MAX_CMP_LENGTH) {
u64 cur_loc = (u64)retaddr;
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
__compcov_trace(cur_loc, str1, str2, n);
}
}
return __libc_strncasecmp(str1, str2, len);
}
#undef memcmp
int memcmp(const void* mem1, const void* mem2, size_t len) {
void* retaddr = __builtin_return_address(0);
if (__compcov_is_in_bound(retaddr)) {
size_t n = len;
if (n <= MAX_CMP_LENGTH) {
u64 cur_loc = (u64)retaddr;
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
__compcov_trace(cur_loc, mem1, mem2, n);
}
}
return __libc_memcmp(mem1, mem2, len);
}
/* Init code to open init the library. */
__attribute__((constructor)) void __compcov_init(void) {
if (getenv("AFL_QEMU_COMPCOV_DEBUG") != NULL)
debug_fd = open("compcov.debug", O_WRONLY | O_CREAT | O_TRUNC | O_SYNC, 0644);
__compcov_load();
}

View File

@ -0,0 +1,280 @@
/*
@Author : ouadimjamal@gmail.com
@date : December 2015
Permission to use, copy, modify, distribute, and sell this software and its
documentation for any purpose is hereby granted without fee, provided that
the above copyright notice appear in all copies and that both that
copyright notice and this permission notice appear in supporting
documentation. No representations are made about the suitability of this
software for any purpose. It is provided "as is" without express or
implied warranty.
*/
#ifndef H_PMPARSER
#define H_PMPARSER
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <linux/limits.h>
//maximum line length in a procmaps file
#define PROCMAPS_LINE_MAX_LENGTH (PATH_MAX + 100)
/**
* procmaps_struct
* @desc hold all the information about an area in the process's VM
*/
typedef struct procmaps_struct{
void* addr_start; //< start address of the area
void* addr_end; //< end address
unsigned long length; //< size of the range
char perm[5]; //< permissions rwxp
short is_r; //< rewrote of perm with short flags
short is_w;
short is_x;
short is_p;
long offset; //< offset
char dev[12]; //< dev major:minor
int inode; //< inode of the file that backs the area
char pathname[600]; //< the path of the file that backs the area
//chained list
struct procmaps_struct* next; //<handler of the chinaed list
} procmaps_struct;
/**
* procmaps_iterator
* @desc holds iterating information
*/
typedef struct procmaps_iterator{
procmaps_struct* head;
procmaps_struct* current;
} procmaps_iterator;
/**
* pmparser_parse
* @param pid the process id whose memory map to be parser. the current process if pid<0
* @return an iterator over all the nodes
*/
procmaps_iterator* pmparser_parse(int pid);
/**
* pmparser_next
* @description move between areas
* @param p_procmaps_it the iterator to move on step in the chained list
* @return a procmaps structure filled with information about this VM area
*/
procmaps_struct* pmparser_next(procmaps_iterator* p_procmaps_it);
/**
* pmparser_free
* @description should be called at the end to free the resources
* @param p_procmaps_it the iterator structure returned by pmparser_parse
*/
void pmparser_free(procmaps_iterator* p_procmaps_it);
/**
* _pmparser_split_line
* @description internal usage
*/
void _pmparser_split_line(char*buf,char*addr1,char*addr2,char*perm, char* offset, char* device,char*inode,char* pathname);
/**
* pmparser_print
* @param map the head of the list
* @order the order of the area to print, -1 to print everything
*/
void pmparser_print(procmaps_struct* map,int order);
/**
* gobal variables
*/
//procmaps_struct* g_last_head=NULL;
//procmaps_struct* g_current=NULL;
procmaps_iterator* pmparser_parse(int pid){
procmaps_iterator* maps_it = malloc(sizeof(procmaps_iterator));
char maps_path[500];
if(pid>=0 ){
sprintf(maps_path,"/proc/%d/maps",pid);
}else{
sprintf(maps_path,"/proc/self/maps");
}
FILE* file=fopen(maps_path,"r");
if(!file){
fprintf(stderr,"pmparser : cannot open the memory maps, %s\n",strerror(errno));
return NULL;
}
int ind=0;char buf[PROCMAPS_LINE_MAX_LENGTH];
//int c;
procmaps_struct* list_maps=NULL;
procmaps_struct* tmp;
procmaps_struct* current_node=list_maps;
char addr1[20],addr2[20], perm[8], offset[20], dev[10],inode[30],pathname[PATH_MAX];
while( !feof(file) ){
fgets(buf,PROCMAPS_LINE_MAX_LENGTH,file);
//allocate a node
tmp=(procmaps_struct*)malloc(sizeof(procmaps_struct));
//fill the node
_pmparser_split_line(buf,addr1,addr2,perm,offset, dev,inode,pathname);
//printf("#%s",buf);
//printf("%s-%s %s %s %s %s\t%s\n",addr1,addr2,perm,offset,dev,inode,pathname);
//addr_start & addr_end
//unsigned long l_addr_start;
sscanf(addr1,"%lx",(long unsigned *)&tmp->addr_start );
sscanf(addr2,"%lx",(long unsigned *)&tmp->addr_end );
//size
tmp->length=(unsigned long)(tmp->addr_end-tmp->addr_start);
//perm
strcpy(tmp->perm,perm);
tmp->is_r=(perm[0]=='r');
tmp->is_w=(perm[1]=='w');
tmp->is_x=(perm[2]=='x');
tmp->is_p=(perm[3]=='p');
//offset
sscanf(offset,"%lx",&tmp->offset );
//device
strcpy(tmp->dev,dev);
//inode
tmp->inode=atoi(inode);
//pathname
strcpy(tmp->pathname,pathname);
tmp->next=NULL;
//attach the node
if(ind==0){
list_maps=tmp;
list_maps->next=NULL;
current_node=list_maps;
}
current_node->next=tmp;
current_node=tmp;
ind++;
//printf("%s",buf);
}
//close file
fclose(file);
//g_last_head=list_maps;
maps_it->head = list_maps;
maps_it->current = list_maps;
return maps_it;
}
procmaps_struct* pmparser_next(procmaps_iterator* p_procmaps_it){
if(p_procmaps_it->current == NULL)
return NULL;
procmaps_struct* p_current = p_procmaps_it->current;
p_procmaps_it->current = p_procmaps_it->current->next;
return p_current;
/*
if(g_current==NULL){
g_current=g_last_head;
}else
g_current=g_current->next;
return g_current;
*/
}
void pmparser_free(procmaps_iterator* p_procmaps_it){
procmaps_struct* maps_list = p_procmaps_it->head;
if(maps_list==NULL) return ;
procmaps_struct* act=maps_list;
procmaps_struct* nxt=act->next;
while(act!=NULL){
free(act);
act=nxt;
if(nxt!=NULL)
nxt=nxt->next;
}
}
void _pmparser_split_line(
char*buf,char*addr1,char*addr2,
char*perm,char* offset,char* device,char*inode,
char* pathname){
//
int orig=0;
int i=0;
//addr1
while(buf[i]!='-'){
addr1[i-orig]=buf[i];
i++;
}
addr1[i]='\0';
i++;
//addr2
orig=i;
while(buf[i]!='\t' && buf[i]!=' '){
addr2[i-orig]=buf[i];
i++;
}
addr2[i-orig]='\0';
//perm
while(buf[i]=='\t' || buf[i]==' ')
i++;
orig=i;
while(buf[i]!='\t' && buf[i]!=' '){
perm[i-orig]=buf[i];
i++;
}
perm[i-orig]='\0';
//offset
while(buf[i]=='\t' || buf[i]==' ')
i++;
orig=i;
while(buf[i]!='\t' && buf[i]!=' '){
offset[i-orig]=buf[i];
i++;
}
offset[i-orig]='\0';
//dev
while(buf[i]=='\t' || buf[i]==' ')
i++;
orig=i;
while(buf[i]!='\t' && buf[i]!=' '){
device[i-orig]=buf[i];
i++;
}
device[i-orig]='\0';
//inode
while(buf[i]=='\t' || buf[i]==' ')
i++;
orig=i;
while(buf[i]!='\t' && buf[i]!=' '){
inode[i-orig]=buf[i];
i++;
}
inode[i-orig]='\0';
//pathname
pathname[0]='\0';
while(buf[i]=='\t' || buf[i]==' ')
i++;
orig=i;
while(buf[i]!='\t' && buf[i]!=' ' && buf[i]!='\n'){
pathname[i-orig]=buf[i];
i++;
}
pathname[i-orig]='\0';
}
#endif

View File

@ -10,6 +10,9 @@
TCG instrumentation and block chaining support by Andrea Biondo
<andrea.biondo965@gmail.com>
QEMU 3.1.0 port, TCG thread-safety and CompareCoverage by Andrea Fioraldi
<andreafioraldi@gmail.com>
Copyright 2015, 2016, 2017 Google Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
@ -19,7 +22,7 @@
http://www.apache.org/licenses/LICENSE-2.0
This code is a shim patched into the separately-distributed source
code of QEMU 2.10.0. It leverages the built-in QEMU tracing functionality
code of QEMU 3.1.0. It leverages the built-in QEMU tracing functionality
to implement AFL-style instrumentation and to take care of the remaining
parts of the AFL fork server logic.
@ -63,6 +66,8 @@ abi_ulong afl_entry_point, /* ELF entry point (_start) */
afl_start_code, /* .text start pointer */
afl_end_code; /* .text end pointer */
u8 afl_enable_compcov;
/* Set in the child process in forkserver mode: */
static int forkserver_installed = 0;
@ -145,7 +150,6 @@ static void afl_setup(void) {
if (inst_r) afl_area_ptr[0] = 1;
}
if (getenv("AFL_INST_LIBS")) {
@ -154,6 +158,11 @@ static void afl_setup(void) {
afl_end_code = (abi_ulong)-1;
}
if (getenv("AFL_QEMU_COMPCOV")) {
afl_enable_compcov = 1;
}
/* pthread_atfork() seems somewhat broken in util/rcu.c, and I'm
not entirely sure what is the cause. This disables that
@ -269,6 +278,25 @@ static void afl_request_tsl(target_ulong pc, target_ulong cb, uint32_t flags, ui
}
/* Check if an address is valid in the current mapping */
static inline int is_valid_addr(target_ulong addr) {
int l, flags;
target_ulong page;
void * p;
page = addr & TARGET_PAGE_MASK;
l = (page + TARGET_PAGE_SIZE) - addr;
flags = page_get_flags(page);
if (!(flags & PAGE_VALID) || !(flags & PAGE_READ))
return 0;
return 1;
}
/* This is the other side of the same channel. Since timeouts are handled by
afl-fuzz simply killing the child, we can just wait until the pipe breaks. */
@ -280,6 +308,8 @@ static void afl_wait_tsl(CPUState *cpu, int fd) {
while (1) {
u8 invalid_pc = 0;
/* Broken pipe means it's time to return to the fork server routine. */
if (read(fd, &t, sizeof(struct afl_tsl)) != sizeof(struct afl_tsl))
@ -288,19 +318,34 @@ static void afl_wait_tsl(CPUState *cpu, int fd) {
tb = tb_htable_lookup(cpu, t.tb.pc, t.tb.cs_base, t.tb.flags, t.tb.cf_mask);
if(!tb) {
mmap_lock();
tb = tb_gen_code(cpu, t.tb.pc, t.tb.cs_base, t.tb.flags, 0);
mmap_unlock();
/* The child may request to transate a block of memory that is not
mapped in the parent (e.g. jitted code or dlopened code).
This causes a SIGSEV in gen_intermediate_code() and associated
subroutines. We simply avoid caching of such blocks. */
if (is_valid_addr(t.tb.pc)) {
mmap_lock();
tb = tb_gen_code(cpu, t.tb.pc, t.tb.cs_base, t.tb.flags, 0);
mmap_unlock();
} else {
invalid_pc = 1;
}
}
if (t.is_chain) {
if (read(fd, &c, sizeof(struct afl_chain)) != sizeof(struct afl_chain))
break;
last_tb = tb_htable_lookup(cpu, c.last_tb.pc, c.last_tb.cs_base,
c.last_tb.flags, c.cf_mask);
if (last_tb) {
tb_add_jump(last_tb, c.tb_exit, tb);
if (!invalid_pc) {
last_tb = tb_htable_lookup(cpu, c.last_tb.pc, c.last_tb.cs_base,
c.last_tb.flags, c.cf_mask);
if (last_tb) {
tb_add_jump(last_tb, c.tb_exit, tb);
}
}
}

View File

@ -0,0 +1,125 @@
/*
american fuzzy lop - high-performance binary-only instrumentation
-----------------------------------------------------------------
Written by Andrew Griffiths <agriffiths@google.com> and
Michal Zalewski <lcamtuf@google.com>
Idea & design very much by Andrew Griffiths.
TCG instrumentation and block chaining support by Andrea Biondo
<andrea.biondo965@gmail.com>
QEMU 3.1.0 port, TCG thread-safety and CompareCoverage by Andrea Fioraldi
<andreafioraldi@gmail.com>
Copyright 2015, 2016, 2017 Google Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
This code is a shim patched into the separately-distributed source
code of QEMU 3.1.0. It leverages the built-in QEMU tracing functionality
to implement AFL-style instrumentation and to take care of the remaining
parts of the AFL fork server logic.
The resulting QEMU binary is essentially a standalone instrumentation
tool; for an example of how to leverage it for other purposes, you can
have a look at afl-showmap.c.
*/
#include "../../config.h"
#include "tcg.h"
#include "tcg-op.h"
/* Declared in afl-qemu-cpu-inl.h */
extern unsigned char *afl_area_ptr;
extern unsigned int afl_inst_rms;
extern abi_ulong afl_start_code, afl_end_code;
extern u8 afl_enable_compcov;
void tcg_gen_afl_compcov_log_call(void *func, target_ulong cur_loc,
TCGv_i64 arg1, TCGv_i64 arg2);
static void afl_compcov_log_16(target_ulong cur_loc, target_ulong arg1,
target_ulong arg2) {
if ((arg1 & 0xff) == (arg2 & 0xff)) {
afl_area_ptr[cur_loc]++;
}
}
static void afl_compcov_log_32(target_ulong cur_loc, target_ulong arg1,
target_ulong arg2) {
if ((arg1 & 0xff) == (arg2 & 0xff)) {
afl_area_ptr[cur_loc]++;
if ((arg1 & 0xffff) == (arg2 & 0xffff)) {
afl_area_ptr[cur_loc +1]++;
if ((arg1 & 0xffffff) == (arg2 & 0xffffff)) {
afl_area_ptr[cur_loc +2]++;
}
}
}
}
static void afl_compcov_log_64(target_ulong cur_loc, target_ulong arg1,
target_ulong arg2) {
if ((arg1 & 0xff) == (arg2 & 0xff)) {
afl_area_ptr[cur_loc]++;
if ((arg1 & 0xffff) == (arg2 & 0xffff)) {
afl_area_ptr[cur_loc +1]++;
if ((arg1 & 0xffffff) == (arg2 & 0xffffff)) {
afl_area_ptr[cur_loc +2]++;
if ((arg1 & 0xffffffff) == (arg2 & 0xffffffff)) {
afl_area_ptr[cur_loc +3]++;
if ((arg1 & 0xffffffffff) == (arg2 & 0xffffffffff)) {
afl_area_ptr[cur_loc +4]++;
if ((arg1 & 0xffffffffffff) == (arg2 & 0xffffffffffff)) {
afl_area_ptr[cur_loc +5]++;
if ((arg1 & 0xffffffffffffff) == (arg2 & 0xffffffffffffff)) {
afl_area_ptr[cur_loc +6]++;
}
}
}
}
}
}
}
}
static void afl_gen_compcov(target_ulong cur_loc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGMemOp ot) {
void *func;
if (!afl_enable_compcov || cur_loc > afl_end_code || cur_loc < afl_start_code)
return;
switch (ot) {
case MO_64:
func = &afl_compcov_log_64;
break;
case MO_32:
func = &afl_compcov_log_32;
break;
case MO_16:
func = &afl_compcov_log_16;
break;
default:
return;
}
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
if (cur_loc >= afl_inst_rms) return;
tcg_gen_afl_compcov_log_call(func, cur_loc, arg1, arg2);
}

View File

@ -0,0 +1,306 @@
/*
american fuzzy lop - high-performance binary-only instrumentation
-----------------------------------------------------------------
Written by Andrew Griffiths <agriffiths@google.com> and
Michal Zalewski <lcamtuf@google.com>
Idea & design very much by Andrew Griffiths.
TCG instrumentation and block chaining support by Andrea Biondo
<andrea.biondo965@gmail.com>
QEMU 3.1.0 port, TCG thread-safety and CompareCoverage by Andrea Fioraldi
<andreafioraldi@gmail.com>
Copyright 2015, 2016, 2017 Google Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
This code is a shim patched into the separately-distributed source
code of QEMU 3.1.0. It leverages the built-in QEMU tracing functionality
to implement AFL-style instrumentation and to take care of the remaining
parts of the AFL fork server logic.
The resulting QEMU binary is essentially a standalone instrumentation
tool; for an example of how to leverage it for other purposes, you can
have a look at afl-showmap.c.
*/
void afl_maybe_log(void* cur_loc);
/* Note: we convert the 64 bit args to 32 bit and do some alignment
and endian swap. Maybe it would be better to do the alignment
and endian swap in tcg_reg_alloc_call(). */
void tcg_gen_afl_maybe_log_call(target_ulong cur_loc)
{
int real_args, pi;
unsigned sizemask, flags;
TCGOp *op;
TCGTemp *arg = tcgv_i64_temp( tcg_const_tl(cur_loc) );
flags = 0;
sizemask = dh_sizemask(void, 0) | dh_sizemask(i64, 1);
#if defined(__sparc__) && !defined(__arch64__) \
&& !defined(CONFIG_TCG_INTERPRETER)
/* We have 64-bit values in one register, but need to pass as two
separate parameters. Split them. */
int orig_sizemask = sizemask;
TCGv_i64 retl, reth;
TCGTemp *split_args[MAX_OPC_PARAM];
retl = NULL;
reth = NULL;
if (sizemask != 0) {
real_args = 0;
int is_64bit = sizemask & (1 << 2);
if (is_64bit) {
TCGv_i64 orig = temp_tcgv_i64(arg);
TCGv_i32 h = tcg_temp_new_i32();
TCGv_i32 l = tcg_temp_new_i32();
tcg_gen_extr_i64_i32(l, h, orig);
split_args[real_args++] = tcgv_i32_temp(h);
split_args[real_args++] = tcgv_i32_temp(l);
} else {
split_args[real_args++] = arg;
}
nargs = real_args;
args = split_args;
sizemask = 0;
}
#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
int is_64bit = sizemask & (1 << 2);
int is_signed = sizemask & (2 << 2);
if (!is_64bit) {
TCGv_i64 temp = tcg_temp_new_i64();
TCGv_i64 orig = temp_tcgv_i64(arg);
if (is_signed) {
tcg_gen_ext32s_i64(temp, orig);
} else {
tcg_gen_ext32u_i64(temp, orig);
}
arg = tcgv_i64_temp(temp);
}
#endif /* TCG_TARGET_EXTEND_ARGS */
op = tcg_emit_op(INDEX_op_call);
pi = 0;
TCGOP_CALLO(op) = 0;
real_args = 0;
int is_64bit = sizemask & (1 << 2);
if (TCG_TARGET_REG_BITS < 64 && is_64bit) {
#ifdef TCG_TARGET_CALL_ALIGN_ARGS
/* some targets want aligned 64 bit args */
if (real_args & 1) {
op->args[pi++] = TCG_CALL_DUMMY_ARG;
real_args++;
}
#endif
/* If stack grows up, then we will be placing successive
arguments at lower addresses, which means we need to
reverse the order compared to how we would normally
treat either big or little-endian. For those arguments
that will wind up in registers, this still works for
HPPA (the only current STACK_GROWSUP target) since the
argument registers are *also* allocated in decreasing
order. If another such target is added, this logic may
have to get more complicated to differentiate between
stack arguments and register arguments. */
#if defined(HOST_WORDS_BIGENDIAN) != defined(TCG_TARGET_STACK_GROWSUP)
op->args[pi++] = temp_arg(arg + 1);
op->args[pi++] = temp_arg(arg);
#else
op->args[pi++] = temp_arg(arg);
op->args[pi++] = temp_arg(arg + 1);
#endif
real_args += 2;
}
op->args[pi++] = temp_arg(arg);
real_args++;
op->args[pi++] = (uintptr_t)&afl_maybe_log;
op->args[pi++] = flags;
TCGOP_CALLI(op) = real_args;
/* Make sure the fields didn't overflow. */
tcg_debug_assert(TCGOP_CALLI(op) == real_args);
tcg_debug_assert(pi <= ARRAY_SIZE(op->args));
#if defined(__sparc__) && !defined(__arch64__) \
&& !defined(CONFIG_TCG_INTERPRETER)
/* Free all of the parts we allocated above. */
real_args = 0;
int is_64bit = orig_sizemask & (1 << 2);
if (is_64bit) {
tcg_temp_free_internal(args[real_args++]);
tcg_temp_free_internal(args[real_args++]);
} else {
real_args++;
}
if (orig_sizemask & 1) {
/* The 32-bit ABI returned two 32-bit pieces. Re-assemble them.
Note that describing these as TCGv_i64 eliminates an unnecessary
zero-extension that tcg_gen_concat_i32_i64 would create. */
tcg_gen_concat32_i64(temp_tcgv_i64(NULL), retl, reth);
tcg_temp_free_i64(retl);
tcg_temp_free_i64(reth);
}
#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
int is_64bit = sizemask & (1 << 2);
if (!is_64bit) {
tcg_temp_free_internal(arg);
}
#endif /* TCG_TARGET_EXTEND_ARGS */
}
void tcg_gen_afl_compcov_log_call(void *func, target_ulong cur_loc, TCGv_i64 arg1, TCGv_i64 arg2)
{
int i, real_args, nb_rets, pi;
unsigned sizemask, flags;
TCGOp *op;
const int nargs = 3;
TCGTemp *args[3] = { tcgv_i64_temp( tcg_const_tl(cur_loc) ),
tcgv_i64_temp(arg1),
tcgv_i64_temp(arg2) };
flags = 0;
sizemask = dh_sizemask(void, 0) | dh_sizemask(i64, 1) |
dh_sizemask(i64, 2) | dh_sizemask(i64, 3);
#if defined(__sparc__) && !defined(__arch64__) \
&& !defined(CONFIG_TCG_INTERPRETER)
/* We have 64-bit values in one register, but need to pass as two
separate parameters. Split them. */
int orig_sizemask = sizemask;
int orig_nargs = nargs;
TCGv_i64 retl, reth;
TCGTemp *split_args[MAX_OPC_PARAM];
retl = NULL;
reth = NULL;
if (sizemask != 0) {
for (i = real_args = 0; i < nargs; ++i) {
int is_64bit = sizemask & (1 << (i+1)*2);
if (is_64bit) {
TCGv_i64 orig = temp_tcgv_i64(args[i]);
TCGv_i32 h = tcg_temp_new_i32();
TCGv_i32 l = tcg_temp_new_i32();
tcg_gen_extr_i64_i32(l, h, orig);
split_args[real_args++] = tcgv_i32_temp(h);
split_args[real_args++] = tcgv_i32_temp(l);
} else {
split_args[real_args++] = args[i];
}
}
nargs = real_args;
args = split_args;
sizemask = 0;
}
#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
for (i = 0; i < nargs; ++i) {
int is_64bit = sizemask & (1 << (i+1)*2);
int is_signed = sizemask & (2 << (i+1)*2);
if (!is_64bit) {
TCGv_i64 temp = tcg_temp_new_i64();
TCGv_i64 orig = temp_tcgv_i64(args[i]);
if (is_signed) {
tcg_gen_ext32s_i64(temp, orig);
} else {
tcg_gen_ext32u_i64(temp, orig);
}
args[i] = tcgv_i64_temp(temp);
}
}
#endif /* TCG_TARGET_EXTEND_ARGS */
op = tcg_emit_op(INDEX_op_call);
pi = 0;
nb_rets = 0;
TCGOP_CALLO(op) = nb_rets;
real_args = 0;
for (i = 0; i < nargs; i++) {
int is_64bit = sizemask & (1 << (i+1)*2);
if (TCG_TARGET_REG_BITS < 64 && is_64bit) {
#ifdef TCG_TARGET_CALL_ALIGN_ARGS
/* some targets want aligned 64 bit args */
if (real_args & 1) {
op->args[pi++] = TCG_CALL_DUMMY_ARG;
real_args++;
}
#endif
/* If stack grows up, then we will be placing successive
arguments at lower addresses, which means we need to
reverse the order compared to how we would normally
treat either big or little-endian. For those arguments
that will wind up in registers, this still works for
HPPA (the only current STACK_GROWSUP target) since the
argument registers are *also* allocated in decreasing
order. If another such target is added, this logic may
have to get more complicated to differentiate between
stack arguments and register arguments. */
#if defined(HOST_WORDS_BIGENDIAN) != defined(TCG_TARGET_STACK_GROWSUP)
op->args[pi++] = temp_arg(args[i] + 1);
op->args[pi++] = temp_arg(args[i]);
#else
op->args[pi++] = temp_arg(args[i]);
op->args[pi++] = temp_arg(args[i] + 1);
#endif
real_args += 2;
continue;
}
op->args[pi++] = temp_arg(args[i]);
real_args++;
}
op->args[pi++] = (uintptr_t)func;
op->args[pi++] = flags;
TCGOP_CALLI(op) = real_args;
/* Make sure the fields didn't overflow. */
tcg_debug_assert(TCGOP_CALLI(op) == real_args);
tcg_debug_assert(pi <= ARRAY_SIZE(op->args));
#if defined(__sparc__) && !defined(__arch64__) \
&& !defined(CONFIG_TCG_INTERPRETER)
/* Free all of the parts we allocated above. */
for (i = real_args = 0; i < orig_nargs; ++i) {
int is_64bit = orig_sizemask & (1 << (i+1)*2);
if (is_64bit) {
tcg_temp_free_internal(args[real_args++]);
tcg_temp_free_internal(args[real_args++]);
} else {
real_args++;
}
}
if (orig_sizemask & 1) {
/* The 32-bit ABI returned two 32-bit pieces. Re-assemble them.
Note that describing these as TCGv_i64 eliminates an unnecessary
zero-extension that tcg_gen_concat_i32_i64 would create. */
tcg_gen_concat32_i64(temp_tcgv_i64(NULL), retl, reth);
tcg_temp_free_i64(retl);
tcg_temp_free_i64(reth);
}
#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
for (i = 0; i < nargs; ++i) {
int is_64bit = sizemask & (1 << (i+1)*2);
if (!is_64bit) {
tcg_temp_free_internal(args[i]);
}
}
#endif /* TCG_TARGET_EXTEND_ARGS */
}

View File

@ -10,6 +10,9 @@
TCG instrumentation and block chaining support by Andrea Biondo
<andrea.biondo965@gmail.com>
QEMU 3.1.0 port, TCG thread-safety and CompareCoverage by Andrea Fioraldi
<andreafioraldi@gmail.com>
Copyright 2015, 2016, 2017 Google Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
@ -19,7 +22,7 @@
http://www.apache.org/licenses/LICENSE-2.0
This code is a shim patched into the separately-distributed source
code of QEMU 2.10.0. It leverages the built-in QEMU tracing functionality
code of QEMU 3.1.0. It leverages the built-in QEMU tracing functionality
to implement AFL-style instrumentation and to take care of the remaining
parts of the AFL fork server logic.
@ -37,10 +40,9 @@ extern unsigned char *afl_area_ptr;
extern unsigned int afl_inst_rms;
extern abi_ulong afl_start_code, afl_end_code;
void tcg_gen_afl_callN(void *func, TCGTemp *ret, int nargs, TCGTemp **args);
void tcg_gen_afl_maybe_log_call(target_ulong cur_loc);
void afl_maybe_log(abi_ulong cur_loc) {
void afl_maybe_log(target_ulong cur_loc) {
static __thread abi_ulong prev_loc;
@ -49,7 +51,6 @@ void afl_maybe_log(abi_ulong cur_loc) {
}
/* Generates TCG code for AFL's tracing instrumentation. */
static void afl_gen_trace(target_ulong cur_loc) {
@ -59,7 +60,7 @@ static void afl_gen_trace(target_ulong cur_loc) {
if (cur_loc > afl_end_code || cur_loc < afl_start_code /*|| !afl_area_ptr*/) // not needed because of static dummy buffer
return;
/* Looks like QEMU always maps to fixed locations, so ASAN is not a
/* Looks like QEMU always maps to fixed locations, so ASLR is not a
concern. Phew. But instruction addresses may be aligned. Let's mangle
the value to get something quasi-uniform. */
@ -71,7 +72,6 @@ static void afl_gen_trace(target_ulong cur_loc) {
if (cur_loc >= afl_inst_rms) return;
TCGTemp *args[1] = { tcgv_i64_temp( tcg_const_tl(cur_loc) ) };
tcg_gen_afl_callN(afl_maybe_log, NULL, 1, args);
tcg_gen_afl_maybe_log_call(cur_loc);
}

View File

@ -1,5 +1,5 @@
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5bccd2e2..94e928a4 100644
index 5bccd2e2..fd7460b3 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -20,6 +20,8 @@
@ -11,16 +11,29 @@ index 5bccd2e2..94e928a4 100644
/* from personality.h */
/*
@@ -2301,6 +2303,8 @@ static void load_elf_image(const char *image_name, int image_fd,
@@ -2301,6 +2303,21 @@ static void load_elf_image(const char *image_name, int image_fd,
info->brk = 0;
info->elf_flags = ehdr->e_flags;
+ if (!afl_entry_point) afl_entry_point = info->entry;
+ if (!afl_entry_point) {
+ char *ptr;
+ if ((ptr = getenv("AFL_ENTRYPOINT")) != NULL) {
+ afl_entry_point = strtoul(ptr, NULL, 16);
+ } else {
+ afl_entry_point = info->entry;
+ }
+#ifdef TARGET_ARM
+ /* The least significant bit indicates Thumb mode. */
+ afl_entry_point = afl_entry_point & ~(target_ulong)1;
+#endif
+ }
+ if (getenv("AFL_DEBUG") != NULL)
+ fprintf(stderr, "AFL forkserver entrypoint: %p\n", (void*)afl_entry_point);
+
for (i = 0; i < ehdr->e_phnum; i++) {
struct elf_phdr *eppnt = phdr + i;
if (eppnt->p_type == PT_LOAD) {
@@ -2335,9 +2339,11 @@ static void load_elf_image(const char *image_name, int image_fd,
@@ -2335,9 +2352,11 @@ static void load_elf_image(const char *image_name, int image_fd,
if (elf_prot & PROT_EXEC) {
if (vaddr < info->start_code) {
info->start_code = vaddr;
@ -32,3 +45,26 @@ index 5bccd2e2..94e928a4 100644
}
}
if (elf_prot & PROT_WRITE) {
@@ -2662,6 +2681,22 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
change some of these later */
bprm->p = setup_arg_pages(bprm, info);
+ // On PowerPC64 the entry point is the _function descriptor_
+ // of the entry function. For AFL to properly initialize,
+ // afl_entry_point needs to be set to the actual first instruction
+ // as opposed executed by the target program. This as opposed to
+ // where the function's descriptor sits in memory.
+ // copied from PPC init_thread
+#if defined(TARGET_PPC64) && !defined(TARGET_ABI32)
+ if (get_ppc64_abi(infop) < 2) {
+ uint64_t val;
+ get_user_u64(val, infop->entry + 8);
+ _regs->gpr[2] = val + infop->load_bias;
+ get_user_u64(val, infop->entry);
+ infop->entry = val + infop->load_bias;
+ }
+#endif
+
scratch = g_new0(char, TARGET_PAGE_SIZE);
if (STACK_GROWS_DOWN) {
bprm->p = copy_elf_strings(1, &bprm->filename, scratch,

View File

@ -1,47 +0,0 @@
--- a/linux-user/elfload.c 2019-06-03 13:06:40.755755923 +0200
+++ b/linux-user/elfload.c 2019-06-03 13:33:01.315709801 +0200
@@ -2303,7 +2303,20 @@
info->brk = 0;
info->elf_flags = ehdr->e_flags;
- if (!afl_entry_point) afl_entry_point = info->entry;
+ if (!afl_entry_point) {
+ char *ptr;
+ if ((ptr = getenv("AFL_ENTRYPOINT")) != NULL) {
+ afl_entry_point = strtoul(ptr, NULL, 16);
+ } else {
+ afl_entry_point = info->entry;
+ }
+#ifdef TARGET_ARM
+ /* The least significant bit indicates Thumb mode. */
+ afl_entry_point = afl_entry_point & ~(target_ulong)1;
+#endif
+ }
+ if (getenv("AFL_DEBUG") != NULL)
+ fprintf(stderr, "AFL forkserver entrypoint: %p\n", (void*)afl_entry_point);
for (i = 0; i < ehdr->e_phnum; i++) {
struct elf_phdr *eppnt = phdr + i;
@@ -2668,6 +2681,22 @@
change some of these later */
bprm->p = setup_arg_pages(bprm, info);
+ // On PowerPC64 the entry point is the _function descriptor_
+ // of the entry function. For AFL to properly initialize,
+ // afl_entry_point needs to be set to the actual first instruction
+ // as opposed executed by the target program. This as opposed to
+ // where the function's descriptor sits in memory.
+ // copied from PPC init_thread
+#if defined(TARGET_PPC64) && !defined(TARGET_ABI32)
+ if (get_ppc64_abi(infop) < 2) {
+ uint64_t val;
+ get_user_u64(val, infop->entry + 8);
+ _regs->gpr[2] = val + infop->load_bias;
+ get_user_u64(val, infop->entry);
+ infop->entry = val + infop->load_bias;
+ }
+#endif
+
scratch = g_new0(char, TARGET_PAGE_SIZE);
if (STACK_GROWS_DOWN) {
bprm->p = copy_elf_strings(1, &bprm->filename, scratch,

View File

@ -0,0 +1,33 @@
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 0dd5fbe4..b95d341e 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -32,6 +32,8 @@
#include "trace-tcg.h"
#include "exec/log.h"
+#include "../patches/afl-qemu-cpu-translate-inl.h"
+
#define PREFIX_REPZ 0x01
#define PREFIX_REPNZ 0x02
#define PREFIX_LOCK 0x04
@@ -1343,9 +1345,11 @@ static void gen_op(DisasContext *s1, int op, TCGMemOp ot, int d)
tcg_gen_atomic_fetch_add_tl(s1->cc_srcT, s1->A0, s1->T0,
s1->mem_index, ot | MO_LE);
tcg_gen_sub_tl(s1->T0, s1->cc_srcT, s1->T1);
+ afl_gen_compcov(s1->pc, s1->cc_srcT, s1->T1, ot);
} else {
tcg_gen_mov_tl(s1->cc_srcT, s1->T0);
tcg_gen_sub_tl(s1->T0, s1->T0, s1->T1);
+ afl_gen_compcov(s1->pc, s1->T0, s1->T1, ot);
gen_op_st_rm_T0_A0(s1, ot, d);
}
gen_op_update2_cc(s1);
@@ -1389,6 +1393,7 @@ static void gen_op(DisasContext *s1, int op, TCGMemOp ot, int d)
tcg_gen_mov_tl(cpu_cc_src, s1->T1);
tcg_gen_mov_tl(s1->cc_srcT, s1->T0);
tcg_gen_sub_tl(cpu_cc_dst, s1->T0, s1->T1);
+ afl_gen_compcov(s1->pc, s1->T0, s1->T1, ot);
set_cc_op(s1, CC_OP_SUBB + ot);
break;
}

View File

@ -2,179 +2,12 @@ diff --git a/tcg/tcg.c b/tcg/tcg.c
index e85133ef..54b9b390 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1612,6 +1612,176 @@ bool tcg_op_supported(TCGOpcode op)
@@ -1612,6 +1612,9 @@ bool tcg_op_supported(TCGOpcode op)
}
}
+
+/* Call the instrumentation function from the TCG IR */
+void tcg_gen_afl_callN(void *func, TCGTemp *ret, int nargs, TCGTemp **args)
+{
+ int i, real_args, nb_rets, pi;
+ unsigned sizemask, flags;
+ TCGOp *op;
+
+ flags = 0;
+ sizemask = 0;
+
+#if defined(__sparc__) && !defined(__arch64__) \
+ && !defined(CONFIG_TCG_INTERPRETER)
+ /* We have 64-bit values in one register, but need to pass as two
+ separate parameters. Split them. */
+ int orig_sizemask = sizemask;
+ int orig_nargs = nargs;
+ TCGv_i64 retl, reth;
+ TCGTemp *split_args[MAX_OPC_PARAM];
+
+ retl = NULL;
+ reth = NULL;
+ if (sizemask != 0) {
+ for (i = real_args = 0; i < nargs; ++i) {
+ int is_64bit = sizemask & (1 << (i+1)*2);
+ if (is_64bit) {
+ TCGv_i64 orig = temp_tcgv_i64(args[i]);
+ TCGv_i32 h = tcg_temp_new_i32();
+ TCGv_i32 l = tcg_temp_new_i32();
+ tcg_gen_extr_i64_i32(l, h, orig);
+ split_args[real_args++] = tcgv_i32_temp(h);
+ split_args[real_args++] = tcgv_i32_temp(l);
+ } else {
+ split_args[real_args++] = args[i];
+ }
+ }
+ nargs = real_args;
+ args = split_args;
+ sizemask = 0;
+ }
+#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+ for (i = 0; i < nargs; ++i) {
+ int is_64bit = sizemask & (1 << (i+1)*2);
+ int is_signed = sizemask & (2 << (i+1)*2);
+ if (!is_64bit) {
+ TCGv_i64 temp = tcg_temp_new_i64();
+ TCGv_i64 orig = temp_tcgv_i64(args[i]);
+ if (is_signed) {
+ tcg_gen_ext32s_i64(temp, orig);
+ } else {
+ tcg_gen_ext32u_i64(temp, orig);
+ }
+ args[i] = tcgv_i64_temp(temp);
+ }
+ }
+#endif /* TCG_TARGET_EXTEND_ARGS */
+
+ op = tcg_emit_op(INDEX_op_call);
+
+ pi = 0;
+ if (ret != NULL) {
+#if defined(__sparc__) && !defined(__arch64__) \
+ && !defined(CONFIG_TCG_INTERPRETER)
+ if (orig_sizemask & 1) {
+ /* The 32-bit ABI is going to return the 64-bit value in
+ the %o0/%o1 register pair. Prepare for this by using
+ two return temporaries, and reassemble below. */
+ retl = tcg_temp_new_i64();
+ reth = tcg_temp_new_i64();
+ op->args[pi++] = tcgv_i64_arg(reth);
+ op->args[pi++] = tcgv_i64_arg(retl);
+ nb_rets = 2;
+ } else {
+ op->args[pi++] = temp_arg(ret);
+ nb_rets = 1;
+ }
+#else
+ if (TCG_TARGET_REG_BITS < 64 && (sizemask & 1)) {
+#ifdef HOST_WORDS_BIGENDIAN
+ op->args[pi++] = temp_arg(ret + 1);
+ op->args[pi++] = temp_arg(ret);
+#else
+ op->args[pi++] = temp_arg(ret);
+ op->args[pi++] = temp_arg(ret + 1);
+#endif
+ nb_rets = 2;
+ } else {
+ op->args[pi++] = temp_arg(ret);
+ nb_rets = 1;
+ }
+#endif
+ } else {
+ nb_rets = 0;
+ }
+ TCGOP_CALLO(op) = nb_rets;
+
+ real_args = 0;
+ for (i = 0; i < nargs; i++) {
+ int is_64bit = sizemask & (1 << (i+1)*2);
+ if (TCG_TARGET_REG_BITS < 64 && is_64bit) {
+#ifdef TCG_TARGET_CALL_ALIGN_ARGS
+ /* some targets want aligned 64 bit args */
+ if (real_args & 1) {
+ op->args[pi++] = TCG_CALL_DUMMY_ARG;
+ real_args++;
+ }
+#endif
+ /* If stack grows up, then we will be placing successive
+ arguments at lower addresses, which means we need to
+ reverse the order compared to how we would normally
+ treat either big or little-endian. For those arguments
+ that will wind up in registers, this still works for
+ HPPA (the only current STACK_GROWSUP target) since the
+ argument registers are *also* allocated in decreasing
+ order. If another such target is added, this logic may
+ have to get more complicated to differentiate between
+ stack arguments and register arguments. */
+#if defined(HOST_WORDS_BIGENDIAN) != defined(TCG_TARGET_STACK_GROWSUP)
+ op->args[pi++] = temp_arg(args[i] + 1);
+ op->args[pi++] = temp_arg(args[i]);
+#else
+ op->args[pi++] = temp_arg(args[i]);
+ op->args[pi++] = temp_arg(args[i] + 1);
+#endif
+ real_args += 2;
+ continue;
+ }
+
+ op->args[pi++] = temp_arg(args[i]);
+ real_args++;
+ }
+ op->args[pi++] = (uintptr_t)func;
+ op->args[pi++] = flags;
+ TCGOP_CALLI(op) = real_args;
+
+ /* Make sure the fields didn't overflow. */
+ tcg_debug_assert(TCGOP_CALLI(op) == real_args);
+ tcg_debug_assert(pi <= ARRAY_SIZE(op->args));
+
+#if defined(__sparc__) && !defined(__arch64__) \
+ && !defined(CONFIG_TCG_INTERPRETER)
+ /* Free all of the parts we allocated above. */
+ for (i = real_args = 0; i < orig_nargs; ++i) {
+ int is_64bit = orig_sizemask & (1 << (i+1)*2);
+ if (is_64bit) {
+ tcg_temp_free_internal(args[real_args++]);
+ tcg_temp_free_internal(args[real_args++]);
+ } else {
+ real_args++;
+ }
+ }
+ if (orig_sizemask & 1) {
+ /* The 32-bit ABI returned two 32-bit pieces. Re-assemble them.
+ Note that describing these as TCGv_i64 eliminates an unnecessary
+ zero-extension that tcg_gen_concat_i32_i64 would create. */
+ tcg_gen_concat32_i64(temp_tcgv_i64(ret), retl, reth);
+ tcg_temp_free_i64(retl);
+ tcg_temp_free_i64(reth);
+ }
+#elif defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+ for (i = 0; i < nargs; ++i) {
+ int is_64bit = sizemask & (1 << (i+1)*2);
+ if (!is_64bit) {
+ tcg_temp_free_internal(args[i]);
+ }
+ }
+#endif /* TCG_TARGET_EXTEND_ARGS */
+}
+
+#include "../../patches/afl-qemu-tcg-inl.h"
+
/* Note: we convert the 64 bit args to 32 bit and do some alignment
and endian swap. Maybe it would be better to do the alignment

137
sharedmem.c Normal file
View File

@ -0,0 +1,137 @@
/*
*/
#define AFL_MAIN
#include "config.h"
#include "types.h"
#include "debug.h"
#include "alloc-inl.h"
#include "hash.h"
#include "sharedmem.h"
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <errno.h>
#include <signal.h>
#include <dirent.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/resource.h>
#include <sys/mman.h>
#ifndef USEMMAP
#include <sys/ipc.h>
#include <sys/shm.h>
#endif
extern unsigned char*trace_bits;
#ifdef USEMMAP
/* ================ Proteas ================ */
int g_shm_fd = -1;
unsigned char *g_shm_base = NULL;
char g_shm_file_path[L_tmpnam];
/* ========================================= */
#else
static s32 shm_id; /* ID of the SHM region */
#endif
/* Get rid of shared memory (atexit handler). */
void remove_shm(void) {
#ifdef USEMMAP
if (g_shm_base != NULL) {
munmap(g_shm_base, MAP_SIZE);
g_shm_base = NULL;
}
if (g_shm_fd != -1) {
close(g_shm_fd);
g_shm_fd = -1;
}
#else
shmctl(shm_id, IPC_RMID, NULL);
#endif
}
/* Configure shared memory. */
void setup_shm(unsigned char dumb_mode) {
#ifdef USEMMAP
/* generate random file name for multi instance */
/* thanks to f*cking glibc we can not use tmpnam securely, it generates a security warning that cannot be suppressed */
/* so we do this worse workaround */
snprintf(g_shm_file_path, L_tmpnam, "/afl_%d_%ld", getpid(), random());
/* create the shared memory segment as if it was a file */
g_shm_fd = shm_open(g_shm_file_path, O_CREAT | O_RDWR | O_EXCL, 0600);
if (g_shm_fd == -1) {
PFATAL("shm_open() failed");
}
/* configure the size of the shared memory segment */
if (ftruncate(g_shm_fd, MAP_SIZE)) {
PFATAL("setup_shm(): ftruncate() failed");
}
/* map the shared memory segment to the address space of the process */
g_shm_base = mmap(0, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, g_shm_fd, 0);
if (g_shm_base == MAP_FAILED) {
close(g_shm_fd);
g_shm_fd = -1;
PFATAL("mmap() failed");
}
atexit(remove_shm);
/* If somebody is asking us to fuzz instrumented binaries in dumb mode,
we don't want them to detect instrumentation, since we won't be sending
fork server commands. This should be replaced with better auto-detection
later on, perhaps? */
if (!dumb_mode) setenv(SHM_ENV_VAR, g_shm_file_path, 1);
trace_bits = g_shm_base;
if (!trace_bits) PFATAL("mmap() failed");
#else
u8* shm_str;
shm_id = shmget(IPC_PRIVATE, MAP_SIZE, IPC_CREAT | IPC_EXCL | 0600);
if (shm_id < 0) PFATAL("shmget() failed");
atexit(remove_shm);
shm_str = alloc_printf("%d", shm_id);
setenv(SHM_ENV_VAR, shm_str, 1);
/* If somebody is asking us to fuzz instrumented binaries in dumb mode,
we don't want them to detect instrumentation, since we won't be sending
fork server commands. This should be replaced with better auto-detection
later on, perhaps? */
if (!dumb_mode) setenv(SHM_ENV_VAR, shm_str, 1);
ck_free(shm_str);
trace_bits = shmat(shm_id, NULL, 0);
if (!trace_bits) PFATAL("shmat() failed");
#endif
}

6
sharedmem.h Normal file
View File

@ -0,0 +1,6 @@
#ifndef __SHAREDMEM_H
#define __SHAREDMEM_H
void setup_shm(unsigned char dumb_mode);
void remove_shm(void);
#endif

View File

@ -22,15 +22,17 @@ int main(int argc, char** argv) {
char buf[8];
if (read(0, buf, 8) < 1) {
if (read(0, buf, sizeof(buf)) < 1) {
printf("Hum?\n");
exit(1);
}
if (buf[0] == '0')
printf("Looks like a zero to me!\n");
else if (buf[0] == '1')
printf("Pretty sure that is a one!\n");
else
printf("A non-zero value? How quaint!\n");
printf("Neither one or zero? How quaint!\n");
exit(0);

11
types.h
View File

@ -78,9 +78,14 @@ typedef int64_t s64;
#define STRINGIFY(x) STRINGIFY_INTERNAL(x)
#define MEM_BARRIER() \
asm volatile("" ::: "memory")
__asm__ volatile("" ::: "memory")
#define likely(_x) __builtin_expect(!!(_x), 1)
#define unlikely(_x) __builtin_expect(!!(_x), 0)
#if __GNUC__ < 6
#define likely(_x) (_x)
#define unlikely(_x) (_x)
#else
#define likely(_x) __builtin_expect(!!(_x), 1)
#define unlikely(_x) __builtin_expect(!!(_x), 0)
#endif
#endif /* ! _HAVE_TYPES_H */

21
unicorn_mode/README.md Normal file
View File

@ -0,0 +1,21 @@
```
__ _ _
__ _ / _| | _ _ _ __ (_) ___ ___ _ __ _ __
/ _` | |_| |___| | | | '_ \| |/ __/ _ \| '__| '_ \
| (_| | _| |___| |_| | | | | | (_| (_) | | | | | |
\__,_|_| |_| \__,_|_| |_|_|\___\___/|_| |_| |_|
```
afl-unicorn lets you fuzz any piece of binary that can be emulated by
[Unicorn Engine](http://www.unicorn-engine.org/).
For the full readme please see docs/unicorn_mode.txt
For an in-depth description of what this is, how to install it, and how to use
it check out this [blog post](https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf).
For general help with AFL, please refer to the documents in the ./docs/ directory.
Created by Nathan Voss, originally funded by
[Battelle](https://www.battelle.org/cyber).

View File

@ -0,0 +1,191 @@
#!/bin/sh
#
# american fuzzy lop - Unicorn-Mode build script
# --------------------------------------
#
# Written by Nathan Voss <njvoss99@gmail.com>
#
# Adapted from code by Andrew Griffiths <agriffiths@google.com> and
# Michal Zalewski <lcamtuf@google.com>
#
# Adapted for Afl++ by Dominik Maier <mail@dmnk.co>
#
# Copyright 2017 Battelle Memorial Institute. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# This script downloads, patches, and builds a version of Unicorn with
# minor tweaks to allow Unicorn-emulated binaries to be run under
# afl-fuzz.
#
# The modifications reside in patches/*. The standalone Unicorn library
# will be written to /usr/lib/libunicornafl.so, and the Python bindings
# will be installed system-wide.
#
# You must make sure that Unicorn Engine is not already installed before
# running this script. If it is, please uninstall it first.
UNICORN_URL="https://github.com/unicorn-engine/unicorn/archive/24f55a7973278f20f0de21b904851d99d4716263.tar.gz"
UNICORN_SHA384="7180d47ca52c99b4c073a343a2ead91da1a829fdc3809f3ceada5d872e162962eab98873a8bc7971449d5f34f41fdb93"
echo "================================================="
echo "Unicorn-AFL build script"
echo "================================================="
echo
echo "[*] Performing basic sanity checks..."
if [ ! "`uname -s`" = "Linux" ]; then
echo "[-] Error: Unicorn instrumentation is supported only on Linux."
exit 1
fi
if [ ! -f "patches/afl-unicorn-cpu-inl.h" -o ! -f "../config.h" ]; then
echo "[-] Error: key files not found - wrong working directory?"
exit 1
fi
if [ ! -f "../afl-showmap" ]; then
echo "[-] Error: ../afl-showmap not found - compile AFL first!"
exit 1
fi
for i in wget python automake autoconf sha384sum; do
T=`which "$i" 2>/dev/null`
if [ "$T" = "" ]; then
echo "[-] Error: '$i' not found. Run 'sudo apt-get install $i'."
exit 1
fi
done
if ! which easy_install > /dev/null; then
# work around for unusual installs
if [ '!' -e /usr/lib/python2.7/dist-packages/easy_install.py ]; then
echo "[-] Error: Python setup-tools not found. Run 'sudo apt-get install python-setuptools'."
exit 1
fi
fi
if echo "$CC" | grep -qF /afl-; then
echo "[-] Error: do not use afl-gcc or afl-clang to compile this tool."
exit 1
fi
echo "[+] All checks passed!"
ARCHIVE="`basename -- "$UNICORN_URL"`"
CKSUM=`sha384sum -- "$ARCHIVE" 2>/dev/null | cut -d' ' -f1`
if [ ! "$CKSUM" = "$UNICORN_SHA384" ]; then
echo "[*] Downloading Unicorn v1.0.1 from the web..."
rm -f "$ARCHIVE"
wget -O "$ARCHIVE" -- "$UNICORN_URL" || exit 1
CKSUM=`sha384sum -- "$ARCHIVE" 2>/dev/null | cut -d' ' -f1`
fi
if [ "$CKSUM" = "$UNICORN_SHA384" ]; then
echo "[+] Cryptographic signature on $ARCHIVE checks out."
else
echo "[-] Error: signature mismatch on $ARCHIVE (perhaps download error?)."
exit 1
fi
echo "[*] Uncompressing archive (this will take a while)..."
rm -rf "unicorn" || exit 1
mkdir "unicorn" || exit 1
tar xzf "$ARCHIVE" -C ./unicorn --strip-components=1 || exit 1
echo "[+] Unpacking successful."
rm -rf "$ARCHIVE" || exit 1
echo "[*] Applying patches..."
cp patches/afl-unicorn-cpu-inl.h unicorn || exit 1
patch -p1 --directory unicorn <patches/patches.diff || exit 1
echo "[+] Patching done."
echo "[*] Configuring Unicorn build..."
cd "unicorn" || exit 1
echo "[+] Configuration complete."
echo "[*] Attempting to build Unicorn (fingers crossed!)..."
UNICORN_QEMU_FLAGS='--python=python2' make || exit 1
echo "[+] Build process successful!"
echo "[*] Installing Unicorn python bindings..."
cd bindings/python || exit 1
if [ -z "$VIRTUAL_ENV" ]; then
echo "[*] Info: Installing python unicorn using --user"
python setup.py install --user || exit 1
else
echo "[*] Info: Installing python unicorn to virtualenv: $VIRTUAL_ENV"
python setup.py install || exit 1
fi
export LIBUNICORN_PATH='$(pwd)' # in theory, this allows to switch between afl-unicorn and unicorn so files.
cd ../../ || exit 1
echo "[+] Unicorn bindings installed successfully."
# Compile the sample, run it, verify that it works!
echo "[*] Testing unicorn-mode functionality by running a sample test harness under afl-unicorn"
cd ../samples/simple || exit 1
# Run afl-showmap on the sample application. If anything comes out then it must have worked!
unset AFL_INST_RATIO
echo 0 | ../../../afl-showmap -U -m none -q -o .test-instr0 -- python simple_test_harness.py ./sample_inputs/sample1.bin || exit 1
if [ -s .test-instr0 ]
then
echo "[+] Instrumentation tests passed. "
echo "[+] All set, you can now use Unicorn mode (-U) in afl-fuzz!"
RETVAL=0
else
echo "[-] Error: Unicorn mode doesn't seem to work!"
RETVAL=1
fi
rm -f .test-instr0
exit $RETVAL

View File

@ -0,0 +1,104 @@
"""
template_test_harness.py
Template which loads the context of a process into a Unicorn Engine,
instance, loads a custom (mutated) inputs, and executes the
desired code. Designed to be used in conjunction with one of the
Unicorn Context Dumper scripts.
Author:
Nathan Voss <njvoss299@gmail.com>
"""
import argparse
from unicorn import *
from unicorn.x86_const import * # TODO: Set correct architecture here as necessary
import unicorn_loader
# Simple stand-in heap to prevent OS/kernel issues
unicorn_heap = None
# Start and end address of emulation
START_ADDRESS = # TODO: Set start address here
END_ADDRESS = # TODO: Set end address here
"""
Implement target-specific hooks in here.
Stub out, skip past, and re-implement necessary functionality as appropriate
"""
def unicorn_hook_instruction(uc, address, size, user_data):
# TODO: Setup hooks and handle anything you need to here
# - For example, hook malloc/free/etc. and handle it internally
pass
#------------------------
#---- Main test function
def main():
parser = argparse.ArgumentParser()
parser.add_argument('context_dir', type=str, help="Directory containing process context")
parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input content")
parser.add_argument('-d', '--debug', default=False, action="store_true", help="Dump trace info")
args = parser.parse_args()
print("Loading context from {}".format(args.context_dir))
uc = unicorn_loader.AflUnicornEngine(args.context_dir, enable_trace=args.debug, debug_print=False)
# Instantiate the hook function to avoid emulation errors
global unicorn_heap
unicorn_heap = unicorn_loader.UnicornSimpleHeap(uc, debug_print=True)
uc.hook_add(UC_HOOK_CODE, unicorn_hook_instruction)
# Execute 1 instruction just to startup the forkserver
# NOTE: This instruction will be executed again later, so be sure that
# there are no negative consequences to the overall execution state.
# If there are, change the later call to emu_start to no re-execute
# the first instruction.
print("Starting the forkserver by executing 1 instruction")
try:
uc.emu_start(START_ADDRESS, 0, 0, count=1)
except UcError as e:
print("ERROR: Failed to execute a single instruction (error: {})!".format(e))
return
# Allocate a buffer and load a mutated input and put it into the right spot
if args.input_file:
print("Loading input content from {}".format(args.input_file))
input_file = open(args.input_file, 'rb')
input_content = input_file.read()
input_file.close()
# TODO: Apply constraints to mutated input here
raise exceptions.NotImplementedError('No constraints on the mutated inputs have been set!')
# Allocate a new buffer and put the input into it
buf_addr = unicorn_heap.malloc(len(input_content))
uc.mem_write(buf_addr, input_content)
print("Allocated mutated input buffer @ 0x{0:016x}".format(buf_addr))
# TODO: Set the input into the state so it will be handled
raise exceptions.NotImplementedError('The mutated input was not loaded into the Unicorn state!')
# Run the test
print("Executing from 0x{0:016x} to 0x{1:016x}".format(START_ADDRESS, END_ADDRESS))
try:
result = uc.emu_start(START_ADDRESS, END_ADDRESS, timeout=0, count=0)
except UcError as e:
# If something went wrong during emulation a signal is raised to force this
# script to crash in a way that AFL can detect ('uc.force_crash()' should be
# called for any condition that you want AFL to treat as a crash).
print("Execution failed with error: {}".format(e))
uc.dump_regs()
uc.force_crash(e)
print("Final register state:")
uc.dump_regs()
print("Done.")
if __name__ == "__main__":
main()

View File

@ -0,0 +1,190 @@
"""
unicorn_dumper_gdb.py
When run with GDB sitting at a debug breakpoint, this
dumps the current state (registers/memory/etc) of
the process to a directory consisting of an index
file with register and segment information and
sub-files containing all actual process memory.
The output of this script is expected to be used
to initialize context for Unicorn emulation.
-----------
In order to run this script, GEF needs to be running in the GDB session (gef.py)
# HELPERS from: https://github.com/hugsy/gef/blob/master/gef.py
It can be loaded with:
source <path_to_gef>/gef.py
Call this function when at a breakpoint in your process with:
source unicorn_dumper_gdb.py
-----------
"""
import datetime
import hashlib
import json
import os
import sys
import time
import zlib
# GDB Python SDK
import gdb
# Maximum segment size that we'll store
# Yep, this could break stuff pretty quickly if we
# omit something that's used during emulation.
MAX_SEG_SIZE = 128 * 1024 * 1024
# Name of the index file
INDEX_FILE_NAME = "_index.json"
#----------------------
#---- Helper Functions
def map_arch():
arch = get_arch() # from GEF
if 'x86_64' in arch or 'x86-64' in arch:
return "x64"
elif 'x86' in arch or 'i386' in arch:
return "x86"
elif 'aarch64' in arch or 'arm64' in arch:
return "arm64le"
elif 'aarch64_be' in arch:
return "arm64be"
elif 'armeb' in arch:
# check for THUMB mode
cpsr = get_register('cpsr')
if (cpsr & (1 << 5)):
return "armbethumb"
else:
return "armbe"
elif 'arm' in arch:
# check for THUMB mode
cpsr = get_register('cpsr')
if (cpsr & (1 << 5)):
return "armlethumb"
else:
return "armle"
else:
return ""
#-----------------------
#---- Dumping functions
def dump_arch_info():
arch_info = {}
arch_info["arch"] = map_arch()
return arch_info
def dump_regs():
reg_state = {}
for reg in current_arch.all_registers:
reg_val = get_register(reg)
# current dumper script looks for register values to be hex strings
# reg_str = "0x{:08x}".format(reg_val)
# if "64" in get_arch():
# reg_str = "0x{:016x}".format(reg_val)
# reg_state[reg.strip().strip('$')] = reg_str
reg_state[reg.strip().strip('$')] = reg_val
return reg_state
def dump_process_memory(output_dir):
# Segment information dictionary
final_segment_list = []
# GEF:
vmmap = get_process_maps()
if not vmmap:
print("No address mapping information found")
return final_segment_list
for entry in vmmap:
if entry.page_start == entry.page_end:
continue
seg_info = {'start': entry.page_start, 'end': entry.page_end, 'name': entry.path, 'permissions': {
"r": entry.is_readable() > 0,
"w": entry.is_writable() > 0,
"x": entry.is_executable() > 0
}, 'content_file': ''}
# "(deleted)" may or may not be valid, but don't push it.
if entry.is_readable() and not '(deleted)' in entry.path:
try:
# Compress and dump the content to a file
seg_content = read_memory(entry.page_start, entry.size)
if(seg_content == None):
print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.page_start, entry.path))
else:
print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.page_start, len(seg_content), entry.path, repr(seg_info['permissions'])))
compressed_seg_content = zlib.compress(seg_content)
md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
seg_info["content_file"] = md5_sum
# Write the compressed contents to disk
out_file = open(os.path.join(output_dir, md5_sum), 'wb')
out_file.write(compressed_seg_content)
out_file.close()
except:
print("Exception reading segment ({}): {}".format(entry.path, sys.exc_info()[0]))
else:
print("Skipping segment {0}@0x{1:016x}".format(entry.path, entry.page_start))
# Add the segment to the list
final_segment_list.append(seg_info)
return final_segment_list
#----------
#---- Main
def main():
print("----- Unicorn Context Dumper -----")
print("You must be actively debugging before running this!")
print("If it fails, double check that you are actively debugging before running.")
try:
GEF_TEST = set_arch()
except Exception as e:
print("!!! GEF not running in GDB. Please run gef.py by executing:")
print('\tpython execfile ("<path_to_gef>/gef.py")')
return
try:
# Create the output directory
timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
output_path = "UnicornContext_" + timestamp
if not os.path.exists(output_path):
os.makedirs(output_path)
print("Process context will be output to {}".format(output_path))
# Get the context
context = {
"arch": dump_arch_info(),
"regs": dump_regs(),
"segments": dump_process_memory(output_path),
}
# Write the index file
index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
index_file.write(json.dumps(context, indent=4))
index_file.close()
print("Done.")
except Exception as e:
print("!!! ERROR:\n\t{}".format(repr(e)))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,209 @@
"""
unicorn_dumper_ida.py
When run with IDA (<v7) sitting at a debug breakpoint,
dumps the current state (registers/memory/etc) of
the process to a directory consisting of an index
file with register and segment information and
sub-files containing all actual process memory.
The output of this script is expected to be used
to initialize context for Unicorn emulation.
"""
import datetime
import hashlib
import json
import os
import sys
import time
import zlib
# IDA Python SDK
from idaapi import *
from idc import *
# Maximum segment size that we'll store
# Yep, this could break stuff pretty quickly if we
# omit something that's used during emulation.
MAX_SEG_SIZE = 128 * 1024 * 1024
# Name of the index file
INDEX_FILE_NAME = "_index.json"
#----------------------
#---- Helper Functions
def get_arch():
if ph.id == PLFM_386 and ph.flag & PR_USE64:
return "x64"
elif ph.id == PLFM_386 and ph.flag & PR_USE32:
return "x86"
elif ph.id == PLFM_ARM and ph.flag & PR_USE64:
if cvar.inf.is_be():
return "arm64be"
else:
return "arm64le"
elif ph.id == PLFM_ARM and ph.flag & PR_USE32:
if cvar.inf.is_be():
return "armbe"
else:
return "armle"
else:
return ""
def get_register_list(arch):
if arch == "arm64le" or arch == "arm64be":
arch = "arm64"
elif arch == "armle" or arch == "armbe":
arch = "arm"
registers = {
"x64" : [
"rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp",
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
"rip", "rsp", "efl",
"cs", "ds", "es", "fs", "gs", "ss",
],
"x86" : [
"eax", "ebx", "ecx", "edx", "esi", "edi", "ebp", "esp",
"eip", "esp", "efl",
"cs", "ds", "es", "fs", "gs", "ss",
],
"arm" : [
"R0", "R1", "R2", "R3", "R4", "R5", "R6", "R7",
"R8", "R9", "R10", "R11", "R12", "PC", "SP", "LR",
"PSR",
],
"arm64" : [
"X0", "X1", "X2", "X3", "X4", "X5", "X6", "X7",
"X8", "X9", "X10", "X11", "X12", "X13", "X14",
"X15", "X16", "X17", "X18", "X19", "X20", "X21",
"X22", "X23", "X24", "X25", "X26", "X27", "X28",
"PC", "SP", "FP", "LR", "CPSR"
# "NZCV",
]
}
return registers[arch]
#-----------------------
#---- Dumping functions
def dump_arch_info():
arch_info = {}
arch_info["arch"] = get_arch()
return arch_info
def dump_regs():
reg_state = {}
for reg in get_register_list(get_arch()):
reg_state[reg] = GetRegValue(reg)
return reg_state
def dump_process_memory(output_dir):
# Segment information dictionary
segment_list = []
# Loop over the segments, fill in the info dictionary
for seg_ea in Segments():
seg_start = SegStart(seg_ea)
seg_end = SegEnd(seg_ea)
seg_size = seg_end - seg_start
seg_info = {}
seg_info["name"] = SegName(seg_ea)
seg_info["start"] = seg_start
seg_info["end"] = seg_end
perms = getseg(seg_ea).perm
seg_info["permissions"] = {
"r": False if (perms & SEGPERM_READ) == 0 else True,
"w": False if (perms & SEGPERM_WRITE) == 0 else True,
"x": False if (perms & SEGPERM_EXEC) == 0 else True,
}
if (perms & SEGPERM_READ) and seg_size <= MAX_SEG_SIZE and isLoaded(seg_start):
try:
# Compress and dump the content to a file
seg_content = get_many_bytes(seg_start, seg_end - seg_start)
if(seg_content == None):
print("Segment empty: {0}@0x{1:016x} (size:UNKNOWN)".format(SegName(seg_ea), seg_ea))
seg_info["content_file"] = ""
else:
print("Dumping segment {0}@0x{1:016x} (size:{2})".format(SegName(seg_ea), seg_ea, len(seg_content)))
compressed_seg_content = zlib.compress(seg_content)
md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
seg_info["content_file"] = md5_sum
# Write the compressed contents to disk
out_file = open(os.path.join(output_dir, md5_sum), 'wb')
out_file.write(compressed_seg_content)
out_file.close()
except:
print("Exception reading segment: {}".format(sys.exc_info()[0]))
seg_info["content_file"] = ""
else:
print("Skipping segment {0}@0x{1:016x}".format(SegName(seg_ea), seg_ea))
seg_info["content_file"] = ""
# Add the segment to the list
segment_list.append(seg_info)
return segment_list
"""
TODO: FINISH IMPORT DUMPING
def import_callback(ea, name, ord):
if not name:
else:
# True -> Continue enumeration
# False -> End enumeration
return True
def dump_imports():
import_dict = {}
for i in xrange(0, number_of_import_modules):
enum_import_names(i, import_callback)
return import_dict
"""
#----------
#---- Main
def main():
try:
print("----- Unicorn Context Dumper -----")
print("You must be actively debugging before running this!")
print("If it fails, double check that you are actively debugging before running.")
# Create the output directory
timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
output_path = os.path.dirname(os.path.abspath(GetIdbPath()))
output_path = os.path.join(output_path, "UnicornContext_" + timestamp)
if not os.path.exists(output_path):
os.makedirs(output_path)
print("Process context will be output to {}".format(output_path))
# Get the context
context = {
"arch": dump_arch_info(),
"regs": dump_regs(),
"segments": dump_process_memory(output_path),
#"imports": dump_imports(),
}
# Write the index file
index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
index_file.write(json.dumps(context, indent=4))
index_file.close()
print("Done.")
except Exception, e:
print("!!! ERROR:\n\t{}".format(str(e)))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,299 @@
"""
unicorn_dumper_lldb.py
When run with LLDB sitting at a debug breakpoint, this
dumps the current state (registers/memory/etc) of
the process to a directory consisting of an index
file with register and segment information and
sub-files containing all actual process memory.
The output of this script is expected to be used
to initialize context for Unicorn emulation.
-----------
Call this function when at a breakpoint in your process with:
command script import -r unicorn_dumper_lldb
If there is trouble with "split on a NoneType", issue the following command:
script lldb.target.triple
and try to import the script again.
-----------
"""
from copy import deepcopy
import datetime
import hashlib
import json
import os
import sys
import time
import zlib
# LLDB Python SDK
import lldb
# Maximum segment size that we'll store
# Yep, this could break stuff pretty quickly if we
# omit something that's used during emulation.
MAX_SEG_SIZE = 128 * 1024 * 1024
# Name of the index file
INDEX_FILE_NAME = "_index.json"
DEBUG_MEM_FILE_NAME = "_memory.json"
# Page size required by Unicorn
UNICORN_PAGE_SIZE = 0x1000
# Alignment functions to align all memory segments to Unicorn page boundaries (4KB pages only)
ALIGN_PAGE_DOWN = lambda x: x & ~(UNICORN_PAGE_SIZE - 1)
ALIGN_PAGE_UP = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE-1)
#----------------------
#---- Helper Functions
def overlap_alignments(segments, memory):
final_list = []
curr_seg_idx = 0
curr_end_addr = 0
curr_node = None
current_segment = None
sorted_segments = sorted(segments, key=lambda k: (k['start'], k['end']))
if curr_seg_idx < len(sorted_segments):
current_segment = sorted_segments[curr_seg_idx]
for mem in sorted(memory, key=lambda k: (k['start'], -k['end'])):
if curr_node is None:
if current_segment is not None and current_segment['start'] == mem['start']:
curr_node = deepcopy(current_segment)
curr_node['permissions'] = mem['permissions']
else:
curr_node = deepcopy(mem)
curr_end_addr = curr_node['end']
while curr_end_addr <= mem['end']:
if curr_node['end'] == mem['end']:
if current_segment is not None and current_segment['start'] > curr_node['start'] and current_segment['start'] < curr_node['end']:
curr_node['end'] = current_segment['start']
if(curr_node['end'] > curr_node['start']):
final_list.append(curr_node)
curr_node = deepcopy(current_segment)
curr_node['permissions'] = mem['permissions']
curr_end_addr = curr_node['end']
else:
if(curr_node['end'] > curr_node['start']):
final_list.append(curr_node)
# if curr_node is a segment
if current_segment is not None and current_segment['end'] == mem['end']:
curr_seg_idx += 1
if curr_seg_idx < len(sorted_segments):
current_segment = sorted_segments[curr_seg_idx]
else:
current_segment = None
curr_node = None
break
# could only be a segment
else:
if curr_node['end'] < mem['end']:
# check for remaining segments and valid segments
if(curr_node['end'] > curr_node['start']):
final_list.append(curr_node)
curr_seg_idx += 1
if curr_seg_idx < len(sorted_segments):
current_segment = sorted_segments[curr_seg_idx]
else:
current_segment = None
if current_segment is not None and current_segment['start'] <= curr_end_addr and current_segment['start'] < mem['end']:
curr_node = deepcopy(current_segment)
curr_node['permissions'] = mem['permissions']
else:
# no more segments
curr_node = deepcopy(mem)
curr_node['start'] = curr_end_addr
curr_end_addr = curr_node['end']
return final_list
# https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/Triple.h
def get_arch():
arch, arch_vendor, arch_os = lldb.target.GetTriple().split('-')
if arch == 'x86_64':
return "x64"
elif arch == 'x86' or arch == 'i386':
return "x86"
elif arch == 'aarch64' or arch == 'arm64':
return "arm64le"
elif arch == 'aarch64_be':
return "arm64be"
elif arch == 'armeb':
return "armbe"
elif arch == 'arm':
return "armle"
else:
return ""
#-----------------------
#---- Dumping functions
def dump_arch_info():
arch_info = {}
arch_info["arch"] = get_arch()
return arch_info
def dump_regs():
reg_state = {}
for reg_list in lldb.frame.GetRegisters():
if 'general purpose registers' in reg_list.GetName().lower():
for reg in reg_list:
reg_state[reg.GetName()] = int(reg.GetValue(), 16)
return reg_state
def get_section_info(sec):
name = sec.name if sec.name is not None else ''
if sec.GetParent().name is not None:
name = sec.GetParent().name + '.' + sec.name
module_name = sec.addr.module.file.GetFilename()
module_name = module_name if module_name is not None else ''
long_name = module_name + '.' + name
return sec.addr.load_addr, (sec.addr.load_addr + sec.size), sec.size, long_name
def dump_process_memory(output_dir):
# Segment information dictionary
raw_segment_list = []
raw_memory_list = []
# 1st pass:
# Loop over the segments, fill in the segment info dictionary
for module in lldb.target.module_iter():
for seg_ea in module.section_iter():
seg_info = {'module': module.file.GetFilename() }
seg_info['start'], seg_info['end'], seg_size, seg_info['name'] = get_section_info(seg_ea)
# TODO: Ugly hack for -1 LONG address on 32-bit
if seg_info['start'] >= sys.maxint or seg_size <= 0:
print "Throwing away page: {}".format(seg_info['name'])
continue
# Page-align segment
seg_info['start'] = ALIGN_PAGE_DOWN(seg_info['start'])
seg_info['end'] = ALIGN_PAGE_UP(seg_info['end'])
print("Appending: {}".format(seg_info['name']))
raw_segment_list.append(seg_info)
# Add the stack memory region (just hardcode 0x1000 around the current SP)
sp = lldb.frame.GetSP()
start_sp = ALIGN_PAGE_DOWN(sp)
raw_segment_list.append({'start': start_sp, 'end': start_sp + 0x1000, 'name': 'STACK'})
# Write the original memory to file for debugging
index_file = open(os.path.join(output_dir, DEBUG_MEM_FILE_NAME), 'w')
index_file.write(json.dumps(raw_segment_list, indent=4))
index_file.close()
# Loop over raw memory regions
mem_info = lldb.SBMemoryRegionInfo()
start_addr = -1
next_region_addr = 0
while next_region_addr > start_addr:
err = lldb.process.GetMemoryRegionInfo(next_region_addr, mem_info)
# TODO: Should check err.success. If False, what do we do?
if not err.success:
break
next_region_addr = mem_info.GetRegionEnd()
if next_region_addr >= sys.maxsize:
break
start_addr = mem_info.GetRegionBase()
end_addr = mem_info.GetRegionEnd()
# Unknown region name
region_name = 'UNKNOWN'
# Ignore regions that aren't even mapped
if mem_info.IsMapped() and mem_info.IsReadable():
mem_info_obj = {'start': start_addr, 'end': end_addr, 'name': region_name, 'permissions': {
"r": mem_info.IsReadable(),
"w": mem_info.IsWritable(),
"x": mem_info.IsExecutable()
}}
raw_memory_list.append(mem_info_obj)
final_segment_list = overlap_alignments(raw_segment_list, raw_memory_list)
for seg_info in final_segment_list:
try:
seg_info['content_file'] = ''
start_addr = seg_info['start']
end_addr = seg_info['end']
region_name = seg_info['name']
# Compress and dump the content to a file
err = lldb.SBError()
seg_content = lldb.process.ReadMemory(start_addr, end_addr - start_addr, err)
if(seg_content == None):
print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(start_addr, region_name))
seg_info['content_file'] = ''
else:
print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(start_addr, len(seg_content), region_name, repr(seg_info['permissions'])))
compressed_seg_content = zlib.compress(seg_content)
md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
seg_info['content_file'] = md5_sum
# Write the compressed contents to disk
out_file = open(os.path.join(output_dir, md5_sum), 'wb')
out_file.write(compressed_seg_content)
out_file.close()
except:
print("Exception reading segment ({}): {}".format(region_name, sys.exc_info()[0]))
return final_segment_list
#----------
#---- Main
def main():
try:
print("----- Unicorn Context Dumper -----")
print("You must be actively debugging before running this!")
print("If it fails, double check that you are actively debugging before running.")
# Create the output directory
timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
output_path = "UnicornContext_" + timestamp
if not os.path.exists(output_path):
os.makedirs(output_path)
print("Process context will be output to {}".format(output_path))
# Get the context
context = {
"arch": dump_arch_info(),
"regs": dump_regs(),
"segments": dump_process_memory(output_path),
}
# Write the index file
index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
index_file.write(json.dumps(context, indent=4))
index_file.close()
print("Done.")
except Exception, e:
print("!!! ERROR:\n\t{}".format(repr(e)))
if __name__ == "__main__":
main()
elif lldb.debugger:
main()

View File

@ -0,0 +1,224 @@
"""
unicorn_dumper_pwndbg.py
When run with GDB sitting at a debug breakpoint, this
dumps the current state (registers/memory/etc) of
the process to a directory consisting of an index
file with register and segment information and
sub-files containing all actual process memory.
The output of this script is expected to be used
to initialize context for Unicorn emulation.
-----------
In order to run this script, PWNDBG needs to be running in the GDB session (gdbinit.py)
# HELPERS from: https://github.com/pwndbg/pwndbg
It can be loaded with:
source <path_to_pwndbg>/gdbinit.py
Call this function when at a breakpoint in your process with:
source unicorn_dumper_pwndbg.py
-----------
"""
import datetime
import hashlib
import json
import os
import sys
import time
import zlib
# GDB Python SDK
import gdb
pwndbg_loaded = False
try:
import pwndbg.arch
import pwndbg.regs
import pwndbg.vmmap
import pwndbg.memory
pwndbg_loaded = True
except ImportError:
print("!!! PWNGDB not running in GDB. Please run gdbinit.py by executing:")
print('\tpython execfile ("<path_to_pwndbg>/gdbinit.py")')
# Maximum segment size that we'll store
# Yep, this could break stuff pretty quickly if we
# omit something that's used during emulation.
MAX_SEG_SIZE = 128 * 1024 * 1024
# Name of the index file
INDEX_FILE_NAME = "_index.json"
#----------------------
#---- Helper Functions
def map_arch():
arch = pwndbg.arch.current # from PWNDBG
if 'x86_64' in arch or 'x86-64' in arch:
return "x64"
elif 'x86' in arch or 'i386' in arch:
return "x86"
elif 'aarch64' in arch or 'arm64' in arch:
return "arm64le"
elif 'aarch64_be' in arch:
return "arm64be"
elif 'arm' in arch:
cpsr = pwndbg.regs['cpsr']
# check endianess
if pwndbg.arch.endian == 'big':
# check for THUMB mode
if (cpsr & (1 << 5)):
return "armbethumb"
else:
return "armbe"
else:
# check for THUMB mode
if (cpsr & (1 << 5)):
return "armlethumb"
else:
return "armle"
elif 'mips' in arch:
if pwndbg.arch.endian == 'little':
return 'mipsel'
else:
return 'mips'
else:
return ""
#-----------------------
#---- Dumping functions
def dump_arch_info():
arch_info = {}
arch_info["arch"] = map_arch()
return arch_info
def dump_regs():
reg_state = {}
for reg in pwndbg.regs.all:
reg_val = pwndbg.regs[reg]
# current dumper script looks for register values to be hex strings
# reg_str = "0x{:08x}".format(reg_val)
# if "64" in get_arch():
# reg_str = "0x{:016x}".format(reg_val)
# reg_state[reg.strip().strip('$')] = reg_str
reg_state[reg.strip().strip('$')] = reg_val
return reg_state
def dump_process_memory(output_dir):
# Segment information dictionary
final_segment_list = []
# PWNDBG:
vmmap = pwndbg.vmmap.get()
# Pointer to end of last dumped memory segment
segment_last_addr = 0x0;
start = None
end = None
if not vmmap:
print("No address mapping information found")
return final_segment_list
# Assume segment entries are sorted by start address
for entry in vmmap:
if entry.start == entry.end:
continue
start = entry.start
end = entry.end
if (segment_last_addr > entry.start): # indicates overlap
if (segment_last_addr > entry.end): # indicates complete overlap, so we skip the segment entirely
continue
else:
start = segment_last_addr
seg_info = {'start': start, 'end': end, 'name': entry.objfile, 'permissions': {
"r": entry.read,
"w": entry.write,
"x": entry.execute
}, 'content_file': ''}
# "(deleted)" may or may not be valid, but don't push it.
if entry.read and not '(deleted)' in entry.objfile:
try:
# Compress and dump the content to a file
seg_content = pwndbg.memory.read(start, end - start)
if(seg_content == None):
print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.start, entry.objfile))
else:
print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.start, len(seg_content), entry.objfile, repr(seg_info['permissions'])))
compressed_seg_content = zlib.compress(seg_content)
md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
seg_info["content_file"] = md5_sum
# Write the compressed contents to disk
out_file = open(os.path.join(output_dir, md5_sum), 'wb')
out_file.write(compressed_seg_content)
out_file.close()
except:
print("Exception reading segment ({}): {}".format(entry.objfile, sys.exc_info()[0]))
else:
print("Skipping segment {0}@0x{1:016x}".format(entry.objfile, entry.start))
segment_last_addr = end
# Add the segment to the list
final_segment_list.append(seg_info)
return final_segment_list
#----------
#---- Main
def main():
print("----- Unicorn Context Dumper -----")
print("You must be actively debugging before running this!")
print("If it fails, double check that you are actively debugging before running.")
try:
# Create the output directory
timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
output_path = "UnicornContext_" + timestamp
if not os.path.exists(output_path):
os.makedirs(output_path)
print("Process context will be output to {}".format(output_path))
# Get the context
context = {
"arch": dump_arch_info(),
"regs": dump_regs(),
"segments": dump_process_memory(output_path),
}
# Write the index file
index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
index_file.write(json.dumps(context, indent=4))
index_file.close()
print("Done.")
except Exception as e:
print("!!! ERROR:\n\t{}".format(repr(e)))
if __name__ == "__main__" and pwndbg_loaded:
main()

View File

@ -0,0 +1,560 @@
"""
unicorn_loader.py
Loads a process context dumped created using a
Unicorn Context Dumper script into a Unicorn Engine
instance. Once this is performed emulation can be
started.
"""
import argparse
import binascii
from collections import namedtuple
import datetime
import hashlib
import json
import os
import signal
import struct
import time
import zlib
# Unicorn imports
from unicorn import *
from unicorn.arm_const import *
from unicorn.arm64_const import *
from unicorn.x86_const import *
from unicorn.mips_const import *
# Name of the index file
INDEX_FILE_NAME = "_index.json"
# Page size required by Unicorn
UNICORN_PAGE_SIZE = 0x1000
# Max allowable segment size (1G)
MAX_ALLOWABLE_SEG_SIZE = 1024 * 1024 * 1024
# Alignment functions to align all memory segments to Unicorn page boundaries (4KB pages only)
ALIGN_PAGE_DOWN = lambda x: x & ~(UNICORN_PAGE_SIZE - 1)
ALIGN_PAGE_UP = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE-1)
#---------------------------------------
#---- Unicorn-based heap implementation
class UnicornSimpleHeap(object):
""" Use this class to provide a simple heap implementation. This should
be used if malloc/free calls break things during emulation. This heap also
implements basic guard-page capabilities which enable immediate notice of
heap overflow and underflows.
"""
# Helper data-container used to track chunks
class HeapChunk(object):
def __init__(self, actual_addr, total_size, data_size):
self.total_size = total_size # Total size of the chunk (including padding and guard page)
self.actual_addr = actual_addr # Actual start address of the chunk
self.data_size = data_size # Size requested by the caller of actual malloc call
self.data_addr = actual_addr + UNICORN_PAGE_SIZE # Address where data actually starts
# Returns true if the specified buffer is completely within the chunk, else false
def is_buffer_in_chunk(self, addr, size):
if addr >= self.data_addr and ((addr + size) <= (self.data_addr + self.data_size)):
return True
else:
return False
# Skip the zero-page to avoid weird potential issues with segment registers
HEAP_MIN_ADDR = 0x00002000
HEAP_MAX_ADDR = 0xFFFFFFFF
_uc = None # Unicorn engine instance to interact with
_chunks = [] # List of all known chunks
_debug_print = False # True to print debug information
def __init__(self, uc, debug_print=False):
self._uc = uc
self._debug_print = debug_print
# Add the watchpoint hook that will be used to implement psuedo-guard page support
self._uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, self.__check_mem_access)
def malloc(self, size):
# Figure out the overall size to be allocated/mapped
# - Allocate at least 1 4k page of memory to make Unicorn happy
# - Add guard pages at the start and end of the region
total_chunk_size = UNICORN_PAGE_SIZE + ALIGN_PAGE_UP(size) + UNICORN_PAGE_SIZE
# Gross but efficient way to find space for the chunk:
chunk = None
for addr in xrange(self.HEAP_MIN_ADDR, self.HEAP_MAX_ADDR, UNICORN_PAGE_SIZE):
try:
self._uc.mem_map(addr, total_chunk_size, UC_PROT_READ | UC_PROT_WRITE)
chunk = self.HeapChunk(addr, total_chunk_size, size)
if self._debug_print:
print("Allocating 0x{0:x}-byte chunk @ 0x{1:016x}".format(chunk.data_size, chunk.data_addr))
break
except UcError as e:
continue
# Something went very wrong
if chunk == None:
return 0
self._chunks.append(chunk)
return chunk.data_addr
def calloc(self, size, count):
# Simple wrapper around malloc with calloc() args
return self.malloc(size*count)
def realloc(self, ptr, new_size):
# Wrapper around malloc(new_size) / memcpy(new, old, old_size) / free(old)
if self._debug_print:
print("Reallocating chunk @ 0x{0:016x} to be 0x{1:x} bytes".format(ptr, new_size))
old_chunk = None
for chunk in self._chunks:
if chunk.data_addr == ptr:
old_chunk = chunk
new_chunk_addr = self.malloc(new_size)
if old_chunk != None:
self._uc.mem_write(new_chunk_addr, str(self._uc.mem_read(old_chunk.data_addr, old_chunk.data_size)))
self.free(old_chunk.data_addr)
return new_chunk_addr
def free(self, addr):
for chunk in self._chunks:
if chunk.is_buffer_in_chunk(addr, 1):
if self._debug_print:
print("Freeing 0x{0:x}-byte chunk @ 0x{0:016x}".format(chunk.req_size, chunk.data_addr))
self._uc.mem_unmap(chunk.actual_addr, chunk.total_size)
self._chunks.remove(chunk)
return True
return False
# Implements basic guard-page functionality
def __check_mem_access(self, uc, access, address, size, value, user_data):
for chunk in self._chunks:
if address >= chunk.actual_addr and ((address + size) <= (chunk.actual_addr + chunk.total_size)):
if chunk.is_buffer_in_chunk(address, size) == False:
if self._debug_print:
print("Heap over/underflow attempting to {0} 0x{1:x} bytes @ {2:016x}".format( \
"write" if access == UC_MEM_WRITE else "read", size, address))
# Force a memory-based crash
uc.force_crash(UcError(UC_ERR_READ_PROT))
#---------------------------
#---- Loading function
class AflUnicornEngine(Uc):
def __init__(self, context_directory, enable_trace=False, debug_print=False):
"""
Initializes an AflUnicornEngine instance, which extends standard the UnicornEngine
with a bunch of helper routines that are useful for creating afl-unicorn test harnesses.
Parameters:
- context_directory: Path to the directory generated by one of the context dumper scripts
- enable_trace: If True trace information will be printed to STDOUT
- debug_print: If True debugging information will be printed while loading the context
"""
# Make sure the index file exists and load it
index_file_path = os.path.join(context_directory, INDEX_FILE_NAME)
if not os.path.isfile(index_file_path):
raise Exception("Index file not found. Expected it to be at {}".format(index_file_path))
# Load the process context from the index file
if debug_print:
print("Loading process context index from {}".format(index_file_path))
index_file = open(index_file_path, 'r')
context = json.load(index_file)
index_file.close()
# Check the context to make sure we have the basic essential components
if 'arch' not in context:
raise Exception("Couldn't find architecture information in index file")
if 'regs' not in context:
raise Exception("Couldn't find register information in index file")
if 'segments' not in context:
raise Exception("Couldn't find segment/memory information in index file")
# Set the UnicornEngine instance's architecture and mode
self._arch_str = context['arch']['arch']
arch, mode = self.__get_arch_and_mode(self._arch_str)
Uc.__init__(self, arch, mode)
# Load the registers
regs = context['regs']
reg_map = self.__get_register_map(self._arch_str)
for register, value in regs.iteritems():
if debug_print:
print("Reg {0} = {1}".format(register, value))
if not reg_map.has_key(register.lower()):
if debug_print:
print("Skipping Reg: {}".format(register))
else:
reg_write_retry = True
try:
self.reg_write(reg_map[register.lower()], value)
reg_write_retry = False
except Exception as e:
if debug_print:
print("ERROR writing register: {}, value: {} -- {}".format(register, value, repr(e)))
if reg_write_retry:
if debug_print:
print("Trying to parse value ({}) as hex string".format(value))
try:
self.reg_write(reg_map[register.lower()], int(value, 16))
except Exception as e:
if debug_print:
print("ERROR writing hex string register: {}, value: {} -- {}".format(register, value, repr(e)))
# Setup the memory map and load memory content
self.__map_segments(context['segments'], context_directory, debug_print)
if enable_trace:
self.hook_add(UC_HOOK_BLOCK, self.__trace_block)
self.hook_add(UC_HOOK_CODE, self.__trace_instruction)
self.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, self.__trace_mem_access)
self.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, self.__trace_mem_invalid_access)
if debug_print:
print("Done loading context.")
def get_arch(self):
return self._arch
def get_mode(self):
return self._mode
def get_arch_str(self):
return self._arch_str
def force_crash(self, uc_error):
""" This function should be called to indicate to AFL that a crash occurred during emulation.
You can pass the exception received from Uc.emu_start
"""
mem_errors = [
UC_ERR_READ_UNMAPPED, UC_ERR_READ_PROT, UC_ERR_READ_UNALIGNED,
UC_ERR_WRITE_UNMAPPED, UC_ERR_WRITE_PROT, UC_ERR_WRITE_UNALIGNED,
UC_ERR_FETCH_UNMAPPED, UC_ERR_FETCH_PROT, UC_ERR_FETCH_UNALIGNED,
]
if uc_error.errno in mem_errors:
# Memory error - throw SIGSEGV
os.kill(os.getpid(), signal.SIGSEGV)
elif uc_error.errno == UC_ERR_INSN_INVALID:
# Invalid instruction - throw SIGILL
os.kill(os.getpid(), signal.SIGILL)
else:
# Not sure what happened - throw SIGABRT
os.kill(os.getpid(), signal.SIGABRT)
def dump_regs(self):
""" Dumps the contents of all the registers to STDOUT """
for reg in sorted(self.__get_register_map(self._arch_str).items(), key=lambda reg: reg[0]):
print(">>> {0:>4}: 0x{1:016x}".format(reg[0], self.reg_read(reg[1])))
# TODO: Make this dynamically get the stack pointer register and pointer width for the current architecture
"""
def dump_stack(self, window=10):
print(">>> Stack:")
stack_ptr_addr = self.reg_read(UC_X86_REG_RSP)
for i in xrange(-window, window + 1):
addr = stack_ptr_addr + (i*8)
print("{0}0x{1:016x}: 0x{2:016x}".format( \
'SP->' if i == 0 else ' ', addr, \
struct.unpack('<Q', self.mem_read(addr, 8))[0]))
"""
#-----------------------------
#---- Loader Helper Functions
def __map_segment(self, name, address, size, perms, debug_print=False):
# - size is unsigned and must be != 0
# - starting address must be aligned to 4KB
# - map size must be multiple of the page size (4KB)
mem_start = address
mem_end = address + size
mem_start_aligned = ALIGN_PAGE_DOWN(mem_start)
mem_end_aligned = ALIGN_PAGE_UP(mem_end)
if debug_print:
if mem_start_aligned != mem_start or mem_end_aligned != mem_end:
print("Aligning segment to page boundary:")
print(" name: {}".format(name))
print(" start: {0:016x} -> {1:016x}".format(mem_start, mem_start_aligned))
print(" end: {0:016x} -> {1:016x}".format(mem_end, mem_end_aligned))
print("Mapping segment from {0:016x} - {1:016x} with perm={2}: {3}".format(mem_start_aligned, mem_end_aligned, perms, name))
if(mem_start_aligned < mem_end_aligned):
self.mem_map(mem_start_aligned, mem_end_aligned - mem_start_aligned, perms)
def __map_segments(self, segment_list, context_directory, debug_print=False):
for segment in segment_list:
# Get the segment information from the index
name = segment['name']
seg_start = segment['start']
seg_end = segment['end']
perms = \
(UC_PROT_READ if segment['permissions']['r'] == True else 0) | \
(UC_PROT_WRITE if segment['permissions']['w'] == True else 0) | \
(UC_PROT_EXEC if segment['permissions']['x'] == True else 0)
if debug_print:
print("Handling segment {}".format(name))
# Check for any overlap with existing segments. If there is, it must
# be consolidated and merged together before mapping since Unicorn
# doesn't allow overlapping segments.
found = False
overlap_start = False
overlap_end = False
tmp = 0
for (mem_start, mem_end, mem_perm) in self.mem_regions():
mem_end = mem_end + 1
if seg_start >= mem_start and seg_end < mem_end:
found = True
break
if seg_start >= mem_start and seg_start < mem_end:
overlap_start = True
tmp = mem_end
break
if seg_end >= mem_start and seg_end < mem_end:
overlap_end = True
tmp = mem_start
break
# Map memory into the address space if it is of an acceptable size.
if (seg_end - seg_start) > MAX_ALLOWABLE_SEG_SIZE:
if debug_print:
print("Skipping segment (LARGER THAN {0}) from {1:016x} - {2:016x} with perm={3}: {4}".format(MAX_ALLOWABLE_SEG_SIZE, seg_start, seg_end, perms, name))
continue
elif not found: # Make sure it's not already mapped
if overlap_start: # Partial overlap (start)
self.__map_segment(name, tmp, seg_end - tmp, perms, debug_print)
elif overlap_end: # Patrial overlap (end)
self.__map_segment(name, seg_start, tmp - seg_start, perms, debug_print)
else: # Not found
self.__map_segment(name, seg_start, seg_end - seg_start, perms, debug_print)
else:
if debug_print:
print("Segment {} already mapped. Moving on.".format(name))
# Load the content (if available)
if 'content_file' in segment and len(segment['content_file']) > 0:
content_file_path = os.path.join(context_directory, segment['content_file'])
if not os.path.isfile(content_file_path):
raise Exception("Unable to find segment content file. Expected it to be at {}".format(content_file_path))
#if debug_print:
# print("Loading content for segment {} from {}".format(name, segment['content_file']))
content_file = open(content_file_path, 'rb')
compressed_content = content_file.read()
content_file.close()
self.mem_write(seg_start, zlib.decompress(compressed_content))
else:
if debug_print:
print("No content found for segment {0} @ {1:016x}".format(name, seg_start))
self.mem_write(seg_start, '\x00' * (seg_end - seg_start))
def __get_arch_and_mode(self, arch_str):
arch_map = {
"x64" : [ UC_X86_REG_RIP, UC_ARCH_X86, UC_MODE_64 ],
"x86" : [ UC_X86_REG_EIP, UC_ARCH_X86, UC_MODE_32 ],
"arm64be" : [ UC_ARM64_REG_PC, UC_ARCH_ARM64, UC_MODE_ARM | UC_MODE_BIG_ENDIAN ],
"arm64le" : [ UC_ARM64_REG_PC, UC_ARCH_ARM64, UC_MODE_ARM | UC_MODE_LITTLE_ENDIAN ],
"armbe" : [ UC_ARM_REG_PC, UC_ARCH_ARM, UC_MODE_ARM | UC_MODE_BIG_ENDIAN ],
"armle" : [ UC_ARM_REG_PC, UC_ARCH_ARM, UC_MODE_ARM | UC_MODE_LITTLE_ENDIAN ],
"armbethumb": [ UC_ARM_REG_PC, UC_ARCH_ARM, UC_MODE_THUMB | UC_MODE_BIG_ENDIAN ],
"armlethumb": [ UC_ARM_REG_PC, UC_ARCH_ARM, UC_MODE_THUMB | UC_MODE_LITTLE_ENDIAN ],
"mips" : [ UC_MIPS_REG_PC, UC_ARCH_MIPS, UC_MODE_MIPS32 | UC_MODE_BIG_ENDIAN ],
"mipsel" : [ UC_MIPS_REG_PC, UC_ARCH_MIPS, UC_MODE_MIPS32 | UC_MODE_LITTLE_ENDIAN ],
}
return (arch_map[arch_str][1], arch_map[arch_str][2])
def __get_register_map(self, arch):
if arch == "arm64le" or arch == "arm64be":
arch = "arm64"
elif arch == "armle" or arch == "armbe" or "thumb" in arch:
arch = "arm"
elif arch == "mipsel":
arch = "mips"
registers = {
"x64" : {
"rax": UC_X86_REG_RAX,
"rbx": UC_X86_REG_RBX,
"rcx": UC_X86_REG_RCX,
"rdx": UC_X86_REG_RDX,
"rsi": UC_X86_REG_RSI,
"rdi": UC_X86_REG_RDI,
"rbp": UC_X86_REG_RBP,
"rsp": UC_X86_REG_RSP,
"r8": UC_X86_REG_R8,
"r9": UC_X86_REG_R9,
"r10": UC_X86_REG_R10,
"r11": UC_X86_REG_R11,
"r12": UC_X86_REG_R12,
"r13": UC_X86_REG_R13,
"r14": UC_X86_REG_R14,
"r15": UC_X86_REG_R15,
"rip": UC_X86_REG_RIP,
"rsp": UC_X86_REG_RSP,
"efl": UC_X86_REG_EFLAGS,
"cs": UC_X86_REG_CS,
"ds": UC_X86_REG_DS,
"es": UC_X86_REG_ES,
"fs": UC_X86_REG_FS,
"gs": UC_X86_REG_GS,
"ss": UC_X86_REG_SS,
},
"x86" : {
"eax": UC_X86_REG_EAX,
"ebx": UC_X86_REG_EBX,
"ecx": UC_X86_REG_ECX,
"edx": UC_X86_REG_EDX,
"esi": UC_X86_REG_ESI,
"edi": UC_X86_REG_EDI,
"ebp": UC_X86_REG_EBP,
"esp": UC_X86_REG_ESP,
"eip": UC_X86_REG_EIP,
"esp": UC_X86_REG_ESP,
"efl": UC_X86_REG_EFLAGS,
# Segment registers removed...
# They caused segfaults (from unicorn?) when they were here
},
"arm" : {
"r0": UC_ARM_REG_R0,
"r1": UC_ARM_REG_R1,
"r2": UC_ARM_REG_R2,
"r3": UC_ARM_REG_R3,
"r4": UC_ARM_REG_R4,
"r5": UC_ARM_REG_R5,
"r6": UC_ARM_REG_R6,
"r7": UC_ARM_REG_R7,
"r8": UC_ARM_REG_R8,
"r9": UC_ARM_REG_R9,
"r10": UC_ARM_REG_R10,
"r11": UC_ARM_REG_R11,
"r12": UC_ARM_REG_R12,
"pc": UC_ARM_REG_PC,
"sp": UC_ARM_REG_SP,
"lr": UC_ARM_REG_LR,
"cpsr": UC_ARM_REG_CPSR
},
"arm64" : {
"x0": UC_ARM64_REG_X0,
"x1": UC_ARM64_REG_X1,
"x2": UC_ARM64_REG_X2,
"x3": UC_ARM64_REG_X3,
"x4": UC_ARM64_REG_X4,
"x5": UC_ARM64_REG_X5,
"x6": UC_ARM64_REG_X6,
"x7": UC_ARM64_REG_X7,
"x8": UC_ARM64_REG_X8,
"x9": UC_ARM64_REG_X9,
"x10": UC_ARM64_REG_X10,
"x11": UC_ARM64_REG_X11,
"x12": UC_ARM64_REG_X12,
"x13": UC_ARM64_REG_X13,
"x14": UC_ARM64_REG_X14,
"x15": UC_ARM64_REG_X15,
"x16": UC_ARM64_REG_X16,
"x17": UC_ARM64_REG_X17,
"x18": UC_ARM64_REG_X18,
"x19": UC_ARM64_REG_X19,
"x20": UC_ARM64_REG_X20,
"x21": UC_ARM64_REG_X21,
"x22": UC_ARM64_REG_X22,
"x23": UC_ARM64_REG_X23,
"x24": UC_ARM64_REG_X24,
"x25": UC_ARM64_REG_X25,
"x26": UC_ARM64_REG_X26,
"x27": UC_ARM64_REG_X27,
"x28": UC_ARM64_REG_X28,
"pc": UC_ARM64_REG_PC,
"sp": UC_ARM64_REG_SP,
"fp": UC_ARM64_REG_FP,
"lr": UC_ARM64_REG_LR,
"nzcv": UC_ARM64_REG_NZCV,
"cpsr": UC_ARM_REG_CPSR,
},
"mips" : {
"0" : UC_MIPS_REG_ZERO,
"at": UC_MIPS_REG_AT,
"v0": UC_MIPS_REG_V0,
"v1": UC_MIPS_REG_V1,
"a0": UC_MIPS_REG_A0,
"a1": UC_MIPS_REG_A1,
"a2": UC_MIPS_REG_A2,
"a3": UC_MIPS_REG_A3,
"t0": UC_MIPS_REG_T0,
"t1": UC_MIPS_REG_T1,
"t2": UC_MIPS_REG_T2,
"t3": UC_MIPS_REG_T3,
"t4": UC_MIPS_REG_T4,
"t5": UC_MIPS_REG_T5,
"t6": UC_MIPS_REG_T6,
"t7": UC_MIPS_REG_T7,
"t8": UC_MIPS_REG_T8,
"t9": UC_MIPS_REG_T9,
"s0": UC_MIPS_REG_S0,
"s1": UC_MIPS_REG_S1,
"s2": UC_MIPS_REG_S2,
"s3": UC_MIPS_REG_S3,
"s4": UC_MIPS_REG_S4,
"s5": UC_MIPS_REG_S5,
"s6": UC_MIPS_REG_S6,
"s7": UC_MIPS_REG_S7,
"s8": UC_MIPS_REG_S8,
"k0": UC_MIPS_REG_K0,
"k1": UC_MIPS_REG_K1,
"gp": UC_MIPS_REG_GP,
"pc": UC_MIPS_REG_PC,
"sp": UC_MIPS_REG_SP,
"fp": UC_MIPS_REG_FP,
"ra": UC_MIPS_REG_RA,
"hi": UC_MIPS_REG_HI,
"lo": UC_MIPS_REG_LO
}
}
return registers[arch]
#---------------------------
# Callbacks for tracing
# TODO: Make integer-printing fixed widths dependent on bitness of architecture
# (i.e. only show 4 bytes for 32-bit, 8 bytes for 64-bit)
# TODO: Figure out how best to determine the capstone mode and architecture here
"""
try:
# If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
from capstone import *
cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
def __trace_instruction(self, uc, address, size, user_data):
mem = uc.mem_read(address, size)
for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
except ImportError:
def __trace_instruction(self, uc, address, size, user_data):
print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
"""
def __trace_instruction(self, uc, address, size, user_data):
print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def __trace_block(self, uc, address, size, user_data):
print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def __trace_mem_access(self, uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE:
print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size))
def __trace_mem_invalid_access(self, uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE_UNMAPPED:
print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))

View File

@ -0,0 +1,290 @@
/*
american fuzzy lop - high-performance binary-only instrumentation
-----------------------------------------------------------------
Written by Andrew Griffiths <agriffiths@google.com> and
Michal Zalewski <lcamtuf@google.com>
TCG instrumentation and block chaining support by Andrea Biondo
<andrea.biondo965@gmail.com>
Adapted for afl-unicorn by Dominik Maier <mail@dmnk.co>
Idea & design very much by Andrew Griffiths.
Copyright 2015, 2016 Google Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
This code is a shim patched into the separately-distributed source
code of Unicorn 1.0.1. It leverages the built-in QEMU tracing functionality
to implement AFL-style instrumentation and to take care of the remaining
parts of the AFL fork server logic.
The resulting QEMU binary is essentially a standalone instrumentation
tool; for an example of how to leverage it for other purposes, you can
have a look at afl-showmap.c.
*/
#include <sys/shm.h>
#include <sys/types.h>
#include <sys/wait.h>
#include "../../config.h"
/***************************
* VARIOUS AUXILIARY STUFF *
***************************/
/* A snippet patched into tb_find_slow to inform the parent process that
we have hit a new block that hasn't been translated yet, and to tell
it to translate within its own context, too (this avoids translation
overhead in the next forked-off copy). */
#define AFL_UNICORN_CPU_SNIPPET1 do { \
afl_request_tsl(pc, cs_base, flags); \
} while (0)
/* This snippet kicks in when the instruction pointer is positioned at
_start and does the usual forkserver stuff, not very different from
regular instrumentation injected via afl-as.h. */
#define AFL_UNICORN_CPU_SNIPPET2 do { \
if(unlikely(afl_first_instr == 0)) { \
afl_setup(); \
afl_forkserver(env); \
afl_first_instr = 1; \
} \
afl_maybe_log(tb->pc); \
} while (0)
/* We use one additional file descriptor to relay "needs translation"
messages between the child and the fork server. */
#define TSL_FD (FORKSRV_FD - 1)
/* This is equivalent to afl-as.h: */
static unsigned char *afl_area_ptr;
/* Set in the child process in forkserver mode: */
static unsigned char afl_fork_child;
static unsigned int afl_forksrv_pid;
/* Instrumentation ratio: */
static unsigned int afl_inst_rms = MAP_SIZE;
/* Function declarations. */
static void afl_setup(void);
static void afl_forkserver(CPUArchState*);
static inline void afl_maybe_log(unsigned long);
static void afl_wait_tsl(CPUArchState*, int);
static void afl_request_tsl(target_ulong, target_ulong, uint64_t);
static TranslationBlock *tb_find_slow(CPUArchState*, target_ulong,
target_ulong, uint64_t);
/* Data structure passed around by the translate handlers: */
struct afl_tsl {
target_ulong pc;
target_ulong cs_base;
uint64_t flags;
};
/*************************
* ACTUAL IMPLEMENTATION *
*************************/
/* Set up SHM region and initialize other stuff. */
static void afl_setup(void) {
char *id_str = getenv(SHM_ENV_VAR),
*inst_r = getenv("AFL_INST_RATIO");
int shm_id;
if (inst_r) {
unsigned int r;
r = atoi(inst_r);
if (r > 100) r = 100;
if (!r) r = 1;
afl_inst_rms = MAP_SIZE * r / 100;
}
if (id_str) {
shm_id = atoi(id_str);
afl_area_ptr = shmat(shm_id, NULL, 0);
if (afl_area_ptr == (void*)-1) exit(1);
/* With AFL_INST_RATIO set to a low value, we want to touch the bitmap
so that the parent doesn't give up on us. */
if (inst_r) afl_area_ptr[0] = 1;
}
}
/* Fork server logic, invoked once we hit first emulated instruction. */
static void afl_forkserver(CPUArchState *env) {
static unsigned char tmp[4];
if (!afl_area_ptr) return;
/* Tell the parent that we're alive. If the parent doesn't want
to talk, assume that we're not running in forkserver mode. */
if (write(FORKSRV_FD + 1, tmp, 4) != 4) return;
afl_forksrv_pid = getpid();
/* All right, let's await orders... */
while (1) {
pid_t child_pid;
int status, t_fd[2];
/* Whoops, parent dead? */
if (read(FORKSRV_FD, tmp, 4) != 4) exit(2);
/* Establish a channel with child to grab translation commands. We'll
read from t_fd[0], child will write to TSL_FD. */
if (pipe(t_fd) || dup2(t_fd[1], TSL_FD) < 0) exit(3);
close(t_fd[1]);
child_pid = fork();
if (child_pid < 0) exit(4);
if (!child_pid) {
/* Child process. Close descriptors and run free. */
afl_fork_child = 1;
close(FORKSRV_FD);
close(FORKSRV_FD + 1);
close(t_fd[0]);
return;
}
/* Parent. */
close(TSL_FD);
if (write(FORKSRV_FD + 1, &child_pid, 4) != 4) exit(5);
/* Collect translation requests until child dies and closes the pipe. */
afl_wait_tsl(env, t_fd[0]);
/* Get and relay exit status to parent. */
if (waitpid(child_pid, &status, 0) < 0) exit(6);
if (write(FORKSRV_FD + 1, &status, 4) != 4) exit(7);
}
}
/* The equivalent of the tuple logging routine from afl-as.h. */
static inline void afl_maybe_log(unsigned long cur_loc) {
static __thread unsigned long prev_loc;
// DEBUG
//printf("IN AFL_MAYBE_LOG 0x%lx\n", cur_loc);
// MODIFIED FOR UNICORN MODE -> We want to log all addresses,
// so the checks for 'start < addr < end' are removed
if(!afl_area_ptr)
return;
// DEBUG
//printf("afl_area_ptr = %p\n", afl_area_ptr);
/* Looks like QEMU always maps to fixed locations, so ASAN is not a
concern. Phew. But instruction addresses may be aligned. Let's mangle
the value to get something quasi-uniform. */
cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
cur_loc &= MAP_SIZE - 1;
/* Implement probabilistic instrumentation by looking at scrambled block
address. This keeps the instrumented locations stable across runs. */
// DEBUG
//printf("afl_inst_rms = 0x%lx\n", afl_inst_rms);
if (cur_loc >= afl_inst_rms) return;
// DEBUG
//printf("cur_loc = 0x%lx\n", cur_loc);
afl_area_ptr[cur_loc ^ prev_loc]++;
prev_loc = cur_loc >> 1;
}
/* This code is invoked whenever QEMU decides that it doesn't have a
translation of a particular block and needs to compute it. When this happens,
we tell the parent to mirror the operation, so that the next fork() has a
cached copy. */
static void afl_request_tsl(target_ulong pc, target_ulong cb, uint64_t flags) {
struct afl_tsl t;
if (!afl_fork_child) return;
t.pc = pc;
t.cs_base = cb;
t.flags = flags;
if (write(TSL_FD, &t, sizeof(struct afl_tsl)) != sizeof(struct afl_tsl))
return;
}
/* This is the other side of the same channel. Since timeouts are handled by
afl-fuzz simply killing the child, we can just wait until the pipe breaks. */
static void afl_wait_tsl(CPUArchState *env, int fd) {
struct afl_tsl t;
while (1) {
/* Broken pipe means it's time to return to the fork server routine. */
if (read(fd, &t, sizeof(struct afl_tsl)) != sizeof(struct afl_tsl))
break;
tb_find_slow(env, t.pc, t.cs_base, t.flags);
}
close(fd);
}

View File

@ -0,0 +1,107 @@
diff --git a/Makefile b/Makefile
index 7d73782..fb3ccfd 100644
--- a/Makefile
+++ b/Makefile
@@ -88,6 +88,10 @@ AR = llvm-ar
LDFLAGS := -fsanitize=address ${LDFLAGS}
endif
+ifeq ($(UNICORN_AFL),yes)
+UNICORN_CFLAGS += -DUNICORN_AFL
+endif
+
ifeq ($(CROSS),)
CC ?= cc
AR ?= ar
diff --git a/config.mk b/config.mk
index c3621fb..c7b4f7e 100644
--- a/config.mk
+++ b/config.mk
@@ -8,7 +8,7 @@
# Compile with debug info when you want to debug code.
# Change this to 'no' for release edition.
-UNICORN_DEBUG ?= yes
+UNICORN_DEBUG ?= no
################################################################################
# Specify which archs you want to compile in. By default, we build all archs.
@@ -28,3 +28,9 @@ UNICORN_STATIC ?= yes
# a shared library.
UNICORN_SHARED ?= yes
+
+
+################################################################################
+# Changing 'UNICORN_AFLL = yes' to 'UNICORN_AFL = no' disables AFL instrumentation
+
+UNICORN_AFL ?= yes
diff --git a/qemu/cpu-exec.c b/qemu/cpu-exec.c
index 7755adf..8114b70 100644
--- a/qemu/cpu-exec.c
+++ b/qemu/cpu-exec.c
@@ -24,6 +24,11 @@
#include "uc_priv.h"
+#if defined(UNICORN_AFL)
+#include "../afl-unicorn-cpu-inl.h"
+static int afl_first_instr = 0;
+#endif
+
static tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr);
static TranslationBlock *tb_find_slow(CPUArchState *env, target_ulong pc,
target_ulong cs_base, uint64_t flags);
@@ -231,6 +236,10 @@ int cpu_exec(struct uc_struct *uc, CPUArchState *env) // qq
next_tb & TB_EXIT_MASK, tb);
}
+#if defined(UNICORN_AFL)
+ AFL_UNICORN_CPU_SNIPPET2;
+#endif
+
/* cpu_interrupt might be called while translating the
TB, but before it is linked into a potentially
infinite loop and becomes env->current_tb. Avoid
@@ -369,6 +378,11 @@ static TranslationBlock *tb_find_slow(CPUArchState *env, target_ulong pc,
not_found:
/* if no translated code available, then translate it now */
tb = tb_gen_code(cpu, pc, cs_base, (int)flags, 0); // qq
+
+#if defined(UNICORN_AFL)
+ /* There seems to be no chaining in unicorn ever? :( */
+ AFL_UNICORN_CPU_SNIPPET1;
+#endif
found:
/* Move the last found TB to the head of the list */
diff --git a/qemu/translate-all.c b/qemu/translate-all.c
index 1a96c34..7ef4878 100644
--- a/qemu/translate-all.c
+++ b/qemu/translate-all.c
@@ -403,11 +403,25 @@ static PageDesc *page_find_alloc(struct uc_struct *uc, tb_page_addr_t index, int
#if defined(CONFIG_USER_ONLY)
/* We can't use g_malloc because it may recurse into a locked mutex. */
+#if defined(UNICORN_AFL)
+ /* This was added by unicorn-afl to bail out semi-gracefully if out of memory. */
+# define ALLOC(P, SIZE) \
+ do { \
+ void* _tmp = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, \
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); \
+ if (_tmp == (void*)-1) { \
+ qemu_log(">>> Out of memory for stack, bailing out. <<<\n"); \
+ exit(1); \
+ } \
+ (P) = _tmp; \
+ } while (0)
+#else /* !UNICORN_AFL */
# define ALLOC(P, SIZE) \
do { \
P = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, \
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); \
} while (0)
+#endif /* UNICORN_AFL */
#else
# define ALLOC(P, SIZE) \
do { P = g_malloc0(SIZE); } while (0)

View File

@ -0,0 +1,41 @@
Compiling simple_target.c
==========================
You shouldn't need to compile simple_target.c since a MIPS binary version is
pre-built and shipped with afl-unicorn. This file documents how the binary
was built in case you want to rebuild it or recompile it for any reason.
The pre-built binary (simple_target.bin) was built by cross-compiling
simple_target.c for MIPS using the mips-linux-gnu-gcc package on an Ubuntu
16.04 LTS system. This cross compiler (and associated binutils) was installed
from apt-get packages:
```
sudo apt-get install gcc-mips-linux-gnu
```
simple_target.c was compiled without optimization, position-independent,
and without standard libraries using the following command line:
```
mips-linux-gnu-gcc -o simple_target.elf simple_target.c -fPIC -O0 -nostdlib
```
The .text section from the resulting ELF binary was then extracted to create
the raw binary blob that is loaded and emulated by simple_test_harness.py:
```
mips-linux-gnu-objcopy -O binary --only-section=.text simple_target.elf simple_target.bin
```
In summary, to recreate simple_taget.bin execute the following:
```
mips-linux-gnu-gcc -o simple_target.elf simple_target.c -fPIC -O0 -nostdlib
&& mips-linux-gnu-objcopy -O binary --only-section=.text simple_target.elf simple_target.bin
&& rm simple_target.elf
```
Note that the output of this is padded with nulls for 16-byte alignment. This is
important when emulating it, as NOPs will be added after the return of main()
as necessary.

View File

@ -0,0 +1 @@
abcd

Binary file not shown.

View File

@ -0,0 +1 @@


View File

@ -0,0 +1 @@


View File

@ -0,0 +1 @@


Binary file not shown.

View File

@ -0,0 +1,31 @@
/*
* Sample target file to test afl-unicorn fuzzing capabilities.
* This is a very trivial example that will crash pretty easily
* in several different exciting ways.
*
* Input is assumed to come from a buffer located at DATA_ADDRESS
* (0x00300000), so make sure that your Unicorn emulation of this
* puts user data there.
*
* Written by Nathan Voss <njvoss99@gmail.com>
*/
// Magic address where mutated data will be placed
#define DATA_ADDRESS 0x00300000
int main(void) {
unsigned char *data_buf = (unsigned char *) DATA_ADDRESS;
if (data_buf[20] != 0) {
// Cause an 'invalid read' crash if data[0..3] == '\x01\x02\x03\x04'
unsigned char invalid_read = *(unsigned char *) 0x00000000;
} else if (data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) {
// Cause an 'invalid read' crash if (0x10 < data[0] < 0x20) and data[1] > data[2]
unsigned char invalid_read = *(unsigned char *) 0x00000000;
} else if (data_buf[9] == 0x00 && data_buf[10] != 0x00 && data_buf[11] == 0x00) {
// Cause a crash if data[10] is not zero, but [9] and [11] are zero
unsigned char invalid_read = *(unsigned char *) 0x00000000;
}
return 0;
}

View File

@ -0,0 +1,170 @@
"""
Simple test harness for AFL's Unicorn Mode.
This loads the simple_target.bin binary (precompiled as MIPS code) into
Unicorn's memory map for emulation, places the specified input into
simple_target's buffer (hardcoded to be at 0x300000), and executes 'main()'.
If any crashes occur during emulation, this script throws a matching signal
to tell AFL that a crash occurred.
Run under AFL as follows:
$ cd <afl_path>/unicorn_mode/samples/simple/
$ ../../../afl-fuzz -U -m none -i ./sample_inputs -o ./output -- python simple_test_harness.py @@
"""
import argparse
import os
import signal
from unicorn import *
from unicorn.mips_const import *
# Path to the file containing the binary to emulate
BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin')
# Memory map for the code to be tested
CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded
CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb)
STACK_ADDRESS = 0x00200000 # Address of the stack (arbitrarily chosen)
STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen)
DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed
DATA_SIZE_MAX = 0x00010000 # Maximum allowable size of mutated data
try:
# If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
from capstone import *
cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
def unicorn_debug_instruction(uc, address, size, user_data):
mem = uc.mem_read(address, size)
for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
except ImportError:
def unicorn_debug_instruction(uc, address, size, user_data):
print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def unicorn_debug_block(uc, address, size, user_data):
print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE:
print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size))
def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE_UNMAPPED:
print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))
def force_crash(uc_error):
# This function should be called to indicate to AFL that a crash occurred during emulation.
# Pass in the exception received from Uc.emu_start()
mem_errors = [
UC_ERR_READ_UNMAPPED, UC_ERR_READ_PROT, UC_ERR_READ_UNALIGNED,
UC_ERR_WRITE_UNMAPPED, UC_ERR_WRITE_PROT, UC_ERR_WRITE_UNALIGNED,
UC_ERR_FETCH_UNMAPPED, UC_ERR_FETCH_PROT, UC_ERR_FETCH_UNALIGNED,
]
if uc_error.errno in mem_errors:
# Memory error - throw SIGSEGV
os.kill(os.getpid(), signal.SIGSEGV)
elif uc_error.errno == UC_ERR_INSN_INVALID:
# Invalid instruction - throw SIGILL
os.kill(os.getpid(), signal.SIGILL)
else:
# Not sure what happened - throw SIGABRT
os.kill(os.getpid(), signal.SIGABRT)
def main():
parser = argparse.ArgumentParser(description="Test harness for simple_target.bin")
parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load")
parser.add_argument('-d', '--debug', default=False, action="store_true", help="Enables debug tracing")
args = parser.parse_args()
# Instantiate a MIPS32 big endian Unicorn Engine instance
uc = Uc(UC_ARCH_MIPS, UC_MODE_MIPS32 + UC_MODE_BIG_ENDIAN)
if args.debug:
uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access)
#---------------------------------------------------
# Load the binary to emulate and map it into memory
print("Loading data input from {}".format(args.input_file))
binary_file = open(BINARY_FILE, 'rb')
binary_code = binary_file.read()
binary_file.close()
# Apply constraints to the mutated input
if len(binary_code) > CODE_SIZE_MAX:
print("Binary code is too large (> {} bytes)".format(CODE_SIZE_MAX))
return
# Write the mutated command into the data buffer
uc.mem_map(CODE_ADDRESS, CODE_SIZE_MAX)
uc.mem_write(CODE_ADDRESS, binary_code)
# Set the program counter to the start of the code
start_address = CODE_ADDRESS # Address of entry point of main()
end_address = CODE_ADDRESS + 0xf4 # Address of last instruction in main()
uc.reg_write(UC_MIPS_REG_PC, start_address)
#-----------------
# Setup the stack
uc.mem_map(STACK_ADDRESS, STACK_SIZE)
uc.reg_write(UC_MIPS_REG_SP, STACK_ADDRESS + STACK_SIZE)
#-----------------------------------------------------
# Emulate 1 instruction to kick off AFL's fork server
# THIS MUST BE DONE BEFORE LOADING USER DATA!
# If this isn't done every single run, the AFL fork server
# will not be started appropriately and you'll get erratic results!
# It doesn't matter what this returns with, it just has to execute at
# least one instruction in order to get the fork server started.
# Execute 1 instruction just to startup the forkserver
print("Starting the AFL forkserver by executing 1 instruction")
try:
uc.emu_start(uc.reg_read(UC_MIPS_REG_PC), 0, 0, count=1)
except UcError as e:
print("ERROR: Failed to execute a single instruction (error: {})!".format(e))
return
#-----------------------------------------------
# Load the mutated input and map it into memory
# Load the mutated input from disk
print("Loading data input from {}".format(args.input_file))
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
print("Test input is too long (> {} bytes)".format(DATA_SIZE_MAX))
return
# Write the mutated command into the data buffer
uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
uc.mem_write(DATA_ADDRESS, input)
#------------------------------------------------------------
# Emulate the code, allowing it to process the mutated input
print("Executing until a crash or execution reaches 0x{0:016x}".format(end_address))
try:
result = uc.emu_start(uc.reg_read(UC_MIPS_REG_PC), end_address, timeout=0, count=0)
except UcError as e:
print("Execution failed with error: {}".format(e))
force_crash(e)
print("Done.")
if __name__ == "__main__":
main()