mirror of
https://github.com/AFLplusplus/AFLplusplus.git
synced 2025-06-15 11:28:08 +00:00
fix sync script, update remote sync documentation
This commit is contained in:
@ -10,8 +10,8 @@ n-core system, you can almost always run around n concurrent fuzzing jobs with
|
|||||||
virtually no performance hit (you can use the afl-gotcpu tool to make sure).
|
virtually no performance hit (you can use the afl-gotcpu tool to make sure).
|
||||||
|
|
||||||
In fact, if you rely on just a single job on a multi-core system, you will
|
In fact, if you rely on just a single job on a multi-core system, you will
|
||||||
be underutilizing the hardware. So, parallelization is usually the right
|
be underutilizing the hardware. So, parallelization is always the right way to
|
||||||
way to go.
|
go.
|
||||||
|
|
||||||
When targeting multiple unrelated binaries or using the tool in
|
When targeting multiple unrelated binaries or using the tool in
|
||||||
"non-instrumented" (-n) mode, it is perfectly fine to just start up several
|
"non-instrumented" (-n) mode, it is perfectly fine to just start up several
|
||||||
@ -65,22 +65,7 @@ still perform deterministic checks; while the secondary instances will
|
|||||||
proceed straight to random tweaks.
|
proceed straight to random tweaks.
|
||||||
|
|
||||||
Note that you must always have one -M main instance!
|
Note that you must always have one -M main instance!
|
||||||
|
Running multiple -M instances is wasteful!
|
||||||
Note that running multiple -M instances is wasteful, although there is an
|
|
||||||
experimental support for parallelizing the deterministic checks. To leverage
|
|
||||||
that, you need to create -M instances like so:
|
|
||||||
|
|
||||||
```
|
|
||||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...]
|
|
||||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...]
|
|
||||||
./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...]
|
|
||||||
```
|
|
||||||
|
|
||||||
...where the first value after ':' is the sequential ID of a particular main
|
|
||||||
instance (starting at 1), and the second value is the total number of fuzzers to
|
|
||||||
distribute the deterministic fuzzing across. Note that if you boot up fewer
|
|
||||||
fuzzers than indicated by the second number passed to -M, you may end up with
|
|
||||||
poor coverage.
|
|
||||||
|
|
||||||
You can also monitor the progress of your jobs from the command line with the
|
You can also monitor the progress of your jobs from the command line with the
|
||||||
provided afl-whatsup tool. When the instances are no longer finding new paths,
|
provided afl-whatsup tool. When the instances are no longer finding new paths,
|
||||||
@ -99,61 +84,88 @@ example may be:
|
|||||||
This is not a concern if you use @@ without -f and let afl-fuzz come up with the
|
This is not a concern if you use @@ without -f and let afl-fuzz come up with the
|
||||||
file name.
|
file name.
|
||||||
|
|
||||||
## 3) Syncing with non-afl fuzzers or independant instances
|
## 3) Multiple -M mains
|
||||||
|
|
||||||
|
|
||||||
|
There is support for parallelizing the deterministic checks.
|
||||||
|
This is only needed where
|
||||||
|
|
||||||
|
1. many new paths are found fast over a long time and it looks unlikely that
|
||||||
|
main node will ever catch up, and
|
||||||
|
2. deterministic fuzzing is actively helping path discovery (you can see this
|
||||||
|
in the main node for the first for lines in the "fuzzing strategy yields"
|
||||||
|
section. If the ration `found/attemps` is high, then it is effective. It
|
||||||
|
most commonly isn't.)
|
||||||
|
|
||||||
|
Only if both are true it is beneficial to have more than one main.
|
||||||
|
You can leverage this by creating -M instances like so:
|
||||||
|
|
||||||
|
```
|
||||||
|
./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...]
|
||||||
|
./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...]
|
||||||
|
./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...]
|
||||||
|
```
|
||||||
|
|
||||||
|
... where the first value after ':' is the sequential ID of a particular main
|
||||||
|
instance (starting at 1), and the second value is the total number of fuzzers to
|
||||||
|
distribute the deterministic fuzzing across. Note that if you boot up fewer
|
||||||
|
fuzzers than indicated by the second number passed to -M, you may end up with
|
||||||
|
poor coverage.
|
||||||
|
|
||||||
|
## 4) Syncing with non-afl fuzzers or independant instances
|
||||||
|
|
||||||
A -M main node can be told with the `-F other_fuzzer_queue_directory` option
|
A -M main node can be told with the `-F other_fuzzer_queue_directory` option
|
||||||
to sync results from other fuzzers, e.g. libfuzzer or honggfuzz.
|
to sync results from other fuzzers, e.g. libfuzzer or honggfuzz.
|
||||||
|
|
||||||
Only the specified directory will by synced into afl, not subdirectories.
|
Only the specified directory will by synced into afl, not subdirectories.
|
||||||
The specified directories do not need to exist yet at the start of afl.
|
The specified directory does not need to exist yet at the start of afl.
|
||||||
|
|
||||||
## 4) Multi-system parallelization
|
The `-F` option can be passed to the main node several times.
|
||||||
|
|
||||||
|
## 5) Multi-system parallelization
|
||||||
|
|
||||||
The basic operating principle for multi-system parallelization is similar to
|
The basic operating principle for multi-system parallelization is similar to
|
||||||
the mechanism explained in section 2. The key difference is that you need to
|
the mechanism explained in section 2. The key difference is that you need to
|
||||||
write a simple script that performs two actions:
|
write a simple script that performs two actions:
|
||||||
|
|
||||||
- Uses SSH with authorized_keys to connect to every machine and retrieve
|
- Uses SSH with authorized_keys to connect to every machine and retrieve
|
||||||
a tar archive of the /path/to/sync_dir/<fuzzer_id>/queue/ directories for
|
a tar archive of the /path/to/sync_dir/<main_node(s)> directory local to
|
||||||
every <fuzzer_id> local to the machine. It's best to use a naming scheme
|
the machine.
|
||||||
that includes host name in the fuzzer ID, so that you can do something
|
It is best to use a naming scheme that includes host name and it's being
|
||||||
like:
|
a main node (e.g. main1, main2) in the fuzzer ID, so that you can do
|
||||||
|
something like:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
for s in {1..10}; do
|
for host in `cat HOSTLIST`; do
|
||||||
ssh user@host${s} "tar -czf - sync/host${s}_fuzzid*/[qf]*" >host${s}.tgz
|
ssh user@$host "tar -czf - sync/$host_main*/" > $host.tgz
|
||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
- Distributes and unpacks these files on all the remaining machines, e.g.:
|
- Distributes and unpacks these files on all the remaining machines, e.g.:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
for s in {1..10}; do
|
for srchost in `cat HOSTLIST`; do
|
||||||
for d in {1..10}; do
|
for dsthost in `cat HOSTLIST`; do
|
||||||
test "$s" = "$d" && continue
|
test "$s" = "$d" && continue
|
||||||
ssh user@host${d} 'tar -kxzf -' <host${s}.tgz
|
ssh user@$srchost 'tar -kxzf -' < $dsthost.tgz
|
||||||
done
|
done
|
||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
There is an example of such a script in examples/distributed_fuzzing/;
|
There is an example of such a script in examples/distributed_fuzzing/.
|
||||||
you can also find a more featured, experimental tool developed by
|
|
||||||
Martijn Bogaard at:
|
|
||||||
|
|
||||||
https://github.com/MartijnB/disfuzz-afl
|
There are other (older) more featured, experimental tools:
|
||||||
|
* https://github.com/richo/roving
|
||||||
|
* https://github.com/MartijnB/disfuzz-afl
|
||||||
|
|
||||||
Another client-server implementation from Richo Healey is:
|
However these do not support syncing just main nodes (yet).
|
||||||
|
|
||||||
https://github.com/richo/roving
|
|
||||||
|
|
||||||
Note that these third-party tools are unsafe to run on systems exposed to the
|
|
||||||
Internet or to untrusted users.
|
|
||||||
|
|
||||||
When developing custom test case sync code, there are several optimizations
|
When developing custom test case sync code, there are several optimizations
|
||||||
to keep in mind:
|
to keep in mind:
|
||||||
|
|
||||||
- The synchronization does not have to happen very often; running the
|
- The synchronization does not have to happen very often; running the
|
||||||
task every 30 minutes or so may be perfectly fine.
|
task every 60 minutes or even less often at later fuzzing stages is
|
||||||
|
fine
|
||||||
|
|
||||||
- There is no need to synchronize crashes/ or hangs/; you only need to
|
- There is no need to synchronize crashes/ or hangs/; you only need to
|
||||||
copy over queue/* (and ideally, also fuzzer_stats).
|
copy over queue/* (and ideally, also fuzzer_stats).
|
||||||
@ -180,11 +192,16 @@ to keep in mind:
|
|||||||
run them all with -S, and just designate a single process somewhere within
|
run them all with -S, and just designate a single process somewhere within
|
||||||
the fleet to run with -M.
|
the fleet to run with -M.
|
||||||
|
|
||||||
|
- Syncing is only necessary for the main nodes on a system. It is possible
|
||||||
|
to run main-less with only secondaries. However then you need to find out
|
||||||
|
which secondary took over the temporary role to be the main node. Look for
|
||||||
|
the `is_main` file in the fuzzer directories, eg. `sync-dir/hostname-*/is_main`
|
||||||
|
|
||||||
It is *not* advisable to skip the synchronization script and run the fuzzers
|
It is *not* advisable to skip the synchronization script and run the fuzzers
|
||||||
directly on a network filesystem; unexpected latency and unkillable processes
|
directly on a network filesystem; unexpected latency and unkillable processes
|
||||||
in I/O wait state can mess things up.
|
in I/O wait state can mess things up.
|
||||||
|
|
||||||
## 5) Remote monitoring and data collection
|
## 6) Remote monitoring and data collection
|
||||||
|
|
||||||
You can use screen, nohup, tmux, or something equivalent to run remote
|
You can use screen, nohup, tmux, or something equivalent to run remote
|
||||||
instances of afl-fuzz. If you redirect the program's output to a file, it will
|
instances of afl-fuzz. If you redirect the program's output to a file, it will
|
||||||
@ -208,7 +225,7 @@ Keep in mind that crashing inputs are *not* automatically propagated to the
|
|||||||
main instance, so you may still want to monitor for crashes fleet-wide
|
main instance, so you may still want to monitor for crashes fleet-wide
|
||||||
from within your synchronization or health checking scripts (see afl-whatsup).
|
from within your synchronization or health checking scripts (see afl-whatsup).
|
||||||
|
|
||||||
## 6) Asymmetric setups
|
## 7) Asymmetric setups
|
||||||
|
|
||||||
It is perhaps worth noting that all of the following is permitted:
|
It is perhaps worth noting that all of the following is permitted:
|
||||||
|
|
||||||
@ -224,7 +241,7 @@ It is perhaps worth noting that all of the following is permitted:
|
|||||||
the discovered test cases can have synergistic effects and improve the
|
the discovered test cases can have synergistic effects and improve the
|
||||||
overall coverage.
|
overall coverage.
|
||||||
|
|
||||||
(In this case, running one -M instance per each binary is a good plan.)
|
(In this case, running one -M instance per target is necessary.)
|
||||||
|
|
||||||
- Having some of the fuzzers invoke the binary in different ways.
|
- Having some of the fuzzers invoke the binary in different ways.
|
||||||
For example, 'djpeg' supports several DCT modes, configurable with
|
For example, 'djpeg' supports several DCT modes, configurable with
|
||||||
|
@ -39,8 +39,11 @@ FUZZ_USER=bob
|
|||||||
# Directory to synchronize
|
# Directory to synchronize
|
||||||
SYNC_DIR='/home/bob/sync_dir'
|
SYNC_DIR='/home/bob/sync_dir'
|
||||||
|
|
||||||
# Interval (seconds) between sync attempts
|
# We only capture -M main nodes, set the name to your chosen nameing scheme
|
||||||
SYNC_INTERVAL=$((30 * 60))
|
MAIN_NAME='main'
|
||||||
|
|
||||||
|
# Interval (seconds) between sync attempts (eg one hour)
|
||||||
|
SYNC_INTERVAL=$((60 * 60))
|
||||||
|
|
||||||
if [ "$AFL_ALLOW_TMP" = "" ]; then
|
if [ "$AFL_ALLOW_TMP" = "" ]; then
|
||||||
|
|
||||||
@ -63,7 +66,7 @@ while :; do
|
|||||||
echo "[*] Retrieving data from ${host}.${FUZZ_DOMAIN}..."
|
echo "[*] Retrieving data from ${host}.${FUZZ_DOMAIN}..."
|
||||||
|
|
||||||
ssh -o 'passwordauthentication no' ${FUZZ_USER}@${host}.$FUZZ_DOMAIN \
|
ssh -o 'passwordauthentication no' ${FUZZ_USER}@${host}.$FUZZ_DOMAIN \
|
||||||
"cd '$SYNC_DIR' && tar -czf - ${host}_*/[qf]*" >".sync_tmp/${host}.tgz"
|
"cd '$SYNC_DIR' && tar -czf - ${host}_${MAIN_NAME}*/" > ".sync_tmp/${host}.tgz"
|
||||||
|
|
||||||
done
|
done
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user