Remove deprecated libfuzzer_coverage task (#2021)

- Remove the ability to create or execute a `libfuzzer_coverage` task
- Preserve the enum variant in `onefuzztypes` to prevent errors when deserializing old data
- Remove doc references to `libfuzzer_coverage`
This commit is contained in:
Joe Ranweiler
2022-06-13 12:38:35 -07:00
committed by GitHub
parent 9989189e60
commit 52ccf05a29
22 changed files with 33 additions and 912 deletions

View File

@ -40,8 +40,8 @@ chmod +x deploy.py
When running `deploy.py` the first time for an instance, you will be prompted
to follow a manual step to initialize your CLI config.
The $NSG_CONFIG_FILE is a required parameter that specifies the 'allow rules' for the OneFuzz Network Security Group. A default `config.json` is provided in the deployment zip.
This 'allow' config resembles the following:
The $NSG_CONFIG_FILE is a required parameter that specifies the 'allow rules' for the OneFuzz Network Security Group. A default `config.json` is provided in the deployment zip.
This 'allow' config resembles the following:
```
{
"proxy_nsg_config": {
@ -50,7 +50,7 @@ This 'allow' config resembles the following:
}
}
```
Future updates can be made to this configuration via the OneFuzz CLI.
Future updates can be made to this configuration via the OneFuzz CLI.
## Install the CLI
@ -146,7 +146,7 @@ INFO:onefuzz:using container: oft-no-repro-14b8ea05ca635426bd9ccf3ee71b2e45
INFO:onefuzz:using container: oft-coverage-14b8ea05ca635426bd9ccf3ee71b2e45
INFO:onefuzz:uploading target exe `fuzz.exe`
INFO:onefuzz:creating libfuzzer task
INFO:onefuzz:creating libfuzzer_coverage task
INFO:onefuzz:creating coverage task
INFO:onefuzz:creating libfuzzer_crash_report task
INFO:onefuzz:done creating tasks
$

View File

@ -11,7 +11,7 @@ When using libFuzzer in C, developers provide a function
and the length of said buffer. ([Tutorial using libFuzzer in
C](https://github.com/google/fuzzing/blob/master/tutorial/libFuzzerTutorial.md))
With libfuzzer-dotnet, developers provide an application that within `Main` calls the method `Fuzzer.LibFuzzer.Run`, with a callback that passes a read only byte-stream their function of interest.
With libfuzzer-dotnet, developers provide an application that within `Main` calls the method `Fuzzer.LibFuzzer.Run`, with a callback that passes a read only byte-stream their function of interest.
> NOTE: libfuzzer-dotnet only works on Linux at this time.
@ -24,7 +24,7 @@ Standard](https://dotnet.microsoft.com/platform/dotnet-standard) check if your
framework version is supported.
## Issues using libfuzzer-dotnet in OneFuzz
* The `libfuzzer_coverage` task does not support the coverage features used by libfuzzer-dotnet. (Work item: [#536](https://github.com/microsoft/onefuzz/issues/536))
* The `coverage` task does not support the coverage features used by libfuzzer-dotnet.
* The `libfuzzer_crash_report` does not support extracting unique output during analysis, making the crash de-duplication and reporting ineffective. (Work item: [#538]https://github.com/microsoft/onefuzz/issues/538))
As such, a libfuzzer-dotnet template is available, which only uses the `libfuzzer_fuzz` tasks. As these issues are resolve, the template will be updated to include the additional tasks.
@ -40,7 +40,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
sudo apt-get install -y clang
```
2. We need to build an application that uses `Fuzzer.LibFuzzer.Run` that calls our function `Func`. For this example, let's call this [wrapper](../../src/integration-tests/libfuzzer-dotnet/wrapper/)
2. We need to build an application that uses `Fuzzer.LibFuzzer.Run` that calls our function `Func`. For this example, let's call this [wrapper](../../src/integration-tests/libfuzzer-dotnet/wrapper/)
The [wrapper/wrapper.csproj](../../src/integration-tests/libfuzzer-dotnet/wrapper/wrapper.csproj) project file uses SharpFuzz 1.6.1 and refers to our [problems](../../src/integration-tests/libfuzzer-dotnet/problems/) library locally.
```xml
@ -57,7 +57,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
</PropertyGroup>
</Project>
```
For our example [problems](../../src/integration-tests/libfuzzer-dotnet/problems/) library, our callback for `Fuzzer.LibFuzzer.Run` is straight forwards. `Func` already takes a `ReadOnlySpan<byte>`. If your functions takes strings, this would be the place to convert the span of bytes to strings.
[wrapper/program.cs](../../src/integration-tests/libfuzzer-dotnet/wrapper/program.cs)
```C#
@ -88,7 +88,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
clang -fsanitize=fuzzer libfuzzer-dotnet.cc -o my-fuzzer/libfuzzer-dotnet
```
6. We should provide some sample inputs for our fuzzing. For this example, a basic file will do. However, this should include reasonable known-good inputs for your function. If you're fuzzing PNGs, use a selection of valid PNGs.
6. We should provide some sample inputs for our fuzzing. For this example, a basic file will do. However, this should include reasonable known-good inputs for your function. If you're fuzzing PNGs, use a selection of valid PNGs.
```
mkdir -p inputs
echo hi > inputs/hi.txt
@ -131,7 +131,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
#10 0x45a942 in main (/home/bcaswell/projects/onefuzz/onefuzz/src/integration-tests/libfuzzer-dotnet/my-fuzzer/libfuzzer-dotnet+0x45a942)
#11 0x7fd6c2ee20b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
#12 0x40689d in _start (/home/bcaswell/projects/onefuzz/onefuzz/src/integration-tests/libfuzzer-dotnet/my-fuzzer/libfuzzer-dotnet+0x40689d)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
@ -141,7 +141,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
artifact_prefix='./'; Test unit written to ./crash-ad81c382bc24cb4edb13f5ab12ce1ee454600a69
Base64: AAEAAA4A
```
As shown in the output, our fuzzing run generated the file `crash-ad81c382bc24cb4edb13f5ab12ce1ee454600a69`. If we provide this file on the command line, we can reproduce the identified crash:
```
$ ./my-fuzzer/libfuzzer-dotnet --target_path=./my-fuzzer/wrapper ./crash-ad81c382bc24cb4edb13f5ab12ce1ee454600a69
@ -168,7 +168,7 @@ Let's fuzz the `Func` function of our example library named [problems](../../src
#8 0x45a942 in main (/home/bcaswell/projects/onefuzz/onefuzz/src/integration-tests/libfuzzer-dotnet/my-fuzzer/libfuzzer-dotnet+0x45a942)
#9 0x7f16819c50b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
#10 0x40689d in _start (/home/bcaswell/projects/onefuzz/onefuzz/src/integration-tests/libfuzzer-dotnet/my-fuzzer/libfuzzer-dotnet+0x40689d)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal

View File

@ -1,94 +0,0 @@
# Understanding libFuzzer coverage within OneFuzz
The `libfuzzer_coverage` task in OneFuzz provides coverage data from
libFuzzer targets by extracting compiler-based coverage at runtime.
The extracted data isn't directly mappable to developer-consumable data at
this time. Microsoft uses this data to identify coverage growth and enables
reverse engineers to identify areas in the applications that need
investigation.
For developer-focused coverage, use [source-based coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html).
## Implementation Details
For each input in the corpus, the fuzzing target is run using a platform
specific debugging script which extracts a per-module `sancov` table. The
per-input `sancov` files are summaries for each module, as well as a total
for the target.
> NOTE: Per-module means the primary executable as well as any loaded .so or .dll that are instrumented with sancov.
* On Linux: [gdb script](../../src/agent/script/linux/libfuzzer-coverage/coverage_cmd.py).
* Supported tables:
* LLVM: `_sancov_cntrs`
* On Windows: [cdb script](../../src/agent/script/win64/libfuzzer-coverage/DumpCounters.js)
* Supported tables:
* LLVM: `_sancov_cntrs`
* MSVC: `sancov$BoolFlag`, `sancov$8bitCounters`, `SancovBitmap`
## Understanding the coverage
Launching an [example libfuzzer](../../src/integration-tests/libfuzzer),
we'll see something like this:
```
$ onefuzz template libfuzzer basic bmc-2021-03-03 bmc 1 linux
INFO:onefuzz:creating libfuzzer from template
INFO:onefuzz:creating job (runtime: 24 hours)
INFO:onefuzz:created job: cd5660e3-3391-48d4-bfff-6f91533fc387
INFO:onefuzz:using container: oft-setup-3907b00953315a1693dcf057be11d03d
INFO:onefuzz:using container: oft-inputs-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-crashes-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-reports-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-unique-reports-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-unique-inputs-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-no-repro-85f2d72678ad533c83e2be999481dec3
INFO:onefuzz:using container: oft-coverage-3907b00953315a1693dcf057be11d03d
INFO:onefuzz:uploading target exe `fuzz.exe`
INFO:onefuzz:creating libfuzzer task
INFO:onefuzz:creating libfuzzer_coverage task
INFO:onefuzz:creating libfuzzer_crash_report task
INFO:onefuzz:done creating tasks
{
"config": {
"build": "1",
"duration": 24,
"name": "bmc",
"project": "bmc-2021-03-03"
},
"end_time": "2021-03-04T21:44:43+00:00",
"job_id": "cd5660e3-3391-48d4-bfff-6f91533fc387",
"state": "init",
"user_info": {
"application_id": "db5c6d5c-f6d7-477c-9376-1889d3a6b183",
"object_id": 77b19309-f8e0-4772-9756-f92ca3b35a0f",
"upn": "example@contoso.com"
}
}
$
```
After letting our task run for a while, we can fetch our coverage from the `oft-coverage` container listed above.
Let's examine the coverage generated thus far:
```
$ mkdir my-coverage
$ onefuzz containers files download_dir oft-coverage-3907b00953315a1693dcf057be11d03d ./my-coverage/
$ cd my-coverage; find . -type f
./by-module/fuzz.exe.cov
./inputs/01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b.cov
./inputs/01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b/fuzz.exe.cov
./inputs/15dab3cc1c78958bc8c6d959cf708c2062e8327d3db873c2629b243c7e1a1759.cov
./inputs/15dab3cc1c78958bc8c6d959cf708c2062e8327d3db873c2629b243c7e1a1759/fuzz.exe.cov
./total.cov
$
```
What is shown here is:
* A per-module summary from all of the inputs. This is stored as `by-module/module.cov`.
* A per-input/per-module sancov file. This is stored as `inputs/SHA256_OF_INPUT/module.cov`.
* A per-input summary of all of the per-module sancov gathered for the input. This is stored as `inputs/SHA256_OF_INPUT.cov`
* A summary of all of the coverage thus far, as `total.cov`
> NOTE: The `inputs/SHA256_OF_INPUT.cov` and `total.cov` are built by naively concatenating the per-module inputs. The result is primarily useful for understanding coverage growth in general, but doesn't easily map back to source code.

View File

@ -18,11 +18,11 @@ are made up of a handful of components, primarily including:
The current task types available are:
* libfuzzer_fuzz: fuzz with a libFuzzer target
* libfuzzer_coverage: extract coverage from a libFuzzer target with the seeds
* libfuzzer_crash_report: Execute the target with crashing inputs, attempting to
generate an informational report for each discovered crash
* libfuzzer_merge: merge newly discovered inputs with an input corpus using
corpus minimization
* coverage: record binary block and source line coverage
* generic_analysis: perform [custom analysis](custom-analysis.md) on every
crashing input
* generic_supervisor: fuzz using user-provided supervisors (such as AFL)

View File

@ -77,7 +77,7 @@ The following are common data types used in multiple locations:
* Scaleset ID - A randomly generated GUID used to uniquely identify a VM
scaleset.
* Task Type - The type of task being executed. Examples include
`generic_crash_report` or `libfuzzer_coverage`. For a full list, see the enum
`generic_crash_report` or `coverage`. For a full list, see the enum
[TaskType](../src/pytypes/onefuzztypes/enums.py).
* OS - An enum value describing the OS used (Currently, only Windows or Linux).
* Version - A compile-time generated string that specifies the OneFuzz version number based on [CURRENT\_RELEASE](../CURRENT_RELEASE) and the sha-1 git revision (See [example](../src/agent/onefuzz-task/build.rs)).

View File

@ -1074,7 +1074,7 @@ If webhook is set to have Event Grid message format then the payload will look a
},
{
"task_id": "00000000-0000-0000-0000-000000000001",
"task_type": "libfuzzer_coverage"
"task_type": "coverage"
}
]
}

View File

@ -1,13 +1,13 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.
#[cfg(any(target_os = "linux", target_os = "windows"))]
use crate::local::coverage;
use crate::local::{
common::add_common_config, generic_analysis, generic_crash_report, generic_generator,
libfuzzer, libfuzzer_crash_report, libfuzzer_fuzz, libfuzzer_merge, libfuzzer_regression,
libfuzzer_test_input, radamsa, test_input, tui::TerminalUi,
};
#[cfg(any(target_os = "linux", target_os = "windows"))]
use crate::local::{coverage, libfuzzer_coverage};
use anyhow::{Context, Result};
use clap::{App, Arg, SubCommand};
use crossterm::tty::IsTty;
@ -28,8 +28,6 @@ enum Commands {
LibfuzzerCrashReport,
LibfuzzerTestInput,
LibfuzzerRegression,
#[cfg(any(target_os = "linux", target_os = "windows"))]
LibfuzzerCoverage,
Libfuzzer,
CrashReport,
Generator,
@ -66,8 +64,6 @@ pub async fn run(args: clap::ArgMatches<'static>) -> Result<()> {
libfuzzer_crash_report::run(&sub_args, event_sender).await
}
Commands::LibfuzzerFuzz => libfuzzer_fuzz::run(&sub_args, event_sender).await,
#[cfg(any(target_os = "linux", target_os = "windows"))]
Commands::LibfuzzerCoverage => libfuzzer_coverage::run(&sub_args, event_sender).await,
Commands::LibfuzzerMerge => libfuzzer_merge::run(&sub_args, event_sender).await,
Commands::LibfuzzerTestInput => {
libfuzzer_test_input::run(&sub_args, event_sender).await
@ -124,8 +120,6 @@ pub fn args(name: &str) -> App<'static, 'static> {
Commands::Radamsa => radamsa::args(subcommand.into()),
Commands::LibfuzzerCrashReport => libfuzzer_crash_report::args(subcommand.into()),
Commands::LibfuzzerFuzz => libfuzzer_fuzz::args(subcommand.into()),
#[cfg(any(target_os = "linux", target_os = "windows"))]
Commands::LibfuzzerCoverage => libfuzzer_coverage::args(subcommand.into()),
Commands::LibfuzzerMerge => libfuzzer_merge::args(subcommand.into()),
Commands::LibfuzzerTestInput => libfuzzer_test_input::args(subcommand.into()),
Commands::LibfuzzerRegression => libfuzzer_regression::args(subcommand.into()),

View File

@ -5,9 +5,9 @@
use crate::{
local::{
common::COVERAGE_DIR,
libfuzzer_coverage::{build_coverage_config, build_shared_args as build_coverage_args},
coverage::{build_coverage_config, build_shared_args as build_coverage_args},
},
tasks::coverage::libfuzzer_coverage::CoverageTask,
tasks::coverage::generic::CoverageTask,
};
use crate::{
local::{
@ -91,7 +91,7 @@ pub async fn run(args: &clap::ArgMatches<'_>, event_sender: Option<Sender<UiEven
)?;
let mut coverage = CoverageTask::new(coverage_config);
let coverage_task = spawn(async move { coverage.managed_run().await });
let coverage_task = spawn(async move { coverage.run().await });
task_handles.push(coverage_task);
task_handles.push(coverage_input_monitor.handle);

View File

@ -1,124 +0,0 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.
use crate::{
local::common::{
build_local_context, get_cmd_arg, get_cmd_env, get_cmd_exe, get_synced_dir,
get_synced_dirs, CmdType, CHECK_FUZZER_HELP, COVERAGE_DIR, INPUTS_DIR, READONLY_INPUTS,
TARGET_ENV, TARGET_EXE, TARGET_OPTIONS,
},
tasks::{
config::CommonConfig,
coverage::libfuzzer_coverage::{Config, CoverageTask},
},
};
use anyhow::Result;
use clap::{App, Arg, SubCommand};
use flume::Sender;
use storage_queue::QueueClient;
use super::common::{SyncCountDirMonitor, UiEvent};
pub fn build_coverage_config(
args: &clap::ArgMatches<'_>,
local_job: bool,
input_queue: Option<QueueClient>,
common: CommonConfig,
event_sender: Option<Sender<UiEvent>>,
) -> Result<Config> {
let target_exe = get_cmd_exe(CmdType::Target, args)?.into();
let target_env = get_cmd_env(CmdType::Target, args)?;
let target_options = get_cmd_arg(CmdType::Target, args);
let readonly_inputs = if local_job {
vec![
get_synced_dir(INPUTS_DIR, common.job_id, common.task_id, args)?
.monitor_count(&event_sender)?,
]
} else {
get_synced_dirs(READONLY_INPUTS, common.job_id, common.task_id, args)?
.into_iter()
.map(|sd| sd.monitor_count(&event_sender))
.collect::<Result<Vec<_>>>()?
};
let coverage = get_synced_dir(COVERAGE_DIR, common.job_id, common.task_id, args)?
.monitor_count(&event_sender)?;
let check_fuzzer_help = args.is_present(CHECK_FUZZER_HELP);
let config = Config {
target_exe,
target_env,
target_options,
check_fuzzer_help,
input_queue,
readonly_inputs,
coverage,
common,
check_queue: false,
};
Ok(config)
}
pub async fn run(args: &clap::ArgMatches<'_>, event_sender: Option<Sender<UiEvent>>) -> Result<()> {
let context = build_local_context(args, true, event_sender.clone())?;
let config = build_coverage_config(
args,
false,
None,
context.common_config.clone(),
event_sender,
)?;
let mut task = CoverageTask::new(config);
task.managed_run().await
}
pub fn build_shared_args(local_job: bool) -> Vec<Arg<'static, 'static>> {
let mut args = vec![
Arg::with_name(TARGET_EXE)
.long(TARGET_EXE)
.takes_value(true)
.required(true),
Arg::with_name(TARGET_ENV)
.long(TARGET_ENV)
.takes_value(true)
.multiple(true),
Arg::with_name(TARGET_OPTIONS)
.long(TARGET_OPTIONS)
.takes_value(true)
.value_delimiter(" ")
.help("Use a quoted string with space separation to denote multiple arguments"),
Arg::with_name(COVERAGE_DIR)
.takes_value(true)
.required(!local_job)
.long(COVERAGE_DIR),
Arg::with_name(CHECK_FUZZER_HELP)
.takes_value(false)
.long(CHECK_FUZZER_HELP),
];
if local_job {
args.push(
Arg::with_name(INPUTS_DIR)
.long(INPUTS_DIR)
.takes_value(true)
.required(true),
)
} else {
args.push(
Arg::with_name(READONLY_INPUTS)
.takes_value(true)
.required(true)
.long(READONLY_INPUTS)
.multiple(true),
)
}
args
}
pub fn args(name: &'static str) -> App<'static, 'static> {
SubCommand::with_name(name)
.about("execute a local-only libfuzzer coverage task")
.args(&build_shared_args(false))
}

View File

@ -9,8 +9,6 @@ pub mod generic_analysis;
pub mod generic_crash_report;
pub mod generic_generator;
pub mod libfuzzer;
#[cfg(any(target_os = "linux", target_os = "windows"))]
pub mod libfuzzer_coverage;
pub mod libfuzzer_crash_report;
pub mod libfuzzer_fuzz;
pub mod libfuzzer_merge;

View File

@ -92,10 +92,6 @@ pub enum Config {
#[serde(alias = "libfuzzer_merge")]
LibFuzzerMerge(merge::libfuzzer_merge::Config),
#[cfg(any(target_os = "linux", target_os = "windows"))]
#[serde(alias = "libfuzzer_coverage")]
LibFuzzerCoverage(coverage::libfuzzer_coverage::Config),
#[serde(alias = "libfuzzer_regression")]
LibFuzzerRegression(regression::libfuzzer::Config),
@ -137,8 +133,6 @@ impl Config {
Config::LibFuzzerFuzz(c) => &mut c.common,
Config::LibFuzzerMerge(c) => &mut c.common,
Config::LibFuzzerReport(c) => &mut c.common,
#[cfg(any(target_os = "linux", target_os = "windows"))]
Config::LibFuzzerCoverage(c) => &mut c.common,
Config::LibFuzzerRegression(c) => &mut c.common,
Config::GenericAnalysis(c) => &mut c.common,
Config::GenericMerge(c) => &mut c.common,
@ -156,8 +150,6 @@ impl Config {
Config::LibFuzzerFuzz(c) => &c.common,
Config::LibFuzzerMerge(c) => &c.common,
Config::LibFuzzerReport(c) => &c.common,
#[cfg(any(target_os = "linux", target_os = "windows"))]
Config::LibFuzzerCoverage(c) => &c.common,
Config::LibFuzzerRegression(c) => &c.common,
Config::GenericAnalysis(c) => &c.common,
Config::GenericMerge(c) => &c.common,
@ -175,8 +167,6 @@ impl Config {
Config::LibFuzzerFuzz(_) => "libfuzzer_fuzz",
Config::LibFuzzerMerge(_) => "libfuzzer_merge",
Config::LibFuzzerReport(_) => "libfuzzer_crash_report",
#[cfg(any(target_os = "linux", target_os = "windows"))]
Config::LibFuzzerCoverage(_) => "libfuzzer_coverage",
Config::LibFuzzerRegression(_) => "libfuzzer_regression",
Config::GenericAnalysis(_) => "generic_analysis",
Config::GenericMerge(_) => "generic_merge",
@ -242,12 +232,6 @@ impl Config {
.managed_run()
.await
}
#[cfg(any(target_os = "linux", target_os = "windows"))]
Config::LibFuzzerCoverage(config) => {
coverage::libfuzzer_coverage::CoverageTask::new(config)
.managed_run()
.await
}
Config::LibFuzzerMerge(config) => merge::libfuzzer_merge::spawn(Arc::new(config)).await,
Config::GenericAnalysis(config) => analysis::generic::run(config).await,

View File

@ -1,298 +0,0 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.
//! # Coverage Task
//!
//! Computes a streaming coverage metric using Sancov-instrumented libFuzzers.
//! Reports the latest coverage rate via telemetry events and updates a remote
//! total coverage map in blob storage.
//!
//! ## Instrumentation
//!
//! Assumes the libFuzzer is instrumented with Sancov inline 8-bit counters.
//! This feature updates a global table without any PC callback. The coverage
//! scripts find and dump this table after executing the test input. For now,
//! our metric projects the counter value to a single bit, treating each table
//! entry as a flag rather than a counter.
//!
//! ## Dependencies
//!
//! This task invokes OS-specific debugger scripts to dump the coverage for
//! each input. To do this, the following must be in the `$PATH`:
//!
//! ### Linux
//! - `python3` (3.6)
//! - `gdb` (8.1)
//!
//! ### Windows
//! - `powershell.exe` (5.1)
//! - `cdb.exe` (10.0)
//!
//! Versions in parentheses have been tested.
use crate::tasks::heartbeat::*;
use crate::tasks::{config::CommonConfig, generic::input_poller::*};
use crate::tasks::{
coverage::{recorder::CoverageRecorder, total::TotalCoverage},
utils::default_bool_true,
};
use anyhow::{Context, Result};
use async_trait::async_trait;
use onefuzz::{fs::list_files, libfuzzer::LibFuzzer, syncdir::SyncedDir};
use onefuzz_telemetry::{Event::coverage_data, EventData};
use reqwest::Url;
use serde::Deserialize;
use std::collections::{BTreeMap, HashMap};
use std::{
ffi::OsString,
path::{Path, PathBuf},
sync::Arc,
};
use storage_queue::{Message, QueueClient};
use tokio::fs;
const TOTAL_COVERAGE: &str = "total.cov";
#[derive(Debug, Deserialize)]
pub struct Config {
pub target_exe: PathBuf,
pub target_env: HashMap<String, String>,
pub target_options: Vec<String>,
pub input_queue: Option<QueueClient>,
pub readonly_inputs: Vec<SyncedDir>,
pub coverage: SyncedDir,
#[serde(default = "default_bool_true")]
pub check_queue: bool,
#[serde(default = "default_bool_true")]
pub check_fuzzer_help: bool,
#[serde(flatten)]
pub common: CommonConfig,
}
/// Compute the coverage provided by one or both of:
///
/// 1. A list of seed corpus containers (one-time batch mode)
/// 2. A queue of inputs pending coverage analysis (streaming)
///
/// If `seed_containers` is empty and `input_queue` is absent, this task
/// will do nothing. If `input_queue` is present, then this task will poll
/// forever.
pub struct CoverageTask {
config: Arc<Config>,
poller: InputPoller<Message>,
}
impl CoverageTask {
pub fn new(config: Config) -> Self {
let config = Arc::new(config);
let poller = InputPoller::new("libfuzzer-coverage");
Self { config, poller }
}
pub async fn verify(&self) -> Result<()> {
let fuzzer = LibFuzzer::new(
&self.config.target_exe,
&self.config.target_options,
&self.config.target_env,
&self.config.common.setup_dir,
);
fuzzer.verify(self.config.check_fuzzer_help, None).await
}
pub async fn managed_run(&mut self) -> Result<()> {
info!("starting libFuzzer coverage task");
self.verify().await?;
self.config.coverage.init_pull().await?;
self.process().await
}
async fn process(&mut self) -> Result<()> {
let mut processor = CoverageProcessor::new(self.config.clone()).await?;
info!("processing initial dataset");
let mut seen_inputs = false;
// Update the total with the coverage from each seed corpus.
for dir in &self.config.readonly_inputs {
debug!("recording coverage for {}", dir.local_path.display());
dir.init_pull().await?;
if self.record_corpus_coverage(&mut processor, dir).await? {
seen_inputs = true;
}
}
if seen_inputs {
processor.report_total().await?;
self.config.coverage.sync_push().await?;
info!(
"recorded coverage for {} containers in `readonly_inputs`",
self.config.readonly_inputs.len(),
);
} else {
info!("no initial inputs in `readonly_inputs`",);
}
// If a queue has been provided, poll it for new coverage.
if let Some(queue) = &self.config.input_queue {
info!("polling queue for new coverage");
let callback = CallbackImpl::new(queue.clone(), processor)?;
self.poller.run(callback).await?;
}
Ok(())
}
async fn record_corpus_coverage(
&self,
processor: &mut CoverageProcessor,
corpus_dir: &SyncedDir,
) -> Result<bool> {
let mut corpus = fs::read_dir(&corpus_dir.local_path)
.await
.with_context(|| {
format!(
"unable to read corpus coverage directory: {}",
corpus_dir.local_path.display()
)
})?;
let mut seen_inputs = false;
let mut count = 0;
loop {
let input = match corpus.next_entry().await {
Ok(Some(input)) => input,
Ok(None) => break,
Err(err) => {
error!("{}", err);
continue;
}
};
processor.test_input(&input.path()).await?;
seen_inputs = true;
count += 1;
// sync the coverage container after every 10 inputs
if count % 10 == 0 {
self.config.coverage.sync_push().await?;
}
}
Ok(seen_inputs)
}
}
pub struct CoverageProcessor {
config: Arc<Config>,
pub recorder: CoverageRecorder,
pub total: TotalCoverage,
pub module_totals: BTreeMap<OsString, TotalCoverage>,
heartbeat_client: Option<TaskHeartbeatClient>,
}
impl CoverageProcessor {
pub async fn new(config: Arc<Config>) -> Result<Self> {
let heartbeat_client = config.common.init_heartbeat(None).await?;
let total = TotalCoverage::new(config.coverage.local_path.join(TOTAL_COVERAGE));
let recorder = CoverageRecorder::new(config.clone()).await?;
let module_totals = BTreeMap::default();
Ok(Self {
config,
recorder,
total,
module_totals,
heartbeat_client,
})
}
async fn update_module_total(&mut self, file: &Path, data: &[u8]) -> Result<()> {
let module = file
.file_name()
.ok_or_else(|| format_err!("module must have filename"))?
.to_os_string();
debug!("updating module info {:?}", module);
if !self.module_totals.contains_key(&module) {
let parent = &self.config.coverage.local_path.join("by-module");
fs::create_dir_all(parent).await.with_context(|| {
format!(
"unable to create by-module coverage directory: {}",
parent.display()
)
})?;
let module_total = parent.join(&module);
let total = TotalCoverage::new(module_total);
self.module_totals.insert(module.clone(), total);
}
self.module_totals[&module].update_bytes(data).await?;
debug!("updated {:?}", module);
Ok(())
}
async fn collect_by_module(&mut self, path: &Path) -> Result<()> {
let mut files = list_files(&path).await?;
files.sort();
let mut sum = Vec::new();
for file in &files {
debug!("checking {:?}", file);
let mut content = fs::read(file)
.await
.with_context(|| format!("unable to read module coverage: {}", file.display()))?;
self.update_module_total(file, &content).await?;
sum.append(&mut content);
}
let mut combined = path.as_os_str().to_owned();
combined.push(".cov");
fs::write(&combined, sum)
.await
.with_context(|| format!("unable to write combined coverage file: {:?}", combined))?;
Ok(())
}
pub async fn test_input(&mut self, input: &Path) -> Result<()> {
info!("processing input {:?}", input);
let new_coverage = self.recorder.record(input).await?;
self.collect_by_module(&new_coverage).await?;
self.update_total().await?;
Ok(())
}
async fn update_total(&mut self) -> Result<()> {
let mut total = Vec::new();
for module_total in self.module_totals.values() {
if let Some(mut module_data) = module_total.data().await? {
total.append(&mut module_data);
}
}
self.total.write(&total).await?;
Ok(())
}
pub async fn report_total(&self) -> Result<()> {
let info = self.total.info().await?;
event!(coverage_data; EventData::Covered = info.covered, EventData::Features = info.features, EventData::Rate = info.rate);
Ok(())
}
}
#[async_trait]
impl Processor for CoverageProcessor {
async fn process(&mut self, _url: Option<Url>, input: &Path) -> Result<()> {
self.heartbeat_client.alive();
self.test_input(input).await?;
self.report_total().await?;
self.config.coverage.sync_push().await?;
Ok(())
}
}

View File

@ -2,6 +2,3 @@
// Licensed under the MIT License.
pub mod generic;
pub mod libfuzzer_coverage;
pub mod recorder;
pub mod total;

View File

@ -1,215 +0,0 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.
use std::{
env,
path::{Path, PathBuf},
process::Stdio,
sync::Arc,
};
use anyhow::{Context, Result};
use onefuzz::{fs::has_files, sha256::digest_file};
use tempfile::{tempdir, TempDir};
use tokio::{
fs,
process::{Child, Command},
};
use crate::tasks::coverage::libfuzzer_coverage::Config;
pub struct CoverageRecorder {
config: Arc<Config>,
script_path: PathBuf,
// keep _temp_dir such that Drop cleans up temporary files
_temp_dir: Option<TempDir>,
}
const SYMBOL_EXTRACT_ERROR: &str = "Target appears to be missing sancov instrumentation. This error can also happen if symbols for the target are not available.";
impl CoverageRecorder {
pub async fn new(config: Arc<Config>) -> Result<Self> {
let (script_path, _temp_dir) = match env::var("ONEFUZZ_TOOLS") {
Ok(tools_dir) => {
let script_path = PathBuf::from(tools_dir);
if cfg!(target_os = "linux") {
(
script_path
.join("linux")
.join("libfuzzer-coverage")
.join("coverage_cmd.py"),
None,
)
} else if cfg!(target_os = "windows") {
(
script_path
.join("win64")
.join("libfuzzer-coverage")
.join("DumpCounters.js"),
None,
)
} else {
bail!("coverage recorder not implemented for target os");
}
}
Err(_) => {
let temp_dir = tempdir()?;
let script_path = if cfg!(target_os = "linux") {
let script_path = temp_dir.path().join("coverage_cmd.py");
let content = include_bytes!(
"../../../../script/linux/libfuzzer-coverage/coverage_cmd.py"
);
fs::write(&script_path, content).await.with_context(|| {
format!("unable to write file: {}", script_path.display())
})?;
script_path
} else if cfg!(target_os = "windows") {
let script_path = temp_dir.path().join("DumpCounters.js");
let content = include_bytes!(
"../../../../script/win64/libfuzzer-coverage/DumpCounters.js"
);
fs::write(&script_path, content).await.with_context(|| {
format!("unable to write file: {}", script_path.display())
})?;
script_path
} else {
bail!("coverage recorder not implemented for target os");
};
(script_path, Some(temp_dir))
}
};
Ok(Self {
config,
script_path,
_temp_dir,
})
}
/// Invoke a script to write coverage to a file.
///
/// Per module coverage is written to:
/// coverage/inputs/<SHA256_OF_INPUT>/<module_name>.cov
///
/// The `.cov` file is a binary dump of the 8-bit PC counter table.
pub async fn record(&mut self, test_input: impl AsRef<Path>) -> Result<PathBuf> {
let test_input = test_input.as_ref();
let coverage_path = {
let digest = digest_file(test_input).await?;
self.config.coverage.local_path.join("inputs").join(digest)
};
fs::create_dir_all(&coverage_path).await.with_context(|| {
format!(
"unable to create coverage path: {}",
coverage_path.display()
)
})?;
let script = self.invoke_debugger_script(test_input, &coverage_path)?;
let output = script.wait_with_output().await?;
let stdout = String::from_utf8_lossy(&output.stdout);
let stderr = String::from_utf8_lossy(&output.stderr);
if !output.status.success() {
let err = format_err!("coverage recording failed: {}", output.status);
error!("{}", err);
error!("recording stderr: {}", stderr);
error!("recording stdout: {}", stdout);
return Err(err);
} else {
debug!("recording stderr: {}", stderr);
debug!("recording stdout: {}", stdout);
}
if !has_files(&coverage_path).await? {
tokio::fs::remove_dir(&coverage_path)
.await
.with_context(|| {
format!(
"unable to remove coverage path: {}",
coverage_path.display()
)
})?;
let filename = test_input
.file_name()
.ok_or_else(|| format_err!("unable to identify coverage input filename"))?;
bail!(
"{}\ntarget_exe: {}\ninput: {:?}\ndebugger stdout: {}\ndebugger stderr: {}",
SYMBOL_EXTRACT_ERROR,
self.config.target_exe.display(),
filename,
stdout,
stderr
);
}
Ok(coverage_path)
}
#[cfg(target_os = "linux")]
fn invoke_debugger_script(&self, test_input: &Path, output: &Path) -> Result<Child> {
let mut cmd = Command::new("gdb");
cmd.arg(&self.config.target_exe)
.arg("-nh")
.arg("-batch")
.arg("-x")
.arg(&self.script_path)
.arg("-ex")
.arg(format!(
"coverage {} {} {}",
&self.config.target_exe.to_string_lossy(),
test_input.to_string_lossy(),
output.to_string_lossy(),
))
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.kill_on_drop(true);
for (k, v) in &self.config.target_env {
cmd.env(k, v);
}
let child = cmd.spawn().context("gdb failed to start")?;
Ok(child)
}
#[cfg(target_os = "windows")]
fn invoke_debugger_script(&self, test_input: &Path, output: &Path) -> Result<Child> {
let should_disable_sympath = !self.config.target_env.contains_key("_NT_SYMBOL_PATH");
let cdb_cmd = format!(
".scriptload {}; !dumpcounters {:?}, {}; q",
self.script_path.to_string_lossy(),
output.to_string_lossy(),
should_disable_sympath,
);
let mut cmd = Command::new("cdb.exe");
cmd.arg("-c")
.arg(cdb_cmd)
.arg(&self.config.target_exe)
.arg(test_input)
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.kill_on_drop(true);
for (k, v) in &self.config.target_env {
cmd.env(k, v);
}
let child = cmd.spawn().context("cdb.exe failed to start")?;
Ok(child)
}
}

View File

@ -1,94 +0,0 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.
use std::path::{Path, PathBuf};
use anyhow::Result;
use tokio::{fs, io};
pub struct TotalCoverage {
/// Absolute path to the total coverage file.
///
/// May not yet exist on disk.
path: PathBuf,
}
#[derive(Debug)]
pub struct Info {
pub covered: u64,
pub features: u64,
pub rate: f64,
}
impl TotalCoverage {
pub fn new(path: PathBuf) -> Self {
Self { path }
}
pub async fn data(&self) -> Result<Option<Vec<u8>>> {
use io::ErrorKind::NotFound;
let data = fs::read(&self.path).await;
if let Err(err) = &data {
if err.kind() == NotFound {
return Ok(None);
}
}
Ok(Some(data?))
}
pub fn path(&self) -> &Path {
&self.path
}
pub async fn write(&self, data: &[u8]) -> Result<()> {
fs::write(self.path(), data).await?;
Ok(())
}
pub async fn update_bytes(&self, new_data: &[u8]) -> Result<()> {
match self.data().await {
Ok(Some(mut total_data)) => {
if total_data.len() < new_data.len() {
total_data.resize_with(new_data.len(), || 0);
}
for (i, b) in new_data.iter().enumerate() {
if *b > 0 {
total_data[i] = 1;
}
}
self.write(&total_data).await?;
}
Ok(None) => {
// Base case: we don't yet have any total coverage. Promote the
// new coverage to being our total coverage.
info!("initializing total coverage map {}", self.path().display());
self.write(new_data).await?;
}
Err(err) => {
// Couldn't read total for some other reason, so this is a real error.
return Err(err);
}
}
Ok(())
}
pub async fn info(&self) -> Result<Info> {
let data = self
.data()
.await?
.ok_or_else(|| format_err!("coverage file not found"))?;
let covered = data.iter().filter(|&&c| c > 0).count() as u64;
let features = data.len() as u64;
let rate = (covered as f64) / (features as f64);
Ok(Info {
covered,
features,
rate,
})
}
}

View File

@ -78,7 +78,7 @@ libfuzzer_linux = JobTemplate(
job_id=UUID(int=0),
prereq_tasks=[UUID(int=0)],
task=TaskDetails(
type=TaskType.libfuzzer_coverage,
type=TaskType.coverage,
duration=1,
target_exe="fuzz.exe",
target_env={},

View File

@ -179,40 +179,6 @@ TASK_DEFINITIONS = {
],
monitor_queue=ContainerType.crashes,
),
TaskType.libfuzzer_coverage: TaskDefinition(
features=[
TaskFeature.target_exe,
TaskFeature.target_env,
TaskFeature.target_options,
TaskFeature.check_fuzzer_help,
],
vm=VmDefinition(compare=Compare.Equal, value=1),
containers=[
ContainerDefinition(
type=ContainerType.setup,
compare=Compare.Equal,
value=1,
permissions=[ContainerPermission.Read, ContainerPermission.List],
),
ContainerDefinition(
type=ContainerType.readonly_inputs,
compare=Compare.AtLeast,
value=1,
permissions=[ContainerPermission.Read, ContainerPermission.List],
),
ContainerDefinition(
type=ContainerType.coverage,
compare=Compare.Equal,
value=1,
permissions=[
ContainerPermission.List,
ContainerPermission.Read,
ContainerPermission.Write,
],
),
],
monitor_queue=ContainerType.readonly_inputs,
),
TaskType.libfuzzer_merge: TaskDefinition(
features=[
TaskFeature.target_exe,

View File

@ -106,7 +106,7 @@ class Task(BASE_TASK, ORMMixin):
# 'task_id': '835f7b3f-43ad-4718-b7e4-d506d9667b09',
# 'state': 'stopped',
# 'config': {
# 'task': {'type': 'libfuzzer_coverage'},
# 'task': {'type': 'coverage'},
# 'vm': {'count': 1}
# }
# }

View File

@ -25,6 +25,10 @@ class TestTaskDefinition(unittest.TestCase):
def test_all_defined(self) -> None:
for entry in [TaskType[x] for x in TaskType.__members__]:
if entry == TaskType.libfuzzer_coverage:
# Deprecated, kept in enum for deserialization back-compat.
continue
self.assertIn(entry, TASK_DEFINITIONS)
def test_basic(self) -> None:

View File

@ -854,11 +854,11 @@ class Tasks(Endpoint):
self.logger.debug("creating task: %s", task_type)
if task_type == TaskType.libfuzzer_coverage:
self.logger.warning(
"DEPRECATED: the `libfuzzer_coverage` task type is deprecated. "
"It will be removed in an upcoming release. "
self.logger.error(
"The `libfuzzer_coverage` task type is deprecated. "
"Please migrate to the `coverage` task type."
)
raise RuntimeError("`libfuzzer_coverage` task type not supported")
job_id_expanded = self._disambiguate_uuid(
"job_id",

View File

@ -236,7 +236,7 @@ def main() -> None:
),
JobTaskStopped(
task_id=UUID(int=1),
task_type=TaskType.libfuzzer_coverage,
task_type=TaskType.coverage,
),
],
),

View File

@ -151,7 +151,10 @@ class TaskState(Enum):
class TaskType(Enum):
coverage = "coverage"
libfuzzer_fuzz = "libfuzzer_fuzz"
# Deprecated, kept for deserialization of old task data.
libfuzzer_coverage = "libfuzzer_coverage"
libfuzzer_crash_report = "libfuzzer_crash_report"
libfuzzer_merge = "libfuzzer_merge"
libfuzzer_regression = "libfuzzer_regression"