Commit Graph

2095 Commits

Author SHA1 Message Date
f26838452b remove stray ? (#652) 2021-03-09 11:57:58 -05:00
d2e4baa48d Updating Task Heartbeat Struct to actually include job_id (#646) 2021-03-09 10:24:04 -05:00
0a3812d8bc Add job stopped task info (#648) 2021-03-09 10:06:06 -05:00
18bf361d62 release 2.8.0 (#639) 2.8.0 2021-03-08 12:39:51 -05:00
54e2cb2bf1 update signalrcore (#640) 2021-03-08 12:22:55 -05:00
23dc8ad301 explain the source of task failures related notifications (#635) 2021-03-06 13:35:09 +00:00
157a14d003 fix markdown link error (#637) 2021-03-05 19:20:00 -05:00
fb8e7490c6 pre-define clusterfuzz as a dependency (#636) 2021-03-05 15:37:30 -05:00
1c09caedc5 add howto guide to understand the libfuzzer_coverage task (#631) 2021-03-05 19:21:11 +00:00
12df25ca17 Defer log formatting (#634)
Log formatting allocates memory and should only happen
if the log message is needed.
2021-03-05 10:39:15 -08:00
fdac6b02a8 re-pin pyjwt version due to conflicts with azure-cli-core (#630) 2021-03-03 22:18:45 -05:00
7fc725d012 add non-x86_64 architecture libfuzzer target support using qemu-user (#600) 2021-03-03 19:06:50 -05:00
92c1d0a7a1 only set VM passwords on Windows (#620) 2021-03-03 21:27:56 +00:00
d2e6c29e6b add info to help clarify success after warnings (#629) 2021-03-03 15:20:31 -05:00
ec33531870 Add Threat Model documentation (#482)
This PR includes a [Threat Model](https://aka.ms/tmt) for OneFuzz.
2021-03-03 19:30:56 +00:00
4ccc84a7de remove pyjwt from dependency list (#627) 2021-03-03 12:47:04 -05:00
2bec9db828 update azure-cli and azure-cli-core (#626) 2021-03-03 10:50:30 +00:00
78d6adf555 upgrade azure-storage-blob to 12.8.0 (#625)
Note, this makes use of the new feature, `ContainerClient.exists()` which models our existing mechanism for doing container existence checking.
2021-03-03 10:33:23 +00:00
b30ade7d0c update AFL++ to 3.10c (#609) 2021-03-02 22:11:46 -05:00
04fc41597e require target_exe to be a canonicalized relative path (#613) 2021-03-02 19:06:02 -05:00
4489036d9f add node & task heartbeat events (#621)
This adds node & task heartbeats and makes the event data available as a structured data in the logs.
2021-03-02 22:04:39 +00:00
7f66eeee0d handle OperationNotAllowed errors when creating VMSS (#614) 2021-03-02 16:14:10 -05:00
a0c04ec3d1 Add symbol cache and filtering (#570)
- Add caching to symbol table-driven module disassembly on Linux.
- Add configurable regex-based filtering for coverage collection, by module and module-scoped symbol name.

Block coverage recording can be manually tested using the `block_coverage` example in the `coverage` crate. See `./block_coverage -h` for expected args.

The filter file is optional. The file format is JSON like this:
```json
{
    "modules": {
        "allow": [
            "<module-path-regex-1>",
            "<module-path-regex-2>",
        ]
    },
    "symbols": {
        "<module-path-regex-1>": {
            "allow": [
                "<symbol-name-regex-1>",
                "<symbol-name-regex-2>",
            ]
        },
        "<module-path-regex-2>": {
            "deny": [
                "<symbol-name-regex-3>",
                "<symbol-name-regex-4>",
            ]
        }
    }
}
```

Closes #285.
2021-03-02 19:42:05 +00:00
b97093735a fix agent retry on connection level failures (#623)
In debugging the connection retry issues, I dug into this more.  

Apparently, some of hyper's connection errors are not mapped to std::io::Error, rendering the existing downcast impl ineffective.

As such, this PR makes the following updates:
1. Any request that fails for what `reqwest` calls a `connection` error is considered transient.
2. Updates the retry notify code to use our `warn` macro such that the events show up in application insights.
3. Updates the unit test to demonstrate that failures by trying to connect to `http://localhost:81/`, which shouldn't be listening on any system.
4. Adds a simple unit test to verify with send_retry_default, connections to https://www.microsoft.com work

Fixes #263
2021-03-02 19:02:10 +00:00
c537458ade update azure-mgmt-compute to 19.0.0 (#611) 2021-03-02 18:44:26 +00:00
ba836a2062 update azure-mgmt-eventgrid to 3.0.0rc9 (#610) 2021-03-02 18:18:38 +00:00
296ba2ee23 update azure-mgmt-storage to 17.0.0 (#612) 2021-03-02 18:00:07 +00:00
e43c1c875c simplify batch-processing log (#622)
Simplifies the logs from:

`Processing batch-downloaded input Ok(DirEntry(DirEntry("task_crashes_1/input-b4c3482194a6ebd275577ea52255fcea3358f3220c408d3c53b9f32b653e6586.txt")))`

to:

`Processing batch-downloaded input: task_crashes_1/input-b4c3482194a6ebd275577ea52255fcea3358f3220c408d3c53b9f32b653e6586.txt`
2021-03-02 17:32:07 +00:00
d4cedabdf8 update 3rd party rust dependencies (#624) 2021-03-02 11:41:30 -05:00
32681b2611 update azure-mgmt-resource (#607) 2021-03-02 08:35:07 +00:00
37af0f1112 add CodeQL pipeline (#617) 2021-03-01 14:06:38 -05:00
100e22a359 Rewrite redundant Result wraps (#616) 2021-03-01 12:43:30 -05:00
91bbcf7a59 update Azure Devops tool integration prereqs (#608) 2021-03-01 11:55:25 -05:00
d1fe872a07 Release 2.7.0 (#604) 2.7.0 2021-02-28 01:32:26 +00:00
0f895d11c9 add context to logging of supervisor work queue interaction (#601) 2021-02-27 20:17:04 -05:00
c1a2c9febb fix infinite loop on request error that isn't an IO Error (#603) 2021-02-26 20:23:39 -05:00
43585d84e3 handle azcopy's rename of NOTICES.txt (#602) 2021-02-26 19:43:04 -05:00
6a82f57c4a remove unused library from prereqs (#599) 2021-02-26 16:57:55 -05:00
e3c73d7a10 Update command variable expansion (#561)
* Documents `crashes_account` and `crashes_container`
* Adds `reports_dir` and support for `unique_reports`, `reports`, and `no_repro` containers to the generic analysis task
* Adds `microsoft_telemetry_key` and `instance_telemetry_key` to generic supervisor, generator, and analysis tasks
2021-02-26 20:58:09 +00:00
419ca05b28 Actively tail worker stdio from supervisor agent (#588)
In the supervisor agent, incrementally read from the running worker agent's redirected stderr and stdout, instead of waiting until it exits.

The worker agent's stderr and stdout are piped to the supervisor when tasks are run. The supervisor's `WorkerRunner` does _not_ use `wait_with_output()`, which handles this (at the cost of blocking). Instead, it makes repeated calls to to `try_wait()` on timer-based state transitions, and does not try to read the pipes until the worker exits. But when one of the child's pipes is full, the child can block forever waiting on a `write(2)`, such as in a `log` facade implementation.

This bug has not been caught because we control the child worker agent, and until recently, it mostly only wrote to these streams using `env_logger` at its default log level. But recent work: (1) set more-verbose `INFO` level default logging, (2) logged stderr/stdout lines of child processes of _the worker_, and (3) some user targets logged very verbosely for debugging. This surfaced the underlying issue.
2021-02-26 20:09:02 +00:00
06f45f338c Update Task Heartbeat to include Job_id (#594) 2021-02-26 13:36:10 -05:00
6a049db3a3 Renames application insights keys to be more clear (#587)
* renames `telemetry_key` to `microsoft_telemetry_key`
* renames `instrumentation_key` to `instance_telemetry_key`
* renames `can_share` to `can_share_with_microsoft`
* renames the `applicationinsights-rs` instances to `internal` and `microsoft` respective of the keys used during construction.

This clarifies the underlying use of Application Insights keys and uses struct tuple to ensure the keys are used correctly via rust's type checker.
2021-02-26 17:04:49 +00:00
8600a44f1f fix bool queries (#597)
This addresses broken queries used for identifying outdated nodes.
2021-02-26 16:51:05 +00:00
bc6c8408c4 add onefuzz containers files download_dir (#598)
fixes #571
2021-02-26 15:27:51 +00:00
4cd2de0e93 Update azure-cli & azure-cli-core (#596) 2021-02-26 09:19:25 -05:00
daef1637f8 update jinja2 (#595) 2021-02-26 09:19:10 -05:00
a3fa5f6b62 Update onefuzz-agent unit tests (#592) 2021-02-24 20:54:36 -05:00
ed86bb0099 Use non-deprecated atomic method (#593) 2021-02-24 17:41:24 -05:00
fb482e357e don't schedule work to a node if the scaleset or pool is shutting down (#583) 2021-02-23 13:33:41 -05:00
e7fe099f25 handle delayed AAD resources in deployments (#585) 2021-02-22 19:40:07 -05:00