Commit Graph

2095 Commits

Author SHA1 Message Date
2dde7f16e0 create proxy-configs container during install (#437) 2021-01-15 15:11:40 -05:00
bb83c03f5c Update Linux tracer version (#429)
Update `pete` to 0.4, which enables and requires us to use `std::process::Child` for spawning tracees.

Toward #370.
2021-01-15 14:23:45 +00:00
5cef03f234 enable sccache & incremental builds for non-release builds (#431) 2021-01-14 15:56:44 -05:00
773b8f203e Bump Github Actions revisions (#430) 2021-01-13 17:43:27 -05:00
a89065f882 adding {setup_dir} to variable expansion (#417)
## Summary of the Pull Request

Adds a new placeholder {setup_dir} for the setup directory 

## PR Checklist
* [x] Applies to work item: #221
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [x] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Info on Pull Request

_What does this include?_

## Validation Steps Performed

_How does someone test & validate?_
2021-01-13 00:39:59 +00:00
2e2ba988ee Fix condition for triggering new unique report event (#422) 2021-01-12 14:00:34 -05:00
70d41d1cc5 Switch to memmap2 (#428)
- Depend on `memmap2`, a maintained fork of the abandoned `memmap` crate
- Revert #364, which temporarily suppressed the relevant `cargo-audit` CI error

Closes #363.
2021-01-12 17:08:48 +00:00
f5dc8ad285 update MSAL to 1.8.0 (#426) 2021-01-12 10:27:32 +00:00
513d1f52c9 Unify Dashboard & Webhook events (#394)
This change unifies the previously adhoc SignalR events and Webhooks into a single event format.
2021-01-11 21:43:09 +00:00
465727680d add context to all fs calls (#423)
Adds additional context in error handling to all `std::fs` and `tokio::fs` calls.

Fixes #309
2021-01-11 20:55:22 +00:00
d573100a97 Clear node messages on deletion (#419)
## Summary of the Pull Request

_What is this about?_

## PR Checklist
* [ ] Applies to work item: #xxx
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Info on Pull Request

_What does this include?_

## Validation Steps Performed

_How does someone test & validate?_
2021-01-11 20:14:43 +00:00
5f9110f97b Release 1.11.0 (#421) 1.11.0 2021-01-11 13:51:48 -05:00
08b1f74e09 fix queue_file_changes logic on corpus accounts (#425) 2021-01-11 12:12:38 -05:00
6aa7d5f6cf remove unused back_channel_address entry (#420) 2021-01-08 14:23:30 -05:00
46e8454569 compare containers rather than SAS urls when building worksets (#418)
By comparing container names rather than SAS urls, this removes a race condition that prevented co-locatable tasks from being co-located.
2021-01-08 09:45:05 +00:00
e799eb03cd Shorten the expiry window for the work queue SAS URLs assigned at node registration (#416)
The underlying impact is that nodes must re-register on a more frequent basis.

Nodes find out they are out-of-date is during registration and immediately prior to starting a new set of work.  Requiring nodes re-register on a shortened cycle provides more opportunities for nodes to get re-imaged.

Additionally, this addresses an issue handling the SAS URL expiry in a more clean fashion in the supervisor.
2021-01-07 12:34:26 +00:00
3b26ffef65 support multiple corpus accounts (#334)
Add support for sharding across multiple storage accounts for blob containers used for corpus management.

Things to note:

1. Additional storage accounts must be in the same resource group, support the "blob" endpoint, and have the tag `storage_type` with the value `corpus`.  A utility is provided (`src/utils/add-corpus-storage-accounts`), which adds storage accounts. 
2. If any secondary storage accounts exist, they are used by default for containers.
3. Storage account names are cached in memory the Azure Function instance forever.   Upon adding new storage accounts, the app needs to be restarted to pick up the new accounts.
2021-01-06 23:11:39 +00:00
f345bd239d Add ssh keys to nodes on demand (#411)
Our existing model has a per-scaleset SSH key.  This update moves towards using user provided SSH keys when they need to connect to a given node.
2021-01-06 19:29:38 +00:00
dae1759b57 update devops prereq (#399) 2021-01-06 09:57:01 -05:00
c1a50f6f6c Colocate tasks (#402)
Enables co-locating multiple tasks in a given work-set.

Tasks are bucketed by the following:
* OS
* job id
* setup container
* VM SKU & image (used in pre-1.0 style tasks)
* pool name (used in 1.0+ style tasks)
* if the task needs rebooting after the task setup script executes.

Additionally, a task will end up in a unique bucket if any of the following are true:
* The task is set to run on more than one VM
* The task is missing the `task.config.colocate` flag (all tasks created prior to this functionality) or the value is False

This updates the libfuzzer template to make use of colocation.  Users can specify co-locating all of the tasks *or* co-locating the secondary tasks.
2021-01-06 13:49:15 +00:00
883f38cb87 Multi-tenant authentication support in CLI (#346)
## Summary of the Pull Request

These are purposed changes to resolve ticket #344 

I have tested these changes and it does not effect or break the current functionality.

I don't necessarily expect this PR to be merged without some tweaks. I'll coordinate over the next week or so to get it right.

One coding issue I would like to discuss/highlight is the assumption (in code) that if "--tenant_domain" is used then the 'common' authority is also used. I am open to suggestions. 

## PR Checklist
* [X] Applies to work item: #344
* [X] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [X] Tests passed (with and without multitenant authentication)
* [?] Requires documentation to be updated
* [No] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #344

## Info on Pull Request

Minor changes to the config file and the login process.

## Validation Steps Performed

Tested these changes with a multi-tenant enabled endpoint and a single-tenant endpoint.
2021-01-06 12:35:47 +00:00
986df8fcc6 limit updating outdated nodes to 500 at a time (#397) 2021-01-05 17:40:36 -05:00
633e5b5f02 restrict api endpoints (#404)
Restrict API endpoints from agents
2021-01-05 19:40:58 +00:00
7e56efa6a8 Address clippy issues (#409) 2021-01-05 15:41:46 +00:00
37f06bb324 handle libfuzzer fuzzing non-zero exits better (#381)
When running libfuzzer in 'fuzzing' mode, we expect the following on exit.

If the exit code is zero, crashing input isn't required.  This happens if the user specifies '-runs=N'

If the exit code is non-zero, then crashes are expected.  In practice, there are two causes to non-zero exits.
1. If the binary can't execute for some reason, like a missing prerequisite
2. If the binary _can_ execute, sometimes the sanitizers are put in such a bad place that they are unable to record the input that caused the crash.

This PR enables handling these two non-zero exit cases.

1. Optionally verify the libfuzzer target loads appropriately using `target_exe -help=1`.  This allows failing faster in the common issues, such a missing prerequisite library.
2. Optionally allow non-zero exits without crashes to be a warning, rather than a task failure.
2021-01-05 14:40:15 +00:00
75d2ffd7f4 lint test utils (#395) 2021-01-05 08:50:52 -05:00
014cb5bcfd Re-adds POST for node endpoint (#412)
Re-adds the POST method for the `node` endpoint, which got accidentally dropped.
2021-01-05 10:49:20 +00:00
4d9abe936b increase function timeout to 15 minutes (#384) 2021-01-04 20:55:15 -05:00
365722c5fa upgrade AFL++ to 3.00b (#393)
Update the version of AFL++ provided in OneFuzz to 3.00b, which was released yesterday.
2021-01-05 00:42:52 +00:00
e51d7affb7 Fixes race condition of a libfuzzer coverage without inputs (#403)
This fixes an issue running a libfuzzer coverage task and don't have any initial seeds (or there are seeds found by the fuzzer by the time the task starts), it will fail.
2021-01-05 00:05:13 +00:00
ce32981b1b address clippy issues in proxy-manager (#410) 2021-01-04 22:33:42 +00:00
1b1af1f84f log stdout & stderr lines for supervisor & generator (#400)
This fixes #371 and #372.
2021-01-04 21:53:49 +00:00
f8f7e28aa2 add 'onefuzz debug log tail' (#401)
Adds `onefuzz debug log tail <keyword>`, which enables performing the same component in `onefuzz debug log keyword <keyword>` in a loop.  

Optimizations:
* only returns the N records at a time (default 1000)
* each query only returns records that occur after the latest record received.
* If no results are returned, waits 10s before retrying
* Increases the wait time by 1.5x until the wait time is larger than 60s

Using `--filter` provides the ability to filter each record that comes back via jmespath.

Example uses:

Monitor any log messages (which ignores metrics) for a given job_id GUID
```
onefuzz debug logs tail bf4efdfd-685c-444a-81c5-d911477433ae --filter message
```

Log the job_id and task_id for each new unique report:
```
onefuzz debug logs tail new_unique_report --filter '[customDimensions.job_id, customDimensions.task_id]'
```

Log the job_id and task_id for each new unique report only for the specific job_id:
```
onefuzz debug logs tail "new_unique_report d5bcd4d2-4dab-49d5-a215-66db94fb0309" --filter '[customDimensions.job_id, customDimensions.task_id]'
```
2021-01-04 21:08:27 +00:00
29c7cfbd5d filter out deleted nodes as to prevent them from being saved later (#391)
In `Scaleset.cleanup_nodes`, nodes that are no longer part of the scaleset should get deleted.  Without filtering the list, the nodes could get re-saved to the Node table later on.
2021-01-04 20:28:57 +00:00
4c2679d61e Re-add windows ssh key (#390)
Adds a scaleset specific setup script, which allows us to save the scaleset based SSH keys into the VM on setup.
2021-01-04 19:52:27 +00:00
3441790322 add delayed start to heartbeats (#387)
Adds a random initial jitter the size of the heartbeat periodicity to prevent heartbeats storming the service when we launch 3000 nodes roughly at the same time.

Fixes #386
2021-01-04 18:50:02 +00:00
d038cca1e1 Verify a workset only exists along with a reboot context (#378)
Adds the following:

1. Serializes a workset to disk during setup.
2. Upon deserializing a RebootContext, delete the file from disk (We support rebooting once and only once)
3. Check if a workset exists with a RebootContext
    1. If True, continuing processing
    2. if False, mark the tasks & node as "Done" with appropriate errors via:
        1. send WorkerEvent::Done events for each of the tasks in the work set
        2. send StateUpdateEvent::Done for the node
2021-01-04 17:51:20 +00:00
36b3e2a5aa disable py-cache prior to mypy on cli (#408) 2021-01-04 11:49:28 -05:00
e222b01003 update rust prereqs (#396) 2020-12-16 07:38:37 -05:00
6dc7b78447 support ASAN odr-violation outputs (#380) 2020-12-10 15:48:15 -05:00
7f5673eb21 handle non-utf8 from libfuzzer stderr (#379) 2020-12-10 15:13:14 -05:00
56090cb01d Demonstrate a more complex template management (#366)
Add a job_template example that demonstrates customization of the arguments to the job. 

This example demonstrates setting the Area and Iteration paths for Azure Devops work items.
2020-12-05 12:30:37 +00:00
69fc9f508b fix clippy issue (#367) 2020-12-04 15:04:29 -05:00
f1b4efc5ff Add troubleshooting guide for the registration issue at deployment (#362) 2020-12-02 18:54:29 -05:00
1d49f27961 Release 1.10.0 (#365) 1.10.0 2020-12-02 17:48:27 -05:00
203bc22756 Allow unmaintained memmap (#364) 2020-12-02 15:34:22 -05:00
fd131c63bf Document managing declarative templates (#361) 2020-12-02 14:18:45 -05:00
b81c6fa89e fix job_templates deletion (#360) 2020-12-02 14:02:16 -05:00
054989f232 Add support for ASAN print_scariness (#359) 2020-12-02 11:33:22 -05:00
e6b55ab95a Simplify job template management workflow (#354)
1. Merge 'create' and 'update' to a single 'save' operation.
2. Allow fetching a single template.

This enables the following workflow:

```
$ onefuzz job_templates manage get libfuzzer_linux > template.json
$ <... update template as desired ...>
$ onefuzz job_templates manage save libfuzzer_linux @./template.json
$
```
2020-12-02 14:27:42 +00:00