Commit Graph

1278 Commits

Author SHA1 Message Date
f345bd239d Add ssh keys to nodes on demand (#411)
Our existing model has a per-scaleset SSH key.  This update moves towards using user provided SSH keys when they need to connect to a given node.
2021-01-06 19:29:38 +00:00
dae1759b57 update devops prereq (#399) 2021-01-06 09:57:01 -05:00
c1a50f6f6c Colocate tasks (#402)
Enables co-locating multiple tasks in a given work-set.

Tasks are bucketed by the following:
* OS
* job id
* setup container
* VM SKU & image (used in pre-1.0 style tasks)
* pool name (used in 1.0+ style tasks)
* if the task needs rebooting after the task setup script executes.

Additionally, a task will end up in a unique bucket if any of the following are true:
* The task is set to run on more than one VM
* The task is missing the `task.config.colocate` flag (all tasks created prior to this functionality) or the value is False

This updates the libfuzzer template to make use of colocation.  Users can specify co-locating all of the tasks *or* co-locating the secondary tasks.
2021-01-06 13:49:15 +00:00
883f38cb87 Multi-tenant authentication support in CLI (#346)
## Summary of the Pull Request

These are purposed changes to resolve ticket #344 

I have tested these changes and it does not effect or break the current functionality.

I don't necessarily expect this PR to be merged without some tweaks. I'll coordinate over the next week or so to get it right.

One coding issue I would like to discuss/highlight is the assumption (in code) that if "--tenant_domain" is used then the 'common' authority is also used. I am open to suggestions. 

## PR Checklist
* [X] Applies to work item: #344
* [X] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [X] Tests passed (with and without multitenant authentication)
* [?] Requires documentation to be updated
* [No] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #344

## Info on Pull Request

Minor changes to the config file and the login process.

## Validation Steps Performed

Tested these changes with a multi-tenant enabled endpoint and a single-tenant endpoint.
2021-01-06 12:35:47 +00:00
986df8fcc6 limit updating outdated nodes to 500 at a time (#397) 2021-01-05 17:40:36 -05:00
633e5b5f02 restrict api endpoints (#404)
Restrict API endpoints from agents
2021-01-05 19:40:58 +00:00
7e56efa6a8 Address clippy issues (#409) 2021-01-05 15:41:46 +00:00
37f06bb324 handle libfuzzer fuzzing non-zero exits better (#381)
When running libfuzzer in 'fuzzing' mode, we expect the following on exit.

If the exit code is zero, crashing input isn't required.  This happens if the user specifies '-runs=N'

If the exit code is non-zero, then crashes are expected.  In practice, there are two causes to non-zero exits.
1. If the binary can't execute for some reason, like a missing prerequisite
2. If the binary _can_ execute, sometimes the sanitizers are put in such a bad place that they are unable to record the input that caused the crash.

This PR enables handling these two non-zero exit cases.

1. Optionally verify the libfuzzer target loads appropriately using `target_exe -help=1`.  This allows failing faster in the common issues, such a missing prerequisite library.
2. Optionally allow non-zero exits without crashes to be a warning, rather than a task failure.
2021-01-05 14:40:15 +00:00
75d2ffd7f4 lint test utils (#395) 2021-01-05 08:50:52 -05:00
014cb5bcfd Re-adds POST for node endpoint (#412)
Re-adds the POST method for the `node` endpoint, which got accidentally dropped.
2021-01-05 10:49:20 +00:00
4d9abe936b increase function timeout to 15 minutes (#384) 2021-01-04 20:55:15 -05:00
365722c5fa upgrade AFL++ to 3.00b (#393)
Update the version of AFL++ provided in OneFuzz to 3.00b, which was released yesterday.
2021-01-05 00:42:52 +00:00
e51d7affb7 Fixes race condition of a libfuzzer coverage without inputs (#403)
This fixes an issue running a libfuzzer coverage task and don't have any initial seeds (or there are seeds found by the fuzzer by the time the task starts), it will fail.
2021-01-05 00:05:13 +00:00
ce32981b1b address clippy issues in proxy-manager (#410) 2021-01-04 22:33:42 +00:00
1b1af1f84f log stdout & stderr lines for supervisor & generator (#400)
This fixes #371 and #372.
2021-01-04 21:53:49 +00:00
f8f7e28aa2 add 'onefuzz debug log tail' (#401)
Adds `onefuzz debug log tail <keyword>`, which enables performing the same component in `onefuzz debug log keyword <keyword>` in a loop.  

Optimizations:
* only returns the N records at a time (default 1000)
* each query only returns records that occur after the latest record received.
* If no results are returned, waits 10s before retrying
* Increases the wait time by 1.5x until the wait time is larger than 60s

Using `--filter` provides the ability to filter each record that comes back via jmespath.

Example uses:

Monitor any log messages (which ignores metrics) for a given job_id GUID
```
onefuzz debug logs tail bf4efdfd-685c-444a-81c5-d911477433ae --filter message
```

Log the job_id and task_id for each new unique report:
```
onefuzz debug logs tail new_unique_report --filter '[customDimensions.job_id, customDimensions.task_id]'
```

Log the job_id and task_id for each new unique report only for the specific job_id:
```
onefuzz debug logs tail "new_unique_report d5bcd4d2-4dab-49d5-a215-66db94fb0309" --filter '[customDimensions.job_id, customDimensions.task_id]'
```
2021-01-04 21:08:27 +00:00
29c7cfbd5d filter out deleted nodes as to prevent them from being saved later (#391)
In `Scaleset.cleanup_nodes`, nodes that are no longer part of the scaleset should get deleted.  Without filtering the list, the nodes could get re-saved to the Node table later on.
2021-01-04 20:28:57 +00:00
4c2679d61e Re-add windows ssh key (#390)
Adds a scaleset specific setup script, which allows us to save the scaleset based SSH keys into the VM on setup.
2021-01-04 19:52:27 +00:00
3441790322 add delayed start to heartbeats (#387)
Adds a random initial jitter the size of the heartbeat periodicity to prevent heartbeats storming the service when we launch 3000 nodes roughly at the same time.

Fixes #386
2021-01-04 18:50:02 +00:00
d038cca1e1 Verify a workset only exists along with a reboot context (#378)
Adds the following:

1. Serializes a workset to disk during setup.
2. Upon deserializing a RebootContext, delete the file from disk (We support rebooting once and only once)
3. Check if a workset exists with a RebootContext
    1. If True, continuing processing
    2. if False, mark the tasks & node as "Done" with appropriate errors via:
        1. send WorkerEvent::Done events for each of the tasks in the work set
        2. send StateUpdateEvent::Done for the node
2021-01-04 17:51:20 +00:00
36b3e2a5aa disable py-cache prior to mypy on cli (#408) 2021-01-04 11:49:28 -05:00
e222b01003 update rust prereqs (#396) 2020-12-16 07:38:37 -05:00
6dc7b78447 support ASAN odr-violation outputs (#380) 2020-12-10 15:48:15 -05:00
7f5673eb21 handle non-utf8 from libfuzzer stderr (#379) 2020-12-10 15:13:14 -05:00
56090cb01d Demonstrate a more complex template management (#366)
Add a job_template example that demonstrates customization of the arguments to the job. 

This example demonstrates setting the Area and Iteration paths for Azure Devops work items.
2020-12-05 12:30:37 +00:00
69fc9f508b fix clippy issue (#367) 2020-12-04 15:04:29 -05:00
f1b4efc5ff Add troubleshooting guide for the registration issue at deployment (#362) 2020-12-02 18:54:29 -05:00
1d49f27961 Release 1.10.0 (#365) 1.10.0 2020-12-02 17:48:27 -05:00
203bc22756 Allow unmaintained memmap (#364) 2020-12-02 15:34:22 -05:00
fd131c63bf Document managing declarative templates (#361) 2020-12-02 14:18:45 -05:00
b81c6fa89e fix job_templates deletion (#360) 2020-12-02 14:02:16 -05:00
054989f232 Add support for ASAN print_scariness (#359) 2020-12-02 11:33:22 -05:00
e6b55ab95a Simplify job template management workflow (#354)
1. Merge 'create' and 'update' to a single 'save' operation.
2. Allow fetching a single template.

This enables the following workflow:

```
$ onefuzz job_templates manage get libfuzzer_linux > template.json
$ <... update template as desired ...>
$ onefuzz job_templates manage save libfuzzer_linux @./template.json
$
```
2020-12-02 14:27:42 +00:00
9b3ccf37ea use the correct instrumentation key (#355) 2020-12-01 18:44:10 -05:00
0182dc597d handle asan check failures (#358) 2020-12-01 18:23:26 -05:00
fc34725428 update rust prereqs (#357) 2020-12-01 17:22:32 -05:00
aef511efe8 Fail the task if parsing asan_log files fail (#351)
This differentiates parsing ASAN log parse failures from ASAN logs not existing, fixing the first part of #343.
2020-12-01 21:10:59 +00:00
7f97c142ed add the instrumentation key to Info (#353) 2020-12-01 11:13:06 -05:00
3f3193beeb Use disable_check_debugger on asan integration tests (#352) 2020-12-01 10:36:53 -05:00
a1af90cb83 Update deployment prerequisites to remove pyopenssl errors (#348)
Over the weekend, pyOpenssl 20.0 was released.  This causes an incompatible library issue during deployment.

Prior to this change, deployment would generate the following error
```
ERROR: pyopenssl 20.0.0 has requirement cryptography>=3.2, but you'll have cryptography 2.9.2 which is incompatible.
```
2020-12-01 14:43:53 +00:00
5092f96af4 Fix deployment of backdated versions of OneFuzz (#347)
When running automated deployments, 'tools' were not being properly replaced with the updated versions if the deployment was created _prior_ to the original instance deployment.
2020-12-01 10:59:43 +00:00
37e3251966 render the event model as json to not include error (#350) 2020-11-30 23:19:27 -05:00
30cc5d4778 ignore nodes already scheduled for re-imaging in outdated check (#341)
If a node is already scheduled to be reimaged/deleted, we should not bother checking if it's outdated.
2020-11-30 17:36:15 +00:00
2391d927f7 Updating yml file to run config endpoint command with tenant/authority ID. (#339)
## Summary of the Pull Request

Originally, the yml file printed out a semi-generalized _onefuzz config --endpoint_ comman. This command did have a specified _--authority_ and so it used the Microsoft id by default. To enable users to work with OneFuzz on tenants other than the standard Microsoft tenant, we have added a _--authority_ parameter that is printed out at the end of the deployment. 

## PR Checklist
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Info on Pull Request

Changes to the yml file. 

## Validation Steps Performed

We have made this change to our local automation repository and tested an automated deployment pipeline with this change.
2020-11-30 14:54:42 +00:00
079f387b88 clarify prefix-expansion errors (#342) 2020-11-24 11:51:03 -05:00
33b7608aaf Adding option to merge all inputs at once (#282) 2020-11-24 08:43:08 -05:00
79cc82098a Move integration test artifacts into primary source tree (#336) 2020-11-24 08:03:01 -05:00
905dc7c0d6 Re-enable the retry logic for App Password creation (#338) 2020-11-24 08:00:31 -05:00
d47124fe8c Fix state management in the scheduler (#337) 2020-11-24 12:43:51 +00:00
32ba86be9d Update current_thread_id when setting current thread (#340) 2020-11-23 13:39:03 -08:00