Mitigate the deployment issue related to the conditional access policy.
The registration logic is updated to use the old rbac python library when possible.
The deployment will print some manual step for operations that cannot be automated
- Switch to using `goblin` for both ELF and PE parsing
- Refactor block entry point recovery, with better documentation
- Fix a broken example binary
Co-authored-by: bmc-msft <41130664+bmc-msft@users.noreply.github.com>
This PR adds an experimental "local" mode for the agent, starting with `libfuzzer`. For tasks that poll a queue, in local mode, they just monitor a directory for new files.
Supported commands:
* libfuzzer-fuzz (models the `libfuzzer-fuzz` task)
* libfuzzer-coverage (models the `libfuzzer-coverage` task)
* libfuzzer-crash-report (models the `libfuzzer-crash-report` task)
* libfuzzer (models the `libfuzzer basic` job template, running libfuzzer-fuzz and libfuzzer-crash-report tasks concurrently, where any files that show up in `crashes_dir` are automatically turned into reports, and optionally runs the coverage task which runs the coverage data exporter for each file that shows up in `inputs_dir`).
Under the hood, there are a handful of changes required to the rest of the system to enable this feature.
1. `SyncedDir` URLs are now optional. In local mode, these no longer make sense. (We've discussed moving management of `SyncedDirs` to the Supervisor. This is tangential to that effort.)
2. `InputPoller` uses a `tempdir` rather than abusing `task_id` for temporary directory naming.
3. Moved the `agent` to only use a single tokio runtime, rather than one for each of the subcommands.
4. Sets the default log level to `info`. (RUST_LOG can still be used as is).
Note, this removes the `onefuzz-agent debug` commands for the tasks that are now exposed via `onefuzz-agent local`, as these provide a more featureful version of the debug tasks.
## Summary of the Pull Request
Adds a new placeholder {setup_dir} for the setup directory
## PR Checklist
* [x] Applies to work item: #221
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [x] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
_What does this include?_
## Validation Steps Performed
_How does someone test & validate?_
- Depend on `memmap2`, a maintained fork of the abandoned `memmap` crate
- Revert #364, which temporarily suppressed the relevant `cargo-audit` CI error
Closes#363.
## Summary of the Pull Request
_What is this about?_
## PR Checklist
* [ ] Applies to work item: #xxx
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
_What does this include?_
## Validation Steps Performed
_How does someone test & validate?_
The underlying impact is that nodes must re-register on a more frequent basis.
Nodes find out they are out-of-date is during registration and immediately prior to starting a new set of work. Requiring nodes re-register on a shortened cycle provides more opportunities for nodes to get re-imaged.
Additionally, this addresses an issue handling the SAS URL expiry in a more clean fashion in the supervisor.
Add support for sharding across multiple storage accounts for blob containers used for corpus management.
Things to note:
1. Additional storage accounts must be in the same resource group, support the "blob" endpoint, and have the tag `storage_type` with the value `corpus`. A utility is provided (`src/utils/add-corpus-storage-accounts`), which adds storage accounts.
2. If any secondary storage accounts exist, they are used by default for containers.
3. Storage account names are cached in memory the Azure Function instance forever. Upon adding new storage accounts, the app needs to be restarted to pick up the new accounts.
Enables co-locating multiple tasks in a given work-set.
Tasks are bucketed by the following:
* OS
* job id
* setup container
* VM SKU & image (used in pre-1.0 style tasks)
* pool name (used in 1.0+ style tasks)
* if the task needs rebooting after the task setup script executes.
Additionally, a task will end up in a unique bucket if any of the following are true:
* The task is set to run on more than one VM
* The task is missing the `task.config.colocate` flag (all tasks created prior to this functionality) or the value is False
This updates the libfuzzer template to make use of colocation. Users can specify co-locating all of the tasks *or* co-locating the secondary tasks.
## Summary of the Pull Request
These are purposed changes to resolve ticket #344
I have tested these changes and it does not effect or break the current functionality.
I don't necessarily expect this PR to be merged without some tweaks. I'll coordinate over the next week or so to get it right.
One coding issue I would like to discuss/highlight is the assumption (in code) that if "--tenant_domain" is used then the 'common' authority is also used. I am open to suggestions.
## PR Checklist
* [X] Applies to work item: #344
* [X] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [X] Tests passed (with and without multitenant authentication)
* [?] Requires documentation to be updated
* [No] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #344
## Info on Pull Request
Minor changes to the config file and the login process.
## Validation Steps Performed
Tested these changes with a multi-tenant enabled endpoint and a single-tenant endpoint.
When running libfuzzer in 'fuzzing' mode, we expect the following on exit.
If the exit code is zero, crashing input isn't required. This happens if the user specifies '-runs=N'
If the exit code is non-zero, then crashes are expected. In practice, there are two causes to non-zero exits.
1. If the binary can't execute for some reason, like a missing prerequisite
2. If the binary _can_ execute, sometimes the sanitizers are put in such a bad place that they are unable to record the input that caused the crash.
This PR enables handling these two non-zero exit cases.
1. Optionally verify the libfuzzer target loads appropriately using `target_exe -help=1`. This allows failing faster in the common issues, such a missing prerequisite library.
2. Optionally allow non-zero exits without crashes to be a warning, rather than a task failure.
This fixes an issue running a libfuzzer coverage task and don't have any initial seeds (or there are seeds found by the fuzzer by the time the task starts), it will fail.
Adds `onefuzz debug log tail <keyword>`, which enables performing the same component in `onefuzz debug log keyword <keyword>` in a loop.
Optimizations:
* only returns the N records at a time (default 1000)
* each query only returns records that occur after the latest record received.
* If no results are returned, waits 10s before retrying
* Increases the wait time by 1.5x until the wait time is larger than 60s
Using `--filter` provides the ability to filter each record that comes back via jmespath.
Example uses:
Monitor any log messages (which ignores metrics) for a given job_id GUID
```
onefuzz debug logs tail bf4efdfd-685c-444a-81c5-d911477433ae --filter message
```
Log the job_id and task_id for each new unique report:
```
onefuzz debug logs tail new_unique_report --filter '[customDimensions.job_id, customDimensions.task_id]'
```
Log the job_id and task_id for each new unique report only for the specific job_id:
```
onefuzz debug logs tail "new_unique_report d5bcd4d2-4dab-49d5-a215-66db94fb0309" --filter '[customDimensions.job_id, customDimensions.task_id]'
```
In `Scaleset.cleanup_nodes`, nodes that are no longer part of the scaleset should get deleted. Without filtering the list, the nodes could get re-saved to the Node table later on.