Updates the following libraries in the service:
* azure-core
* azure-functions
* azure-identity
* azure-keyvault-keys
* azure-keyvault-secrets
* azure-mgmt-compute
* azure-mgmt-core
* azure-mgmt-loganalytics
* azure-mgmt-network
* azure-mgmt-resource
* azure-mgmt-storage
* azure-mgmt-subscription
* azure-storage-blob
* azure-storage-queue
* pydantic
* requests
* jsonpatch
Removes the following libraries in the service:
* azure-cli-core
* azure-cli-nspkg
* azure-mgmt-cosmosdb
* azure-servicebus
Updates the following libraries in the CLI:
* requests
* semver
* asciimatics
* pydantic
* tenacity
Updates the following libraries in onefuzztypes:
* pydantic
The primary "legacy" libraries are [azure-graphrbac](https://pypi.org/project/azure-graphrbac/) and azure-cosmosdb-table. The former has not been updated to use azure-identity yet. The later is being rewritten as [azure-data-tables](https://pypi.org/project/azure-data-tables/), but is still in early beta.
This PR adds an experimental "local" mode for the agent, starting with `libfuzzer`. For tasks that poll a queue, in local mode, they just monitor a directory for new files.
Supported commands:
* libfuzzer-fuzz (models the `libfuzzer-fuzz` task)
* libfuzzer-coverage (models the `libfuzzer-coverage` task)
* libfuzzer-crash-report (models the `libfuzzer-crash-report` task)
* libfuzzer (models the `libfuzzer basic` job template, running libfuzzer-fuzz and libfuzzer-crash-report tasks concurrently, where any files that show up in `crashes_dir` are automatically turned into reports, and optionally runs the coverage task which runs the coverage data exporter for each file that shows up in `inputs_dir`).
Under the hood, there are a handful of changes required to the rest of the system to enable this feature.
1. `SyncedDir` URLs are now optional. In local mode, these no longer make sense. (We've discussed moving management of `SyncedDirs` to the Supervisor. This is tangential to that effort.)
2. `InputPoller` uses a `tempdir` rather than abusing `task_id` for temporary directory naming.
3. Moved the `agent` to only use a single tokio runtime, rather than one for each of the subcommands.
4. Sets the default log level to `info`. (RUST_LOG can still be used as is).
Note, this removes the `onefuzz-agent debug` commands for the tasks that are now exposed via `onefuzz-agent local`, as these provide a more featureful version of the debug tasks.
Add support for sharding across multiple storage accounts for blob containers used for corpus management.
Things to note:
1. Additional storage accounts must be in the same resource group, support the "blob" endpoint, and have the tag `storage_type` with the value `corpus`. A utility is provided (`src/utils/add-corpus-storage-accounts`), which adds storage accounts.
2. If any secondary storage accounts exist, they are used by default for containers.
3. Storage account names are cached in memory the Azure Function instance forever. Upon adding new storage accounts, the app needs to be restarted to pick up the new accounts.
Enables co-locating multiple tasks in a given work-set.
Tasks are bucketed by the following:
* OS
* job id
* setup container
* VM SKU & image (used in pre-1.0 style tasks)
* pool name (used in 1.0+ style tasks)
* if the task needs rebooting after the task setup script executes.
Additionally, a task will end up in a unique bucket if any of the following are true:
* The task is set to run on more than one VM
* The task is missing the `task.config.colocate` flag (all tasks created prior to this functionality) or the value is False
This updates the libfuzzer template to make use of colocation. Users can specify co-locating all of the tasks *or* co-locating the secondary tasks.
When running libfuzzer in 'fuzzing' mode, we expect the following on exit.
If the exit code is zero, crashing input isn't required. This happens if the user specifies '-runs=N'
If the exit code is non-zero, then crashes are expected. In practice, there are two causes to non-zero exits.
1. If the binary can't execute for some reason, like a missing prerequisite
2. If the binary _can_ execute, sometimes the sanitizers are put in such a bad place that they are unable to record the input that caused the crash.
This PR enables handling these two non-zero exit cases.
1. Optionally verify the libfuzzer target loads appropriately using `target_exe -help=1`. This allows failing faster in the common issues, such a missing prerequisite library.
2. Optionally allow non-zero exits without crashes to be a warning, rather than a task failure.
We need to move to supporting data sharding.
One of the steps towards that is stop passing around `account_id`, rather we need to specify the type of storage we need.
This PR makes user information from JWT tokens available as part of a Task.
Included changes:
* Renamed `verify_token` to `call_if_agent`, since this function is specific to agent token verification
* Renames `is_authorized` to `is_agent`, since this function checks if the token is an agent
* Adds support for unmanaged nodes in `is_agent` (see #133 for information)
* Saves the user information from the JWT token on task create as part of `TaskConfig`
Note, `TaskConfig` is what is provided to notification templates. This enables Github issues and ADO work items to tie back to the user that created the task.
Note, while `upn` _usually_ means email for AAD user tokens. If we were going to make use of the email address, we should perform a graph lookup based on the `oid`, but we're not.