When running libfuzzer in 'fuzzing' mode, we expect the following on exit.
If the exit code is zero, crashing input isn't required. This happens if the user specifies '-runs=N'
If the exit code is non-zero, then crashes are expected. In practice, there are two causes to non-zero exits.
1. If the binary can't execute for some reason, like a missing prerequisite
2. If the binary _can_ execute, sometimes the sanitizers are put in such a bad place that they are unable to record the input that caused the crash.
This PR enables handling these two non-zero exit cases.
1. Optionally verify the libfuzzer target loads appropriately using `target_exe -help=1`. This allows failing faster in the common issues, such a missing prerequisite library.
2. Optionally allow non-zero exits without crashes to be a warning, rather than a task failure.
This fixes an issue running a libfuzzer coverage task and don't have any initial seeds (or there are seeds found by the fuzzer by the time the task starts), it will fail.
Adds `onefuzz debug log tail <keyword>`, which enables performing the same component in `onefuzz debug log keyword <keyword>` in a loop.
Optimizations:
* only returns the N records at a time (default 1000)
* each query only returns records that occur after the latest record received.
* If no results are returned, waits 10s before retrying
* Increases the wait time by 1.5x until the wait time is larger than 60s
Using `--filter` provides the ability to filter each record that comes back via jmespath.
Example uses:
Monitor any log messages (which ignores metrics) for a given job_id GUID
```
onefuzz debug logs tail bf4efdfd-685c-444a-81c5-d911477433ae --filter message
```
Log the job_id and task_id for each new unique report:
```
onefuzz debug logs tail new_unique_report --filter '[customDimensions.job_id, customDimensions.task_id]'
```
Log the job_id and task_id for each new unique report only for the specific job_id:
```
onefuzz debug logs tail "new_unique_report d5bcd4d2-4dab-49d5-a215-66db94fb0309" --filter '[customDimensions.job_id, customDimensions.task_id]'
```
In `Scaleset.cleanup_nodes`, nodes that are no longer part of the scaleset should get deleted. Without filtering the list, the nodes could get re-saved to the Node table later on.
Adds a random initial jitter the size of the heartbeat periodicity to prevent heartbeats storming the service when we launch 3000 nodes roughly at the same time.
Fixes#386
Adds the following:
1. Serializes a workset to disk during setup.
2. Upon deserializing a RebootContext, delete the file from disk (We support rebooting once and only once)
3. Check if a workset exists with a RebootContext
1. If True, continuing processing
2. if False, mark the tasks & node as "Done" with appropriate errors via:
1. send WorkerEvent::Done events for each of the tasks in the work set
2. send StateUpdateEvent::Done for the node
1. Merge 'create' and 'update' to a single 'save' operation.
2. Allow fetching a single template.
This enables the following workflow:
```
$ onefuzz job_templates manage get libfuzzer_linux > template.json
$ <... update template as desired ...>
$ onefuzz job_templates manage save libfuzzer_linux @./template.json
$
```
Over the weekend, pyOpenssl 20.0 was released. This causes an incompatible library issue during deployment.
Prior to this change, deployment would generate the following error
```
ERROR: pyopenssl 20.0.0 has requirement cryptography>=3.2, but you'll have cryptography 2.9.2 which is incompatible.
```
When running automated deployments, 'tools' were not being properly replaced with the updated versions if the deployment was created _prior_ to the original instance deployment.
Starting earlier today, I saw roughly 1 in 3 deployments fail with the error `Azure.Functions.Cli.Common.CliException: Timed out waiting for SCM to update the Environment Settings`. Redeploying the application resolves the issue. New builds and past releases alike hit this exception.
According to https://github.com/Azure/azure-functions-core-tools/issues/1863, function app deployments may fail due to timeouts related to cold-start.
This PR executes the deploy in a loop with a delay in the case of failure.
This PR fixes two issues:
- First, in MSVC compiled binaries both the LLVM _and_ MSVC symbols are
present, but only the MSVC symbols have correct values. For example:
```
0:000> cdb: Reading initial command '.scriptload DumpCountersOld.js ; !dumpcounters "cov" ; q'
JavaScript script successfully loaded from 'DumpCountersOld.js'
[+] not disabling sympath
INFO: Seed: 58715679
INFO: Loaded 1 modules (3968 inline 8-bit counters): 3968 [00007FF70DB4B000, 00007FF70DB4BF80), # XXX Note
xxx.exe: Running 1 inputs 1 time(s) each.
Running: inp
[+] processing xxx.exe
[+] using LLVM 10 symbols - 0x7ff70db72b00:0x7ff70db72b08 # XXX These are wrong
```
This means the order we search for the coverage symbols is important.
- Secondly, this enables support for MSVC 8bit counter coverage.
## Validation Steps Performed
Running any recent MSVC compiled libfuzzer target should fail to actually collect coverage, instead just returning the 8 null bytes described in the linked issue.