Commit Graph

571 Commits

Author SHA1 Message Date
c537458ade update azure-mgmt-compute to 19.0.0 (#611) 2021-03-02 18:44:26 +00:00
ba836a2062 update azure-mgmt-eventgrid to 3.0.0rc9 (#610) 2021-03-02 18:18:38 +00:00
296ba2ee23 update azure-mgmt-storage to 17.0.0 (#612) 2021-03-02 18:00:07 +00:00
e43c1c875c simplify batch-processing log (#622)
Simplifies the logs from:

`Processing batch-downloaded input Ok(DirEntry(DirEntry("task_crashes_1/input-b4c3482194a6ebd275577ea52255fcea3358f3220c408d3c53b9f32b653e6586.txt")))`

to:

`Processing batch-downloaded input: task_crashes_1/input-b4c3482194a6ebd275577ea52255fcea3358f3220c408d3c53b9f32b653e6586.txt`
2021-03-02 17:32:07 +00:00
d4cedabdf8 update 3rd party rust dependencies (#624) 2021-03-02 11:41:30 -05:00
32681b2611 update azure-mgmt-resource (#607) 2021-03-02 08:35:07 +00:00
37af0f1112 add CodeQL pipeline (#617) 2021-03-01 14:06:38 -05:00
100e22a359 Rewrite redundant Result wraps (#616) 2021-03-01 12:43:30 -05:00
91bbcf7a59 update Azure Devops tool integration prereqs (#608) 2021-03-01 11:55:25 -05:00
d1fe872a07 Release 2.7.0 (#604) 2.7.0 2021-02-28 01:32:26 +00:00
0f895d11c9 add context to logging of supervisor work queue interaction (#601) 2021-02-27 20:17:04 -05:00
c1a2c9febb fix infinite loop on request error that isn't an IO Error (#603) 2021-02-26 20:23:39 -05:00
43585d84e3 handle azcopy's rename of NOTICES.txt (#602) 2021-02-26 19:43:04 -05:00
6a82f57c4a remove unused library from prereqs (#599) 2021-02-26 16:57:55 -05:00
e3c73d7a10 Update command variable expansion (#561)
* Documents `crashes_account` and `crashes_container`
* Adds `reports_dir` and support for `unique_reports`, `reports`, and `no_repro` containers to the generic analysis task
* Adds `microsoft_telemetry_key` and `instance_telemetry_key` to generic supervisor, generator, and analysis tasks
2021-02-26 20:58:09 +00:00
419ca05b28 Actively tail worker stdio from supervisor agent (#588)
In the supervisor agent, incrementally read from the running worker agent's redirected stderr and stdout, instead of waiting until it exits.

The worker agent's stderr and stdout are piped to the supervisor when tasks are run. The supervisor's `WorkerRunner` does _not_ use `wait_with_output()`, which handles this (at the cost of blocking). Instead, it makes repeated calls to to `try_wait()` on timer-based state transitions, and does not try to read the pipes until the worker exits. But when one of the child's pipes is full, the child can block forever waiting on a `write(2)`, such as in a `log` facade implementation.

This bug has not been caught because we control the child worker agent, and until recently, it mostly only wrote to these streams using `env_logger` at its default log level. But recent work: (1) set more-verbose `INFO` level default logging, (2) logged stderr/stdout lines of child processes of _the worker_, and (3) some user targets logged very verbosely for debugging. This surfaced the underlying issue.
2021-02-26 20:09:02 +00:00
06f45f338c Update Task Heartbeat to include Job_id (#594) 2021-02-26 13:36:10 -05:00
6a049db3a3 Renames application insights keys to be more clear (#587)
* renames `telemetry_key` to `microsoft_telemetry_key`
* renames `instrumentation_key` to `instance_telemetry_key`
* renames `can_share` to `can_share_with_microsoft`
* renames the `applicationinsights-rs` instances to `internal` and `microsoft` respective of the keys used during construction.

This clarifies the underlying use of Application Insights keys and uses struct tuple to ensure the keys are used correctly via rust's type checker.
2021-02-26 17:04:49 +00:00
8600a44f1f fix bool queries (#597)
This addresses broken queries used for identifying outdated nodes.
2021-02-26 16:51:05 +00:00
bc6c8408c4 add onefuzz containers files download_dir (#598)
fixes #571
2021-02-26 15:27:51 +00:00
4cd2de0e93 Update azure-cli & azure-cli-core (#596) 2021-02-26 09:19:25 -05:00
daef1637f8 update jinja2 (#595) 2021-02-26 09:19:10 -05:00
a3fa5f6b62 Update onefuzz-agent unit tests (#592) 2021-02-24 20:54:36 -05:00
ed86bb0099 Use non-deprecated atomic method (#593) 2021-02-24 17:41:24 -05:00
fb482e357e don't schedule work to a node if the scaleset or pool is shutting down (#583) 2021-02-23 13:33:41 -05:00
e7fe099f25 handle delayed AAD resources in deployments (#585) 2021-02-22 19:40:07 -05:00
e2e44ace8a release 2.6.0 (#584) 2.6.0 2021-02-22 12:25:12 -05:00
cebb84b9e7 handle error condition when creating a container that is being deleted (#582)
When users try to create a container immediately after deleting it, Azure will fail saying the deletion is in-progress.

catching ResourceExistsError during create handles this error.
2021-02-22 01:49:07 +00:00
feb80ecb54 allow nodes with multiple tasks to continue on task stop (#567)
As is, when multiple tasks are running on a single node, if any one of them stops, the node gets reimaged.

This changes the behavior such that when a node with multiple tasks has one task stop, the other tasks will continue.
2021-02-19 23:54:26 +00:00
6ba5795f36 update proxy port ranges to avoid current blocks (#552) 2021-02-19 17:50:09 -05:00
4de19ffe5e stop jobs that do not start within 30 days (#565)
If a job does not start within 30 days, stop the job and mark all of the tasks as `failed`.
2021-02-19 21:23:35 +00:00
305c23a4d9 add instance information to webhooks (#577)
Fixes #574
2021-02-19 21:00:51 +00:00
8ce4638b8a clarify scaleset logging (#568) 2021-02-19 19:36:16 +00:00
4992b494f1 add task config to all task events (#580) 2021-02-19 14:10:48 -05:00
872a5ddc14 add details to exceptions generated during report render failures (#576) 2021-02-19 13:48:49 -05:00
3a7bc95316 import local relative paths (#579) 2021-02-19 12:29:35 -05:00
cc5965ebbf add .gitignore to ignore libfuzzer-dotnet build artifacts (#564) 2021-02-19 09:32:26 +00:00
657af9722c coverage containers should be unique to the project/name/build/platform (#572) 2021-02-18 17:07:44 -05:00
929d9ce496 make user triggered reimaging happen immediately (#566) 2021-02-18 14:08:25 -05:00
21f08f6a98 Release 2.5.0 (#559) 2.5.0 2021-02-17 19:07:02 -05:00
279629292f handle SkuNotAvailable errors when creating VM Scalesets (#557) 2021-02-17 16:52:37 -05:00
89d7f060dd make missing symbols for coverage tasks more explicit (#554)
This moves from:

```
"Error: coverage extraction from C:\users\bcaswell\projects\bugs\andrew-coverage-fail\setup\oft-setup-5c77cfe1b181520ab0b33a16286a690a\fuzz.exe failed when processing file "11f6ad8ec52a2984abaafd7c3b516503785c2072".  target appears to be missing sancov instrumentation",
```

To even more explicit:
```
Error: Target appears to be missing sancov instrumentation.  This error can happen due to missing coverage symbols.
target_exe: C:\users\bcaswell\projects\bugs\andrew-coverage-fail\setup\oft-setup-5c77cfe1b181520ab0b33a16286a690a\fuzz.exe
input: "11f6ad8ec52a2984abaafd7c3b516503785c2072"
debugger stdout:
...
[+] disabling sympath
[+] processing fuzz.exe
[+] no tables  fuzz.exe
[+] processing C:\WINDOWS\SYSTEM32\kernel.appcore.dll
[+] no tables  C:\WINDOWS\SYSTEM32\kernel.appcore.dll
[+] processing C:\WINDOWS\System32\KERNELBASE.dll
[+] no tables  C:\WINDOWS\System32\KERNELBASE.dll
[+] processing C:\WINDOWS\System32\RPCRT4.dll
[+] no tables  C:\WINDOWS\System32\RPCRT4.dll
[+] processing C:\WINDOWS\System32\msvcrt.dll
[+] no tables  C:\WINDOWS\System32\msvcrt.dll
[+] processing C:\WINDOWS\System32\KERNEL32.DLL
[+] no tables  C:\WINDOWS\System32\KERNEL32.DLL
[+] processing ntdll.dll
[+] no tables  ntdll.dll
Error: unable to find sancov counter symbols [at DumpCounters (line 114 col 9)]
...
```
2021-02-17 16:34:09 +00:00
ce47e4924a add status job commands (#550) 2021-02-16 13:47:57 -05:00
c160088998 expose input_blob fields needed to generate crash reports (#551) 2021-02-16 13:16:54 -05:00
f64a0dcc05 lint integration-test.py (#549) 2021-02-16 12:22:45 -05:00
e9b67952e3 update 3rd-party rust dependencies (#548) 2021-02-16 11:11:20 -05:00
933fe6850c libfuzzer-dotnet integration (#535) 2021-02-11 17:30:24 -05:00
360693e8a4 move verbose to debug to align with log and opentelemetry (#541) 2021-02-11 16:49:27 -05:00
a3d73a240d report the total coverage after processing all inputs in local mode (#537) 2021-02-11 19:34:09 +00:00
1e536c54d3 update error message when coverage extraction fails (#539) 2021-02-11 14:18:49 -05:00