Commit Graph

2050 Commits

Author SHA1 Message Date
969701aa7d Lowercase webhooks digest header value (#2471)
Co-authored-by: stas <statis@microsoft.com>
2022-09-29 19:20:24 -07:00
2155c48b99 Allow worker loops to continue after errors (#2469)
During `TimerWorkers` if updating one entity fails and throws an exception we will abandon the whole update. Instead log the error and continue to attempt to process the remaining entities. This will allow us to make progress even if one entity is stuck.
2022-09-30 12:40:38 +13:00
4c5023cb55 CLI: Retry on connection reset (#2468)
In our integration test run we are seeing some connection-reset errors which causes the CLI operation to fail.

To fix this:
1. Set TCP-KeepAlive to keep Azure load balancer connections alive longer than the default timeout (4 minutes).
2. Treat ConnectionResetError as retryable.
2022-09-30 11:23:29 +13:00
41f973184e Prefix target_exe with setup dir at use sites (#2405) 2022-09-29 13:47:04 -07:00
0c4cd5414d C#: Fix UpdateConfigs (#2463) 2022-09-29 07:00:04 +00:00
1ec9b13e55 Disable PoolName validation (#2459)
* Add comment

* Disable test
2022-09-29 04:18:11 +00:00
8f4cf9d3b6 Correct pool transitions (#2462) 2022-09-29 17:10:39 +13:00
0e2f651a35 fix null ref exception (#2460)
* fix null ref exception

* write out <null> if builder or message are null

Co-authored-by: stas <statis@microsoft.com>
2022-09-28 14:44:32 -07:00
9180215a10 Permit periods in Pool names (#2452) 2022-09-28 21:05:42 +00:00
0e9e32a934 EnsureNotNull doesn't support our custom formatter (#2458) 2022-09-28 19:18:51 +00:00
9a042724d7 Can create ado notifications (#2456)
* Can create ado notifications

* Missed a small issue
2022-09-28 11:04:39 -04:00
872c1070fc Fix: Node state getting reset to init (#2454)
When `isNew` was passed, then the creation should fail if there is a `Node` that already exists. Instead, the existing `Node` was being overwritten.
2022-09-28 16:53:40 +13:00
b918720083 Update download-artifact action (#2453) 2022-09-27 22:38:17 +00:00
476c99a998 use InterpolatedStringHandler to move values to CustomDimensions Tags instead of keeping them in the error message (#2450)
* use InterpolatedStringHandler to move values to CustomDimensions Tags instead of keeping them in the error message

* log blob save raw response failure

* add StringBuilder to CSharpExtensions

Co-authored-by: stas <statis@microsoft.com>
2022-09-27 15:22:29 -07:00
b3748e4283 Handle 404 in Queue.RemoveFirstMessage (#2451) 2022-09-28 11:01:02 +13:00
e6d3b39d1a Bump protobuf from 3.20.0 to 3.20.2 in /src/api-service/__app__ (#2446)
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.20.0 to 3.20.2.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.20.0...v3.20.2)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-27 12:42:55 +13:00
5ee4cd045d Add Roslyn analyzer to check results are used (#2443)
As seen in #2441, it is easy to drop return values of updated entities accidentally.

This PR adds a Roslyn Analyzer which will detect when return values are unused. To explicitly ignore a value you can drop it with `_ = …;`

Closes #2442.
2022-09-26 22:26:06 +00:00
bb2e8ad05e Fix logic to retrieve partitionKey and rowKey (#2447)
* Fix logic to retrieve partitionKey and rowKey

* Moved key getters to EntityConverter and added unit test
2022-09-26 22:01:00 +00:00
85fca0d945 Release 5.14.0 (#2440)
* Release-5.14.0

* removing pr

* Release-5.14.0

* Adding PRs 2441 & 2438

* Adding PR 2434
5.14.0
2022-09-23 16:19:02 -07:00
3f35d81f4b Adding New Default Image Config Value to IC. (#2434)
* Adding New Default Image Config Value to IC.

* Removing forced image setting.

* Updating Webhook Events.

* Removing typo.

* Updating webhook_events again.

* Syncing webhook events.

* Fixing check for os type.

* Fixing import.

* PR Suggestions.

* Fix C# Model Typo.

* Removing other refs to images.

* Removing remaining refs to images outside of models.

* Removing hardcoded image values from tests.

* Update Default Proxy and Repro Images.

Co-authored-by: Marc Greisen <mgreisen@microsoft.com>
2022-09-23 10:40:44 -07:00
dc2c4649c8 do not loose proxy objects when setting state (#2441)
Co-authored-by: stas <statis@microsoft.com>
2022-09-23 13:29:22 +12:00
e1851b0af4 Add more logs (#2438)
* add logs

* avoid relying on exceptions for logic flow control

* add logs to agent commands

* add more logs and fix error logging when table writes fail

* move machine ID to CustomDimensions

* log insert errors

* Log Delete failures

* more logs

* more logs

* more logs

* More logs (I think that's it there is no more...)

Co-authored-by: stas <statis@microsoft.com>
2022-09-23 11:28:37 +12:00
4f9682d3cf Do not fail task on notification failure (#2435)
* Do not fail task on notification failure

* Need to throw on the last iteration in order for it to go to poison queue

* lint
2022-09-22 21:05:07 +00:00
b14bade0fc cleanup queues for non-existent pools and non-existent tasks (#2433)
* cleanup queues for non-existent pools and non-existent tasks

* extra logs

Co-authored-by: stas <statis@microsoft.com>
2022-09-22 08:15:57 -07:00
1013e01a3d Release-5.14.0 (#2428)
* Release-5.14.0
2022-09-21 17:21:09 -07:00
de766dfa78 Delete pool queue when pool is deleted (#2431)
* Delete pool queue when pool is deleted

* Also delete shrink queue when pool is deleted

Co-authored-by: stas <statis@microsoft.com>
2022-09-21 14:39:30 -07:00
2f42cd74d4 Adding State Transition Error Logging. (#2425)
* Adding State Transition Logging.

* Fix log call.

* Removing logging statement.

* Fix error! call.

* Adding error to error message.

* Make err var borrowed.

* Formatting.
2022-09-21 12:54:28 -07:00
39c3736bef Adjust concurrency spec (#2426) 2022-09-20 22:12:53 +00:00
a6addbf83a Minor fixes (#2420)
* delete all temp files

* add try/catch and some logging when deleting temp files

Co-authored-by: stas <statis@microsoft.com>
2022-09-20 10:20:29 -07:00
f40a69a37f fix linux repro extensions (#2415)
* fix OMS Linux repro extension config

* Fixing lost Node state updates

* fix bug in ReproVmss

* rewrite ssh auth

* win azure function ssh-keygen fix

* more logs

* try -P

* use empty string for password

* use argument list

* addressing comments

Co-authored-by: stas <statis@microsoft.com>
Co-authored-by: George Pollard <gpollard@microsoft.com>
2022-09-20 08:15:26 -07:00
b647e4a1fe mark tasks as failed if a work unit cannot be created for the task (#2409)
* mark tasks as failed if a work unit cannot be created for the task

* fix up time queries

* query improvements

Co-authored-by: stas <statis@microsoft.com>
2022-09-17 12:50:41 -07:00
867cdbc06f Port SyncAutoscaleSettings from Python to C# (#2407)
* Port SyncAutoscaleSettings from Python to C#

* address comment

Co-authored-by: stas <statis@microsoft.com>
2022-09-16 07:58:37 -07:00
3f86fc8689 fix some bugs (#2406)
* - fix queries in timer retention

- do not discard proxy record after proxy state is processed, since that record needs to persist

* addressing comments

Co-authored-by: stas <statis@microsoft.com>
2022-09-15 15:45:39 -07:00
4f1ac523da Adding error message to catch Model HttpResponseError. (#2384)
* Adding error message to catch Model HttpResponseError.

* Changing error message.

* Formatting.
2022-09-15 14:44:23 -07:00
f22dee18df CodeQL needs explicit permissions to run (#2404) 2022-09-15 09:05:18 -04:00
4cc4de9c9e Codecov setup for C# & Rust code (#2400)
Use Codecov to show coverage reports, so we get highlighted versions of the files where it is easy to see missing coverage.

- Setup Rust coverage using [`cargo-llvm-cov`](https://github.com/taiki-e/cargo-llvm-cov).
- Add the `ci/agent.sh` build script to the agent artifact cache key, since it wasn't there before.
- Don't run Rust tests in `--release` mode (have been meaning to change this so doing it at the same time).

There is some subtlety about putting the coverage result into the cached agent artifact, so that when we reuse the agent artifact we can still upload the coverage information for it to Codecov. Without this it would look like the coverage had dropped.
2022-09-15 02:29:22 +00:00
61a797e224 Restore self-hosted configuration (#2394) 2022-09-14 23:59:41 +00:00
ac9d072e1d fix linux proxy extensions provisioning failures (#2401)
* fix linux proxy extensions provisioning failures

* format

Co-authored-by: stas <statis@microsoft.com>
2022-09-14 23:52:35 +00:00
2fe73ab79c bug fix (#2392)
* bug fix

* rename anyNotStoppedJobs  to anyNotStoppedTasks

Co-authored-by: stas <statis@microsoft.com>
2022-09-14 10:07:15 -07:00
ca7b6be43b Refactor notification support (#2363)
* Add teams notifications

* .

* Fix compilation isues

* Checkpoint

* Added Ado

* Fix some TODOs

* Teams messages work! 🎉

* fmt

* Bug fix container url generator

* Some small ado changes

* 🧹

* PR comments

* Fix packages

* Get more detailed restore information to debug errors

* Maybe fixes this issue?

* Undo CI change
2022-09-14 15:07:52 +00:00
f375ee719e Two fixes to C# scheduling (#2390)
Two fixes to scheduling code:

- `GetPool` was not correct for the VM case (this code is possibly legacy and not used any more)
- `BuildWorkUnit` could fetch the same pool multiple times and then fail due to `BucketConfig` mismatch (on `TimeStamp`)
  - add a cache to the loop so that we only fetch each pool once
2022-09-14 02:06:01 +00:00
3b8cbc3f1e Cancel any previous PR builds when new one starts (#2393) 2022-09-14 01:34:42 +00:00
bb81f2ec51 Fix MarkDependantsFailed (#2389) 2022-09-14 01:21:13 +00:00
2ff758464e Use Github-hosted Ubuntu until issue with self-hosted pool is resolved (#2391) 2022-09-14 12:39:18 +12:00
f7f91df622 CSharp Refactor - Instance Config Endpoint (#2347)
* CSharp Refactor - Instance Config Endpoint

* Finshing config update.

* Formatting.

* Formatting.

* formatting.

* Fixing encoding.

* Fixing config references.

* Fixing refs.

* Trying location.

* Trying ref to location.

* Passing nsg.

* Passing nsg.

* Setting nsg to not null.

* Fixing ok reference.

* Adding Instance Config Response.

* Setting required attribute.

* Adding route specifier.

* Formatting.

* Fixing route.

* Fixing optionals.

* Trying to set default

* Trying again.

* Setting require admins

* Removing optioanl.

* Testing with instancename.

* Updating instanceconfig model.

* Updating instance config response.

* Formatting.

* Removing AllowPoolManagement.

* Readding.

* Removing arg.

* Replacing with RequireAdminPrivs.

* Fix orm test.

* Setting requireadminprivs to true.

* Requiring admin privs.

* Fix formatting.

* fix test.

* Fixing.

* Changing error message.

* Changing.

* Reordering test args.

* Flipping.

* Fixing args.

* Fixing again.

* Removing false.

* Removing from constructor.

* Setting.

* Setting string to optional.

* Formatting.

* Adding default value.

* PUshing changes to OrmModelsTest

* Updating test to not pass null.

* George's suggestions.

* Removing entityconverter changes.

* Fixing import.
2022-09-13 08:55:40 -07:00
67e55910ac Fix TaskOperations.SearchStates (#2383) 2022-09-13 00:31:35 +00:00
bc33ae1d7a Fix Scaleset response Auth inclusion (#2382)
The `Auth` property is not meant to be returned upon Create/`POST`. Fix this, and make it easier to specify when `Auth` should be included or not.
2022-09-12 17:19:10 -07:00
ecf858bbdd Updating error and fixing default value for uto_create_cli_app (#2378) 2022-09-12 16:28:41 -07:00
ce1fc773a9 implement not-implemented: GetInputContainerQueues (#2380)
* implement not-implemented: GetInputContainerQueues

* named tuple

Co-authored-by: stas <statis@microsoft.com>
2022-09-12 15:17:21 -07:00
44f74f622a DoNotRunExtensionsOnOverprovisionedVms must be false if Overprovision is false (#2375) 2022-09-12 22:00:47 +00:00