Commit Graph

1562 Commits

Author SHA1 Message Date
41fc3b22c0 Update Logging Statement to Warning and Include Http Info (#2484)
* Convert Error to Warning on Poll State Transition Failure.

* Adding Result information.
2022-10-10 10:49:41 -07:00
cdc104f966 Update documentation to specify appropriate RID (#2490)
Closes #2457
2022-10-07 01:03:18 +00:00
053153fa44 Release 5.15.1 (#2493) 2022-10-06 11:56:10 -07:00
80b1122a8d Undo tab change from previous PR (#2492) 2022-10-05 17:32:33 +00:00
124c62756c Migrating notification templates (#2486)
* Add jinja template migration

* Support migrating our most common jinja templates to scriban on the fly

* Fix tests
2022-10-05 14:48:20 +00:00
f6a680bb3d Attempt to fix integration-tests-linux failure (#2487) 2022-10-05 00:29:08 +00:00
7809b40e74 Update to Rust 1.64 (#2488)
[Release notes.](https://blog.rust-lang.org/2022/09/22/Rust-1.64.0.html)

Probably one of the more important things is:
> performance improvements of 10-20% for compiling Rust code on Windows
2022-10-04 22:31:53 +00:00
7529a184cf Release 5.15.0 (#2474)
* Release 5.15.0

* Update CHANGELOG.md

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Adding PR.

* Update CHANGELOG.md

Co-authored-by: Marc Greisen <mgreisen@microsoft.com>

* Adding state machine pr.

* UPdating changelog.

* Update CHANGELOG.md

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Adding cache PR.

Co-authored-by: Joe Ranweiler <joe@lemma.co>
Co-authored-by: Marc Greisen <mgreisen@microsoft.com>
5.15.0
2022-10-04 14:13:47 -07:00
489579a971 Adding missing caching from python code (#2467)
* bringing back some more caching

* more caching

* formatting

* use a record instead of a string as the key to the cache entry
2022-10-03 16:05:52 -07:00
ef5682c282 Debug failing check pr (#2476)
* add more logs

* bug fix

* more logs

* another fix

* fix integration tests

* do not log error when vm deletion is in progress

* addressing comments

* .

* ..

Co-authored-by: stas <statis@microsoft.com>
2022-10-03 10:40:10 -07:00
e77a87a782 Enable backtraces for agent (#2437) 2022-09-30 10:13:09 -07:00
4662df3e39 Cache VMSS VM InstanceID lookups (#2464)
* Cache VMSS VM InstanceID lookups

* Adding an expiration time to the cache

* make the TTL 10 min

* properly add entries to the cache

Co-authored-by: Cheick Keita <chkeita@microsoft.com>
2022-09-30 16:22:09 +00:00
969701aa7d Lowercase webhooks digest header value (#2471)
Co-authored-by: stas <statis@microsoft.com>
2022-09-29 19:20:24 -07:00
2155c48b99 Allow worker loops to continue after errors (#2469)
During `TimerWorkers` if updating one entity fails and throws an exception we will abandon the whole update. Instead log the error and continue to attempt to process the remaining entities. This will allow us to make progress even if one entity is stuck.
2022-09-30 12:40:38 +13:00
4c5023cb55 CLI: Retry on connection reset (#2468)
In our integration test run we are seeing some connection-reset errors which causes the CLI operation to fail.

To fix this:
1. Set TCP-KeepAlive to keep Azure load balancer connections alive longer than the default timeout (4 minutes).
2. Treat ConnectionResetError as retryable.
2022-09-30 11:23:29 +13:00
41f973184e Prefix target_exe with setup dir at use sites (#2405) 2022-09-29 13:47:04 -07:00
0c4cd5414d C#: Fix UpdateConfigs (#2463) 2022-09-29 07:00:04 +00:00
1ec9b13e55 Disable PoolName validation (#2459)
* Add comment

* Disable test
2022-09-29 04:18:11 +00:00
8f4cf9d3b6 Correct pool transitions (#2462) 2022-09-29 17:10:39 +13:00
0e2f651a35 fix null ref exception (#2460)
* fix null ref exception

* write out <null> if builder or message are null

Co-authored-by: stas <statis@microsoft.com>
2022-09-28 14:44:32 -07:00
9180215a10 Permit periods in Pool names (#2452) 2022-09-28 21:05:42 +00:00
0e9e32a934 EnsureNotNull doesn't support our custom formatter (#2458) 2022-09-28 19:18:51 +00:00
9a042724d7 Can create ado notifications (#2456)
* Can create ado notifications

* Missed a small issue
2022-09-28 11:04:39 -04:00
872c1070fc Fix: Node state getting reset to init (#2454)
When `isNew` was passed, then the creation should fail if there is a `Node` that already exists. Instead, the existing `Node` was being overwritten.
2022-09-28 16:53:40 +13:00
b918720083 Update download-artifact action (#2453) 2022-09-27 22:38:17 +00:00
476c99a998 use InterpolatedStringHandler to move values to CustomDimensions Tags instead of keeping them in the error message (#2450)
* use InterpolatedStringHandler to move values to CustomDimensions Tags instead of keeping them in the error message

* log blob save raw response failure

* add StringBuilder to CSharpExtensions

Co-authored-by: stas <statis@microsoft.com>
2022-09-27 15:22:29 -07:00
b3748e4283 Handle 404 in Queue.RemoveFirstMessage (#2451) 2022-09-28 11:01:02 +13:00
e6d3b39d1a Bump protobuf from 3.20.0 to 3.20.2 in /src/api-service/__app__ (#2446)
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.20.0 to 3.20.2.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.20.0...v3.20.2)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-27 12:42:55 +13:00
5ee4cd045d Add Roslyn analyzer to check results are used (#2443)
As seen in #2441, it is easy to drop return values of updated entities accidentally.

This PR adds a Roslyn Analyzer which will detect when return values are unused. To explicitly ignore a value you can drop it with `_ = …;`

Closes #2442.
2022-09-26 22:26:06 +00:00
bb2e8ad05e Fix logic to retrieve partitionKey and rowKey (#2447)
* Fix logic to retrieve partitionKey and rowKey

* Moved key getters to EntityConverter and added unit test
2022-09-26 22:01:00 +00:00
85fca0d945 Release 5.14.0 (#2440)
* Release-5.14.0

* removing pr

* Release-5.14.0

* Adding PRs 2441 & 2438

* Adding PR 2434
5.14.0
2022-09-23 16:19:02 -07:00
3f35d81f4b Adding New Default Image Config Value to IC. (#2434)
* Adding New Default Image Config Value to IC.

* Removing forced image setting.

* Updating Webhook Events.

* Removing typo.

* Updating webhook_events again.

* Syncing webhook events.

* Fixing check for os type.

* Fixing import.

* PR Suggestions.

* Fix C# Model Typo.

* Removing other refs to images.

* Removing remaining refs to images outside of models.

* Removing hardcoded image values from tests.

* Update Default Proxy and Repro Images.

Co-authored-by: Marc Greisen <mgreisen@microsoft.com>
2022-09-23 10:40:44 -07:00
dc2c4649c8 do not loose proxy objects when setting state (#2441)
Co-authored-by: stas <statis@microsoft.com>
2022-09-23 13:29:22 +12:00
e1851b0af4 Add more logs (#2438)
* add logs

* avoid relying on exceptions for logic flow control

* add logs to agent commands

* add more logs and fix error logging when table writes fail

* move machine ID to CustomDimensions

* log insert errors

* Log Delete failures

* more logs

* more logs

* more logs

* More logs (I think that's it there is no more...)

Co-authored-by: stas <statis@microsoft.com>
2022-09-23 11:28:37 +12:00
4f9682d3cf Do not fail task on notification failure (#2435)
* Do not fail task on notification failure

* Need to throw on the last iteration in order for it to go to poison queue

* lint
2022-09-22 21:05:07 +00:00
b14bade0fc cleanup queues for non-existent pools and non-existent tasks (#2433)
* cleanup queues for non-existent pools and non-existent tasks

* extra logs

Co-authored-by: stas <statis@microsoft.com>
2022-09-22 08:15:57 -07:00
1013e01a3d Release-5.14.0 (#2428)
* Release-5.14.0
2022-09-21 17:21:09 -07:00
de766dfa78 Delete pool queue when pool is deleted (#2431)
* Delete pool queue when pool is deleted

* Also delete shrink queue when pool is deleted

Co-authored-by: stas <statis@microsoft.com>
2022-09-21 14:39:30 -07:00
2f42cd74d4 Adding State Transition Error Logging. (#2425)
* Adding State Transition Logging.

* Fix log call.

* Removing logging statement.

* Fix error! call.

* Adding error to error message.

* Make err var borrowed.

* Formatting.
2022-09-21 12:54:28 -07:00
39c3736bef Adjust concurrency spec (#2426) 2022-09-20 22:12:53 +00:00
a6addbf83a Minor fixes (#2420)
* delete all temp files

* add try/catch and some logging when deleting temp files

Co-authored-by: stas <statis@microsoft.com>
2022-09-20 10:20:29 -07:00
f40a69a37f fix linux repro extensions (#2415)
* fix OMS Linux repro extension config

* Fixing lost Node state updates

* fix bug in ReproVmss

* rewrite ssh auth

* win azure function ssh-keygen fix

* more logs

* try -P

* use empty string for password

* use argument list

* addressing comments

Co-authored-by: stas <statis@microsoft.com>
Co-authored-by: George Pollard <gpollard@microsoft.com>
2022-09-20 08:15:26 -07:00
b647e4a1fe mark tasks as failed if a work unit cannot be created for the task (#2409)
* mark tasks as failed if a work unit cannot be created for the task

* fix up time queries

* query improvements

Co-authored-by: stas <statis@microsoft.com>
2022-09-17 12:50:41 -07:00
867cdbc06f Port SyncAutoscaleSettings from Python to C# (#2407)
* Port SyncAutoscaleSettings from Python to C#

* address comment

Co-authored-by: stas <statis@microsoft.com>
2022-09-16 07:58:37 -07:00
3f86fc8689 fix some bugs (#2406)
* - fix queries in timer retention

- do not discard proxy record after proxy state is processed, since that record needs to persist

* addressing comments

Co-authored-by: stas <statis@microsoft.com>
2022-09-15 15:45:39 -07:00
4f1ac523da Adding error message to catch Model HttpResponseError. (#2384)
* Adding error message to catch Model HttpResponseError.

* Changing error message.

* Formatting.
2022-09-15 14:44:23 -07:00
f22dee18df CodeQL needs explicit permissions to run (#2404) 2022-09-15 09:05:18 -04:00
4cc4de9c9e Codecov setup for C# & Rust code (#2400)
Use Codecov to show coverage reports, so we get highlighted versions of the files where it is easy to see missing coverage.

- Setup Rust coverage using [`cargo-llvm-cov`](https://github.com/taiki-e/cargo-llvm-cov).
- Add the `ci/agent.sh` build script to the agent artifact cache key, since it wasn't there before.
- Don't run Rust tests in `--release` mode (have been meaning to change this so doing it at the same time).

There is some subtlety about putting the coverage result into the cached agent artifact, so that when we reuse the agent artifact we can still upload the coverage information for it to Codecov. Without this it would look like the coverage had dropped.
2022-09-15 02:29:22 +00:00
61a797e224 Restore self-hosted configuration (#2394) 2022-09-14 23:59:41 +00:00
ac9d072e1d fix linux proxy extensions provisioning failures (#2401)
* fix linux proxy extensions provisioning failures

* format

Co-authored-by: stas <statis@microsoft.com>
2022-09-14 23:52:35 +00:00