Commit Graph

331 Commits

Author SHA1 Message Date
46f4aac805 Default must be greater than or equal to minimum (#2248) 2022-08-12 14:40:55 -04:00
660d943824 Adjust auto scale to scale down nodes on shutdown (#2232)
* Only scale down when scale set in shutdown state

* Bug fix + explaing the logic a bit better

* Fix some bugs

* linting and bug fixes

* lint

* Actually now

* I'm not writing sql

* last try

* It's working

* lint

* Small docs update
2022-08-11 18:26:39 +00:00
4fa6e74241 Enable .NET functions in check-pr for Agent-specific functions (#2119)
Enable the .NET functions for the agent by sending the agent the URI for the `-net` service.

Also fix some things causing failures when using the .NET functions (`CouldShrinkScaleset` was not implemented).

Improve error handling around table serialization/deserialization, fix an issue with int64/long mismatch between Python & C# code.

----

For `check-pr` testing:

1. There's a new parameter `enable_dotnet` which maps directly to the `--enable_dotnet` switch on `deploy.py`.
2. If you put `agent` there, all the `agent_*` functions will be enabled for .NET and disabled for Python.
3. If `agent_can_schedule` is disabled on the Python side, it will automatically tell the agent to use the .NET functions.

So to test the .NET agent functions, do a `check-pr` run with `enable_dotnet` set to `agent` and it should all work.
2022-07-20 20:40:30 +00:00
3347e7e67b Use .get instead of lookup (#2165) 2022-07-14 23:55:30 +00:00
0b8093b89e Reuse Agent artifacts if nothing in src/agent changes (#2115)
The agent build takes most of the CI runtime, so improve it by only rebuilding if the pre-reqs or something inside `src/agent` changes.

We will always skip the cache for builds on tags and from the `main` branch, so that version is stamped correctly there.
2022-07-12 22:09:32 +00:00
f37224e8bb Add dotnet coverage task (#2062)
* checkpoint

* some more progress

* more progress

* More progress

* Now it's time to test it

* It works locally 🎉

* Attempting clean build

* fmt

* temporarily stub out macos

* missed a few

* please be the last one

* .

* .

* .

* noop change to unstuck actions

* .

* .

* Fix setup script

* Some fixes

* It works except for a race condition -- use a directory watcher to fix it

* It works end to end!

* Execute the commands using tokio's structs and timeout mechanism

* It works.... for real this time

* Undo timer changes

* Cleanup

* 🧹

* Fix import

* .

* PR comments

* Fix clippy

* Clippy whyyy

* Only check dotnet path once

* fmt

* Fix a couple more comments
2022-07-06 16:13:45 -04:00
61fc091f88 Make the log sas url last as long as the job duration (#2116) 2022-07-02 01:40:02 +00:00
30daae2215 Potential solution for TLS errors in OneFuzz (#2087)
* proposed fix from here:

https://github.com/Azure/azure-functions-durable-python/issues/194#issuecomment-710670377

* Update src/api-service/__app__/__init__.py

Co-authored-by: George Pollard <porges@porg.es>

Co-authored-by: stas <statis@microsoft.com>
Co-authored-by: George Pollard <porges@porg.es>
Co-authored-by: George Pollard <gpollard@microsoft.com>
2022-06-29 08:24:40 -07:00
52ccf05a29 Remove deprecated libfuzzer_coverage task (#2021)
- Remove the ability to create or execute a `libfuzzer_coverage` task
- Preserve the enum variant in `onefuzztypes` to prevent errors when deserializing old data
- Remove doc references to `libfuzzer_coverage`
2022-06-13 12:38:35 -07:00
9989189e60 Adding Node State to Node Heartbeat (#2024)
* Adding Node State to Node Heartbeat.

* Updating docs.

* Fixing webhook events.

* Formatting.

* Resetting type.

* Updting param.

* Setting to nodestate.
2022-06-13 10:13:57 -07:00
60b304a220 handle messages that are too big to fit in a queue message (#2020)
* handle messages that are too big to fit in a queue message

* tests

Co-authored-by: stas <statis@microsoft.com>
2022-06-06 12:16:47 -07:00
79cc5d54d3 Fix equire_admin_privileges Logic. (#2016) 2022-06-03 15:59:08 -07:00
0d14ca1ed1 pin protobuf (#1985) 2022-05-26 17:42:59 +00:00
91922ce01a enable python functions (#1907)
- queue_node_heartbeat
- queue_task_heartbeat

disable dotnet functions
- QueueNodeHeartbeat
- QueueTaskHeartbeat
2022-05-06 17:50:33 -07:00
5393dbab65 Replace queue_task_hearbeat (#1899)
* Replace queue_task_hearbeat

* dont rename statsFormat

* using hashsets for the helpers
2022-05-05 11:02:36 -07:00
3370c3df9a Replace queue_node_heatbeat (#1875)
* repalce queue_node_heatbeat

* Changing the name of the input queue

* [testing] hard code deployment of -net function

* explicitly disable other C# functions
2022-05-04 17:09:50 -07:00
b2399c4571 allow jobs with no log data to be schedule (#1893) 2022-05-04 20:09:19 +00:00
44059f20ca Adding Admin Checks to Node Operations. (#1779)
* Adding Admin Checks to Node Operations.

* Importing function.

* Changing naming convention.

* Fixing webhook events.

* Adding changes to scaleset init.
2022-04-27 11:31:43 -07:00
cb45c5685f add tool_name and onfuzz_version to CrashReport (#1635) 2022-04-18 23:56:07 +00:00
87eb606b35 Delete nodes when they're done (#1763)
* Delete nodes when they're done

* Missed a file

* Load node disposal strategy from env var

* Lint

* Fix subtle bug

* Deleting doesn't work, will 'decomission' nodes once they complete work

* Missed a file

* Remove logging line
2022-04-12 17:32:15 +00:00
8299d8fb57 Using existing auto scale settings isn't an error (#1745) 2022-04-06 12:41:58 +00:00
7add51fd3a Log redirection, service side (#1727)
* Setting the service side of the log management
- a log is created or reused when e create a job
- when scheduling the task we send the log location to the agent
The expected log structure looks liek
{fuzzContainer}/logs/{job_id}/{task_id}/{machine_id}/1.log

* regenerate doces

* including job_id in the container name

* regenerating docs
removing bad doc file
2022-03-29 18:47:20 +00:00
424ffdb4b5 Adding auto scale via cli (#1717)
* Initial implementation for adding auto scale via cli

* Remove unused argument

* Remove unused import

* I had a 👻extra line👻
2022-03-29 09:50:07 -04:00
24454e3681 pin click to fix black (#1726)
* pin click to fix black

* missed a couple

Co-authored-by: stas <statis@microsoft.com>
2022-03-28 12:06:13 -07:00
ce03394376 Handle instance being destroyed before updating scaling protection (#1719)
* Handle instance being destroyed before updating scaling protection

* Fix bug where we release protection too early
2022-03-28 10:14:39 -04:00
5c418eeb36 Add autoscaling diagnostics (#1708)
* Initial attempt

* Adding diagnostics works

* 🧹

* lint

* I wish the linter could auto fix these issues

* Lint
2022-03-21 13:25:21 +00:00
40b0e6685a Give function app resource group scoped contributor role (#1698)
* Give function app resource group scoped contributor role

* Reenable autoscaling

* We don't know what the minimum capacity for a sku is yet

* Lint
2022-03-09 13:07:22 -05:00
4d1c1f5713 Abstract out node disposal (#1686)
* Abstract node disposal strategy

* Cleanup + lint

* Handle possibile scalesets being in resize state

* Setting the size is still exposed via CLI, we don't want to break that functionality

* PR comments
2022-03-08 13:30:34 -05:00
7c507ab7c7 Remove dependency on onefuzz deployment role to unblock (#1693) 2022-03-04 18:51:57 +00:00
16166e1c14 Create autoscale resources for scaleset (#1661)
* Initial progress to adding a auto scale resource

* auto scale API is ready

* When creating a scaleset, add an autoscale resource to it as well

* Auto scale is correctly linked with scaleset

* 🧹

* Lint

* Cleaned up
2022-02-28 17:28:31 +00:00
5d8516bd70 Enable scale in protection on VMSS instances (#1647)
* draft attempt at adding scaling protection

* Service can now control scaling protection policy on VM instances

* Improve logging a bit

* draft attempt at adding scaling protection

* Service can now control scaling protection policy on VM instances

* Improve logging a bit

* Error message was missing info

* Linter

* Don't schedule work if we can't protect the node

* Last of the linter changes
2022-02-14 14:56:55 +00:00
77dcd57b46 Add EventGrid compatible webhook format (#1640) 2022-02-11 16:39:19 -08:00
62731f3836 bump azure-cli-core and azure-cli to 2.32.0 (#1634) 2022-02-02 17:38:46 -08:00
809db31186 fix node serialization (#1627)
* fix node serialization

* remove intermediate variable

* mypy fix
2022-01-31 18:05:28 +00:00
4e6f496e06 bump azure-mgmt-subscription to 3.0.0 (#1606) 2022-01-28 18:57:12 -08:00
6100191aaf Fixing VMSS Re-Image 7-Day Timer (#1616)
* Fixing VMSS Re-Image 7-Day Timer

* Updating use of TimeStamp to created_at

* Renaming.

* Updating field name.

* Removing test chagne.

* Updating query to work if init_at entry does not exist yet.

* Changing timer for testing.

* Adding field comment.

* Formatting models.py

* Fixing where save is called.

* Adidng logging.

* Removing logging. Ready for merge.

* Update src/pytypes/onefuzztypes/models.py

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Formatting.

* Updating datetime.

* Testing after datetime change.

* Removing test.

Co-authored-by: nharper285 <nharper285@gmail.com>
Co-authored-by: Joe Ranweiler <joe@lemma.co>
2022-01-26 17:40:27 -08:00
f374801d35 Re-Image Functionality Now Includes 'Upgrade' Call (#1612)
* Re-Image Functionality Now Includes 'Upgrade' Call

* Fixing call.

* Fixing testing change.

* Shrinking timedelta even more.

* Update nodes.py

* Update scalesets.py

* Update nodes.py

* Update nodes.py

* Fixing merge.

* Revert "Fixing merge."

This reverts commit ab4d2a54c3.

* Adding comment and logging to new upgrade call.

* Removing old logging statement.

Co-authored-by: nharper285 <nharper285@gmail.com>
2022-01-26 10:42:03 -08:00
464940d716 bumpt azure-mgmt-compute to 24.0.1 (#1599) 2022-01-22 08:25:06 -08:00
13e52308f2 Bump azure-identity to 1.7.1 (#1586) 2022-01-19 16:09:37 -08:00
6901cd15ff bump azure-functions to 1.8.0 (#1582) 2022-01-14 22:34:01 -08:00
5083dc6e03 bump azure-core to 1.21.1 (#1579) 2022-01-13 09:34:44 -08:00
9092276d96 bump jinja2 to 3.0.3 (#1577) 2022-01-12 18:30:25 -08:00
1731ca4dde Bump requests to 2.27.1 (#1567)
Co-authored-by: stas <statis@microsoft.com>
2022-01-12 07:29:34 -08:00
f505ece25f Fixing proxy tag issue. (#1568) 2022-01-07 13:10:10 -08:00
83e48e7e7b Adding new InstanceConfig value for VMSS & VM tags (#1560)
* Adding new instanceconfig value for tags.

* Removing bad import.

* Updating where tags are generated.

* Updating tag generation for scalesets.

* Updating tag generation in vm.

* Updating vm tag generation.

* Updating vm tag generation.

* Fixing extension.

* Fixing import.

* Fixing typing.

* Fixing get_vm calls.

* Fixing calls to get_vm.

* Fixing optional tag.
2022-01-05 13:16:03 -08:00
5515aa1819 Move call to check_access to call_if (#1472)
* Move call to check_access to call_if

* fix logic

* Update src/api-service/__app__/onefuzzlib/endpoint_authorization.py
2022-01-04 00:34:03 +00:00
91630f2a28 Adding AutoConfig Properties. (#1541)
Co-authored-by: nharper285 <nharper285@gmail.com>
2021-12-21 13:41:40 -08:00
bb972c22f4 pin mypy to 0.910 (#1531)
https://github.com/samuelcolvin/pydantic/issues/3528

https://github.com/python/mypy/issues/6617#issuecomment-892438903
https://github.com/samuelcolvin/pydantic/pull/3175#issuecomment-914897604

updating mypy in build yml and requirements to 0.910

Co-authored-by: stas <statis@microsoft.com>
2021-12-16 14:13:54 -08:00
08691c007f Integration tests reliability fixes (#1505)
* only reimage nodes that are in the done state

* ignore done message when the node is deleted

* log warning instead of error when receiving a heartbeat from a deleted node
2021-12-03 10:08:30 -08:00
aa74550160 Group membership check (#1074) 2021-11-22 14:06:03 -08:00