onefuzz

mirror of https://github.com/microsoft/onefuzz.git synced 2025-06-16 03:48:09 +00:00

Author	SHA1	Message	Date
Stas	720c8dc466	Azure DevOps notifications not appearing (#1370 ) Co-authored-by: stas <statis@microsoft.com>	2021-10-19 08:50:00 -07:00
bmc-msft	22b2d62e29	enable configurable virtual network ranges (#1268 )	2021-09-27 18:01:32 +00:00
Noah McGregor Harper	599c400fa0	Custom Extension Instance Configuration (#1184 )	2021-09-24 12:27:39 -04:00
bmc-msft	3d1766271e	backdate SAS URLs to avoid time sync issues (#1195 )	2021-08-27 17:00:15 +00:00
bmc-msft	2a2844ae7a	enable configuring proxy VM sku (#1128 )	2021-08-23 16:04:59 +00:00
Joe Ranweiler	d2faf7c66d	Fix case of logger format string specifier (#1160 ) Fix a log statement with an invalid format string specifier. At runtime, the invalid specifier causes the service to throw a `ValueError`. This is typically invoked in the `agent_can_schedule` function [here](https://github.com/microsoft/onefuzz/blob/main/src/api-service/__app__/agent_can_schedule/__init__.py#L33).	2021-08-23 14:37:01 +00:00
bmc-msft	2fcb499888	Merge pull request from GHSA-q5vh-6whw-x745 * verify aad tenants, primarily needed in multi-tenant deployments * add logging and fix trailing slash for issuer * handle call_if* not supporting additional argument callbacks * add logging * include new datatype in webhook docs * fix pytypes unit tests Co-authored-by: Brian Caswell <bmc@shmoo.com>	2021-08-13 14:50:54 -04:00
bmc-msft	5a8a1c998e	Enable ado render testing (#1144 )	2021-08-12 16:38:49 +00:00
bmc-msft	338b541a94	expose `coverage` as an optional directory that gets synced to supervisor tasks (#1123 ) Addresses #1122	2021-08-06 19:13:23 +00:00
bmc-msft	39bd0d2ca7	don't overload `list` builtin (#1120 )	2021-08-02 13:51:35 -04:00
bmc-msft	cfe0ec8d5f	address lint issues (#1117 )	2021-08-02 12:05:10 -04:00
bmc-msft	95e2ecff3d	fix format in notification (#1115 )	2021-08-02 12:04:46 -04:00
bmc-msft	9ec7e7a20a	process all expired nodes rather than those not already marked for deletion (#1103 ) This makes sure debug_keep_node is reset and the rest of the reimage processing occurs regardless of reimage_requested and delete_requested being set. Without this, nodes that are marked `debug_keep_node` do not get reimaged/deleted.	2021-07-27 00:53:04 +00:00
bmc-msft	0e27256faf	Remove signalr from endpoints (#1102 ) This is a follow-on PR from #1100	2021-07-23 15:47:08 +00:00
bmc-msft	7e6a42cdd6	require {input} in target_env or target_options for generator and coverage tasks (#1106 ) Fixes #925	2021-07-23 14:58:42 +00:00
bmc-msft	b90ee03fd9	tasks must use pools not VMs (#1105 ) using config.vm depricated prior to 1.0.0	2021-07-23 14:10:51 +00:00
bmc-msft	55366e751a	allow pools & scalesets set to `shutdown` to `halt` (#1104 ) Currently, if a pool or scaleset is set to `shutdown`, it cannot be set to `halt`. While moving from `halt` to `shutdown` would cause issues, moving from `shutdown` to `halt` is fine.	2021-07-23 13:14:47 +00:00
bmc-msft	5be9c4dcee	relay SignalR integrations through a storage queue (#1100 ) The SignalR integration from Azure Functions does not have automatic retry. When the SignalR instance has issues, all other APIs fail. To make the service resilient to SignalR outages, this bounces SignalR events through an Azure Storage queue. NOTE: This PR does not remove the integration from all of the functions. That is intended to be done as a follow-on PR.	2021-07-22 18:10:20 +00:00
bmc-msft	ee3d0871f2	handle azure-mgmt expired auth tokens by clearing the client cache and retrying (#1099 ) In order to reduce how frequently the IMS is hit from the service, the service caches the azure-mgmt clients between API calls. While the management APIs should have some amount of authentication expiration redundancy built in, not all of them do. This is seen with `ClientAuthenticationError`, most often with the nested exception record of `ExpiredAuthenticationToken`. This wraps all of the compute layer functionality with a wrapper that checks if there has been an exception, and retries the request.	2021-07-22 18:01:02 +00:00
bmc-msft	3269dbb1aa	delete secret on object delete (#1085 )	2021-07-21 16:04:27 -04:00
bmc-msft	065272191e	Replace notifications by default (#1084 )	2021-07-20 18:39:31 -04:00
Cheick Keita	152dd190b7	Add more information to the logs of transient error (#1082 )	2021-07-16 17:52:06 -04:00
bmc-msft	39beb1591c	use managed identity reader access for scaleset configs (#1060 )	2021-07-13 13:20:50 -04:00
bmc-msft	7a7ded6b7e	force upgrade custom script extensions (#1059 )	2021-07-13 12:08:07 -04:00
Cheick Keita	89b7d13125	Fix get_dead_nodes query (#1054 )	2021-07-09 13:33:42 -04:00
bmc-msft	826ef8dd22	Pool shrink queue (#1050 )	2021-07-08 10:23:54 -04:00
bmc-msft	45d468f2ce	set pool_id on node creation (#1049 )	2021-07-07 17:58:24 -04:00
bmc-msft	52f83b5b26	add EventScalesetResizeScheduled (#1047 )	2021-07-07 14:15:26 -04:00
bmc-msft	7b2679a1ce	make ShrinkQueue not scaleset specific (#1046 )	2021-07-07 13:27:49 -04:00
bmc-msft	15063908b0	update azure-cli to 2.26.0 (#1045 )	2021-07-07 12:07:34 -04:00
bmc-msft	29dda54b83	instance wide configuration (#1010 ) TODO: * [x] add setting initial set of admins during deployment	2021-06-30 21:13:58 +00:00
Cheick Keita	1e90ed6092	Allow notifications to be retried when an error occurs (#1026 )	2021-06-30 14:05:25 -04:00
bmc-msft	883c93aaf4	ensure VM IDs are unique before calling Azure reimage/delete APIs (#1023 )	2021-06-25 11:54:52 -04:00
bmc-msft	10d2e3e366	update azure-keyvault-secrets to 4.3.0 (#1012 )	2021-06-23 18:27:32 -04:00
bmc-msft	5f8e423265	remove nodes from db upon reimage (#1005 ) The flag `Node.reimage_queued` is intended to stop nodes from reimaging repeatedly. In #970, in order to work around Azure API failures, this flag was cycled if the node was already set to cleanup. Unfortunately, reimaging can sometimes take a significant amount of time, causing this change to get nodes multiple times. Instead of using `reimage_queued` as a flag, this PR deletes the node from the storage table upon reimage. When the node registers OR the next time through `Scaleset.cleanup_nodes`, the Node will be recreated automatically, whichever comes first.	2021-06-23 22:25:15 +00:00
bmc-msft	50652c2e48	mark tasks as failed when the node is being reimaged due to heartbeat issues (#1015 )	2021-06-23 16:39:47 -04:00
bmc-msft	b9950c5526	update log messages to ease debugging (#988 )	2021-06-14 15:18:03 -04:00
bmc-msft	bcdae2d5cb	Check scaleset size for missing nodes (#984 )	2021-06-11 18:47:21 -04:00
bmc-msft	2be1edd9dc	handle reimaging failures by resetting reimage_queued (#970 ) In a previous commit, reimage_queued was added to prevent reimaging a node while it is reimaging. However, this means reimaging failures due to Azure issues don't finish reimaging. This will reset the this flag allowing the node to reimage in the following cleanup cycle.	2021-06-09 18:58:56 +00:00
bmc-msft	da931b3a5c	address issues raised from latest mypy (#972 )	2021-06-09 12:04:24 -04:00
bmc-msft	af39d25a7d	reimage/delete expired nodes even with the debug_keep_node flag (#968 ) Fixes #965	2021-06-08 17:37:10 +00:00
bmc-msft	ed289c9a3c	handle scaleset resize exceptions (#967 )	2021-06-08 09:30:36 -04:00
Joe Ranweiler	2c72bd590f	Add generic coverage task (#763 ) Todo: - [x] Finalize format for coverage file(s) - [x] Add service support - [x] Integration test - [x] Merge #926 - [x] Merge #929	2021-06-03 23:36:00 +00:00
bmc-msft	a92c84d42a	work around issue with discriminated typed unions (#939 ) We're experiencing a bug where Unions of sub-models are getting downcast, which causes a loss of information. As an example, EventScalesetCreated was getting downcast to EventScalesetDeleted. I have not figured out why, nor can I replicate it locally to minimize the bug send upstream, but I was able to reliably replicate it on the service. While working through this issue, I noticed that deserialization of SignalR events was frequently wrong, leaving things like tasks as "init" in `status top`. Both of these issues are Unions of models with a type field, so it's likely these are related.	2021-06-02 16:40:58 +00:00
bmc-msft	60ae07c34f	handle azure-storage deleting nonexistent containers (#948 )	2021-06-02 15:11:33 +00:00
bmc-msft	b761908409	send NodeCommandStopIfFree on node shutdown (#940 ) If we move to shutdown a single node but it's not doing work, it will wait until it picks up work to shutdown. This shortcuts that.	2021-06-01 15:03:33 +00:00
bmc-msft	0a6021bfa1	prevent object id collision in hide_secrets (#936 ) this fixes an issue related to object id reuse that can occur making the object identification cache fail. Instead, this simplifies the hide_secrets to always recurse and use setattr to always set the value based on the recursion. Note, the object id reuse issue was seen in the `events.filter_event_recurse` development and this was the fix for the id reuse there. Python documentation states: id(object): Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.	2021-05-27 08:28:02 -04:00
bmc-msft	d557fc16c6	mark tasks that are stopped that never started with an error (#935 )	2021-05-26 18:42:21 -04:00
bmc-msft	c107a04cf9	fix issue deleting proxy from storage tables (#932 )	2021-05-26 13:33:22 -04:00
bmc-msft	8b74d08d3d	fix deleting nodes with expired heartbeats (#930 )	2021-05-26 13:06:44 -04:00

1 2 3 4 5 ...

258 Commits