onefuzz

mirror of https://github.com/microsoft/onefuzz.git synced 2025-06-16 11:58:09 +00:00

Author	SHA1	Message	Date
Joe Ranweiler	d2faf7c66d	Fix case of logger format string specifier (#1160 ) Fix a log statement with an invalid format string specifier. At runtime, the invalid specifier causes the service to throw a `ValueError`. This is typically invoked in the `agent_can_schedule` function [here](https://github.com/microsoft/onefuzz/blob/main/src/api-service/__app__/agent_can_schedule/__init__.py#L33).	2021-08-23 14:37:01 +00:00
bmc-msft	9ec7e7a20a	process all expired nodes rather than those not already marked for deletion (#1103 ) This makes sure debug_keep_node is reset and the rest of the reimage processing occurs regardless of reimage_requested and delete_requested being set. Without this, nodes that are marked `debug_keep_node` do not get reimaged/deleted.	2021-07-27 00:53:04 +00:00
bmc-msft	55366e751a	allow pools & scalesets set to `shutdown` to `halt` (#1104 ) Currently, if a pool or scaleset is set to `shutdown`, it cannot be set to `halt`. While moving from `halt` to `shutdown` would cause issues, moving from `shutdown` to `halt` is fine.	2021-07-23 13:14:47 +00:00
Cheick Keita	89b7d13125	Fix get_dead_nodes query (#1054 )	2021-07-09 13:33:42 -04:00
bmc-msft	826ef8dd22	Pool shrink queue (#1050 )	2021-07-08 10:23:54 -04:00
bmc-msft	45d468f2ce	set pool_id on node creation (#1049 )	2021-07-07 17:58:24 -04:00
bmc-msft	52f83b5b26	add EventScalesetResizeScheduled (#1047 )	2021-07-07 14:15:26 -04:00
bmc-msft	7b2679a1ce	make ShrinkQueue not scaleset specific (#1046 )	2021-07-07 13:27:49 -04:00
bmc-msft	883c93aaf4	ensure VM IDs are unique before calling Azure reimage/delete APIs (#1023 )	2021-06-25 11:54:52 -04:00
bmc-msft	5f8e423265	remove nodes from db upon reimage (#1005 ) The flag `Node.reimage_queued` is intended to stop nodes from reimaging repeatedly. In #970, in order to work around Azure API failures, this flag was cycled if the node was already set to cleanup. Unfortunately, reimaging can sometimes take a significant amount of time, causing this change to get nodes multiple times. Instead of using `reimage_queued` as a flag, this PR deletes the node from the storage table upon reimage. When the node registers OR the next time through `Scaleset.cleanup_nodes`, the Node will be recreated automatically, whichever comes first.	2021-06-23 22:25:15 +00:00
bmc-msft	50652c2e48	mark tasks as failed when the node is being reimaged due to heartbeat issues (#1015 )	2021-06-23 16:39:47 -04:00
bmc-msft	b9950c5526	update log messages to ease debugging (#988 )	2021-06-14 15:18:03 -04:00
bmc-msft	bcdae2d5cb	Check scaleset size for missing nodes (#984 )	2021-06-11 18:47:21 -04:00
bmc-msft	2be1edd9dc	handle reimaging failures by resetting reimage_queued (#970 ) In a previous commit, reimage_queued was added to prevent reimaging a node while it is reimaging. However, this means reimaging failures due to Azure issues don't finish reimaging. This will reset the this flag allowing the node to reimage in the following cleanup cycle.	2021-06-09 18:58:56 +00:00
bmc-msft	af39d25a7d	reimage/delete expired nodes even with the debug_keep_node flag (#968 ) Fixes #965	2021-06-08 17:37:10 +00:00
bmc-msft	b761908409	send NodeCommandStopIfFree on node shutdown (#940 ) If we move to shutdown a single node but it's not doing work, it will wait until it picks up work to shutdown. This shortcuts that.	2021-06-01 15:03:33 +00:00
bmc-msft	8b74d08d3d	fix deleting nodes with expired heartbeats (#930 )	2021-05-26 13:06:44 -04:00
bmc-msft	ff140a6b1b	Stop tasks on nodes before deleting task queues (#801 )	2021-05-17 18:59:13 +00:00
bmc-msft	cb5e786bcd	add event for scaleset state updates (#882 ) This moves all scaleset state updates through `Scaleset.set_state` and adds a new event EventScalesetStateUpdated.	2021-05-13 21:23:02 +00:00
bmc-msft	584f68065d	cleanup a handful of scaleset logs (#880 )	2021-05-12 17:31:08 -04:00
bmc-msft	221a3316a1	Add StopIfFree node command to tell free nodes to stop asking for new work (#866 )	2021-05-07 13:55:50 -04:00
bmc-msft	007ecf2efe	shutdown missing scalesets during resize (#860 )	2021-05-06 12:00:09 -04:00
bmc-msft	ced21b2ea3	Add node messages to node get (#836 ) This exposes the node commands that have yet to be processed by the node. Example use case: The SDK can now ask "has this node installed my SSH key"	2021-04-26 16:14:58 -04:00
bmc-msft	f4b5c1ae73	when processing node updates, don't wait on the node in cases it should be stopped (#834 ) In situations when the node should be done, mark it as done without waiting for the node to respond to the Done command.	2021-04-26 15:19:46 -04:00
bmc-msft	cf3d904940	address formatting from black 21.4b0 (#831 )	2021-04-26 12:35:16 -04:00
Cheick Keita	80b3533f83	Report the setup failure in the task when available (#781 )	2021-04-09 08:57:56 -04:00
bmc-msft	3096f99e86	enable using ephemeral disks by default (#461 )	2021-03-30 18:48:44 -04:00
bmc-msft	a3fdc74c53	handle exception related to manually deleted scalesets (#672 ) If a user manually deletes a scaleset managed by OneFuzz, then `get_vmss_size` returns None. When this happens, `Scaleset.shutdown` generates an exception from the `logging.info` call on line 573. This PR handles this edge condition.	2021-03-15 14:18:59 +00:00
bmc-msft	fb482e357e	don't schedule work to a node if the scaleset or pool is shutting down (#583 )	2021-02-23 13:33:41 -05:00
bmc-msft	feb80ecb54	allow nodes with multiple tasks to continue on task stop (#567 ) As is, when multiple tasks are running on a single node, if any one of them stops, the node gets reimaged. This changes the behavior such that when a node with multiple tasks has one task stop, the other tasks will continue.	2021-02-19 23:54:26 +00:00
bmc-msft	8ce4638b8a	clarify scaleset logging (#568 )	2021-02-19 19:36:16 +00:00
bmc-msft	929d9ce496	make user triggered reimaging happen immediately (#566 )	2021-02-18 14:08:25 -05:00
bmc-msft	8c9f65c0be	add missing scaleset nodes (#518 )	2021-02-08 13:50:08 -05:00
bmc-msft	1d74379a70	use the primitive types in more places (#514 )	2021-02-05 13:10:37 -05:00
bmc-msft	e3dfcb8b95	Scalesets that are about to be deleted don't need updated configs (#511 )	2021-02-05 09:53:29 -05:00
bmc-msft	3cb055d331	clarify message upon service & agent version mismatch (#510 )	2021-02-04 19:58:45 -05:00
bmc-msft	a02e084522	split out node, scaleset, and pool code (#507 )	2021-02-04 19:07:49 -05:00

37 Commits