balena-supervisor

mirror of https://github.com/balena-os/balena-supervisor.git synced 2024-12-27 01:11:05 +00:00

Author	SHA1	Message	Date
Christina Ying Wang	7eba48f8b8	Improve tests surrounding Engine-host race patch See: #2170 Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-19 11:11:26 -07:00
Christina Ying Wang	9e249e6ae8	Remove unnecessary async/await from method Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-19 11:11:26 -07:00
Christina Ying Wang	6e6f79c71d	Decrease wait time before start from 60s to 30s 60 seconds to wait may be excessively long. Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-19 11:11:26 -07:00
Christina Ying Wang	ace642ea0f	Improve naming of a util function & add unit test isOlderThan -> isValidDateAndOlderThan See: https://github.com/balena-os/balena-supervisor/pull/2170#discussion_r1226809686 Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-19 11:11:26 -07:00
Christina Ying Wang	ab80f198d8	Add exitCode property to Service class Since we need to conditionally query the service's exit code during step inference, adding the exitCode property keeps the step inference function pure. See: https://github.com/balena-os/balena-supervisor/pull/2170#discussion_r1226805153 Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-19 11:11:26 -07:00
flowzone-app[bot]	7e24f095cc	v14.11.4	2023-06-19 07:56:46 +00:00
flowzone-app[bot]	96d2c6af64	Merge pull request #2177 from balena-os/specify-fs-type-when-mounting-partitions Specify fs type when mounting partitions to prevent "Can't open blockdev" warnings	2023-06-19 07:55:21 +00:00
Christina Ying Wang	e6662f664c	Specify fs type when mounting partitions to prevent "Can't open blockdev" warnings Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-16 13:46:41 -07:00
flowzone-app[bot]	0521d97c96	v14.11.3	2023-06-15 19:49:47 +00:00
flowzone-app[bot]	51ad257e7f	Merge pull request #2173 from balena-os/renovate/balena-io-deploy-to-balena-action-0.x Update balena-io/deploy-to-balena-action action to v0.27.0	2023-06-15 19:48:57 +00:00
Self-hosted Renovate Bot	1675c16622	Update balena-io/deploy-to-balena-action action to v0.27.0 Update balena-io/deploy-to-balena-action Change-type: patch Changelog-entry: Update balena-io/deploy-to-balena-action to v0.27.0	2023-06-08 11:15:42 -07:00
Balena CI	d3f9821895	v14.11.2	2023-06-05 18:53:19 +00:00
flowzone-app[bot]	ce9ba9aac1	Merge pull request #2170 from balena-os/handle-engine-host-resource-race-condition Handle Engine-host race condition	2023-06-05 18:52:38 +00:00
Christina Ying Wang	2537eb8189	Handle the case of 'on-failure' restart policy As explained in the comments of this commit, a container with the restart policy of 'on-failure' with a non-zero exit code matches the conditions for the race, so the Supervisor will also attempt to start it. A container with the 'no' restart policy that has been started once will not be started again. If a container with 'no' has never been started, its service status will be 'Installed' and the Supervisor will already try to start it until success, so the service with 'no' doesn't require special handling. Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-06-05 11:05:58 -07:00
Christina Ying Wang	95f3e13d50	Add extra delay after state engine integration tests This ensures target state has settled (since it seems that the 'applied' status that's reported isn't 100% accurate and the actual Engine state may lag behind slightly) Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-05-31 11:33:27 -07:00
Christina Ying Wang	7f32141958	Handle Engine-host race condition for "always" and "unless-stopped" restart policy There exists a race condition between Engine and a host resource that may not be immediately created. In this race condition, if a container's compose config depends on the existence of that host resource, such as a network interface, and the Engine tries to create & start the container before the host resource is created, the Engine will not reattempt to start the container, regardless of the restart policy. This is undesireable behavior but seems to be the behavior as implemented by Docker. To rectify this, the Supervisor state funnel noops for a grace period of 1 minute after starting a container to see that the container's status has become 'running`. If the container exits because of the race condition, the status becomes 'exited' and the Supervisor will attempt to generate another start step. This noop-wait-start step loop will repeat until the container is able to start. If the container is never able to start, there was a problem in the host in the creation of the host resource, and that should be fixed at the host level. This commit does not handle the case of services with restart policies "no" or "on-failure" which encounter this host race, as metadata from container inspects needs to be introduced during step calculation in order to figure out whether services with those restart policies need to be started. This will be fixed in a future PR. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-05-31 11:32:19 -07:00
Balena CI	e6c136d6cd	v14.11.1	2023-05-11 22:07:34 +00:00
flowzone-app[bot]	09f975395e	Merge pull request #2168 from balena-os/fix-contract-arch-test Fix `sw.arch` typo when testing contracts	2023-05-11 22:06:50 +00:00
Felipe Lalanne	2758e190b2	Fix `sw.arch` typo when testing contracts Change-type: patch	2023-05-11 13:07:26 -04:00
Balena CI	ec363c305a	v14.11.0	2023-05-10 12:39:17 +00:00
flowzone-app[bot]	3b6878fd80	Merge pull request #2167 from balena-os/hw-arch-contract Add `arch.sw` to supported container requirements	2023-05-10 12:38:21 +00:00
Felipe Lalanne	8656bd62f7	Add `arch.sw` to the valid container requirements Change-type: minor	2023-05-09 15:44:26 -04:00
Felipe Lalanne	f1f09e0e27	Allow using slug to validate hw.device-type contract This also adds the hw.device-type test case to the unit tests. Change-type: patch	2023-05-09 15:20:18 -04:00
Felipe Lalanne	a884a58b4c	Simplify and move lib/contract.spec.ts to tests/unit Improve contract tests to remove dependence on stubs and unnecessary system calls. Change-type: patch	2023-05-09 15:20:12 -04:00
Balena CI	eec4d06909	v14.10.11	2023-05-08 20:35:29 +00:00
flowzone-app[bot]	196bc820b1	Merge pull request #2166 from balena-os/hdmi-port-1-docs Add information about hdmi port 2 config vars	2023-05-08 20:34:43 +00:00
Felipe Lalanne	d5cc8238cb	Add information about hdmi port 2 config vars Support for colon characters was added v14.6.0 which enabled configurations for HDMI port 2 (e.g on the RPi 4). These configurations are not documented anywhere else so this allows users to be able to better find the relevant information for working with HDMI. Change-type: patch Relates-to: #2090	2023-05-08 15:21:28 -04:00
Felipe Lalanne	ba39cf539e	Update table formatting on configurations.md For better readability on text editor Change-type: patch	2023-05-08 15:15:37 -04:00
Balena CI	6148ed6ed1	v14.10.10	2023-05-03 16:01:39 +00:00
flowzone-app[bot]	4087782e80	Merge pull request #2165 from balena-os/mtoman/detect-crypt-mounts mount-partitions.sh: Add support for encrypted partitions	2023-05-03 16:00:51 +00:00
Michal Toman	0045928944	mount-partitions.sh: Add support for encrypted partitions After a recent change enforcing all the partitions to be on the same block device, encrypted partitions are no longer being detected correctly. This is because the assumption that the parent block device is a substring of the actually mounted block device does not work for LUKS devices - the mount will either be /dev/mapper/luks-XXX or /dev/dm-X while the parent device is still e.g. /dev/sda. The usual balenaOS boot partition is also split in two - boot and efi. The boot partition (mounted under /mnt/boot) is encrypted and the efi partition (mounted under /mnt/efi) is not. This patch generalizes the detection of the parent device so that it works with both encrypted and unencrypted partitions. Change-type: patch Signed-off-by: Michal Toman <michalt@balena.io>	2023-05-03 16:29:16 +02:00
Balena CI	c8d7b28a7e	v14.10.9	2023-05-03 14:26:38 +00:00
flowzone-app[bot]	cf21b093a6	Merge pull request #2164 from balena-os/klutchell-patch-1 Run test supervisor under a different service name	2023-05-03 14:25:58 +00:00
Kyle Harding	33b29cfa22	Run test supervisor under a different service name The docker compose V2 spec no longer accepts `network_mode: bridge`, which means we can no longer override the network configuration of the `balena-supervisor` service for tests. For this reason we now create a separate service to run the built supervisor `balena-supervisor-sut` and run API tests against this service instead of the default `balena-supervisor`. Change-type: patch	2023-05-03 09:33:22 -04:00
Balena CI	f6e0683032	v14.10.8	2023-04-26 18:49:44 +00:00
flowzone-app[bot]	bc969c8c89	Merge pull request #2161 from balena-os/network-plus-service-bug Fix device state not applied when a network change happens during the update	2023-04-26 18:48:55 +00:00
Felipe Lalanne	5fdd689590	Fix service comparison when creating component steps A bug in service comparison would make it that a device already running a service from a new release with network changes would never stop the running service so remaining services would forever get stuck in `Downloaded` state. This fixes the comparison so the service will get killed in this case, particularly allowing devices to recover from #1576 Change-type: patch	2023-04-26 11:58:48 -04:00
Felipe Lalanne	7b8b187c74	Create tests with recovery from #1576 Devices affected by the bug described in 1576, are also stuck with some services in the `Downloaded` state, because the state engine does not detect that the running services should be killed on a network change even if they belong to a new release. This is a bug, which can be replicated by the tests in this commit Change-type: patch	2023-04-26 11:58:42 -04:00
Felipe Lalanne	7aecaae8b0	Skip updateMetadata step if there are network changes Previous behavior would make it that an `updateMetadata` step would take precedence over a `kill` step when network changes are present. This would lead to an inconsistent state if an update included a network and a container change. Closes: #1576 Change-type: patch	2023-04-25 14:47:00 -04:00
Felipe Lalanne	0a358a4463	Add replication of issue using unit tests Change-type: patch	2023-04-25 14:47:00 -04:00
Felipe Lalanne	138aec5de4	Add integration tests for state-engine These tests use the supervisor API to check that applying a target state allows the device to eventually get to the desired target configuration. This are high-level tests that work with real images and containers using dind. Change-type: patch	2023-04-25 14:47:00 -04:00
Felipe Lalanne	c1207cbbff	Do not pass auth to images with no registry The supervisor allows the target image to be an image without a registry (e.g. `alpine:latest`), while this really only happens while in local mode, we don't want to pass credentials to the default registry as those credentials are meant for balena registry and will otherwise fail. Change-type: patch	2023-04-25 14:47:00 -04:00
Balena CI	d3be730c8e	v14.10.7	2023-04-21 23:04:21 +00:00
flowzone-app[bot]	48951d0333	Merge pull request #2153 from balena-os/local-mode Refactor state engine to be able to use current state as target	2023-04-21 23:03:37 +00:00
Felipe Lalanne	6c031299d6	Remove safeStateClone function This function is no longer needed with the latest changes to getCurrentState Change-type: patch	2023-04-20 14:58:58 -04:00
Felipe Lalanne	36311ef7a1	Get rid of targetVolatile in app manager Target volatile doesn't make sense now that we can use the current state as a target. It wasn't actually being used for anything anymore apparently Change-type: patch	2023-04-20 14:58:58 -04:00
Felipe Lalanne	1e0dd381f5	Make pausingApply a private member of device-state This simplifies this module interface and hides implementation details from the rest of the code. The function `applyIntermediateTarget` will now call `pausingApply` before applying the target API actions no longer need to call pausing apply Change-type: patch	2023-04-20 14:58:58 -04:00
Felipe Lalanne	3d43f7e3b3	Simplify doRestart and doPurge actions The actions now work by passing an intermediate state to the state engine. - doPurge first removes the user app from the target state and passes that to the state engine for purging. Since intermediate state doesn't remove images, this will have the effect of basically re-installing the app. - doRestart modifies the target state by first removing only the services from the current state but keeping volumes and networks. This has the same effect as before where services were stopped one by one Change-type: patch	2023-04-20 14:58:58 -04:00
Felipe Lalanne	43630e5267	Fix network appUuid inference in local mode Local mode uses a numeric `appUuid` which was messing up parsing the network name. This fixes this issue so the current state can be used as a target state Change-type: patch	2023-04-20 14:58:58 -04:00
Felipe Lalanne	b1fc4e1761	Get image name from DB when getting the app current state The Service class in `compose/service.ts` cannot get the image name from the image id when building the object from the container metadata. We query the metadata in the application manager getCurrentApps method so the current state can be used as target by API methods Change-type: patch	2023-04-20 14:58:58 -04:00

... 3 4 5 6 7 ...

4570 Commits