balena-supervisor

mirror of https://github.com/balena-os/balena-supervisor.git synced 2025-06-07 18:11:40 +00:00

Author	SHA1	Message	Date
Christina Ying Wang	8c69166271	Support target state apply cancellation The current target state apply is cancelled when either: - /v1/update is called with cancel: true - A different target state is received from the cloud (with a non-304 status) Following apply cancellation, a target state apply is re-triggered. This ensures that the user can force a device out of a dead-locked situation where a long-running task such as an image fetch fails to cede control back to the Supervisor, which is the behavior observed in an Engine bug with infinite pull retries with a bad network. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-05-28 07:46:10 -07:00
Christina Ying Wang	0af915d815	Pass AbortSignal to image pull functions When abortController.abort() is called, this signal is passed down to the functions that interface with Docker Engine for image pulls, cancelling those pulls. The next commit will limit when abortController.abort() is called. Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-05-27 11:09:19 -07:00
Felipe Lalanne	4318272844	Remove unsupported fields from contract requirements A contract including extra requirement fields, such as "name" would fail validation. This PR removes any extra fields from the validated contract to prevent services with these extra fields from getting rejected by the contract validation. Change-type: patch	2025-05-15 17:38:03 -04:00
Christina Ying Wang	b596c77ce2	Add Docker network label if custom ipam config In a target release where the only change is the addition or removal of a custom ipam config, the Supervisor does not recreate the network due to ignoring ipam config differences when comparing current and target network (in network.isEqualConfig). This commit implements the addition of a network label if the target compose object includes a network with custom ipam. With the label, the Supervisor will detect a difference between a network with a custom ipam and a network without, without needing to compare the ipam configs themselves. This is a major change, as devices running networks with custom ipam configs will have their networks recreated to add the network label. Closes: #2251 Change-type: major See: https://balena.fibery.io/Work/Project/Fix-Supervisor-not-recreating-network-when-passed-custom-ipam-config-1127 Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-03-24 14:55:19 -07:00
Felipe Lalanne	7764f98c9d	Start a dependent if all dependencies are started The previous behavior required that dependencies were running beefore starting the dependent service. This made it that services dependent on a one-shot service would not get started and goes against the default docker behavior. Depending on a service to be running will require the implementation of [long syntax depends_on](https://docs.docker.com/reference/compose-file/services/#long-syntax-1) and the condition `service_healthy`. Change-type: patch Closes: #2409	2025-03-20 14:51:32 -03:00
Pagan Gazzard	49163e92a0	Decrease balenaCloud api request timeout from 15m to 59s This was mistakenly increased due to confusion between the timeout for requests to the supervisor's api vs the timeout for requests from the supervisor to the balenaCloud api. This separates the two configs and documents the difference between the timeouts whilst also decreasing the timeout for balenaCloud api requests to the correct/expected value Change-type: patch	2025-03-04 12:29:18 +00:00
Felipe Lalanne	51f1fb0f30	Refactor device-config as part of device-state Move the device-config module to the device-state folder and export only those functions that are needed elsewhere in the codebase This moves us closer to making the device-state module the only way to modify application and configuration. Change-type: patch	2025-01-09 14:31:43 -03:00
Felipe Lalanne	8e6c0fcad7	Wait for service dependencies to be running This fixes a regression where dependencies would only be started in order and would start the dependent service if its dependency had been started at some point in the past, regardless of the running condition. This makes the behavior more consistent with docker compose where the [dependency needs to be running or healthy](`69a83d1303/pkg/compose/convergence.go (L441)`) for the service to be started. Change-type: patch	2024-12-13 16:22:11 -03:00
Christina Ying Wang	828bd22ba0	Add PowerFanConfig config backend This config backend uses ConfigJsonConfigBackend to update os.power and os.fan subfields under the "os" key, in order to set power and fan configs. The expected format for os.power and os.fan settings is: ``` { os: { power: { mode: string }, fan: { profile: string } } } ``` There may be other keys in os which are not managed by the Supervisor, so PowerFanConfig backend doesn't read or write to them. Extra keys in os.power and os.fan are ignored when getting boot config and removed when setting boot config. After this backend writes to config.json, host services os-power-mode and os-fan-profile pick up the changes, on reboot in the former's case and at runtime in the latter's case. The changes are applied by the host services, which the Supervisor does not manage aside from streaming their service logs to the dashboard. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:43:51 -08:00
Christina Ying Wang	54fcfa22a7	Support "os" key with object values in ConfigJsonConfigBackend Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:29:26 -08:00
Christina Ying Wang	9ec45a724b	Add tests for ConfigJsonConfigBackend Also deprecate path-getting method, and remove OS version check. The OS version itself is not used in ConfigJsonConfigBackend, so it seems the OS version check is to confirm the existence of config.json during class init, because OS version is a field that's always there in a valid config.json. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:29:26 -08:00
Christina Ying Wang	c610710f03	Move logger.ts into logging/index.ts Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-05 21:55:09 -08:00
Felipe Lalanne	a2d4b31b23	Take update locks for host-config changes This adds update-lock support to hostname changes via the host-config endpoint, in addition to proxy changes as changing the hostname may cause an engine restart from the OS. Change-type: minor	2024-12-03 15:07:24 -03:00
Felipe Lalanne	9c09329b86	Clean up remaining locks on state settle Locks could remain from a previous supervisor run that didn't get to settle the state. This ensures that cleanup will happen for remaining locks every time the state is settled. Change-type: patch	2024-11-27 16:40:58 -03:00
Felipe Lalanne	3c6e9dd209	Refactor update-locks implementation The refactor simplifies the implementation and ensures that locks per app can only be held by one supervisor task at the time. Change-type: patch	2024-11-27 16:40:50 -03:00
Felipe Lalanne	d8f54c05e7	Refactor lockfile module Updated interfaces for clarity Change-type: patch	2024-11-15 18:25:50 -03:00
Christina Ying Wang	3d3f659f16	Delete apps not in target from db by appUuid instead of appId Resolve an issue in balenaMachine instances that were installed at <v14.1.0, in which a Supervisor app with random UUID is kept in the target db due to its appId being the same, even after the BM instance has upgraded to v14.1.0 which patches the correct reserved Supervisor app UUIDs in. This results in two Supervisors running on devices under the BM instance which persists after BM upgrade. See: https://balena.fibery.io/search/T7ozi#Inputs/Pattern/Two-supervisors-are-running-on-device-3370 Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-11-04 14:15:55 -08:00
Felipe Lalanne	e9a52e6786	Store rejected apps in the database This moves from throwing an error when an app is rejected due to unmet requirements (because of contracts) to storing the target with a `rejected` flag on the database. The application manager filters rejected apps when calculating steps to prevent them from affecting the current state. The state engine uses the rejection info to generate the state report. Change-type: minor	2024-08-30 10:52:11 -04:00
Felipe Lalanne	227fee9941	Set the app update status when reporting state Change-type: minor	2024-08-30 10:52:11 -04:00
Christina Ying Wang	eaa07e97a9	Add support for redsocks dnsu2t config Users may specify dnsu2t config by including a `dns` field in the `proxy` section of PATCH /v1/device/host-config's body: ``` { network: { proxy: { dns: '1.1.1.1:53', } } } ``` If `dns` is a string, ADDRESS and PORT are required and should be in the format `ADDRESS:PORT`. The endpoint with error with code 400 if either ADDRESS or PORT are missing. `dns` may also be a boolean. If true, defaults will be configured. If false, the dns configuration will be removed. If `proxy` is patched to empty, `dns` will be removed regardless of its current or input configs, as `dns` depends on an active redsocks proxy to function. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-28 14:01:51 -07:00
Felipe Lalanne	b088b78a3e	Do not write `noProxy` to redsocks.conf This fixes a regression introduced by the refactor in #2329 where `noProxy` was being included in the data added to redsocks.conf. Change-type: patch	2024-08-08 11:59:20 -04:00
Felipe Lalanne	a255001c2e	Verify that LED_FILE exists on blinking setup Before v1, the blinking module would not throw when the passed led file does not exist. This change checks for file existence and defaults to `/dev/null` otherwise Change-type: patch	2024-08-07 15:33:07 -04:00
Felipe Lalanne	f38714d40f	Cleanup images after state-engine tests Tests on GitHub started failing recently because of leftover images from the state engine test suite. This fixes that issue to allow tests to pass. Change-type: patch	2024-07-16 16:33:52 -04:00
Christina Ying Wang	f99ccb58c6	Remove unnecessary exports from host-config This limits the host-config interface to necessary methods only Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-07-03 16:47:51 -07:00
Christina Ying Wang	53f5641ef1	Refactor host-config to be its own module The host-config module exposes the following interfaces: get, patch, and parse. `get` gets host configuration such as redsocks proxy configuration and hostname and returns it in an object of type HostConfiguration. `patch` takes an object of type HostConfiguration or LegacyHostConfiguration and updates the hostname and redsocks proxy configuration, optionally forcing the patch through update locks. `parse` takes a user input of unknown type and parses it into type HostConfiguration or LegacyHostConfiguration for patching, erroring if parse was unsuccessful. LegacyHostConfiguration is a looser typing of the user input which does not validate values of the five known proxy fields of type, ip, port, username, and password. We should stop supporting it in the next major Supervisor API release. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-07-03 16:47:51 -07:00
Christina Ying Wang	be986a62a5	Add HostConfig.parse method Parses input from PATCH /v1/device/host-config into either type HostConfiguration, or if LegacyHostConfiguration if input is of an acceptable shape (for backwards compatibility). Once input has been determined to be of type HostConfiguration, we can easily extract ProxyConfig from the object for patching, stringifying, and writing to redsocks.conf. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-07-03 16:47:51 -07:00
Christina Ying Wang	1e224be0cd	Add RedsocksConf.parse method This is part of the host-config refactor which enables easier encoding to / decoding from `redsocks.conf`. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-07-03 16:45:06 -07:00
Christina Ying Wang	725d7790fb	Move noProxy handling to separate module Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-06-28 11:34:27 -07:00
Christina Ying Wang	0cf5a4bf18	Move hostname get/set to separate "module" (directory) Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-06-28 11:34:21 -07:00
Felipe Lalanne	5d93f358aa	Create replicating test This adds a test to check the case where a service image changes along with networks being removed.	2024-06-24 15:54:19 -04:00
Felipe Lalanne	9497eed380	Move device-state/target state to api-binder/poll This goes in the direction of grouping modules by responsibility. The api-binder module is the middleware between the device and the backend, thus the target state polling code makes more sense there. Change-type: patch	2024-06-03 11:40:46 -04:00
Felipe Lalanne	ac2db38742	Move api-keys module to src/lib This removes circular dependencies between the device-api module and the compose module, reducing total circular dependencies to 15 Change-type: patch	2024-05-27 14:36:03 -04:00
Felipe Lalanne	234e0de075	Move composition types to compose/types This reduces circular dependencies from 250 to 80 by ensuring that modules that only require types do not import the full module with all its dependencies. Change-type: patch	2024-05-27 14:36:03 -04:00
Felipe Lalanne	94de4006a0	Split compose types into interface and implementation This splits `App`, `Network`, `Service` and `Volume` which used to be defined as classes into an interface and a class implementation that is not exported. This will allow to work with just the types in some cases and prevent circular dependencies when importing. Change-type: patch	2024-05-27 14:36:03 -04:00
Christina Ying Wang	9c968b8d06	Move lib/fs-utils tests to testfs This removes mock-fs as a dependency Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-24 13:37:45 -07:00
Felipe Lalanne	6f02b17968	Refactor MDNS resolver into a module Also add integration tests for the resolver functionality to prevent regressions. Change-type: patch	2024-04-23 19:23:32 -04:00
Christina Ying Wang	57207c3539	Add additional update lock tests for lockOverride & force Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-15 14:46:44 -04:00
Christina Ying Wang	6e185fbd44	Don't follow symlinks when checking for lockfiles The Supervisor should only care whether a lockfile exists or not. This also fixes an edge case where a user symlinked a lockfile to a nonexistent file, causing the Supervisor to enter an error loop as it was not able to `stat` the nonexistent file. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-12 10:34:46 -04:00
Christina Ying Wang	f863075bdc	Add memory usage healthcheck This healthcheck fails when Supervisor memory usage is above a threshold based on initial memory measurements after device state has settled. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-11 18:16:47 -07:00
Christina Ying Wang	8ac2ce4677	Respect lockOverride when taking locks Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-06 00:59:04 -07:00
Christina Ying Wang	fd7d58f89a	Clean up lockfiles on takeLock step failure We don't want any Supervisor lockfiles to remain on the device when a takeLock step fails because this would interfere with the user app. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	fb1bd33ab6	Refine update locking interface * Remove Supervisor lockfile cleanup SIGTERM listener * Modify lockfile.getLocksTaken to read files from the filesystem * Remove in-memory tracking of locks taken in favor of filesystem * Require both `(resin-)updates.lock` to be locked with `nobody` UID for service to count as locked by the Supervisor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	10f294cf8e	Add takeLock to state funnel A takeLock step should be generated before any of the following steps: * kill * start * stop * updateMetadata * restart * handover ALL services in an app will be locked for any of the above actions, unless the action is generated through Supervisor API's `POST /v2/applications/:appId/(start\|stop\|restart)-service` endpoints, in which case only the target service will be locked. A lock will be taken for a service before it starts by creating the directory in /tmp before the Engine creates it through bind mounts. Also, the commit simplifies the generation of service kill steps from network/volume changes or removals. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	cf8d8cedd7	Simplify lock interface to prep for adding takeLock to state funnel This commit changes a few things: * Pass `force` to `takeLock` step directly. This allows us to remove the `lockFn` used by app manager's action executors, setting takeLock as the main interface to interact with the update lock module. Note that this commit by itself will not pass tests, as no update locking occurs where it once did. This will be amended in the next commit. * Remove locking functions from doRestart & doPurge, as this is the only area where skipLock is required. * Remove `skipLock` interface, as it's redundant with the functionality of `force`. The only time `skipLock` is true is in doRestart/doPurge, as those API methods are already run within a lock function. We removed the lock function which removes the need for skipLock, and in the next commit we'll add locking as a composition step to replace the functionality removed here. * Remove some methods not in use, such as app manager's `stopAll`. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	af6359f7ae	Take lock before updating service metadata Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	e6df78a22b	Implement takeLock composition step + tests This commit only implements the action that a takeLock step results in. It does not add takeLock step generation logic to the state funnel yet. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	f2843e1382	Add update lock release functionality to state funnel releaseLock is a step that will be inferred if there are services in target state, and if some of those services have locks taken by the Supervisor. The releaseLock composition step calls the method of the same name in the updateLock module, which takes the exclusive process lock before disposing all Supervisor lockfiles in the target appId. This is half of the update lock incorporation into the state funnel, as we also need to introduce a takeLock step which triggers during crucial stages of device state transition. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	d18a740a40	Add methods for easier checking of lockfile existence Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Felipe Lalanne	6217546894	Update typescript to v5 This also updates code to use the default import syntax instead of `import * as` when the imported module exposes a default. This is needed with the latest typescript version. Change-type: patch	2024-03-05 15:33:56 -03:00
Felipe Lalanne	988a1c9e9a	Update @balena/lint to v7 This updates balena lint to the latest version to enable eslint support and unblock Typescript updates. This is a huge number of changes as the linting rules are much more strict now, requiring modifiying most files in the codebase. This commit also bumps the test dependency `rewire` as that was interfering with the update of balena-lint Change-type: patch	2024-03-01 18:27:30 -03:00

1 2 3

131 Commits