balena-supervisor

mirror of https://github.com/balena-os/balena-supervisor.git synced 2025-03-15 08:41:03 +00:00

Author	SHA1	Message	Date
Christina Ying Wang	1dcd156fc8	Update @balena/contrato to 0.9.4 Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-30 16:39:49 -07:00
Pagan Gazzard	4adf710520	Update @types dependencies Change-type: patch	2024-04-29 16:29:07 +01:00
Felipe Lalanne	ae823fea18	Update docker related dependencies This bumps dockerode, removes resin-docker-build in favor of @balena/compose, and updates docker-delta and docker-progress packages. Change-type: patch	2024-04-26 12:03:04 -04:00
Felipe Lalanne	6f02b17968	Refactor MDNS resolver into a module Also add integration tests for the resolver functionality to prevent regressions. Change-type: patch	2024-04-23 19:23:32 -04:00
Felipe Lalanne	ad52561de5	Fix mdnsResolver import The `mdns-resolver` module does not provide a default export. Trying to use a default import notation, causes the `resolve` function to not be found, breaking MDNS resolution. Change-type: patch	2024-04-23 19:23:32 -04:00
Christina Ying Wang	14bdc522c1	Gracefully handle multiple reboot/shutdown requests Since HTTP's server.close() is async, there is a slim chance for two instances of /v1/reboot or /v1/shutdown to be processed. If the server is already closed when server.close() is called, the call throws ERR_SERVER_NOT_RUNNING which doesn't need to be surfaced to the user. This change only allows one server.close() attempt to occur at a time. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-23 12:59:44 -07:00
Christina Ying Wang	6e185fbd44	Don't follow symlinks when checking for lockfiles The Supervisor should only care whether a lockfile exists or not. This also fixes an edge case where a user symlinked a lockfile to a nonexistent file, causing the Supervisor to enter an error loop as it was not able to `stat` the nonexistent file. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-12 10:34:46 -04:00
Christina Ying Wang	f863075bdc	Add memory usage healthcheck This healthcheck fails when Supervisor memory usage is above a threshold based on initial memory measurements after device state has settled. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-11 18:16:47 -07:00
Christina Ying Wang	8ac2ce4677	Respect lockOverride when taking locks Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-06 00:59:04 -07:00
Christina Ying Wang	b7922e6875	Fix some RegEx io-ts types io-ts types that were generated using `shortStringWithRegex` were testing against `VAR_NAME_REGEX`, instead of the Regex that was specified when generating the type. This affected `DockerName` such that service names with a dash in the middle were returning as false when passed through the `DockerName.is` type guard, affecting how `getServicesLockedByAppId` was returning a map of locked services. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-06 00:20:34 -07:00
Christina Ying Wang	7220e994dc	Log takeLock and releaseLock steps as system events Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	fd7d58f89a	Clean up lockfiles on takeLock step failure We don't want any Supervisor lockfiles to remain on the device when a takeLock step fails because this would interfere with the user app. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	fb1bd33ab6	Refine update locking interface * Remove Supervisor lockfile cleanup SIGTERM listener * Modify lockfile.getLocksTaken to read files from the filesystem * Remove in-memory tracking of locks taken in favor of filesystem * Require both `(resin-)updates.lock` to be locked with `nobody` UID for service to count as locked by the Supervisor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	10f294cf8e	Add takeLock to state funnel A takeLock step should be generated before any of the following steps: * kill * start * stop * updateMetadata * restart * handover ALL services in an app will be locked for any of the above actions, unless the action is generated through Supervisor API's `POST /v2/applications/:appId/(start\|stop\|restart)-service` endpoints, in which case only the target service will be locked. A lock will be taken for a service before it starts by creating the directory in /tmp before the Engine creates it through bind mounts. Also, the commit simplifies the generation of service kill steps from network/volume changes or removals. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	cf8d8cedd7	Simplify lock interface to prep for adding takeLock to state funnel This commit changes a few things: * Pass `force` to `takeLock` step directly. This allows us to remove the `lockFn` used by app manager's action executors, setting takeLock as the main interface to interact with the update lock module. Note that this commit by itself will not pass tests, as no update locking occurs where it once did. This will be amended in the next commit. * Remove locking functions from doRestart & doPurge, as this is the only area where skipLock is required. * Remove `skipLock` interface, as it's redundant with the functionality of `force`. The only time `skipLock` is true is in doRestart/doPurge, as those API methods are already run within a lock function. We removed the lock function which removes the need for skipLock, and in the next commit we'll add locking as a composition step to replace the functionality removed here. * Remove some methods not in use, such as app manager's `stopAll`. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	af6359f7ae	Take lock before updating service metadata Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	e6df78a22b	Implement takeLock composition step + tests This commit only implements the action that a takeLock step results in. It does not add takeLock step generation logic to the state funnel yet. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	f2843e1382	Add update lock release functionality to state funnel releaseLock is a step that will be inferred if there are services in target state, and if some of those services have locks taken by the Supervisor. The releaseLock composition step calls the method of the same name in the updateLock module, which takes the exclusive process lock before disposing all Supervisor lockfiles in the target appId. This is half of the update lock incorporation into the state funnel, as we also need to introduce a takeLock step which triggers during crucial stages of device state transition. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	7cfc42e197	Separate rwlock functionality from update-lock for clarity Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	d18a740a40	Add methods for easier checking of lockfile existence Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Christina Ying Wang	b9a6a6b685	Improve types & remove some lodash from state engine Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-04-04 14:07:47 -07:00
Shreya Patel	b5dbef82d7	Add revpi-connect-4 to RPi variants We need the supervisor to be able to manage config.txt changes for the RevPi Connect 4. Change-type: patch Signed-off-by: Shreya Patel <shreya@dynamicdevices.co.uk>	2024-03-27 11:55:15 +00:00
Pagan Gazzard	20e57f7f16	Log the full error on device state report failure as it is more useful The message can be an empty string or similarly unhelpful, therefore logging the entire error means that we will have whatever the message may be along with the stack trace and other info that will be helpful even when the message is not Change-type: patch	2024-03-25 15:17:09 -03:00
Pagan Gazzard	6b0500cdbc	Set @balena/es-version to es2022 to match tsconfig.json Change-type: patch	2024-03-25 16:56:27 +00:00
Pagan Gazzard	5cd37e73ac	Increase the timeout for auto select family to 5000ms to avoid issues On slower networks the default of 250ms can cause problems as all attempts will fail rather than only the ones for interfaces that do not actually work correctly. Increasing this timeout to 5000ms will help to avoid these issues Change-type: patch	2024-03-25 15:05:13 +00:00
Felipe Lalanne	08727ed2b5	Remove dependency on @balena/happy-eyeballs Node 20 now implements the happy eyeballs algorithm as part of its core `net` module, with the [autoSelectFamily](https://nodejs.org/docs/latest-v20.x/api/net.html#netgetdefaultautoselectfamily) option of `socket.connect`. This option defaults to `true`, meaning that a separate implementation of happy eyeballs is no longer needed. Change-type: patch	2024-03-06 15:16:33 -03:00
Felipe Lalanne	b77dba2046	Update Node to v20 This updates the supervisor runtime to latest Node LTS version. There are no breaking changes related to this bump. Change-type: patch	2024-03-06 12:29:54 -03:00
Felipe Lalanne	6217546894	Update typescript to v5 This also updates code to use the default import syntax instead of `import * as` when the imported module exposes a default. This is needed with the latest typescript version. Change-type: patch	2024-03-05 15:33:56 -03:00
Felipe Lalanne	988a1c9e9a	Update @balena/lint to v7 This updates balena lint to the latest version to enable eslint support and unblock Typescript updates. This is a huge number of changes as the linting rules are much more strict now, requiring modifiying most files in the codebase. This commit also bumps the test dependency `rewire` as that was interfering with the update of balena-lint Change-type: patch	2024-03-01 18:27:30 -03:00
Felipe Lalanne	bda1bac04c	Add support for repeated overlays RPI firmware configuration allows repeating overlays to define configurations on multiple devices. For instance, for configuring multiple `ads` devices, `config.txt` needs to be setup this way ``` dtoverlay=ads1115,addr=0x48 dtoverlay=ads1115,addr=0x49 ``` Before this change, the supervisor would interpret both lines as belonging to the same overlay, preventing users from configuring multiple devices, and leading to a loop when trying to apply configurations with repeated overlays coming from the cloud side. Change-type: minor	2024-02-27 14:52:41 -03:00
Christina Ying Wang	3fd035c5bd	Patch default dtparam handling in config.txt This commit completes the list of default / board-wide dtparams to include some `baudrate` and `vc` i2c params. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-02-21 12:45:29 -08:00
Christina Ying Wang	e22253ce6e	Patch config.txt backend to return array configs correctly Previously, getBootConfig() of the config.txt backend was omitting array configurations such as gpio settings, thus resulting in the SV mistakenly assuming that boot config had not been applied, since gpio would not be in current config.txt config but would be in target config. This resulted in SV entering an infinite loop of attempting to apply the gpio config when it wasn't necessary. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-02-16 18:12:33 -08:00
Felipe Lalanne	6e6a796da5	Add special case for base DTO params on RPI config While ordering is important in the RPI firmware configuration file (config.txt), some dt params are by default considered part of the base dt overlay if they are not used by other overlays. Unfortunately the [list of dtparams](https://github.com/raspberrypi/firmware/blob/master/boot/overlays/README#L133) is too long to add all of them as exceptions, but we can add the params used in the default config.txt provided in OS images, to avoid reboots when updating to this new supervisor and correctly parsing the provisioning config.txt as variables. While this addition handles most common scenarios, there is still a chance a user may have use other base overlay dt params in the initial config, in which case those will be interpreted according to the relative ordering Change-type: patch	2024-02-08 15:48:10 -03:00
Felipe Lalanne	9546a1a3b1	Fix processing of dtoverlay/dtparams on config.txt DT overlays and DT params need to be consumed in the order that they appear on the file. DT params apply to the last dtoverlay defined on the file, or to the base overlay. This commit updates config.txt parsing to consider this ordering, and it also ensures global dtparams are written first so they cannot be overriden by later overlays. Because of the more strict parsing method, it is possible that existing HOST_CONFIG vars do not match the interpretation of the parser. If that's the case, the supervisor will re-apply the target state which will cause the device to reboot. Change-type: major	2024-02-08 15:46:07 -03:00
Felipe Lalanne	a8e371f0c9	Refactor config-txt backend Cleans up code and adds better type detection	2024-02-07 20:39:41 -03:00
Christina Ying Wang	3afcef2969	Respect update strategies app-wide instead of at the service level Fixes behavior for release updates which removes a service in current state and adds a new service in target state. Change-type: patch Closes: #2095 Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-01-29 12:26:28 -08:00
Felipe Lalanne	dec39a35d4	Try MDNS lookup only if regular DNS lookup fails This is meant to allow users to configure their device to resolve `.local` queries via dnsmasq by modifying config.json, e.g. `dnsServers": "/bob.local/172.17.0.33`. This would fail before as MDNS lookups would always come first Change-type: minor	2024-01-03 14:42:23 -03:00
Felipe Lalanne	7a39da92b7	Refactor mdns lookup code in app entry Change-type: patch	2024-01-03 14:42:23 -03:00
Felipe Lalanne	3ea8d4727a	Force remove container if updateMetadata fails The `updateMetadata` step renames the container to match the target release when the service doesn't change between releases. We have seen this step fail because of an engine bug that seems to relate to the engine keeping stale references after container restarts. The only way around this issue is to remove the old container and create it again. This implements that workaround during the updateMetadata step to deal with that issue. Change-type: minor Relates-to: balena-os/balena-engine#261	2023-11-22 14:16:44 -03:00
Christina Ying Wang	eb8ad11cd7	Cache last reported current state to /mnt/root/tmp Whenever the Supervisor reports current state, it diffs the current state with its last reported current state. However, when the Supervisor starts up, there is no last reported state, since that last report is stored in process memory. Caching the last report in a location that survives Supervisor restarts will reduce the current report bandwidth used on startup. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-11-14 16:15:36 -08:00
Christina Ying Wang	d440776881	Convert current state types to io-ts Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-11-08 16:00:54 -08:00
Christina Ying Wang	a993b3e7af	Set applyInProgress to true while applying intermediate state Intermediate state is utilized when executing device actions such as a volume purge. It's a type of state apply, but despite that, applyInProgress is not true. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2023-10-25 10:32:10 -07:00
Felipe Lalanne	9bd216327f	Expose ports from port mappings on services PR #2217 removed the expose configuration but also caused a regresion where ports set via the `ports` configuration would no longer get exposed to the host, despite portmappings being set. This fixes that issue by exposing only those ports comming from port mappings. Change-type: patch	2023-10-24 15:04:39 -03:00
Felipe Lalanne	416170bc05	Ignore `expose` service compose configuration The docker EXPOSE directive and corresponding docker-compose `expose` service configuration serves as documentation/metadata that a container listens on a certain port that may be used for service discovery but it doesn't have any real impact on the ability for other containers on the same network to access the exposed service via the port. In newer engine implementations, this property may conflict with other network configurations, and prevent the container from being started by the docker engine (see #2211). This PR removes code that would manage the expose property and takes the property out of the whitelist. A composition with the `expose` property will result in the log message `Ignoring unsupported or unknown compose fields: expose`. While this change should not have operational impact, it still removes a previously supported configuration and as such there is a chance of it being a breaking change for some applications. For this reason it is being published as a new major version. Change-type: major Closes: #2211	2023-10-23 11:41:32 -03:00
Felipe Lalanne	b107868765	Add note regading API jitter on target state poll Change-type: patch	2023-10-23 14:11:20 +01:00
Pagan Gazzard	e15205301c	Switch some _.includes usage to native versions Change-type: patch	2023-10-16 14:30:25 -03:00
Pagan Gazzard	a4a9a17c1a	Switch _.assign usage to native versions Change-type: patch	2023-10-16 14:30:25 -03:00
Pagan Gazzard	d0cb54537f	Switch _.isNaN usage to native versions Change-type: patch	2023-10-16 14:30:25 -03:00
Pagan Gazzard	3bfdc4454e	Switch _.isUndefined usage to native versions Change-type: patch	2023-10-16 14:30:25 -03:00
Pagan Gazzard	8e23091aa9	Switch _.isNull usage to native versions Change-type: patch	2023-10-16 14:30:25 -03:00

1 2 3 4 5 ...

1588 Commits