balena-supervisor

mirror of https://github.com/balena-os/balena-supervisor.git synced 2025-03-15 08:41:03 +00:00

Author	SHA1	Message	Date
Felipe Lalanne	026dc0aed2	Release locks when removing apps This prevents leftover locks that can prevent other operations from taking place. Change-type: patch	2025-03-06 11:50:31 -03:00
Felipe Lalanne	6d00be2093	Log non-API errors during state poll The supervisor was failing silently if an error happened while establishing the connection (e.g. requesting the socket). Change-type: patch	2025-03-04 10:46:45 -03:00
Felipe Lalanne	f8bdb14335	Fix target poll healthcheck The Target.lastFetch time compared when performing the healthcheck resets any time a poll is attempted no matter the outcome. This changes the behavior so the time is reset only on a successful poll Change-type: patch	2025-03-04 10:45:31 -03:00
Pagan Gazzard	49163e92a0	Decrease balenaCloud api request timeout from 15m to 59s This was mistakenly increased due to confusion between the timeout for requests to the supervisor's api vs the timeout for requests from the supervisor to the balenaCloud api. This separates the two configs and documents the difference between the timeouts whilst also decreasing the timeout for balenaCloud api requests to the correct/expected value Change-type: patch	2025-03-04 12:29:18 +00:00
Christina Ying Wang	2dc9d275b1	Don't revert to regular pull if delta server 401 If the Supervisor receives a 401 Unauthorized from the delta server when requesting a delta image location, we should surface the error instead of falling back to a regular pull immediately, as there could be an issue with the delta auth token, which refreshes after DELTA_TOKEN_TIMEOUT (10min), or some other edge case. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-02-24 16:17:15 -08:00
Christina Ying Wang	341111f1f9	Retry DELTA_APPLY_RETRY_COUNT (3) times during delta apply fail before reverting to regular pull This prevents an image download error loop where the delta image on the delta server is present, but some aspect of the delta image or the base image on the device does not match up, causing the delta to fail to be applied to the base image. Delta apply errors don't raise status codes as they are thrown from the Engine (although they should), so if an error with a status code is raised during this time, throw an error to the handler indicating that the delta should be retried until success. Errors with status codes raised during this time are largely network related, so falling back to a regular pull won't improve anything. Upon delta apply errors exceeding DELTA_APPLY_RETRY_COUNT, revert to a regular pull. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-02-11 12:19:53 -08:00
Christina Ying Wang	1fc242200f	Revert to regular pull immediately on delta server failure (code 400s) If the delta server responds immediately with HTTP 4xx upon requesting a delta image, this means the server is not able to supply the resource, so fall back to a regular pull immediately. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2025-02-11 10:58:51 -08:00
Felipe Lalanne	f71f98777c	Update network-manager to v1 Change-type: patch	2025-01-23 23:40:52 -03:00
Felipe Lalanne	85fc5784bc	Update contrato to v0.12.0 Change-type: patch	2025-01-15 18:56:24 -03:00
Felipe Lalanne	e416ad0daf	Add support for `io.balena.update.requires-reboot` This label can be used by user services to indicate that a reboot is required after the install of a service in order to fully apply an update. Change-type: minor	2025-01-14 11:20:35 -03:00
Felipe Lalanne	75127c6074	Move reboot breadcrumb check to device-state This was on device-config before, but we'll need to set the reboot breadcrumb from the application-manager as well when we introduce `requires-reboot` as a label. Change-type: patch	2025-01-09 14:31:55 -03:00
Felipe Lalanne	51f1fb0f30	Refactor device-config as part of device-state Move the device-config module to the device-state folder and export only those functions that are needed elsewhere in the codebase This moves us closer to making the device-state module the only way to modify application and configuration. Change-type: patch	2025-01-09 14:31:43 -03:00
Felipe Lalanne	8e6c0fcad7	Wait for service dependencies to be running This fixes a regression where dependencies would only be started in order and would start the dependent service if its dependency had been started at some point in the past, regardless of the running condition. This makes the behavior more consistent with docker compose where the [dependency needs to be running or healthy](`69a83d1303/pkg/compose/convergence.go (L441)`) for the service to be started. Change-type: patch	2024-12-13 16:22:11 -03:00
Christina Ying Wang	2f2b2e1c50	Don't require reboot if setting fan control Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:43:57 -08:00
Christina Ying Wang	828bd22ba0	Add PowerFanConfig config backend This config backend uses ConfigJsonConfigBackend to update os.power and os.fan subfields under the "os" key, in order to set power and fan configs. The expected format for os.power and os.fan settings is: ``` { os: { power: { mode: string }, fan: { profile: string } } } ``` There may be other keys in os which are not managed by the Supervisor, so PowerFanConfig backend doesn't read or write to them. Extra keys in os.power and os.fan are ignored when getting boot config and removed when setting boot config. After this backend writes to config.json, host services os-power-mode and os-fan-profile pick up the changes, on reboot in the former's case and at runtime in the latter's case. The changes are applied by the host services, which the Supervisor does not manage aside from streaming their service logs to the dashboard. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:43:51 -08:00
Christina Ying Wang	54fcfa22a7	Support "os" key with object values in ConfigJsonConfigBackend Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:29:26 -08:00
Christina Ying Wang	9ec45a724b	Add tests for ConfigJsonConfigBackend Also deprecate path-getting method, and remove OS version check. The OS version itself is not used in ConfigJsonConfigBackend, so it seems the OS version check is to confirm the existence of config.json during class init, because OS version is a field that's always there in a valid config.json. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-09 18:29:26 -08:00
Christina Ying Wang	8f3eeff72d	Stream logs from last SV's State.FinishedAt, process uptime otherwise This will catch any container or host logs between Supervisor runs. If FinishedAt is invalid (0), the last sent timestamp is already set (i.e. this isn't the first time logMonitor.start() has been called), or the Supervisor container metadata couldn't be acquired, use the Supervisor process uptime as the default. This has the downside of missing any logs generated during SV downtime, but at least means the log-streamer can proceed without error. Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-06 07:46:38 -08:00
Christina Ying Wang	fb6fa9b16c	Add ability to stream logs from host services to cloud Add `os-power-mode.service`, `nvpmodel.service`, and `os-fan-profile.service` which report status from applying power mode and fan profile configs as read from config.json. The Supervisor sets these configs in config.json for these host services to pick up and apply. Also add host log streaming from `jetson-qspi-manager.service` as it will very soon be needed for Jetson Orins. Relates-to: #2379 See: balena-io/open-balena-api#1792 See: balena-os/balena-jetson-orin#513 Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-06 07:45:43 -08:00
Christina Ying Wang	c610710f03	Move logger.ts into logging/index.ts Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-05 21:55:09 -08:00
Christina Ying Wang	e62e245fc7	Modify log monitor logging to be more generic Includes other host services in addition to balena.service Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-12-05 09:11:04 -08:00
Felipe Lalanne	a2d4b31b23	Take update locks for host-config changes This adds update-lock support to hostname changes via the host-config endpoint, in addition to proxy changes as changing the hostname may cause an engine restart from the OS. Change-type: minor	2024-12-03 15:07:24 -03:00
Felipe Lalanne	8b3b9a5b7b	Respect lockOverride when using withLock	2024-11-27 16:40:58 -03:00
Felipe Lalanne	9c09329b86	Clean up remaining locks on state settle Locks could remain from a previous supervisor run that didn't get to settle the state. This ensures that cleanup will happen for remaining locks every time the state is settled. Change-type: patch	2024-11-27 16:40:58 -03:00
Felipe Lalanne	3c6e9dd209	Refactor update-locks implementation The refactor simplifies the implementation and ensures that locks per app can only be held by one supervisor task at the time. Change-type: patch	2024-11-27 16:40:50 -03:00
Felipe Lalanne	d8f54c05e7	Refactor lockfile module Updated interfaces for clarity Change-type: patch	2024-11-15 18:25:50 -03:00
Christina Ying Wang	7e1cafa866	Firewall: allow DNS requests from custom Docker bridge networks We only allow DNS requests through `balena0` interface, but this is the default Docker bridge which is used for containers that don't have a custom bridge. However, the Supervisor creates a custom bridge for all containers unless another network mode is specified. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-11-08 17:02:34 -08:00
Christina Ying Wang	3d3f659f16	Delete apps not in target from db by appUuid instead of appId Resolve an issue in balenaMachine instances that were installed at <v14.1.0, in which a Supervisor app with random UUID is kept in the target db due to its appId being the same, even after the BM instance has upgraded to v14.1.0 which patches the correct reserved Supervisor app UUIDs in. This results in two Supervisors running on devices under the BM instance which persists after BM upgrade. See: https://balena.fibery.io/search/T7ozi#Inputs/Pattern/Two-supervisors-are-running-on-device-3370 Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-11-04 14:15:55 -08:00
Christina Ying Wang	ed1c18e369	Add support for init field from compose Init supports boolean values, and is not included in the config when not defined. Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-09-26 10:39:59 -03:00
Felipe Lalanne	e9a52e6786	Store rejected apps in the database This moves from throwing an error when an app is rejected due to unmet requirements (because of contracts) to storing the target with a `rejected` flag on the database. The application manager filters rejected apps when calculating steps to prevent them from affecting the current state. The state engine uses the rejection info to generate the state report. Change-type: minor	2024-08-30 10:52:11 -04:00
Felipe Lalanne	227fee9941	Set the app update status when reporting state Change-type: minor	2024-08-30 10:52:11 -04:00
Felipe Lalanne	48e526ec43	Refactor contracts validation code This updates the interfaces on lib/contracts and the validation in the application-manager module.	2024-08-30 10:52:11 -04:00
Felipe Lalanne	e9f460fd75	Add update status to types Change-type: minor	2024-08-30 10:52:11 -04:00
Felipe Lalanne	788afee9a1	Remove unused patchDevice function This function was a remainder of the dependent devices code that no was removed on #2105 Change-type: patch	2024-08-29 10:34:43 -04:00
Christina Ying Wang	eaa07e97a9	Add support for redsocks dnsu2t config Users may specify dnsu2t config by including a `dns` field in the `proxy` section of PATCH /v1/device/host-config's body: ``` { network: { proxy: { dns: '1.1.1.1:53', } } } ``` If `dns` is a string, ADDRESS and PORT are required and should be in the format `ADDRESS:PORT`. The endpoint with error with code 400 if either ADDRESS or PORT are missing. `dns` may also be a boolean. If true, defaults will be configured. If false, the dns configuration will be removed. If `proxy` is patched to empty, `dns` will be removed regardless of its current or input configs, as `dns` depends on an active redsocks proxy to function. Change-type: minor Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-28 14:01:51 -07:00
Christina Ying Wang	8bf346a6fd	Parse dnsu2t block to dns config Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-28 13:51:46 -07:00
Christina Ying Wang	b775f8f14d	Stringify dns subsection of redsocks input config to dnsu2t Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-28 13:51:46 -07:00
Christina Ying Wang	e724f60beb	Strip additional fields from HostConfiguration type Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-28 13:51:46 -07:00
Christina Ying Wang	51e59725f8	Add unit test for usingInferStepsLock Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-26 13:44:51 -07:00
Christina Ying Wang	3cebfa9f78	Revert PR #2364 Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-22 14:31:35 -07:00
Christina Ying Wang	fc6927e53d	Avoid unnecessary config calls during Supervisor init Change-type: patch Signed-off-by: Christina Ying Wang <christina@balena.io>	2024-08-20 19:11:14 -07:00
Felipe Lalanne	b088b78a3e	Do not write `noProxy` to redsocks.conf This fixes a regression introduced by the refactor in #2329 where `noProxy` was being included in the data added to redsocks.conf. Change-type: patch	2024-08-08 11:59:20 -04:00
Felipe Lalanne	a255001c2e	Verify that LED_FILE exists on blinking setup Before v1, the blinking module would not throw when the passed led file does not exist. This change checks for file existence and defaults to `/dev/null` otherwise Change-type: patch	2024-08-07 15:33:07 -04:00
Felipe Lalanne	d789e5bb77	Avoid leaking memory on deep promise recursions The following pattern ```ts async function longRunning() { // do something await setTimeout(delay); await longRunning(); } ``` Is regularly used for long running operations on the supervisor (e.g. polling target state). We have recently discovered that this pattern can slowly leak memory as it essentially creates an infinite promise chain. Using `void longRunning()` breaks the chain and avoids the issue. This commit fixes all those instances where the pattern was used. Change-type: patch	2024-07-31 18:39:29 -04:00
Felipe Lalanne	8bc08750e9	Use promises for setup/writing for logging backend The balena logging backend now uses async functions to setup the connection and write messages to the request stream. This adds some backpressure on `log` calls by by the log monitor module, to prevent a very agressive container causing the supervisor to waste CPU cycles just dropping messages. Change-type: patch	2024-07-30 10:51:19 -04:00
Felipe Lalanne	f3fcb0db7a	Improve the LogBackend interface This make the LogBackend `log` method into an async method in preparation for upcoming changes that will use backpressure from the connection to delay logging coming from containers. This also removes unnecessary imageId from the LogMessage type Change-type: patch	2024-07-30 10:51:19 -04:00
Felipe Lalanne	5af948483a	Use stream pipeline instead of pipe This also removes the use of JSONStream from the monitor module Change-type: patch	2024-07-30 10:51:19 -04:00
Felipe Lalanne	dbacca977a	Do not use DB to store container logs info This removes the dependence of the supervisor on the containerLogs database for remembering the last sent timestamp. This commit instead uses the supervisor startup time as the initial time for log retrieval. This might result in some logs missing for services that may start before the supervisor after a boot, or if the supervisor restarts. However this seems like an acceptable trade-off as the current implementation seems to make things worst in resource contrained environments. We'll move storing the last sent timestamp to a better storage medium in a future commit. Change-type: minor	2024-07-30 10:51:18 -04:00
Pagan Gazzard	4976578a83	Improve log message typing Change-type: patch	2024-07-17 11:14:17 +01:00
Pagan Gazzard	c5d0eafea9	Logs: only truncate the message if it's possible it will need it Change-type: patch	2024-07-16 18:09:12 -04:00

1 2 3 4 5 ...

1660 Commits