1564 Commits

Author SHA1 Message Date
Pagan Gazzard
5cd37e73ac Increase the timeout for auto select family to 5000ms to avoid issues
On slower networks the default of 250ms can cause problems as all
attempts will fail rather than only the ones for interfaces that do not
actually work correctly. Increasing this timeout to 5000ms will help to
avoid these issues

Change-type: patch
2024-03-25 15:05:13 +00:00
Felipe Lalanne
08727ed2b5 Remove dependency on @balena/happy-eyeballs
Node 20 now implements the happy eyeballs algorithm as part of its core
`net` module, with the [autoSelectFamily](https://nodejs.org/docs/latest-v20.x/api/net.html#netgetdefaultautoselectfamily) option of `socket.connect`. This option defaults to `true`, meaning that a separate
implementation of happy eyeballs is no longer needed.

Change-type: patch
2024-03-06 15:16:33 -03:00
Felipe Lalanne
b77dba2046 Update Node to v20
This updates the supervisor runtime to latest Node LTS version. There
are no breaking changes related to this bump.

Change-type: patch
2024-03-06 12:29:54 -03:00
Felipe Lalanne
6217546894 Update typescript to v5
This also updates code to use the default import syntax instead of
`import * as` when the imported module exposes a default. This is needed
with the latest typescript version.

Change-type: patch
2024-03-05 15:33:56 -03:00
Felipe Lalanne
988a1c9e9a Update @balena/lint to v7
This updates balena lint to the latest version to enable eslint support
and unblock Typescript updates. This is a huge number of changes as the
linting rules are much more strict now, requiring modifiying most files
in the codebase. This commit also bumps the test dependency `rewire` as
that was interfering with the update of balena-lint

Change-type: patch
2024-03-01 18:27:30 -03:00
Felipe Lalanne
bda1bac04c Add support for repeated overlays
RPI firmware configuration allows repeating overlays to define
configurations on multiple devices. For instance, for configuring
multiple `ads` devices, `config.txt` needs to be setup this way

```
dtoverlay=ads1115,addr=0x48
dtoverlay=ads1115,addr=0x49
```

Before this change, the supervisor would interpret both lines as
belonging to the same overlay, preventing users from configuring multiple
devices, and leading to a loop when trying to apply configurations with
repeated overlays coming from the cloud side.

Change-type: minor
2024-02-27 14:52:41 -03:00
Christina Ying Wang
3fd035c5bd Patch default dtparam handling in config.txt
This commit completes the list of default / board-wide dtparams
to include some `baudrate` and `vc` i2c params.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-02-21 12:45:29 -08:00
Christina Ying Wang
e22253ce6e Patch config.txt backend to return array configs correctly
Previously, getBootConfig() of the config.txt backend was omitting
array configurations such as gpio settings, thus resulting in the SV
mistakenly assuming that boot config had not been applied, since gpio
would not be in current config.txt config but would be in target config.
This resulted in SV entering an infinite loop of attempting to apply the
gpio config when it wasn't necessary.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-02-16 18:12:33 -08:00
Felipe Lalanne
6e6a796da5 Add special case for base DTO params on RPI config
While ordering is important in the RPI firmware configuration file (config.txt),
some dt params are by default considered part of the base dt overlay
if they are not used by other overlays.
Unfortunately the [list of dtparams](https://github.com/raspberrypi/firmware/blob/master/boot/overlays/README#L133)
is too long to add all of them as exceptions, but we can add the params
used in the default config.txt provided in OS images, to avoid reboots
when updating to this new supervisor and correctly parsing the
provisioning config.txt as variables.

While this addition handles most common scenarios, there is still a
chance a user may have use other base overlay dt params in the initial
config, in which case those will be interpreted according to the
relative ordering

Change-type: patch
2024-02-08 15:48:10 -03:00
Felipe Lalanne
9546a1a3b1 Fix processing of dtoverlay/dtparams on config.txt
DT overlays and DT params need to be consumed in the order that they
appear on the file. DT params apply to the last dtoverlay defined on the
file, or to the base overlay.

This commit updates config.txt parsing to consider this ordering, and it
also ensures global dtparams are written first so they cannot be
overriden by later overlays.

Because of the more strict parsing method, it is possible that existing
HOST_CONFIG vars do not match the interpretation of the parser. If
that's the case, the supervisor will re-apply the target state which
will cause the device to reboot.

Change-type: major
2024-02-08 15:46:07 -03:00
Felipe Lalanne
a8e371f0c9 Refactor config-txt backend
Cleans up code and adds better type detection
2024-02-07 20:39:41 -03:00
Christina Ying Wang
3afcef2969 Respect update strategies app-wide instead of at the service level
Fixes behavior for release updates which removes a service in current state
and adds a new service in target state.

Change-type: patch
Closes: #2095
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-01-29 12:26:28 -08:00
Felipe Lalanne
dec39a35d4 Try MDNS lookup only if regular DNS lookup fails
This is meant to allow users to configure their device to
resolve `.local` queries via dnsmasq by modifying config.json, e.g. `dnsServers":
"/bob.local/172.17.0.33`.

This would fail before as MDNS lookups would always come first

Change-type: minor
2024-01-03 14:42:23 -03:00
Felipe Lalanne
7a39da92b7 Refactor mdns lookup code in app entry
Change-type: patch
2024-01-03 14:42:23 -03:00
Felipe Lalanne
3ea8d4727a Force remove container if updateMetadata fails
The `updateMetadata` step renames the container to match the target
release when the service doesn't change between releases. We have seen
this step fail because of an engine bug that seems to relate to the
engine keeping stale references after container restarts. The only way
around this issue is to remove the old container and create it again.
This implements that workaround during the updateMetadata step to deal
with that issue.

Change-type: minor
Relates-to: balena-os/balena-engine#261
2023-11-22 14:16:44 -03:00
Christina Ying Wang
eb8ad11cd7 Cache last reported current state to /mnt/root/tmp
Whenever the Supervisor reports current state, it diffs the current state
with its last reported current state. However, when the Supervisor starts
up, there is no last reported state, since that last report is stored in
process memory. Caching the last report in a location that survives
Supervisor restarts will reduce the current report bandwidth used on startup.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-11-14 16:15:36 -08:00
Christina Ying Wang
d440776881 Convert current state types to io-ts
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-11-08 16:00:54 -08:00
Christina Ying Wang
a993b3e7af Set applyInProgress to true while applying intermediate state
Intermediate state is utilized when executing device actions such as a
volume purge. It's a type of state apply, but despite that,
applyInProgress is not true.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-10-25 10:32:10 -07:00
Felipe Lalanne
9bd216327f Expose ports from port mappings on services
PR #2217 removed the expose configuration but also caused a regresion
where ports set via the `ports` configuration would no longer get
exposed to the host, despite portmappings being set. This fixes that
issue by exposing only those ports comming from port mappings.

Change-type: patch
2023-10-24 15:04:39 -03:00
Felipe Lalanne
416170bc05 Ignore expose service compose configuration
The docker EXPOSE directive and corresponding docker-compose `expose`
service configuration serves as documentation/metadata that a container
listens on a certain port that may be used for service discovery but it doesn't
have any real impact on the ability for
other containers on the same network to access the exposed service via
the port. In newer engine implementations, this property may conflict
with other network configurations, and prevent the container from being
started by the docker engine (see #2211).

This PR removes code that would manage the expose property and takes the
property out of the whitelist. A composition with the `expose` property
will result in the log message `Ignoring unsupported or unknown compose fields: expose`.

While this change should not have operational impact, it still removes
a previously supported configuration and as such there is a chance of it
being a breaking change for some applications. For this reason it is
being published as a new major version.

Change-type: major
Closes: #2211
2023-10-23 11:41:32 -03:00
Felipe Lalanne
b107868765 Add note regading API jitter on target state poll
Change-type: patch
2023-10-23 14:11:20 +01:00
Pagan Gazzard
e15205301c Switch some _.includes usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
a4a9a17c1a Switch _.assign usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
d0cb54537f Switch _.isNaN usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
3bfdc4454e Switch _.isUndefined usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
8e23091aa9 Switch _.isNull usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
ca3faebfc9 Switch _.isNumber usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
20df54668c Switch _.isArray usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Pagan Gazzard
3fe8a22fb0 Switch _.isString usage to native versions
Change-type: patch
2023-10-16 14:30:25 -03:00
Felipe Lalanne
3e828dcc52 Revert "Do not expose ports from image if service network mode"
This reverts commit 0c7bad779291e15e419166a2c66c2a21dd06aa83, as that
change causes a service restart loop. The supervisor cannot distinguish
between ports exposed via the `EXPOSE` directive and the docker-compose
`expose` property. Because of this, in the case of `network-mode:
service:<...>` the current state and target state never match, leading
to a service restart loop.

Change-type: patch
2023-10-16 13:06:50 -03:00
Pagan Gazzard
766cce89c7 Convert multiple bluebird uses to native promises
Change-type: patch
2023-10-16 11:40:45 +01:00
Felipe Lalanne
0c7bad7792 Do not expose ports from image if service network mode
The supervisor exposes ports configured using the `EXPOSE` directive in
the dockerfile when configuring the container for runtime. This can
cause issues if using `network_mode: service:<service name>` as the
expose configuration is not compatible with that network mode. This
fix now skips image exposed ports for that particular network mode.

Change-type: patch
Relates-to: #2211
2023-10-12 18:03:42 -03:00
Pagan Gazzard
3d73bf3e91 Use mutation for adding service/image ids to logs to reduce allocations
Change-type: patch
2023-10-11 15:39:19 -03:00
Pagan Gazzard
d685ccacb2 Keep the container lock for the entire duration of attaching logs
Change-type: patch
2023-10-11 15:39:19 -03:00
Pagan Gazzard
74d374b5ad Remove unnecessary async on handling journald stderr entries
Change-type: patch
2023-10-11 15:39:19 -03:00
Pagan Gazzard
e3806ec018 Avoid unnecessary work in systemd log row handling for invalid logs
Change-type: patch
2023-10-11 15:39:19 -03:00
Pagan Gazzard
894bdeeeb6 Remove unused docker logs logging code
Change-type: patch
2023-10-11 14:20:33 +01:00
Christina Ying Wang
06d4775178 Use native structuredClone instead of _.cloneDeep
Memory tests have shown performance improvements to using the native method.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-09-29 12:29:50 -07:00
jaomaloy
ab513cc021 Dump target-state to hostOS tmp dir
This change is mainly for the hostOS
to know if update locks should be ignored
when updating to a newer version.

Change-type: patch
Signed-off-by: jaomaloy <jao.maloy@balena.io>
2023-09-14 11:03:34 +08:00
Felipe Lalanne
327dc31ef0 Replace node-dbus with @balena/systemd
The node-dbus module is unmaintained and a blocker for the update to
Node 18. Switching to our own node bindings for systemd solves this
issue

Relates-to: Shouqun/node-dbus#241
Change-type: patch
2023-08-16 15:58:52 -04:00
Alexandru Costache
512240c544 backends: Add Jetson Orin NANO custom device-tree support
Signed-off-by: Alexandru Costache <alexandru@balena.io>
Change-type: patch
2023-07-11 18:11:32 +03:00
Florin Sarbu
8d2b310af8 Add revpi-connect-s to Raspberry Pi variants
We need the supervisor to be able to manage config.txt changes for the
Revolution Pi Connect S.

Change-type: patch
Signed-off-by: Florin Sarbu <florin@balena.io>
2023-07-05 13:55:29 +02:00
Christina Ying Wang
38fe8dae75 Remove the 'Stopped' status for services
It's not an official status from container inspects, and the Supervisor
doesn't set it internally anywhere. It's better to remove it entirely as the
method by which Supervisor sets internal service statuses is by using a global
event emitter (reportNewStatus) which makes things difficult to test.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-28 11:17:13 -04:00
Christina W
71d24d6e33 Parse container exit error message instead of status
The previous implementation in #2170 of parsing the container status was too general,
because it relied on the mistaken assumption that a container would have a status of
`Stopped` if it was manually stopped. This turned out to be untrue, as manually stopped
containers were also getting restarted by the Supervisor due to their inspect status of
`exited`. With this, parsing the exit message became unavoidable as there are no other
clear ways to discern a container that has been manually stopped and shouldn't be started
from a container experiencing the Engine-host race condition issue (again, see #2170).

Since we're just parsing the exit error message, we don't need to worry about different behaviors
amongst restart policies, as any container with the error message on exit should be started.

Change-type: patch
Closes: #2178
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-22 14:43:17 -07:00
Felipe Lalanne
12eac04484 Fix /v2/applications/state endpoint
It was returning stale information, particularly the download progress
of the target release images never got updated.

Change-type: patch
Closes: #2174
2023-06-19 17:16:36 -04:00
Christina Ying Wang
9e249e6ae8 Remove unnecessary async/await from method
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-19 11:11:26 -07:00
Christina Ying Wang
6e6f79c71d Decrease wait time before start from 60s to 30s
60 seconds to wait may be excessively long.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-19 11:11:26 -07:00
Christina Ying Wang
ace642ea0f Improve naming of a util function & add unit test
isOlderThan -> isValidDateAndOlderThan

See: https://github.com/balena-os/balena-supervisor/pull/2170#discussion_r1226809686
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-19 11:11:26 -07:00
Christina Ying Wang
ab80f198d8 Add exitCode property to Service class
Since we need to conditionally query the service's exit code
during step inference, adding the exitCode property keeps the
step inference function pure.

See: https://github.com/balena-os/balena-supervisor/pull/2170#discussion_r1226805153
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-19 11:11:26 -07:00
Christina Ying Wang
2537eb8189 Handle the case of 'on-failure' restart policy
As explained in the comments of this commit, a container with the restart policy
of 'on-failure' with a non-zero exit code matches the conditions for the race, so
the Supervisor will also attempt to start it. A container with the 'no' restart
policy that has been started once will not be started again. If a container with
'no' has never been started, its service status will be 'Installed' and the Supervisor
will already try to start it until success, so the service with 'no' doesn't require
special handling.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-06-05 11:05:58 -07:00