Commit Graph

4600 Commits

Author SHA1 Message Date
Christina Ying Wang
f863075bdc Add memory usage healthcheck
This healthcheck fails when Supervisor memory usage is above a threshold
based on initial memory measurements after device state has settled.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-11 18:16:47 -07:00
flowzone-app[bot]
a71cc374db
v16.2.6 2024-04-10 16:30:15 +00:00
balena-renovate[bot]
31ec3a86a3
Merge pull request #2268 from balena-os/renovate/balena-io-deploy-to-balena-action-2.0.x
Update balena-io/deploy-to-balena-action action to v2.0.55
2024-04-10 16:29:24 +00:00
Self-hosted Renovate Bot
6864ab329a Update balena-io/deploy-to-balena-action action to v2.0.55
Update balena-io/deploy-to-balena-action from 2.0.54 to 2.0.55

Change-type: patch
2024-04-10 16:09:40 +00:00
flowzone-app[bot]
93ac25eff7
v16.2.5 2024-04-09 18:43:54 +00:00
balena-renovate[bot]
16304cc39c
Merge pull request #2267 from balena-os/renovate/balena-io-deploy-to-balena-action-2.0.x
Update balena-io/deploy-to-balena-action action to v2.0.54
2024-04-09 18:41:43 +00:00
Self-hosted Renovate Bot
30351e5f12 Update balena-io/deploy-to-balena-action action to v2.0.54
Update balena-io/deploy-to-balena-action from 2.0.53 to 2.0.54

Change-type: patch
2024-04-09 18:09:41 +00:00
flowzone-app[bot]
5f14b5c407
v16.2.4 2024-04-09 17:32:33 +00:00
balena-renovate[bot]
23e1c3b8e3
Merge pull request #2266 from balena-os/renovate/balena-io-deploy-to-balena-action-2.0.x
Update balena-io/deploy-to-balena-action action to v2.0.53
2024-04-09 17:31:35 +00:00
Self-hosted Renovate Bot
7d73911948 Update balena-io/deploy-to-balena-action action to v2.0.53
Update balena-io/deploy-to-balena-action from 2.0.52 to 2.0.53

Change-type: patch
2024-04-09 17:08:52 +00:00
flowzone-app[bot]
6d4c7a5709
v16.2.3 2024-04-09 16:30:50 +00:00
balena-renovate[bot]
0b45962c6c
Merge pull request #2264 from balena-os/renovate/balena-io-deploy-to-balena-action-2.0.x
Update balena-io/deploy-to-balena-action action to v2.0.52
2024-04-09 16:29:46 +00:00
Self-hosted Renovate Bot
55de8ae430 Update balena-io/deploy-to-balena-action action to v2.0.52
Update balena-io/deploy-to-balena-action from 2.0.27 to 2.0.52

Change-type: patch
2024-04-09 16:09:02 +00:00
flowzone-app[bot]
1c18e8319d
v16.2.2 2024-04-08 14:36:48 +00:00
flowzone-app[bot]
cfad7f86a0
Merge pull request #2256 from balena-os/kyle/renovate-extends-balena-io
Inherit Renovate settings from balena-io
2024-04-08 14:35:46 +00:00
Kyle Harding
58e05d0f63 Inherit Renovate settings from balena-io
Change-type: patch
Signed-off-by: Kyle Harding <kyle@balena.io>
2024-04-07 19:23:52 -04:00
flowzone-app[bot]
2cef1b9bca
v16.2.1 2024-04-06 08:22:27 +00:00
flowzone-app[bot]
4319d0aa56
Merge pull request #2263 from balena-os/bugfix-1-for-update-lock-state-apply
Update lock state apply: patch DockerName & respect lockOverride
2024-04-06 08:21:00 +00:00
Christina Ying Wang
8ac2ce4677 Respect lockOverride when taking locks
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-06 00:59:04 -07:00
Christina Ying Wang
b7922e6875 Fix some RegEx io-ts types
io-ts types that were generated using `shortStringWithRegex` were testing
against `VAR_NAME_REGEX`, instead of the Regex that was specified when
generating the type. This affected `DockerName` such that service names with
a dash in the middle were returning as false when passed through the
`DockerName.is` type guard, affecting how `getServicesLockedByAppId` was
returning a map of locked services.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-06 00:20:34 -07:00
flowzone-app[bot]
aa00727f45
v16.2.0 2024-04-05 02:35:17 +00:00
flowzone-app[bot]
1e025ec410
Merge pull request #2234 from balena-os/update-lock-during-state-apply
Update lock during state apply
2024-04-05 02:34:23 +00:00
Christina Ying Wang
7220e994dc Log takeLock and releaseLock steps as system events
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
fd7d58f89a Clean up lockfiles on takeLock step failure
We don't want any Supervisor lockfiles to remain on the device
when a takeLock step fails because this would interfere with the user app.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
fb1bd33ab6 Refine update locking interface
* Remove Supervisor lockfile cleanup SIGTERM listener
* Modify lockfile.getLocksTaken to read files from the filesystem
* Remove in-memory tracking of locks taken in favor of filesystem
* Require both `(resin-)updates.lock` to be locked with `nobody` UID
  for service to count as locked by the Supervisor

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
10f294cf8e Add takeLock to state funnel
A takeLock step should be generated before any of the following steps:
* kill
* start
* stop
* updateMetadata
* restart
* handover

ALL services in an app will be locked for any of the above actions,
unless the action is generated through Supervisor API's
`POST /v2/applications/:appId/(start|stop|restart)-service` endpoints,
in which case only the target service will be locked.

A lock will be taken for a service before it starts by creating the
directory in /tmp before the Engine creates it through bind mounts.

Also, the commit simplifies the generation of service kill
steps from network/volume changes or removals.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
cf8d8cedd7 Simplify lock interface to prep for adding takeLock to state funnel
This commit changes a few things:

* Pass `force` to `takeLock` step directly. This allows us to remove
the `lockFn` used by app manager's action executors, setting takeLock
as the main interface to interact with the update lock module. Note
that this commit by itself will not pass tests, as no update locking
occurs where it once did. This will be amended in the next commit.

* Remove locking functions from doRestart & doPurge, as this is
the only area where skipLock is required.

* Remove `skipLock` interface, as it's redundant with the functionality
of `force`. The only time `skipLock` is true is in doRestart/doPurge,
as those API methods are already run within a lock function. We removed
the lock function which removes the need for skipLock, and in the next
commit we'll add locking as a composition step to replace the
functionality removed here.

* Remove some methods not in use, such as app manager's `stopAll`.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
2f728ee43e Change lock directory in tests to tmpfs
This prevents leftover lockfiles from interfering with tests
in between test runs.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
af6359f7ae Take lock before updating service metadata
Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
e6df78a22b Implement takeLock composition step + tests
This commit only implements the action that a takeLock step
results in. It does not add takeLock step generation logic
to the state funnel yet.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
f2843e1382 Add update lock release functionality to state funnel
releaseLock is a step that will be inferred if there are services
in target state, and if some of those services have locks taken by
the Supervisor.

The releaseLock composition step calls the method of the same name
in the updateLock module, which takes the exclusive process lock before
disposing all Supervisor lockfiles in the target appId.

This is half of the update lock incorporation into the state funnel, as
we also need to introduce a takeLock step which triggers during crucial
stages of device state transition.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
7cfc42e197 Separate rwlock functionality from update-lock for clarity
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
d18a740a40 Add methods for easier checking of lockfile existence
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
Christina Ying Wang
b9a6a6b685 Improve types & remove some lodash from state engine
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-04-04 14:07:47 -07:00
flowzone-app[bot]
4596149d0e
v16.1.10 2024-03-28 16:51:52 +00:00
Florin Sarbu
6ce8da3f45
Merge pull request #2258 from DynamicDevices/shreya/revpi-connect-4-support
Add revpi-connect-4 to RPi variants
2024-03-28 18:50:57 +02:00
Shreya Patel
b5dbef82d7 Add revpi-connect-4 to RPi variants
We need the supervisor to be able to manage config.txt changes for the
RevPi Connect 4.

Change-type: patch
Signed-off-by: Shreya Patel <shreya@dynamicdevices.co.uk>
2024-03-27 11:55:15 +00:00
flowzone-app[bot]
14e91779f4
v16.1.9 2024-03-25 18:36:46 +00:00
Page-
b5e098b249
Merge pull request #2254 from balena-os/report-failure-log-error
Log the full error on device state report failure as it is more useful
2024-03-25 18:36:02 +00:00
Pagan Gazzard
20e57f7f16 Log the full error on device state report failure as it is more useful
The message can be an empty string or similarly unhelpful, therefore
logging the entire error means that we will have whatever the message
may be along with the stack trace and other info that will be helpful
even when the message is not

Change-type: patch
2024-03-25 15:17:09 -03:00
flowzone-app[bot]
073373192c
v16.1.8 2024-03-25 18:11:11 +00:00
flowzone-app[bot]
9b89b9ead5
Merge pull request #2255 from balena-os/es-version-es2022
Set @balena/es-version to es2022 to match tsconfig.json
2024-03-25 18:10:17 +00:00
Pagan Gazzard
6b0500cdbc Set @balena/es-version to es2022 to match tsconfig.json
Change-type: patch
2024-03-25 16:56:27 +00:00
flowzone-app[bot]
1e2d9a71f9
v16.1.7 2024-03-25 15:25:29 +00:00
Page-
3c1763c6a8
Merge pull request #2253 from balena-os/increase-auto-select-family-timeout
Increase the timeout for auto select family to 5000ms to avoid issues
2024-03-25 15:24:27 +00:00
Pagan Gazzard
5cd37e73ac Increase the timeout for auto select family to 5000ms to avoid issues
On slower networks the default of 250ms can cause problems as all
attempts will fail rather than only the ones for interfaces that do not
actually work correctly. Increasing this timeout to 5000ms will help to
avoid these issues

Change-type: patch
2024-03-25 15:05:13 +00:00
flowzone-app[bot]
7dd0323c2b
v16.1.6 2024-03-18 21:37:36 +00:00
flowzone-app[bot]
77e596cc13
Merge pull request #2252 from balena-os/pin-iptables-to-legacy
Pin iptables to 1.8.9 (legacy)
2024-03-18 21:36:20 +00:00
Christina Ying Wang
3d881347e7 Pin iptables to 1.8.9 (legacy)
With Alpine 3.19, iptables gets bumped to 1.8.10 which uses nftables.
The host OS still uses iptables 1.8.7 (legacy), and we should
use legacy as well until the OS uses nftables.

See: https://balena.zulipchat.com/#narrow/stream/345889-balena-io.2Fos/topic/iptables.20host.20vs.2E.20nftables.20Supervisor
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2024-03-18 14:15:24 -07:00
flowzone-app[bot]
8b173918ea
v16.1.5 2024-03-12 13:33:10 +00:00