4975 Commits

Author SHA1 Message Date
balena-renovate[bot]
9f24c16b7c
Update deep-object-diff to v1.1.9
Update deep-object-diff from 1.1.0 to 1.1.9

Change-type: patch
2025-05-28 16:10:43 +00:00
flowzone-app[bot]
f90d72231d
v17.1.0 v17.1.0 2025-05-28 15:42:43 +00:00
flowzone-app[bot]
c33ea5e474
Merge pull request #2417 from balena-os/target-state-cancellation
Support target state apply cancellation
2025-05-28 15:41:54 +00:00
Christina Ying Wang
8c69166271
Support target state apply cancellation
The current target state apply is cancelled when either:
- /v1/update is called with cancel: true
- A different target state is received from the cloud (with a non-304 status)

Following apply cancellation, a target state apply is re-triggered. This ensures
that the user can force a device out of a dead-locked situation where a long-running
task such as an image fetch fails to cede control back to the Supervisor, which is
the behavior observed in an Engine bug with infinite pull retries with a bad network.

Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-05-28 07:46:10 -07:00
Christina Ying Wang
0af915d815
Pass AbortSignal to image pull functions
When abortController.abort() is called, this signal is passed down
to the functions that interface with Docker Engine for image pulls,
cancelling those pulls.

The next commit will limit when abortController.abort() is called.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-05-27 11:09:19 -07:00
Christina Ying Wang
4558b5a20b
Bump docker-progress to v5.3.1
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-05-27 11:09:09 -07:00
flowzone-app[bot]
1d34d90b2d
v17.0.5 v17.0.5 2025-05-27 15:32:23 +00:00
Ken Bannister
d3801b2092
Merge pull request #2420 from balena-os/adjust_used_memory
Exclude reclaimable slab memory from used memory metric
2025-05-27 11:31:40 -04:00
Ken Bannister
49e91a2639
Exclude reclaimable slab memory from used memory metric
Aligns metric value with used memory reported by the free and htop
utilities.

Change-type: patch
2025-05-25 11:51:06 -04:00
flowzone-app[bot]
f3aa2a2c90
v17.0.4 v17.0.4 2025-05-22 18:47:30 +00:00
flowzone-app[bot]
58bd22511e
Merge pull request #2418 from balena-os/container-contracts-ignore-extra
Remove unsupported fields from contract requirements
2025-05-22 18:46:44 +00:00
Felipe Lalanne
4318272844
Remove unsupported fields from contract requirements
A contract including extra requirement fields, such as "name" would fail
validation. This PR removes any extra fields from the validated contract
to prevent services with these extra fields from getting rejected by the
contract validation.

Change-type: patch
2025-05-15 17:38:03 -04:00
flowzone-app[bot]
aecefd400a
v17.0.3 v17.0.3 2025-05-13 14:50:16 +00:00
flowzone-app[bot]
2cb201f808
Merge pull request #2416 from balena-os/container-contracts-refactor
Simplify contract validation module
2025-05-13 14:49:06 +00:00
Felipe Lalanne
7c83eaef80
Simplify contract validation module
Use `satisfiesChildContract` instead of Blueprints as the previous
implementation did.

Change-type: patch
2025-05-08 19:33:12 -04:00
flowzone-app[bot]
01585c688e
v17.0.2 v17.0.2 2025-04-02 20:16:11 +00:00
flowzone-app[bot]
eeac56efc3
Merge pull request #2415 from balena-os/local-leftover-locks
Fix search for app leftover locks
2025-04-02 20:15:12 +00:00
Felipe Lalanne
d475b1d830
Fix search for app leftover locks
The leftover locks search was creating an array rather than an object
keyed by the appId. This could affect the lock cleanup and make leftover
locks from one app affect the install of the app in local mode.

Change-type: patch
2025-04-01 17:56:06 -03:00
flowzone-app[bot]
49b18b4a37
v17.0.1 v17.0.1 2025-03-25 20:41:22 +00:00
Christina Wang
623a1638c1
Merge pull request #2413 from balena-os/clarify-firewall-docs-on-host-network-containers
Clarify firewall docs on behavior with host network containers
2025-03-25 13:40:28 -07:00
Christina Ying Wang
caed4dcca0
Clarify firewall docs on behavior with host network containers
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-03-25 13:10:52 -07:00
flowzone-app[bot]
7efdeea0f7
v17.0.0 v17.0.0 2025-03-24 22:18:11 +00:00
flowzone-app[bot]
2d1871e16d
Merge pull request #2400 from balena-os/network-custom-ipam-label
Add Docker network label if custom ipam config
2025-03-24 22:17:12 +00:00
Christina Ying Wang
b596c77ce2
Add Docker network label if custom ipam config
In a target release where the only change is the addition or removal
of a custom ipam config, the Supervisor does not recreate the network
due to ignoring ipam config differences when comparing current and target
network (in network.isEqualConfig). This commit implements the addition of
a network label if the target compose object includes a network with custom
ipam. With the label, the Supervisor will detect a difference between a
network with a custom ipam and a network without, without needing to compare
the ipam configs themselves.

This is a major change, as devices running networks with custom ipam configs
will have their networks recreated to add the network label.

Closes: #2251
Change-type: major
See: https://balena.fibery.io/Work/Project/Fix-Supervisor-not-recreating-network-when-passed-custom-ipam-config-1127
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-03-24 14:55:19 -07:00
flowzone-app[bot]
8c6e3df7d9
v16.12.9 v16.12.9 2025-03-20 18:43:08 +00:00
flowzone-app[bot]
94cdd3fcd7
Merge pull request #2411 from balena-os/service-dependencies
Start a dependent if all dependencies are started
2025-03-20 18:42:13 +00:00
Felipe Lalanne
7764f98c9d
Start a dependent if all dependencies are started
The previous behavior required that dependencies were running beefore
starting the dependent service. This made it that services dependent on
a one-shot service would not get started and goes against the default
docker behavior.

Depending on a service to be running will require the implementation of
[long syntax depends_on](https://docs.docker.com/reference/compose-file/services/#long-syntax-1) and the condition
`service_healthy`.

Change-type: patch
Closes: #2409
2025-03-20 14:51:32 -03:00
flowzone-app[bot]
b8032edc04
v16.12.8 v16.12.8 2025-03-12 14:50:35 +00:00
flowzone-app[bot]
175872b358
Merge pull request #2408 from balena-os/fix-socket-timeout
Ensure poll socket timeout is defined early
2025-03-12 14:49:34 +00:00
Felipe Lalanne
ae337a1dd7
Remove GOT retries on state poll
The state poll already has retry implementation, making the GOT default
unnecessary.

Change-type: patch
2025-03-12 10:59:16 -03:00
Felipe Lalanne
bdbc6a4ba4
Ensure poll socket timeout is defined early
We have observed that even when setting the socket timeout on the
state poll https request, the timeout is only applied once the socket is
connected. This causes issues with Node's auto family selection (happy
eyeballs), as the default https timeout is 5s which means that larger
[auto select attempt timeout](https://nodejs.org/docs/latest-v22.x/api/net.html#netgetdefaultautoselectfamilyattempttimeout) may result in the socket timing out before all connection attempts have been tried.

This commit sets a different https Agent for state polling, with a
timeout matching the `apiRequestTimeout` used for other request events.

Change-type: patch
2025-03-12 10:59:11 -03:00
flowzone-app[bot]
978652b292
v16.12.7 v16.12.7 2025-03-06 19:11:20 +00:00
flowzone-app[bot]
7771c0e96b
Merge pull request #2406 from balena-os/release-locks-on-app-remove
Release locks when removing apps
2025-03-06 19:10:38 +00:00
Felipe Lalanne
026dc0aed2
Release locks when removing apps
This prevents leftover locks that can prevent other operations from
taking place.

Change-type: patch
2025-03-06 11:50:31 -03:00
flowzone-app[bot]
5ef6b054fd
v16.12.6 v16.12.6 2025-03-04 14:25:09 +00:00
flowzone-app[bot]
3cca2b7ecd
Merge pull request #2404 from balena-os/polling-improvements
Polling improvements
2025-03-04 14:24:18 +00:00
Felipe Lalanne
3d8bd28f5a
Update GOT to v14.4.6 2025-03-04 10:46:47 -03:00
Felipe Lalanne
6d00be2093
Log non-API errors during state poll
The supervisor was failing silently if an error happened while establishing the
connection (e.g. requesting the socket).

Change-type: patch
2025-03-04 10:46:45 -03:00
Felipe Lalanne
f8bdb14335
Fix target poll healthcheck
The Target.lastFetch time compared when performing the healthcheck
resets any time a poll is attempted no matter the outcome. This changes
the behavior so the time is reset only on a successful poll

Change-type: patch
2025-03-04 10:45:31 -03:00
flowzone-app[bot]
c88cf6a259
v16.12.5 v16.12.5 2025-03-04 13:35:28 +00:00
Page-
906ce6dc0d
Merge pull request #2405 from balena-os/fix-api-request-timeout
Decrease balenaCloud api request timeout from 15m to 59s
2025-03-04 13:34:35 +00:00
Pagan Gazzard
49163e92a0 Decrease balenaCloud api request timeout from 15m to 59s
This was mistakenly increased due to confusion between the timeout for
requests to the supervisor's api vs the timeout for requests from the
supervisor to the balenaCloud api. This separates the two configs and
documents the difference between the timeouts whilst also decreasing
the timeout for balenaCloud api requests to the correct/expected value

Change-type: patch
2025-03-04 12:29:18 +00:00
flowzone-app[bot]
f67e45f432
v16.12.4 v16.12.4 2025-03-03 13:42:20 +00:00
flowzone-app[bot]
91335051ac
Merge pull request #2403 from balena-os/dont-revert-to-regular-pull-if-401
Don't revert to regular pull if delta server 401
2025-03-03 13:41:29 +00:00
Christina Ying Wang
2dc9d275b1 Don't revert to regular pull if delta server 401
If the Supervisor receives a 401 Unauthorized from the delta server
when requesting a delta image location, we should surface the error
instead of falling back to a regular pull immediately, as there could
be an issue with the delta auth token, which refreshes after
DELTA_TOKEN_TIMEOUT (10min), or some other edge case.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-24 16:17:15 -08:00
flowzone-app[bot]
b6f0ecba18
v16.12.3 v16.12.3 2025-02-19 20:51:55 +00:00
flowzone-app[bot]
dd0253ff1f
Merge pull request #2396 from balena-os/switch-to-image-pull-if-delta-failure
Switch to image pull if delta failure
2025-02-19 20:50:58 +00:00
Christina Ying Wang
5936af37e7 Bump docker-progress to 5.2.4
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-12 13:49:09 -08:00
Christina Ying Wang
341111f1f9 Retry DELTA_APPLY_RETRY_COUNT (3) times during delta apply fail before reverting to regular pull
This prevents an image download error loop where the delta image on the delta server is present,
but some aspect of the delta image or the base image on the device does not match up, causing
the delta to fail to be applied to the base image.

Delta apply errors don't raise status codes as they are thrown from the Engine (although they should),
so if an error with a status code is raised during this time, throw an error to the handler
indicating that the delta should be retried until success. Errors with status codes raised during
this time are largely network related, so falling back to a regular pull won't improve anything.

Upon delta apply errors exceeding DELTA_APPLY_RETRY_COUNT, revert to a regular pull.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-11 12:19:53 -08:00
Christina Ying Wang
1fc242200f Revert to regular pull immediately on delta server failure (code 400s)
If the delta server responds immediately with HTTP 4xx upon requesting a delta image,
this means the server is not able to supply the resource, so fall back to a regular pull
immediately.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-11 10:58:51 -08:00