Compare commits

...

48 Commits

Author SHA1 Message Date
flowzone-app[bot]
b8032edc04
v16.12.8 2025-03-12 14:50:35 +00:00
flowzone-app[bot]
175872b358
Merge pull request #2408 from balena-os/fix-socket-timeout
Ensure poll socket timeout is defined early
2025-03-12 14:49:34 +00:00
Felipe Lalanne
ae337a1dd7
Remove GOT retries on state poll
The state poll already has retry implementation, making the GOT default
unnecessary.

Change-type: patch
2025-03-12 10:59:16 -03:00
Felipe Lalanne
bdbc6a4ba4
Ensure poll socket timeout is defined early
We have observed that even when setting the socket timeout on the
state poll https request, the timeout is only applied once the socket is
connected. This causes issues with Node's auto family selection (happy
eyeballs), as the default https timeout is 5s which means that larger
[auto select attempt timeout](https://nodejs.org/docs/latest-v22.x/api/net.html#netgetdefaultautoselectfamilyattempttimeout) may result in the socket timing out before all connection attempts have been tried.

This commit sets a different https Agent for state polling, with a
timeout matching the `apiRequestTimeout` used for other request events.

Change-type: patch
2025-03-12 10:59:11 -03:00
flowzone-app[bot]
978652b292
v16.12.7 2025-03-06 19:11:20 +00:00
flowzone-app[bot]
7771c0e96b
Merge pull request #2406 from balena-os/release-locks-on-app-remove
Release locks when removing apps
2025-03-06 19:10:38 +00:00
Felipe Lalanne
026dc0aed2
Release locks when removing apps
This prevents leftover locks that can prevent other operations from
taking place.

Change-type: patch
2025-03-06 11:50:31 -03:00
flowzone-app[bot]
5ef6b054fd
v16.12.6 2025-03-04 14:25:09 +00:00
flowzone-app[bot]
3cca2b7ecd
Merge pull request #2404 from balena-os/polling-improvements
Polling improvements
2025-03-04 14:24:18 +00:00
Felipe Lalanne
3d8bd28f5a
Update GOT to v14.4.6 2025-03-04 10:46:47 -03:00
Felipe Lalanne
6d00be2093
Log non-API errors during state poll
The supervisor was failing silently if an error happened while establishing the
connection (e.g. requesting the socket).

Change-type: patch
2025-03-04 10:46:45 -03:00
Felipe Lalanne
f8bdb14335
Fix target poll healthcheck
The Target.lastFetch time compared when performing the healthcheck
resets any time a poll is attempted no matter the outcome. This changes
the behavior so the time is reset only on a successful poll

Change-type: patch
2025-03-04 10:45:31 -03:00
flowzone-app[bot]
c88cf6a259
v16.12.5 2025-03-04 13:35:28 +00:00
Page-
906ce6dc0d
Merge pull request #2405 from balena-os/fix-api-request-timeout
Decrease balenaCloud api request timeout from 15m to 59s
2025-03-04 13:34:35 +00:00
Pagan Gazzard
49163e92a0 Decrease balenaCloud api request timeout from 15m to 59s
This was mistakenly increased due to confusion between the timeout for
requests to the supervisor's api vs the timeout for requests from the
supervisor to the balenaCloud api. This separates the two configs and
documents the difference between the timeouts whilst also decreasing
the timeout for balenaCloud api requests to the correct/expected value

Change-type: patch
2025-03-04 12:29:18 +00:00
flowzone-app[bot]
f67e45f432
v16.12.4 2025-03-03 13:42:20 +00:00
flowzone-app[bot]
91335051ac
Merge pull request #2403 from balena-os/dont-revert-to-regular-pull-if-401
Don't revert to regular pull if delta server 401
2025-03-03 13:41:29 +00:00
Christina Ying Wang
2dc9d275b1 Don't revert to regular pull if delta server 401
If the Supervisor receives a 401 Unauthorized from the delta server
when requesting a delta image location, we should surface the error
instead of falling back to a regular pull immediately, as there could
be an issue with the delta auth token, which refreshes after
DELTA_TOKEN_TIMEOUT (10min), or some other edge case.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-24 16:17:15 -08:00
flowzone-app[bot]
b6f0ecba18
v16.12.3 2025-02-19 20:51:55 +00:00
flowzone-app[bot]
dd0253ff1f
Merge pull request #2396 from balena-os/switch-to-image-pull-if-delta-failure
Switch to image pull if delta failure
2025-02-19 20:50:58 +00:00
Christina Ying Wang
5936af37e7 Bump docker-progress to 5.2.4
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-12 13:49:09 -08:00
Christina Ying Wang
341111f1f9 Retry DELTA_APPLY_RETRY_COUNT (3) times during delta apply fail before reverting to regular pull
This prevents an image download error loop where the delta image on the delta server is present,
but some aspect of the delta image or the base image on the device does not match up, causing
the delta to fail to be applied to the base image.

Delta apply errors don't raise status codes as they are thrown from the Engine (although they should),
so if an error with a status code is raised during this time, throw an error to the handler
indicating that the delta should be retried until success. Errors with status codes raised during
this time are largely network related, so falling back to a regular pull won't improve anything.

Upon delta apply errors exceeding DELTA_APPLY_RETRY_COUNT, revert to a regular pull.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-11 12:19:53 -08:00
Christina Ying Wang
1fc242200f Revert to regular pull immediately on delta server failure (code 400s)
If the delta server responds immediately with HTTP 4xx upon requesting a delta image,
this means the server is not able to supply the resource, so fall back to a regular pull
immediately.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2025-02-11 10:58:51 -08:00
flowzone-app[bot]
5c94c61b0a
v16.12.2 2025-02-11 01:04:24 +00:00
balena-renovate[bot]
43426a4a26
Merge pull request #2401 from balena-os/renovate/balena-io-deploy-to-balena-action-2.0.x
Update balena-io/deploy-to-balena-action action to v2.0.92
2025-02-11 01:03:38 +00:00
balena-renovate[bot]
c57622e226
Update balena-io/deploy-to-balena-action action to v2.0.92
Update balena-io/deploy-to-balena-action from 2.0.74 to 2.0.92

Change-type: patch
2025-02-11 00:32:00 +00:00
flowzone-app[bot]
5fca7c25bc
v16.12.1 2025-02-10 22:51:54 +00:00
flowzone-app[bot]
e901c38df0
Merge pull request #2399 from balena-os/dependency-updates
Dependency updates
2025-02-10 22:50:54 +00:00
Felipe Lalanne
f99e19f8a9
Update mocha-pod to v2.0.10 2025-02-10 15:54:25 -03:00
Felipe Lalanne
f4b1acba89
Pin deep-object-diff to v1.1.0
Newer patches of the package seem to change the interface causing the
code to no longer compile. More investigation is needed
2025-02-05 18:27:25 -03:00
Felipe Lalanne
88e821ed8e
Pin io-ts version to v2.2.20
gcanti/io-ts#705 fixes an issue with io-ts and non-enumerable
properties, but that results in objects with invalid properties to get
removed during `decode`, which breaks our validation tests.

Need to figure out what is the right behavior for us

Change-type: patch
2025-02-05 18:27:10 -03:00
Felipe Lalanne
58824066e0
Update more dependencies 2025-01-27 12:52:02 +00:00
Felipe Lalanne
f71f98777c
Update network-manager to v1
Change-type: patch
2025-01-23 23:40:52 -03:00
Felipe Lalanne
25e46574ab
Update development dependencies 2025-01-23 11:00:59 -03:00
Felipe Lalanne
52081ba15e
Update balena-request and balena-register-device
Change-type: patch
2025-01-23 10:16:39 -03:00
Felipe Lalanne
342a2d4dac
Update pinejs-client-request to v8
Change-type: patch
2025-01-23 10:07:32 -03:00
Felipe Lalanne
e474a9d95d
Update @balena/compose to v6 2025-01-22 17:16:14 -03:00
Felipe Lalanne
3a3889546d
Update chai utility modules
Updating chai will be done in a future PR as it requires overhauling all
tests since chai is now ESM

Change-type: patch
2025-01-22 10:43:45 -03:00
flowzone-app[bot]
3fbd98e218
v16.12.0 2025-01-20 22:14:38 +00:00
flowzone-app[bot]
84b9d869e1
Merge pull request #2398 from balena-os/node-22
Update supervisor to Node 22
2025-01-20 22:13:51 +00:00
Felipe Lalanne
85fc5784bc
Update contrato to v0.12.0
Change-type: patch
2025-01-15 18:56:24 -03:00
Felipe Lalanne
55f22dbc0f
Update alpine base image to 3.21
This allows to update Node to v22 on production supervisor images

Change-type: patch
2025-01-15 18:52:26 -03:00
Felipe Lalanne
ea594b18ab
Update Node support to v22
Updates @types/node and expands module support to v22.
Support for v20 will be removed on a future version.

Change-type: minor
2025-01-15 18:50:53 -03:00
flowzone-app[bot]
2637d997b6
v16.11.0 2025-01-14 18:15:59 +00:00
flowzone-app[bot]
bc306c1bc9
Merge pull request #2381 from balena-os/reboot-required
Add support for `io.balena.update.requires-reboot` label
2025-01-14 18:15:04 +00:00
Felipe Lalanne
e416ad0daf
Add support for io.balena.update.requires-reboot
This label can be used by user services to indicate that a reboot is
required after the install of a service in order to fully apply an update.

Change-type: minor
2025-01-14 11:20:35 -03:00
Felipe Lalanne
75127c6074
Move reboot breadcrumb check to device-state
This was on device-config before, but we'll need to set the reboot
breadcrumb from the application-manager as well when we introduce
`requires-reboot` as a label.

Change-type: patch
2025-01-09 14:31:55 -03:00
Felipe Lalanne
51f1fb0f30
Refactor device-config as part of device-state
Move the device-config module to the device-state folder and export only
those functions that are needed elsewhere in the codebase

This moves us closer to making the device-state module the only way to
modify application and configuration.

Change-type: patch
2025-01-09 14:31:43 -03:00
39 changed files with 2312 additions and 1239 deletions

View File

@ -13,7 +13,7 @@ inputs:
runs: runs:
using: 'composite' using: 'composite'
steps: steps:
- uses: balena-io/deploy-to-balena-action@72b7652cd8b4b0b49376f60fe790eef9ba76e3f0 # v2.0.74 - uses: balena-io/deploy-to-balena-action@3cb4217ab3347a885b4fcdc44d5f3a4153145633 # v2.0.92
with: with:
balena_token: ${{ fromJSON(inputs.secrets).BALENA_STAGING_TOKEN }} balena_token: ${{ fromJSON(inputs.secrets).BALENA_STAGING_TOKEN }}
fleet: ${{ env.matrix_value }} fleet: ${{ env.matrix_value }}

View File

@ -13,7 +13,7 @@ inputs:
runs: runs:
using: "composite" using: "composite"
steps: steps:
- uses: balena-io/deploy-to-balena-action@72b7652cd8b4b0b49376f60fe790eef9ba76e3f0 # v2.0.74 - uses: balena-io/deploy-to-balena-action@3cb4217ab3347a885b4fcdc44d5f3a4153145633 # v2.0.92
with: with:
balena_token: ${{ fromJSON(inputs.secrets).BALENA_STAGING_TOKEN }} balena_token: ${{ fromJSON(inputs.secrets).BALENA_STAGING_TOKEN }}
fleet: ${{ env.matrix_value }} fleet: ${{ env.matrix_value }}

View File

@ -1,3 +1,301 @@
- commits:
- subject: Remove GOT retries on state poll
hash: ae337a1dd7743b0ee0a05c32a5ce01965c5bafef
body: |
The state poll already has retry implementation, making the GOT default
unnecessary.
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Ensure poll socket timeout is defined early
hash: bdbc6a4ba4766f9466891497bc02bd33aff1d4c7
body: |
We have observed that even when setting the socket timeout on the
state poll https request, the timeout is only applied once the socket is
connected. This causes issues with Node's auto family selection (happy
eyeballs), as the default https timeout is 5s which means that larger
[auto select attempt timeout](https://nodejs.org/docs/latest-v22.x/api/net.html#netgetdefaultautoselectfamilyattempttimeout) may result in the socket timing out before all connection attempts have been tried.
This commit sets a different https Agent for state polling, with a
timeout matching the `apiRequestTimeout` used for other request events.
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
version: 16.12.8
title: ""
date: 2025-03-12T14:50:33.204Z
- commits:
- subject: Release locks when removing apps
hash: 026dc0aed29ce7d66cfdd8616d80d1f5daf3ad46
body: |
This prevents leftover locks that can prevent other operations from
taking place.
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
version: 16.12.7
title: ""
date: 2025-03-06T19:11:18.704Z
- commits:
- subject: Log non-API errors during state poll
hash: 6d00be20930398699da1006176dac1e81b2dbbd6
body: >
The supervisor was failing silently if an error happened while
establishing the
connection (e.g. requesting the socket).
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Fix target poll healthcheck
hash: f8bdb1433508dcaeff12a78d746256041ba1c414
body: |
The Target.lastFetch time compared when performing the healthcheck
resets any time a poll is attempted no matter the outcome. This changes
the behavior so the time is reset only on a successful poll
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
version: 16.12.6
title: ""
date: 2025-03-04T14:25:06.565Z
- commits:
- subject: Decrease balenaCloud api request timeout from 15m to 59s
hash: 49163e92a013250f72ca7231e11945b465c4dd45
body: |
This was mistakenly increased due to confusion between the timeout for
requests to the supervisor's api vs the timeout for requests from the
supervisor to the balenaCloud api. This separates the two configs and
documents the difference between the timeouts whilst also decreasing
the timeout for balenaCloud api requests to the correct/expected value
footer:
Change-type: patch
change-type: patch
author: Pagan Gazzard
nested: []
version: 16.12.5
title: ""
date: 2025-03-04T13:35:26.801Z
- commits:
- subject: Don't revert to regular pull if delta server 401
hash: 2dc9d275b15a0802264bcd49e2f0dddbbadd2225
body: |
If the Supervisor receives a 401 Unauthorized from the delta server
when requesting a delta image location, we should surface the error
instead of falling back to a regular pull immediately, as there could
be an issue with the delta auth token, which refreshes after
DELTA_TOKEN_TIMEOUT (10min), or some other edge case.
footer:
Change-type: patch
change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
signed-off-by: Christina Ying Wang <christina@balena.io>
author: Christina Ying Wang
nested: []
version: 16.12.4
title: ""
date: 2025-03-03T13:42:18.045Z
- commits:
- subject: Retry DELTA_APPLY_RETRY_COUNT (3) times during delta apply fail before
reverting to regular pull
hash: 341111f1f94cd9f17fd7be9b6f21e3bc22c9ad3a
body: >
This prevents an image download error loop where the delta image on the
delta server is present,
but some aspect of the delta image or the base image on the device does
not match up, causing
the delta to fail to be applied to the base image.
Delta apply errors don't raise status codes as they are thrown from the
Engine (although they should),
so if an error with a status code is raised during this time, throw an
error to the handler
indicating that the delta should be retried until success. Errors with
status codes raised during
this time are largely network related, so falling back to a regular pull
won't improve anything.
Upon delta apply errors exceeding DELTA_APPLY_RETRY_COUNT, revert to a
regular pull.
footer:
Change-type: patch
change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
signed-off-by: Christina Ying Wang <christina@balena.io>
author: Christina Ying Wang
nested: []
- subject: Revert to regular pull immediately on delta server failure (code 400s)
hash: 1fc242200f78e4219aafc5bb91de8cf0916236af
body: >
If the delta server responds immediately with HTTP 4xx upon requesting a
delta image,
this means the server is not able to supply the resource, so fall back
to a regular pull
immediately.
footer:
Change-type: patch
change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
signed-off-by: Christina Ying Wang <christina@balena.io>
author: Christina Ying Wang
nested: []
version: 16.12.3
title: ""
date: 2025-02-19T20:51:53.085Z
- commits:
- subject: Update balena-io/deploy-to-balena-action action to v2.0.92
hash: c57622e2264e41078e907d6ba8de9d5206bb6293
body: |
Update balena-io/deploy-to-balena-action from 2.0.74 to 2.0.92
footer:
Change-type: patch
change-type: patch
author: balena-renovate[bot]
nested: []
version: 16.12.2
title: ""
date: 2025-02-11T01:04:22.736Z
- commits:
- subject: Pin io-ts version to v2.2.20
hash: 88e821ed8e36e10d6429dc31950b5aeed968aa3f
body: |
gcanti/io-ts#705 fixes an issue with io-ts and non-enumerable
properties, but that results in objects with invalid properties to get
removed during `decode`, which breaks our validation tests.
Need to figure out what is the right behavior for us
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update network-manager to v1
hash: f71f98777cbf7198745f1dcb8467b8cc62719d85
body: ""
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update balena-request and balena-register-device
hash: 52081ba15e84be794a906d5cbccc343b24748bba
body: ""
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update pinejs-client-request to v8
hash: 342a2d4dac737274ab13a8b05eac0f1f036a5075
body: ""
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update chai utility modules
hash: 3a3889546d8546793914bc2b5da10e202ebb14b1
body: |
Updating chai will be done in a future PR as it requires overhauling all
tests since chai is now ESM
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
version: 16.12.1
title: ""
date: 2025-02-10T22:51:51.632Z
- commits:
- subject: Update contrato to v0.12.0
hash: 85fc5784bcd187d086bffbd0c2167ce9eb34650f
body: ""
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update alpine base image to 3.21
hash: 55f22dbc0f4792033b6253af89c6adde6a727ab0
body: |
This allows to update Node to v22 on production supervisor images
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Update Node support to v22
hash: ea594b18abb6b82f498071e50f71422dedb5b280
body: |
Updates @types/node and expands module support to v22.
Support for v20 will be removed on a future version.
footer:
Change-type: minor
change-type: minor
author: Felipe Lalanne
nested: []
version: 16.12.0
title: ""
date: 2025-01-20T22:14:35.646Z
- commits:
- subject: Add support for `io.balena.update.requires-reboot`
hash: e416ad0daf61fba14cd8c2012c5b2f66d8fb5f4a
body: >
This label can be used by user services to indicate that a reboot is
required after the install of a service in order to fully apply an
update.
footer:
Change-type: minor
change-type: minor
author: Felipe Lalanne
nested: []
- subject: Move reboot breadcrumb check to device-state
hash: 75127c6074531fd20199ed07d6860687b4105cfb
body: |
This was on device-config before, but we'll need to set the reboot
breadcrumb from the application-manager as well when we introduce
`requires-reboot` as a label.
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
- subject: Refactor device-config as part of device-state
hash: 51f1fb0f30e04ece6a00d2d8b9420b49703a2fde
body: |
Move the device-config module to the device-state folder and export only
those functions that are needed elsewhere in the codebase
This moves us closer to making the device-state module the only way to
modify application and configuration.
footer:
Change-type: patch
change-type: patch
author: Felipe Lalanne
nested: []
version: 16.11.0
title: ""
date: 2025-01-14T18:15:55.879Z
- commits: - commits:
- subject: Update systeminformation to v5.23.8 [SECURITY] - subject: Update systeminformation to v5.23.8 [SECURITY]
hash: 92b26c7ae2d8d329be18806abe24ab312e92db68 hash: 92b26c7ae2d8d329be18806abe24ab312e92db68

View File

@ -4,6 +4,67 @@ All notable changes to this project will be documented in this file
automatically by Versionist. DO NOT EDIT THIS FILE MANUALLY! automatically by Versionist. DO NOT EDIT THIS FILE MANUALLY!
This project adheres to [Semantic Versioning](http://semver.org/). This project adheres to [Semantic Versioning](http://semver.org/).
# v16.12.8
## (2025-03-12)
* Remove GOT retries on state poll [Felipe Lalanne]
* Ensure poll socket timeout is defined early [Felipe Lalanne]
# v16.12.7
## (2025-03-06)
* Release locks when removing apps [Felipe Lalanne]
# v16.12.6
## (2025-03-04)
* Log non-API errors during state poll [Felipe Lalanne]
* Fix target poll healthcheck [Felipe Lalanne]
# v16.12.5
## (2025-03-04)
* Decrease balenaCloud api request timeout from 15m to 59s [Pagan Gazzard]
# v16.12.4
## (2025-03-03)
* Don't revert to regular pull if delta server 401 [Christina Ying Wang]
# v16.12.3
## (2025-02-19)
* Retry DELTA_APPLY_RETRY_COUNT (3) times during delta apply fail before reverting to regular pull [Christina Ying Wang]
* Revert to regular pull immediately on delta server failure (code 400s) [Christina Ying Wang]
# v16.12.2
## (2025-02-11)
* Update balena-io/deploy-to-balena-action action to v2.0.92 [balena-renovate[bot]]
# v16.12.1
## (2025-02-10)
* Pin io-ts version to v2.2.20 [Felipe Lalanne]
* Update network-manager to v1 [Felipe Lalanne]
* Update balena-request and balena-register-device [Felipe Lalanne]
* Update pinejs-client-request to v8 [Felipe Lalanne]
* Update chai utility modules [Felipe Lalanne]
# v16.12.0
## (2025-01-20)
* Update contrato to v0.12.0 [Felipe Lalanne]
* Update alpine base image to 3.21 [Felipe Lalanne]
* Update Node support to v22 [Felipe Lalanne]
# v16.11.0
## (2025-01-14)
* Add support for `io.balena.update.requires-reboot` [Felipe Lalanne]
* Move reboot breadcrumb check to device-state [Felipe Lalanne]
* Refactor device-config as part of device-state [Felipe Lalanne]
# v16.10.3 # v16.10.3
## (2024-12-20) ## (2024-12-20)

View File

@ -1,8 +1,8 @@
ARG ARCH=%%BALENA_ARCH%% ARG ARCH=%%BALENA_ARCH%%
ARG FATRW_VERSION=0.2.21 ARG FATRW_VERSION=0.2.21
ARG NODE="nodejs~=20" ARG NODE="nodejs~=22"
ARG NPM="npm~=10" ARG NPM="npm~=10"
ARG ALPINE_VERSION="3.19" ARG ALPINE_VERSION="3.21"
################################################### ###################################################
# Build the supervisor dependencies # Build the supervisor dependencies

View File

@ -1 +1 @@
16.10.3 16.12.8

View File

@ -2,6 +2,6 @@ name: balena-supervisor
description: 'Balena Supervisor: balena''s agent on devices.' description: 'Balena Supervisor: balena''s agent on devices.'
joinable: false joinable: false
type: sw.application type: sw.application
version: 16.10.3 version: 16.12.8
provides: provides:
- slug: sw.compose.long-volume-syntax - slug: sw.compose.long-volume-syntax

2440
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
{ {
"name": "balena-supervisor", "name": "balena-supervisor",
"description": "This is balena's Supervisor, a program that runs on IoT devices and has the task of running user Apps (which are Docker containers), and updating them as the balena API informs it to.", "description": "This is balena's Supervisor, a program that runs on IoT devices and has the task of running user Apps (which are Docker containers), and updating them as the balena API informs it to.",
"version": "16.10.3", "version": "16.12.8",
"license": "Apache-2.0", "license": "Apache-2.0",
"repository": { "repository": {
"type": "git", "type": "git",
@ -35,17 +35,18 @@
"sqlite3": "^5.1.6" "sqlite3": "^5.1.6"
}, },
"engines": { "engines": {
"node": ">=20 <21", "node": ">=20 <23",
"npm": ">=10" "npm": ">=10"
}, },
"devDependencies": { "devDependencies": {
"@balena/compose": "^3.2.1", "@balena/compose": "^6.0.0",
"@balena/contrato": "^0.9.4", "@balena/contrato": "^0.12.0",
"@balena/es-version": "^1.0.3", "@balena/es-version": "^1.0.3",
"@balena/lint": "^8.0.2", "@balena/lint": "^8.0.2",
"@balena/sbvr-types": "^9.1.0",
"@types/bluebird": "^3.5.42", "@types/bluebird": "^3.5.42",
"@types/chai": "^4.3.14", "@types/chai": "^4.3.20",
"@types/chai-as-promised": "^7.1.8", "@types/chai-as-promised": "^8.0.1",
"@types/chai-like": "^1.1.3", "@types/chai-like": "^1.1.3",
"@types/chai-things": "0.0.38", "@types/chai-things": "0.0.38",
"@types/common-tags": "^1.8.4", "@types/common-tags": "^1.8.4",
@ -57,7 +58,7 @@
"@types/memoizee": "^0.4.11", "@types/memoizee": "^0.4.11",
"@types/mocha": "^10.0.6", "@types/mocha": "^10.0.6",
"@types/morgan": "^1.9.9", "@types/morgan": "^1.9.9",
"@types/node": "^20.12.7", "@types/node": "^22.10.6",
"@types/request": "^2.48.12", "@types/request": "^2.48.12",
"@types/rewire": "^2.5.30", "@types/rewire": "^2.5.30",
"@types/rwlock": "^5.0.6", "@types/rwlock": "^5.0.6",
@ -70,71 +71,72 @@
"@types/webpack": "^5.28.5", "@types/webpack": "^5.28.5",
"@types/yargs": "^17.0.32", "@types/yargs": "^17.0.32",
"balena-auth": "^6.0.1", "balena-auth": "^6.0.1",
"balena-register-device": "^9.0.2", "balena-register-device": "^9.0.4",
"balena-request": "^13.3.1", "balena-request": "^14.0.1",
"blinking": "^1.0.1", "blinking": "^1.0.1",
"bluebird": "^3.7.2", "bluebird": "^3.7.2",
"chai": "^4.3.4", "chai": "^4.5.0",
"chai-as-promised": "^7.1.1", "chai-as-promised": "^8.0.1",
"chai-like": "^1.1.1", "chai-like": "^1.1.3",
"chai-things": "^0.2.0", "chai-things": "^0.2.0",
"chokidar": "^3.5.1", "chokidar": "^4.0.3",
"common-tags": "^1.8.0", "common-tags": "^1.8.0",
"copy-webpack-plugin": "^12.0.0", "copy-webpack-plugin": "^12.0.0",
"deep-object-diff": "^1.1.0", "deep-object-diff": "1.1.0",
"docker-delta": "^4.1.0", "docker-delta": "^4.1.0",
"docker-progress": "^5.2.3", "docker-progress": "^5.2.4",
"dockerode": "^4.0.2", "dockerode": "^4.0.2",
"duration-js": "^4.0.0", "duration-js": "^4.0.0",
"express": "^4.17.1", "express": "^4.21.2",
"fork-ts-checker-webpack-plugin": "^9.0.2", "fork-ts-checker-webpack-plugin": "^9.0.2",
"fp-ts": "^2.16.5", "fp-ts": "^2.16.5",
"got": "14.4.1", "got": "^14.4.6",
"husky": "^9.0.11", "husky": "^9.1.7",
"io-ts": "^2.2.20", "io-ts": "2.2.20",
"io-ts-reporters": "^2.0.1", "io-ts-reporters": "^2.0.1",
"json-mask": "^2.0.0", "json-mask": "^2.0.0",
"JSONStream": "^1.3.5", "JSONStream": "^1.3.5",
"knex": "^3.1.0", "knex": "^3.1.0",
"lint-staged": "^15.2.2", "lint-staged": "^15.4.3",
"livepush": "^3.5.1", "livepush": "^3.5.1",
"lodash": "^4.17.21", "lodash": "^4.17.21",
"mdns-resolver": "1.1.0", "mdns-resolver": "1.1.0",
"memoizee": "^0.4.14", "memoizee": "^0.4.14",
"mocha": "^10.4.0", "mocha": "^10.4.0",
"mocha-pod": "^2.0.5", "mocha-pod": "^2.0.10",
"morgan": "^1.10.0", "morgan": "^1.10.0",
"network-checker": "^0.1.1", "network-checker": "^1.0.2",
"nock": "^13.1.2", "nock": "^13.5.6",
"node-loader": "^2.0.0", "node-loader": "^2.1.0",
"nodemon": "^3.1.0", "nodemon": "^3.1.9",
"pinejs-client-request": "^7.3.5", "pinejs-client-core": "^7.2.0",
"pinejs-client-request": "^8.0.1",
"pretty-ms": "^7.0.1", "pretty-ms": "^7.0.1",
"request": "^2.88.2", "request": "^2.88.2",
"resumable-request": "^2.0.1", "resumable-request": "^2.0.1",
"rewire": "^7.0.0", "rewire": "^7.0.0",
"rimraf": "^5.0.0", "rimraf": "^5.0.10",
"rwlock": "^5.0.0", "rwlock": "^5.0.0",
"semver": "7.6.3", "semver": "7.6.3",
"shell-quote": "^1.7.2", "shell-quote": "^1.8.2",
"sinon": "^18.0.0", "sinon": "^18.0.0",
"sinon-chai": "^3.7.0", "sinon-chai": "^3.7.0",
"strict-event-emitter-types": "^2.0.0", "strict-event-emitter-types": "^2.0.0",
"supertest": "^7.0.0", "supertest": "^7.0.0",
"systeminformation": "^5.22.7", "systeminformation": "^5.25.11",
"tar-stream": "^3.1.7", "tar-stream": "^3.1.7",
"terser-webpack-plugin": "^5.3.6", "terser-webpack-plugin": "^5.3.11",
"ts-loader": "^9.4.0", "ts-loader": "^9.5.2",
"ts-node": "^10.0.0", "ts-node": "^10.0.0",
"tsconfig-paths": "^4.1.0", "tsconfig-paths": "^4.2.0",
"typed-error": "^3.2.1", "typed-error": "^3.2.1",
"typescript": "^5.5.4", "typescript": "^5.7.3",
"webpack": "^5.74.0", "webpack": "^5.97.1",
"webpack-cli": "^5.0.0", "webpack-cli": "^5.1.4",
"winston": "^3.3.3", "winston": "^3.17.0",
"yargs": "^17.7.2" "yargs": "^17.7.2"
}, },
"versionist": { "versionist": {
"publishedAt": "2024-12-20T20:43:23.891Z" "publishedAt": "2025-03-12T14:50:33.763Z"
} }
} }

View File

@ -5,7 +5,6 @@ import _ from 'lodash';
import type { PinejsClientRequest } from 'pinejs-client-request'; import type { PinejsClientRequest } from 'pinejs-client-request';
import * as config from '../config'; import * as config from '../config';
import * as deviceConfig from '../device-config';
import * as eventTracker from '../event-tracker'; import * as eventTracker from '../event-tracker';
import { loadBackupFromMigration } from '../lib/migration'; import { loadBackupFromMigration } from '../lib/migration';
@ -64,7 +63,7 @@ export async function healthcheck() {
} }
// Check last time target state has been polled // Check last time target state has been polled
const timeSinceLastFetch = process.hrtime(TargetState.lastFetch); const timeSinceLastFetch = process.hrtime(TargetState.lastSuccessfulFetch);
const timeSinceLastFetchMs = const timeSinceLastFetchMs =
timeSinceLastFetch[0] * 1000 + timeSinceLastFetch[1] / 1e6; timeSinceLastFetch[0] * 1000 + timeSinceLastFetch[1] / 1e6;
@ -332,10 +331,10 @@ async function reportInitialEnv(
); );
} }
const defaultConfig = deviceConfig.getDefaults(); const defaultConfig = deviceState.getDefaultConfig();
const currentConfig = await deviceConfig.getCurrent(); const currentConfig = await deviceState.getCurrentConfig();
const targetConfig = deviceConfig.formatConfigKeys(targetConfigUnformatted); const targetConfig = deviceState.formatConfigKeys(targetConfigUnformatted);
if (!currentConfig) { if (!currentConfig) {
throw new InternalInconsistencyError( throw new InternalInconsistencyError(

View File

@ -3,6 +3,7 @@ import url from 'url';
import { setTimeout } from 'timers/promises'; import { setTimeout } from 'timers/promises';
import Bluebird from 'bluebird'; import Bluebird from 'bluebird';
import type StrictEventEmitter from 'strict-event-emitter-types'; import type StrictEventEmitter from 'strict-event-emitter-types';
import { Agent } from 'https';
import type { TargetState } from '../types/state'; import type { TargetState } from '../types/state';
import { InternalInconsistencyError } from '../lib/errors'; import { InternalInconsistencyError } from '../lib/errors';
@ -87,7 +88,8 @@ const emitTargetState = (
* We set a value rather then being undeclared because having it undefined * We set a value rather then being undeclared because having it undefined
* adds more overhead to dealing with this value without any benefits. * adds more overhead to dealing with this value without any benefits.
*/ */
export let lastFetch: ReturnType<typeof process.hrtime> = process.hrtime(); export let lastSuccessfulFetch: ReturnType<typeof process.hrtime> =
process.hrtime();
/** /**
* Attempts to update the target state * Attempts to update the target state
@ -101,11 +103,11 @@ export const update = async (
): Promise<void> => { ): Promise<void> => {
await config.initialized(); await config.initialized();
return Bluebird.using(lockGetTarget(), async () => { return Bluebird.using(lockGetTarget(), async () => {
const { uuid, apiEndpoint, apiTimeout, deviceApiKey } = const { uuid, apiEndpoint, apiRequestTimeout, deviceApiKey } =
await config.getMany([ await config.getMany([
'uuid', 'uuid',
'apiEndpoint', 'apiEndpoint',
'apiTimeout', 'apiRequestTimeout',
'deviceApiKey', 'deviceApiKey',
]); ]);
@ -119,6 +121,13 @@ export const update = async (
const got = await getGotInstance(); const got = await getGotInstance();
const { statusCode, headers, body } = await got(endpoint, { const { statusCode, headers, body } = await got(endpoint, {
retry: { limit: 0 },
agent: {
https: new Agent({
keepAlive: true,
timeout: apiRequestTimeout,
}),
},
headers: { headers: {
Authorization: `Bearer ${deviceApiKey}`, Authorization: `Bearer ${deviceApiKey}`,
'If-None-Match': cache?.etag, 'If-None-Match': cache?.etag,
@ -126,12 +135,12 @@ export const update = async (
timeout: { timeout: {
// TODO: We use the same default timeout for all of these in order to have a timeout generally // TODO: We use the same default timeout for all of these in order to have a timeout generally
// but it would probably make sense to tune them individually // but it would probably make sense to tune them individually
lookup: apiTimeout, lookup: apiRequestTimeout,
connect: apiTimeout, connect: apiRequestTimeout,
secureConnect: apiTimeout, secureConnect: apiRequestTimeout,
socket: apiTimeout, socket: apiRequestTimeout,
send: apiTimeout, send: apiRequestTimeout,
response: apiTimeout, response: apiRequestTimeout,
}, },
}); });
@ -154,8 +163,6 @@ export const update = async (
// Emit the target state and update the cache // Emit the target state and update the cache
cache.emitted = emitTargetState(cache, force, isFromApi); cache.emitted = emitTargetState(cache, force, isFromApi);
}).finally(() => {
lastFetch = process.hrtime();
}); });
}; };
@ -188,7 +195,11 @@ const poll = async (
await update(); await update();
// Reset fetchErrors because we successfuly updated // Reset fetchErrors because we successfuly updated
fetchErrors = 0; fetchErrors = 0;
} catch { lastSuccessfulFetch = process.hrtime();
} catch (e) {
if (!(e instanceof ApiResponseError)) {
log.error('Target state poll failed', e);
}
// Exponential back off if request fails // Exponential back off if request fails
pollInterval = Math.min(appUpdatePollInterval, 15000 * 2 ** fetchErrors); pollInterval = Math.min(appUpdatePollInterval, 15000 * 2 ** fetchErrors);
++fetchErrors; ++fetchErrors;

View File

@ -41,14 +41,17 @@ export let stateReportErrors = 0;
type StateReportOpts = { type StateReportOpts = {
[key in keyof Pick< [key in keyof Pick<
config.ConfigMap<SchemaTypeKey>, config.ConfigMap<SchemaTypeKey>,
'apiEndpoint' | 'apiTimeout' | 'deviceApiKey' | 'appUpdatePollInterval' | 'apiEndpoint'
| 'apiRequestTimeout'
| 'deviceApiKey'
| 'appUpdatePollInterval'
>]: SchemaReturn<key>; >]: SchemaReturn<key>;
}; };
type StateReport = { body: Partial<DeviceState>; opts: StateReportOpts }; type StateReport = { body: Partial<DeviceState>; opts: StateReportOpts };
async function report({ body, opts }: StateReport) { async function report({ body, opts }: StateReport) {
const { apiEndpoint, apiTimeout, deviceApiKey } = opts; const { apiEndpoint, apiRequestTimeout, deviceApiKey } = opts;
if (!apiEndpoint) { if (!apiEndpoint) {
throw new InternalInconsistencyError( throw new InternalInconsistencyError(
@ -69,7 +72,7 @@ async function report({ body, opts }: StateReport) {
const [{ statusCode, body: statusMessage, headers }] = await request const [{ statusCode, body: statusMessage, headers }] = await request
.patchAsync(endpoint, params) .patchAsync(endpoint, params)
.timeout(apiTimeout); .timeout(apiRequestTimeout);
if (statusCode < 200 || statusCode >= 300) { if (statusCode < 200 || statusCode >= 300) {
throw new StatusError( throw new StatusError(
@ -203,7 +206,7 @@ export async function startReporting() {
// Get configs needed to make a report // Get configs needed to make a report
const reportConfigs = (await config.getMany([ const reportConfigs = (await config.getMany([
'apiEndpoint', 'apiEndpoint',
'apiTimeout', 'apiRequestTimeout',
'deviceApiKey', 'deviceApiKey',
'appUpdatePollInterval', 'appUpdatePollInterval',
])) as StateReportOpts; ])) as StateReportOpts;

View File

@ -247,6 +247,16 @@ class AppImpl implements App {
} }
} }
// Release locks (if any) for all services before settling state
if (state.lock || state.hasLeftoverLocks) {
return [
generateStep('releaseLock', {
appId: this.appId,
lock: state.lock,
}),
];
}
return []; return [];
} }
@ -654,6 +664,7 @@ class AppImpl implements App {
context.targetApp, context.targetApp,
needsDownload, needsDownload,
servicesLocked, servicesLocked,
context.rebootBreadcrumbSet,
context.appsToLock, context.appsToLock,
context.availableImages, context.availableImages,
context.networkPairs, context.networkPairs,
@ -682,6 +693,8 @@ class AppImpl implements App {
context.appsToLock, context.appsToLock,
context.targetApp.services, context.targetApp.services,
servicesLocked, servicesLocked,
context.rebootBreadcrumbSet,
context.bootTime,
); );
} }
@ -761,6 +774,8 @@ class AppImpl implements App {
appsToLock: AppsToLockMap, appsToLock: AppsToLockMap,
targetServices: Service[], targetServices: Service[],
servicesLocked: boolean, servicesLocked: boolean,
rebootBreadcrumbSet: boolean,
bootTime: Date,
): CompositionStep[] { ): CompositionStep[] {
// Update container metadata if service release has changed // Update container metadata if service release has changed
if (current.commit !== target.commit) { if (current.commit !== target.commit) {
@ -774,16 +789,38 @@ class AppImpl implements App {
return []; return [];
} }
} else if (target.config.running !== current.config.running) { } else if (target.config.running !== current.config.running) {
// Take lock for all services before starting/stopping container
if (!servicesLocked) {
this.services.concat(targetServices).forEach((s) => {
appsToLock[target.appId].add(s.serviceName);
});
return [];
}
if (target.config.running) { if (target.config.running) {
// if the container has a reboot
// required label and the boot time is before the creation time, then
// return a 'noop' to ensure a reboot happens before starting the container
const requiresReboot =
checkTruthy(
target.config.labels?.['io.balena.update.requires-reboot'],
) &&
current.createdAt != null &&
current.createdAt > bootTime;
if (requiresReboot && rebootBreadcrumbSet) {
// Do not return a noop to allow locks to be released by the
// app module
return [];
} else if (requiresReboot) {
return [
generateStep('requireReboot', {
serviceName: target.serviceName,
}),
];
}
return [generateStep('start', { target })]; return [generateStep('start', { target })];
} else { } else {
// Take lock for all services before stopping container
if (!servicesLocked) {
this.services.concat(targetServices).forEach((s) => {
appsToLock[target.appId].add(s.serviceName);
});
return [];
}
return [generateStep('stop', { current })]; return [generateStep('stop', { current })];
} }
} else { } else {
@ -796,6 +833,7 @@ class AppImpl implements App {
targetApp: App, targetApp: App,
needsDownload: boolean, needsDownload: boolean,
servicesLocked: boolean, servicesLocked: boolean,
rebootBreadcrumbSet: boolean,
appsToLock: AppsToLockMap, appsToLock: AppsToLockMap,
availableImages: UpdateState['availableImages'], availableImages: UpdateState['availableImages'],
networkPairs: Array<ChangingPair<Network>>, networkPairs: Array<ChangingPair<Network>>,
@ -832,8 +870,10 @@ class AppImpl implements App {
} }
return [generateStep('start', { target })]; return [generateStep('start', { target })];
} else { } else {
// Wait for dependencies to be started // Wait for dependencies to be started unless there is a
return [generateStep('noop', {})]; // reboot breadcrumb set, in which case we need to allow the state
// to settle for the reboot to happen
return rebootBreadcrumbSet ? [] : [generateStep('noop', {})];
} }
} else { } else {
return []; return [];
@ -897,11 +937,11 @@ class AppImpl implements App {
return false; return false;
} }
const depedencyUnmet = _.some(target.dependsOn, (dep) => const dependencyUnmet = _.some(target.dependsOn, (dep) =>
_.some(servicePairs, (pair) => pair.target?.serviceName === dep), _.some(servicePairs, (pair) => pair.target?.serviceName === dep),
); );
if (depedencyUnmet) { if (dependencyUnmet) {
return false; return false;
} }

View File

@ -40,6 +40,8 @@ import type {
Image, Image,
InstancedAppState, InstancedAppState,
} from './types'; } from './types';
import { isRebootBreadcrumbSet } from '../lib/reboot';
import { getBootTime } from '../lib/fs-utils';
type ApplicationManagerEventEmitter = StrictEventEmitter< type ApplicationManagerEventEmitter = StrictEventEmitter<
EventEmitter, EventEmitter,
@ -127,6 +129,7 @@ export async function getRequiredSteps(
config.getMany(['localMode', 'delta']), config.getMany(['localMode', 'delta']),
]); ]);
const containerIdsByAppId = getAppContainerIds(currentApps); const containerIdsByAppId = getAppContainerIds(currentApps);
const rebootBreadcrumbSet = await isRebootBreadcrumbSet();
// Local mode sets the image and volume retention only // Local mode sets the image and volume retention only
// if not explicitely set by the caller // if not explicitely set by the caller
@ -149,6 +152,7 @@ export async function getRequiredSteps(
availableImages, availableImages,
containerIdsByAppId, containerIdsByAppId,
appLocks: lockRegistry, appLocks: lockRegistry,
rebootBreadcrumbSet,
}); });
} }
@ -161,6 +165,7 @@ interface InferNextOpts {
availableImages: UpdateState['availableImages']; availableImages: UpdateState['availableImages'];
containerIdsByAppId: { [appId: number]: UpdateState['containerIds'] }; containerIdsByAppId: { [appId: number]: UpdateState['containerIds'] };
appLocks: LockRegistry; appLocks: LockRegistry;
rebootBreadcrumbSet: boolean;
} }
// Calculate the required steps from the current to the target state // Calculate the required steps from the current to the target state
@ -176,6 +181,7 @@ export async function inferNextSteps(
availableImages = [], availableImages = [],
containerIdsByAppId = {}, containerIdsByAppId = {},
appLocks = {}, appLocks = {},
rebootBreadcrumbSet = false,
}: Partial<InferNextOpts>, }: Partial<InferNextOpts>,
) { ) {
const currentAppIds = Object.keys(currentApps).map((i) => parseInt(i, 10)); const currentAppIds = Object.keys(currentApps).map((i) => parseInt(i, 10));
@ -184,6 +190,7 @@ export async function inferNextSteps(
const withLeftoverLocks = await Promise.all( const withLeftoverLocks = await Promise.all(
currentAppIds.map((id) => hasLeftoverLocks(id)), currentAppIds.map((id) => hasLeftoverLocks(id)),
); );
const bootTime = getBootTime();
let steps: CompositionStep[] = []; let steps: CompositionStep[] = [];
@ -245,6 +252,8 @@ export async function inferNextSteps(
force, force,
lock: appLocks[id], lock: appLocks[id],
hasLeftoverLocks: withLeftoverLocks[id], hasLeftoverLocks: withLeftoverLocks[id],
rebootBreadcrumbSet,
bootTime,
}, },
targetApps[id], targetApps[id],
), ),
@ -261,6 +270,8 @@ export async function inferNextSteps(
force, force,
lock: appLocks[id], lock: appLocks[id],
hasLeftoverLocks: withLeftoverLocks[id], hasLeftoverLocks: withLeftoverLocks[id],
rebootBreadcrumbSet,
bootTime,
}), }),
); );
} }
@ -287,6 +298,8 @@ export async function inferNextSteps(
force, force,
lock: appLocks[id], lock: appLocks[id],
hasLeftoverLocks: false, hasLeftoverLocks: false,
rebootBreadcrumbSet,
bootTime,
}, },
targetApps[id], targetApps[id],
), ),

View File

@ -6,6 +6,7 @@ import * as networkManager from './network-manager';
import * as volumeManager from './volume-manager'; import * as volumeManager from './volume-manager';
import * as commitStore from './commit'; import * as commitStore from './commit';
import { Lockable, cleanLocksForApp } from '../lib/update-lock'; import { Lockable, cleanLocksForApp } from '../lib/update-lock';
import { setRebootBreadcrumb } from '../lib/reboot';
import type { DeviceLegacyReport } from '../types/state'; import type { DeviceLegacyReport } from '../types/state';
import type { CompositionStepAction, CompositionStepT } from './types'; import type { CompositionStepAction, CompositionStepT } from './types';
import type { Lock } from '../lib/update-lock'; import type { Lock } from '../lib/update-lock';
@ -157,6 +158,9 @@ export function getExecutors(app: { callbacks: CompositionCallbacks }) {
// Clean up any remaining locks // Clean up any remaining locks
await cleanLocksForApp(step.appId); await cleanLocksForApp(step.appId);
}, },
requireReboot: async (step) => {
await setRebootBreadcrumb({ serviceName: step.serviceName });
},
}; };
return executors; return executors;

View File

@ -19,7 +19,7 @@ import {
isStatusError, isStatusError,
} from '../lib/errors'; } from '../lib/errors';
import * as LogTypes from '../lib/log-types'; import * as LogTypes from '../lib/log-types';
import { checkInt, isValidDeviceName } from '../lib/validation'; import { checkInt, isValidDeviceName, checkTruthy } from '../lib/validation';
import { Service } from './service'; import { Service } from './service';
import type { ServiceStatus } from './types'; import type { ServiceStatus } from './types';
import { serviceNetworksToDockerNetworks } from './utils'; import { serviceNetworksToDockerNetworks } from './utils';
@ -27,6 +27,7 @@ import { serviceNetworksToDockerNetworks } from './utils';
import log from '../lib/supervisor-console'; import log from '../lib/supervisor-console';
import logMonitor from '../logging/monitor'; import logMonitor from '../logging/monitor';
import { setTimeout } from 'timers/promises'; import { setTimeout } from 'timers/promises';
import { getBootTime } from '../lib/fs-utils';
interface ServiceManagerEvents { interface ServiceManagerEvents {
change: void; change: void;
@ -233,7 +234,7 @@ export async function remove(service: Service) {
} }
} }
async function create(service: Service) { async function create(service: Service): Promise<Service> {
const mockContainerId = config.newUniqueKey(); const mockContainerId = config.newUniqueKey();
try { try {
const existing = await get(service); const existing = await get(service);
@ -242,7 +243,7 @@ async function create(service: Service) {
`No containerId provided for service ${service.serviceName} in ServiceManager.updateMetadata. Service: ${service}`, `No containerId provided for service ${service.serviceName} in ServiceManager.updateMetadata. Service: ${service}`,
); );
} }
return docker.getContainer(existing.containerId); return existing;
} catch (e: unknown) { } catch (e: unknown) {
if (!isNotFoundError(e)) { if (!isNotFoundError(e)) {
logger.logSystemEvent(LogTypes.installServiceError, { logger.logSystemEvent(LogTypes.installServiceError, {
@ -287,7 +288,9 @@ async function create(service: Service) {
reportNewStatus(mockContainerId, service, 'Installing'); reportNewStatus(mockContainerId, service, 'Installing');
const container = await docker.createContainer(conf); const container = await docker.createContainer(conf);
service.containerId = container.id; const inspectInfo = await container.inspect();
service = Service.fromDockerContainer(inspectInfo);
await Promise.all( await Promise.all(
_.map((nets || {}).EndpointsConfig, (endpointConfig, name) => _.map((nets || {}).EndpointsConfig, (endpointConfig, name) =>
@ -299,7 +302,7 @@ async function create(service: Service) {
); );
logger.logSystemEvent(LogTypes.installServiceSuccess, { service }); logger.logSystemEvent(LogTypes.installServiceSuccess, { service });
return container; return service;
} finally { } finally {
reportChange(mockContainerId); reportChange(mockContainerId);
} }
@ -310,13 +313,25 @@ export async function start(service: Service) {
let containerId: string | null = null; let containerId: string | null = null;
try { try {
const container = await create(service); const svc = await create(service);
const container = docker.getContainer(svc.containerId!);
const requiresReboot =
checkTruthy(
service.config.labels?.['io.balena.update.requires-reboot'],
) &&
svc.createdAt != null &&
svc.createdAt > getBootTime();
if (requiresReboot) {
log.warn(`Skipping start of service ${svc.serviceName} until reboot`);
}
// Exit here if the target state of the service // Exit here if the target state of the service
// is set to running: false // is set to running: false or we are waiting for a reboot
// QUESTION: should we split the service steps into // QUESTION: should we split the service steps into
// 'install' and 'start' instead of doing this? // 'install' and 'start' instead of doing this?
if (service.config.running === false) { if (service.config.running === false || requiresReboot) {
return container; return container;
} }

View File

@ -128,7 +128,6 @@ class ServiceImpl implements Service {
service.releaseId = parseInt(appConfig.releaseId, 10); service.releaseId = parseInt(appConfig.releaseId, 10);
service.serviceId = parseInt(appConfig.serviceId, 10); service.serviceId = parseInt(appConfig.serviceId, 10);
service.imageName = appConfig.image; service.imageName = appConfig.image;
service.createdAt = appConfig.createdAt;
service.commit = appConfig.commit; service.commit = appConfig.commit;
service.appUuid = appConfig.appUuid; service.appUuid = appConfig.appUuid;

View File

@ -12,6 +12,8 @@ export interface UpdateState {
hasLeftoverLocks: boolean; hasLeftoverLocks: boolean;
lock: Lock | null; lock: Lock | null;
force: boolean; force: boolean;
rebootBreadcrumbSet: boolean;
bootTime: Date;
} }
export interface App { export interface App {

View File

@ -76,6 +76,7 @@ export interface CompositionStepArgs {
appId: string | number; appId: string | number;
lock: Lock | null; lock: Lock | null;
}; };
requireReboot: { serviceName: string };
} }
export type CompositionStepAction = keyof CompositionStepArgs; export type CompositionStepAction = keyof CompositionStepArgs;

View File

@ -90,7 +90,7 @@ export const fnSchema = {
'deviceArch', 'deviceArch',
'deviceType', 'deviceType',
'apiEndpoint', 'apiEndpoint',
'apiTimeout', 'apiRequestTimeout',
'registered_at', 'registered_at',
'deviceId', 'deviceId',
'version', 'version',
@ -107,7 +107,7 @@ export const fnSchema = {
provisioningApiKey: conf.apiKey, provisioningApiKey: conf.apiKey,
deviceApiKey: conf.deviceApiKey, deviceApiKey: conf.deviceApiKey,
apiEndpoint: conf.apiEndpoint, apiEndpoint: conf.apiEndpoint,
apiTimeout: conf.apiTimeout, apiRequestTimeout: conf.apiRequestTimeout,
registered_at: conf.registered_at, registered_at: conf.registered_at,
deviceId: conf.deviceId, deviceId: conf.deviceId,
supervisorVersion: conf.version, supervisorVersion: conf.version,

View File

@ -12,6 +12,9 @@ export const schemaTypes = {
type: t.string, type: t.string,
default: '', default: '',
}, },
/**
* The timeout for the supervisor's api
*/
apiTimeout: { apiTimeout: {
type: PermissiveNumber, type: PermissiveNumber,
default: 15 * 60 * 1000, default: 15 * 60 * 1000,
@ -118,6 +121,13 @@ export const schemaTypes = {
type: PermissiveBoolean, type: PermissiveBoolean,
default: false, default: false,
}, },
/**
* The timeout for requests to the balenaCloud api
*/
apiRequestTimeout: {
type: PermissiveNumber,
default: 59000,
},
deltaRequestTimeout: { deltaRequestTimeout: {
type: PermissiveNumber, type: PermissiveNumber,
default: 59000, default: 59000,
@ -218,7 +228,7 @@ export const schemaTypes = {
provisioningApiKey: t.union([t.string, NullOrUndefined]), provisioningApiKey: t.union([t.string, NullOrUndefined]),
deviceApiKey: t.string, deviceApiKey: t.string,
apiEndpoint: t.string, apiEndpoint: t.string,
apiTimeout: PermissiveNumber, apiRequestTimeout: PermissiveNumber,
registered_at: t.union([PermissiveNumber, NullOrUndefined]), registered_at: t.union([PermissiveNumber, NullOrUndefined]),
deviceId: t.union([PermissiveNumber, NullOrUndefined]), deviceId: t.union([PermissiveNumber, NullOrUndefined]),
supervisorVersion: t.union([t.string, t.undefined]), supervisorVersion: t.union([t.string, t.undefined]),

View File

@ -4,6 +4,9 @@ export const schema = {
mutable: false, mutable: false,
removeIfNull: false, removeIfNull: false,
}, },
/**
* The timeout for the supervisor's api
*/
apiTimeout: { apiTimeout: {
source: 'config.json', source: 'config.json',
mutable: false, mutable: false,
@ -120,6 +123,11 @@ export const schema = {
mutable: true, mutable: true,
removeIfNull: false, removeIfNull: false,
}, },
apiRequestTimeout: {
source: 'db',
mutable: true,
removeIfNull: false,
},
delta: { delta: {
source: 'db', source: 'db',
mutable: true, mutable: true,

View File

@ -11,7 +11,6 @@ import { Volume } from '../compose/volume';
import * as commitStore from '../compose/commit'; import * as commitStore from '../compose/commit';
import * as config from '../config'; import * as config from '../config';
import * as db from '../db'; import * as db from '../db';
import * as deviceConfig from '../device-config';
import * as logger from '../logging'; import * as logger from '../logging';
import * as images from '../compose/images'; import * as images from '../compose/images';
import * as volumeManager from '../compose/volume-manager'; import * as volumeManager from '../compose/volume-manager';
@ -512,7 +511,7 @@ router.get('/v2/device/tags', async (_req, res) => {
}); });
router.get('/v2/device/vpn', async (_req, res) => { router.get('/v2/device/vpn', async (_req, res) => {
const conf = await deviceConfig.getCurrent(); const conf = await deviceState.getCurrentConfig();
// Build VPNInfo // Build VPNInfo
const info = { const info = {
enabled: conf.SUPERVISOR_VPN_CONTROL === 'true', enabled: conf.SUPERVISOR_VPN_CONTROL === 'true',

View File

@ -1,34 +1,24 @@
import _ from 'lodash'; import _ from 'lodash';
import { inspect } from 'util'; import { inspect } from 'util';
import { promises as fs } from 'fs';
import * as config from './config'; import * as config from '../config';
import * as db from './db'; import * as db from '../db';
import * as logger from './logging'; import * as logger from '../logging';
import * as dbus from './lib/dbus'; import * as dbus from '../lib/dbus';
import type { EnvVarObject } from './types'; import type { EnvVarObject } from '../types';
import { UnitNotLoadedError } from './lib/errors'; import { UnitNotLoadedError } from '../lib/errors';
import { checkInt, checkTruthy } from './lib/validation'; import { checkInt, checkTruthy } from '../lib/validation';
import log from './lib/supervisor-console'; import log from '../lib/supervisor-console';
import * as configUtils from './config/utils'; import { setRebootBreadcrumb } from '../lib/reboot';
import type { SchemaTypeKey } from './config/schema-type';
import { matchesAnyBootConfig } from './config/backends'; import * as configUtils from '../config/utils';
import type { ConfigBackend } from './config/backends/backend'; import type { SchemaTypeKey } from '../config/schema-type';
import { Odmdata } from './config/backends/odmdata'; import { matchesAnyBootConfig } from '../config/backends';
import * as fsUtils from './lib/fs-utils'; import type { ConfigBackend } from '../config/backends/backend';
import { pathOnRoot } from './lib/host-utils'; import { Odmdata } from '../config/backends/odmdata';
const vpnServiceName = 'openvpn'; const vpnServiceName = 'openvpn';
// This indicates the file on the host /tmp directory that
// marks the need for a reboot. Since reboot is only triggered for now
// by some config changes, we leave this here for now. There is planned
// functionality to allow image installs to require reboots, at that moment
// this constant can be moved somewhere else
const REBOOT_BREADCRUMB = pathOnRoot(
'/tmp/balena-supervisor/reboot-after-apply',
);
interface ConfigOption { interface ConfigOption {
envVarName: string; envVarName: string;
varType: string; varType: string;
@ -39,10 +29,7 @@ interface ConfigOption {
// FIXME: Bring this and the deviceState and // FIXME: Bring this and the deviceState and
// applicationState steps together // applicationState steps together
export interface ConfigStep { export interface ConfigStep {
// TODO: This is a bit of a mess, the DeviceConfig class shouldn't action: keyof DeviceActionExecutors | 'noop';
// know that the reboot action exists as it is implemented by
// DeviceState. Fix this weird circular dependency
action: keyof DeviceActionExecutors | 'reboot' | 'noop';
humanReadableTarget?: Dictionary<string>; humanReadableTarget?: Dictionary<string>;
target?: string | Dictionary<string>; target?: string | Dictionary<string>;
} }
@ -117,10 +104,12 @@ const actionExecutors: DeviceActionExecutors = {
await setBootConfig(backend, step.target as Dictionary<string>); await setBootConfig(backend, step.target as Dictionary<string>);
} }
}, },
setRebootBreadcrumb: async () => { setRebootBreadcrumb: async (step) => {
// Just create the file. The last step in the target state calculation will check const changes =
// the file and create a reboot step step != null && step.target != null && typeof step.target === 'object'
await fsUtils.touch(REBOOT_BREADCRUMB); ? step.target
: {};
return setRebootBreadcrumb(changes);
}, },
}; };
@ -152,6 +141,11 @@ const configKeys: Dictionary<ConfigOption> = {
varType: 'bool', varType: 'bool',
defaultValue: 'true', defaultValue: 'true',
}, },
apiRequestTimeout: {
envVarName: 'SUPERVISOR_API_REQUEST_TIMEOUT',
varType: 'int',
defaultValue: '59000',
},
delta: { delta: {
envVarName: 'SUPERVISOR_DELTA', envVarName: 'SUPERVISOR_DELTA',
varType: 'bool', varType: 'bool',
@ -210,7 +204,7 @@ const configKeys: Dictionary<ConfigOption> = {
}, },
}; };
export const validKeys = [ const validKeys = [
'SUPERVISOR_VPN_CONTROL', 'SUPERVISOR_VPN_CONTROL',
'OVERRIDE_LOCK', 'OVERRIDE_LOCK',
..._.map(configKeys, 'envVarName'), ..._.map(configKeys, 'envVarName'),
@ -413,6 +407,7 @@ function getConfigSteps(
target: Dictionary<string>, target: Dictionary<string>,
) { ) {
const configChanges: Dictionary<string> = {}; const configChanges: Dictionary<string> = {};
const rebootingChanges: Dictionary<string> = {};
const humanReadableConfigChanges: Dictionary<string> = {}; const humanReadableConfigChanges: Dictionary<string> = {};
let reboot = false; let reboot = false;
const steps: ConfigStep[] = []; const steps: ConfigStep[] = [];
@ -448,6 +443,9 @@ function getConfigSteps(
} }
if (changingValue != null) { if (changingValue != null) {
configChanges[key] = changingValue; configChanges[key] = changingValue;
if ($rebootRequired) {
rebootingChanges[key] = changingValue;
}
humanReadableConfigChanges[envVarName] = changingValue; humanReadableConfigChanges[envVarName] = changingValue;
reboot = $rebootRequired || reboot; reboot = $rebootRequired || reboot;
} }
@ -457,7 +455,7 @@ function getConfigSteps(
if (!_.isEmpty(configChanges)) { if (!_.isEmpty(configChanges)) {
if (reboot) { if (reboot) {
steps.push({ action: 'setRebootBreadcrumb' }); steps.push({ action: 'setRebootBreadcrumb', target: rebootingChanges });
} }
steps.push({ steps.push({
@ -544,24 +542,16 @@ async function getBackendSteps(
return [ return [
// All backend steps require a reboot except fan control // All backend steps require a reboot except fan control
...(steps.length > 0 && rebootRequired ...(steps.length > 0 && rebootRequired
? [{ action: 'setRebootBreadcrumb' } as ConfigStep] ? [
{
action: 'setRebootBreadcrumb',
} as ConfigStep,
]
: []), : []),
...steps, ...steps,
]; ];
} }
async function isRebootRequired() {
const hasBreadcrumb = await fsUtils.exists(REBOOT_BREADCRUMB);
if (hasBreadcrumb) {
const stats = await fs.stat(REBOOT_BREADCRUMB);
// If the breadcrumb exists and the last modified time is greater than the
// boot time, that means we need to reboot
return stats.mtime.getTime() > fsUtils.getBootTime().getTime();
}
return false;
}
export async function getRequiredSteps( export async function getRequiredSteps(
currentState: { local?: { config?: EnvVarObject } }, currentState: { local?: { config?: EnvVarObject } },
targetState: { local?: { config: EnvVarObject } }, targetState: { local?: { config: EnvVarObject } },
@ -584,19 +574,6 @@ export async function getRequiredSteps(
: await getBackendSteps(current, target)), : await getBackendSteps(current, target)),
]; ];
// Check if there is either no steps, or they are all
// noops, and we need to reboot. We want to do this
// because in a preloaded setting with no internet
// connection, the device will try to start containers
// before any boot config has been applied, which can
// cause problems
const rebootRequired = await isRebootRequired();
if (_.every(steps, { action: 'noop' }) && rebootRequired) {
steps.push({
action: 'reboot',
});
}
return steps; return steps;
} }
@ -642,7 +619,7 @@ export function executeStepAction(
step: ConfigStep, step: ConfigStep,
opts: DeviceActionExecutorOpts, opts: DeviceActionExecutorOpts,
) { ) {
if (step.action !== 'reboot' && step.action !== 'noop') { if (step.action !== 'noop') {
return actionExecutors[step.action](step, opts); return actionExecutors[step.action](step, opts);
} }
} }

View File

@ -9,7 +9,7 @@ import * as config from '../config';
import * as logger from '../logging'; import * as logger from '../logging';
import * as network from '../network'; import * as network from '../network';
import * as deviceConfig from '../device-config'; import * as deviceConfig from './device-config';
import * as constants from '../lib/constants'; import * as constants from '../lib/constants';
import * as dbus from '../lib/dbus'; import * as dbus from '../lib/dbus';
@ -19,6 +19,7 @@ import * as updateLock from '../lib/update-lock';
import { getGlobalApiKey } from '../lib/api-keys'; import { getGlobalApiKey } from '../lib/api-keys';
import * as sysInfo from '../lib/system-info'; import * as sysInfo from '../lib/system-info';
import { log } from '../lib/supervisor-console'; import { log } from '../lib/supervisor-console';
import { isRebootRequired } from '../lib/reboot';
import { loadTargetFromFile } from './preload'; import { loadTargetFromFile } from './preload';
import * as applicationManager from '../compose/application-manager'; import * as applicationManager from '../compose/application-manager';
import * as commitStore from '../compose/commit'; import * as commitStore from '../compose/commit';
@ -26,6 +27,12 @@ import type { InstancedDeviceState } from './target-state';
import * as TargetState from './target-state'; import * as TargetState from './target-state';
export { getTarget, setTarget } from './target-state'; export { getTarget, setTarget } from './target-state';
export {
formatConfigKeys,
getCurrent as getCurrentConfig,
getDefaults as getDefaultConfig,
} from './device-config';
import type { DeviceLegacyState, DeviceState, DeviceReport } from '../types'; import type { DeviceLegacyState, DeviceState, DeviceReport } from '../types';
import type { import type {
CompositionStepT, CompositionStepT,
@ -512,7 +519,7 @@ export async function executeStepAction(
} }
} }
export async function applyStep( async function applyStep(
step: DeviceStateStep<PossibleStepTargets>, step: DeviceStateStep<PossibleStepTargets>,
{ {
force, force,
@ -609,11 +616,12 @@ export const applyTarget = async ({
({ action }) => action === 'noop', ({ action }) => action === 'noop',
); );
let backoff: boolean; const rebootRequired = await isRebootRequired();
let backoff = false;
let steps: Array<DeviceStateStep<PossibleStepTargets>>; let steps: Array<DeviceStateStep<PossibleStepTargets>>;
if (!noConfigSteps) { if (!noConfigSteps) {
backoff = false;
steps = deviceConfigSteps; steps = deviceConfigSteps;
} else { } else {
const appSteps = await applicationManager.getRequiredSteps( const appSteps = await applicationManager.getRequiredSteps(
@ -640,6 +648,21 @@ export const applyTarget = async ({
} }
} }
// Check if there is either no steps, or they are all
// noops, and we need to reboot. We want to do this
// because in a preloaded setting with no internet
// connection, the device will try to start containers
// before any boot config has been applied, which can
// cause problems
// For application manager, the reboot breadcrumb should
// be set after all downloads are ready and target containers
// have been installed
if (steps.every(({ action }) => action === 'noop') && rebootRequired) {
steps.push({
action: 'reboot',
});
}
if (_.isEmpty(steps)) { if (_.isEmpty(steps)) {
emitAsync('apply-target-state-end', null); emitAsync('apply-target-state-end', null);
if (!intermediate) { if (!intermediate) {

View File

@ -6,9 +6,9 @@ import { imageFromService } from '../compose/images';
import { NumericIdentifier } from '../types'; import { NumericIdentifier } from '../types';
import { setTarget } from './target-state'; import { setTarget } from './target-state';
import * as config from '../config'; import * as config from '../config';
import * as deviceConfig from '../device-config';
import * as eventTracker from '../event-tracker'; import * as eventTracker from '../event-tracker';
import * as imageManager from '../compose/images'; import * as imageManager from '../compose/images';
import * as deviceState from '../device-state';
import { import {
AppsJsonParseError, AppsJsonParseError,
@ -126,8 +126,8 @@ export async function loadTargetFromFile(appsPath: string): Promise<boolean> {
await imageManager.save(image); await imageManager.save(image);
} }
const deviceConf = await deviceConfig.getCurrent(); const deviceConf = await deviceState.getCurrentConfig();
const formattedConf = deviceConfig.formatConfigKeys(preloadState.config); const formattedConf = deviceState.formatConfigKeys(preloadState.config);
const localState = { const localState = {
[uuid]: { [uuid]: {
name: '', name: '',

View File

@ -6,7 +6,7 @@ import * as config from '../config';
import * as db from '../db'; import * as db from '../db';
import * as globalEventBus from '../event-bus'; import * as globalEventBus from '../event-bus';
import * as deviceConfig from '../device-config'; import * as deviceConfig from './device-config';
import { TargetStateError } from '../lib/errors'; import { TargetStateError } from '../lib/errors';
import { takeGlobalLockRO, takeGlobalLockRW } from '../lib/process-lock'; import { takeGlobalLockRO, takeGlobalLockRW } from '../lib/process-lock';

View File

@ -111,10 +111,10 @@ export const exchangeKeyAndGetDevice = async (
opts: Partial<KeyExchangeOpts>, opts: Partial<KeyExchangeOpts>,
): Promise<Device> => { ): Promise<Device> => {
const uuid = opts.uuid; const uuid = opts.uuid;
const apiTimeout = opts.apiTimeout; const apiRequestTimeout = opts.apiRequestTimeout;
if (!(uuid && apiTimeout)) { if (!(uuid && apiRequestTimeout)) {
throw new InternalInconsistencyError( throw new InternalInconsistencyError(
'UUID and apiTimeout should be defined in exchangeKeyAndGetDevice', 'UUID and apiRequestTimeout should be defined in exchangeKeyAndGetDevice',
); );
} }
@ -122,7 +122,12 @@ export const exchangeKeyAndGetDevice = async (
// valid, because if it is then we can just use that // valid, because if it is then we can just use that
if (opts.deviceApiKey != null) { if (opts.deviceApiKey != null) {
try { try {
return await fetchDevice(balenaApi, uuid, opts.deviceApiKey, apiTimeout); return await fetchDevice(
balenaApi,
uuid,
opts.deviceApiKey,
apiRequestTimeout,
);
} catch (e) { } catch (e) {
if (e instanceof DeviceNotFoundError) { if (e instanceof DeviceNotFoundError) {
// do nothing... // do nothing...
@ -146,7 +151,7 @@ export const exchangeKeyAndGetDevice = async (
balenaApi, balenaApi,
uuid, uuid,
opts.provisioningApiKey, opts.provisioningApiKey,
apiTimeout, apiRequestTimeout,
); );
} catch { } catch {
throw new ExchangeKeyError(`Couldn't fetch device with provisioning key`); throw new ExchangeKeyError(`Couldn't fetch device with provisioning key`);
@ -165,7 +170,7 @@ export const exchangeKeyAndGetDevice = async (
Authorization: `Bearer ${opts.provisioningApiKey}`, Authorization: `Bearer ${opts.provisioningApiKey}`,
}, },
}) })
.timeout(apiTimeout); .timeout(apiRequestTimeout);
if (res.statusCode !== 200) { if (res.statusCode !== 200) {
throw new ExchangeKeyError( throw new ExchangeKeyError(
@ -220,7 +225,7 @@ export const provision = async (
osVariant: opts.osVariant, osVariant: opts.osVariant,
macAddress: opts.macAddress, macAddress: opts.macAddress,
}), }),
).timeout(opts.apiTimeout); ).timeout(opts.apiRequestTimeout);
} catch (err) { } catch (err) {
if ( if (
err instanceof deviceRegister.ApiError && err instanceof deviceRegister.ApiError &&

View File

@ -128,7 +128,7 @@ export function containerContractsFulfilled(
].map((c) => new Contract(c)), ].map((c) => new Contract(c)),
); );
const solution = blueprint.reproduce(universe); const solution = [...blueprint.reproduce(universe)];
if (solution.length > 1) { if (solution.length > 1) {
throw new InternalInconsistencyError( throw new InternalInconsistencyError(

View File

@ -1,22 +1,23 @@
import type { ProgressCallback } from 'docker-progress';
import { DockerProgress } from 'docker-progress'; import { DockerProgress } from 'docker-progress';
import type { ProgressCallback } from 'docker-progress';
import Dockerode from 'dockerode'; import Dockerode from 'dockerode';
import _ from 'lodash'; import _ from 'lodash';
import memoizee from 'memoizee'; import memoizee from 'memoizee';
import { applyDelta, OutOfSyncError } from 'docker-delta'; import { applyDelta, OutOfSyncError } from 'docker-delta';
import type { SchemaReturn } from '../config/schema-type'; import log from './supervisor-console';
import { envArrayToObject } from './conversions'; import { envArrayToObject } from './conversions';
import * as request from './request';
import { import {
DeltaStillProcessingError, DeltaStillProcessingError,
ImageAuthenticationError, ImageAuthenticationError,
InvalidNetGatewayError, InvalidNetGatewayError,
DeltaServerError,
DeltaApplyError,
isStatusError,
} from './errors'; } from './errors';
import * as request from './request';
import type { EnvVarObject } from '../types'; import type { EnvVarObject } from '../types';
import type { SchemaReturn } from '../config/schema-type';
import log from './supervisor-console';
export type FetchOptions = SchemaReturn<'fetchOptions'>; export type FetchOptions = SchemaReturn<'fetchOptions'>;
export type DeltaFetchOptions = FetchOptions & { export type DeltaFetchOptions = FetchOptions & {
@ -41,6 +42,18 @@ type ImageNameParts = {
// (10 mins) // (10 mins)
const DELTA_TOKEN_TIMEOUT = 10 * 60 * 1000; const DELTA_TOKEN_TIMEOUT = 10 * 60 * 1000;
// How many times to retry a v3 delta apply before falling back to a regular pull.
// A delta is applied to the base image when pulling, so a failure could be due to
// "layers from manifest don't match image configuration", which can occur before
// or after downloading delta image layers.
//
// Other causes of failure have not been documented as clearly as "layers from manifest"
// but could manifest as well, though unclear if they occur before, after, or during
// downloading delta image layers.
//
// See: https://github.com/balena-os/balena-engine/blob/master/distribution/pull_v2.go#L43
const DELTA_APPLY_RETRY_COUNT = 3;
export const docker = new Dockerode(); export const docker = new Dockerode();
export const dockerProgress = new DockerProgress({ export const dockerProgress = new DockerProgress({
docker, docker,
@ -113,11 +126,7 @@ export async function fetchDeltaWithProgress(
onProgress: ProgressCallback, onProgress: ProgressCallback,
serviceName: string, serviceName: string,
): Promise<string> { ): Promise<string> {
const deltaSourceId = const deltaSourceId = deltaOpts.deltaSourceId ?? deltaOpts.deltaSource;
deltaOpts.deltaSourceId != null
? deltaOpts.deltaSourceId
: deltaOpts.deltaSource;
const timeout = deltaOpts.deltaApplyTimeout; const timeout = deltaOpts.deltaApplyTimeout;
const logFn = (str: string) => const logFn = (str: string) =>
@ -143,7 +152,7 @@ export async function fetchDeltaWithProgress(
} }
// Since the supevisor never calls this function with a source anymore, // Since the supevisor never calls this function with a source anymore,
// this should never happen, but w ehandle it anyway // this should never happen, but we handle it anyway
if (deltaOpts.deltaSource == null) { if (deltaOpts.deltaSource == null) {
logFn('Falling back to regular pull due to lack of a delta source'); logFn('Falling back to regular pull due to lack of a delta source');
return fetchImageWithProgress(imgDest, deltaOpts, onProgress); return fetchImageWithProgress(imgDest, deltaOpts, onProgress);
@ -210,6 +219,18 @@ export async function fetchDeltaWithProgress(
} }
break; break;
case 3: case 3:
// If 400s status code, throw a more specific error & revert immediately to a regular pull,
// unless the code is 401 Unauthorized, in which case we should surface the error by retrying
// the delta server request, instead of falling back to a regular pull immediately.
if (res.statusCode >= 400 && res.statusCode < 500) {
if (res.statusCode === 401) {
throw new Error(
`Got ${res.statusCode} when requesting an image from delta server: ${res.statusMessage}`,
);
} else {
throw new DeltaServerError(res.statusCode, res.statusMessage);
}
}
if (res.statusCode !== 200) { if (res.statusCode !== 200) {
throw new Error( throw new Error(
`Got ${res.statusCode} when requesting v3 delta from delta server.`, `Got ${res.statusCode} when requesting v3 delta from delta server.`,
@ -225,24 +246,62 @@ export async function fetchDeltaWithProgress(
`Got an error when parsing delta server response for v3 delta: ${e}`, `Got an error when parsing delta server response for v3 delta: ${e}`,
); );
} }
id = await applyBalenaDelta(name, token, onProgress, logFn); // Try to apply delta DELTA_APPLY_RETRY_COUNT times, then throw DeltaApplyError
let lastError: Error | undefined = undefined;
for (
let tryCount = 0;
tryCount < DELTA_APPLY_RETRY_COUNT;
tryCount++
) {
try {
id = await applyBalenaDelta(name, token, onProgress, logFn);
break;
} catch (e) {
if (isStatusError(e)) {
// A status error during delta pull indicates network issues,
// so we should throw an error to the handler that indicates that
// the delta pull should be retried until network issues are resolved,
// rather than falling back to a regular pull.
throw e;
}
lastError = e as Error;
logFn(
`Delta apply failed, retrying (${tryCount + 1}/${DELTA_APPLY_RETRY_COUNT})...`,
);
}
}
if (lastError) {
throw new DeltaApplyError(lastError.message);
}
} }
break; break;
default: default:
throw new Error(`Unsupported delta version: ${deltaOpts.deltaVersion}`); throw new Error(`Unsupported delta version: ${deltaOpts.deltaVersion}`);
} }
} catch (e) { } catch (e) {
// Log appropriate message based on error type
if (e instanceof OutOfSyncError) { if (e instanceof OutOfSyncError) {
logFn('Falling back to regular pull due to delta out of sync error'); logFn('Falling back to regular pull due to delta out of sync error');
return await fetchImageWithProgress(imgDest, deltaOpts, onProgress); } else if (e instanceof DeltaServerError) {
logFn(
`Falling back to regular pull due to delta server error (${e.statusCode})${e.statusMessage ? `: ${e.statusMessage}` : ''}`,
);
} else if (e instanceof DeltaApplyError) {
// A delta apply error is raised from the Engine and doesn't have a status code
logFn(
`Falling back to regular pull due to delta apply error ${e.message ? `: ${e.message}` : ''}`,
);
} else { } else {
logFn(`Delta failed with ${e}`); logFn(`Delta failed with ${e}`);
throw e; throw e;
} }
// For handled errors, fall back to regular pull
return fetchImageWithProgress(imgDest, deltaOpts, onProgress);
} }
logFn(`Delta applied successfully`); logFn(`Delta applied successfully`);
return id; return id!;
} }
export async function fetchImageWithProgress( export async function fetchImageWithProgress(

View File

@ -70,6 +70,13 @@ export class InvalidNetGatewayError extends TypedError {}
export class DeltaStillProcessingError extends TypedError {} export class DeltaStillProcessingError extends TypedError {}
export class DeltaServerError extends StatusError {}
export class DeltaApplyError extends Error {
constructor(message?: string) {
super(message);
}
}
export class UpdatesLockedError extends TypedError {} export class UpdatesLockedError extends TypedError {}
export function isHttpConflictError(err: { statusCode: number }): boolean { export function isHttpConflictError(err: { statusCode: number }): boolean {

View File

@ -87,5 +87,4 @@ export const touch = (file: string, time = new Date()) =>
); );
// Get the system boot time as a Date object // Get the system boot time as a Date object
export const getBootTime = () => export const getBootTime = () => new Date(Date.now() - uptime() * 1000);
new Date(new Date().getTime() - uptime() * 1000);

40
src/lib/reboot.ts Normal file
View File

@ -0,0 +1,40 @@
import { pathOnRoot } from '../lib/host-utils';
import * as fsUtils from '../lib/fs-utils';
import { promises as fs } from 'fs';
import * as logger from '../logging';
// This indicates the file on the host /tmp directory that
// marks the need for a reboot. Since reboot is only triggered for now
// by some config changes, we leave this here for now. There is planned
// functionality to allow image installs to require reboots, at that moment
// this constant can be moved somewhere else
const REBOOT_BREADCRUMB = pathOnRoot(
'/tmp/balena-supervisor/reboot-after-apply',
);
export async function setRebootBreadcrumb(source: Dictionary<any> = {}) {
// Just create the file. The last step in the target state calculation will check
// the file and create a reboot step
await fsUtils.touch(REBOOT_BREADCRUMB);
logger.logSystemMessage(
`Reboot has been scheduled to apply changes: ${JSON.stringify(source)}`,
{},
'Reboot scheduled',
);
}
export async function isRebootBreadcrumbSet() {
return await fsUtils.exists(REBOOT_BREADCRUMB);
}
export async function isRebootRequired() {
const hasBreadcrumb = await fsUtils.exists(REBOOT_BREADCRUMB);
if (hasBreadcrumb) {
const stats = await fs.stat(REBOOT_BREADCRUMB);
// If the breadcrumb exists and the last modified time is greater than the
// boot time, that means we need to reboot
return stats.mtime.getTime() > fsUtils.getBootTime().getTime();
}
return false;
}

View File

@ -1,6 +1,7 @@
import _ from 'lodash'; import _ from 'lodash';
import { promises as fs, watch } from 'fs'; import { promises as fs, watch } from 'fs';
import networkCheck from 'network-checker'; import { checkHost as checkNetHost, monitor } from 'network-checker';
import type { ConnectOptions, MonitorChangeFunction } from 'network-checker';
import os from 'os'; import os from 'os';
import url from 'url'; import url from 'url';
@ -20,21 +21,16 @@ const networkPattern = {
let isConnectivityCheckPaused = false; let isConnectivityCheckPaused = false;
let isConnectivityCheckEnabled = true; let isConnectivityCheckEnabled = true;
function checkHost( async function checkHost(opts: ConnectOptions): Promise<boolean> {
opts: networkCheck.ConnectOptions,
): boolean | PromiseLike<boolean> {
return ( return (
!isConnectivityCheckEnabled || !isConnectivityCheckEnabled ||
isConnectivityCheckPaused || isConnectivityCheckPaused ||
networkCheck.checkHost(opts) (await checkNetHost(opts))
); );
} }
function customMonitor( function customMonitor(options: ConnectOptions, fn: MonitorChangeFunction) {
options: networkCheck.ConnectOptions, return monitor(checkHost, options, fn);
fn: networkCheck.MonitorChangeFunction,
) {
return networkCheck.monitor(checkHost, options, fn);
} }
export function enableCheck(enable: boolean) { export function enableCheck(enable: boolean) {
@ -60,7 +56,7 @@ export const startConnectivityCheck = _.once(
async ( async (
apiEndpoint: string, apiEndpoint: string,
enable: boolean, enable: boolean,
onChangeCallback?: networkCheck.MonitorChangeFunction, onChangeCallback?: MonitorChangeFunction,
) => { ) => {
enableConnectivityCheck(enable); enableConnectivityCheck(enable);
if (!apiEndpoint) { if (!apiEndpoint) {

View File

@ -5,7 +5,7 @@ import type { SinonStub, SinonSpy } from 'sinon';
import { stub, spy } from 'sinon'; import { stub, spy } from 'sinon';
import { expect } from 'chai'; import { expect } from 'chai';
import * as deviceConfig from '~/src/device-config'; import * as deviceConfig from '~/src/device-state/device-config';
import * as fsUtils from '~/lib/fs-utils'; import * as fsUtils from '~/lib/fs-utils';
import * as logger from '~/src/logging'; import * as logger from '~/src/logging';
import { Extlinux } from '~/src/config/backends/extlinux'; import { Extlinux } from '~/src/config/backends/extlinux';
@ -84,6 +84,7 @@ describe('device-config', () => {
SUPERVISOR_LOCAL_MODE: 'false', SUPERVISOR_LOCAL_MODE: 'false',
SUPERVISOR_CONNECTIVITY_CHECK: 'true', SUPERVISOR_CONNECTIVITY_CHECK: 'true',
SUPERVISOR_LOG_CONTROL: 'true', SUPERVISOR_LOG_CONTROL: 'true',
SUPERVISOR_API_REQUEST_TIMEOUT: '59000',
SUPERVISOR_DELTA: 'false', SUPERVISOR_DELTA: 'false',
SUPERVISOR_DELTA_REQUEST_TIMEOUT: '59000', SUPERVISOR_DELTA_REQUEST_TIMEOUT: '59000',
SUPERVISOR_DELTA_APPLY_TIMEOUT: '0', SUPERVISOR_DELTA_APPLY_TIMEOUT: '0',

View File

@ -335,7 +335,7 @@ describe('ApiBinder', () => {
before(async () => { before(async () => {
await initModels(components, '/config-apibinder.json'); await initModels(components, '/config-apibinder.json');
previousLastFetch = TargetState.lastFetch; previousLastFetch = TargetState.lastSuccessfulFetch;
}); });
after(async () => { after(async () => {

View File

@ -8,7 +8,7 @@ import { expect } from 'chai';
import * as TargetState from '~/src/api-binder/poll'; import * as TargetState from '~/src/api-binder/poll';
import Log from '~/lib/supervisor-console'; import Log from '~/lib/supervisor-console';
import * as request from '~/lib/request'; import * as request from '~/lib/request';
import * as deviceConfig from '~/src/device-config'; import * as deviceConfig from '~/src/device-state/device-config';
import { UpdatesLockedError } from '~/lib/errors'; import { UpdatesLockedError } from '~/lib/errors';
import { setTimeout } from 'timers/promises'; import { setTimeout } from 'timers/promises';

View File

@ -1,5 +1,4 @@
import chai from 'chai'; import chai from 'chai';
import chaiAsPromised from 'chai-as-promised';
import sinonChai from 'sinon-chai'; import sinonChai from 'sinon-chai';
import chaiThings from 'chai-things'; import chaiThings from 'chai-things';
import chaiLike from 'chai-like'; import chaiLike from 'chai-like';
@ -14,9 +13,11 @@ import chaiLike from 'chai-like';
* If unsure whether to add to global fixtures, refer to the chart above. * If unsure whether to add to global fixtures, refer to the chart above.
* Also, avoid setting global mutable variables here. * Also, avoid setting global mutable variables here.
*/ */
export const mochaGlobalSetup = function () { export const mochaGlobalSetup = async function () {
console.log('Setting up global fixtures for tests...'); console.log('Setting up global fixtures for tests...');
const { default: chaiAsPromised } = await import('chai-as-promised');
/* Setup chai assertion plugins */ /* Setup chai assertion plugins */
chai.use(chaiAsPromised); chai.use(chaiAsPromised);
chai.use(sinonChai); chai.use(sinonChai);

View File

@ -21,6 +21,8 @@ const defaultContext = {
downloading: [] as string[], downloading: [] as string[],
lock: null, lock: null,
hasLeftoverLocks: false, hasLeftoverLocks: false,
rebootBreadcrumbSet: false,
bootTime: new Date(Date.now() - 30 * 60 * 1000), // 30 minutes ago
}; };
const mockLock: Lock = { const mockLock: Lock = {
@ -2111,6 +2113,128 @@ describe('compose/app', () => {
); );
expectSteps('start', steps3, 2); expectSteps('start', steps3, 2);
}); });
it('should set the reboot breadcrumb after a service with `requires-reboot` has been installed', async () => {
// Container is a "run once" type of service so it has exitted.
const current = createApp({
services: [
await createService(
{
labels: { 'io.balena.update.requires-reboot': 'true' },
running: false,
},
{ state: { createdAt: new Date(), status: 'Installed' } },
),
],
networks: [DEFAULT_NETWORK],
});
// Now test that another start step is not added on this service
const target = createApp({
services: [
await createService({
labels: { 'io.balena.update.requires-reboot': 'true' },
running: true,
}),
],
isTarget: true,
});
const steps = current.nextStepsForAppUpdate(
{
...defaultContext,
rebootBreadcrumbSet: false,
// 30 minutes ago
bootTime: new Date(Date.now() - 30 * 60 * 1000),
},
target,
);
expect(steps.length).to.equal(1);
expectSteps('requireReboot', steps);
});
it('should not try to start a container with `requires-reboot` if the reboot has not taken place yet', async () => {
// Container is a "run once" type of service so it has exitted.
const current = createApp({
services: [
await createService(
{
labels: { 'io.balena.update.requires-reboot': 'true' },
running: false,
},
{ state: { createdAt: new Date(), status: 'Installed' } },
),
],
networks: [DEFAULT_NETWORK],
});
// Now test that another start step is not added on this service
const target = createApp({
services: [
await createService({
labels: { 'io.balena.update.requires-reboot': 'true' },
running: true,
}),
],
isTarget: true,
});
const steps = current.nextStepsForAppUpdate(
{
...defaultContext,
rebootBreadcrumbSet: true,
bootTime: new Date(Date.now() - 30 * 60 * 1000),
},
target,
);
expect(steps.length).to.equal(0);
expectNoStep('start', steps);
});
it('should start a container with `requires-reboot` after reboot has taken place', async () => {
// Container is a "run once" type of service so it has exitted.
const current = createApp({
services: [
await createService(
{
labels: { 'io.balena.update.requires-reboot': 'true' },
running: false,
},
// Container was created 5 minutes ago
{
state: {
createdAt: new Date(Date.now() - 5 * 60 * 1000),
status: 'Installed',
},
},
),
],
networks: [DEFAULT_NETWORK],
});
// Now test that another start step is not added on this service
const target = createApp({
services: [
await createService({
labels: { 'io.balena.update.requires-reboot': 'true' },
running: true,
}),
],
isTarget: true,
});
const steps = current.nextStepsForAppUpdate(
{
...defaultContext,
rebootBreadcrumbSet: true,
// Reboot just happened
bootTime: new Date(),
},
target,
);
expect(steps.length).to.equal(1);
expectSteps('start', steps);
});
}); });
describe('image state behavior', () => { describe('image state behavior', () => {
@ -2275,5 +2399,19 @@ describe('compose/app', () => {
const [releaseLockStep] = expectSteps('releaseLock', steps, 1); const [releaseLockStep] = expectSteps('releaseLock', steps, 1);
expect(releaseLockStep).to.have.property('appId').that.equals(1); expect(releaseLockStep).to.have.property('appId').that.equals(1);
}); });
it('should infer a releaseLock step when removing an app', async () => {
const current = createApp({
services: [],
networks: [],
});
const steps = current.stepsToRemoveApp({
...defaultContext,
lock: mockLock,
});
const [releaseLockStep] = expectSteps('releaseLock', steps, 1);
expect(releaseLockStep).to.have.property('appId').that.equals(1);
});
}); });
}); });