We have observed that even when setting the socket timeout on the
state poll https request, the timeout is only applied once the socket is
connected. This causes issues with Node's auto family selection (happy
eyeballs), as the default https timeout is 5s which means that larger
[auto select attempt timeout](https://nodejs.org/docs/latest-v22.x/api/net.html#netgetdefaultautoselectfamilyattempttimeout) may result in the socket timing out before all connection attempts have been tried.
This commit sets a different https Agent for state polling, with a
timeout matching the `apiRequestTimeout` used for other request events.
Change-type: patch
The Target.lastFetch time compared when performing the healthcheck
resets any time a poll is attempted no matter the outcome. This changes
the behavior so the time is reset only on a successful poll
Change-type: patch
This was mistakenly increased due to confusion between the timeout for
requests to the supervisor's api vs the timeout for requests from the
supervisor to the balenaCloud api. This separates the two configs and
documents the difference between the timeouts whilst also decreasing
the timeout for balenaCloud api requests to the correct/expected value
Change-type: patch
If the Supervisor receives a 401 Unauthorized from the delta server
when requesting a delta image location, we should surface the error
instead of falling back to a regular pull immediately, as there could
be an issue with the delta auth token, which refreshes after
DELTA_TOKEN_TIMEOUT (10min), or some other edge case.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
This prevents an image download error loop where the delta image on the delta server is present,
but some aspect of the delta image or the base image on the device does not match up, causing
the delta to fail to be applied to the base image.
Delta apply errors don't raise status codes as they are thrown from the Engine (although they should),
so if an error with a status code is raised during this time, throw an error to the handler
indicating that the delta should be retried until success. Errors with status codes raised during
this time are largely network related, so falling back to a regular pull won't improve anything.
Upon delta apply errors exceeding DELTA_APPLY_RETRY_COUNT, revert to a regular pull.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
If the delta server responds immediately with HTTP 4xx upon requesting a delta image,
this means the server is not able to supply the resource, so fall back to a regular pull
immediately.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
gcanti/io-ts#705 fixes an issue with io-ts and non-enumerable
properties, but that results in objects with invalid properties to get
removed during `decode`, which breaks our validation tests.
Need to figure out what is the right behavior for us
Change-type: patch
This label can be used by user services to indicate that a reboot is
required after the install of a service in order to fully apply an update.
Change-type: minor
This was on device-config before, but we'll need to set the reboot
breadcrumb from the application-manager as well when we introduce
`requires-reboot` as a label.
Change-type: patch
Move the device-config module to the device-state folder and export only
those functions that are needed elsewhere in the codebase
This moves us closer to making the device-state module the only way to
modify application and configuration.
Change-type: patch