Commit Graph

1588 Commits

Author SHA1 Message Date
Christina Ying Wang
ce5bf89dfc Move /v1 routes in deviceState.router to v1.ts
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Christina Ying Wang
a2d9af2407 Move /v1 routes in apiBinder.router to v1.ts
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Christina Ying Wang
d08f25f0a3 Consolidate API middlewares, move api-keys to device-api
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Christina Ying Wang
5af146ec4e Move supervisor-api.ts to device-api/index.ts
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
pipex
97ec2a4151 Remove unused code from dbus module 2022-10-17 10:15:36 -03:00
pipex
4de816d1e9 Fix bug in preloading config vars 2022-10-17 10:15:36 -03:00
Ken Bannister
91f93952db Allow most printable ASCII chars for service label key
Change-type: patch
Signed-off-by: Ken Bannister <kb2ma@runbox.com>
2022-10-14 20:50:25 -04:00
Felipe Lalanne
b4514631b1 Start state engine and API binder in parallel
The state engine and preloading is performed before the device gets a
chance to register, while this is desirable for preloaded apps, it
introduces a delay on registration which is known to cause issues since
the VPN is also trying to connect at the same time.

This triggers a simultaneous start of the device engine, the API binder
and the supevisor API to avoid delays.

Change-type: patch
2022-09-30 19:38:10 +00:00
Pagan Gazzard
0237bd7cf4 Update type dependencies
Change-type: patch
2022-10-03 14:38:42 -05:00
Felipe Lalanne
f19f70d690 Migrate update-lock tests as integration tests
Update-lock tests now use the actual filesystem for testing, instead of
relying on stubs and spies.

This commit also fixes a small bug with update-lock that would cause a
`PromiseRejectionHandledWarning` when the lock callback would throw.
2022-09-28 10:37:41 -03:00
Felipe Lalanne
a69fbf6eac Migrate volume-manager tests to integration
Now the tests are ran against the actual docker engine instead of
against mockerode.

The new tests actually caught a bug in
`volumeManager.removeOrphanedVolumes`, where that function would try to
remove volumes for stopped containers, causing an exception.
This commit also fixes that bug.
2022-09-28 10:37:41 -03:00
Felipe Lalanne
460659429d Update dependencies to fix NPM build
Change-type: patch
2022-09-26 15:26:48 -03:00
Felipe Lalanne
b168cc35a0 Remove mixpanel configurations
Mixpanel configurations and packages are no longer used. This removes
deadcode from the supervisor.
2022-09-20 14:22:24 -03:00
Felipe Lalanne
e00687408c Disable event tracking
The supervisor used to rely on specific event reporting for identifying
issues at runtime. As the platform has grown, it has become much more
difficult to get any signal from the event noise. Recently the API side
for these events has been disabled, meaning these events only
contribute to bandwidth consumption.  This commit disables the
event reporting feature of the supervisor which will be most likely
replaced by something like Sentry in the near future.

Change-type: minor
2022-09-20 14:19:26 -03:00
Pagan Gazzard
5518eb17bd Update to nodejs 16
Change-type: minor
2022-09-19 17:51:48 +01:00
Pagan Gazzard
96418d55b5 Update @balena/lint to 6.2.0
Change-type: patch
2022-09-19 16:41:28 +01:00
Pagan Gazzard
a4c13aa2e9 Update to typescript 4.8.2
Change-type: patch
2022-09-19 16:36:17 +01:00
Pagan Gazzard
65e69f3a83 Update to nodejs 14
Change-type: patch
2022-09-15 22:59:40 +01:00
Felipe Lalanne
f7bc30a310 Remove unnecessary check for docker status code 2022-09-14 10:41:32 -03:00
Felipe Lalanne
c6f911c36b Only install service if running is set to false
The supervisor supports target state `running: false` for services.
This state indicates that the service should be stopped if already
running, or that the container should just be created and never started
if the container does not exist. This commit fixes the latter behavior.

Although nothing in our platform currently sends this target state, this
enables some potential use cases, e.g. only starting some services
in manufacturing and starting the rest of the services when the device
actually connects.

Change-type: patch
Closes: #2014
2022-09-14 10:15:51 -03:00
Felipe Lalanne
3e45e9561e Fix withDefault type helper to work with boolean
`withDefault` is a type helper that allows to create a type that
defaults to a default value when trying to decode a nullish value.
That type was not correctly working with boolean types, causing `false`
values to be replaced by true. This would specifically cause issues when
parsing the target state, where a `running: false` in a service would
become a `running: true` due to the type decoding.

Change-type: patch
2022-09-13 20:08:32 +00:00
Felipe Lalanne
5a57647450 Fix filtering of the supervisor app on the target state
Under some conditions, an aarch64 device may get a reference to a armv7hf
supervisor on the target state. One of the ways this can happen is if
an aarch64 device is added to an armv7hf fleet and the target supervisor
is set before the device fully provisions.

If that happens, the previous filtering for the supervisor app (which
relied on the architecture in device-type.json) would
fail and the user would end up with two supervisor containers, one
running correctly and the other crash looping.

This fixes the filtering and just checks if the supervisor uuid/service
name belongs to a group of known uuids.

Closes: #2006
Change-type: patch
2022-09-12 16:28:22 -03:00
Thodoris Greasidis
fadd514463 Set desired es-version for downstream modules that support it
Change-type: patch
2022-09-07 17:07:16 +03:00
Felipe Lalanne
48e0733c7e Remove side effects for module imports
The supervisor uses the following pattern for async module
initialization

```typescript
// module.ts

export const initialised = (async () => {
    // do some async initialization
})();

// somewhere else
import * as module from 'module';

async function setup() {
  await module.initialise;
}
```

The above pattern means that whenever the module is imported, the
initialisation procedure will be ran, which is an anti-pattern.

This converts any instance of this pattern into a function

```typescript
export const initialised = _.once(async () => {
    // do some async initialization
});
```

And anywhere else on the code it replaces the call with a

```typescript
await module.initialised();
```

Change-type: patch
2022-09-06 15:48:18 -04:00
Alexandru Costache
36544b7d6e Add custom DTB support for imx8mm-var-som
Change-type: patch
Signed-off-by: Alexandru Costache <alexandru@balena.io>
2022-09-06 16:33:35 +02:00
Felipe Lalanne
e0e1eacc6e Migrate lockfile tests to testfs
Since tests are ran in a container, lockfile tests no longer need to
mock the behavior of the `lockfile` binary.
2022-08-24 16:07:25 -04:00
Christina Wang
12b67742c8 Wait for Stopping services to stop before target apply success
This mitigates an edge case bug introduced in v13.1.3 where services that
are slow to exit may get stuck in a state of Downloaded if a service var is
changed then reverted rapidly. More detailed description in linked issue.

Change-type: patch
Closes: #1991
Signed-off-by: Christina Wang <christina@balena.io>
2022-08-02 14:34:25 -07:00
Felipe Lalanne
0c4e6ce421 Disable healthchecks failing on report errors
State report errors contribute to the supervisor failing healthchecks
and being restarted by the engine. There is not evidence of this
improving the connectivity situation and it is likely to make things
worst for the API as the first report is much more expensive than
subsequent partial reports.

Change-type: patch
Closes: #1986
2022-07-18 15:53:26 -04:00
Felipe Lalanne
861e902d7f Allow directories to be used as lockfiles
Some libraries, like [proper-lockfile](https://www.npmjs.com/package/proper-lockfile)
use directories instead of files for locking. This PR allows the supervisor to be able to
work with those types of locks when lock override is requested.

Closes: #1978
Change-type: patch
2022-07-13 13:05:38 -04:00
Christina Wang
0fc79e87d9 Allow host config patch regardless of running applications
Host config shouldn't be tied to applications in the first place, but
needs to be done so because it uses update locks to determine when it's
safe to patch host config, and update locks are tied to apps.

Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-07-06 12:41:51 -07:00
Christina Wang
a7a0821a3e Read hostname from config.json with container /etc/hostname as backup
We don't need to read the host's hostname through /mnt/root/etc/hostname,
because the hostname is written to config.json on a change. When the hostname
has never changed, it won't be found in config.json, so we can default to
the Supervisor container's /etc/hostname as it will match the host's
/etc/hostname, the network mode being `host`.

Closes: #1968
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-06-15 11:31:36 -07:00
Christina Wang
dfb6bcf0e6 Add custom DTB support for Variscite Dart DT family
Closes: #1963
Relates-to: https://github.com/balena-os/balena-variscite-mx8/pull/134
Relates-to: https://github.com/balena-io/open-balena-api/issues/1033
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-06-14 11:26:45 -07:00
Christina Wang
ffa1c73418 Better document mocked-dbus, add missing dbus interface methods
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-06-09 11:41:50 -07:00
Felipe Lalanne
99efd12acd Do not send name as part of the current state
This fixes a race condition that could occur with the first current
state report, where if the device managed to send the current state
report first, then the device name on the cloud would be set to `local`
(see #1959).

Closes: #1959
Change-type: patch
2022-06-07 15:14:21 -04:00
Christina Wang
be1c01039a Don't use config.get for appId when checking locks in host config PATCH
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-06-06 12:15:23 -07:00
20k-ultra
aad5a9efc5 Use locks before shutdown/reboot instead of stopping containers
Closes: #1940
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-06-02 00:31:42 -04:00
20k-ultra
471f0f0615 Refactor update-lock.lock to accept an array of applications to lock
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-06-02 00:31:42 -04:00
20k-ultra
ef7371a7ef Refactor update-lock function to avoid callback hell
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-06-02 00:31:42 -04:00
Christina Wang
6ccd2178c1 Use Mounts API for engine socket feature label
When upgrading to this Supervisor version, containers using the
engine feature label will be restarted.

Relates-to: https://github.com/balena-os/balena-supervisor/pull/1780
Closes: https://github.com/balena-os/balena-engine/issues/220
Closes: #1933
Change-type: major
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 23:57:28 +00:00
Felipe Lalanne
af1a60f7c6 Throw a more explanatory error if migrating apps.json fails 2022-05-26 16:58:15 -04:00
Felipe Lalanne
303c805008 Fix check for preloaded v2 target state 2022-05-24 17:55:05 -04:00
Christina Wang
95bf4718d6 Only migrate apps.json on preload after target has been set
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
Signed-off-by: Felipe Lalanne <felipe@balena.io>
2022-05-24 17:54:38 -04:00
Felipe Lalanne
76553c6b4a Trim newlines from sysinfo files
Change-type: patch
2022-05-23 14:08:08 -04:00
Felipe Lalanne
29867ccf17 Fix serial number support for variscite boards
Closes: #1950

Change-type: patch
2022-05-23 10:29:22 -04:00
Christina Wang
2896444988 Log anonymous volumes
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 11:08:23 -07:00
Christina Wang
0a9c7282e8 Add compose support for volumes defined with long syntax
balena-compose already supports this, and with this PR, Supervisor can
have the option of using HostConfig.Mounts for internal bind mounts such as
ones added by feature labels. This will be handled in a future PR.

The only blocker to having users use long syntax is adding this feature
to target state. This PR does not add that feature.

Relates-to: https://github.com/balena-os/balena-supervisor/pull/1780
Relates-to: https://github.com/balena-os/balena-engine/issues/220
Relates-to: #1933
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 11:08:23 -07:00
20k-ultra
67f9c44a6c Prevent throttling reports when nothing was sent
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-05-16 15:19:55 -04:00
Felipe Lalanne
a5ede01b18 Avoid splash image failures if image is corrupt
Splash image backend would throw if the image is not a valid png during
the write step. This could prevent the device from provisioning if some
corruption happens at some point.

Change-type: patch
2022-05-03 15:30:18 +00:00
Felipe Lalanne
c04955354a Use write + sync when writing configs to /mnt/boot
This commit updates all backends that write to /mnt/boot to do it
through a new `lib/host-utils` module. Writes are now done using write +
sync as rename is not an atomic operation in vfat.

The change also applies for writes through the `/v1/host-config`
endpoint.

Finally this change includes some improvements on tests.

Change-type: patch
2022-05-03 11:23:00 -04:00
20k-ultra
2e81a7328e Use delay instead of interval to recursively report state
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-27 23:16:38 -04:00
20k-ultra
5437aea786 Remove in memory storage of started/stopped containers
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-19 22:27:15 -04:00
20k-ultra
ca9945bdfb Only start a container once in its lifetime
This will ensure the restart policy specified is not violated

Change-type: patch
Closes: #1668
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-19 22:27:15 -04:00
Christina Wang
babe10e2a7 Move Supervisor-specific from lockfile.ts to update-lock.ts to
make lockfile module more generic

BASE_LOCK_DIR, LOCKFILE_UID moved to update-lock.ts

Signed-off-by: Christina Wang <christina@balena.io>
2022-04-12 12:02:57 -07:00
Christina Wang
cfd3f03e4a Make lockfile cleanup multi-app aware
When disposing of resources which include Supervisor-created lockfiles,
only dispose of lockfiles for the specified user application.

Signed-off-by: Christina Wang <christina@balena.io>
2022-04-12 12:02:28 -07:00
Christina Wang
e9738b5f78 Modify update lock module to use new lockfile binary and library
Also uninstall lockfile NPM package as we're no longer using it

Signed-off-by: Christina Wang <christina@balena.io>
2022-04-12 12:02:28 -07:00
Christina Wang
51e63ea22b Add lockfile binary and internal lib for interfacing with it
The linked issue describes the Supervisor not cleaning up locks it creates due
to crashing at just the wrong time. After internal discussion we decided to
differentiate Supervisor-created lockfiles from user-created lockfiles by using
the `nobody` UID (65534) for Supervisor-created lockfiles.

As the existing NPM lockfile lib does not allow creating lockfiles atomically
with different UIDs, we move to using the lockfile binary, which is part of the
procmail package. To allow nonroot users to write to lock directories, permissions
are changed to allow write access by nonroot users.

See: https://www.flowdock.com/app/rulemotion/r-resinos/threads/gWMgK5hmR26TzWGHux62NpgJtVl
Change-type: minor
Closes: #1758
Signed-off-by: Christina Wang <christina@balena.io>
2022-04-12 12:02:26 -07:00
Felipe Lalanne
e6fa22306b Add system id/model support for Compulab IOT-gate
dmidecode for alpine 3.11 doesn't work in this device type. This change
moves to using `/proc/device-tree/product-sn` and
`/proc/device-tree/product-name` for these devices.

Resolves: #1916
Change-type: patch
2022-04-08 12:02:21 -04:00
20k-ultra
c1b5e58ebd Correctly evaluate downloadProgress when computing current state
Change-type: patch
Closes: #1918
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-07 21:21:44 -04:00
Felipe Lalanne
eee2460445 Fix database migration for legacyApps
Migration `M00008` had a bug with the check for legacy apps, which
resulted in devices that had at some point been updated from a single
container supervisor to get the error

```
Undefined binding(s) detected when compiling UPDATE. Undefined column(s): [appUuid] query
```

This adds a new migration with the fix to ensure broken fix the
inconsistent database state.

Change-type: patch
Closes: #1913
2022-04-01 17:58:20 -03:00
Felipe Lalanne
b11696144f Only report current state of apps in the target state
If an app is not in the target state means the supervisor no longer
has permissions to that app hence it cannot report on it. When moving
between apps, there is a transitional period where containers and images
from both apps can be in the current state, therefore filtering is
needed to prevent getting 401 errors from the API.
2022-03-22 19:28:43 -03:00
Felipe Lalanne
b2b1b111b3 Ignore the supervisor in the target state
Starting with v3 state endpoint, the supervisor may receive the configuration
for the supervisor service on the target state. This commit allows the
supervisor to filter out the supervisor container from the current and target
state to let the update-balena-supervisor script handle the creation and update
of the supervisor container.

Updating and creating the supervisor container will be handled by a
future commit
2022-03-22 19:28:43 -03:00
Felipe Lalanne
8e40f1c2f5 Ignore unknown image classes on the target state
Starting with v3 state endpoint, the supervisor can receive
service configuration for services that are meant to be installed as
overlays or filesets on the host, as well as configuration for services
that are meant to be installed on the root partition. This commit just
ignores those services from the target state until support is added
2022-03-22 19:28:43 -03:00
Felipe Lalanne
8bf8792583 Only uninstall 'fleet' apps when localMode is set
Local mode is still a device level config. Eventually it will become a
property of an app, but for now, we don't want the supervisor trying to
uninstall supervisor or host app when local mode is set
2022-03-22 19:28:43 -03:00
Felipe Lalanne
f1cd3d367c Cleanup unused methods and dependencies on db ids 2022-03-22 19:28:43 -03:00
Felipe Lalanne
381abeadb9 Refactor current state report to patch v3 state
This change makes the `api-binder/report` module more agnostic
to internal device state implementation details, moving necessary
healthchecks and data filtering to getCurrentForReport in device-state.

This also adds generic functions to perform comparison between current
state reports.
2022-03-22 19:28:36 -03:00
Felipe Lalanne
25e9ab4786 Refactor api-binder as a directory
The role of the api-binder module is to be the intermediary
between the cloud API and the device-state. For this reason it makes sense to
isolate target state retrieval and current state reporting into this
module. This change just moves current state reporting to the directory.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
97f3b2a51e Update types and create methods for reporting v3 state 2022-03-22 19:08:03 -03:00
Felipe Lalanne
e9af9d8e83 Allow application manager to match apps between environments
If an app with the same app uuid exists between environments, the
supervisor will match the apps by uuid to prevent stopping the running
app
2022-03-22 19:08:03 -03:00
Felipe Lalanne
5c5483dd3d Rename networks to <appUuid>_<networkName>
This is required as we are phasing out app ids and we need to be able to
get app uuid from the current state of the network. The app-id now
exists as a container in new networks

This commit will restart containers as it needs to recreate the network.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
0835b29874 Add app uuid as metadata to new volumes
We cannot modify older volumes but newly created volumes will contain
app uuid as metadata so they can be migrated at some point in the
future.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
0b19dee511 Cleanup current state reporting methods
Removed redundant `getCurrentAppsForReport` and `getCurrentForComparison` since
the behavior of these methods is already handled by `getCurrentApps` and
`getCurrentState`.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
063bd400a4 Convert target state in local endpoints
Convert target state from to v3 in `/v2/local/target-state`. Add tests
for target state conversion
2022-03-22 19:08:03 -03:00
Felipe Lalanne
1edd060143 Clean up migration from legacy target state format
Creates `lib/legacy.ts` and `device-state/legacy.ts` to deal with
migration from legacy target states (single container and v2) for all
apps and for apps.json respectively
2022-03-22 19:08:03 -03:00
Felipe Lalanne
7425d1110b Add support for GET v3 target state
This change updates types and database format in order to allow
receiving the new format of the target state from the cloud and allow
applications to keep working.

This change also updates metadata in the containers, meaning services
will need to be restarted on supervisor update

Change-type: major
2022-03-22 19:08:02 -03:00
Felipe Lalanne
ccae1f7cb8 Rename aplication manager getStatus as getLegacyState
With the move to v3 target state and the move forward to remove
database ids from the supervisor, we want to ensure the ids are only
used for legacy support (such as within the API). This change renames
the method and sets it as deprecated
2022-03-22 19:08:02 -03:00
Felipe Lalanne
21c1c006f7 Always add status to image download report
It seems that in some cases the supervisor can report
an image without a `status` field leading to a cloud side 401 response.
See #1905 for more details.

Change-type: patch
2022-03-21 14:39:29 -03:00
Felipe Lalanne
e217ff9027 Only count report connectivity errors for healthcheck
Change-type: patch
2022-03-16 17:34:07 +00:00
20k-ultra
2fdb83839c Move report throttle out of reporting logic
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-03-15 22:53:34 -04:00
20k-ultra
b069d6b9d5 Apply target state if loaded from file (apps.json)
Closes: #1895
Change-type: patch
See: https://www.flowdock.com/app/rulemotion/r-supervisor/threads/tSN9BgLxkgJKapbQHQJr-R9yLPM
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-03-14 18:49:29 -04:00
Felipe Lalanne
d1956b69cc Fix check for supervisor0 network
The check for the docker network supervisor0 assumed that if the
interface supervisor0 existed, then the network would exist too. However this is not
true on the case of docker directory corruption, which would lead to a
loop with `Error: (HTTP code 404) no such network - network supervisor0 not found`.

Change-type: patch
Closes: #1806
2022-02-25 19:46:59 -03:00
Felipe Lalanne
1b54ce8bfd Ignore selinux security opts when comparing services
The moby engine v20.x.y adds some selinux [security configurations](https://docs.docker.com/engine/reference/run/#security-configuration)
depending on the [container configuration](https://github.com/moby/moby/blob/master/daemon/create.go#L214).
This would cause the supervisor to enter a service restart loop as the
current and target service configurations will never match. The
supervisor now ignores selinux specific security options since those are
not supported by balenaOS.

Closes: #1890
Change-type: patch
2022-02-23 18:12:27 -03:00
Felipe Lalanne
e7ec42fadc Use a breadcrumb to mark that a reboot is required
As changes to config.json may restart the supervisor before it can
trigger the reboot (or something can kill the supervisor before it can run that step),
the supervisor needs a persistent signal that a reboot is required
(instead of the current transient signal).

With this commit, the supervisor will now create a breadcrumb in the
host `/tmp` folder, that will be checked as the last step of the
configuration changes.
2022-02-15 12:52:48 -03:00
Felipe Lalanne
a2d6db1e1d Update signature of fsUtils.getPathOnHost
The function now returns either a string array if it receives multiple
arguments or a single string if it receives a single argument.
2022-02-15 12:52:46 -03:00
Felipe Lalanne
2917f03452 Perform config.json sequentially to other config changes
As config.json changes may restart the engine (and hence the supervisor)
in newer OS versions, this ensures that the supervisor does not get
interrupted while writing to backends.
2022-02-15 12:49:03 -03:00
Felipe Lalanne
63cb985c53 Split device-config step calculation into separate functions 2022-02-15 12:49:03 -03:00
Felipe Lalanne
118875e12e Fix apiUpdatePollInterval default to line up with API 2022-02-15 12:49:03 -03:00
Felipe Lalanne
a4d91d381a Create touch and getBootTime utility functions
Change-type: patch
2022-02-15 12:49:03 -03:00
Christina Wang
5f1a77da25 Add update lock check to PATCH /v1/device/host-config
This is necessary with the changes as of balenaOS 2.82.6, which watches config.json
and will restart balena-hostname and some other services automatically on file change.

Change-type: patch
Relates-to: #1876
Signed-off-by: Christina Wang <christina@balena.io>
2022-02-14 22:22:00 +00:00
Christina Wang
4f446103f4 Remove lockingIfNecessary in favor of updateLock.lock
The functionality is pretty much the same, so we don't need the two
functions in two different places.

Signed-off-by: Christina Wang <christina@balena.io>
2022-02-14 22:06:18 +00:00
Felipe Lalanne
72f6cbe4c7 Add support for local ipv6 reporting
With more and more devices in ipv6 only networks, this ensures the
local addresses are reported to the cloud as part of the state patch.

Change-type: patch
2022-02-08 19:06:13 -03:00
Felipe Lalanne
d071cd1507 Use writeAndSync when writing to config.json
`/mnt/boot` is a vfat partition which does not support atomic file
rename. The best course of action is to write and sync as fast as
possible to prevent corruption (although it still may happen)

Change-type: patch
2022-02-01 18:56:18 -03:00
Felipe Lalanne
a0ed00d8f3 Perform safeRename on writeFileAtomic
This forces a sync of the file as soon as the rename happens to prevent
corruption.
2022-02-01 18:56:18 -03:00
Felipe Lalanne
fa0e28de6d Clean up image event reporting 2022-02-01 18:35:50 -03:00
Pagan Gazzard
ae501048f5 Ensure the finish event is always reported when fetching images
Change-type: patch
2022-01-18 11:45:13 +00:00
Felipe Lalanne
f471ad736c Throw if target states gets a 304 without an ETAG
The API uses 304 as a mechanism for load management on target state
requests. This may cause that the supervisor receives a 304 response
without having received a copy of the target state first, leading to
issues. This change checks for an etag when receiving a 304, throwing an
exception otherwise.

Change-type: patch
2022-01-26 11:27:15 -03:00
Felipe Lalanne
d06b8e053e Use dmidecode to read cpuid in non ARM devices
Cpu id is set to null so far for non ARM devices (e.g. Intel NUC). This
parses the output of dmidecode to get the cpu id and system model.

Change-type: patch
2022-01-13 22:49:42 +00:00
Felipe Lalanne
c7fc7aacf8 Use dmidecode to read cpuid in non ARM devices
Cpu id is set to null so far for non ARM devices (e.g. Intel NUC). This
parses the output of dmidecode to get the cpu id and system model.

Change-type: patch
2022-01-06 21:01:53 +00:00
Pagan Gazzard
157fd95196 Increase delta request timeout to 59s to better align with our backends
Change-type: patch
2022-01-18 10:02:13 +00:00
Pagan Gazzard
fd1f646073 Fix memoization of registry token request
Change-type: patch
2022-01-17 16:52:43 +00:00
Felipe Lalanne
9c6e5ee11f Remove apps.json after initial preload
This avoids the supervisor trying to get back to the preloaded target
state if the database is deleted by any reason. It does this by moving the
used apps.json to a backup location.

Change-type: patch
Depends-on: #1841
2021-12-13 20:11:42 +00:00
Felipe Lalanne
08147e6a86 Ensure happy-eyeballs uses supervisor dns lookup
Happy-eyeballs performs [dns lookups](https://github.com/balena-io-modules/happy-eyeballs/blob/master/src/happy-eyeballs.ts#L23)
for the requested addresses, however, because of the order of imports it
was not using the supervisor custom `dns.lookup` that handles `.local`
name resolution, making address resolution fail in those cases.

Moving the import after the `dns.lookup` patch fixes the problem.
2021-12-16 11:59:59 -03:00
Felipe Lalanne
39c667803d Fix .local dns resolution when returning multiple addresses
The supervisor performs its own local resolution for `.local`
addresses due to a limitation in [musl](https://wiki.musl-libc.org/future-ideas.html).
The resolution function was not following exactly the nodejs [dns.lookup
specification](https://nodejs.org/api/dns.html#dnslookuphostname-options-callback)
which could cause certain clients to fail (in this case happy-eyeballs). This
updates the function to follow the specification.

Change-type: patch
2021-12-16 11:59:54 -03:00
Felipe Lalanne
9015b0e22f Skip initial apply until a target has been set
The supervisor always applies target state on start to ensure that the
device is at the correct in case of a crash or another reason. This had
the side effect that if the database is deleted, the supervisor would
apply target state (which is empty), stopping services and possibly
causing volume data loss.

This prevents that behavior and ensures that the supervisor only
applies target state if a target has been set either by the cloud, preload or local
mode.

Change-type: patch
2021-12-13 09:31:00 -03:00
Pagan Gazzard
32e3399f7c Fix the "already delayed by" calculation
Change-type: patch
2021-12-10 15:54:30 +00:00
Pagan Gazzard
6554ff5a64 Add exponential backoff on errors for logs reporting
Change-type: patch
2021-12-09 18:30:04 +00:00
Felipe Lalanne
f6b2ec9677 Improve validation messages for env vars and labels
Change-type: patch
2021-12-02 17:19:50 -03:00
Felipe Lalanne
445aefaa29 Ensure target state errors are sent to the log backend
Closes: #1838
2021-12-02 15:29:37 -03:00
Felipe Lalanne
f6692ab918 Convert target state types to io-ts for better validation
This simplifies target state validation and improves validation
messages.

Change-type: patch
2021-12-02 15:29:37 -03:00
Felipe Lalanne
ca7c22d854 Move lib/types.ts to src/types/basic.ts 2021-12-02 15:29:37 -03:00
Zane Hitchcox
9ed2685f63 Add happy eyeballs
Change-type: patch
2021-11-30 12:43:18 -05:00
Pagan Gazzard
2eb00fa0da Increase request timeout to 59s to better align with our backends
Change-type: patch
2021-11-29 17:14:51 +00:00
Felipe Lalanne
6fd516a930 Fix broken local mode after PR #1824
PR #1824 changed app update behavior to test that all images are there
before moving between releases. This check always fails in local mode
since local mode images are handled differently.

This PR fixes local mode again by skipping the check when `localMode` is
set.

Change-type: patch
2021-11-17 17:54:25 -03:00
Alexandru Costache
3b9c68246e backends/extra-uEnv: Extend custom DTB support for Nano 2GB Devkit
Change-type: patch
Signed-off-by: Alexandru Costache <alexandru@balena.io>
2021-11-17 13:48:19 +01:00
Felipe Lalanne
394377e0a1 Fix delete-then-download strategy
The strategy has been broken for a while but it was not clear how to
fix it before the changes to image management. This PR fixes application
manager to remove images before downloading the new image. This will
only have an effect on changing images.

Closes: #1233
Change-type: patch
2021-11-16 16:40:15 -03:00
Felipe Lalanne
7aedc97ee1 Wait for images to be ready before moving between releases
For download-then-kill strategy, this waits for all changing images on the target
release to be available on device before killing the old services. This
will prevent that multicontainer applications get to a state where some
services of the new release start runnning much before others have been
downloaded.

When adding new services to a multicontainer app, the supervisor will
now wait for other changing services to be downloaded before starting
the new service.

Closes: #1812
Change-type: patch
2021-11-11 14:08:36 -03:00
Felipe Lalanne
969f4225e5 Check config for networks and volumes inside Service
This removes the need for the app module to know about the naming
conventions for networks and volumes since those exist now within the
service itself. This also fixes a small bug where the volume would be
removed before the service itself had been successfully stopped.

Change-type: patch
2021-10-28 10:20:53 -03:00
Alexandru Costache
7d678fa838 backends/extra-uEnv: Extend custom DTB support for Jetson TX2 NX
We just added support for the TX2 NX, which supports u-boot
thus allows for using custom device-trees. Let's allow
for Jetson TX2 NX and future TX2 NX derived
device types to have device-trees configurable from the dashboard.

Change-type: patch
Signed-off-by: Alexandru Costache <alexandru@balena.io>
2021-08-24 07:24:48 +00:00
Felipe Lalanne
aab000209b Add backoff to state reporting when 503 is received
Current state reporting had a backoff when network or inconsistency
errors were found, but not on API errors. This change adds a backoff
using RetryAfter header if present to reduce load on API

Change-type: patch
2021-09-28 14:53:26 -04:00
Felipe Lalanne
802f26fe71 Improve network interface filter
The supervisor filters out some network interfaces for mac address
reporting, to remove (balena*,lo,tun*,etc). The previous filter was
matching any interface containing in one of the defined filters, making
it stricter than necessary. This commit fixes the issue

Change-type: patch
2021-09-24 13:01:17 -03:00
Alex Gonzalez
9e0cbe04c6 api-keys: Remove os variant parameter for authentication check
The current code authenticates unmanaged production devices which makes
no sense. Unmanaged devices do not need to authenticate with the API.

Change-type: patch
Signed-off-by: Alex Gonzalez <alexg@balena.io>
2021-08-05 09:30:35 +00:00
Alex Gonzalez
1abd10a129 os-release: Use developmentMode to ascertain OS variant in new releases
Newer BalenaOS releases have replaced OS variants for a developmentMode
configuration setting. This commit uses this variable to set the OS
variant in the absence of `VARIANT_ID` from the os-release file.

Change-type: patch
Signed-off-by: Alex Gonzalez <alexg@balena.io>
2021-08-05 09:30:35 +00:00
Alex Gonzalez
4ad7a3ae91 config: Add developmentMode to schema
Add a `developmentMode` configuration variable to the schema. Do not expose
this on the device target state until local key-based authentication is
sorted.

Relates-to: https://jel.ly.fish/e9525e9e-aa74-478c-b931-52951c679f78
Change-type: patch
Signed-off-by: Alex Gonzalez <alexg@balena.io>
2021-08-05 09:30:35 +00:00
Kyle Harding
669866b4c2
Skip restarting services if they are part of conf targets
Some recent changes to the OS allowed some services to restart
automatically when the associated config files are changed.

In these cases we want to avoid restarting the same services
manually from the supervisor.

Change-type: patch
Signed-off-by: Kyle Harding <kyle@balena.io>
2021-08-24 14:03:55 -04:00
peakyDicers
30c728fae2 Removed fire emoji prefix for firewall logs.
Change-type: patch
2021-08-02 17:24:03 -04:00
Felipe Lalanne
6f5f3bc2f3 Fix regression with local mode push
PR #1749 introduced a bug when pushing local target state. An update to
the [image name normalization](f1bd4b8d9b/src/lib/docker-utils.ts (L81))
failed to consider the local image name format. This results in mangling
of image names in the database, i.e. the image `ubuntu:latest` is stored
as `/ubuntu:latest`. This causes an exception to be returned by the
dockerode `getImage('/ubuntu:latest').inspect()` call.

This sends the supervisor into a crash loop and is shown on the supervisor
journal logs as

```
getaddrinfo ENOTFOUND images
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:64:26)
```

Unfortunately if this happens on a user device, since the mangled image
name is already on the database, the easiest way to fix is to remove the
supervisor database and let the supervisor recreate it. Deleting the
database should be side effect free.

Change-type: patch
2021-08-02 11:52:07 -04:00
Felipe Lalanne
104a8006fb Update apiSecret table to id services by name
It adds a migration replacing the serviceId column by serviceName and
populates serviceNames from services in the target state.
2021-07-28 09:57:38 -04:00
Felipe Lalanne
b67f94802d Remove comparison based on image, release, and service ids
Preparing for the new v3 target state, where the supervisor will make environment
dependent ids optional and rely on using general UUIDs and user known identifiers
for comparison. This PR moves forward in that direction by removing some of those
comparisons for v2 target state.

- imageId to be replaced with imageName
- serviceId to be replace by serviceName
- releaseId to be replaced by commit (future release_uuid)

This is a backwards compatible change, meaning it doesn't completely get rid of
these identifiers (which are still being used by supervisor API and for state
patch), but will not depend on those identifiers for calculating steps to target state.

Change-type: minor
2021-07-28 09:57:38 -04:00
Felipe Lalanne
77070712a4 Remove image manager appUpdatePollInterval listener 2021-07-28 09:57:36 -04:00
Felipe Lalanne
a1d098d8f3 Refactor image "volatile state" to use state pattern
This replaces stored `volatileState` with a more declarative ImageTask API.
An ImageTask stores volatile image state for operations that cannot be
obtained through an engine query, such as fetching and removing an
image, state that can be updated while the task is running.

Image controller methods can now use the `reportEvent` method to create
and update the state of a longer running task.
2021-07-28 09:56:38 -04:00
Felipe Lalanne
f1bd4b8d9b Use tags to track supervised images in docker
The image manager module now uses tags instead of docker IDs as the main
way to identify docker images on the engine. That is, if the target
state image has a name `imageName:tag@digest`, the supervisor will always use
the given `imageName` and `tag` (which may be empty) to tag the image on
the engine after fetching. This PR also adds checkups to ensure
consistency is maintained between the database and the engine.

Using tags allows to simplify query and removal operations, since now
removing the image now means removing tags matching the image name.

Before this change the supervisor relied only on information in the
supervisor database, and used that to remove images by docker ID. However, the docker
id is not a reliable identifier, since images retain the same id between
releases or between services in the same release.

List of squashed commits
- Remove custom type NormalizedImageInfo
- Remove dependency on docker-toolbelt
- Use tags to traack supervised images in docker
- Ensure tag removal occurs in sequence
- Only save database image after download confirmed

Relates-to: #1616 #1579
Change-type: patch
2021-07-26 09:52:25 -04:00
Felipe Lalanne
c05c5803f0 Log the delta URL that will be downloaded on update
Change-type: patch
Closes: #1755
2021-07-22 11:05:00 -04:00
Christina Wang
17e740a4ba
Allow users to override HUP lock if device is stuck in invalid state
This functionality is needed when breadcrumbs aren't deleted after a HUP
rollback for whatever reason. Also rename HUP lock function.

Change-type: patch
Connects-to: #1459
Signed-off-by: Christina Wang <christina@balena.io>
2021-07-08 12:43:32 +09:00
Felipe Lalanne
e04e64763f Improve testing for supervisor composition modules
This PR cleans up testing for supervisor compose modules. It also fixes broken
tests for application manager and removes a lot of dependencies for those tests
on DB and other unnecessary mocks. There are probably a lot of cases that tests
are missing but this should make writing new tests a lot easier.

This PR also creates a new mock dockerode (mockerode) module that should make it
easier to test operations that interact with the engine. All references
to the old mock-dockerode have not yet been removed but that should come
soon in another PR

List of squashed commits:
- Add tests for network create/remove
- Move compose service tests to test/src/compose and reorganize test descriptions
- Add support for image creation to mockerode
- Add additional tests for compose volumes
- Update mockerode so unimplemented fake methods throw. This is to ensure
  tests using mockerode fail if an unimplemented method is used
- Update tests for volume-manager with mockerode
- Update tests for compose/images
- Simplify tests using mockerode
- Clean up compose/app tests
- Create application manager tests

Change-type: minor
2021-07-05 17:50:52 -04:00
Christina Wang
a9028e58ec
Prevent updates/reboots with locks when HUP breadcrumbs present
On HUP, some healthceck services need to complete before
it's safe for the Supervisor to reboot the device when
applying state changes. rollback-{health|altboot}-breadcrumb
are the two files that Supervisor looks for and locks the device
on when present in this patch.

Not closing issue 1459 because there is a possible case where,
on altboot rollback, the breadcrumbs are not present. 1459
may be closed when this edge case is investigated.

Change-type: patch
Connects-to: #1459
See: https://www.flowdock.com/app/rulemotion/r-supervisor/threads/cL7YfNOLSfTPfw05h59GEW0kfOt
Signed-off-by: Christina Wang <christina@balena.io>
2021-06-30 13:27:03 +09:00
Felipe Lalanne
2fa0d3dc43 Fix supervisor using wrong source for deltas
This fixes a specific issue when the supervisor cannot find the right
source for deltas (e.g. after the DB gets deleted), where legacy
behavior was to look for any image in the app.

Change-type: patch
Relates-to: #1729
2021-06-25 16:24:51 -04:00
Florin Sarbu
7c26480ada
Add revpi-connect, revpi-core-3 to Raspberry Pi variants
We need the supervisor to be able to manage config.txt changes for these
Revolution Pi boards too.

Change-type: patch
Signed-off-by: Florin Sarbu <florin@balena.io>
2021-06-18 20:33:27 +09:00
Pagan Gazzard
ee4d919fca Improve target state typings
Change-type: patch
2021-06-08 13:45:44 +01:00
Miguel Casqueira
ab4fb454e0 Refactor debug log when unmanaged volume is found
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-06-02 13:07:24 -04:00
Miguel Casqueira
55a344dceb Prevent a recursive loop when reporting current state
Closes: #1673
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-05-28 16:20:27 -04:00
Christina Wang
dcd863eed8
Add toggleable SUPERVISOR_HARDWARE_METRICS config
On devices with bandwidth sensitivity, this config var
disables sending system information such as memory
usage or cpu temp as current state.

Closes: #1645
Change-type: minor
Signed-off-by: Christina Wang <christina@balena.io>
2021-05-13 13:59:07 +09:00
Christina Wang
ea3e50e96e
Create & unify src/device-state/current-state tests
Signed-off-by: Christina Wang <christina@balena.io>
2021-05-12 18:33:01 +09:00
Christina Wang
39601473c0
Fix undervoltage regex, add undervoltage tests, move sysinfo suite to test/src
Signed-off-by: Christina Wang <christina@balena.io>
2021-05-12 18:33:01 +09:00
Pagan Gazzard
74ae31fcfd Simplify/optimize filtering non-significant sys info changes
Change-type: patch
2021-05-06 10:59:49 +00:00
Pagan Gazzard
466ff58871 Avoid double omits whilst filtering current state
Change-type: patch
2021-05-06 10:59:23 +00:00
Kyle Harding
164dd7ccc1 Rename meta-resin to meta-balena
Signed-off-by: Kyle Harding <kyle@balena.io>
2021-05-06 17:05:26 +00:00
Kyle Harding
301aa52f03 Backwards compatility changes for old resin namespaces
Change-type: patch
Signed-off-by: Kyle Harding <kyle@balena.io>
2021-05-06 17:05:26 +00:00
Kyle Harding
09615c9d82 Change container name to balena_supervisor
Change-type: minor
Signed-off-by: Kyle Harding <kyle@balena.io>
2021-05-06 17:05:25 +00:00
Kyle Harding
5faf9d7686 Rename resin-supervisor to balena-supervisor
Change-type: minor
Signed-off-by: Kyle Harding <kyle@balena.io>
2021-05-06 17:05:25 +00:00
Felipe Lalanne
5197a1330d Show warning instead of exception for invalid network config
A previous PR (#1656) fixed validation for network ipam config,
checking that both network and subnet are defined for each ipam config entry
(as described in the docker documentation).

After that PR, the validations throws an exception if the network target state is incorrect,
but this turns out to be the wrong approach, because that exception is also triggered
when querying target state.

This isn't a problem in normal operation, but it is in local mode, because local
mode queries the old target state before sending a new one. Since the query fails,
the CLI can never push the new target state.

This PR replaces the exception with a warning on the logs, since a
misconfigured network won't cause any engine failures, it will just
prevent containers to communicate through the provided network.

A future improvement should move this validation to an earlier point in the process,
so the target state can get rejected before it even gets to a point it
can be used.

Relates-to: #1693
Change-type: patch
2021-05-06 16:27:40 -04:00
Miguel Casqueira
8b0c2347d8 Patch awaiting response when checking if supervisor0 network exists
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-05-06 14:41:32 +00:00
quentinGllmt
1408fd7bcb Fix parsing driver_opts from compose to docker network creation
Change-type: patch
Signed-off-by: quentinGllmt <quentin@quentingllmt.fr>
2021-05-06 16:50:11 +02:00
Pagan Gazzard
9e52bb33ac Update balena-register-device and send extra info at provision time
This extra info will mean the API is able to immediately set default
config vars based on the os/supervisor version so that they are
available on the first target state fetch rather than having a delay
whilst waiting for the supervisor to report them as part of a state
patch

Update balena-register-device from 6.1.6 to 7.2.0

Change-type: patch
2021-04-29 13:44:30 +00:00
Felipe Lalanne
2203f78d51 Log error responses from API when reporting state
This adds the error message from the API to journal logs to better
identify those cases where patching to the API fails.

Change-type: patch
Relates-to: #1680
2021-05-04 17:57:55 +00:00
Christina Wang
4a2ac557ef
Remove mz, mkdirp, body-parser dependencies
'mz' can be safely replaced with fs.promises
and util.promisify for faster native methods.
'mkdirp' after Node v8 uses native fs.mkdir, thus
is redundant. 'body-parser' is deprecated and
contained within express v4.x.

Closes: #1567
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2021-04-28 07:20:15 +09:00
Felipe Lalanne
95fb568aae Bump dockerode types to 2.5.34
This commit updates dockerode types to the latest 2.x version, removing the need
for custom composer types for network.

This commit also modifies network tests to use the new types

Change-type: minor
2021-04-27 13:00:56 -04:00
Felipe Lalanne
fd06c06092 Update supervisor to typescript 4
Change-type: patch
2021-04-19 15:18:21 +00:00
Miguel Casqueira
e6eda0fca7 Refactor extra_uEnv to not match with intel nuc
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-04-14 01:20:26 -04:00
Vipul Gupta (@vipulgupta2048)
d058f43feb patch: Fix substring end parameter for accurate CPU ID
Signed-off-by: Vipul Gupta (@vipulgupta2048) <vipul@balena.io>
2021-04-13 03:18:56 +05:30
Felipe Lalanne
fdb37191e7 Fix broken IPAM network validation
Network validaton was failing to identify a bad IPAM network
configuration leading to supervisor failures (see #1618)

Change-type: patch
Closes: #1618
2021-04-09 17:49:09 -04:00
Miguel Casqueira
204475d3dc Improved mutable (/data) file system detection
Change-type: patch
Closes: #1609
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-04-08 17:34:27 -04:00
Christina Wang
31effed426 Prevent unintended image removal when calling purge endpoints to remove volumes
Using safeStateClone within doPurge to applyIntermediateTarget after
successful volume purge has led to various type deficiencies being revealed
in common.js. Add several inline types in common.js to satisfy
the type checker (credit: Page <page@balena.io>). Delete common.d.ts
since it's not required and might mistakenly mask true I/O types of
functions in common.js.

Closes: #1611
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2021-04-05 12:10:09 +00:00
Miguel Casqueira
ecbe9ee9f9 Patch list volumes to always return an array
Change-type: patch
Closes: #1636
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-04-01 20:31:09 -04:00
Pagan Gazzard
2ae22b4fbd Enable strict options by default and only disable specific ones
Change-type: patch
2021-03-22 13:29:53 +00:00
Matthew McGinn
f9a157c9ec typos: seperate -> separate
mainly to get the docs one, but figured i could hit them all

Change-type: patch
Signed-off-by: Matthew McGinn <matthew@balena.io>
2021-03-17 14:27:53 -04:00
Miguel Casqueira
183ea88a2a Infer legacy Volumes that do not have the supervised label
Change-type: patch
Closes: #1604
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-03-15 19:46:53 -04:00
Felipe Lalanne
8f9254b6b1 Add nebra-hnt to raspberry pi variants
Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
2021-03-12 12:42:28 -03:00
Miguel Casqueira
898b72c7f7 Refactor journalctl monitor to only spawn new process on exit
Change-type: patch
Closes: #1591
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-02-24 12:01:19 -05:00
Miguel Casqueira
ec23d1d371 Refactor checkTruthy to return more predictable values
Change-type: patch
Closes: #1595
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-02-18 12:36:42 -05:00
Christina Wang
b3b1d47b34
Complete /v1/device/host-config unit tests, modify PATCH route
Change-type: minor
Signed-off-by: Christina Wang <christina@balena.io>
2021-02-18 12:25:44 +09:00
Miguel Casqueira
c602014617 Patch killServicesUsingApi to not get stuck in noop loop
Change-type: patch
Closes: #1594
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-02-16 18:33:50 -05:00
Robert Günzler
f009d3a3e9
Fix gpu label support
The device request object was created with untouched fields left unset. When
comparing state to determine if a transition is required this would
result in a mismatch between:

    {
      Driver: '',
      Count: 1,
      DeviceIDs: null,
      Capabilities: [Array],
      Options: null
    }

and

    {
      Count: 1,
      Capabilities: [Array],
    }

Which in turn resulted in the target service being continously restarted.
The fix is to instantiate the object in full.

Connects-to: https://github.com/balena-io/balena-supervisor/issues/1449
Connects-to: ae646a07ec
Change-type: patch
Signed-off-by: Robert Günzler <robertg@balena.io>
2021-02-09 11:27:03 +01:00
Miguel Casqueira
277d984af2 Prevent inserting null commit during DB migration
Change-type: patch
Closes: #1581
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-02-03 10:44:11 -05:00
Miguel Casqueira
ba1c857c4f Cancel pending apply target after /v1/update request
Closes: #1530
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2021-01-20 22:21:14 -05:00
Felipe Lalanne
4aa8090a56 Add support for BALENA_HOST_SPLASH_IMAGE config
Setting this this variable to a base64 encoded string will replace the splash
image on the device by rewriting `/mnt/boot/splash/balena-logo.png`.
This will also make a copy of the default balena logo so the splash can
be restored if the variable is removed.

Change-type: minor
Signed-off-by: Felipe Lalanne <felipe@balena.io>
2021-01-06 15:11:31 -03:00
Felipe Lalanne
e66a775c15 Move required configuration check to Backend
The `ensureRequiredOverlay` function is currently ran for any backend,
at this moment this causes no issue, since most configuration backends
are defined per single device type. However, with the option to modify splash
images, which is available for all device types, the function would add
unwanted configuration vars to the splash image configuration. Moving it
to the config txt backend solves this issue.
2021-01-05 18:30:07 -03:00
Felipe Lalanne
4cdf26f82f Improve supervisor API behavior when locks are set
This PR adds the following

* Supervisor v1 API application actions now return HTTP status code 423 when locks
  are preventing the action to be performed. Previously this resulted in a
  503 error
* Supervisor API v2 service actions now returns HTTP status code 423 when locks are
  preventing the action to be performed. Previously, this resulted in an
  exception logged by the supervisor and the API query timing out
* Supervisor API `/v2/applications/:appId/start-service` now does not
  check for a lock. Lock handling in v2 actions is now performed by each
  step executor
* `/v1/apps/:appId/start` now queries the target state and uses that
  information to execute the start step (as v2 does). Previously start
  resulted in `cannot get appId from undefined`
* Extra tests for API methods

Change-type: patch
Connects-to: #1523
Signed-off-by: Felipe Lalanne <felipe@balena.io>
2020-12-14 10:43:41 -03:00
Felipe Lalanne
a8c4a6683a Add config.txt support for Alliance rpi3
Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
2020-12-11 09:46:48 -03:00
Cameron Diver
2c1fb7110e Add config.txt support for Rocktech rpi
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-12-10 11:14:35 +00:00
Miguel Casqueira
8b37df492b Patched /v1/restart exception
Change-type: patch
Closes: #1509
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2020-11-30 15:42:49 -05:00
Rich Bayliss
02aeb4fc1c fix: Scoped keys breaking livepush with existing cloud images on the device
Closes: #1512
Change-type: patch
Signed-off-by: Rich Bayliss <rich@balena.io>
2020-11-16 12:55:40 -05:00
Felipe Lalanne
e4e895630f Ensure the first target state request is applied
During first time run of the supervisor, the target state is queried
by `reportInitialEnv`. Since this happens early on the initialization
process, this target state report is missed by any listeners and this
can lead to the initial target state not beeing applied (see #1455).

This PR ensures that target state is re-emitted if there were no
listeners setup on call to update.

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1455
2020-11-13 10:19:27 -03:00
Rich Bayliss
bc9bdd1094
validation: Ensure commit lookup has a bound value
Change-type: patch
Signed-off-by: Rich Bayliss <rich@balena.io>
2020-11-11 11:01:20 +00:00
Rich Bayliss
591598e102
fix: Scoped keys not working in LocalMode
Some endpoints filter data based on the scope of the API key
used to make the request. When in LocalMode the check was not
being made correctly and all apps were considered out of scope.

Change-type: patch
Signed-off-by: Rich Bayliss <rich@balena.io>
2020-11-11 10:58:58 +00:00
Cameron Diver
f08316dc57 Allow storing commits against their appIds
This paves the way for running multiple applications and storing
information related to the application against the application itself. A
couple of hacks have been added to v1 and v2 endpoints to maintain
compatability but these should eventually be removed with the addition
of a v3 api.

Change-type: minor
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-11-10 10:50:08 +00:00
Felipe Lalanne
01477e41b8 Mount docker socket under /host/run for services
Currently, when the label `io.balena.features.balena-socket` is set,
the balena engine socket is mounted under `/run/balena-engine.sock`.

This causes a problem when using systemd inside the container, since
this service remounts `/run` and `/run/lock` as tmpfs, causing the
socket to become unavailable.

Making a mount of the socket into `/host/run` solves this issue. This is
the same approach taken with DBUS.

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1494
2020-10-29 15:54:31 -03:00
Cameron Diver
9d19a45701 Use root mount point to find device-type.json
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-10-28 13:03:40 +00:00
Thomas Manning
2c83864f22 Change log source from docker to journalctl
Change-type: minor
Signed-off-by: Thomas Manning <thomasm@balena.io>
2020-10-28 16:09:42 +10:00
Felipe Lalanne
f5183df356 Change source of deviceType to device-type.json
The source of truth for the device-type should be
device-type.json instead of config.json

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1472
2020-10-27 09:40:18 -03:00
Miguel Casqueira
77333f1e11 Fixed evaluating if updates are needed to reach target state
Closes: #1476
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2020-10-26 14:54:04 -04:00
Miguel Casqueira
edf23871d9 Improved log message when networks do not match
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2020-10-19 12:01:50 -04:00
ab77
0fd9b63762
Fixes check allowing preloading in local (unmanaged) mode
* adds apiEndpoint empty string check

Change-type: patch
2020-10-16 15:19:22 -07:00
Felipe Lalanne
4795c336d0 Handle delete of multiple images with same dockerImageId
A docker-compose.yml with the following structure

```
version: '2.1'
services:
  app_1:
    build: ./noisy-1
    image: noisy1
  app_2:
    build: ./noisy-1
    image: noisy1
  app_3:
    build: ./noisy-1
    image: noisy1
```

Will lead to the supervisor creating multiple image database entries
with the same dockerId (this is because of how the engine handles this
particular case). This case is not handled by the removal process
leading to image pile up and increased disk usage.

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1434
2020-10-16 14:06:10 -04:00
Felipe Lalanne
dd5f62227a Improve calculation for used system memory
The memory information reported by the supervisor currently
estimates the value of used memory as `MemTotal - MemFree`.
However, linux systems will try to cache and buffer as much
memory as possible, which will affect the output of `MemFree`
(from /proc/meminfo) and in consequence the memory usage seen
by the user on the dashboard, which will appear much greater than
it is.

The correct calculation should be `MemTotal - MemFree - Buffers - Cached`,
which the calculation performed by `htop` and the `free` commands.

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1471
2020-10-14 13:15:17 -03:00
Cameron Diver
a2ceb5c931 Refactor system information filtering
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-10-12 15:44:07 +01:00
Cameron Diver
0e3c026392 Attempt a state report once every maxReportFrequency
With the addition of the system information feature (CPU temp) etc if
there wasn't any changes in the docker or config state of the device,
updates in system information would not be sent to the API. Now we
attempt to send data once every maxReportFrequency (although this does
not mean that we will be sending data that often, we still only send the
delta, if one exists)

Change-type: patch
Closes: #1481
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-10-12 11:53:19 +01:00
Cameron Diver
975129188a Remove superfluous current state reporting code from api-binder
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2020-10-12 11:53:19 +01:00
Thomas Manning
1eeff698ac Add features label io.balena.features.journal-logs
Change-type: patch
Signed-off-by: Thomas Manning <thomasm@balena.io>
2020-10-12 15:37:35 +10:00
Matthew McGinn
8e65466f2d version: drop SUPERVISOR_VERSION env var
In order to make supervisor upgrades more transparent, lets move away
from this env var since it requires a container restart any time the supervisor
is upgraded. We should ultimately move towards providing the supervisors
set of capabilities, but that can come later

Connects-to: #1447
Change-type: major
Signed-off-by: Matthew McGinn <matthew@balena.io>
2020-09-29 11:22:30 -04:00
Felipe Lalanne
adffde932e Fix supervisor deadlock during migration
Due to the singleton work, when performing migration M00005 and there
are apps with services created in the database, a deadlock occurs
during database initialization due to a circular
dependency for generating scoped keys.

Change-type: patch
Signed-off-by: Felipe Lalanne <felipe@balena.io>
Connects-to: #1468
2020-09-28 23:52:36 -03:00
Miguel Casqueira
90981a00be Correctly evaluate if scheduledApply.delay is not set
Closes: #1428
Change-type: patch
Signed-off-by: Miguel Casqueira <miguel@balena.io>
2020-09-25 13:14:09 -04:00