316 Commits

Author SHA1 Message Date
Felipe Lalanne
b1fc4e1761 Get image name from DB when getting the app current state
The Service class in `compose/service.ts` cannot get the image name
from the image id when building the object from the container metadata.

We query the metadata in the application manager getCurrentApps method
so the current state can be used as target by API methods

Change-type: patch
2023-04-20 14:58:58 -04:00
Felipe Lalanne
27f0d2e655 Improve net alias comparison to prevent unwanted restarts
Network aliases are now compared checking that the target state is a
subset of the current state. This will prevent service restarts due to
additional aliases created by docker in the container.

Closes: #2134
Change-type: patch
2023-04-20 14:58:58 -04:00
Felipe Lalanne
cb98133717 Exclude containerId from service network aliases
When getting the service from the docker container, remove the
containerId from the list of aliases (which gets added by docker). This
will make it easier to use the current service state as a target.

This will help us remove the `safeStateClone` function in the API in a
future commit

Change-type: patch
2023-04-20 14:58:58 -04:00
Felipe Lalanne
f2ca7dbb6a Skip image delete when applying intermediate state
This replaces the previous flag `isApplyingIntermediate` on application
manager and simplifies the interface of the state engine to make temporary changes to the
general app state.

Change-type: patch
2023-04-20 14:58:58 -04:00
Felipe Lalanne
967cb7747f Make local mode image management work as in cloud mode
There were multiple places in the state engine that skipped some
operations while in local mode. In reality, all it's needed while in
local mode is to skip image and volume deletion.

This commit simplifies application-manager and compose app to be more
local mode agnostic and instead making the image deletion and volume
deletion configurable via function arguments.

This also has the benefit to make the treatment of local mode
applications more similar to cloud mode applications, allowing for
API endpoints to function the same way both modes.

Change-type: patch
2023-04-20 14:58:58 -04:00
Felipe Lalanne
76d5be64e5 Remove ignoreImages argument from getRequiredSteps
The argument was unused and hence unnecesary. This is just a bit of
cleanup

Change-type: patch
2023-04-20 14:58:58 -04:00
Christina Ying Wang
4c948c8854 Mount data and state partitions on container startup
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-03-27 12:07:01 -07:00
Christina Ying Wang
49ee1042a8 Mount boot partition into container on Supervisor start
As the Supervisor is a privileged container, it has access to host /dev, and therefore has access
to boot, data, and state balenaOS partitions. This commit sets up the framework for the following:

- Finds the /dev partition that corresponds to each partition based on partition label
- Mounts the partitions into set mountpoints in the device
- Removes reliance on env vars and mountpoints provided by host's start-balena-supervisor script
- Simplifies host path querying by centralizing these queries through methods in lib/host-utils.ts

This particular changes env vars for and mounts the boot partition.

Since the Supervisor would no longer rely on container `run` arguments provided by a host script,
this change moves Supervisor closer to being able to start itself (Supervisor-as-an-app).

Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-03-27 12:07:01 -07:00
Christina Ying Wang
9522c15ecd Change constants imports to remove 'require'
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-03-27 12:07:01 -07:00
Felipe Lalanne
89175432af Find and remove duplicate networks
We have seen a few times devices with duplicated network names for some
reason. While we don't know the cause the networks get duplicates, this
can be disruptive for updates as trying to create a container referencing a duplicate
network results in a 400 error from the engine.

This commit finds and removes duplicate networks via the state engine,
this means that even if somehow a container could be referencing a
network that has been duplicated later somehow, this will remove the
container first.

While thies doesn't solve the problem of duplicate networks being
created in the first place, it will fix the state of the system to
correct the inconsistency.

Change-type: minor
Closes: #590
2023-02-10 20:24:36 -05:00
Felipe Lalanne
180c4ff31a Reference networks by Id instead of by name
We have seen a few times devices with duplicated network names for some
reason. While we don't know the cause the networks get duplicates,
this is disruptive of updates, as the supervisor usually queries
resource by name, resulting in a 400 error from the engine because of
the ambiguity.

This replaces those queries by name to queries by id. This includes
network removal. If a `removeNetwork` step is generated, the supervisor
opts to remove all instances of the network with the same name as it
cannot easily resolve the ambiguity.

This doesn't solve the problem of ambiguous networks, because even if
networks are referenced by id when creating a container, the engine will
throw an error (see https://github.com/balena-os/balena-supervisor/issues/590#issuecomment-1423557871)

Change-type: patch
Relates-to: #590
2023-02-10 20:24:36 -05:00
Christina Ying Wang
c4f9d72172 Remove dependent devices content in codebase
This includes:
- proxyvisor.js
- references in docs
- references device-state, api-binder, compose modules, API
- references in tests

The commit also adds a migration to remove the 4 dependent device tables from the DB.

Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-02-06 19:34:02 -08:00
Pagan Gazzard
b8891ebb08 Use timers/promises for promisified setTimeout
Change-type: patch
2022-11-21 18:17:34 -05:00
Pagan Gazzard
3dea88c74e Remove some unnecessary usage of .reduce
Change-type: patch
2022-11-18 17:10:36 +00:00
Christina Ying Wang
8174ea9643 Simplify getting images for cleanup
getImagesForCleanup used to query the Engine for the Supervisor
image, which is unnecessary given that the Supervisor has access
to constants.supervisorImage. Thus, this Engine query is removed.

The method is simplified and made more clear, and
imageManager.isCleanupNeeded doesn't need to be stubbed in tests.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-11-16 12:52:49 -08:00
Christina Ying Wang
f558be0a16 Create default network as config-only when services have host networking
This eliminates chances of host-Docker address collision for apps such
as the Supervisor where all services have host networking.

Closes: #2062
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-11-16 10:19:36 -08:00
Christina Ying Wang
af0bb87def Improve error message if supervisor0 network not found
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-11-09 10:22:27 -08:00
Christina Ying Wang
1034aa70e6 Convert ensureSupervisorNetwork to native Promises
Also remove system interface check from ensureSupervisorNetwork.

Previously `ensure` was a Bluebird promise which wasn't awaited in
its composition step. This has been here for some time and may contribute
to issues with duplicate networks. The conversion to native Promises
allows `ensure` to be awaited, hopefully reducing instances of duplicate
networks.

Removing the system interface check for /sys/class/net/supervisor0
because it's superfluous given that the Engine creates the interface
with NetworkManager. It also makes testing a lot more difficult to set up
as /sys/class/net isn't a directory that can be written to for emulating
system interface creation / removal.

Relates-to: https://github.com/balena-os/balena-supervisor/issues/1110
Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-11-08 16:06:10 -08:00
Christina Ying Wang
75bf2aa3b4 Improve NotFoundError handling
Use isNotFoundError which converts an error of the default type
`unknown` into NotFoundError if the error is an instance of NotFoundError.
Thrown errors are of type `unknown` by default so we should use methods
with type guards for better type narrowing.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-11-08 15:41:52 -08:00
Christina Ying Wang
463d73f8a4 Access api-key methods through device API
This makes for better black boxing of device API as a module.

Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:27:19 -07:00
Christina Ying Wang
966d957465 Convert common.js to TypeScript
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:27:19 -07:00
Christina Ying Wang
71b2aea0fe Use v2 router directly instead of through application manager
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Christina Ying Wang
a2d9af2407 Move /v1 routes in apiBinder.router to v1.ts
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Christina Ying Wang
d08f25f0a3 Consolidate API middlewares, move api-keys to device-api
Signed-off-by: Christina Ying Wang <christina@balena.io>
2022-10-18 14:16:53 -07:00
Felipe Lalanne
a69fbf6eac Migrate volume-manager tests to integration
Now the tests are ran against the actual docker engine instead of
against mockerode.

The new tests actually caught a bug in
`volumeManager.removeOrphanedVolumes`, where that function would try to
remove volumes for stopped containers, causing an exception.
This commit also fixes that bug.
2022-09-28 10:37:41 -03:00
Pagan Gazzard
96418d55b5 Update @balena/lint to 6.2.0
Change-type: patch
2022-09-19 16:41:28 +01:00
Pagan Gazzard
a4c13aa2e9 Update to typescript 4.8.2
Change-type: patch
2022-09-19 16:36:17 +01:00
Felipe Lalanne
f7bc30a310 Remove unnecessary check for docker status code 2022-09-14 10:41:32 -03:00
Felipe Lalanne
c6f911c36b Only install service if running is set to false
The supervisor supports target state `running: false` for services.
This state indicates that the service should be stopped if already
running, or that the container should just be created and never started
if the container does not exist. This commit fixes the latter behavior.

Although nothing in our platform currently sends this target state, this
enables some potential use cases, e.g. only starting some services
in manufacturing and starting the rest of the services when the device
actually connects.

Change-type: patch
Closes: #2014
2022-09-14 10:15:51 -03:00
Felipe Lalanne
5a57647450 Fix filtering of the supervisor app on the target state
Under some conditions, an aarch64 device may get a reference to a armv7hf
supervisor on the target state. One of the ways this can happen is if
an aarch64 device is added to an armv7hf fleet and the target supervisor
is set before the device fully provisions.

If that happens, the previous filtering for the supervisor app (which
relied on the architecture in device-type.json) would
fail and the user would end up with two supervisor containers, one
running correctly and the other crash looping.

This fixes the filtering and just checks if the supervisor uuid/service
name belongs to a group of known uuids.

Closes: #2006
Change-type: patch
2022-09-12 16:28:22 -03:00
Felipe Lalanne
48e0733c7e Remove side effects for module imports
The supervisor uses the following pattern for async module
initialization

```typescript
// module.ts

export const initialised = (async () => {
    // do some async initialization
})();

// somewhere else
import * as module from 'module';

async function setup() {
  await module.initialise;
}
```

The above pattern means that whenever the module is imported, the
initialisation procedure will be ran, which is an anti-pattern.

This converts any instance of this pattern into a function

```typescript
export const initialised = _.once(async () => {
    // do some async initialization
});
```

And anywhere else on the code it replaces the call with a

```typescript
await module.initialised();
```

Change-type: patch
2022-09-06 15:48:18 -04:00
Christina Wang
12b67742c8 Wait for Stopping services to stop before target apply success
This mitigates an edge case bug introduced in v13.1.3 where services that
are slow to exit may get stuck in a state of Downloaded if a service var is
changed then reverted rapidly. More detailed description in linked issue.

Change-type: patch
Closes: #1991
Signed-off-by: Christina Wang <christina@balena.io>
2022-08-02 14:34:25 -07:00
Christina Wang
a7a0821a3e Read hostname from config.json with container /etc/hostname as backup
We don't need to read the host's hostname through /mnt/root/etc/hostname,
because the hostname is written to config.json on a change. When the hostname
has never changed, it won't be found in config.json, so we can default to
the Supervisor container's /etc/hostname as it will match the host's
/etc/hostname, the network mode being `host`.

Closes: #1968
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-06-15 11:31:36 -07:00
20k-ultra
471f0f0615 Refactor update-lock.lock to accept an array of applications to lock
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-06-02 00:31:42 -04:00
Christina Wang
6ccd2178c1 Use Mounts API for engine socket feature label
When upgrading to this Supervisor version, containers using the
engine feature label will be restarted.

Relates-to: https://github.com/balena-os/balena-supervisor/pull/1780
Closes: https://github.com/balena-os/balena-engine/issues/220
Closes: #1933
Change-type: major
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 23:57:28 +00:00
Christina Wang
2896444988 Log anonymous volumes
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 11:08:23 -07:00
Christina Wang
0a9c7282e8 Add compose support for volumes defined with long syntax
balena-compose already supports this, and with this PR, Supervisor can
have the option of using HostConfig.Mounts for internal bind mounts such as
ones added by feature labels. This will be handled in a future PR.

The only blocker to having users use long syntax is adding this feature
to target state. This PR does not add that feature.

Relates-to: https://github.com/balena-os/balena-supervisor/pull/1780
Relates-to: https://github.com/balena-os/balena-engine/issues/220
Relates-to: #1933
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
2022-05-17 11:08:23 -07:00
20k-ultra
5437aea786 Remove in memory storage of started/stopped containers
Change-type: patch
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-19 22:27:15 -04:00
20k-ultra
ca9945bdfb Only start a container once in its lifetime
This will ensure the restart policy specified is not violated

Change-type: patch
Closes: #1668
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-19 22:27:15 -04:00
20k-ultra
c1b5e58ebd Correctly evaluate downloadProgress when computing current state
Change-type: patch
Closes: #1918
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
2022-04-07 21:21:44 -04:00
Felipe Lalanne
b2b1b111b3 Ignore the supervisor in the target state
Starting with v3 state endpoint, the supervisor may receive the configuration
for the supervisor service on the target state. This commit allows the
supervisor to filter out the supervisor container from the current and target
state to let the update-balena-supervisor script handle the creation and update
of the supervisor container.

Updating and creating the supervisor container will be handled by a
future commit
2022-03-22 19:28:43 -03:00
Felipe Lalanne
8e40f1c2f5 Ignore unknown image classes on the target state
Starting with v3 state endpoint, the supervisor can receive
service configuration for services that are meant to be installed as
overlays or filesets on the host, as well as configuration for services
that are meant to be installed on the root partition. This commit just
ignores those services from the target state until support is added
2022-03-22 19:28:43 -03:00
Felipe Lalanne
f1cd3d367c Cleanup unused methods and dependencies on db ids 2022-03-22 19:28:43 -03:00
Felipe Lalanne
97f3b2a51e Update types and create methods for reporting v3 state 2022-03-22 19:08:03 -03:00
Felipe Lalanne
e9af9d8e83 Allow application manager to match apps between environments
If an app with the same app uuid exists between environments, the
supervisor will match the apps by uuid to prevent stopping the running
app
2022-03-22 19:08:03 -03:00
Felipe Lalanne
5c5483dd3d Rename networks to <appUuid>_<networkName>
This is required as we are phasing out app ids and we need to be able to
get app uuid from the current state of the network. The app-id now
exists as a container in new networks

This commit will restart containers as it needs to recreate the network.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
0835b29874 Add app uuid as metadata to new volumes
We cannot modify older volumes but newly created volumes will contain
app uuid as metadata so they can be migrated at some point in the
future.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
0b19dee511 Cleanup current state reporting methods
Removed redundant `getCurrentAppsForReport` and `getCurrentForComparison` since
the behavior of these methods is already handled by `getCurrentApps` and
`getCurrentState`.
2022-03-22 19:08:03 -03:00
Felipe Lalanne
1edd060143 Clean up migration from legacy target state format
Creates `lib/legacy.ts` and `device-state/legacy.ts` to deal with
migration from legacy target states (single container and v2) for all
apps and for apps.json respectively
2022-03-22 19:08:03 -03:00
Felipe Lalanne
7425d1110b Add support for GET v3 target state
This change updates types and database format in order to allow
receiving the new format of the target state from the cloud and allow
applications to keep working.

This change also updates metadata in the containers, meaning services
will need to be restarted on supervisor update

Change-type: major
2022-03-22 19:08:02 -03:00