The Service class in `compose/service.ts` cannot get the image name
from the image id when building the object from the container metadata.
We query the metadata in the application manager getCurrentApps method
so the current state can be used as target by API methods
Change-type: patch
Network aliases are now compared checking that the target state is a
subset of the current state. This will prevent service restarts due to
additional aliases created by docker in the container.
Closes: #2134
Change-type: patch
When getting the service from the docker container, remove the
containerId from the list of aliases (which gets added by docker). This
will make it easier to use the current service state as a target.
This will help us remove the `safeStateClone` function in the API in a
future commit
Change-type: patch
This replaces the previous flag `isApplyingIntermediate` on application
manager and simplifies the interface of the state engine to make temporary changes to the
general app state.
Change-type: patch
There were multiple places in the state engine that skipped some
operations while in local mode. In reality, all it's needed while in
local mode is to skip image and volume deletion.
This commit simplifies application-manager and compose app to be more
local mode agnostic and instead making the image deletion and volume
deletion configurable via function arguments.
This also has the benefit to make the treatment of local mode
applications more similar to cloud mode applications, allowing for
API endpoints to function the same way both modes.
Change-type: patch
As the Supervisor is a privileged container, it has access to host /dev, and therefore has access
to boot, data, and state balenaOS partitions. This commit sets up the framework for the following:
- Finds the /dev partition that corresponds to each partition based on partition label
- Mounts the partitions into set mountpoints in the device
- Removes reliance on env vars and mountpoints provided by host's start-balena-supervisor script
- Simplifies host path querying by centralizing these queries through methods in lib/host-utils.ts
This particular changes env vars for and mounts the boot partition.
Since the Supervisor would no longer rely on container `run` arguments provided by a host script,
this change moves Supervisor closer to being able to start itself (Supervisor-as-an-app).
Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
We have seen a few times devices with duplicated network names for some
reason. While we don't know the cause the networks get duplicates, this
can be disruptive for updates as trying to create a container referencing a duplicate
network results in a 400 error from the engine.
This commit finds and removes duplicate networks via the state engine,
this means that even if somehow a container could be referencing a
network that has been duplicated later somehow, this will remove the
container first.
While thies doesn't solve the problem of duplicate networks being
created in the first place, it will fix the state of the system to
correct the inconsistency.
Change-type: minor
Closes: #590
We have seen a few times devices with duplicated network names for some
reason. While we don't know the cause the networks get duplicates,
this is disruptive of updates, as the supervisor usually queries
resource by name, resulting in a 400 error from the engine because of
the ambiguity.
This replaces those queries by name to queries by id. This includes
network removal. If a `removeNetwork` step is generated, the supervisor
opts to remove all instances of the network with the same name as it
cannot easily resolve the ambiguity.
This doesn't solve the problem of ambiguous networks, because even if
networks are referenced by id when creating a container, the engine will
throw an error (see https://github.com/balena-os/balena-supervisor/issues/590#issuecomment-1423557871)
Change-type: patch
Relates-to: #590
This includes:
- proxyvisor.js
- references in docs
- references device-state, api-binder, compose modules, API
- references in tests
The commit also adds a migration to remove the 4 dependent device tables from the DB.
Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
getImagesForCleanup used to query the Engine for the Supervisor
image, which is unnecessary given that the Supervisor has access
to constants.supervisorImage. Thus, this Engine query is removed.
The method is simplified and made more clear, and
imageManager.isCleanupNeeded doesn't need to be stubbed in tests.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
This eliminates chances of host-Docker address collision for apps such
as the Supervisor where all services have host networking.
Closes: #2062
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
Also remove system interface check from ensureSupervisorNetwork.
Previously `ensure` was a Bluebird promise which wasn't awaited in
its composition step. This has been here for some time and may contribute
to issues with duplicate networks. The conversion to native Promises
allows `ensure` to be awaited, hopefully reducing instances of duplicate
networks.
Removing the system interface check for /sys/class/net/supervisor0
because it's superfluous given that the Engine creates the interface
with NetworkManager. It also makes testing a lot more difficult to set up
as /sys/class/net isn't a directory that can be written to for emulating
system interface creation / removal.
Relates-to: https://github.com/balena-os/balena-supervisor/issues/1110
Change-type: minor
Signed-off-by: Christina Ying Wang <christina@balena.io>
Use isNotFoundError which converts an error of the default type
`unknown` into NotFoundError if the error is an instance of NotFoundError.
Thrown errors are of type `unknown` by default so we should use methods
with type guards for better type narrowing.
Signed-off-by: Christina Ying Wang <christina@balena.io>
Now the tests are ran against the actual docker engine instead of
against mockerode.
The new tests actually caught a bug in
`volumeManager.removeOrphanedVolumes`, where that function would try to
remove volumes for stopped containers, causing an exception.
This commit also fixes that bug.
The supervisor supports target state `running: false` for services.
This state indicates that the service should be stopped if already
running, or that the container should just be created and never started
if the container does not exist. This commit fixes the latter behavior.
Although nothing in our platform currently sends this target state, this
enables some potential use cases, e.g. only starting some services
in manufacturing and starting the rest of the services when the device
actually connects.
Change-type: patch
Closes: #2014
Under some conditions, an aarch64 device may get a reference to a armv7hf
supervisor on the target state. One of the ways this can happen is if
an aarch64 device is added to an armv7hf fleet and the target supervisor
is set before the device fully provisions.
If that happens, the previous filtering for the supervisor app (which
relied on the architecture in device-type.json) would
fail and the user would end up with two supervisor containers, one
running correctly and the other crash looping.
This fixes the filtering and just checks if the supervisor uuid/service
name belongs to a group of known uuids.
Closes: #2006
Change-type: patch
The supervisor uses the following pattern for async module
initialization
```typescript
// module.ts
export const initialised = (async () => {
// do some async initialization
})();
// somewhere else
import * as module from 'module';
async function setup() {
await module.initialise;
}
```
The above pattern means that whenever the module is imported, the
initialisation procedure will be ran, which is an anti-pattern.
This converts any instance of this pattern into a function
```typescript
export const initialised = _.once(async () => {
// do some async initialization
});
```
And anywhere else on the code it replaces the call with a
```typescript
await module.initialised();
```
Change-type: patch
This mitigates an edge case bug introduced in v13.1.3 where services that
are slow to exit may get stuck in a state of Downloaded if a service var is
changed then reverted rapidly. More detailed description in linked issue.
Change-type: patch
Closes: #1991
Signed-off-by: Christina Wang <christina@balena.io>
We don't need to read the host's hostname through /mnt/root/etc/hostname,
because the hostname is written to config.json on a change. When the hostname
has never changed, it won't be found in config.json, so we can default to
the Supervisor container's /etc/hostname as it will match the host's
/etc/hostname, the network mode being `host`.
Closes: #1968
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
balena-compose already supports this, and with this PR, Supervisor can
have the option of using HostConfig.Mounts for internal bind mounts such as
ones added by feature labels. This will be handled in a future PR.
The only blocker to having users use long syntax is adding this feature
to target state. This PR does not add that feature.
Relates-to: https://github.com/balena-os/balena-supervisor/pull/1780
Relates-to: https://github.com/balena-os/balena-engine/issues/220
Relates-to: #1933
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
This will ensure the restart policy specified is not violated
Change-type: patch
Closes: #1668
Signed-off-by: 20k-ultra <3946250+20k-ultra@users.noreply.github.com>
Starting with v3 state endpoint, the supervisor may receive the configuration
for the supervisor service on the target state. This commit allows the
supervisor to filter out the supervisor container from the current and target
state to let the update-balena-supervisor script handle the creation and update
of the supervisor container.
Updating and creating the supervisor container will be handled by a
future commit
Starting with v3 state endpoint, the supervisor can receive
service configuration for services that are meant to be installed as
overlays or filesets on the host, as well as configuration for services
that are meant to be installed on the root partition. This commit just
ignores those services from the target state until support is added
This is required as we are phasing out app ids and we need to be able to
get app uuid from the current state of the network. The app-id now
exists as a container in new networks
This commit will restart containers as it needs to recreate the network.
Removed redundant `getCurrentAppsForReport` and `getCurrentForComparison` since
the behavior of these methods is already handled by `getCurrentApps` and
`getCurrentState`.
Creates `lib/legacy.ts` and `device-state/legacy.ts` to deal with
migration from legacy target states (single container and v2) for all
apps and for apps.json respectively
This change updates types and database format in order to allow
receiving the new format of the target state from the cloud and allow
applications to keep working.
This change also updates metadata in the containers, meaning services
will need to be restarted on supervisor update
Change-type: major