This allows to test that the supervisor build actually runs and opens up the
possibility of running more exhaustive API tests against a working supervisor.
Change-type: patch
When code that is unit tested is part of a file that imports modules which
depend on the dbus module, this breaks the unit test environment because there
is no system socket set up, as the unit test mocha config doesn't import fixtures.ts.
For example, if we change src/compose/utils to import device-config or api-binder, both
of those modules import lib/dbus which invokes a dbus.getBus call at the root level. This
is problematic for unit testing.
We can get around the root-level dbus.getBus call by initializing dbus only when it's first
needed. The mocked-dbus test setup code can also be removed in favor of legacy mocha
hooks, which makes the dbus stubbing in the legacy test environment more clear.
We can remove these legacy hooks when all the legacy tests are migrated to unit/integration.
Signed-off-by: Christina Ying Wang <christina@balena.io>
The state engine and preloading is performed before the device gets a
chance to register, while this is desirable for preloaded apps, it
introduces a delay on registration which is known to cause issues since
the VPN is also trying to connect at the same time.
This triggers a simultaneous start of the device engine, the API binder
and the supevisor API to avoid delays.
Change-type: patch
Update-lock tests now use the actual filesystem for testing, instead of
relying on stubs and spies.
This commit also fixes a small bug with update-lock that would cause a
`PromiseRejectionHandledWarning` when the lock callback would throw.
Now the tests are ran against the actual docker engine instead of
against mockerode.
The new tests actually caught a bug in
`volumeManager.removeOrphanedVolumes`, where that function would try to
remove volumes for stopped containers, causing an exception.
This commit also fixes that bug.
The supervisor used to rely on specific event reporting for identifying
issues at runtime. As the platform has grown, it has become much more
difficult to get any signal from the event noise. Recently the API side
for these events has been disabled, meaning these events only
contribute to bandwidth consumption. This commit disables the
event reporting feature of the supervisor which will be most likely
replaced by something like Sentry in the near future.
Change-type: minor
The supervisor supports target state `running: false` for services.
This state indicates that the service should be stopped if already
running, or that the container should just be created and never started
if the container does not exist. This commit fixes the latter behavior.
Although nothing in our platform currently sends this target state, this
enables some potential use cases, e.g. only starting some services
in manufacturing and starting the rest of the services when the device
actually connects.
Change-type: patch
Closes: #2014
`withDefault` is a type helper that allows to create a type that
defaults to a default value when trying to decode a nullish value.
That type was not correctly working with boolean types, causing `false`
values to be replaced by true. This would specifically cause issues when
parsing the target state, where a `running: false` in a service would
become a `running: true` due to the type decoding.
Change-type: patch
Under some conditions, an aarch64 device may get a reference to a armv7hf
supervisor on the target state. One of the ways this can happen is if
an aarch64 device is added to an armv7hf fleet and the target supervisor
is set before the device fully provisions.
If that happens, the previous filtering for the supervisor app (which
relied on the architecture in device-type.json) would
fail and the user would end up with two supervisor containers, one
running correctly and the other crash looping.
This fixes the filtering and just checks if the supervisor uuid/service
name belongs to a group of known uuids.
Closes: #2006
Change-type: patch
The supervisor uses the following pattern for async module
initialization
```typescript
// module.ts
export const initialised = (async () => {
// do some async initialization
})();
// somewhere else
import * as module from 'module';
async function setup() {
await module.initialise;
}
```
The above pattern means that whenever the module is imported, the
initialisation procedure will be ran, which is an anti-pattern.
This converts any instance of this pattern into a function
```typescript
export const initialised = _.once(async () => {
// do some async initialization
});
```
And anywhere else on the code it replaces the call with a
```typescript
await module.initialised();
```
Change-type: patch
This mitigates an edge case bug introduced in v13.1.3 where services that
are slow to exit may get stuck in a state of Downloaded if a service var is
changed then reverted rapidly. More detailed description in linked issue.
Change-type: patch
Closes: #1991
Signed-off-by: Christina Wang <christina@balena.io>
State report errors contribute to the supervisor failing healthchecks
and being restarted by the engine. There is not evidence of this
improving the connectivity situation and it is likely to make things
worst for the API as the first report is much more expensive than
subsequent partial reports.
Change-type: patch
Closes: #1986
Some libraries, like [proper-lockfile](https://www.npmjs.com/package/proper-lockfile)
use directories instead of files for locking. This PR allows the supervisor to be able to
work with those types of locks when lock override is requested.
Closes: #1978
Change-type: patch
Host config shouldn't be tied to applications in the first place, but
needs to be done so because it uses update locks to determine when it's
safe to patch host config, and update locks are tied to apps.
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
We don't need to read the host's hostname through /mnt/root/etc/hostname,
because the hostname is written to config.json on a change. When the hostname
has never changed, it won't be found in config.json, so we can default to
the Supervisor container's /etc/hostname as it will match the host's
/etc/hostname, the network mode being `host`.
Closes: #1968
Change-type: patch
Signed-off-by: Christina Wang <christina@balena.io>
This fixes a race condition that could occur with the first current
state report, where if the device managed to send the current state
report first, then the device name on the cloud would be set to `local`
(see #1959).
Closes: #1959
Change-type: patch