49 Commits

Author SHA1 Message Date
Cameron Diver
d5f4ac690f
refactor: Only promisify read and write locks once
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-01-08 11:59:54 +00:00
Cameron Diver
b32fba43e1
refactor: Convert DeviceConfig module to typescript
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-12-21 17:18:35 +00:00
Pagan Gazzard
019190646e Update pinejs-client to pinejs-client-request 5.x
Change-type: patch
2018-12-19 17:54:53 +00:00
Cameron Diver
b977b30dfe
refactor: Convert update-lock module to typescript
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-12-19 14:07:47 +00:00
Pablo Carranza Velez
42737cb9e9 Fix a race condition that could cause an unnecessary restart of a service immediately after download
Up to now, there was a slim but non-zero chance that an image would be downloaded between the call to `@getTarget` inside deviceState
(which gets the target state and creates Service objects using information from available images), and the call to
`@images.getAvailable` in ApplicationManager (which is used to determine whether we should keep waiting for a download or start the
service). If this race condition happened, then the ApplicationManager would infer that a service was ready to be started (because
the image appears as available), but would have incomplete information about the service because the image wasn't available when
the Service object was created. The result would be that the service would be started, and then immediately on the next applyTarget
the ApplicationManager would try to kill it and restart it to update it with the complete information from the image.

This patch changes this behavior by ensuring that all of the additional information about the current state, which includes available images,
is gathered *before* building the current and target states that we compare. This means that if the image is downloaded after the call to getAvailable, the Service might be constructed with all the information about the image, but it won't be started until the next pass, because ApplicationManager will treat it as still downloading.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-12-17 15:41:12 -03:00
Cameron Diver
82602abf8d
config: Replace supervisorOfflineMode and offlineMode with unmanaged
Change-type: major
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-12-14 15:01:41 +00:00
Cameron Diver
5bea0fdc9d
fix: Give unmanaged target states a source of 'local'
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-12-14 15:01:41 +00:00
Pablo Carranza Velez
b94921263a Use rimraf package instead of handmade function
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-12-12 14:22:15 -03:00
Pablo Carranza Velez
af717a3761 Stricter validation for backup file contents
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-12-12 14:17:34 -03:00
Pablo Carranza Velez
501272266b Add the ability to restore volumes from a backup.tgz in the data partition
Change-type: minor
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-12-12 14:17:33 -03:00
Heds Simons
80203f29ad
api: Ensure Supervisor API returns IP addresses
The move from pure CoffeeScript to TypeScript has brought a
few changes to the way transpiling happens. Previously, through
serendipity, the way `startIPAddressUpdate` was called worked
because of the binding convention pre-transpiling.

However, with the move to TypeScript, this has altered and
the assumption that a lack of parentheses would call the
method before supplying a callback into the returned function
is incorrect. The method must be specifically called first.

Connects-to: #836
Change-type: patch
Signed-off-by: Heds Simons <heds@balena.io>
2018-12-07 10:37:00 +00:00
Cameron Diver
64a8c03eba
unmanged: Don't require a device name when setting a target state
Also set a default device name of 'local', to avoid an undefined value.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-11-29 11:35:08 +00:00
Pablo Carranza Velez
502167e267
fix: When updating from a legacy supervisor, use updated resource ids and image URL from the API
When updating from old supervisors (<7.0.0), we've been so far using a fake id 1 for serviceId, imageId
and releaseId since these were not available in the old supervisor. This causes problems when the supervisor
tries to report these values to the API. Moreover, the app from the legacy supervisor has an image URL
that doesn't include the content hash - this causes the supervisor to believe the image is not really downloaded
and try to fetch it again.

To fix these issues, we add a request to the API when the supervisor starts up and detects that there's a legacy
app that needs to be normalised. We fetch the appropriate release, and use it to populate the resource ids
and the updated image URL.

This should avoid the unnecessary image download, and errors reporting target state after an update.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-11-28 17:19:55 +00:00
Cameron Diver
2e80b49da1
Don't start connectivity check when in offlineMode
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-11-28 14:53:07 +00:00
Cameron Diver
ce543d820f
Improve UX when apps.json is not present
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2018-11-28 14:53:06 +00:00
Pagan Gazzard
d6e9283a15 Fix coffee-script lint failures 2018-11-02 14:50:12 +00:00
Pablo Carranza Velez
b3860b2b70 fix: Store and retrieve device config without namespaces
This avoids issues on provisioning where the current state
(esp. config.txt) that we want to save is retrieved without
a RESIN_ or BALENA_ prefix, causing those values to be lost.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-10-20 04:40:55 +02:00
Pablo Carranza Velez
6fb0147d3c Fix preloading in flasher images by reading apps.json if target hasn't been set
i.e. if we're not provisioned or if the target state is empty (of apps), then we
read apps.json to preload. We then mark that the target state has been set to avoid
trying to preload again if we ever get an empty target state from the API.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-10-19 14:42:29 +02:00
Pablo Carranza Velez
24cbfbb860 deviceConfig: allow BALENA_ config variables
They will take precedence over any existing RESIN_ variables. We strip both namespaces now
whenever we get the target values.

This also fixes preloading with a legacy config (the interface to get the config keys from
the legacy apps.json was broken).

Change-type: minor
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2018-10-18 17:20:53 +02:00
Cameron Diver
d939b2b9e6
fix: Remove debugging console logs
Change-type: patch
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-10-18 15:04:04 +01:00
Cameron Diver
479e0a8bb8
state: Don't consider local mode when storing state
Change-type: patch
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-10-13 20:20:05 +01:00
Cameron Diver
19cd310da3
Support setting target state in local mode from supervisor API
Change-type: minor
Closes: #689
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-10-10 13:02:40 +01:00
Cameron Diver
d3a18da573
Refactor: Convert logging module to typescript
Change-type: patch
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-09-03 09:39:06 -07:00
Petros Angelatos
0d812c272c
logger: Use the new logging backend
Change-type: minor
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
2018-07-18 12:30:59 -07:00
Cameron Diver
c61b16655e
Remove resinApiEndpoint meta-endpoint and use config.json entry instead
The resinApiEndpoint config option existed for legacy reasons, where the
apiEndpoint was passed in via env vars, but this is no longer the case,
and the current supervisor wouldn't run on these older versions of
resinOS anymore anyway, so I've removed the references to this legacy
endpoint, as it made reasoning about offline mode weird.

Change-type: minor
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-07-12 13:33:57 +01:00
Cameron Diver
089f31cb5d
Pin a device to a commit when preload has a pinDevice field
Change-type: minor
Closes: #668
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-06-11 20:27:16 +01:00
Cameron Diver
bc37ee56e4
Check against application source for target applications
The supervisor will now check that a source of an application matches
the current source, and only start it if so.

Change-type: patch
Closes: #658
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-05-22 12:11:57 +01:00
Cameron Diver
393671505c
Respond to reboot and shutdown endpoints with a success object
Change-type: patch
Closes: #607
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-03-29 13:11:02 +01:00
Pablo Carranza Velez
348ff66cee
Replace the gosuper component with a node module that handles communication with systemd, and stop using an init system in the supervisor container
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-22 15:55:15 +00:00
Pablo Carranza Velez
e1e33b376e Force reboots and shutdowns if lock override is enabled
Closes #440
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-19 16:30:59 -03:00
Cameron Diver
d27c529ebe
Fix bug in require for migrations for legacy preload
Change-type: patch
Connects-to: #573
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-03-09 14:10:11 +00:00
Cameron Diver
a150dbf329
Convert object to array when normalising legacy target apps
Change-type: patch
Connects-to: #567
Signed-off-by: Cameron Diver <cameron@resin.io>
2018-03-08 15:48:03 +00:00
Pablo Carranza Velez
15da221382 Implement a new logger that sends logs to the resin API, that can be used optionally instead of PubNub
Change-Type: minor
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 21:09:06 -08:00
Pablo Carranza Velez
dc62418db4 Some fixes in current state reporting, error handling for "container not found", plus more style improvements
Also, ensure the properties argument to eventTracker.track is an object

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:29 -08:00
Pablo Carranza Velez
58b167b43d Various bugfixes and sytlistic improvements
* Use the correct defaults for the delta config variables that have them

* Only mount /lib/firmware and /lib/modules if they exist on the host

* hardcode-migrations.js: Nicer line separation

* APIBinder: switch to using a header for authentication, and keep credentials saved in the API clients

* Fix hrtime measurements in milliseconds

* Do not uses classes for routers

* compose: properly initialize networkMode to the first entry in networks if there is one

* Fix some details regarding defaults in validation and service

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:29 -08:00
Pablo Carranza Velez
7ed27ea203 Some fixes on migrations, dependent devices and deltas
* Switch default dependent device type to generic

* Reduce noise in logs

* Limit to 3 simultaneous delta downloads

* Better check for deltaSource

* When checking volume dependencies, do not compare regular (non-named) volumes

* Store imageId for dependent apps, and don't report dependent images with invalid imageIds

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:29 -08:00
Pablo Carranza Velez
e43c9052dd Improve backwards-compatible response of GET /v1/device
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:29 -08:00
Pablo Carranza Velez
3fd52bb0c7 Simplify the update logic by making fetch and kill (the only long-running actions) happen in the background, and always waiting for all actions before continuing
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:29 -08:00
Pablo Carranza Velez
484a688dbd Pause updates while purging or restarting apps, and ensure an applyTarget is triggered after the actions run
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
8548222a00 Several bugfixes:
* Ensure commit is only reported when update has finished

* Change default delay between actions to 100ms

* Fix envArrayToObject for cases where the env var has an equal sign

* Use shell-quote to properly parse string command and entrypoint

* Fix preloading with a legacy apps.json

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
2809d3c2ca Avoid failed updates causing several instances of applyTarget
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
534f7d13cb Fix local mode and the host-config endpoint
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
5ec8e57aa0 Implement v2 API endpoints to restart and purge apps, and restart a service
This also changes the deviceState object to use promises instead of timeouts to schedule
applying the target state.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
f653fa4961 Add support for service hostname
Plus several small bug fixes:

* Allow target states with apps with no release

* Fix lock override and a TypeError in compareServicesForUpdate

* Lowercase service names when doing migrations and legacy preload

* Fix deltas from scratch

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
839ebf8688 Fix preloaded apps and support legacy preloading, and fix some details in the default service when migrating
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
25695aade5 Add support for init, mem_reservation, shm_size, read_only and sysctls.
Also several bugfixes:

* Fix VPN control, logging in deviceConfig, and action executors in proxyvisor

* Fix bug in calculation of dependencies due to fields still using snake_case

* Fix snake_case in a migration, and remove unused lib/migration.coffee

* In healthcheck, count deviceState as healthy when a fetch is in progress (as in the non-multicontainer supervisor)

* Set always as default restart policy

* Fix healthcheck, stop_grace_period and mem_limit

* Lint and reduce some cyclomatic complexities

* Namespace volumes and networks by appId, switch default network name to 'default', fix dependencies in networks and volumes, fix duplicated kill steps, fix fat arrow on provisioning

* Check that supervisor network is okay every time we're applying target state

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
739fe13cad Use a supervisor0 network interface for the supervisor network API. Remove RESIN_APP_COMMIT and RESIN_APP_RELEASE env vars.
Also add support for several networks per container (but with no configuration yet).
Also some bugfixes and implement healthcheck and not disabling VPN on startup.

Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
b003f48d7b Switch to using knex migrations to set up the database, and change the database format to use integers for ids instead of strings.
Also includes various improvements and bugfixes to services and the migration from legacy /data to volumes.

The switch ti migrations involves a dirty hack for webpack to properly resolve the paths to the migrations js files - it uses an expression
that webpack can't resolve, so we hardcode it to a value and use the ContextReplacementPlugin to make that value resolve to the migrations folder.

The downsides to this approach are:
- a change in knex code would break this
- the migration code is added twice to the supervisor image: once in the migrations folder (because knex needs to loop through the directory to find the files),
and once inside app.js (because I can't make webpack treat them as external)

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
f77d3e1563 DeviceState: implement a module to manage the device's target and current state
This module will take care of applying the target state for the device and reporting its current state.
The state itself is handled by two other modules, ApplicationManager and DeviceConfig. The former will take care of running applications (including the dependent ones
via its Proxyvisor), and the latter will take care of device configuration like config.txt and supervisor configuration variables.

The way state is applied differs radically from the previous approach: the old application.coffee had a big `update` function that took all of the steps from fetching the target state
to running the containers. DeviceState, instead, does an iterative process through `triggerApplyTarget` of inferring the next steps to perform towards the target state, by looking at the current state and asking the ApplicationManager and DeviceConfig for
the next steps. It then applies the next steps and every time a step is completed, it schedules another round of inferring and applying the next steps.

Special care is taken to ensure `applyTarget` is not called simultaneously more than once.

This commit also adds a "device" module to handle reboot and shutdown, and moves gosuper calls to a separate module.

The module also uses a "network" module to manage network-related parts of the device's current state: IP addresses and the connectivity check.

The module implements a "normaliseLegacy" function that allows a migration from the models from older versions of the supervisor to the multicontainer models,
so that in case of a supervisor update we can have minimal downtime and bandwidth consumption when updating to the multicontainer supervisor - this migration allows
us to avoid cleaning up images, and also allows migrating the contents of the old /data for the app.

Changelog-Entry: Infer the current state of the device when applying the target state
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00