Commit Graph

2188 Commits

Author SHA1 Message Date
Pablo Carranza Velez
3a710506a6 Switch to a new image management system keeping the docker image ID in the database, allowing deltas and proper comparison for images that have a digest.
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
d84bcf0fb4 When applying host config values like dtoverlay and dtparam, take values not starting with double quotes as single entries instead of arrays to parse
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
918372b569 Some bugfixes and style improvements
* Fix validation of 0, fix ulimits, don't compare mem_limit or mem_reservation until OS supports them

* Remove all instances of _.forEach

* ApplicationManager: have separate compareNetworksForUpdate and compareVolumesForUpdate

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
1706afa7a2 Remove deprecated and broken OOM protection from gosuper, and clean up its dependencies and unused files
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
25695aade5 Add support for init, mem_reservation, shm_size, read_only and sysctls.
Also several bugfixes:

* Fix VPN control, logging in deviceConfig, and action executors in proxyvisor

* Fix bug in calculation of dependencies due to fields still using snake_case

* Fix snake_case in a migration, and remove unused lib/migration.coffee

* In healthcheck, count deviceState as healthy when a fetch is in progress (as in the non-multicontainer supervisor)

* Set always as default restart policy

* Fix healthcheck, stop_grace_period and mem_limit

* Lint and reduce some cyclomatic complexities

* Namespace volumes and networks by appId, switch default network name to 'default', fix dependencies in networks and volumes, fix duplicated kill steps, fix fat arrow on provisioning

* Check that supervisor network is okay every time we're applying target state

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
739fe13cad Use a supervisor0 network interface for the supervisor network API. Remove RESIN_APP_COMMIT and RESIN_APP_RELEASE env vars.
Also add support for several networks per container (but with no configuration yet).
Also some bugfixes and implement healthcheck and not disabling VPN on startup.

Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
b003f48d7b Switch to using knex migrations to set up the database, and change the database format to use integers for ids instead of strings.
Also includes various improvements and bugfixes to services and the migration from legacy /data to volumes.

The switch ti migrations involves a dirty hack for webpack to properly resolve the paths to the migrations js files - it uses an expression
that webpack can't resolve, so we hardcode it to a value and use the ContextReplacementPlugin to make that value resolve to the migrations folder.

The downsides to this approach are:
- a change in knex code would break this
- the migration code is added twice to the supervisor image: once in the migrations folder (because knex needs to loop through the directory to find the files),
and once inside app.js (because I can't make webpack treat them as external)

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
7c98a9d058 Supervisor API: remove the tcp-ping endpoints
Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
7d8a208a06 ApplicationManager: Avoid deadlocks by killing services once its dependencies have been downloaded, and killing services with handover when it is absolutely necessary
Two cases could've caused deadlocks:
1) Two services use a volume, and one service depends on the other. The volume config changes, but we can't update the volume because we need to kill
both services, and yet we can't kill the dependent service because its dependency isn't ready either.
2) A service with handover strategy uses a volume. The volume config changes. We can't update the volume because the running service is using it, and we can't
start the handover because it depends on the volume being ready. So we need to kill the service to update the volume config.

(Same for networks as with volumes)

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
0d27658a87 Various improvements and fixes to how compositions are handled
Change the way we get the network gateway to set up the supervisor API address.

Added support for cap_add, cap_drop and devices.

Some fixes like missing fat arrows and removing leftover code.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
5f651c71f7 app.coffee: Switch to the multicontainer supervisor, add missing dependencies, and remove all files that are not used anymore
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
14d2bc3f39 APIBinder: implement a module to handle all interactions with the Resin API
This module provisions the device and takes care of getting the target state from the API, calling deviceState to apply it.
It also reports the current state of the device back to the API.

An important change is that the initial values of the device configuration (e.g. config.txt) are reported to the API, creating new config
variables if no values exist for a particular key. This will allow better management of config.txt by giving visibility to the initial configuration.

Changelog-Entry: Remove support for keeping the provisioning apiKey on Resin OS 1.X. Report initial values from config.txt and other device configuration variables to the Resin API.
Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
bc191ee86c Proxyvisor: implement the Proxyvisor for the multicontainer supervisor
This will be quickly replaced by a newer version with a different API, but for now we needed to maintain backwards compatibility (see #508).

This proxyvisor handles dependent apps and devices with a multicontainer parent app.
It also switches to the new update mechanism by inferring and applying updates step by step.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:28 -08:00
Pablo Carranza Velez
195697a7e1 compose: implement the models that make up multicontainer applications
This commit adds models to manage services, images, volumes and networks.

The main model for this is ServiceManager, which manages the collection of services on the device. It has functions to query what services are running, and to perform actions like starting, killing or performing handovers.

The Service model allows defining the transformations between a container and its service representation, and includes the functions to compare a running service with a target to determine if an update needs to happen.
This model includes the relevant compose file entries for a service that are supported. Bind mounts are disallowed except for the ones that relate to supervisor features, and persistent data is now stored in named volumes.

The Images model allows fetching and removing images, and includes functionality to determine images that have to be cleaned up - now only dangling and old supervisor images are cleaned up automatically, and ApplicationManager
will remove images that correspond to old services that are no longer needed.

The Networks and Volumes models allow managing named networks and volumes that are part of composed applications.

Changelog-Entry: Remove all bind mounts that were specific to 1.X devices. Move the resin-kill-me file for the handover strategy to /tmp/resin. Add environment variables for the location of resin-kill-me and the lockfile. Use running containers to determine what services are running instead of storing them in the internal database. Use named volumes for persistent data.
Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
be5623cbf1 DockerUtils: implement the docker utilities library as a class
This commit implements what we used to have in docker-utils.coffee now making use of coffeescript classes.

We remove the cleanup function as this is now handled directly by the ApplicationManager.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
dac153eb8c updateLock: implement a module for a file-based update lock
This update lock library allows an application to take a lockfile in several locations (subdirectories inside a base folder). The user of this library must be able
to exclusively create a lockfile in each of the corresponding locations, and if any of the files exist, the locking fails.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
cb37f7ebcc ApplicationManager: implement a module to run multicontainer applications
This module takes care of inferring and applying the steps to run multicontainer applications. It will have a Proxyvisor to handle dependent apps and
devices. It understands the relationship between services, networks and volumes to infer the steps in the correct order, also taking update strategies into account.

Changelog-Entry: Allow running docker-compose-like multicontainer applications
Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
7ae7ceab73 gosuper: add internal endpoints to get VPN and log-to-display status, and remove purge and IP address endpoints
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
d3e98eab11 DeviceConfig: implement a module to manage device configuration, including config.txt
This model allows modifying config.txt on raspberry pi devices, as well as logging to display, bandwidth control variables and other supervisor
configuration settings. Configuration values are read from the underlying OS and the supervisor configuration where appropriate (i.e. the Config object), instead of storing the current state
in the database. This means that the supervisor will always use the real values to determine if changes have to be made.

This fixes several issues with config.txt, as the current values are now read from the file, and can be reported on the supervisor's first run (which will be implemented in APIBinder).

It also now treats dtoverlay and dtparam values as a JSON array without the enclosing brackets, for instance:

```
RESIN_HOST_CONFIG_dtparam="audio=on","spi=on"
```

Will produce the following lines in config.txt:

```
dtparam=audio=on
dtparam=spi=on
```

Changelog-Entry: Implement inference of device configuration. Allow array values for dtoverlay and dtparam.
Change-Type: major
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
2953b745ce Logger: implement a module that handles all logging to pubnub
This module can also send logs for dependent devices (by passing a specific channel to the "log" function).

The log types are also moved to a separate module to be used by modules that perform logging.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
93832d6540 network: implement a module to get IP addresses and check network connectivity
This module now uses the native node `os.networkInterfaces()` to retrieve the addresses,
instead of the gosuper endpoint.

We also add the very simple "blink" library that is also used by the Supervisor API.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
f77d3e1563 DeviceState: implement a module to manage the device's target and current state
This module will take care of applying the target state for the device and reporting its current state.
The state itself is handled by two other modules, ApplicationManager and DeviceConfig. The former will take care of running applications (including the dependent ones
via its Proxyvisor), and the latter will take care of device configuration like config.txt and supervisor configuration variables.

The way state is applied differs radically from the previous approach: the old application.coffee had a big `update` function that took all of the steps from fetching the target state
to running the containers. DeviceState, instead, does an iterative process through `triggerApplyTarget` of inferring the next steps to perform towards the target state, by looking at the current state and asking the ApplicationManager and DeviceConfig for
the next steps. It then applies the next steps and every time a step is completed, it schedules another round of inferring and applying the next steps.

Special care is taken to ensure `applyTarget` is not called simultaneously more than once.

This commit also adds a "device" module to handle reboot and shutdown, and moves gosuper calls to a separate module.

The module also uses a "network" module to manage network-related parts of the device's current state: IP addresses and the connectivity check.

The module implements a "normaliseLegacy" function that allows a migration from the models from older versions of the supervisor to the multicontainer models,
so that in case of a supervisor update we can have minimal downtime and bandwidth consumption when updating to the multicontainer supervisor - this migration allows
us to avoid cleaning up images, and also allows migrating the contents of the old /data for the app.

Changelog-Entry: Infer the current state of the device when applying the target state
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
0dc9fea4d3 DB, Config: Implement modules to interact with the sqlite DB and to manage configuration
These modules allow managing the models in the sqlite database and the configuration for the supervisor.

The database will now have a schema version, and the supervisor will normalize any legacy data when migrating
from an older schema (i.e. when doing a supervisor update). This will make model changes cleaner.
If a migration is needed, the DB initialization will return "true" and store the legacy data in a legacyData table. Once the supervisor finishes migrating the data,
it calls `db.finishMigration` to mark the migration complete and clear the legacyData table.

Changes in the models:
* The database implements the tables for multicontainer applications that now have services, networks and volumes as in a docker compose file.
* Dependent apps and devices now have separate tables to store their target states.
* The deviceConfig table now only stores target values, as the current ones will be inferred from the state of the device.
* We keep a table for images as we have no way to label them in docker storage, so we need to keep our own track of what images are relevant for the supervisor.

The Config object allows transparent management of configuration values, mainly through `get`, `getMany` and `set` functions. The values can be stored in config.json or
the database, and this is managed with a schema definition that also defines whether values are mutable and whether they have default values.

Some configuration values are of the "func" type, which means that instead of corresponding to a config.json or database key, they result from a helper function
that aggregates other configuration values or gets the value from other sources, like OS version and supervisor version.

Writes to config.json are atomic if a path to the file via /mnt/root can be found. We keep a write-through cache of the file to avoid unnecessary IO.

Changelog-Entry: Implement the multicontainer app models, and change the supervisor configuration management to avoid duplication between fields in config.json and fields in the internal database
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
23f81c28f5 EventTracker: add a module to track mixpanel events
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
Pablo Carranza Velez
60a4cccfd2 Supervisor: Implement a Supervisor class with a SupervisorAPI
This will be the top level object in the multicontainer supervisor, using the following objects
to perform its duties:

* A DB object to manage the sqlite database models
* A Config object to manage configuration in sqlite and config.json
* An EventTracker to track events and send them to mixpanel
* A DeviceState object to manage the device state, including containers, device configuration and dependent devices
* An APIBinder object to manage all interactions with the Resin API
* The SupervisorAPI, implemented here, which exposes functionality from the other objects over an HTTP API with apikey authentication.

We also include an iptables module that the SupervisorAPI will use to only allow traffic from certain interfaces.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-03-06 10:32:27 -08:00
resin-io-versionbot[bot]
b397c998dd
Auto-merge for PR #565 via VersionBot
Update locking docs update, again
2018-03-05 19:31:13 +00:00
resin-io-versionbot[bot]
b0f2335b41 v6.6.9 2018-03-05 19:16:34 +00:00
zwalchuk
d931bf217a Update update-locking.md
Test to see if this updates commit message

Change-type: patch
Signed-off-by: Zach Walchuk zach@resin.io
2018-03-05 07:09:42 -08:00
zwalchuk
2ce831420f Small updates for standardization
A few small updates to standardize terminology and header with the rest of the docs.
2018-03-05 07:09:42 -08:00
resin-io-versionbot[bot]
cea1a4b781
Auto-merge for PR #561 via VersionBot
Allow truthy values for deltas and lock override (i.e. the string 'tr…
2018-02-27 18:42:22 +00:00
resin-io-versionbot[bot]
b6fc45b671 v6.6.8 2018-02-27 18:27:22 +00:00
Pablo Carranza Velez
f2d5a59727 Allow truthy values for deltas and lock override (i.e. the string 'true' besides '1')
We had previously done this for all the other configuration variables, but for some reason we had missed these two.

Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-02-27 09:27:45 -08:00
resin-io-versionbot[bot]
cb450a7f59
Auto-merge for PR #560 via VersionBot
Since armel builds are disabled, do not pull an armel node base image…
2018-02-27 17:25:26 +00:00
resin-io-versionbot[bot]
590c67333e v6.6.7 2018-02-27 17:11:11 +00:00
Pablo Carranza Velez
c463f7fa5b Since armel builds are disabled, do not pull an armel node base image, and ensure we never deploy an armel supervisor
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-02-27 08:40:43 -08:00
resin-io-versionbot[bot]
580873d069
Auto-merge for PR #558 via VersionBot
Update docker-delta to 2.0.4
2018-02-27 01:26:52 +00:00
resin-io-versionbot[bot]
7b9a87c347 v6.6.6 2018-02-27 01:11:47 +00:00
Akis Kesoglou
76ac7da1d6 Update docker-delta to 2.0.4
This brings in a fix for an edge case where rsync would exit before we had a chance to register event listeners.

Change-Type: patch
2018-02-23 11:26:32 +02:00
resin-io-versionbot[bot]
28258e3bbf
Auto-merge for PR #556 via VersionBot
circle.yml: escape branch name to ensure we always have a valid tag
2018-02-21 22:35:33 +00:00
resin-io-versionbot[bot]
9bbeea8a72 v6.6.5 2018-02-21 20:54:46 +00:00
Pablo Carranza Velez
7b404d0999 circle.yml: escape branch name to ensure we always have a valid tag
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-02-20 07:35:40 -08:00
resin-io-versionbot[bot]
650d2086f0
Auto-merge for PR #555 via VersionBot
circle.yml: Do not push images to dockerhub if building without a docker password
2018-02-20 02:42:06 +00:00
resin-io-versionbot[bot]
2361137a98 v6.6.4 2018-02-20 02:20:57 +00:00
Pablo Carranza Velez
7eacd93321 circle.yml: Do not push images to dockerhub if building without a docker password
This will allow us to build contributor PRs without passing secrets.

Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-02-16 14:16:22 -08:00
resin-io-versionbot[bot]
5023030c69
Auto-merge for PR #551 via VersionBot
Update resumable-request to v2.0
2018-02-06 00:28:01 +00:00
resin-io-versionbot[bot]
50e2110ba8 v6.6.3 2018-02-06 00:12:34 +00:00
Akis Kesoglou
dc69917b5a Update resumable-request to v2.0
Turned out that disk I/O can be the bottleneck when applying deltas on some devices. When the disk can’t keep up and consume the downloaded delta, there’s memory bloat due to buffering.

The updated version provides far better reliability when the device is under load and pretty much constant memory consumption with any number of concurrent deltas.

Change-Type: patch
2018-02-05 10:59:56 +02:00
resin-io-versionbot[bot]
79277fea68
Auto-merge for PR #549 via VersionBot
Use i386-nlp for supervisor releases for quark boards
2018-01-24 19:52:48 +00:00
resin-io-versionbot[bot]
bb422b1464 v6.6.2 2018-01-24 19:30:24 +00:00
Pablo Carranza Velez
03c58a2bfa Use i386-nlp for supervisor releases for quark boards
Change-Type: patch
Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
2018-01-24 09:08:37 -08:00