Commit Graph

1006 Commits

Author SHA1 Message Date
Cameron Diver
0fa47f635b
fix: Correctly handle multiple hosts ports pointing to a container port
When assigning multiple host ports to a single container port before
this change, the supervisor would incorrectly take only the first host
port into consideration. This change makes it so that every host port
per container port is considered.

Closes: #986
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-25 13:48:04 +01:00
Cameron Diver
9e3fae5852
compose: Remove unique expose entries after adding all entries
Prior to this change, we would `_.uniq` the expose value before adding
values from the port mappings. This could cause ports to get added
twice, which would cause the supervisor to think that there is a
configuration mismatch.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-11 10:20:20 +01:00
Cameron Diver
892cf1961e
Don't attempt to report any state during local mode
Even though this would never have attempted to report the state to the
api during local mode, it leaves behind artifacts which would cause the
state to be sometimes reported when exiting local mode. This would cause
the api to reject the update unecessarily.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-10 15:40:52 +01:00
Cameron Diver
80031b76e4
types: Upgrade dockerode types, and remove fixes which are superceded
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-08 10:46:28 +01:00
Cameron Diver
5943d3117c
Run database cleanup on startup in addition to once a day
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:29 +01:00
Cameron Diver
760b18dd2a
fix: Fix non-tty container message parsing
This had a bug where it was using the `in` operator on a list. It may
have worked for some cases, but would have failed for others.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:28 +01:00
Cameron Diver
e148ce0529
Report all logs from a container's runtime
We add a database table, which holds information about the last
timestamp of a log successfully reported to a backend (local or remote).
We then use this value to calculate from which point in time to start
reporting logs from the container. If this is the first time we've seen
a container, we get all logs, and for every log reported we save the
timestamp. If it is not the first time we've seen a container, we
request all logs since the last reported time, ensuring no interruption
of service.

Change-type: minor
Closes: #937
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:27 +01:00
Cameron Diver
25fd11bed3
Refactor container logging interface and rename logging-backends
Container logging is now handled by a class which attaches and emits
information from the container. We add these to the directory
logging-backends/, and rename it to logging/.

Change-type: minor
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:26 +01:00
Cameron Diver
196f173e13
ux: Show a supervisor starting log message in dashboard
Change-type: minor
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:25 +01:00
Cameron Diver
0504776169
ux: Remove service already running log message
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-05 10:06:24 +01:00
Pablo Carranza Velez
9961ebb41d In /v1/update, return 202 when we're not updating immediately
We also add a catch to any errors when getting configuration, and send 503 in this case, even if it's
unlikely.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-04-03 09:15:50 -07:00
Pablo Carranza Velez
8f07bf62de Add a random jitter to target state polls, and a config var to ignore update notifications and not poll immediately after startup
This commit does two related things:

* We make the poll interval a random time between 0.5 and 1.5 times the configured interval.
* We introduce the BALENA_SUPERVISOR_INSTANT_UPDATE_TRIGGER configuration variable, that defaults to true. If this variable is set
to false, then calls to /v1/update are ignored, and on startup the supervisor waits for a poll interval before getting the target state.

This will help especially on cases where there's a large number of devices on a single network. By disabling instant updates and setting a large
poll interval, we can now achieve a sitation where not all devices apply an update at the same time, which can help avoid
overwhelming the network.

Change-type: minor
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-04-03 09:15:49 -07:00
Cameron Diver
9a343316b2
Fix service comparison when starting a stopped service
When comparing a stopped container after a start request, the container
ID will be present in the target state (where usually it is not). We
were already filtering this value out of the current state, but
neglected to do so for the target state. This change now ensures we
remove it from both alias lists if it exists.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-04-03 13:08:22 +01:00
Pablo Carranza Velez
c902706600 Fix migration of legacy apps when there's more than one app in the local DB
In an edge case observed in the field, a supervisor's database held two applications
because the device had been moved and the update lock was set in the old app. This causes
the updated supervisor to be unable to start, logging "No compatible releases found in API",
because it can't fetch the release for the app it was moved from.

This commit changes the migration code to iterate through all apps, and remove any for which
we can't get a release.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-28 17:14:06 -07:00
Cameron Diver
175cbfee50
Fix typo in delta request error message
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-28 12:31:50 +00:00
Cameron Diver
c4b7fb481a
Remove log-timestamp due to having journald logs
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-28 10:27:22 +00:00
Cameron Diver
1570935b2c
misc: Fix lint errors
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-28 10:27:22 +00:00
Cameron Diver
b3192679c7
fix: Correctly compare and generate network membership aliases
Before this change, service name resolution would only occur in the
default network. This was because we were not explicitly adding aliases
of the service names to the aliases fields.

We also fix the comparison, which would do funny things based on
container IDs, which was correct but unnecessary.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-28 10:27:21 +00:00
CameronDiver
e956242d45
Merge pull request #949 from balena-io/use-default-device-config
When a device config variables requested value is not valid, fallback to the default
2019-03-28 09:43:30 +00:00
Cameron Diver
c7499a6b12
device-config: Show invalid values in dashboard logs
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-27 14:03:45 +00:00
Cameron Diver
83d53cfb56
events: Allow system messages to not be tracked
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-27 13:14:40 +00:00
Cameron Diver
c211efe399
device-config: Use default values for any invalid target values
If a value is requested which does not pass validation, we instead set
it to the default value, to ensure that the state engine continues to
work and move towards the target state.

Change-type: minor
Closes: #938
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-27 13:08:04 +00:00
Pablo Carranza Velez
dab5d7546c Ensure the supervisor0 network uses a subnet less likely to cause conflicts
We put the supervisor0 network in the 10.114.104.0/25 subnet to avoid issues when the device
is in a network using the 172.17.* network.

We also ensure we recreate this network if it was created in the incorrect subnet (i.e. if we're updating
from an old supervisor that didn't do this), for which we have to kill any containers using this network.

Closes #731

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-21 16:02:49 -07:00
Pablo Carranza Velez
22a5b33196 fix: When pinning a preloaded device, ensure the pinning is done when retrying after a failure
Without this patch, if for some reason device pinning fails (e.g. connectivity goes down) or anything
interrupts the initialization after provisioning completes but before pinning is completed, after a retry
the supervisor would just skip the pinning code, leaving the device unpinned. This patch ensures that the
pinning procedure is run even if the device was already provisioned (as long as the pinning flag has been set,
of course). This matches the behavior that the CoffeeScript code had from before the TypeScript conversion.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-20 17:50:26 -07:00
Pablo Carranza Velez
6e3bedeb1d fix: Return a promise when retrying provisioning to avoid continuing after a failure
Otherwise we'll keep doing the rest of the APIBinder init steps, like reporting initial config,
potentially before completing the provisioning.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-20 16:48:20 -07:00
Pablo Carranza Velez
b374bd81dd fix: Await reporting the initial config before continuing APIBinder initialization
This avoid a race condition, in which config.txt can be cleared if a target state is fetched before the
initial values have been created as config vars.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-20 12:27:20 -07:00
Pablo Carranza Velez
f32de99aff Fix typo when getting device config default values
Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-19 13:46:17 -07:00
Pablo Carranza Velez
d64dcb4b40 fix: Correct use of $expand to avoid an exception when updating from a legacy OS
The last update of pinejs-client to pinejs-client-request made the way we were
using $expand on the migration break. This switches to the correct way of doing it now.

Change-type: patch
Signed-off-by: Pablo Carranza Velez <pablo@balena.io>
2019-03-18 09:49:28 -07:00
Cameron Diver
b922789dee
device-api: Add v2/device/tags api endpoint
This endpoint will fetch the device tags from the balena api

Change-type: minor
Closes: #890
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-13 14:21:23 +00:00
Cameron Diver
3f231e8ff3
device-api: Add v2/device/name endpoint
This endpoint returns the last known device name from the API. This
differs from the BALENA_DEVICE_NAME_AT_INIT env var because this will
not change throughout the runtime of the container.

Closes: #908
Change-type: minor
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-13 14:21:22 +00:00
Cameron Diver
84356b82b8
state-engine: Return a noop when waiting for a dependency
We run the risk of the state engine exiting early when a dependency is
not ready, especially in local mode. This changes forces a noop to be
returned when we are waiting on another service, which is the process
used elsewhere in the state engine.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-13 10:34:15 +00:00
Cameron Diver
8f2d6f4d7b
Skip dependency check on kill in local mode
This function would usually check that an image is present for a
dependency, but in local mode the images would have never been inserted
into the database.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-13 10:34:14 +00:00
Cameron Diver
1aa58fd7b9
state-engine: Add an exponential backoff for device-config noops
To avoid unnecesarilly using resources, we add an exponential backoff
when the noops explicitly come from the device-config module.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-07 18:40:09 +00:00
Cameron Diver
ea1b247d3f
fix: Fix connectivity active VPN check
During the conversion to typescript, the VPN active check was being
performed on the directory, and not the file that the VPN creates,
meaning it would always return true (as we explicitly create the
directory on startup if it does not exist).

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-07 18:39:32 +00:00
Cameron Diver
6f79702099
state-engine: Add rate limited steps to device-config
In the case of an airgapped supervisor, with a target state that
requests the vpn be enabled, the supervisor will constantly loop on
trying to set the vpn to on. Unfortunately the vpn requires an internet
connection to be configured, so it will never be turned on.

We add the concept of no-ops to the device-config state change steps,
and don't end the state engine transition while these are present
(similar to how image pulls are implemented).

Change-type: minor
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-03-07 18:39:30 +00:00
Rich Bayliss
aeb96aa807
feature: Add BALENA_API_URL environment variable when using the balena-api feature label
When using the label `io.balena.features.balena-api` the supervisor will inject 2 environment
variables into the container:
- BALENA_API_KEY
- BALENA_API_URL

This allows the container to access the currently associated API using the KEY.

Change-type: patch
Signed-off-by: Rich Bayliss <rich@balena.io>
Connects-to: #847
2019-02-28 11:41:28 +00:00
Cameron Diver
987de0e097 debug: Print more information about failing validations
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-25 13:22:09 +01:00
Cameron Diver
f9626a3ee4 device-config: Add migration for SUPERVISOR_DELTA_APPLY_TIMEOUT
The default value for the delta apply timeout was changed from `''` to
`'0'` (note strings as these are database values) - but if the value
existed in the database already, this would fail validation. We add a
migration which will look explcitily for the failing value and switch it
to the new default.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-25 13:22:02 +01:00
Cameron Diver
c9507e013c Increase max payload size in bodyparser to avoid PayloadTooLarge errors
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-25 13:18:55 +01:00
Cameron Diver
0e3f260978
Fix provisioning workflow when UUID already exists
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-22 10:53:24 +00:00
Cameron Diver
911ee7f009
Run iptables rules synchronous to avoid locking errors
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-19 17:43:04 +00:00
Cameron Diver
5f82f6fd3f
Apply iptables rules to ipv6
Change-type: patch
Closes: #867
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-13 20:57:05 +00:00
Cameron Diver
7bd7f7e025
Improve error messages, and add description to ImageAuth error
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-13 15:44:42 +00:00
Cameron Diver
81ec85c581
fix: Request image authentication token with explicitly as json
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-13 15:43:55 +00:00
Cameron Diver
d9177404b5
Always back off on image fetch failure
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-13 15:43:18 +00:00
Cameron Diver
06580bf437
Don't treat a non-200 status response on patch as report errors
Non-200 errors were causing the watchdog to restart the supervisor,
which in some cases could cause a restart loop. Instead we change the
code to only treat communication failures as an error, and report status
code failures directly.

Change-type: patch
Closes: #843
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-12 13:56:54 +00:00
Cameron Diver
49dbaaba12
Allow newlines to be part of environment variables
We were not allowing newlines previously by virtue of the regex not
allowing them. The docker daemon and supervisor handling code both
support them, so we allow them in the parsing code too.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-12 11:19:55 +00:00
Cameron Diver
6bf008cc85
Remove environment variable whitespace trimming
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-11 17:13:35 +00:00
Cameron Diver
3d6dc88eb0
Make sure to correctly convert config emit events after validation
We were validating the input configuration values by coercing them to
the correct type, and then using the initial value to be saved (which
currently is always converted to a string).

We now use the coerced value as the actual value we will store, and more
importantly emit. This means that the config.on('change' ...) calls will
always be properly typed, which before this change was not a guarantee
that we could make.

Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-11 11:22:08 +00:00
Cameron Diver
88f19b4147
Set default delta apply timeout of 0
Change-type: patch
Signed-off-by: Cameron Diver <cameron@balena.io>
2019-02-11 10:21:04 +00:00