This is being done to keep the source of truth for this variable in one place and reduce confusion since it's better for dashboard users to reference the dashboard option rather than the actual variable name
We need the supervisor to be able to manage config.txt changes for the
Revolution Pi Connect S.
Change-type: patch
Signed-off-by: Florin Sarbu <florin@balena.io>
It's not an official status from container inspects, and the Supervisor
doesn't set it internally anywhere. It's better to remove it entirely as the
method by which Supervisor sets internal service statuses is by using a global
event emitter (reportNewStatus) which makes things difficult to test.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
The previous implementation in #2170 of parsing the container status was too general,
because it relied on the mistaken assumption that a container would have a status of
`Stopped` if it was manually stopped. This turned out to be untrue, as manually stopped
containers were also getting restarted by the Supervisor due to their inspect status of
`exited`. With this, parsing the exit message became unavoidable as there are no other
clear ways to discern a container that has been manually stopped and shouldn't be started
from a container experiencing the Engine-host race condition issue (again, see #2170).
Since we're just parsing the exit error message, we don't need to worry about different behaviors
amongst restart policies, as any container with the error message on exit should be started.
Change-type: patch
Closes: #2178
Signed-off-by: Christina Ying Wang <christina@balena.io>
It was returning stale information, particularly the download progress
of the target release images never got updated.
Change-type: patch
Closes: #2174
This is necessary since the builder no longer passes the platform flag
to the build. This would lead to dockerfiles that are mixing multi and single
arch stages to pull the wrong architecture images, particularly when
trying to build images in emulated builds (e.g. armv7hf built on aarch64).
Moving the full build to multi-arch solves this as the docker engine is
capable of chosing the right architecture from the manifest.
Relatest-to: balena-io/balena-builder#1010
Change-type: patch
As explained in the comments of this commit, a container with the restart policy
of 'on-failure' with a non-zero exit code matches the conditions for the race, so
the Supervisor will also attempt to start it. A container with the 'no' restart
policy that has been started once will not be started again. If a container with
'no' has never been started, its service status will be 'Installed' and the Supervisor
will already try to start it until success, so the service with 'no' doesn't require
special handling.
Signed-off-by: Christina Ying Wang <christina@balena.io>
This ensures target state has settled (since it seems that the 'applied' status
that's reported isn't 100% accurate and the actual Engine state may lag behind slightly)
Signed-off-by: Christina Ying Wang <christina@balena.io>
There exists a race condition between Engine and a host resource that may not
be immediately created. In this race condition, if a container's compose config
depends on the existence of that host resource, such as a network interface, and the
Engine tries to create & start the container before the host resource is created, the
Engine will not reattempt to start the container, regardless of the restart policy.
This is undesireable behavior but seems to be the behavior as implemented by Docker.
To rectify this, the Supervisor state funnel noops for a grace period of 1 minute
after starting a container to see that the container's status has become 'running`.
If the container exits because of the race condition, the status becomes 'exited' and the
Supervisor will attempt to generate another start step. This noop-wait-start step loop
will repeat until the container is able to start.
If the container is never able to start, there was a problem in the host in the creation of the
host resource, and that should be fixed at the host level.
This commit does not handle the case of services with restart policies "no" or "on-failure"
which encounter this host race, as metadata from container inspects needs to be introduced
during step calculation in order to figure out whether services with those restart policies
need to be started. This will be fixed in a future PR.
Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>