balena-supervisor/test/integration/compose
Christina Ying Wang 7f32141958 Handle Engine-host race condition for "always" and "unless-stopped" restart policy
There exists a race condition between Engine and a host resource that may not
be immediately created. In this race condition, if a container's compose config
depends on the existence of that host resource, such as a network interface, and the
Engine tries to create & start the container before the host resource is created, the
Engine will not reattempt to start the container, regardless of the restart policy.
This is undesireable behavior but seems to be the behavior as implemented by Docker.

To rectify this, the Supervisor state funnel noops for a grace period of 1 minute
after starting a container to see that the container's status has become 'running`.
If the container exits because of the race condition, the status becomes 'exited' and the
Supervisor will attempt to generate another start step. This noop-wait-start step loop
will repeat until the container is able to start.

If the container is never able to start, there was a problem in the host in the creation of the
host resource, and that should be fixed at the host level.

This commit does not handle the case of services with restart policies "no" or "on-failure"
which encounter this host race, as metadata from container inspects needs to be introduced
during step calculation in order to figure out whether services with those restart policies
need to be started. This will be fixed in a future PR.

Change-type: patch
Signed-off-by: Christina Ying Wang <christina@balena.io>
2023-05-31 11:32:19 -07:00
..
application-manager.spec.ts Handle Engine-host race condition for "always" and "unless-stopped" restart policy 2023-05-31 11:32:19 -07:00
commit.spec.ts Migrate simple legacy tests to test/unit and test/integration 2022-10-18 20:36:53 -03:00
images.spec.ts Remove dependent devices content in codebase 2023-02-06 19:34:02 -08:00
network.spec.ts Create default network as config-only when services have host networking 2022-11-16 10:19:36 -08:00
service.spec.ts Access api-key methods through device API 2022-10-18 14:27:19 -07:00
volume-manager.spec.ts Migrate tests for image manager 2022-09-28 10:37:41 -03:00
volume.spec.ts Split compose/volume tests into unit/integration 2022-09-28 10:37:40 -03:00