Fix engine deadlock on network+service change

This fixes a regression on the supervisor state engine computation
(added on v16.2.0) when
the target state removes a network at the same time that a service
referencing that network is changed. Example going from

```
services:
   one:
      image: alpine: 3.18
      networks: ['balena']

networks:
   balena:
```

to

```
services:
   one:
      image: alpine: latest
```

Would never reach the target state as killing the service in order to
remove the network is prioritized, but one of the invariants in the target state calculation is
to not kill any services until all images have been downloaded. These
two instructions were in contradiction leading to a deadlock.

The fix involves only adding removal steps for services depending on a
changing network or volume if the service container is not being removed
already.

Change-type: patch
This commit is contained in:
Felipe Lalanne 2024-06-24 15:55:11 -04:00
parent 5d93f358aa
commit ede27b63ce
No known key found for this signature in database
GPG Key ID: 03E696BFD472B26A

View File

@ -379,9 +379,9 @@ class AppImpl implements App {
);
}
}
const toBeRemoved = _(currentServiceNames)
const maybeRemove = _(currentServiceNames)
.difference(targetServiceNames)
.union(dependentServices.map((s) => s.serviceName))
.map((id) => ({ current: currentByServiceName[id] }))
.value();
@ -409,7 +409,7 @@ class AppImpl implements App {
currentByServiceName[serviceName],
);
for (const service of otherContainers) {
toBeRemoved.push({ current: service });
maybeRemove.push({ current: service });
}
} else {
currentByServiceName[serviceName] = currentServiceContainers[0];
@ -489,6 +489,33 @@ class AppImpl implements App {
);
};
/**
* Services with changing configuration will have their containers recreated
*/
const toBeRecreated = maybeUpdate
.map((serviceName) => ({
current: currentByServiceName[serviceName],
target: targetByServiceName[serviceName],
}))
.filter(
({ current: c, target: t }) => !isEqualExceptForRunningState(c, t),
);
/**
* Services that depend on network changes should be added to the removal
* list if they are not being recreated already
*/
const toBeRemoved = maybeRemove.concat(
dependentServices
.filter(
(s) =>
!toBeRecreated.some(
({ current: c }) => c.serviceName === s.serviceName,
),
)
.map((s) => ({ current: s })),
);
/**
* Checks if a service is destined for removal due to a network or volume change
*/