..note:: The flow hospital will never terminate a flow, but will propagate its error back to the state machine, and ultimately, end user code to handle
if it ultimately proves impossible to resolve automatically.
This concept is analogous to *exception management handling* associated with enterprise workflow software, or
*retry queues/stores* in enterprise messaging middleware for recovering from failure to deliver a message.
Functionality
-------------
Flow hospital functionality is enabled by default in |release|. No explicit configuration settings are required.
There are two aspects to the flow hospital:
- run-time behaviour in the node upon failure, including retry and recovery transitions and policies.
- visualisation of failed flows in the Explorer UI.
.._flow-hospital-runtime:
Run-time behaviour
~~~~~~~~~~~~~~~~~~
Specifically, there are two main ways a flow is hospitalized:
1. A counterparty invokes a flow on your node that isn’t installed (i.e. missing CorDapp):
this will cause the flow session initialisation mechanism to trigger a ``ClassNotFoundException``.
If this happens, the session initiation attempt is kept in the hospital for observation and will retry if you restart the node.
Corrective action requires installing the correct CorDapp in the node's "cordapps" directory.
..warning:: There is currently no retry API. If you don’t want to install the cordapp, you should be able to call `killFlow` with the UUID
associated with the failing flow in the node's log messages.
2. Once started, if a flow experiences an error, the following failure scenarios are handled:
* SQLException mentioning a deadlock*:
if this happens, the flow will retry. If it retries more than once, a back off delay is applied to try and reduce contention.
Current policy means these types of failed flows will retry forever (unless explicitly killed). No intervention required.
* Database constraint violation:
this scenario may occur due to natural contention between racing flows as Corda delegates handling using the database's optimistic concurrency control.
As the likelihood of re-occurrence should be low, the flow will actually error and fail if it experiences this at the same point more than 3 times. No intervention required.