Remove memory leak endurance test as it spends 8 minutes testing a single failure case that's not end user visible,
and ultimately manifests elsewhere in test failures (which is where this came from in the beginning). It was a good
idea to confirm the change fixed the issue, but this isn't critical enough to retain.
Making statemachine not remove COMPLETED flows' checkpoints from the database
if they are started with a clientId, instead they are getting persisted and retained within
the database along with their result (`DBFlowResult`).
On flow start with a client id (`startFlowDynamicWithClientId`), if the client id maps to
a flow that was previously started with the same client id and the flow is now finished,
then fetch the `DBFlowResult` from the database to construct a
`FlowStateMachineHandle` done future and return it back to the client.
Object stored as results must abide by the storage serializer rules. If they fail to do so
the result will not be stored and an exception is thrown to the client to indicate this.
Enable reloading of a flow after every checkpoint is saved. This
includes reloading the checkpoint from the database and recreating the
fiber.
When a flow and its `StateMachineState` is created it checks the node's
config to see if the `reloadCheckpointAfterSuspend` is set to true. If it is
it initialises `StateMachineState.reloadCheckpointAfterSuspendCount`
with the value 0. Otherwise, it remains `null`.
This count represents how many times the flow has reloaded from its
checkpoint (not the same as retrying). It is incremented every time the
flow is reloaded.
When a flow suspends, it processes the suspend event like usual, but
it will now also check if `reloadCheckpointAfterSuspendCount` is not
`null` (that it is activated) and process a
`ReloadFlowFromCheckpointAfterSuspend`event, if and only if
`reloadCheckpointAfterSuspendCount` is greater than
`CheckpointState.numberOfSuspends`.
This means idempotent flows can reload from the start and not reload
again until reaching a new suspension point.
Flows that skip checkpoints can reload from a previously saved
checkpoint (or from the initial checkpoint) and will continue reloading
on reaching the next new suspension point (not the suspension point that
it skipped saving).
If the flow fails to deserialize the checkpoint from the database upon
reloading a `ReloadFlowFromCheckpointException` is throw. This causes
the flow to be kept for observation.
* CORDA-3844: Add new functions to network map client
* CORDA-3844: Apply new fetch logic to nm updater
* CORDA-3844: Fix base url and warnings
* CORDA-3844: Change response object and response validation
In order to make sure that the returned node infos are not maliciously modified, either a signed list response
or a signed reference object would need to be provided. As providing a signed list requires a lot of effort from NM and Signer services,
the signed network map is provided instead, allowing nodes to validate that the list provided conforms to the entries of the signed network map.
* CORDA-3844: Add clarifications and comments
* CORDA-3844: Add error handling for bulk request
* CORDA-3844: Enhance testing
* CORDA-3844: Fix detekt issues
* EG-3844: Apply pr suggestions
* CORDA-3845: Update BC to 1.64
* CORDA-3845: Upgraded log4j to 2.13.3
* We can remove the use of Manifests from the logging package so that when _it_ logs it doesn't error on the fact the stream was already closed by the default Java logger.
* Some more tidy up
* Remove the logging package as a plugin
* latest BC version
* Remove old test
* fix up
* Fix some rebased changes to log file handling
* Fix some rebased changes to log file handling
* Update slf4j too
Co-authored-by: Adel El-Beik <adel.el-beik@r3.com>
* CORDA-3717: Apply custom serializers to checkpoints
* Remove try/catch to fix TooGenericExceptionCaught detekt rule
* Rename exception
* Extract method
* Put calls to the userSerializer on their own lines to improve readability
* Remove unused constructors from exception
* Remove unused proxyType field
* Give field a descriptive name
* Explain why we are looking for two type parameters when we only use one
* Tidy up the fetching of types
* Use 0 seconds when forcing a flow checkpoint inside test
* Add test to check references are restored correctly
* Add CheckpointCustomSerializer interface
* Wire up the new CheckpointCustomSerializer interface
* Use kryo default for abstract classes
* Remove unused imports
* Remove need for external library in tests
* Make file match original to remove from diff
* Remove maySkipCheckpoint from calls to sleep
* Add newline to end of file
* Test custom serializers mapped to interfaces
* Test serializer configured with abstract class
* Move test into its own package
* Rename test
* Move flows and serializers into their own source file
* Move broken map into its own source file
* Delete comment now source file is simpler
* Rename class to have a shorter name
* Add tests that run the checkpoint serializer directly
* Check serialization of final classes
* Register as default unless the target class is final
* Test PublicKey serializer has not been overridden
* Add a broken serializer for EdDSAPublicKey to make test more robust
* Split serializer registration into default and non-default registrations. Run registrations at the right time to preserve Cordas own custom serializers.
* Check for duplicate custom checkpoint serializers
* Add doc comments
* Add doc comments to CustomSerializerCheckpointAdaptor
* Add test to check duplicate serializers are logged
* Do not log the duplicate serializer warning when the duplicate is the same class
* Update doc comment for CheckpointCustomSerializer
* Sort serializers by classname so we are not registering in an unknown or random order
* Add test to serialize a class that references itself
* Store custom serializer type in the Kryo stream so we can spot when a different serializer is being used to deserialize
* Testing has shown that registering custom serializers as default is more robust when adding new cordapps
* Remove new line character
* Remove unused imports
* Add interface net.corda.core.serialization.CheckpointCustomSerializer to api-current.txt
* Remove comment
* Update comment on exception
* Make CustomSerializerCheckpointAdaptor internal
* Revert "Add interface net.corda.core.serialization.CheckpointCustomSerializer to api-current.txt"
This reverts commit b835de79bd21f0048be741e7fc5f0c3088516d2b.
* Restore "Add interface net.corda.core.serialization.CheckpointCustomSerializer to api-current.txt""
This reverts commit 718873a4e963bad4e327bb200e7bb4de44bc47ad.
* Pass the class loader instead of the context
* Do less work in test setup
* Make the serialization context unique for CustomCheckpointSerializerTest so we get a new Kryo pool for the test
* Rebuild the Kryo pool for the given context when we change custom serializers
* Rebuild all Kryo pools on serializer change to keep serializer list consistent
* Move the custom serializer list into CheckpointSerializationContext to reduce scope from global to a serialization context
* Remove unused imports
* Make the new checkpointCustomSerializers property default to the empty list
* Delegate implementation using kotlin language feature
Refactor `FlowStateMachineImpl.transientValues` and
`FlowStateMachineImpl.transientState` to stop the fields from exposing
the fact that they are nullable.
This is done by having private backing fields `transientValuesReference`
and `transientStateReference` that can be null. The nullability is still
needed due to serialisation and deserialisation of flow fibers. The
fields are transient and therefore will be null when reloaded from the
database.
Getters and setters hide the private field, allowing a non-null field to
returned.
There is no point other than in `FlowCreator` where the transient fields
can be null. Therefore the non null checks that are being made are
valid.
Add custom kryo serialisation and deserialisation to `TransientValues`
and `StateMachineState` to ensure that neither of the objects are ever
touched by kryo.
Introducing a new flow start method (`startFlowDynamicWithClientId`) passing in a `clientId`.
Once `startFlowDynamicWithClientId` gets called, the `clientId` gets injected into `InvocationContext` and also pushed to the logging context.
If a new flow starts with this method, then a < `clientId` to flow > pair is kept on node side, even after the flow's lifetime. If `startFlowDynamicWithClientId` is called again with the same `clientId` then the node identifies that this `clientId` refers to an existing < `clientId` to flow > pair and returns back to the rpc client a `FlowStateMachineHandle` future, created out of that pair.
`FlowStateMachineHandle` interface was introduced as a thinner `FlowStateMachine`. All `FlowStateMachine` properties used by call sites are moved into this new interface along with `clientId` and then `FlowStateMachine` extends it.
Introducing an acknowledgement method (`removeClientId`). Calling this method removes the < `clientId` to flow > pair on the node side and frees resources.
* CORDA-3769: Switched attachments class loader cache to use caffeine with original implementation used by determinstic core.
* CORDA-3769: Removed default ctor arguments.
* CORDA-3769: Switched mapping function to Function type to avoid synthetic method being generated.
* CORDA-3769: Now using a cache created from NamedCacheFactory for the attachments class loader cache.
* CORDA-3769: Making detekt happy.
* CORDA-3769: The finality tests now check for UntrustedAttachmentsException which will actually happen in reality.
* CORDA-3769: Refactored after review comments.
* CORDA-3769: Removed the AttachmentsClassLoaderSimpleCacheImpl as DJVM does not need it. Also updated due to review comments.
* CORDA-3769: Removed the generic parameters from AttachmentsClassLoader.
* CORDA-3769: Removed unused imports.
* CORDA-3769: Updates from review comments.
* CORDA-3769: Updated following review comments. MigrationServicesForResolution now uses cache factory. Ctor updated for AttachmentsClassLoaderSimpleCacheImpl.
* CORDA-3769: Reduced max class loader cache size
* CORDA-3769: Fixed the attachments class loader cache size to a fixed default
* CORDA-3769: Switched attachments class loader size to be reduced by fixed value.
Cancel the future being run by a flow when finishing or retrying it. The
cancellation of the future no longer cares about what type of future it
is.
`StateMachineState` has the `future` field, which holds the 3
(currently) possible types of futures:
- sleep
- wait for ledger commit
- async operation / external operation
Move the starting of all futures triggered by actions into
`ActionFutureExecutor`.
* Add schema migration to smoke tests
* Fix driver to work correctly for out-of-proc node with persistent database.
Co-authored-by: Ross Nicoll <ross.nicoll@r3.com>
Changes to `TopLevelTransition` after merging from earlier releases.
When a flow is kept for observation and its checkpoint is saved as
HOSPITALIZED in the database, we must acknowledge the session init and
flow start events so that they are not replayed on node startup.
Otherwise the same flow will be ran twice when the node is restarted,
one from the checkpoint and one from artemis.
Reduce exception_message and stack_trace lengths in table node_flow_exceptions from 4000 to 2000 to fix Oracle failing with: 'ORA-00910: specified length too long for its datatype'
These tests were removed after doing a merge from 4.4. They needed
updating after the changes from 4.4 anyway. These have been included in
this change.
Also fix kill flow tests and send initial tests.
When an uncaught exception propagates all the way to the flow exception
handler, the flow will be forced into observation/hospitalised.
The updating of the checkpoints status is done on a separate thread as
the fiber cannot be relied on anymore. The new thread is needed to allow
database transaction to be created and committed. Failures to the status
update will be rescheduled to ensure that this information is eventually
reflected in the database.
* Move log messages that are not useful in typical usage from info to debug level to reduce log spam.
* Add node startup check before attempting to connect.
In enterprise, `AuthDBTests` picked up a schema from a unit test and
included it in the cordapp it builds. This schema does not have a
migration and therefore fails the integration tests.
`NodeBasedTest` now lets cordapps to be defined and passed in to avoid
this issue. It defaults to making a cordapp from the tests base
directory if none are provided.