Commit Graph

2695 Commits

Author SHA1 Message Date
LankyDan
b6d649634f ENT-5196 handle errors during flow initialisation (#6378)
Update flow error handling tests after merging from earlier releases
2020-07-08 16:02:17 +01:00
LankyDan
b05c0f0cc1 ENT-5196 handle errors during flow initialisation (#6378)
Changes to `StaffedFlowHospital` after merging from earlier releases.
2020-07-08 14:43:24 +01:00
LankyDan
2204f44332 ENT-5196 handle errors during flow initialisation (#6378)
Changes to `TopLevelTransition` after merging from earlier releases.

When a flow is kept for observation and its checkpoint is saved as
HOSPITALIZED in the database, we must acknowledge the session init and
flow start events so that they are not replayed on node startup.

Otherwise the same flow will be ran twice when the node is restarted,
one from the checkpoint and one from artemis.
2020-07-08 14:41:29 +01:00
LankyDan
fdae04fc28 Merge branch 'release/os/4.5' into dan/os-4.5-to-4.6-merge-2020-07-08 2020-07-08 10:44:47 +01:00
Kyriakos Tharrouniatis
b619356bff
Reduce exception_message and stack_trace lengths in table node_flow_exceptions (#6432)
Reduce exception_message and stack_trace lengths in table node_flow_exceptions from 4000 to 2000 to fix Oracle failing with: 'ORA-00910: specified length too long for its datatype'
2020-07-07 09:07:01 +01:00
Dan Newton
9a8ae0fd32
CORDA-3841 Update session init flow error handling tests (#6431)
These tests were removed after doing a merge from 4.4. They needed
updating after the changes from 4.4 anyway. These have been included in
this change.

Also fix kill flow tests and send initial tests.
2020-07-07 08:47:09 +01:00
Dan Newton
a1e1bf4e6d
CORDA-3848 Uncaught exception hospitalises flow (#6377)
When an uncaught exception propagates all the way to the flow exception
handler, the flow will be forced into observation/hospitalised.

The updating of the checkpoints status is done on a separate thread as
the fiber cannot be relied on anymore. The new thread is needed to allow
database transaction to be created and committed. Failures to the status
update will be rescheduled to ensure that this information is eventually
reflected in the database.
2020-07-06 22:43:48 +01:00
Ryan Fowler
0d5bed5243
ENT-5131: Avoid NPE by throwing a catchable exception when openAttachment fails. (#6408) 2020-07-06 11:42:36 +01:00
Ross Nicoll
6aa19723e6
INFRA-417 Improve driver DSL test stability (#6415)
* Move log messages that are not useful in typical usage from info to debug level to reduce log spam.
* Add node startup check before attempting to connect.
2020-07-03 20:42:29 +01:00
Dan Newton
6bc2c79e23
NOTICK NodeBasedTest take in cordapps (#6424)
In enterprise, `AuthDBTests` picked up a schema from a unit test and
included it in the cordapp it builds. This schema does not have a
migration and therefore fails the integration tests.

`NodeBasedTest` now lets cordapps to be defined and passed in to avoid
this issue. It defaults to making a cordapp from the tests base
directory if none are provided.
2020-07-02 16:14:51 +01:00
Denis Rekalov
adc0879e8e
CORDA-3867: Add tests for AMQ_VALIDATED_USER (#6418)
* CORDA-3867: Add tests for AMQ_VALIDATED_USER

* CORDA-3867: detekt
2020-07-02 09:20:23 +01:00
Ross Nicoll
9f12e6bbc5
INFRA-433 Rebuild node config parsing tests (#6423)
Replace node configuration parsing tests with lighter weight equivalents which just parse the configuration rather than starting a full node.
2020-07-01 22:47:09 +01:00
LankyDan
c06830d851 Merge branch 'release/os/4.4' into dan/os-4.4-to-4.5-merge-2020-07-01
# Conflicts:
#	core/src/main/kotlin/net/corda/core/node/ServiceHub.kt
#	node/src/integration-test-slow/kotlin/net/corda/node/services/statemachine/StatemachineGeneralErrorHandlingTest.kt
#	node/src/integration-test-slow/kotlin/net/corda/node/services/statemachine/StatemachineKillFlowErrorHandlingTest.kt
#	node/src/integration-test/kotlin/net/corda/node/flows/FlowEntityManagerTest.kt
#	node/src/main/kotlin/net/corda/node/internal/AbstractNode.kt
#	node/src/main/kotlin/net/corda/node/services/statemachine/TransitionExecutorImpl.kt
#	node/src/main/kotlin/net/corda/node/services/statemachine/interceptors/HospitalisingInterceptor.kt
2020-07-01 17:32:36 +01:00
Kyriakos Tharrouniatis
5499e2c050
ENT-5384 Rename MAX_SQL_IN_CLAUSE_SET (#6414)
Rename constant to `DEFAULT_SOFT_LOCKING_SQL_IN_CLAUSE_SIZE`
2020-07-01 09:31:12 +01:00
Dan Newton
efd633c7b9 CORDA-3722 withEntityManager can rollback its session (#6187)
* CORDA-3722 withEntityManager can rollback its session

Improve the handling of database transactions when using
`withEntityManager` inside a flow.

Extra changes have been included to improve the safety and
correctness of Corda around handling database transactions.

This focuses on allowing flows to catch errors that occur inside an
entity manager and handle them accordingly.

Errors can be caught in two places:

- Inside `withEntityManager`
- Outside `withEntityManager`

Further changes have been included to ensure that transactions are
rolled back correctly.

Errors caught inside `withEntityManager` require the flow to manually
`flush` the current session (the entity manager's individual session).
By manually flushing the session, a `try-catch` block can be placed
around the `flush` call, allowing possible exceptions to be caught.

Once an error is thrown from a call to `flush`, it is no longer possible
to use the same entity manager to trigger any database operations. The
only possible option is to rollback the changes from that session.
The flow can continue executing updates within the same session but they
will never be committed. What happens in this situation should be handled
by the flow. Explicitly restricting the scenario requires a lot of effort
and code. Instead, we should rely on the developer to control complex
workflows.

To continue updating the database after an error like this occurs, a new
`withEntityManager` block should be used (after catching the previous
error).

Exceptions can be caught around `withEntityManager` blocks. This allows
errors to be handled in the same way as stated above, except the need to
manually `flush` the session is removed. `withEntityManager` will
automatically `flush` a session if it has not been marked for rollback
due to an earlier error.

A `try-catch` can then be placed around the whole of the
`withEntityManager` block, allowing the error to be caught while not
committing any changes to the underlying database transaction.

To make `withEntityManager` blocks work like mini database transactions,
save points have been utilised. A new savepoint is created when opening
a `withEntityManager` block (along with a new session). It is then used
as a reference point to rollback to if the session errors and needs to
roll back. The savepoint is then released (independently from
completing successfully or failing).

Using save points means, that either all the statements inside the
entity manager are executed, or none of them are.

- A new session is created every time an entity manager is requested,
but this does not replace the flow's main underlying database session.
- `CordaPersistence.transaction` can now determine whether it needs
to execute its extra error handling code. This is needed to allow errors
escape `withEntityManager` blocks while allowing some of our exception
handling around subscribers (in `NodeVaultService`) to continue to work.
2020-06-30 11:54:16 +01:00
Dan Newton
f6b5737277
ENT-5196 handle errors during flow initialisation (#6378)
## Summary

This change deals with multiple issues:

* Errors that occur during flow initialisation.

* Errors that occur when handling the outcome of an existing flow error.

* Failures to rollback and close a database transaction when an error
occurs in `TransitionExecutorImpl`.

* Removal of create and commit transaction actions around retrying a flow.

## Errors that occur during flow initialisation

Flow initialisation has been moved into the try/catch that exists inside
`FlowStateMachineImpl.run`. This means if an error is thrown all the way
out of `initialiseFlow` (which should rarely happen) it will be caught 
and move into a flow's standard error handling path. The flow should 
then properly terminate.

`Event.Error` was changed to make the choice to rollback be optional. 
Errors during flow initialisation cause the flow to not have a open 
database transaction. Therefore there is no need to rollback.

## Errors that occur when handling the outcome of an existing flow error

When an error occurs a flow goes to the flow hospital and is given an 
outcome event to address the original error. If the transition that was 
processing the error outcome event (`StartErrorPropagation` and
`RetryFlowFromSafePoint`) has an error then the flow aborts and
nothing happens. This means that the flow is left in a runnable state.

To resolve this, we now retry the original error outcome event whenever 
another error occurs doing so.

This is done by adding a new staff member that looks for 
`ErrorStateTransitionException` thrown in the error code path of 
`TransitionExecutorImpl`. It then takes the last outcome for that flow 
and schedules it to run again. This scheduling runs with a backoff.

This means that a flow will continually retry the original error outcome
event until it completes it successfully.

## Failures to rollback and close a database transaction when an error occurs in `TransitionExecutorImpl`
   
Rolling back and closing the database transaction inside of 
`TransitionExecutorImpl` is now done inside individual try/catch blocks
as this should not prevent the flow from continuing.

## Removal of create and commit transaction actions around retrying a flow

The database commit that occurs after retrying a flow can fail which 
required some custom code just for that event to prevent inconsistent 
behaviour. The transaction was only needed for reading checkpoints from 
the database, therefore the transaction was moved into 
`retryFlowFromSafePoint` instead and the commit removed.

If we need to commit data inside of `retryFlowFromSafePoint` in the 
future, a commit should be added directly to `retryFlowFromSafePoint`. 
The commit should occur before the flow is started on a new fiber.
2020-06-30 11:50:42 +01:00
Tamas Veingartner
18cd24c2b4
NOTICK Ignore pagination large volume test as its completion is environment dependent and might cause issues in slower environments (#6420) 2020-06-30 10:12:31 +01:00
Denis Rekalov
e75dc9e415 Merge branch 'release/os/4.5' into denis/CORDA-3856-amqp-header-os-4.6 2020-06-29 13:13:48 +01:00
Denis Rekalov
7686ca2944 Merge branch 'release/os/4.4' into denis/CORDA-3856-amqp-header-os-4.5 2020-06-29 12:35:05 +01:00
pnemeth
6fc9f2eacf
EG-2854 Failing Test: net.corda.node.internal.NodeStartupCliTest.--nodeconf using absolute path will not be changed (#6393)
* EG-2854

Failing Test: net.corda.node.internal.NodeStartupCliTest.--nodeconf using absolute path will not be changed

* fix bug EG-2854

* PR comment
2020-06-29 09:40:12 +01:00
Denis Rekalov
3f03de6fbd
CORDA-3856: Add Artemis plugin for validating AMQP message header and type (#6407) 2020-06-29 09:23:29 +01:00
Dan Newton
516a4bf3a1
Merge pull request #6403 from corda/NOTICK-rfowler-merge-OS4.5-OS4.6-20200626
NOTICK rfowler merge os4.5 os4.6 20200626
2020-06-29 08:40:48 +01:00
Dan Newton
796e92b512
CORDA-3720 Extract locking of InnerState out of SMM (#6289)
The state machines state is held within `InnerState` which lived inside
the SMM. `InnerState` has been extracted out of the SMM to allow the SMM
to be refactored in the future. Smaller classes can now be made that
focus on a single goal as the locking of the state can be accessed from
external classes. To achieve this, pass the `InnerState` into the class
and request a lock if needed.

The locking of `InnerState` has been made a property of the `InnerState`
itself. It has a `lock` field that allows locks to be taken out when
needed.

An inline `withLock` function has been added to tidy up the code and not
harm performance.

Some classes have been made internal to prevent invalid usage of purely
node internal classes.

As part of this change, flow timeouts have been extracted out into
`FlowTimeoutScheduler`.
2020-06-26 12:48:15 +01:00
Ryan Fowler
31a1ee8803 Merge branch 'release/os/4.5' into NOTICK-rfowler-merge-OS4.5-OS4.6-20200626 2020-06-26 12:14:56 +01:00
Ryan Fowler
881d8d687c Merge branch 'release/os/4.4' into NOTICK-rfowler-merge-OS4.4-OS4.5-20200626 2020-06-26 09:52:16 +01:00
Ryan Fowler
ef582900cf
NOTICK Expand the regex to match what we already do in ENT (#6400) 2020-06-25 17:26:55 +01:00
Kyriakos Tharrouniatis
6ec2910f15
NOTICK Revert node_flow_exceptions type length back to 256 (#6395) 2020-06-25 09:40:58 +01:00
Chris Rankin
6485a025c7
ENT-5430: Increase test coverage when serializing Optional fields. (#6387) 2020-06-22 16:51:40 +01:00
Waldemar Zurowski
1a4efbac7f Merge branch 'release/os/4.5' into merge-45-to-46 2020-06-19 19:48:08 +01:00
pnemeth
d6cab0e131
EG-1557 - Configuration data from "include" section ignored while command line contains the path to config file without leading ./ (#6354)
Configuration data from "include" section ignored while command line contains the path to config file without leading ./
2020-06-19 10:32:55 +01:00
alicer3
e77f7a7546
center console message for registration (#6191) 2020-06-19 09:49:07 +01:00
LankyDan
56d0bbc036 CORDA-3841 Check isAnyCheckpointPersisted in startFlowInternal (#6351)
Only hit the database if `StateMachineState.isAnyCheckpointPersisted`
returns true. Otherwise, there will be no checkpoint to retrieve from the
database anyway. This can prevent errors due to a transient loss of
connection to the database.

Update tests after merging to 4.6
2020-06-18 16:15:15 +01:00
LankyDan
e8b17ff7b9 Merge branch 'release/os/4.5' into dan/os-4.5-to-4.6-merge-2020-06-18
# Conflicts:
#	node/src/main/kotlin/net/corda/node/services/persistence/DBCheckpointStorage.kt
#	node/src/main/kotlin/net/corda/node/services/statemachine/SingleThreadedStateMachineManager.kt
2020-06-18 15:50:46 +01:00
Chris Rankin
1ef62870bb Merge commit 'fe617818895edab334d80c5e8de2b38f39e67af6' into chrisr3-os44-merge 2020-06-17 18:54:54 +01:00
Chris Rankin
d0c0a1d9ba
ENT-5430: Fix deserialisation of commands containing generic types. (#6359) 2020-06-17 17:28:26 +01:00
James Higgs
24b0240d82
EG-2654 - Ensure stack traces are printed to the logs in error reporting (#6345)
* EG-2654 Ensure stack trace is printed to the logs in error reporting

* EG-2654 - Add a test case for exception logging
2020-06-17 14:32:12 +01:00
Dan Newton
7ab6a8f600
CORDA-3841 Check isAnyCheckpointPersisted in startFlowInternal (#6351)
Only hit the database if `StateMachineState.isAnyCheckpointPersisted`
returns true. Otherwise, there will be no checkpoint to retrieve from the
database anyway. This can prevent errors due to a transient loss of
connection to the database.
2020-06-16 09:22:26 +01:00
Tamas Veingartner
26d4bfb89f
CORDA-3578 add corda prefix to conf file names as original issue was … (#6322)
* CORDA-3578 add corda prefix to conf file names as original issue was having non-corda reference.conf files on classpath causes DriverDSLImp failure

it is better to have this naming convention and avoid further conflicts of conf files.

* fixed weird duplicates

* revert renaming changes for web-reference.conf and loadtest-reference.conf
2020-06-16 09:15:51 +01:00
Christian Sailer
35c661b9f6
Merge pull request #6341 from corda/chrisr3-45-merge
NOTICK: Merge from OS 4.5 up to ef00fa1.
2020-06-12 16:42:51 +01:00
Stefano Franz
64f0011a62
Make Checkpoint classes data classes (#6342)
* Make Checkpoint classes data classes

* tidy up null-checks for array equality
2020-06-12 16:35:32 +01:00
Chris Rankin
3f67e314c0 Merge commit 'ef00fa1388db37e155ab8cfed3763c14801f8aa9' into chrisr3-45-merge 2020-06-12 13:14:44 +01:00
James Higgs
6e349f298e
NOTICK - Ignore a potentially dodgy test (#6336) 2020-06-11 16:47:48 +01:00
James Higgs
ab023d0b07 Merge branch 'release/os/4.5' into jamesh/os-4.5-4.6-merge-11062020 2020-06-11 09:40:39 +01:00
James Higgs
58af87c988
EG-2225 - Create log directory in correct place with verbose flag set (#6321)
* Ensure logs directory is written to correct location
* Remove a superfluous set of log path property
* Add a unit test to catch bad log paths
* Address detekt issues
2020-06-10 10:46:57 +01:00
James Higgs
8b7275eb97
EG-2564 - Move printed error to logger (#6323) 2020-06-10 10:45:50 +01:00
Schife
fb184839f4 Merge branch 'release/os/4.5' of https://github.com/corda/corda into razvan-os-4.5-to-4.6-merge 2020-06-05 07:55:48 +01:00
nikinagy
caf5482244
ENT-4064, ENT-4608 - checking for unsigned cordapps in prod mode and checking the PV (#6291)
* checking for unsigned cordapps in prod mode and shutting down if it the contracts are not signed

* inheriting from CordaRuntimeException instead of Exception

* had the same tests twice, removed the shutdown for minimumplatformversion, modified some of the tests

* shut down when we get invalid platformversion and modified the test according to this

* making the message more accurate
2020-06-04 16:24:49 +01:00
Denis Rekalov
45614cf29e
Merge pull request #6266 from corda/denis/CORDA-3805-custom-migration-scripts
CORDA-3805: cut dependency from PersistentIdentityService for custom migration scripts
2020-06-04 14:20:26 +01:00
Dan Newton
f0d2c9fe71
NOTICK Correct StatemachineGeneralErrorHandlingtest (#6306) 2020-06-04 08:04:41 +01:00
Kyriakos Tharrouniatis
b8b462f68e
NOTICK Fix schema validation error for flow parameters (#6304)
Adding 'corda-blob' type, fixed 'Schema-validation: wrong column type'
2020-06-02 17:52:50 +01:00