CORDA-3071 - Checkpoint agent tool (#5295)

* Initial commit based on experimental kryo hook agent.

* WIP

* Added documentation.

* Additional improvements and documentation following more testing.

* Added field level instrumentation + basic type handlers for String, byteArray, charArray, primitive types.

* Working version (without array type handling)

* Missing build.gradle file.

* Handle display of Arrays and String.
Pruning output to avoid repetition (by loop depth, object count).

* Added configurable StackDepth (for display purposes) and filter out ProgressTracker stacks.

* Further array handling (Object arrays, 2D, 3D), improved display and general code cleanup.

* Various fixes and improvements following demo to RP.

* Clean-up

* Further clean-up

* Set checkpoint id before deserialization.

* Update documentation

* Final clean-up.

* Minor documentation fixes.

* Updates following PR review feedback.

* Add changelog entry.
This commit is contained in:
josecoll 2019-08-01 09:40:52 +01:00 committed by Rick Parker
parent eff762a5c3
commit 99e802c5c5
12 changed files with 1117 additions and 5 deletions

2
.idea/compiler.xml generated
View File

@ -29,6 +29,8 @@
<module name="buildSrc_test" target="1.8" />
<module name="canonicalizer_main" target="1.8" />
<module name="canonicalizer_test" target="1.8" />
<module name="checkpoint-agent_main" target="1.8" />
<module name="checkpoint-agent_test" target="1.8" />
<module name="cli_main" target="1.8" />
<module name="cli_test" target="1.8" />
<module name="client_main" target="1.8" />

View File

@ -146,6 +146,30 @@
</RollingRandomAccessFile>
<RollingFile name="Checkpoint-Agent-RollingFile-Appender"
fileName="${log-path}/checkpoints_agent-${date:yyyyMMdd-HHmmss}.log"
filePattern="${archive}/checkpoints_agent.%date{yyyy-MM-dd}-%i.log.gz">
<PatternLayout pattern="[%-5level] %date{ISO8601}{UTC}Z [%t] %c{2}.%method - %msg%n"/>
<Policies>
<TimeBasedTriggeringPolicy/>
<SizeBasedTriggeringPolicy size="100MB"/>
</Policies>
<DefaultRolloverStrategy min="1" max="100">
<Delete basePath="${archive}" maxDepth="1">
<IfFileName glob="${log-name}*.log.gz"/>
<IfLastModified age="60d">
<IfAny>
<IfAccumulatedFileSize exceeds="10 GB"/>
</IfAny>
</IfLastModified>
</Delete>
</DefaultRolloverStrategy>
</RollingFile>
<Rewrite name="Console-ErrorCode-Selector">
<AppenderRef ref="Console-Selector"/>
<ErrorCodeRewritePolicy/>
@ -190,5 +214,8 @@
<AppenderRef ref="Console-ErrorCode-Appender-Println"/>
<AppenderRef ref="RollingFile-ErrorCode-Appender"/>
</Logger>
<Logger name="CheckpointAgent" level="debug" additivity="false">
<AppenderRef ref="Checkpoint-Agent-RollingFile-Appender"/>
</Logger>
</Loggers>
</Configuration>

View File

@ -9,6 +9,9 @@ release, see :doc:`app-upgrade-notes`.
Version 5.0
-----------
* Introduced a new low level flow diagnostics tool: checkpoint agent (that can be used standalone or in conjunction with the ``dumpCheckpoints`` shell command).
See :doc:`tools-checkpoint-agent` for more information.
* The MockNet now supports setting a custom Notary class name, as was already supported by normal node config. See :doc:`tutorial-custom-notary`.
* Introduced a new ``Destination`` abstraction for communicating with non-Party destinations using the new ``FlowLogic.initateFlow(Destination)``

View File

@ -127,6 +127,8 @@ The main try-catch that handles the yielding may be found `here <https://github.
primitive values). The arrays always have the same length, and they both contain values for each stack frame. The primitive stack
additionally has a "metadata" slot for each stack frame, this is where the "method entry" value is put, as well as frame size data.
.. _flow_internals_checkpoints_ref:
Checkpoints
-----------

View File

@ -0,0 +1,384 @@
Checkpoint Agent
================
Firstly, please ensure you understand the mechanics and principles of Corda Flows by reading :doc:`key-concepts-flows` and :doc:`flow-state-machines`.
We also recommend you understand the purpose and behaviour of the :doc:`node-flow-hospital` in relation to *checkpoints* and flow recovery.
An advanced explanation of :ref:`*checkpoints* <flow_internals_checkpoints_ref>` within the flow state machine can be found here: :doc:`contributing-flow-internals`.
As a recap,
"A flow *checkpoint* is a serialised snapshot of the flow's stack frames and any objects reachable from the stack. Checkpoints are saved to
the database automatically when a flow suspends or resumes, which typically happens when sending or receiving messages. A flow may be replayed
from the last checkpoint if the node restarts. Automatic checkpointing is an unusual feature of Corda and significantly helps developers write
reliable code that can survive node restarts and crashes. It also assists with scaling up, as flows that are waiting for a response can be flushed
from memory."
The Checkpoint Agent is a very low level diagnostics tool that can be used to output the type, size and content of flow *checkpoints* at node runtime.
It is primarily targeted at users developing and testing code that may exhibit flow mis-behaviour (preferably before going into production).
For a given flow *checkpoint*, the agent outputs:
1. Information about the checkpoint such as its ``id`` (also called a ``flow id``) that can be used to correlate with that flows lifecycle details in the main Corda logs.
2. A nested hierarchical view of its reachable objects (indented and tagged with depth and size) and their associated sizes, including the state
of any flows held within the checkpoint.
Diagnostics information is written to standard log files (eg. log4j2 configured logger).
This tool is particularly useful when used in conjunction with the ``dumpCheckpoints`` CRaSH shell command to troubleshoot and identify potential
problems associated with checkpoints for flows that appear to not be completing.
The checkpoint agent can be downloaded from `here <https://software.r3.com/artifactory/corda-releases/net/corda/corda-tools-checkpoint-agent/>`_.
To run simply pass in the following jar to the JVM used to start a Corda node: ``-javaagent:<PATH>/checkpoint-agent.jar[=arg=value,...]``
.. warning:: This tool requires additional memory footprint and we recommended a minimal heap size of at least 1Gb.
The agent can be customised with a number of optional parameters described below.
.. note:: When using the gradle plugin utilities for deploying and running nodes, ensure the checkpoint agent jar is correctly passed to capsule as follows:
``-Dcapsule.jvm.args=-javaagent:checkpoint-agent.jar[=arg=value,...]``
Configuration
~~~~~~~~~~~~~
The checkpoint agent can be started with the following optional parameters:
.. code-block:: shell
checkpoint-agent.jar=[instrumentType=<read|write>],[instrumentClassname=<CLASSNAME>],[minimumSize=<MIN_SIZE>],[maximumSize=<MAX_SIZE>, [graphDepth=<DEPTH>], [printOnce=<true|false>]
* ``instrumentType``: whether to output checkpoints on read or write. Possible values: [read, write]. Default: read.
* ``instrumentClassname``: specify the base type of objects to log. The default setting is to process all *Flow* object types. Default: net.corda.node.services.statemachine.FlowStateMachineImpl.
* ``minimumSize``: specifies the minimum size (in bytes) of objects to log. Default: 8192 bytes (8K)
* ``maximumSize``: specifies the maximum size (in bytes) of objects to log. Default: 20000000 bytes (20Mb)
* ``graphDepth``: specifies how many levels deep to display the graph output. Default: unlimited
* ``printOnce``: if true, will display a full object reference (and its sub-graph) only once. Otherwise an object will be displayed repeatedly as referenced. Default: true
These arguments are passed to the JVM along with the agent specification. For example:
.. code-block:: shell
-javaagent:<PATH>/checkpoint-agent.jar=instrumentClassname=net.corda.vega.flows.SimmFlow,instrumentType=read,minimumSize=10240,maximumSize=512000,graphDepth=6,printOnce=false
.. note:: Arguments may be passed into the agent in any order and should **not** contain spaces between them.
Checkpoint Dump support
~~~~~~~~~~~~~~~~~~~~~~~
When used in combination with the ``dumpCheckpoints`` shell command (see :ref:`Upgrading CorDapps <upgrading-cordapps-flow-drains>`.),
the checkpoint agent will automatically output additional diagnostic information for all checkpoints dumped by the aforementioned tool.
You should therefore see two different output files upon invoking the checkpoint dumper command:
* ``<NODE_BASE>\logs\checkpoints_dump-<date>.zip`` contains zipped JSON representation of checkpoints (from ``dumpCheckpoints`` shell command)
* ``<NODE_BASE>\logs\checkpoints_agent-<date>.log`` contains output from this agent tool (types and sizes of a checkpoint stack)
.. note:: You will only see a separate `checkpoints_agent-<date>.log` file if you configure a separate log4j logger as described below.
Otherwise all diagnostics logging will be routed to the standard Corda node log file: ``node-<hostname>.log``.
If you **only** wish to log checkpoint data for failing flows, start the checkpoint agent with the following arguments:
.. code-block:: shell
checkpoint-agent.jar=instrumentType=read,instrumentClassname=NONE
and use the ``dumpCheckpoints`` shell command to trigger diagnostics collection.
.. warning:: The checkpoint agent JAR file must be called "checkpoint-agent.jar" as the checkpoint dump support code uses Java reflection to
determine whether the VM has been instrumented or not at runtime.
Logging configuration
~~~~~~~~~~~~~~~~~~~~~
The agent will log output to a log4j2 configured logger.
It is recommended to configure a separate log file to capture this information by configuring an appender as follows:
.. sourcecode:: none
<Logger name="CheckpointAgent" level="info" additivity="false">
<AppenderRef ref="Checkpoint-Agent-RollingFile-Appender"/>
</Logger>
.. warning:: You must specify "CheckpointAgent" as the logger name.
In this instance we are specifying a Rolling File appender with archival rotation as follows:
.. sourcecode:: none
<RollingFile name="Checkpoint-Agent-RollingFile-Appender"
fileName="${log-path}/checkpoints_agent-${date:yyyyMMdd-HHmmss}.log"
filePattern="${archive}/checkpoints_agent.%date{yyyy-MM-dd}-%i.log.gz">
<PatternLayout pattern="[%-5level] %date{ISO8601}{UTC}Z [%t] %c{2}.%method - %msg%n"/>
<Policies>
<TimeBasedTriggeringPolicy/>
<SizeBasedTriggeringPolicy size="100MB"/>
</Policies>
<DefaultRolloverStrategy min="1" max="100">
<Delete basePath="${archive}" maxDepth="1">
<IfFileName glob="${log-name}*.log.gz"/>
<IfLastModified age="60d">
<IfAny>
<IfAccumulatedFileSize exceeds="10 GB"/>
</IfAny>
</IfLastModified>
</Delete>
</DefaultRolloverStrategy>
</RollingFile>
The *log4j2.xml* containing the above configuration must now be be passed to the Corda node JVM along with the agent specification:
.. code-block:: shell
-Dlog4j.configurationFile=<PATH>/log4j2.xml
Sample output
~~~~~~~~~~~~~
Using the *log4j2* configuration described above, the following output is generated to a file called ``checkpoints_agent-<DATE>.log`` under
the Corda node ``logs`` directory for a single flow execution (in this case):
.. sourcecode:: none
[INFO ] 2019-07-11T18:25:15,723Z [Node thread-1] CheckpointAgent. - [WRITE] Fiber@10000004:[43c7d5c8-aa66-4a98-beed-dc91354d0353][task: co.paralleluniverse.fibers.RunnableFiberTask@4dc8eaf(Fiber@10000004), target: null, scheduler: co.paralleluniverse.fibers.FiberExecutorScheduler@4e468018]
000:net.corda.node.services.statemachine.FlowStateMachineImpl 21,149
[INFO ] 2019-07-11T18:19:51,115Z [FiberDeserializationChecker] CheckpointAgent. - [READ] class net.corda.node.services.statemachine.FlowStateMachineImpl
000:net.corda.node.services.statemachine.FlowStateMachineImpl 21,151
001: net.corda.node.services.statemachine.FlowStateMachineImpl 21,149
002: java.lang.String 107
003: [C 77
002: co.paralleluniverse.fibers.Stack 20,932
003: [J 278
003: [Ljava.lang.Object; 20,054
004: net.corda.finance.flows.CashIssueAndPaymentFlow 7,229
005: net.corda.core.utilities.ProgressTracker 5,664
etc ...
[INFO ] 2019-07-11T18:35:03,198Z [rpc-server-handler-pool-2] CheckpointAgent. - [READ] class net.corda.node.services.statemachine.ErrorState$Clean
Checkpoint id: 15f16740-4ea2-4e48-bcb3-fd9051d5ba59
000:net.corda.node.services.statemachine.FlowStateMachineImpl 21,151
001: [C 77
001: [J 278
001: [Ljava.lang.Object; 20,054
002: java.util.ArrayList 1,658
003: net.corda.core.utilities.ProgressTracker$STARTING 0
etc ...
Note,
* on WRITE (eg. a checkpoint is being serialized to disk), we have complete information of the checkpoint object including the Fiber it is
running on and its checkpoint id (43c7d5c8-aa66-4a98-beed-dc91354d0353)
* on READ (eg. a checkpoint is being deserialized from disk), we only have information about the stack class hierarchy.
Additionally, if we are using the CRaSH shell ``dumpCheckpoints`` command, we also see a flows checkpoint id.
Flow diagnostic process
~~~~~~~~~~~~~~~~~~~~~~~
Lets assume a scenario where we have triggered a flow in a node (eg. node acting as a flow initiator) but the flow does not appear to complete.
For example, you may see the following using the CRaSH shell ``flow watch`` command:
.. sourcecode:: none
Id Flow name Initiator Status
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
15f16740-4ea2-4e48-bcb3-fd9051d5b Cash Issue And Payment bankUser In progress
1c6c3e59-26aa-4b93-8435-4e34e265e Cash Issue And Payment bankUser In progress
90613d6f-be78-41bd-98e1-33a756c28 Cash Issue And Payment bankUser In progress
43c7d5c8-aa66-4a98-beed-dc91354d0 Cash Issue And Payment bankUser In progress
Waiting for completion or Ctrl-C ...
Note that "In progress" indicates the flows above have not completed (and will have been checkpointed).
1. Check the main corda node log file for *hospitalisation* and/or *flow retry* messages: ``<NODE_BASE>\logs\node-<hostname>.log``
.. sourcecode:: none
[INFO ] 2019-07-11T17:56:43,227Z [pool-12-thread-1] statemachine.FlowMonitor. - Flow with id 90613d6f-be78-41bd-98e1-33a756c28808 has been waiting for 97904 seconds to receive messages from parties [O=BigCorporation, L=New York, C=US].
.. note:: Always search for the flow id, in this case **90613d6f-be78-41bd-98e1-33a756c28808**
2. From the CRaSH shell run the ``dumpCheckpoints`` command to trigger diagnostics information.
.. sourcecode:: none
Welcome to the Corda interactive shell.
Useful commands include 'help' to see what is available, and 'bye' to shut down the node.
Thu Jul 11 18:56:48 BST 2019>>> run dumpCheckpoints
You will now see an addition line in the main corda node log file as follows:
.. sourcecode:: none
[INFO ] 2019-07-11T18:02:47,610Z [rpc-server-handler-pool-0] rpc.CheckpointDumper. - Checkpoint agent processing checkpointId: [90613d6f-be78-41bd-98e1-33a756c28808]
And two additional files will appear in the nodes logs directory:
* ``<NODE_BASE>\logs\checkpoints_dump-20190711-180247.zip``
* ``<NODE_BASE>\logs\checkpoints_agent-20190711-185424.log``
3. Unzip the ``<NODE_BASE>\logs\checkpoints_dump-<date>.zip`` file, and you should see a file with a matching flow id as above:
**CashIssueAndPaymentFlow-90613d6f-be78-41bd-98e1-33a756c28808.jsos**
It contents will contain the following diagnostics information:
.. sourcecode:: none
{
"id" : "90613d6f-be78-41bd-98e1-33a756c28808",
"flowLogicClass" : "net.corda.finance.flows.CashIssueAndPaymentFlow",
"flowLogic" : {
"amount" : "200.00 USD",
"issueRef" : "AQ==",
"recipient" : "O=BigCorporation, L=New York, C=US",
"anonymous" : true,
"notary" : "O=Notary Service, L=Zurich, C=CH"
},
"flowCallStack" : [
{
"flowClass" : "net.corda.confidential.SwapIdentitiesFlow",
"progressStep" : "Awaiting counterparty's anonymous identity"
},
{
"flowClass" : "net.corda.finance.flows.CashPaymentFlow",
"progressStep" : "Generating anonymous identities"
},
{
"flowClass" : "net.corda.finance.flows.CashIssueAndPaymentFlow",
"progressStep" : "Paying recipient"
}
],
"suspendedOn" : {
"sendAndReceive" : [
{
"session" : {
"peer" : "O=BigCorporation, L=New York, C=US",
"ourSessionId" : 1443438003030966253
},
"sentPayloadType" : "net.corda.confidential.SwapIdentitiesFlow$IdentityWithSignature",
"sentPayload" : {
"identity" : {
"class" : "net.corda.core.identity.PartyAndCertificate",
"deserialized" : "O=BankOfCorda, L=London, C=GB"
},
"signature" : "t+7hyUnQE08n3ST4krA/7fi1R8ItdrGvpeEbMFgTBDCHibMWiKo/NaTSVUdfwPmsEtl1PFx0MHz5rtRQ+XuEBg=="
}
}
],
"suspendedTimestamp" : "2019-07-10T14:44:58",
"secondsSpentWaiting" : 98268
},
"origin" : {
"rpc" : "bankUser"
},
"ourIdentity" : "O=BankOfCorda, L=London, C=GB",
"activeSessions" : [ ],
"errored" : null
}
4. View the contents of the node agent diagnostics file:
.. sourcecode:: none
[INFO ] 2019-07-11T18:02:47,615Z [rpc-server-handler-pool-0] CheckpointAgent. - [READ] class net.corda.node.services.statemachine.Checkpoint
Checkpoint id: 90613d6f-be78-41bd-98e1-33a756c28808
000:net.corda.node.services.statemachine.Checkpoint 29,200
001: net.corda.node.services.statemachine.ErrorState$Clean 0
001: net.corda.node.services.statemachine.FlowState$Started 26,061
002: net.corda.core.internal.FlowIORequest$SendAndReceive 4,666
003: java.util.Collections$SingletonMap 4,536
004: net.corda.node.services.statemachine.FlowSessionImpl 500
005: net.corda.core.identity.Party 360
005: net.corda.node.services.statemachine.SessionId 28
004: net.corda.core.serialization.SerializedBytes 3,979
002: net.corda.core.serialization.SerializedBytes 21,222
001: net.corda.core.context.InvocationContext 905
002: net.corda.core.context.Actor 259
002: net.corda.core.context.InvocationOrigin$RPC 13
002: net.corda.core.context.Trace 398
001: net.corda.core.identity.Party 156
002: net.i2p.crypto.eddsa.EdDSAPublicKey 45
002: net.corda.core.identity.CordaX500Name 92
001: java.util.LinkedHashMap 327
002: net.corda.node.services.statemachine.SessionState$Initiating 214
001: java.util.ArrayList 1,214
002: net.corda.node.services.statemachine.SubFlow$Inlined 525
003: java.lang.Class 47
003: net.corda.node.services.statemachine.SubFlowVersion$CorDappFlow 328
004: net.corda.core.crypto.SecureHash$SHA256 118
005: [B 33
002: net.corda.node.services.statemachine.SubFlow$Initiating 322
003: java.lang.Class 39
003: net.corda.core.flows.FlowInfo 124
003: net.corda.node.services.statemachine.SubFlowVersion$CorDappFlow 11
002: net.corda.node.services.statemachine.SubFlow$Initiating 250
003: java.lang.Class 41
003: net.corda.core.flows.FlowInfo 99
004: java.lang.String 91
005: [C 85
003: net.corda.node.services.statemachine.SubFlowVersion$CoreFlow 28
5. Take relevant recovery action, which may include:
* killing and retrying the flow:
.. sourcecode:: none
Welcome to the Corda interactive shell.
Useful commands include 'help' to see what is available, and 'bye' to shut down the node.
Thu Jul 11 20:24:11 BST 2019>>> flow kill 90613d6f-be78-41bd-98e1-33a756c28808
[ERROR] 20:24:18+0100 [Node thread-1] corda.flow. - Flow interrupted while waiting for events, aborting immediately {actor_id=bankUser, actor_owning_identity=O=BankOfCorda, L=London, C=GB, actor_store_id=NODE_CONFIG, fiber-id=10000003, flow-id=15f16740-4ea2-4e48-bcb3-fd9051d5ba59, invocation_id=45622dc7-c4cf-4d11-85ad-1c45e0943455, invocation_timestamp=2019-07-11T18:19:40.519Z, origin=bankUser, session_id=02010e15-8e7a-46f7-976b-5e0626451c54, session_timestamp=2019-07-11T18:19:32.285Z, thread-id=176}
Killed flow [90613d6f-be78-41bd-98e1-33a756c28808]
Thu Jul 11 20:26:45 BST 2019>>> flow start CashIssueAndPaymentFlow amount: $1000, issueRef: 0x01, recipient: "Bank B", anonymous: false, notary: "Notary Service"
* attempting to perform a graceful shutdown (draining all outstanding flows and preventing others from starting) and re-start of the node:
.. sourcecode:: none
Welcome to the Corda interactive shell.
Useful commands include 'help' to see what is available, and 'bye' to shut down the node.
Thu Jul 11 19:52:56 BST 2019>>> gracefulShutdown
Upon re-start ensure you disable flow draining mode to allow the node to continue to receive requests:
.. sourcecode:: none
Welcome to the Corda interactive shell.
Useful commands include 'help' to see what is available, and 'bye' to shut down the node.
Thu Jul 11 19:52:56 BST 2019>>> run setFlowsDrainingModeEnabled enabled: false
See also :ref:`Flow draining mode <draining-mode>`.
* contacting other participants in the network where their nodes are not responding to an initiated flow.
The checkpoint dump gives good diagnostics on the reason a flow may be suspended (including the destination peer participant node that is not responding):
.. sourcecode:: none
"suspendedOn" : {
"sendAndReceive" : [
{
"session" : {
"peer" : "O=BigCorporation, L=New York, C=US",
"ourSessionId" : 1443438003030966253
},
"sentPayloadType" : "net.corda.confidential.SwapIdentitiesFlow$IdentityWithSignature",
"sentPayload" : {
"identity" : {
"class" : "net.corda.core.identity.PartyAndCertificate",
"deserialized" : "O=BankOfCorda, L=London, C=GB"
},
"signature" : "t+7hyUnQE08n3ST4krA/7fi1R8ItdrGvpeEbMFgTBDCHibMWiKo/NaTSVUdfwPmsEtl1PFx0MHz5rtRQ+XuEBg=="
}
}

View File

@ -1,7 +1,7 @@
Tools
=====
Corda provides various command line and GUI tools to help you as you work. Along with the three below, you may also
Corda provides various command line and GUI tools to help you as you work. Along with those below, you may also
wish to try the :doc:`blob-inspector`.
.. toctree::
@ -11,3 +11,4 @@ wish to try the :doc:`blob-inspector`.
network-bootstrapper
demobench
node-explorer
tools-checkpoint-agent

View File

@ -341,7 +341,7 @@ abstract class AbstractNode<S>(val configuration: NodeConfiguration,
installCoreFlows()
registerCordappFlows()
services.rpcFlows += cordappLoader.cordapps.flatMap { it.rpcFlows }
val checkpointDumper = CheckpointDumper(checkpointStorage, database, services)
val checkpointDumper = CheckpointDumper(checkpointStorage, database, services, services.configuration.baseDirectory)
val rpcOps = makeRPCOps(cordappLoader, checkpointDumper)
startShell()
networkMapClient?.start(trustRoot)

View File

@ -1,6 +1,7 @@
package net.corda.node.services.rpc
import co.paralleluniverse.fibers.Stack
import co.paralleluniverse.strands.Strand
import com.fasterxml.jackson.annotation.JsonAutoDetect
import com.fasterxml.jackson.annotation.JsonAutoDetect.Visibility
import com.fasterxml.jackson.annotation.JsonFormat
@ -24,9 +25,11 @@ import net.corda.core.crypto.SecureHash
import net.corda.core.flows.FlowInfo
import net.corda.core.flows.FlowLogic
import net.corda.core.flows.FlowSession
import net.corda.core.flows.StateMachineRunId
import net.corda.core.identity.CordaX500Name
import net.corda.core.identity.Party
import net.corda.core.internal.*
import net.corda.core.node.ServiceHub
import net.corda.core.serialization.SerializeAsToken
import net.corda.core.serialization.SerializedBytes
import net.corda.core.serialization.deserialize
@ -38,11 +41,12 @@ import net.corda.core.utilities.ProgressTracker
import net.corda.core.utilities.contextLogger
import net.corda.node.internal.NodeStartup
import net.corda.node.services.api.CheckpointStorage
import net.corda.node.services.api.ServiceHubInternal
import net.corda.node.services.statemachine.*
import net.corda.nodeapi.internal.persistence.CordaPersistence
import net.corda.serialization.internal.CheckpointSerializeAsTokenContextImpl
import net.corda.serialization.internal.withTokenContext
import sun.misc.VMSupport
import java.nio.file.Path
import java.time.Duration
import java.time.Instant
import java.time.ZoneOffset.UTC
@ -53,7 +57,7 @@ import java.util.concurrent.atomic.AtomicInteger
import java.util.zip.ZipEntry
import java.util.zip.ZipOutputStream
class CheckpointDumper(private val checkpointStorage: CheckpointStorage, private val database: CordaPersistence, private val serviceHub: ServiceHubInternal) {
class CheckpointDumper(private val checkpointStorage: CheckpointStorage, private val database: CordaPersistence, private val serviceHub: ServiceHub, val baseDirectory: Path) {
companion object {
private val TIME_FORMATTER = DateTimeFormatter.ofPattern("yyyyMMdd-HHmmss").withZone(UTC)
private val log = contextLogger()
@ -64,6 +68,10 @@ class CheckpointDumper(private val checkpointStorage: CheckpointStorage, private
private lateinit var checkpointSerializationContext: CheckpointSerializationContext
private lateinit var writer: ObjectWriter
private val isCheckpointAgentRunning by lazy {
checkpointAgentRunning()
}
fun start(tokenizableServices: List<Any>) {
checkpointSerializationContext = CheckpointSerializationDefaults.CHECKPOINT_CONTEXT.withTokenContext(
CheckpointSerializeAsTokenContextImpl(
@ -91,13 +99,15 @@ class CheckpointDumper(private val checkpointStorage: CheckpointStorage, private
fun dump() {
val now = serviceHub.clock.instant()
val file = serviceHub.configuration.baseDirectory / NodeStartup.LOGS_DIRECTORY_NAME / "checkpoints_dump-${TIME_FORMATTER.format(now)}.zip"
val file = baseDirectory / NodeStartup.LOGS_DIRECTORY_NAME / "checkpoints_dump-${TIME_FORMATTER.format(now)}.zip"
try {
if (lock.getAndIncrement() == 0 && !file.exists()) {
database.transaction {
checkpointStorage.getAllCheckpoints().use { stream ->
ZipOutputStream(file.outputStream()).use { zip ->
stream.forEach { (runId, serialisedCheckpoint) ->
if (isCheckpointAgentRunning)
instrumentCheckpointAgent(runId)
val checkpoint = serialisedCheckpoint.checkpointDeserialize(context = checkpointSerializationContext)
val json = checkpoint.toJson(runId.uuid, now)
val jsonBytes = writer.writeValueAsBytes(json)
@ -116,6 +126,28 @@ class CheckpointDumper(private val checkpointStorage: CheckpointStorage, private
}
}
private fun instrumentCheckpointAgent(checkpointId: StateMachineRunId) {
log.info("Checkpoint agent diagnostics for checkpointId: $checkpointId")
try {
val checkpointHook = Class.forName("net.corda.tools.CheckpointHook").kotlin
if (checkpointHook.objectInstance == null)
log.info("Instantiating checkpoint agent object instance")
val instance = checkpointHook.objectOrNewInstance()
val checkpointIdField = instance.declaredField<UUID>(instance.javaClass, "checkpointId")
checkpointIdField.value = checkpointId.uuid
}
catch (e: Exception) {
log.error("Checkpoint agent instrumentation failed for checkpointId: $checkpointId\n. ${e.message}")
}
}
private fun checkpointAgentRunning(): Boolean {
val agentProperties = VMSupport.getAgentProperties()
return agentProperties.values.any { value ->
(value is String && value.contains("checkpoint-agent.jar"))
}
}
private fun Checkpoint.toJson(id: UUID, now: Instant): CheckpointJson {
val (fiber, flowLogic) = when (flowState) {
is FlowState.Unstarted -> {

View File

@ -0,0 +1,102 @@
package net.corda.node.services.rpc
import net.corda.core.context.InvocationContext
import net.corda.core.flows.FlowLogic
import net.corda.core.flows.StateMachineRunId
import net.corda.core.identity.CordaX500Name
import net.corda.core.serialization.SerializedBytes
import net.corda.core.serialization.internal.CheckpointSerializationDefaults
import net.corda.core.serialization.internal.checkpointSerialize
import net.corda.node.services.persistence.DBCheckpointStorage
import net.corda.node.services.statemachine.Checkpoint
import net.corda.node.services.statemachine.FlowStart
import net.corda.node.services.statemachine.SubFlowVersion
import net.corda.nodeapi.internal.persistence.CordaPersistence
import net.corda.testing.core.SerializationEnvironmentRule
import net.corda.testing.core.TestIdentity
import net.corda.testing.node.MockServices
import org.junit.After
import org.junit.Before
import org.junit.Rule
import org.junit.Test
import java.nio.file.Paths
class CheckpointDumperTest {
@Rule
@JvmField
val testSerialization = SerializationEnvironmentRule()
private val myself = TestIdentity(CordaX500Name("Me", "London", "GB"))
private lateinit var database: CordaPersistence
private lateinit var services: MockServices
private lateinit var checkpointStorage: DBCheckpointStorage
@Before
fun setUp() {
val (db, mockServices) = MockServices.makeTestDatabaseAndPersistentServices(
cordappPackages = emptyList(),
initialIdentity = myself,
moreIdentities = emptySet(),
moreKeys = emptySet()
)
database = db
services = mockServices
newCheckpointStorage()
}
@After
fun cleanUp() {
database.close()
}
@Test
fun testDumpCheckpoints() {
val dumper = CheckpointDumper(checkpointStorage, database, services, Paths.get("."))
dumper.start(emptyList())
// add a checkpoint
val (id, checkpoint) = newCheckpoint()
database.transaction {
checkpointStorage.addCheckpoint(id, checkpoint)
}
dumper.dump()
// check existence of output zip file: checkpoints_dump-<data>.zip
}
// This test will only succeed when the VM startup includes the "checkpoint-agent":
// -javaagent:tools/checkpoint-agent/build/libs/checkpoint-agent.jar
@Test
fun testDumpCheckpointsAndAgentDiagnostics() {
val dumper = CheckpointDumper(checkpointStorage, database, services, Paths.get("."))
dumper.start(emptyList())
// add a checkpoint
val (id, checkpoint) = newCheckpoint()
database.transaction {
checkpointStorage.addCheckpoint(id, checkpoint)
}
dumper.dump()
// check existence of output zip file: checkpoints_dump-<date>.zip
// check existence of output agent log: checkpoints_agent-<data>.log
}
private fun newCheckpointStorage() {
database.transaction {
checkpointStorage = DBCheckpointStorage()
}
}
private fun newCheckpoint(version: Int = 1): Pair<StateMachineRunId, SerializedBytes<Checkpoint>> {
val id = StateMachineRunId.createRandom()
val logic: FlowLogic<*> = object : FlowLogic<Unit>() {
override fun call() {}
}
val frozenLogic = logic.checkpointSerialize(context = CheckpointSerializationDefaults.CHECKPOINT_CONTEXT)
val checkpoint = Checkpoint.create(InvocationContext.shell(), FlowStart.Explicit, logic.javaClass, frozenLogic, myself.identity.party, SubFlowVersion.CoreFlow(version), false)
.getOrThrow()
return id to checkpoint.checkpointSerialize(context = CheckpointSerializationDefaults.CHECKPOINT_CONTEXT)
}
}

View File

@ -56,6 +56,7 @@ include 'tools:shell-cli'
include 'tools:network-builder'
include 'tools:cliutils'
include 'tools:worldmap'
include 'tools:checkpoint-agent'
include 'example-code'
project(':example-code').projectDir = file("$settingsDir/docs/source/example-code")
include 'samples:attachment-demo:contracts'
@ -97,4 +98,6 @@ if (JavaVersion.current() == JavaVersion.VERSION_1_8) {
include 'core-deterministic:testing:verifier'
include 'serialization-deterministic'
}
include 'tools:checkpoint-agent'
findProject(':tools:checkpoint-agent')?.name = 'checkpoint-agent'

View File

@ -0,0 +1,64 @@
buildscript {
// For sharing constants between builds
Properties constants = new Properties()
file("$projectDir/../../constants.properties").withInputStream { constants.load(it) }
ext.kotlin_version = constants.getProperty("kotlinVersion")
ext.javaassist_version = "3.12.1.GA"
repositories {
mavenLocal()
mavenCentral()
jcenter()
}
dependencies {
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
}
}
repositories {
mavenLocal()
mavenCentral()
jcenter()
}
apply plugin: 'kotlin'
apply plugin: 'idea'
description 'A javaagent to allow hooking into Kryo checkpoints'
dependencies {
compile "org.jetbrains.kotlin:kotlin-stdlib-jre8:$kotlin_version"
compile "org.jetbrains.kotlin:kotlin-reflect:$kotlin_version"
compile "javassist:javassist:$javaassist_version"
compile "com.esotericsoftware:kryo:4.0.0"
compile "co.paralleluniverse:quasar-core:$quasar_version:jdk8"
compile (project(':core')) {
transitive = false
}
// Unit testing helpers.
compile "org.junit.jupiter:junit-jupiter-api:${junit_jupiter_version}"
compile "junit:junit:$junit_version"
// SLF4J: commons-logging bindings for a SLF4J back end
compile "org.slf4j:jcl-over-slf4j:$slf4j_version"
compile "org.slf4j:slf4j-api:$slf4j_version"
}
jar {
archiveName = "${project.name}.jar"
manifest {
attributes(
'Premain-Class': 'net.corda.tools.CheckpointAgent',
'Can-Redefine-Classes': 'true',
'Can-Retransform-Classes': 'true',
'Can-Set-Native-Method-Prefix': 'true',
'Implementation-Title': "CheckpointAgent",
'Implementation-Version': rootProject.version
)
}
from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
}

View File

@ -0,0 +1,492 @@
package net.corda.tools
import co.paralleluniverse.strands.Strand
import com.esotericsoftware.kryo.io.Input
import com.esotericsoftware.kryo.io.Output
import com.esotericsoftware.kryo.serializers.FieldSerializer
import javassist.ClassPool
import javassist.CtClass
import net.corda.core.internal.ThreadBox
import net.corda.core.utilities.debug
import net.corda.tools.CheckpointAgent.Companion.instrumentClassname
import net.corda.tools.CheckpointAgent.Companion.instrumentType
import net.corda.tools.CheckpointAgent.Companion.log
import net.corda.tools.CheckpointAgent.Companion.maximumSize
import net.corda.tools.CheckpointAgent.Companion.minimumSize
import net.corda.tools.CheckpointAgent.Companion.printOnce
import net.corda.tools.CheckpointAgent.Companion.graphDepth
import org.slf4j.LoggerFactory
import java.io.ByteArrayInputStream
import java.lang.instrument.ClassFileTransformer
import java.lang.instrument.Instrumentation
import java.security.ProtectionDomain
import java.util.*
import java.util.concurrent.ConcurrentHashMap
import java.util.concurrent.atomic.AtomicInteger
class CheckpointAgent {
// whether to instrument serialized object on reads or write
enum class InstrumentationType {
WRITE, READ
}
companion object {
// custom argument defaults
val DEFAULT_INSTRUMENT_CLASSNAME = "net.corda.node.services.statemachine.FlowStateMachineImpl"
val DEFAULT_INSTRUMENT_TYPE = InstrumentationType.READ
val DEFAULT_MINIMUM_SIZE = 8 * 1024
val DEFAULT_MAXIMUM_SIZE = 20_000_000
val DEFAULT_GRAPH_DEPTH = Int.MAX_VALUE
// startup arguments
var instrumentClassname = DEFAULT_INSTRUMENT_CLASSNAME
var minimumSize = DEFAULT_MINIMUM_SIZE
var maximumSize = DEFAULT_MAXIMUM_SIZE
var instrumentType = DEFAULT_INSTRUMENT_TYPE
var graphDepth = DEFAULT_GRAPH_DEPTH
var printOnce = true
val log by lazy {
LoggerFactory.getLogger("CheckpointAgent")
}
@JvmStatic
fun premain(argumentsString: String?, instrumentation: Instrumentation) {
parseArguments(argumentsString)
instrumentation.addTransformer(CheckpointHook)
}
fun parseArguments(argumentsString: String?) {
argumentsString?.let {
val nvpList = it.split(",")
nvpList.forEach {
val nvpItem = it.split("=")
if (nvpItem.size == 2) {
when (nvpItem[0].trim()) {
"instrumentClassname" -> instrumentClassname = nvpItem[1]
"instrumentType" -> try { instrumentType = InstrumentationType.valueOf(nvpItem[1].toUpperCase()) } catch (e: Exception) {
display("Invalid value: ${nvpItem[1]}. Please specify read or write.")
}
"minimumSize" -> try { minimumSize = nvpItem[1].toInt() } catch (e: NumberFormatException) {
display("Invalid value: ${nvpItem[1]}. Please specify an integer value.") }
"maximumSize" -> try { maximumSize = nvpItem[1].toInt() } catch (e: NumberFormatException) {
display("Invalid value: ${nvpItem[1]}. Please specify an integer value.")
}
"graphDepth" -> try { graphDepth = nvpItem[1].toInt() } catch (e: NumberFormatException) {
display("Invalid value: ${nvpItem[1]}. Please specify an integer value.")
}
"printOnce" -> try { printOnce = nvpItem[1].toBoolean() } catch (e: Exception) {
display("Invalid value: ${nvpItem[1]}. Please specify true or false.")
}
else -> display("Invalid argument: $nvpItem")
}
}
else display("Missing value for argument: $nvpItem")
}
}
println("Running Checkpoint agent with following arguments: instrumentClassname=$instrumentClassname, instrumentType=$instrumentType, minimumSize=$minimumSize, maximumSize=$maximumSize, graphDepth=$graphDepth, printOnce=$printOnce\n")
}
private fun display(output: String) {
System.err.println("CheckpointAgent: $output")
}
}
}
/**
* The hook simply records the read() and write() entries and exits together with the output offset at the time of the call.
* This is recorded in a StrandID -> List<StatsEvent> map.
*
* Later we "parse" these lists into a tree.
*/
object CheckpointHook : ClassFileTransformer {
val classPool = ClassPool.getDefault()
val hookClassName = javaClass.name
override fun transform(
loader: ClassLoader?,
className: String,
classBeingRedefined: Class<*>?,
protectionDomain: ProtectionDomain?,
classfileBuffer: ByteArray
): ByteArray? {
if (className.startsWith("java") || className.startsWith("javassist") || className.startsWith("kotlin")) {
return null
}
return try {
val clazz = classPool.makeClass(ByteArrayInputStream(classfileBuffer))
instrumentClass(clazz)?.toBytecode()
} catch (throwable: Throwable) {
throwable.printStackTrace(System.out)
null
}
}
private fun instrumentClass(clazz: CtClass): CtClass? {
for (method in clazz.declaredBehaviors) {
if (instrumentType == CheckpointAgent.InstrumentationType.WRITE) {
if (method.name == "write") {
val parameterTypeNames = method.parameterTypes.map { it.name }
if (parameterTypeNames == listOf("com.esotericsoftware.kryo.Kryo", "com.esotericsoftware.kryo.io.Output", "java.lang.Object")) {
if (method.isEmpty) continue
log.debug { "Instrumenting on write: ${clazz.name}" }
method.insertBefore("$hookClassName.${this::writeEnter.name}($2, $3);")
method.insertAfter("$hookClassName.${this::writeExit.name}($2, $3);")
return clazz
}
}
}
if (instrumentType == CheckpointAgent.InstrumentationType.READ) {
if (method.name == "read") {
val parameterTypeNames = method.parameterTypes.map { it.name }
if (parameterTypeNames == listOf("com.esotericsoftware.kryo.Kryo", "com.esotericsoftware.kryo.io.Input", "java.lang.Class")) {
if (method.isEmpty) continue
log.debug { "Instrumenting on read: ${clazz.name}" }
method.insertBefore("$hookClassName.${this::readEnter.name}($2, $3);")
method.insertAfter("$hookClassName.${this::readExit.name}($2, $3, (java.lang.Object)\$_);")
return clazz
} else if (parameterTypeNames == listOf("com.esotericsoftware.kryo.io.Input", "java.lang.Object")) {
if (method.isEmpty) continue
log.debug { "Instrumenting on field read: ${clazz.name}" }
method.insertBefore("$hookClassName.${this::readFieldEnter.name}((java.lang.Object)this);")
method.insertAfter("$hookClassName.${this::readFieldExit.name}($2, (java.lang.Object)this);")
return clazz
}
}
}
}
return null
}
// StrandID -> StatsEvent map
val events = ConcurrentHashMap<Long, Pair<ArrayList<StatsEvent>, AtomicInteger>>()
// Global static variable that can be set programmatically by 3rd party tools/code (eg. CheckpointDumper)
// Only relevant to READ operations (as the checkpoint id must have previously been generated)
var checkpointId: UUID? = null
set(value) {
if (value != null)
log.debug { "Diagnosing checkpoint id: $value" }
mutex.locked {
field = value
}
}
private val mutex = ThreadBox(checkpointId)
@JvmStatic
fun readFieldEnter(that: Any) {
if (that is FieldSerializer.CachedField<*>) {
log.debug { "readFieldEnter object: ${that.field.name}:${that.field.type}" }
val (list, _) = events.getOrPut(Strand.currentStrand().id) { Pair(ArrayList(), AtomicInteger(0)) }
list.add(StatsEvent.EnterField(that.field.name, that.field.type))
}
}
@JvmStatic
fun readFieldExit(obj: Any?, that: Any) {
if (that is FieldSerializer.CachedField<*>) {
val (list, _) = events.getOrPut(Strand.currentStrand().id) { Pair(ArrayList(), AtomicInteger(0)) }
val value = that.field.get(obj)
val arrayValue = getArrayValue(that.field.type, value)
if (!that.javaClass.name.endsWith("ObjectField")
|| arrayValue != null || that.field.type == java.lang.String::class.java || value == null) {
log.debug { "readFieldExit basic type value: ${that.field.name}:${that.field.type} = ${arrayValue ?: value}" }
list.add(StatsEvent.BasicTypeField(that.field.name, that.field.type, arrayValue ?: value))
} else {
log.debug { "readFieldExit object value: ${that.field.name}:${that.field.type} = $value" }
list.add(StatsEvent.ObjectField(that.field.name, that.field.type, value))
}
}
}
private fun <T> getArrayValue(clazz: Class<T>, value: Any?): String? {
if (clazz.isArray) {
log.debug { "readFieldExit array type: $clazz, value: $value]" }
if (Array<Number>::class.java.isAssignableFrom(clazz)) {
val numberValue = value as Array<Number>
log.debug { "readFieldExit array of number: $clazz = ${numberValue.joinToString(",")}" }
return numberValue.joinToString(",")
}
else if (clazz == Array<Boolean>::class.java) {
val arrayValue = value as Array<Boolean>
log.debug { "readFieldExit array of boolean: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
// N dimensional arrays
else if (arrayOf<Array<*>>()::class.java.isAssignableFrom(clazz)) {
val arrayValue = value as Array<Array<*>>
return arrayValue.map { arrayEntry ->
log.debug { "N Dimensional: $clazz, $arrayEntry, ${arrayEntry::class.java}" }
"[" + getArrayValue(arrayEntry::class.java, arrayEntry) + "]"
}.toString()
}
// Kotlin array types
else if (clazz == CharArray::class.java) {
val arrayValue = value as CharArray
log.debug { "readFieldExit char array: $clazz = ${arrayValue.joinToString("")}" }
return arrayValue.joinToString("")
}
else if (clazz == ByteArray::class.java) {
val arrayValue = value as ByteArray
log.debug { "readFieldExit byte array: $clazz = ${byteArrayToHex(arrayValue)}" }
return byteArrayToHex(arrayValue)
}
else if (clazz == ShortArray::class.java) {
val arrayValue = value as ShortArray
log.debug { "readFieldExit short array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
else if (clazz == IntArray::class.java) {
val arrayValue = value as IntArray
log.debug { "readFieldExit int array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
else if (clazz == LongArray::class.java) {
val arrayValue = value as LongArray
log.debug { "readFieldExit long array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
else if (clazz == FloatArray::class.java) {
val arrayValue = value as FloatArray
log.debug { "readFieldExit float array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
else if (clazz == DoubleArray::class.java) {
val arrayValue = value as DoubleArray
log.debug { "readFieldExit double array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
else if (clazz == BooleanArray::class.java) {
val arrayValue = value as BooleanArray
log.debug { "readFieldExit boolean array: $clazz = ${arrayValue.joinToString(",")}" }
return arrayValue.joinToString(",")
}
log.debug { "ARRAY OF TYPE: $clazz (size: ${(value as Array<Any?>).size})" }
}
return null
}
private fun byteArrayToHex(a: ByteArray): String {
val sb = StringBuilder(a.size * 2)
for (b in a)
sb.append(String.format("%02x", b))
return sb.toString()
}
@JvmStatic
fun readEnter(input: Input, clazz: Class<*>) {
val (list, count) = events.getOrPut(Strand.currentStrand().id) { Pair(ArrayList(), AtomicInteger(0)) }
log.debug { "readEnter: adding event for clazz: ${clazz.name} (strandId: ${Strand.currentStrand().id})" }
list.add(StatsEvent.Enter(clazz.name, input.total()))
count.incrementAndGet()
}
@JvmStatic
fun readExit(input: Input, clazz: Class<*>, value: Any?) {
val (list, count) = events[Strand.currentStrand().id]!!
list.add(StatsEvent.Exit(clazz.name, input.total(), value))
log.debug { "readExit: clazz[$clazz], strandId[${Strand.currentStrand().id}], eventCount[$count]" }
if (count.decrementAndGet() == 0) {
if ((checkpointId != null) ||
((clazz.name == instrumentClassname) &&
(input.total() >= minimumSize) &&
(input.total() <= maximumSize))) {
val sb = StringBuilder()
if (checkpointId != null)
sb.append("Checkpoint id: $checkpointId\n")
prettyStatsTree(0, StatsInfo("", Any::class.java), readTree(list, 0).second, sb)
log.info("[READ] $clazz\n$sb")
checkpointId = null
}
log.debug { "readExit: clearing event for clazz: $clazz (strandId: ${Strand.currentStrand().id})" }
events.remove(Strand.currentStrand().id)
}
}
@JvmStatic
fun writeEnter(output: Output, obj: Any) {
val (list, count) = events.getOrPut(-Strand.currentStrand().id) { Pair(ArrayList(), AtomicInteger(0)) }
log.debug { "writeEnter: adding event for clazz: ${obj.javaClass.name} (strandId: ${Strand.currentStrand().id})" }
list.add(StatsEvent.Enter(obj.javaClass.name, output.total()))
count.incrementAndGet()
}
@JvmStatic
fun writeExit(output: Output, obj: Any) {
val (list, count) = events[-Strand.currentStrand().id]!!
list.add(StatsEvent.Exit(obj.javaClass.name, output.total(), null))
log.debug { "writeExit: clazz[${obj.javaClass.name}], strandId[${Strand.currentStrand().id}], eventCount[$count]" }
if (count.decrementAndGet() == 0) {
// always log diagnostics for explicit checkpoint ids (eg. set dumpCheckpoints)
if ((checkpointId != null) ||
((obj.javaClass.name == instrumentClassname) &&
(output.total() >= minimumSize) &&
(output.total() <= maximumSize))) {
val sb = StringBuilder()
prettyStatsTree(0, StatsInfo("", Any::class.java), readTree(list, 0).second, sb)
log.info("[WRITE] $obj\n$sb")
checkpointId = null
}
log.debug { "writeExit: clearing event for clazz: ${obj.javaClass.name} (strandId: ${Strand.currentStrand().id})" }
events.remove(-Strand.currentStrand().id)
}
}
private fun prettyStatsTree(indent: Int, statsInfo: StatsInfo, identityInfo: IdentityInfo, builder: StringBuilder) {
val statsTree = identityInfo.tree
when (statsTree) {
is StatsTree.Object -> {
if (printOnce && identityInfo.refCount > 1) {
log.debug { "Skipping $statsInfo, $statsTree (count:${identityInfo.refCount})" }
}
else if (indent/2 < graphDepth) {
builder.append(String.format("%03d:", indent / 2))
builder.append(CharArray(indent) { ' ' })
builder.append(" ${statsInfo.fieldName} ")
if (statsInfo.fieldType != null && statsInfo.fieldType.isArray) {
val arrayValue = (statsTree.value as Array<Any?>)
builder.append("${statsInfo.fieldType} (array length:${arrayValue.size})")
}
else if (statsInfo.fieldType != null && statsTree.value is Collection<*>) {
builder.append("${statsInfo.fieldType} (collection size:${statsTree.value.size})")
}
else if (statsInfo.fieldType != null && statsTree.value is Map<*,*>) {
builder.append("${statsInfo.fieldType} (map size:${statsTree.value.size})")
}
else {
builder.append("${statsTree.className} (hash:${statsTree.value?.hashCode()}) (count:${identityInfo.refCount})")
}
builder.append(" ")
builder.append(String.format("%,d", statsTree.size))
builder.append("\n")
}
for (child in statsTree.children) {
prettyStatsTree(indent + 2, child.first, child.second, builder)
}
}
is StatsTree.BasicType -> {
if (indent/2 < graphDepth) {
builder.append(String.format("%03d:", indent / 2))
builder.append(CharArray(indent) { ' ' })
builder.append(" ${statsInfo.fieldName} ")
builder.append("${statsTree.value}")
builder.append("\n")
}
}
}
}
}
sealed class StatsEvent {
data class Enter(val className: String, val offset: Long) : StatsEvent()
data class Exit(val className: String, val offset: Long, val value: Any?) : StatsEvent()
data class BasicTypeField(val fieldName: String, val fieldType: Class<*>?, val fieldValue: Any?) : StatsEvent()
data class EnterField(val fieldName: String, val fieldType: Class<*>?) : StatsEvent()
data class ObjectField(val fieldName: String, val fieldType: Class<*>?, val value: Any) : StatsEvent()
}
sealed class StatsTree {
data class Object(
val className: String,
val size: Long,
val children: List<Pair<StatsInfo, IdentityInfo>>,
val value: Any?
) : StatsTree() {
override fun toString(): String {
return "Object [$className, $size, $value]"
}
}
data class BasicType(
val value: Any?
) : StatsTree()
data class Loop(var depth: Int) : StatsTree() {
override fun toString(): String {
return "Loop()"
}
}
}
data class IdentityInfo(val tree: StatsTree, val refCount: Int)
fun readTree(events: List<StatsEvent>, index: Int, idMap: IdentityHashMap<Any, IdentityInfo> = IdentityHashMap()): Pair<Int, IdentityInfo> {
val event = events[index]
when (event) {
is StatsEvent.Enter -> {
val (nextIndex, children) = readTrees(events, index + 1, idMap)
val exit = events[nextIndex] as StatsEvent.Exit
require(event.className == exit.className)
val tree = StatsTree.Object(event.className, exit.offset - event.offset, children, exit.value)
if (idMap.containsKey(exit.value)) {
val identityInfo = idMap[exit.value]!!
idMap[exit.value] = IdentityInfo(identityInfo.tree, identityInfo.refCount + 1)
log.debug { "Skipping repeated StatsEvent.Enter: ${exit.value} (hashcode:${exit.value!!.hashCode()}) (count:${idMap[exit.value]?.refCount})" }
} else idMap[exit.value] = IdentityInfo(tree, 1)
return Pair(nextIndex + 1, idMap[exit.value]!!)
}
else -> {
throw IllegalStateException("Wasn't expecting event: $event")
}
}
}
data class StatsInfo(val fieldName: String, val fieldType: Class<*>?)
fun readTrees(events: List<StatsEvent>, index: Int, idMap: IdentityHashMap<Any, IdentityInfo>): Pair<Int, List<Pair<StatsInfo, IdentityInfo>>> {
val trees = ArrayList<Pair<StatsInfo, IdentityInfo>>()
var i = index
var arrayIdx = 0
var inField = false
while (true) {
val event = events.getOrNull(i)
when (event) {
is StatsEvent.Enter -> {
val (nextIndex, tree) = readTree(events, i, idMap)
if (!inField) {
arrayIdx++
trees += StatsInfo("[$arrayIdx]", Any::class.java) to tree
}
i = nextIndex
}
is StatsEvent.EnterField -> {
i++
inField = true
}
is StatsEvent.Exit -> {
arrayIdx = 0
if (idMap.containsKey(event.value)) {
val identityInfo = idMap[event.value]!!
idMap[event.value] = IdentityInfo(identityInfo.tree, identityInfo.refCount + 1)
log.debug { "Skipping repeated StatsEvent.Exit: ${event.value} (hashcode:${event.value!!.hashCode()}) (count:${idMap[event.value]?.refCount})" }
}
return Pair(i, trees)
}
is StatsEvent.BasicTypeField -> {
trees += StatsInfo(event.fieldName, event.fieldType) to IdentityInfo(StatsTree.BasicType(event.fieldValue), 1)
i++
inField = false
}
is StatsEvent.ObjectField -> {
val identityInfo =
if (idMap.containsKey(event.value)) {
val identityInfo = idMap[event.value]!!
idMap[event.value] = IdentityInfo(identityInfo.tree, identityInfo.refCount + 1)
log.debug { "Skipping repeated StatsEvent.ObjectField: ${event.value} (hashcode:${event.value!!.hashCode()}) (count:${idMap[event.value]?.refCount})" }
identityInfo
}
else {
IdentityInfo(StatsTree.Loop(0), 1)
}
trees += StatsInfo(event.fieldName, event.fieldType) to identityInfo
i++
inField = false
}
null -> {
return Pair(i, trees)
}
}
}
}