trick/docs/documentation/simulation_capabilities/Data-Record.md
ddj116 9099792947
Integrate MonteCarloGenerate capability from EG CML and associated TrickOps enhancements (#1415)
* Provide MonteCarloGenerate capability

Intermediate commit, this squash represents all of Isaac Reaves' work
during his Fall 2022 Pathways internship tour

[skip ci]

* TrickOps: Add phase, [min-max] range, and overhaul YAML verification

* Add new "phase:" mechanism to TrickOps Runs and Builds to support
  project-specific constraints on build and run ordering
  - phase defaults to zero if not specified and must be between -1000
    and 1000 if given.
  - jobs can now optionally be requested by their phase or phase range
  - See trickops/README.md for details
* Add [min-max] notation capability to run: entries and compare: entries
  - [min-max] ranges provide definition of a set of runs using a common
    numbering scheme in the YAML file, greatly reducing YAML file size
    for monte-carlo and other zero-padded run numbering use cases
  - See trickops/README.md for details
* YAML parsing changes
  - Overhaul the logic which verifies YAML files for the expected
    TrickOps format. This is now done in TrickWorkflowYamlVerifier and
    provides much more robust error checking than previous approach
  - .yaml_requirements.yml now provides the required types, ranges, and
    default values as applicable to expected entries in YAML files
  - valgrind: is now an sub-option to run: entries, not its own section
    Users should now list their runs normallly and define their flags in
    in that run's valgrind: subsection
  - parallel_safety is now a per-sim parameter and not global. Users
    should move their global config to the sim layer
  - self.config_errors is now a list of errors. Users should now
    check for empty list when using instead of True/False
* Robustify the get_koviz_report_jobs unit test to work whether koviz
  exists on PATH or not
* Adjust trickops.py to use the new phase and range features
   - Make it more configurable on the command-line via argparse
   - Move SIM_mc_generation tests into test_sims.yml

[skip ci]

* Code review and cleanup from PR #1389

Documentation:

* Adjust documentation to fit suggested symlinked approach. Also
  cleaned up duplicate images and old documentation.
* Moved the verification section out of markdown and into a PDF since it
  heavily leverages formatting not available in markdown.
* Clarify a couple points on the Darwin Trick install guide
* Update wiki to clarify that data recording strings is not supported

MCG Code:

* Replace MonteCarloVariableRandomNormal::is_near_equal with new
  Trick::dbl_is_near from trick team

MCG Testing:

* Reduce the set of SIM_mc_generation comparisons. After discussion
  the trick team, we are choosing to remove all comparisons to
  verif_data/ which contain random-generated numbers since
  these tests cannot pass across all supported trick platforms.
* Fix the wrong rule on exlcuding -Werror for Darwin builds
  of SIM_mc_generation
* Remove data recording of strings in SIM_mc_generation

Trickops:

* Replace build_command with build_args per discussion w/ Trick team
  Since we only support arguments to trick-CP, replace the build_command
  yaml entry with build_args
* Disable var server connection by default in SingleRun if TrickWorkflow.quiet
  is True
* Guard against multiple Job starts
* Remove SimulationJob inheritance layer since old monte-carlo wasn't
  and never will be supported by TrickOps
* Ignore IOError raise from variable_server that looks like "The remote
  endpoint has closed the connection". This appears to occur when
  SingleRun jobs attempt to connect to the var server for a sim that
  terminates very early

[skip ci]

* Adjust phasing of old/new MCG initialize functions

* Clarify failure message in generate_dispersions if new/old MC are both
  used.
* Adjust the phasing order of MCG intialize method to be before
  legacy MC initialized. Without this, monte-carlo dry run completes with
  success before the check in generate_dispersions() can run
* Add -Wno-stringop-truncation to S_override.mk for SIM_mc_generation
  since gcc 8+ warns about SWIG generated content in top.cpp

* Introduce MonteCarloGenerationHelper python class

This new class provides an easy-to-use interface for MCG sim-module
users:

1. Run generation
2. Getting an sbatch array job suitable for SLURM
3. Getting a list of SingleRun() instances for generated runs, to be
   executed locally if desired

---------

Co-authored-by: Dan Jordan <daniel.d.jordan@nasa.gov>
2023-03-06 09:25:50 -06:00

16 KiB

HomeDocumentation HomeSimulation Capabilities → Data Record

Data Recording provides the capability to specify any number of data recording groups, each with an unlimited number of parameter references, and with each group recording at different frequencies to different files in different formats.

All data is written to the simulation output directory.

Format of Recording Groups

Trick allows recording in three different formats. Each recording group is readable by different external tools outside of Trick.

  • DRAscii - Human readable and compatible with Excel.
  • DRBinary - Readable by previous Trick data products.
  • DRHDF5 - Readable by Matlab.

DRHDF5 recording support is off by default. To enable DRHDF5 support Trick must be built with HDF5 support. Go to http://www.hdf5group.org and download the latest pre-built hdf5 package for your system. Source packages are available as well. We recommend getting the static library packages above the shared. Static packages make your executable larger, but you will not have to deal with LD_LIBRARY issues. The HDF5 package may be installed anywhere on your system. To tell Trick you have HDF5 run ${TRICK_HOME}/configure --with-hdf5=/path/to/hdf5. Re-compile Trick to enable HDF5 support.

Creating a New Recording Group

To create a new recording group, in the Python input file instantiate a new group by format name: <variable_name> = trick.<data_record_format>() ;

For example:

drg = trick.DRBinary() ;

Note: drg is just an example name. Any name may be used.

Adding a Variable To Be Recorded

To add variables to the recording group call the drg.add_variable("<string_of_variable_name>") method of the recording group. For example:

drg.add_variable("ball.obj.state.output.position[0]")
drg.add_variable("ball.obj.state.output.position[1]")

In this example position is an array of floating point numbers. DO NOT ATTEMPT TO DATA RECORD C OR C++ STRINGS. THIS HAS BEEN OBSERVED TO CREATE MEMORY ISSUES AND TRICK DOES NOT CURRENTLY PROVIDE ERROR CHECKING FOR THIS UNSUPPORTED USE CASE

An optional alias may also be specified in the method as drg.add_variable("<string_of_variable_name>" [, ""]).
If an alias is present as a second argument, the alias name will be used in the data recording file instead of the actual variable name. For example:

drg.add_variable("ball.obj.state.output.position[0]", "x_pos")
drg.add_variable("ball.obj.state.output.position[1]", "y_pos")

Only individual primitive types can be recorded. Arrays, strings/char *, structured objects, or STL types are not supported.

Changing the Recording Rate

To change the recording rate call the set_cycle() method of the recording group.

drg.set_cycle(0.01) 

Buffering Techniques

Data recording groups have three buffering options:

  • DR_Buffer - the group will save recorded data to a buffer and use a separate thread to write recorded data to disk. This will have little impact to the performance of the simulation. The downside is that if the simulation crashes, the most recent recorded points may not be written to disk in time. DR_Buffer is the default buffering technique. (For backwards compatibility, DR_Buffer can also be called DR_Thread_Buffer).
  • DR_No_Buffer - the group will write recorded data straight to disk. All data is guaranteed to be written to disk at simulation termination time. The downside of this method is that it is performed in the main thread of the simulation and could impact real-time performance.
  • DR_Ring_Buffer - the group will save a set number of records in memory and write this data to disk during a graceful simulation termination. The advantage of this method is that there is only a set, usually small, number of records written. The downside of this method is that if the simulation terminates ungracefully, all recorded data may be lost.

To set the buffering technique call the set_buffer_type(trick.<buffering_option>) method of the recording group. For example:

drg.set_buffer_type(trick.DR_Buffer) 

All buffering options (except for DR_No_Buffer) have a maximum amount of memory allocated to holding data. See Trick::DataRecordGroup::set_max_buffer_size for buffer size information.

Recording Frequency: Always or Only When Data Changes

Data recording groups have three recording frequency options:

  • DR_Always - the group will record the variable value(s) at every recording cycle. (This is the default).
  • DR_Changes - the group will record the variable value(s) only when a particular watched parameter (or parameters) value changes.
  • DR_Changes_Step - like DR_Changes, except that a before and after value will be recorded for each variable, creating a stair step effect (instead of point-to-point) when plotted.

To set the recording frequency call the set_freq(trick.<frequency_option>) method of the recording group. For example:

drg.set_freq(trick.DR_Changes)

For DR_Changes or DR_Changes_Step, to specify parameter(s) to watch that will control when the variables added with add_variable are recorded, call the add_change_variable(string) method of the recording group. For example:

drg.add_change_variable("ball.obj.state.output.velocity[0]") 

So if we assume the add_variable statements from the example in @ref S_7_8_3 "7.8.3" combined with the above add_change_variable statement, then ball.obj.state.output.position[0] and ball.obj.state.output.position[1] will be recorded only when ball.obj.state.output.velocity[0] changes. Multiple parameters may be watched by adding more change variables, in which case data will be recorded when any of the watched variable values change.

Turn Off/On and Record Individual Recording Groups

At any time during the simulation, model code or the input processor can turn on/off individual recording groups as well as record a single point of data.

/* C code */
dr_enable_group("<group_name>") ;
dr_disable_group("<group_name>") ;
dr_record_now_group("<group_name>") ;

This is the Python input file version:

# Python code
trick.dr_enable_group("<group_name>") ;  # same as <group_name>.enable()
trick.dr_disable_group("<group_name>") ; # same as <group_name>.disable()
trick.dr_record_now_group("<group_name>") ;

Changing the thread Data Recording runs on.

To change the thread that the data recording group runs on use the DataRecordGroup::set_thread method. The thread number follows the same numbering as the child threads in the S_define file. This must be done before the add_data_record_group function is called. Trick does not provide data locks for data record groups. It is up to the user to ensure that the data recorded on any thread (including the master) is ready in order for data recording to record a time homogeneous set of data.

drg.set_thread(<thread_number>)

Changing the Job Class of a Data Record Group

The default job class of a data record group is "data_record". This job class is run after all of the cyclic job classes have completed. The job class of the data record group can be changed through the set_job_class method. The data recording job will be added to the end of the job class queue it is set.

drg.set_job_class(<string class_name>)

Changing the Max File Size of a Data Record Group (Ascii and Binary only)

The default size of a data record is 1 GiB. A new size can be set through the set_max_file_size method. For unlimited size, pass 0.

drg.set_max_file_size(<uint64 file_size_in_bytes>)

Example Data Recording Group

This is an example of a data recording group in the input file

# Data recording HDF5 test
drg0 = trick.DRHDF5("Ball")
drg0.add_variable("ball.obj.state.output.position[0]") 
drg0.add_variable("ball.obj.state.output.position[1]") 
drg0.add_variable("ball.obj.state.output.velocity[0]") 
drg0.add_variable("ball.obj.state.output.velocity[1]") 
drg0.add_variable("ball.obj.state.output.acceleration[0]") 
drg0.add_variable("ball.obj.state.output.acceleration[1]") 
drg0.set_cycle(0.01)
drg0.freq = trick.DR_Always
trick.add_data_record_group(drg0, trick.DR_Buffer)

# This line is to tell python not to free this memory when drg0 goes out of scope
drg0.thisown = 0

User accessible routines

Create a new data recording group:

Trick::DRAscii::DRAscii(string in_name);
Trick::DRBinary::DRBinary(string in_name);
Trick::DRHDF5::DRHDF5(string in_name);

This list of routines is for all recording formats:

int dr_disable_group( const char * in_name );
int dr_enable_group( const char * in_name );
int dr_record_now_group( const char * in_name );

int Trick::DataRecordGroup::add_variable
int Trick::DataRecordGroup::add_change_variable
int Trick::DataRecordGroup::disable
int Trick::DataRecordGroup::enable
int Trick::DataRecordGroup::set_cycle
int Trick::DataRecordGroup::set_freq
int Trick::DataRecordGroup::set_job_class
int Trick::DataRecordGroup::set_max_buffer_size

This list of routines provide file size configuration for Ascii and Binary:


int set_max_size_record_group (const char * in_name, uint64_t bytes ) ;
int dr_set_max_file_size ( uint64_t bytes ) ;

int Trick::DataRecordGroup::set_max_file_size

This list of routines provide some additional configuration for DR_Ascii format only:

int Trick::DRAscii::set_ascii_double_format
int Trick::DRAscii::set_ascii_float_format
int Trick::DRAscii::set_delimiter
int Trick::DataRecordGroup::set_single_prec_only

DRAscii Recording Format

The DRAscii recording format is a comma separated value file named log_<group_name>.csv. The contents of this file type are readable by the Trick Data Products packages, ascii editors, and Microsoft Excel. The format of the file follows. Users are able to change the comma delimiter to another string. Changing the delimiter will change the file extension from ".csv" to ".txt".

name_1 {units_1},name_2 {units_2},etc...
value1,value2,etc...
value1,value2,etc...
value1,value2,etc...
value1,value2,etc...

DRBinary Recording Format

The DRBinary recording format is a Trick simulation specific format. Files written in this format are named log_<group_name>.trk. The contents of this file type are readable by the Trick Data Products packages from Trick 07 to the current version. The format of the file follows.

DRBinary-File

Value Description Type #Bytes
Trick-<vv>-<e> <vv> is trick version (2 chars, "07" or "10"). <e> is endianness (1 char) 'L' -> little endian, and 'B' -> big endian. char 10
numparms Number of recorded variables char 4
List of Variable Descriptors Variable-Descriptor-List
List Data Records Data-Record-List
EOF End of File

Variable-Descriptor-List

A Variable-Descriptor-List is a sequence of Variable-Descriptors. The number of descriptors in the list is specified by numparms. The list describes each of the recorded variables, starting with the simulation time variable.

Value Description Type #Bytes
Time-Variable-Descriptor Descriptor for Variable # 1. This first descriptor always represents the simulation time variable. Variable-Descriptor 34
... ... ... ...
Descriptor for Variable # numparms Variable-Descriptor variable

Variable-Descriptor

A Variable-Descriptor describes a recorded variable.

Value Description Type Bytes
namelen Length of Variable Name int 4
name Variable Name namelen
unitlen Length of Variable Units int 4
unit Variable Units unitlen
type Variable Type (see Notes 2. & 3.) int 4
sizeof(type) Variable Type Size int 4

Notes:

  1. the size of a Variable-Descriptor in bytes = namelen + unitlen + 16.
  2. If vv = "07", use Trick 07 Data Types.
  3. If vv = "10", use Trick 10 Data Types.

Time-Variable-Descriptor

Value Description Type Bytes
17 Length of Variable Name int 4
sys.exec.out.time Variable Name char 17
1 Length of Variable Units int 4
s Variable Units (see Note 1.) char 1
11 Variable Type int 4
8 Variable Type Size int 4

Notes:

  1. Here, we are assuming "vv" = "10", and so, referring to Trick 10 Data Types, Variable Type = 11, which corresponds to double.

Data-Record-List

A Data-Record-List contains a collection of Data-Records, at regular times.

Value Description Type Bytes
Data-Record #1 Data-Record
... ... ... ...
Data-Record #Last Data-Record

Data-Record

A Data-Record contains a collection of values for each of the variables we are recording, at a specific time.

Value Description Type Bytes
time Value of Variable #1 (time) typeof( Variable#1 ) sizeof( typeof( Variable#1)) = 8
... ... ... ...
value Value of Variable #numparms typeof( Variable#numparms) sizeof( type-of( Variable#numparms))

Trick 7 Data Types

The following data-types were used in Trick-07 data recording files (that is for, vv = "07").

Type value Data Type
0 char
1 unsigned char
2 string (char*)
3 short
4 unsigned short
5 int
6 unsigned int
7 long
8 unsigned long
9 float
10 double
11 Bit field
12 unsigned Bit field
13 long long
14 unsigned long long
17 Boolean (C++)

Trick 10 Data Types

The following data-types are used in Trick versions >= 10, that is for, vv = "10".

Type value Data Type
1 char
2 unsigned char
4 short
5 unsigned short
6 int
7 unsigned int
8 long
9 unsigned long
10 float
11 double
12 Bit field
13 unsigned Bit field
14 long long
15 unsigned long long
17 Boolean (C++)``

DRHDF5 Recording Format

HDF5 recording format is an industry conforming HDF5 formatted file. Files written in this format are named log_<group_name>.h5. The contents of this file type are readable by the Trick Data Products packages from Trick 07 to the current version. The contents of the file are binary and is not included here. The HDF5 layout of the file follows.

GROUP "/" {
    GROUP "header" {
        DATASET "file_names" {
            "param_1_file_name", "param_2_file_name", etc...
        }
        DATASET "param_names" {
            "param_1_name", "param_2_name", etc...
        }
        DATASET "param_types" {
            "param_1_type", "param_2_type", etc...
        }
        DATASET "param_units" {
            "param_1_units", "param_2_units", etc...
        }
    }
    DATASET "parameter #1" {
        value1 , value2 , value3 , etc...
    }
    DATASET "parameter #2" {
        value1 , value2 , value3 , etc...
    }
    .
    .
    .
    DATASET "parameter #n" {
        value1 , value2 , value3 , etc...
    }
}

Interaction with Checkpoints

Data recording groups are able to be checkpointed, reloaded, and restarted without any interaction by the user. When a checkpoint is loaded that includes data recording, the data recording groups will be initiated and begin recording at the time in the checkpoint. For example, if a checkpoint was dumped when t=5, when the checkpoint is loaded into another run, it will data record starting at t=5, no matter what time in the run it was loaded or whether the run was already data recording. Loading a checkpoint will overwrite any data recording files that were being recorded before the load.

Loading a checkpoint with different data recording groups than the current run will overwrite the current data recording groups.

Refer to test/SIM_checkpoint_data_recording to see expected behavior in action. Overall, the loading a checkpoint should completely overwrite any other data recording the sim is currently doing, and the new recording will start at the time in the checkpoint. If you come across different behavior, please open an issue.

Continue to Checkpointing