trick/trick_source/sim_services/MonteCarloGeneration/mc_variable_file.cc

288 lines
12 KiB
C++
Raw Normal View History

Integrate MonteCarloGenerate capability from EG CML and associated TrickOps enhancements (#1415) * Provide MonteCarloGenerate capability Intermediate commit, this squash represents all of Isaac Reaves' work during his Fall 2022 Pathways internship tour [skip ci] * TrickOps: Add phase, [min-max] range, and overhaul YAML verification * Add new "phase:" mechanism to TrickOps Runs and Builds to support project-specific constraints on build and run ordering - phase defaults to zero if not specified and must be between -1000 and 1000 if given. - jobs can now optionally be requested by their phase or phase range - See trickops/README.md for details * Add [min-max] notation capability to run: entries and compare: entries - [min-max] ranges provide definition of a set of runs using a common numbering scheme in the YAML file, greatly reducing YAML file size for monte-carlo and other zero-padded run numbering use cases - See trickops/README.md for details * YAML parsing changes - Overhaul the logic which verifies YAML files for the expected TrickOps format. This is now done in TrickWorkflowYamlVerifier and provides much more robust error checking than previous approach - .yaml_requirements.yml now provides the required types, ranges, and default values as applicable to expected entries in YAML files - valgrind: is now an sub-option to run: entries, not its own section Users should now list their runs normallly and define their flags in in that run's valgrind: subsection - parallel_safety is now a per-sim parameter and not global. Users should move their global config to the sim layer - self.config_errors is now a list of errors. Users should now check for empty list when using instead of True/False * Robustify the get_koviz_report_jobs unit test to work whether koviz exists on PATH or not * Adjust trickops.py to use the new phase and range features - Make it more configurable on the command-line via argparse - Move SIM_mc_generation tests into test_sims.yml [skip ci] * Code review and cleanup from PR #1389 Documentation: * Adjust documentation to fit suggested symlinked approach. Also cleaned up duplicate images and old documentation. * Moved the verification section out of markdown and into a PDF since it heavily leverages formatting not available in markdown. * Clarify a couple points on the Darwin Trick install guide * Update wiki to clarify that data recording strings is not supported MCG Code: * Replace MonteCarloVariableRandomNormal::is_near_equal with new Trick::dbl_is_near from trick team MCG Testing: * Reduce the set of SIM_mc_generation comparisons. After discussion the trick team, we are choosing to remove all comparisons to verif_data/ which contain random-generated numbers since these tests cannot pass across all supported trick platforms. * Fix the wrong rule on exlcuding -Werror for Darwin builds of SIM_mc_generation * Remove data recording of strings in SIM_mc_generation Trickops: * Replace build_command with build_args per discussion w/ Trick team Since we only support arguments to trick-CP, replace the build_command yaml entry with build_args * Disable var server connection by default in SingleRun if TrickWorkflow.quiet is True * Guard against multiple Job starts * Remove SimulationJob inheritance layer since old monte-carlo wasn't and never will be supported by TrickOps * Ignore IOError raise from variable_server that looks like "The remote endpoint has closed the connection". This appears to occur when SingleRun jobs attempt to connect to the var server for a sim that terminates very early [skip ci] * Adjust phasing of old/new MCG initialize functions * Clarify failure message in generate_dispersions if new/old MC are both used. * Adjust the phasing order of MCG intialize method to be before legacy MC initialized. Without this, monte-carlo dry run completes with success before the check in generate_dispersions() can run * Add -Wno-stringop-truncation to S_override.mk for SIM_mc_generation since gcc 8+ warns about SWIG generated content in top.cpp * Introduce MonteCarloGenerationHelper python class This new class provides an easy-to-use interface for MCG sim-module users: 1. Run generation 2. Getting an sbatch array job suitable for SLURM 3. Getting a list of SingleRun() instances for generated runs, to be executed locally if desired --------- Co-authored-by: Dan Jordan <daniel.d.jordan@nasa.gov>
2023-03-06 15:25:50 +00:00
/*******************************TRICK HEADER******************************
PURPOSE: ( Implementation of a file-lookup assignment
PROGRAMMERS:
(((Gary Turner) (OSR) (October 2019) (Antares) (Initial)))
(((Isaac Reaves) (NASA) (November 2022) (Integration into Trick Core)))
**********************************************************************/
#include <algorithm> // all_of
#include <sstream> // istringstream
#include "trick/exec_proto.h"
#include "trick/message_proto.h"
#include "trick/message_type.h"
#include "trick/mc_variable_file.hh"
/*****************************************************************************
Constructor
*****************************************************************************/
MonteCarloVariableFile::MonteCarloVariableFile(
const std::string & var_name,
const std::string & filename_,
size_t column_number_,
size_t first_column_number_)
:
MonteCarloVariable(var_name),
max_skip(0),
is_dependent(false),
rand_gen(0),
filename(filename_),
column_number(column_number_),
first_column_number(first_column_number_),
dependents(),
file()
{
// make this a dependent of itself so that when it reads the data file, it
// populates its own "assignment" variable.
dependents.push_back(this);
type = MonteCarloVariable::Prescribed;
}
/*****************************************************************************
initialize_file
Purpose:(Opens the file identified by filename as an ifstream)
*****************************************************************************/
void
MonteCarloVariableFile::initialize_file()
{
// At this time, the list of dependencies has been finalized. We can sort
// this list by column number of each of the dependencies, in increasing
// order.
if (dependents.size() > 1) {
dependents.sort(sort_by_col_num);
}
// Check that the specified first_column_number is no larger than the
// smallest column number:
MonteCarloVariableFile * first_var =dependents.front();
if (first_column_number > first_var->get_column_number()) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + ", Configuration Error\nIn configuring " +
"the file for variable " + first_var->get_variable_name().c_str() +
", it was identified that\nit was specified to draw data from column " +
std::to_string(first_var->get_column_number()) + ", but that the " +
"first\ncolumn was identified as having index " +
std::to_string(first_column_number) + ".\n";
message_publish(MSG_ERROR, message.c_str());
exec_terminate_with_return(1, __FILE__, __LINE__, message.c_str());
}
// Now we can get to reading the file.
file.open(filename);
if (file.fail()) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + ", I/O error\nUnable to open file " +
filename.c_str() + " for reading.\nRequired for variable " +
variable_name.c_str() + ".\n";
message_publish(MSG_ERROR, message.c_str());
exec_terminate_with_return(1, __FILE__, __LINE__, message.c_str());
}
// Sanity check -- make sure the file has at least 1 line of data:
std::string line;
do {
// if reached the end of the file, not found anything good. Fail out.
if (file.eof()) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + " Invalid data file\nData file " +
filename.c_str() + " contains no recognized lines of data\n" +
"Required for variable " + variable_name.c_str() + ".\n";
message_publish(MSG_ERROR, message.c_str());
exec_terminate_with_return(1, __FILE__, __LINE__, message.c_str());
}
std::getline( file, line);
// keep looking if the line is empty, starts with a "#" character or
// "/" character or is completely whitespace.
} while (line.empty() ||
line.front() == '#' ||
line.front() == '/' ||
std::all_of( line.begin(), line.end(), isspace));
// Rewind the file
file.seekg(0, file.beg);
}
/*****************************************************************************
generate_assignment
Purpose:(generates the command line that is to be embedded in the monte-input
file currently being generated.)
*****************************************************************************/
void
MonteCarloVariableFile::generate_assignment()
{
// if this instance is not dependent on another, need to read the file.
if (!is_dependent) {
process_line(); // provides "assignment"
}
generate_command();
insert_units();
}
/*****************************************************************************
register_dependent
Purpose:(Registers another MonteCarloVariableFile instance with this one,
allowing this instance to read the data for the other.)
*****************************************************************************/
void
MonteCarloVariableFile::register_dependent(
MonteCarloVariableFile * new_var)
{
if (new_var == NULL) {
std::string message =
std::string("File: ") + __FILE__ + ", line " +
std::to_string(__LINE__) + ", Invalid call\nAttempted to register " +
"a dependent identified with NULL pointer with \nthe " +
"MonteCarloVariableFile for variable " + variable_name.c_str() +
".\nThis is not a valid action.\nRegistration failed, exiting " +
"without action.\n";
message_publish(MSG_ERROR, message.c_str());
return;
}
if (new_var->has_dependents()) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + ", Invalid configuration\nError in " +
"attempting to make " + new_var->get_variable_name().c_str() +
" be dependent on " + variable_name.c_str() + ".\n" +
new_var->get_variable_name().c_str() + " cannot be marked as " +
"dependent when it has dependencies of its own.\nThe dependency " +
"hierarchy can only be one level deep.";
message_publish(MSG_ERROR, message.c_str());
exec_terminate_with_return(1, __FILE__, __LINE__, message.c_str());
}
if (new_var->max_skip != max_skip) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + ", Invalid configuration\nIt is not " +
"permissible for two variables looking at the same file to\noperate " +
"under different line-selection criteria.\n" +
new_var->get_variable_name().c_str() + "\nwill be switched to the " +
"behavior of\n" + variable_name.c_str() + ",\nwhich " +
"as a setting for the maximum number of lines to skip of " +
std::to_string(max_skip) + "\n";
message_publish(MSG_ERROR, message.c_str());
}
dependents.push_back(new_var);
new_var->is_dependent = true;
}
/*****************************************************************************
process_line
Purpose:(extract and process a line of data from the file, breaking it
into words and extracting the approriate word number.)
*****************************************************************************/
void
MonteCarloVariableFile::process_line()
{
size_t skip_count = 0;
if (max_skip > 0) {
std::uniform_int_distribution<int> skip_distrib(0,max_skip);
skip_count = skip_distrib( rand_gen);
}
std::string line;
for (size_t ii = 0; ii <= skip_count; ++ii) {
// keep reading the next line until a "good" line is found
do {
// read the next line
std::getline( file, line);
// if reached the end of the file, clear the error flag and go back
// to the beginning.
if (file.eof()) {
file.clear();
file.seekg(0, file.beg);
}
// keep looking if the line is empty, starts with a "#" character or
// "/" character or is completely whitespace.
} while ( line.empty() ||
line.front() == '#' ||
line.front() == '/' ||
std::all_of( line.begin(), line.end(), isspace));
// A good line was found; return to the top of the for loop to see if
// we have skipped over enough good lines yet.
}
// Have the line containing the data. Need to assign some subset of the
// words in this line to a set of variables, knowing the column number for
// each variable.
// Capture the first word, and associate it with the user-specified
// column number of the first word.
std::istringstream word(line);
std::string scratch_assignment;
word >> scratch_assignment;
size_t current_column_number = first_column_number;
// Now for each dependent (including itself) find and assign the appropriate
// word
for (auto it = dependents.begin(); it != dependents.end(); ++it) {
// The dependents have already been sorted according to their column
// number, so the next column we need data from is the column number of
// this next variable.
size_t next_column_needed = (*it)->get_column_number();
// This next-needed column can be no earlier than
// the current column, but it could be the current column -- could be
// the first column, or we could have multiple variables collecting data
// from the same column. So check whether we need to advance the
// string-stream.
while (next_column_needed > current_column_number && word) {
// for as long as the next needed column is to the right of where we
// currently are, and there is another word to the right, extract
// the next word and advance the current column number.
word >> scratch_assignment;
current_column_number++;
}
// There are two ways to get past the while loop -- we ran out of
// words, or the current column number reached the target next-needed
// column number (including the case where it was already there).
// If we ran out of words before reaching a specified column, we have
// a problem
if (current_column_number < next_column_needed) {
std::string message =
std::string("File: ") + __FILE__ + ", Line: " +
std::to_string(__LINE__) + ", Malformed data file\nData file " +
"for variable " + variable_name.c_str() + " includes this line:\n" +
line.c_str() + "\nWhich has only " +
std::to_string(current_column_number-1) + " values.\nVariable " +
variable_name.c_str() + " uses the value from position " +
std::to_string(column_number) +
", which does not exist in this line\n";
exec_terminate_with_return(1, __FILE__, __LINE__, message.c_str());
}
// and if we found the desired column, send its value to the variable:
(*it)->assignment = scratch_assignment;
}
}
/*****************************************************************************
sort_by_col_num
Purpose:(sorts the dependent list by column number
*****************************************************************************/
bool MonteCarloVariableFile::sort_by_col_num(
MonteCarloVariableFile * left,
MonteCarloVariableFile * right)
{
return left->get_column_number() < right->get_column_number();
}
/*****************************************************************************
summarize_variable
Purpose:(Provide a string summarizing the attributes of this MonteCarloVariable)
*****************************************************************************/
std::string MonteCarloVariableFile::summarize_variable() const
{
std::ostringstream ss;
ss << MonteCarloVariable::summarize_variable()
<< ", max_skip=" << max_skip << ", is_dependent=" << is_dependent
<< ", filename=" << filename << ", column_number=" << column_number
<< ", first_column_number=" << first_column_number;
return (ss.str());
}