trick/docs/not_referenced/design/DesMonteCarlo.md

84 lines
4.5 KiB
Markdown
Raw Normal View History

# Monte Carlo and Optimization
Monte Carlo is the process of iteratively calling a simulation over a set of predetermined or auto-generated inputs.
Trick has designed its Monte Carlo capability to run distributed.
## Structure
In particular, Monte Carlo is designed after a "master/slave" model. The master is in charge of creating slaves and
tasking them to work. There may be any number of slaves distributed over a network. Master and slave communicate through
sockets. Theoretically, a master and slave need not have the same file system. Each slave is responsible for requesting
work, accomplishing work and reporting results. The work at hand is running a single simulation iteratively over an input space.
### The Master
The master is the command center of a Monte Carlo simulation. The master tasks slaves to run the simulation with a given set of inputs. The master will task slaves to run in parallel. The master is responsible for keeping the slaves as busy as possible. To keep things running smoothly, the master is designed to reassign work when a slave is either dead or running too slowly. the master is only in charge of tasking work. The master does not run the simulation itself. The master will continue issuing work to the slaves until it is satisfied all simulation runs are complete.
The master's life cycle consists of the following:
- Initialize
- While there are unresolved runs:
- Spawn any uninitialized slaves.
- Dispatch runs to ready slaves.
- Resolve run based on slave's exit status.
- Receive results from finished slave's child.
- Check for timeouts.
- Shutdown the slaves and terminate.
@see Trick::MonteCarlo
### Slaves
A slave consists of a parent and fork()ed children. A slave parent spawns a child using the fork() system call. A
slave child runs the simulation in its own address space. Only one child exists at a time in a slave. Per slave,
simulation execution is sequential.
A slave is responsible for requesting work from the master, running a Trick simulation with inputs given by the master,
dumping recorded data to disk and informing the master when it is finished running its task.
The slave's life cycle consists of the following:
- Initialize
- Connect to and inform the master of the port over which the slave is listening for dispatches.
- Until the connection to the master is lost or the master commands a shutdown:
- Wait for a new dispatch.
- Process the dispatch.
- Slave fork()s child.
- Child runs simulation with varied input.
- Write the run number processed to the master at child shutdown.
- Write the exit status to the master.
- Run the shutdown jobs and terminate.
@see Trick::MonteSlave
## Simulation Inputs
The goal of Monte Carlo is to run the simulation over a set of inputs. The inputs that the master passes to the slaves
are either generated by a statistical algorithm or they are hard-coded by the user in a data file. Inputs may also be
generated exclusively by user calculations.
## Monte Carlo Output
For each simulation run within a Monte Carlo suite of runs, a directory called "MONTE_<name>" is created. Slave output
is directed to this "MONTE_" directory. Trick recorded data is written in a set of "RUN_" directories within the parent
"MONTE_" directory. Along with recorded data, stdout, stderr, and send_hs files are written. A file that contains the
summary of all runs is written to the "MONTE_" directory.
### Data Processing
The trick_dp is desinged to understand "MONTE_" directories. When choosing to plot a "MONTE_" directory, trick_dp
will overlay all curves from each "RUN_" directory within the parent "MONTE_" directory. The plot widget has built
in features that allow the developer to distinguish what curve goes with what simulation run.
### Optimization
Optimization is made possible by creating a framework whereby the developer can change simulation inputs based on
simulation results. Trick offers a set of job classes that allow the developer to enter the Monte Carlo loop and
thereby enter the decision making on-the-fly. No canned optimization is available.
This special set of job classes work in concert together in master and slaves. Trick schedules jobs within the master
at critical points so that they may create inputs to send to the slave as well as receive results from the slave.
Slave jobs are scheduled to receive simulation inputs from the master as well as send simulation results back to the
master.
The jobs are specified in the S_define. The jobs are created by the developer.