Running Multiple Simulations

FLAME GPU 2 provides CUDAEnsemble as a facility for executing batch runs of multiple configurations of a model.

Creating a CUDAEnsemble

An ensemble is a group of simulations executed in batch, optionally using all available GPUs. To use an ensemble, construct a RunPlanVector and CUDAEnsemble instead of a CUDASimulation.

First you must define a model as usual, followed by creating a CUDAEnsemble:

flamegpu::ModelDescription model("example model");

// Fully define the model

// Create a CUDAEnsemble
flamegpu::CUDAEnsemble ensemble(model);
// Handle any runtime args
ensemble.initialise(argc, argv);

Creating a RunPlanVector

RunPlanVector is a data structure which can be used to build run configurations, specifying; simulation speed, steps and initialising environment properties. These are a std::vector of RunPlan (which was introduced in the previous chapter), with some additional methods included to enable easy configuration of batches of runs.

Operations performed on the vector, will apply to all elements, whereas individual elements can also be updated directly.

It is also possible to specify subdirectories for a particular runs’ logging output to be sent to, this can be useful when constructing large batch runs or parameter sweeps:

// Create a template run plan
flamegpu::RunPlanVector runs_control(model, 128);
// Ensure that repeated runs use the same Random values within the RunPlans
{  // Initialise values across the whole vector
    // All runs require 3600 steps
    // Random seeds for each run should take the values (12, 13, 14, 15, etc)
    runs_control.setRandomSimulationSeed(12, 1);
    // Initialise environment property 'lerp_float' with values uniformly distributed between 1 and 256
    runs_control.setPropertyLerpRange<float>("lerp_float", 1.0f, 256.0f);
    // Initialise environment property 'step_int' with values 0, 2, 4, 6, 8, etc
    runs_control.setPropertyStep<int>("step_int", 0, 2);

    // Initialise environment property 'random_int' with values uniformly distributed in the range [0, 10]
    runs_control.setPropertyUniformRandom<int>("random_int", 0, 10);
    // Initialise environment property 'random_float' with values from the normal dist (mean: 1, stddev: 2)
    runs_control.setPropertyNormalRandom<float>("random_float", 1.0f, 2.0f);
    // Initialise environment property 'random_double' with values from the log normal dist (mean: 2, stddev: 1)
    runs_control.setPropertyLogNormalRandom<double>("random_double", 2.0, 1.0);

    // Initialise environment property array 'int_array_3' with [1, 3, 5]
    runs_control.setProperty<int, 3>("int_array_3", {1, 3, 5});

    // Iterate vector to manually assign properties
    for (RunPlan &plan:runs_control) {
        // e.g. manually set all 'manual_float' to 32
        plan.setProperty<float>("manual_float", 32.0f);
// Create an empty RunPlanVector, that we will construct by mutating and copying runs_control several times
flamegpu::RunPlanVector runs(model, 0);
for (const float &mutation : {0.2f, 0.5f, 0.8f, 1.5f, 1.9f, 2.5f}) {
    // Dynamically generate a name for mutation sub directory
    char subdir[24];
    sprintf(subdir, "mutation_%g", mutation);
    // Fill in specialised parameters
    runs_control.setProperty<float>("mutation", mutation);
    // Append to the main run plan vector
    runs += runs_control;

Creating a Logging Configuration

Next you need to decide which data will be collected, as it is not possible to export full agent states from a CUDAEnsemble.

A short example is shown below, however you should refer to the previous chapter for the comprehensive guide.

One benefit of using CUDAEnsemble to carry out experiments, is that the specific RunPlan data is included in each log file, allowing them to be automatically processed and used for reproducible research. However, this does not identify the particular version or build of your model.

Agent data is logged according to agent state, so agents with multiple states must have the config specified for each state required to be logged.

// Specify the desired LoggingConfig or StepLoggingConfig
flamegpu::StepLoggingConfig step_log_cfg(model);
    // Log every step (not available to LoggingConfig, for exit logs)
    // Include the current number of 'boid' agents, within the 'default' state
    // Include the current mean speed of 'boid' agents, within the 'alive' state
    step_log_cfg.agent("boid", "alive").logMean<float>("speed");
flamegpu::LoggingConfig exit_log_cfg(model);

// Pass the logging configs to the CUDAEnsemble

Configuring & Running the Ensemble

Now you can execute the CUDAEnsemble from the command line, using the below parameters, it will execute the runs and log the collected data to file.

Long Argument

Short Argument




Print the command line guide and exit.

--devices <device ids>

-d <device ids>

Comma separated list of GPU ids to be used to execute the ensemble. By default all devices will be used.

--concurrent <runs>

-c <runs>

The number of concurrent simulations to run per GPU. By default 4 concurrent simulations will run per GPU.

--out <directory> <format>

-o <directory> <format>

Directory and format (JSON/XML) for ensemble logging.



Don’t print ensemble progress to console.



Print config, progress and timing (-t) information to console.



Output timing information to console at exit.



Silence warnings for unknown arguments passed after this flag.


-e <error level>

The ErrorLevel to use: 0, 1, 2, “off”, “slow” or “fast”. By default the ErrorLevel will be set to “slow” (1).


Allow the operating system to enter standby during ensemble execution. The standby blocking feature is currently only supported on Windows, where it is enabled by default.

You may also wish to specify your own defaults, by setting the values prior to calling initialise():

// Fully declare a ModelDescription, RunPlanVector and LoggingConfig/StepLoggingConfig

// Create a CUDAEnsemble to run the RunPlanVector
flamegpu::CUDAEnsemble ensemble(model);

// Override config defaults
ensemble.Config().out_directory = "results";
ensemble.Config().out_format = "json";
ensemble.Config().concurrent_runs = 1;
ensemble.Config().timing = true;
ensemble.Config().error_level = CUDAEnsemble::EnsembleConfig::Fast;
ensemble.Config().devices = {0};

// Handle any runtime args
// If this is instead performed before overriding defaults, overridden args will be ignored from command line
ensemble.initialise(argc, argv);

// Pass the logging configs to the CUDAEnsemble

// Execute the ensemble using the specified RunPlans
const unsigned int errs = ensemble.simulate(runs);

Error Handling Within Ensembles

CUDAEnsemble has three supported levels of error handling.






Runs which fail do not cause an exception to be raised.



If any runs fail, an EnsembleError will be raised after all runs have been attempted.



An EnsembleError will be raised as soon as a failed run is detected, cancelling remaining runs.

The default error level is “Slow” (1), which will cause an exception to be raised if any of the simulations fail to complete. However, all simulations will be attempted first, so partial results will be available.

Alternatively, calls to simulate() return the number of errors, when the error level is set to “Off” (0). Therefore, failed runs can be probed manually via checking that the return value of simulate() does not equal zero.