4. FLAME GPU Simulation and Visualisation

4.1. Introduction

The processes of building and running a simulation is made easier described within this chapter as are a number of tools and procedures which simplify the simulation code generation and compilation of simulation executables. In order to use the FLAME GPU SDK it should be placed in a directory which does not contain any spaces (preferably directly within the C: drive or root or root operating system drive). The host machine must also be running windows with a copy of the .NET runtime (used within the XSLT template processor) and must contain NVIDIA GPU hardware with Compute level 1.0.

4.2. Generating a Functions File Template

Chapter Summary of Agent Function Arguments previously described the exact argument order for agent function declarations however in most cases it is sensible to use the provided XSLT template functions.xslt located in the FLAMEGPU/templates directory within the FLAME GPU SDK) to generate a agent function source file with empty agent function declarations automatically using your XMML model file. Once this has been generated the agent function scripts can be implemented within the function declarations rather easily. Care must however be taken in ensuring that if the XMML model file is later modified that the agent function arguments are updated manually where necessary. Likewise be careful not to overwrite any existing function source file when generating a new one using the XSLT template. Generation of blank function source files is not incorporated into the visual studio template project and must be manually accomplished. A .NET based XSLT processor is provided within the FLAME GPU SDK for this purpose (XSLTProcessor.exe located in the tools directory) and can be used via the command line as follows (or via the GenerateFunctionsFileTemplate batch file located in the tools directory of the FLAME GPU SDK);

XSLTProcessor.exe XMLModelFile.xml functions.xslt functions.c

Alternatively any compliant XSLT processor such as Xalan, Unicorn or even Firefox web browser can be used.

4.3. FLAME GPU Template Files

The FLAME GPU SDK contains a number of XSLT templates which are used to generate the dynamic simulation code. A brief summary of the functionality and contents of each template file is as follows:

  • header.xslt This template file generates a header file which contains any agent and message data structures which are common in many of the other dynamically generated simulation source files. The template also generates function prototypes for simulation functions and functions which are visible externally within custom C or C++ code.
  • main.xslt This template file generates a source file which defines the main execution entry point function which is responsible for handling command line options and initialising the GPU device.
  • io.xslt This template file generates a source file which contains functions for loading initial agent XML data files (see Initial XML Agent Data) into the simulation and saving the simulation state back into XML format.
  • simulation.xslt This template file generates a source file containing the host side simulation code which includes loading data to and from the GPU device and making a number of CUDA kernel calls which perform the simulation process.
  • FLAMEGPU\_kernels.xslt This template file generates a CUDA header file which contains the CUDA kernels and device functions which make up the simulation.
  • visualisation.xslt This template file generates a source file which will allow basic visualisation of the simulation using sphere based representation of agents in 3D space. The source file is responsible for CUDA OpenGL interoperability and rending using OpenGL. The source file includes a visualisation.h file containing a number of definitions and variables which is not generated by any templates and should be specified manually.

4.4. Compilation Using Visual Studio

The FLAME GPU SDK and examples are targeted at a specific CUDA and Visual Studio version. The Visual Studios XML editor includes validation support and XML tag auto completion which makes defining an XMML model incredibly easy. The following subsections describe the various aspects of a FLAME GPU project file and describe the build processes.

4.4.1. Visual Studio Project Build Configurations

The FLAME GPU examples and template project file contain build configurations 64 bit Windows (x64) environments. 32 bit windows has been removed due to limitations on GPU memory addressing since version 1.3.0. For each platform the project also contains four configurations for debugging (Debug) and release versions (Release) of both console based simulation and visualisation simulation. The two debug options disable all compiler optimisations and generate debug information for debugging host (non GPU) code and enables CUDA device emulation for GPU (device) debugging. The visualisation configurations enable building of visualisation code and specify a pre processor macro (VISUALISATION) which is used by a number of pre-processor conditionals to change the simulations expected arguments (see Simulation Execution Modes and Options).

4.4.2. Visual Studio Project Virtual File Structure

Within the FLAME GPU examples and template projects code is organised into the following virtual folders;

  • FLAME GPU Consisting of a folder containing the FLAME GPU XML schemas and Code generating templates. These files are shared amongst all examples so editing them will change simulation code generated for other projects.
  • FLAMEModel Contains the XMML model file and the agent functions file (usually called functions.c). Note that the functions.c file is actually excluded from the build processes as it is built by the dynamically generated simulation.cu source file which includes it.
  • Dynamic Code Contains the dynamically generated FLAME GPU simulation code. This code will be overwritten each time the project is built so any changes to this files will be lost unless template transformation is turned off using the FLAME GPU build rule (see FLAME GPU Build Rule Options).
  • Additional Source Code This folder should contain any hard coded simulation specific source or header files. By default the FLAME GPU project template defines a single visualisation.h file in this folder which may be modified to set a number of variables such as viewing distance and clipping. Within the FLAME GPU examples this folder is typically used to sore any model specific visualisation code which replaces the dynamically generated visualisation source file.

The physical folders of the SDK structure a self explanatory however it is worth noting that executable files generated by the Visual Studio build processes are output in the SDKs bin folder which also contains the CUDA run time dlls.

4.4.3. Build Process

The Visual Studio build process consists of a number of stages which call various tools, compilers and linkers. The first of these is the FLAME GPU build tool (described in more detail in the following section) which generates the dynamic simulation code from the FLAME GPU templates and mode file. Following this the simulation code (within the Dynamic Code folder) is built using the CUDA build rule which compiles the source files using the NVIDIA CUDA compiler nvcc. Finally any C or C++ source files are compiled using MSVC compiler and are then linked with the CUDA object files to produce the executable. To start the build processes select the Build menu followed by Build Solution or use the F7 hotkey. If the first build step in the Visual Studio skips the FLAME GPU build tool a complete rebuilt can be forced by selecting the Build menu followed by Rebuild Solution (or Ctrl + Alt + F7).

4.4.4. FLAME GPU Build Rule Options

The FLAME GPU build rule is configured by selecting the XMML model file properties. Within the Build rule the XSLT options tab (see Figure) allows individual template file transformations to be toggled on or off. These options are configuration specific and therefore console configurations by default do not processes the visualisation template.

FLAME GPU Modelling and Simulation Processes

4.4.5. Visual Studio Launch Configuration Command Arguments

In order to set the execution arguments (described in the next section) for simulation executable in any one of one of the four launch configurations, the Command Arguments property can be set form the Project Properties Page (Select Project Menu followed by FLAMEGPU\_Project Properties). The Command Arguments property is located under Configuration Properties -> Debug (see Agent Function Scripts and the Simulation API). Each configuration has its own set of Command Arguments so when moving between configurations these will need to be set. Likewise the Configuration Properties are computer and user specific so these cannot be preset and must be specified the first time each example is compiled and run. The Visual Studio macro $InputDir can be used to specify the working directory of the project file which makes locating initial agent data XML files for many of the examples much easier (these are normally located in the iterations folders of each example).

The Command Arguments have been set the simulation executable can be launched by selecting Start Debugging from the Debug menu or using the F5 hotkey (this is the same in both release and debug launch configurations).

FLAME GPU Project Properties Page

4.5. Compilation using Make (for Linux)

Linux compilation is controlled using make, with makefiles provided for each example.

  1. Install Ubuntu 16.04 or later.
  2. Install all the needed build tools and libraries:
sudo apt-get install g++ git make libxml2-utils

Minimum supported versions are g++ 4.8 and cuda 7.5.

  1. Download the FLAME GPU SDK release or alternatively clone the project using Git (it will be cloned into the folder FLAMEGPU):
git clone https://github.com/FLAMEGPU/FLAMEGPU.git
  1. Build the SDK in Release mode (this is the default mode):
cd FLAMEGPU/examples
make

This will process the XML model and build both console and visualisation version of the model in release mode. You can build the Debug version by specifying dbg value on the make line instead (make all dbg=1). Moreover, for each example, executables can also be built in either Visualisation (make Visualisation_mode) or Console (make Console_mode) mode.

cd examples/{example name}
make XSLTPREP
make Visualisation_mode
# or
make Console_mode

Replace {example name} with the name of the specific example you wish to build.

  1. After building the executables, run the examples by executing the relevant bash script inside the “bin/linux-x64” folder:
  • Visualisation mode: ./*_vis.sh}
  • Console mode: ./*_console.sh iter='arg'

*Note: XML output is disabled but can be re-enabled by setting the``XML_OUTPUT`` definition in the automatically generated src/dynamic/main.cu file to 1. After rebuilding and running the simulation again this will create an XML file (saved in the location of the initial input file) for each iteration which will contain the state of the agents after applying a single simulation iteration to the agents (in the same formal as 0.xml. You can view this file (cat command) to see how the agent properties have changed.

The parameters passed to the simulation are the initial model file and the number of simulation runs (iterations). Note that by default, the number of iterations is set to 1. In order to modify the number of iterations, pass an argument to the shell script (e.g: iter=50):

  1. Debugging examples:
cd examples/{folder name}
make Console_mode dbg=1
  • Debugging with verb|cuda-gdb|
cuda-gdb ../../bin/x64/Debug_Console/{folder name}_console
..
(cuda-gdb) run iterations/0.xml 2
...
  • Debugging with valgrind
valgrind --tool=memcheck {executable} iterations/0.xml 1

where executable is ../../bin/x64/Debug_Console/{folder name}_console.

  1. Clean generated dynamic and object files with make clobber. Note that you need to use make XSLTPREP to generate the .cu files first, then build a specific target (console or visualisation mode). make all would generate the dynamic files as well as building the executables. And make clean only deletes the object files and leaves the verb|.cu| files behind.
  2. For more details on how to build specific targets for each example, run make help

4.6. Simulation Execution Modes and Options

FLAME GPU simulations require a number of arguments depending on either console or visualisation mode. Both are described in the following subsections.

4.6.1. Console Mode

Simulation executables built for console execution require two arguments (usage shown below). The first of which is a file location for an initial agent XML file containing the initial agent data. The second argument is the number of simulation iterations which should be processed. A number of optional CUDA arguments may also be passed (i.e. device=1 to specify the second CUDA enabled GPU device within the host machine should be used for simulation) if required.

FLAMEGPU_simulation.exe [XML model data] [Iterations] [Optional CUDA arguments]

The result of running the simulation will be a number of output XML files which will be numbered from 1 to n, where n is the number of simulations specified by the Iterations argument. It is possible to turn XML output on or off by changing the definition of the OUTPUT_TO_XML macro located within the main.xslt template to true (1) false (0).

4.6.2. Visualisation Mode

Simulation executables built for visualisation require only a single argument (usage shown below) which is the same as the first argument for with console execution (an initial agent XML file). The number of simulations iterations is not required as the simulation will run indefinitely until the visualisation is closed. As with console execution it is possible to specify optional CUDA arguments.

Usage: main [XML model data] [Optional CUDA arguments]

Many of the options for the default visualisation are contained within the visualisation.h header file and include the following;

  • SIMULATION_DELAY Many simulations are executed extremely quickly making visualisation a blur. This definition allows an artificial delay by executing this number of visualisation draw loops before each simulation iteration is processed.
  • WINDOW_WIDTH and WINDOW_HEIGHT Specifies the size of the visualisation window
  • NEAR_CLIP and FAR_CLIP Specifies the near an far clipping plane used for OpenGL rendering.
  • SPHERE_SLICES The number of slices used to create the sphere geometry representing a single agent in the visualisation.
  • SPHERE_STACKS The number of stacks used to create the sphere geometry representing a single agent in the visualisation.
  • SPHERE_RADIUS The physical size of the sphere geometry representing a single agent in the visualisation. This will need to be a sensible value which corresponds with the environment size and agent locations within your model/simulation.
  • VIEW_DISTANCE The camera viewing distance. Again this will need to be a sensible value which corresponds with the environment size and agent locations within your model/simulation.
  • LIGHT_POSITION The visualisation will contain a single light source which will be located at this position.

4.7. Creating a Custom Visualisation

Customised visualisation can easily be integrated to a FLAME GPU project by extending the automatically generated visualisation file (the output of processing visualisation.xslt). Note: When doing this within Visual Studio it is important to turn off the template processing of the ``visualisation.xslt`` file in each of the launch configurations as processing them will overwrite any custom code!. Many of the FLAME GPU SDK examples use customised visualisations in this way. As with the default visualisations any custom visualisation must define the following function prototypes defined in the automatically generated simulation header.

1
2
3
 extern "C" void initVisualisation();

 extern "C" void runVisualisation();

The first of these can be used to initialise any OpenGL memory and CUDA OpengGL bindings as well as displaying the user interface. The second of these functions must take control of the simulation by repeatedly calling the draw and singleIteration (which advances the simulation by a single iteration step) functions in a recursive loop. A more detailed description of the default rendering technique is provided within other FLAME GPU documentation (listed in Purpose of This Document).

4.8. Performance Tips

The GPU offers some enormous performance advantages for agent simulation over more traditional CPU based alternatives. With this in mind it is possible to write extremely sub optimal code which will reduce performance. The following is a list of performance tips for creating FLAME GPU model files;

General Usage of FLAME GPU

  • FLAME GPU is optimal where there are very large numbers of relatively simple agents which can be parallelised.
  • Populations of agents with very low numbers will perform poorly (in extreme cases slower than if they were simulated using the CPU). If you require an agent population with very few agents consider writing some custom CPU simulation code and transferring any important information into simulation constants to be read by larger agent populations during the FLAME GPU simulation step.
  • Outputting information to disk (XML files) is painfully slow in comparison with simulation speeds so consider outputting information visually or only after larger numbers of simulation iterations.

Model Specification

  • Minimise the number of variables with agents and message data where possible.
  • Try to conceptualise and fully specify the model before completing the agent functions script to avoid making mistakes with agent function arguments. Try to think in terms of X-Machines agents!

Agent Function Scripting

  • Small compute intensive agent functions are more efficient than functions which only iterate messages. Try to minimise the number of times message lists are iterated.
  • Keep agent functions small and do not define more local variables than is strictly required. Reuse local variables where possible if they are no longer needed and before they go out of scope.

Message Iteration

  • For small populations of agents (generally less than 2000 but dependant on hardware and the model) non partitioned messaging has less overhead and is similarly comparable to spatial partitioning.
  • For large populations of distributed agents with limited communication spatially partitioned message communication will be much faster.