Class CUDAScatter

Nested Relationships

Nested Types

Class Documentation

class CUDAScatter

Singleton class for performing generic scatters This is used for optional messages, agent death, agent birth

Public Types

typedef CUDAScanCompaction::Type Type

Flag used to decide which scan_flag array should be used

See also

CUDAScanCompaction::type

Public Functions

inline CUDAScanCompaction &Scan()
inline detail::CubTemporaryMemory &CubTemp(const unsigned int streamId)
unsigned int scatter(unsigned int streamResourceId, cudaStream_t stream, const Type &messageOrAgent, const VariableMap &vars, const std::map<std::string, void*> &in, const std::map<std::string, void*> &out, unsigned int itemCount, unsigned int out_index_offset = 0, bool invert_scan_flag = false, unsigned int scatter_all_count = 0)

Convenience wrapper for scatter() Scatters agents from SoA to SoA according to d_position flag Used for device agent creation and agent death CUDAScanCompaction::scan_flag is used to decide who should be scattered CUDAScanCompaction::position is used to decide where to scatter to

Note

This is deprecated, unclear if still used

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • messageOrAgent – Flag of whether message or agent CUDAScanCompaction arrays should be used

  • varsVariable description map from ModelData hierarchy

  • in – Input variable name:ptr map

  • out – Output variable name:ptr map

  • itemCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the ouput index (e.g. if out already contains data)

  • invert_scan_flag – If true, agents with scan_flag set to 0 will be moved instead

  • scatter_all_count – The number of agents at the start of in to be copied, ones after this use scanflag

unsigned int scatter(unsigned int streamResourceId, cudaStream_t stream, const Type &messageOrAgent, const std::vector<ScatterData> &scatterData, unsigned int itemCount, unsigned int out_index_offset = 0, bool invert_scan_flag = false, unsigned int scatter_all_count = 0)

Scatters agents from SoA to SoA according to d_position flag Used for device agent creation and agent death CUDAScanCompaction::scan_flag is used to decide who should be scattered CUDAScanCompaction::position is used to decide where to scatter to

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • messageOrAgent – Flag of whether message or agent CUDAScanCompaction arrays should be used

  • scatterData – Vector of scatter configuration for each variable to be scattered

  • itemCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the ouput index (e.g. if out already contains data)

  • invert_scan_flag – If true, agents with scan_flag set to 0 will be moved instead

  • scatter_all_count – The number of agents at the start of in to be copied, ones after this use scanflag

void scatterPosition_async(unsigned int streamResourceId, cudaStream_t stream, Type messageOrAgent, const std::vector<ScatterData> &scatterData, unsigned int itemCount)

Scatters agents from SoA to SoA according to d_position flag as input_source, all variables are scattered Used for Host function sort agent CUDAScanCompaction::position is used to decide where to scatter to

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • messageOrAgent – Flag of whether message or agent CUDAScanCompaction arrays should be used

  • scatterData – Vector of scatter configuration for each variable to be scattered

  • itemCount – Total number of items in input array to consider

void scatterPosition_async(unsigned int streamResourceId, cudaStream_t stream, unsigned int *position, const std::vector<ScatterData> &scatterData, unsigned int itemCount)

Scatters agents from SoA to SoA according to d_position flag as input_source, all variables are scattered Used for Host function sort agent CUDAScanCompaction::position is used to decide where to scatter to

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • position – Buffer containing indexes to move data from

  • scatterData – Vector of scatter configuration for each variable to be scattered

  • itemCount – Total number of items in input array to consider

void scatterPosition(unsigned int streamResourceId, cudaStream_t stream, Type messageOrAgent, const std::vector<ScatterData> &scatterData, unsigned int itemCount)
unsigned int scatterCount(unsigned int streamResourceId, cudaStream_t stream, const Type &messageOrAgent, unsigned int itemCount, unsigned int scatter_all_count = 0)

Returns the final CUDAScanCompaction::position item Same value as scatter, - scatter_a__count

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • messageOrAgent – Flag of whether message or agent CUDAScanCompaction arrays should be used

  • itemCount – Total number of items in input array to consider

  • scatter_all_count – The number offset into the array where the scan began

unsigned int scatterAll(unsigned int streamResourceId, cudaStream_t stream, const std::vector<ScatterData> &scatterData, unsigned int itemCount, unsigned int out_index_offset = 0)

Scatters a contigous block from SoA to SoA CUDAScanCompaction::scan_flag/position are not used

Note

If calling scatter() with itemCount == scatter_all_count works the same

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • scatterData – Vector of scatter configuration for each variable to be scattered

  • itemCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the ouput index (e.g. if out already contains data)

unsigned int scatterAll(unsigned int streamResourceId, cudaStream_t stream, const VariableMap &vars, const std::map<std::string, void*> &in, const std::map<std::string, void*> &out, unsigned int itemCount, unsigned int out_index_offset)

Convenience wrapper to scatterAll()

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • varsVariable description map from ModelData hierarchy

  • in – Input variable name:ptr map

  • out – Output variable name:ptr map

  • itemCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the output index (e.g. if out already contains data)

void pbm_reorder(unsigned int streamResourceId, cudaStream_t stream, const VariableMap &vars, const std::map<std::string, void*> &in, const std::map<std::string, void*> &out, unsigned int itemCount, const unsigned int *d_bin_index, const unsigned int *d_bin_sub_index, const unsigned int *d_pbm)

Used for reordering messages from SoA to SoA Position information is taken using PBM data, rather than d_position Used by spatial messaging.

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • varsVariable description map from ModelData hierarchy

  • in – Input variable name:ptr map

  • out – Output variable name:ptr map

  • itemCount – Total number of items in input array to consider

  • d_bin_index – This idenitifies which bin each index should be sorted to

  • d_bin_sub_index – This indentifies where within it’s bin, an index should be sorted to

  • d_pbm – This is the PBM, it identifies at which index a bin’s storage begins

void scatterNewAgents(unsigned int streamResourceId, cudaStream_t stream, const std::vector<ScatterData> &scatterData, size_t totalAgentSize, unsigned int inCount, unsigned int out_index_offset)

Scatters agents from AoS to SoA Used by host agent creation

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • scatterData – Vector of scatter configuration for each variable to be scattered

  • totalAgentSize – Total size of all of the variables in an agent

  • inCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the ouput index (e.g. if out already contains data)

void broadcastInit_async(unsigned int streamResourceId, cudaStream_t stream, const std::list<std::shared_ptr<VariableBuffer>> &vars, unsigned int itemCount, unsigned int out_index_offset)

Broadcasts a single value for each variable to a contiguous block in SoA Used prior to device agent creation

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • varsVariable description map from ModelData hierarchy

  • itemCount – Total number of items in input array to consider

  • out_index_offset – The offset to be applied to the ouput index (e.g. if out already contains data)

void broadcastInit(unsigned int streamResourceId, cudaStream_t stream, const std::list<std::shared_ptr<VariableBuffer>> &vars, unsigned int itemCount, unsigned int out_index_offset)
void broadcastInit_async(unsigned int streamResourceId, cudaStream_t stream, const VariableMap &vars, void *const d_newBuff, unsigned int itemCount, unsigned int out_index_offset)
void broadcastInit(unsigned int streamResourceId, cudaStream_t stream, const VariableMap &vars, void *const d_newBuff, unsigned int itemCount, unsigned int out_index_offset)
void arrayMessageReorder(unsigned int streamResourceId, cudaStream_t stream, const VariableMap &vars, const std::map<std::string, void*> &in, const std::map<std::string, void*> &out, unsigned int itemCount, unsigned int array_length, unsigned int *d_write_flag = nullptr)

Used to reorder array messages based on __INDEX variable, that variable is not sorted Also throws exception if any indexes are repeated

Parameters:
  • streamResourceId – The stream index to use for accessing stream specific resources such as scan compaction arrays and buffers

  • stream – CUDA stream to be used for async CUDA operations

  • vars – Map of variable metadata, must correspond to variables within in and out parameters

  • in – Map name:ptr of input buffers to be sorted

  • out – Map name:ptr of output buffers to return sorted variable data into

  • itemCount – Number of items to be reordered

  • array_length – Length of the array messages are to be stored in (max index + 1)

  • d_write_flag – Device pointer to array for tracking how many messages output to each bin, caller responsibiltiy to ensure it is array_length or longer

inline CUDAScatter()
CUDAScatter(CUDAScatter const&) = delete
void operator=(CUDAScatter const&) = delete
struct InversionIterator

This utility class provides a wrapper for unsigned int * When the iterator is dereferenced the pointed to unsigned int is evaluated using invert() This is useful when trying to partition and sort a dataset using only scatter and scan

Public Types

using difference_type = unsigned int
using value_type = unsigned int
using pointer = unsigned int*
using reference = unsigned int&
using iterator_category = std::random_access_iterator_tag

Public Functions

__host__ __device__ inline explicit InversionIterator(unsigned int *_p)
__device__ inline InversionIterator &operator=(const InversionIterator &other)
__device__ inline InversionIterator operator++(int a)
__device__ inline InversionIterator operator++()
__device__ inline unsigned int operator*()
__device__ inline InversionIterator operator+(const int b) const
__device__ inline unsigned int operator[](int b) const
struct ScatterData

As we scatter per variable, this structure holds all the data required for a single variable

Public Members

size_t typeLen
char *const in
char *out