VPP  0.8
A high-level modern C++ API for Vulkan
vpp::WArray< ItemT > Class Template Reference

Declares workgroup-scoped variable of array type. More...

#include <vppLangAggregates.hpp>

Public Member Functions

 WArray (int s)
 Constucts workgroup-scoped array of specified size. More...
 
template<typename IndexT >
auto operator[] (IndexT index) const
 Allows read/write access to elements of the array.
 
Int Size () const
 Returns GPU-level value equal to the size of the array.
 
int size () const
 Returns CPU-level value equal to the size of the array.
 

Detailed Description

template<class ItemT>
class vpp::WArray< ItemT >

Declares workgroup-scoped variable of array type.

This allows to create arrays of specified type (scalar, vector or matrix) and the size determined at shader compilation time, shared between threads within single workgroup.

This enables the following possibilities:

  • Fast inter-thread communication within single workgroup.
  • Group oriented paradigm of parallel programming. This involves solving single problem on single workgroup (compute unit). Multiple compute units are allocated to different problem instances, hence they do not require communication. Also single workgroup can execute some generally sequential algorithm, but with parallelizable subroutines (like sorting, searching or reducing). Shared arrays are natural way to construct working space for such algorithms. VPP facilitates such scenarios by providing group-scoped algorithm library in vpp::ct::group namespace.
  • Flexible memory allocation for individual threads.

Workgroups are called also thread groups in some proprietary APIs. Also usually a single workgroup maps to so-called warp (32 threads) or wavefront (64 threads) on the GPU - depending on particular GPU vendor. It is not necessary though, as the workgroup size is configurable.

Shared arrays consume space inside the shared memory block. On current devices, it is typically 32 to 48 kB. No general purpose registers are allocated. This type of array usually does not impose performance penalties, as long as all arrays fit inside the block.

Use ComputeShader::getTotalWorkgroupMemory() and ComputeShader::getFreeWorkgroupMemory() functions to determine the total and still available sizes of the shared memory block. Exceeding the size with too many allocations will result in exception thrown by VPP during shader compilation. It is recommended to check these sizes before allocating large arrays. Various devices may provide different sizes. At the moment of writing this, the rule of thumb is as follows:

  • NVIDIA GPUs: 48 kB,
  • Radeon GPUs: 32 kB,
  • some mobile GPUs: 16 kB (this is the minimum required by Vulkan standard).

Caution: one type of performance penalty that might occur with shared memory is the RAW latency. If you write to some location and immediately read from the same location, delay might occur which can impose quite large slowdown. Try to avoid such kind of accesses.

Constructor & Destructor Documentation

◆ WArray()

template<class ItemT>
vpp::WArray< ItemT >::WArray ( int  s)

Constucts workgroup-scoped array of specified size.

The size may come from CPU-level variable. It can not be changed after constructing the array.

The lifetime of constructed array is until the end of the shader. Consider reusing the array if possible in order to avoid overflow of the shared memory block.


The documentation for this class was generated from the following file: