WarpDrive Sampler for Numba

WarpDrive Sampler for Numba

Logo for WarpDrive Sampler for Numba
Description
Tutorial covering the basics of WarpDrive Sampler using Numba
Publisher
Salesforce
Latest Version
1.0
Modified
April 4, 2023
Compressed Size
21.06 KB

This is the second tutorial on WarpDrive, a PyCUDA-based framework for extremely parallelized multi-agent reinforcement learning (RL) on a single graphics processing unit (GPU). At this stage, we assume you have read our first tutorial on WarpDrive basics.

In this tutorial, we describe CUDASampler, a lightweight and fast action sampler based on the policy distribution across several RL agents and environment replicas. CUDASampler utilizes the GPU to parallelize operations to efficiently sample a large number of actions in parallel.

Notably:

  • It reads the distribution on the GPU through Pytorch and samples actions exclusively at the GPU. There is no data transfer.
  • It maximizes parallelism down to the individual thread level, i.e., each agent at each environment has its own random seed and independent random sampling process.